![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
When one tries to use category theory for the applied work, a number of questions arise: Is it just too difficult to be used at all by me given my level of technical skills? Is it fruitful enough, and is the fruitfulness/efforts ratio high enough for all this to make sense?
I recently discovered Bruno Gavranović, a graduate student in Glasgow, whose work is promising in this sense. They are really trying hard to keep things simple and also trying to make sure that there are non-trivial applications. Here is one of his essays and papers (March 2021, so it's not the most recent one, but probably the most central):
www.brunogavranovic.com/posts/2021-03-03-Towards-Categorical-Foundations-Of-Neural-Networks.html
(I am posting this here because there are people who read this blog who are interested in applied category theory and like it, not because I am trying to convince those who formed a negative opinion of this subject. I am non-committal myself, I have not decided whether applied categories have strong enough fruitfulness/efforts ratio, but this particular entry seems to be one of the best shots in this sense, so I am going to try to go deeper with their work.)
Update: their collection of papers in the intersection between Category Theory and Machine Learning: github.com/bgavran/Category_Theory_Machine_Learning
I recently discovered Bruno Gavranović, a graduate student in Glasgow, whose work is promising in this sense. They are really trying hard to keep things simple and also trying to make sure that there are non-trivial applications. Here is one of his essays and papers (March 2021, so it's not the most recent one, but probably the most central):
www.brunogavranovic.com/posts/2021-03-03-Towards-Categorical-Foundations-Of-Neural-Networks.html
(I am posting this here because there are people who read this blog who are interested in applied category theory and like it, not because I am trying to convince those who formed a negative opinion of this subject. I am non-committal myself, I have not decided whether applied categories have strong enough fruitfulness/efforts ratio, but this particular entry seems to be one of the best shots in this sense, so I am going to try to go deeper with their work.)
Update: their collection of papers in the intersection between Category Theory and Machine Learning: github.com/bgavran/Category_Theory_Machine_Learning
no subject
Date: 2022-12-06 06:01 pm (UTC)Since our work is based on category theory, you might wonder the aforementioned concepts are, what Category theory even is, or even why you would want to abstract away some details in neural networks? This is a question that deserves a proper answer. For now I’ll just say that our paper really answers the following question in a very precise way: “What is the minimal structure, in some suitable sense, that you need to have to perform learning?”. This is certainly valuable. Why? If you try answering that question you might discover, just as we did, that this structure ends up encapsulated some strange types of learning, with hints to even meta-learning. For instance, after defining our framework on neural networks on Euclidean spaces we realized that it includes learning not just in Euclidean spaces, but also on Boolean circuits. This is pretty strange, how can you “differentiate” a Boolean circuit? It turns out you can, and this falls under the same framework of Reverse derivative categories.
Another thing we discovered is that all the optimizers (standard gradient descent, momentum, Nesterov momentum, Adagrad, Adam etc.) are the same kind of structure neural networks themselves are - giving us hints that optimizers are in some sense “hardwired meta-learners”, just as Learning to Learn by Gradient Descent by Gradient Descent describes.
Of course, I still didn’t tell you what this framework is, nor did I tell you how we defined neural networks. I’ll do that briefly now.'
no subject
Date: 2022-12-06 06:05 pm (UTC)I looked at it earlier this year, because I had to rewrite one in order for it to work on Tree-like structures.
Here is my rather crude and non-idiomatic rewrite:
https://github.com/anhinga/julia-flux-drafts/blob/main/arxiv-1606-09470-section3/May-August-2022/v0-1/TreeADAM.jl
)