dmm: (Default)
I have been looking at a recent rather remarkable paper which includes the DeepDream creator among its authors, and I've decided to check whether I missed any of his works; and I turns out there is this paper I really should be aware of. This really resonates with some of the thing I have been exploring this year.


arxiv.org/abs/2007.00970

"We present a novel method for learning the weights of an artificial neural network - a Message Passing Learning Protocol (MPLP). In MPLP, we abstract every operations occurring in ANNs as independent agents. Each agent is responsible for ingesting incoming multidimensional messages from other agents, updating its internal state, and generating multidimensional messages to be passed on to neighbouring agents. We demonstrate the viability of MPLP as opposed to traditional gradient-based approaches on simple feed-forward neural networks, and present a framework capable of generalizing to non-traditional neural network architectures. MPLP is meta learned using end-to-end gradient-based meta-optimisation. We further discuss the observed properties of MPLP and hypothesize its applicability on various fields of deep learning."

dmm: (Default)
When one tries to use category theory for the applied work, a number of questions arise: Is it just too difficult to be used at all by me given my level of technical skills? Is it fruitful enough, and is the fruitfulness/efforts ratio high enough for all this to make sense?

I recently discovered Bruno Gavranović, a graduate student in Glasgow, whose work is promising in this sense. They are really trying hard to keep things simple and also trying to make sure that there are non-trivial applications. Here is one of his essays and papers (March 2021, so it's not the most recent one, but probably the most central):

www.brunogavranovic.com/posts/2021-03-03-Towards-Categorical-Foundations-Of-Neural-Networks.html

(I am posting this here because there are people who read this blog who are interested in applied category theory and like it, not because I am trying to convince those who formed a negative opinion of this subject. I am non-committal myself, I have not decided whether applied categories have strong enough fruitfulness/efforts ratio, but this particular entry seems to be one of the best shots in this sense, so I am going to try to go deeper with their work.)

Update: their collection of papers in the intersection between Category Theory and Machine Learning: github.com/bgavran/Category_Theory_Machine_Learning
dmm: (Default)
Another important paper from one of François Fleuret's collaborations: arxiv.org/abs/2209.00588

Previous important papers include "Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention",arxiv.org/abs/2006.16236 and "Flatten the Curve: Efficiently Training Low-Curvature Neural Networks", arxiv.org/abs/2206.07144
dmm: (Default)
Андрей Карпати воспроизводит и исследует первую convolutional neural net (1989):

karpathy.github.io/2022/03/14/lecun1989/

dmm: (Default)
It turns out that this is open-access: dl.acm.org/doi/10.1145/3448250

The nuances in that lecture are very interesting, shed various light in the disagreement between Hinton et al and Schmidhuber et al (this one is written from the Hinton et al side, obviously; their emphasis is that technical aspects are equally important and not subservient to "pioneering theory"; e.g. a lot of rather recent pre-2012 developments such as the practical understanding of the role of ReLU is what made the AlexNet breakthrough possible, and moreover things like "the very efficient use of multiple GPUs by Alex Krizhevsky" are also key, not just the neural architecture ideas).

There is a whole section on Transformers, I am going to include it in the comments verbatim.

The journal publication is July 2021, and there are references in the paper which are newer than 2018; I don't know how heavily the text itself has been edited since 2018.
dmm: (Default)
I am very interested in sparsity in neural nets and I am super-happy that the community around sparsity is growing, and the first workshop on the subject had just taken place.

The key organizer is D.C.Mocanu; I did an experimental PyTorch project building upon his work a couple of years ago.

I learned about this via the ML Collective reading group mailing list. The links are in the comments.

Profile

dmm: (Default)
Dataflow matrix machines (by Anhinga anhinga)

December 2025

S M T W T F S
 123456
78910111213
141516 17181920
21222324252627
28293031   

Syndicate

RSS Atom

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jan. 4th, 2026 09:02 pm
Powered by Dreamwidth Studios