Date: 2022-12-06 06:01 pm (UTC)
dmm: (Default)
From: [personal profile] dmm
'Deep Learning is notoriously an ad-hoc field. Despite its tremendous success, we lack a unifying perspective for this growing body of work. We have entire paradigms of how to effectively learn, but it’s still hard to precisely state what a neural network is and cover all the use cases. This is the contribution of our paper. We’re making a step forward in that regard by creating a foundation of neural networks terms of three things: 1) Parameterized maps, 2) Bidirectional data structures (Lenses/Optics) and 3) Reverse derivative categories.

Since our work is based on category theory, you might wonder the aforementioned concepts are, what Category theory even is, or even why you would want to abstract away some details in neural networks? This is a question that deserves a proper answer. For now I’ll just say that our paper really answers the following question in a very precise way: “What is the minimal structure, in some suitable sense, that you need to have to perform learning?”. This is certainly valuable. Why? If you try answering that question you might discover, just as we did, that this structure ends up encapsulated some strange types of learning, with hints to even meta-learning. For instance, after defining our framework on neural networks on Euclidean spaces we realized that it includes learning not just in Euclidean spaces, but also on Boolean circuits. This is pretty strange, how can you “differentiate” a Boolean circuit? It turns out you can, and this falls under the same framework of Reverse derivative categories.

Another thing we discovered is that all the optimizers (standard gradient descent, momentum, Nesterov momentum, Adagrad, Adam etc.) are the same kind of structure neural networks themselves are - giving us hints that optimizers are in some sense “hardwired meta-learners”, just as Learning to Learn by Gradient Descent by Gradient Descent describes.

Of course, I still didn’t tell you what this framework is, nor did I tell you how we defined neural networks. I’ll do that briefly now.'
This account has disabled anonymous posting.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

Profile

dmm: (Default)
Dataflow matrix machines (by Anhinga anhinga)

May 2025

S M T W T F S
    123
456 78910
11 121314151617
18192021222324
25262728293031

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 13th, 2025 01:14 pm
Powered by Dreamwidth Studios