I am reading this paper: arxiv.org/abs/2104.04657
"In this paper, we introduce a new type of generalized neural network where neurons and synapses maintain multiple states. We show that classical gradient-based backpropagation in neural networks can be seen as a special case of a two-state network where one state is used for activations and another for gradients, with update rules derived from the chain rule. In our generalized framework, networks have neither explicit notion of nor ever receive gradients. The synapses and neurons are updated using a bidirectional Hebb-style update rule parameterized by a shared low-dimensional "genome". We show that such genomes can be meta-learned from scratch, using either conventional optimization techniques, or evolutionary strategies, such as CMA-ES. Resulting update rules generalize to unseen tasks and train faster than gradient descent based optimizers for several standard computer vision and synthetic tasks."
"In this paper, we introduce a new type of generalized neural network where neurons and synapses maintain multiple states. We show that classical gradient-based backpropagation in neural networks can be seen as a special case of a two-state network where one state is used for activations and another for gradients, with update rules derived from the chain rule. In our generalized framework, networks have neither explicit notion of nor ever receive gradients. The synapses and neurons are updated using a bidirectional Hebb-style update rule parameterized by a shared low-dimensional "genome". We show that such genomes can be meta-learned from scratch, using either conventional optimization techniques, or evolutionary strategies, such as CMA-ES. Resulting update rules generalize to unseen tasks and train faster than gradient descent based optimizers for several standard computer vision and synthetic tasks."
no subject
Date: 2021-04-26 08:27 pm (UTC)"There are many interesting directions for future exploration.
Perhaps, the most important one is the question of scale.
Here one intriguing direction is the connection between
the number of states and the learning capabilities. Another
possible approach is extending the space of update rules,
such as allowing injection of randomness for robustness, or
providing an ability for neurons to self-regulate based on
current state. Finally the ability to extend existing genomes
to produce ever better learners, might help us scale even
further. Another intriguing direction is incorporating the
weight updates on both forward and backward passes. The
former can be seen as a generalization of unsupervised learning,
thus merging both supervised and unsupervised learning
in one gradient-free framework."
no subject
Date: 2021-04-26 08:54 pm (UTC)When we see this kind of bootstrap ("eating one's own dog food"), this will be the sign that the field of metalearnng is maturing.