(Of course, anyone who looked inside something like an ADAM optimizer knows that it's a compact neural-like machine of a non-standard architecture, and so various metalearning things can be done with it.
I looked at it earlier this year, because I had to rewrite one in order for it to work on Tree-like structures.
no subject
I looked at it earlier this year, because I had to rewrite one in order for it to work on Tree-like structures.
Here is my rather crude and non-idiomatic rewrite:
https://github.com/anhinga/julia-flux-drafts/blob/main/arxiv-1606-09470-section3/May-August-2022/v0-1/TreeADAM.jl
)