dmm: (Default)
[personal profile] dmm
It turns out that this is open-access: dl.acm.org/doi/10.1145/3448250

The nuances in that lecture are very interesting, shed various light in the disagreement between Hinton et al and Schmidhuber et al (this one is written from the Hinton et al side, obviously; their emphasis is that technical aspects are equally important and not subservient to "pioneering theory"; e.g. a lot of rather recent pre-2012 developments such as the practical understanding of the role of ReLU is what made the AlexNet breakthrough possible, and moreover things like "the very efficient use of multiple GPUs by Alex Krizhevsky" are also key, not just the neural architecture ideas).

There is a whole section on Transformers, I am going to include it in the comments verbatim.

The journal publication is July 2021, and there are references in the paper which are newer than 2018; I don't know how heavily the text itself has been edited since 2018.

Date: 2021-10-27 09:55 pm (UTC)
timelets: (Default)
From: [personal profile] timelets
Thanks!

Profile

dmm: (Default)
Dataflow matrix machines (by Anhinga anhinga)

May 2025

S M T W T F S
    123
456 78910
11 121314151617
18192021222324
25262728293031

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 30th, 2025 09:48 pm
Powered by Dreamwidth Studios