dmm: (Default)
Dataflow matrix machines (by Anhinga anhinga) ([personal profile] dmm) wrote 2021-08-16 02:37 pm (UTC)

In Session 1:


A Qualitative Study of the Dynamic Behavior for Adaptive Gradient Algorithms, Chao Ma (Princeton University), Lei Wu (Princeton University), Weinan E (Princeton University)

Paper Highlight, by Pankaj Mehta

The paper connects the continue-time limits of adaptive gradient descent methods, RMSProp and Adam, to the sign gradient descent algorithm and explores three types of typical phenomena in these adaptive algorithms’ training processes. By analyzing the signGD flow, this paper explains the fast initial convergence of these adaptive gradient algorithms with a learning rate approximating 0 and fixed momentum parameters. The connection, the convergence analysis, and experiments on verifying the three qualitative patterns are original and technically sound.

Post a comment in response:

This account has disabled anonymous posting.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting