dmm: (Default)
[personal profile] dmm
In the last few years, people discovered that in addition to the traditional machine learning trade-off between underfitting and overfitting, there is often also a good zone "to right of overfitting", the "superoverfitting zone" of very overdefined models is often surprisingly good. This is what "double descent" terminology stands for: the second descent, not to the sweet spot between underfitting and overfitting, but to the right of the "overfitting boundary". This is the so-called "interpolation mode", where training loss is zero, but generalization beyond training data is also pretty good (counter-intuitively).

This is a good way to explain nice performance of really huge models, but also if one just has a tiny bit of training data, then moderately-sized, but still large models might do pretty well. And it turns out that if the tiny bit of training data is describing the problem precisely, then it might happen that the model finds the precise overall solution to the problem. This is what the authors of a recent paper called "grokking" (following "Stranger in a Strange Land").

What is interesting here is that software engineering tasks often belong to this class. Training data are constraints that tests should pass, and they might be relatively compact, and then the model in question might be able to derive the desired software (although the "grokking paper" (which I reference in the comments) solves a set of much more narrowly defined mathematical problems, and it remains to be seen how general this approach turns out to be).


This account has disabled anonymous posting.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

Profile

dmm: (Default)
Dataflow matrix machines (by Anhinga anhinga)

September 2025

S M T W T F S
 1 23456
78910111213
14151617181920
21222324252627
282930    

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Dec. 28th, 2025 09:10 am
Powered by Dreamwidth Studios