dmm | OpenAI: Improvements in algorithmic efficiency of neural networks

openai.com/blog/ai-and-efficiency/

arxiv.org/abs/2005.04305

"Three factors drive the advance of AI: algorithmic innovation, data, and the amount of compute available for training. Algorithmic progress has traditionally been more difficult to quantify than compute and data. In this work, we argue that algorithmic progress has an aspect that is both straightforward to measure and interesting: reductions over time in the compute needed to reach past capabilities. We show that the number of floating-point operations required to train a classifier to AlexNet-level performance on ImageNet has decreased by a factor of 44x between 2012 and 2019. This corresponds to algorithmic efficiency doubling every 16 months over a period of 7 years. By contrast, Moore's Law would only have yielded an 11x cost improvement. We observe that hardware and algorithmic efficiency gains multiply and can be on a similar scale over meaningful horizons, which suggests that a good model of AI progress should integrate measures from both."

Flat | Top-Level Comments Only

From:

dmm

The second author co-authored 2 interesting OpenAI papers recently:

"Scaling Laws for Neural Language Models" https://arxiv.org/abs/2001.08361

"We study empirical scaling laws for language model performance on the cross-entropy loss. The loss scales as a power-law with model size, dataset size, and the amount of compute used for training, with some trends spanning more than seven orders of magnitude. Other architectural details such as network width or depth have minimal effects within a wide range. Simple equations govern the dependence of overfitting on model/dataset size and the dependence of training speed on model size. These relationships allow us to determine the optimal allocation of a fixed compute budget. Larger models are significantly more sample-efficient, such that optimally compute-efficient training involves training very large models on a relatively modest amount of data and stopping significantly before convergence."

***

"Fine-Tuning Language Models from Human Preferences" https://arxiv.org/abs/1909.08593

"[...] For stylistic continuation we achieve good results with only 5,000 comparisons evaluated by humans.[...]"

https://openai.com/blog/fine-tuning-gpt-2/

Edited Date: 2020-05-20 09:04 pm (UTC)

(And this second author was the first author of the famous GPT-3 paper, and then he left OpenAI at the end of 2020, and I have no idea what he is up to.)

S	M	T	W	T	F	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

Dataflow matrix machines (by Anhinga anhinga)

OpenAI: Improvements in algorithmic efficiency of neural networks

OpenAI: Improvements in algorithmic efficiency of neural networks

no subject

no subject

Profile

September 2025

Most Popular Tags

Page Summary

Active Entries

Style Credit

Expand Cut Tags