dmm | The next AI revolution

You're viewing

dmm's journal
Create a Dreamwidth Account Learn More

Reload page in style: site light

OpenAI is finally releasing their next set of models. Those models take time to ponder and reason internally before talking. This is what has been known as mysterious "Q*" and "Strawberry", but is now released as "o1 series of models".

They promise a preview version availability today for ChatGPT+ users.

Links in the comments.

Flat | Top-Level Comments Only

From:

dmm

https://x.com/_jasonwei/status/1834278706522849788

From:

dmm

https://openai.com/index/introducing-openai-o1-preview/

https://openai.com/index/learning-to-reason-with-llms/

From:

dmm

The lead author:

https://www.jasonwei.net/

https://scholar.google.com/citations?user=wA5TK_0AAAAJ&hl=en

He was the lead author on the Chain-of-Thought paper (NeurIPS 2022), he worked at Google Brain back then:

https://proceedings.neurips.cc/paper_files/paper/2022/file/9d5609613524ecf4f15af0f7b31abca4-Paper-Conference.pdf

Edited Date: 2024-09-12 06:09 pm (UTC)

From:

dmm

https://x.com/DrJimFan/status/1834293265061085328

"Great post from the author of Libratus (superhuman poker AI) and Diplomacy AI champion. One can argue that winning those games isn't very useful - but similar techniques apply well to boosting LLM reasoning."

https://x.com/polynoamial/status/1834280155730043108 (Noam Brown)

From:

dmm

https://x.com/DrJimFan/status/1834284702494327197

"This may be the most important figure in LLM research since the OG Chinchilla scaling law in 2022. The key insight is 2 curves working in tandem. Not one.

People have been predicting a stagnation in LLM capability by extrapolating the training scaling law, yet they didn't foresee that inference scaling is what truly beats the diminishing return.

I posted in February that no self-improving LLM algorithm was able to gain much beyond 3 rounds. No one was able to reproduce AlphaGo's success in the realm of LLM, where more compute would carry the capability envelope beyond human level.

Well, we have turned the page.

[Image]"

From:

dmm

Ethan Mollick:

https://www.oneusefulthing.org/p/something-new-on-openais-strawberry

From:

dmm

A lot more insightful comments from various people (to be gradually added, perhaps).

The summary:

1) This is System 2 thinking (in terms of "Thinking fast and slow")

And that’s why it is so important that the underlying LLM (the underlying System 1 thinking) is rapidly becoming cheaper and faster...

If we now focus on scaling inference, the cost and speed of an inference step is super-important...

2) This is probably "The 4th Deep Learning Revolution" (roughly speaking, synthesis of Alpha Zero and GPT-4, more or less):

AlexNet, GPT-3, GPT-4, o1

So, 2012-2020-2023-2024

7.5 years, 3 years, 18 months - these are the shortening intervals between revolutions (shrinking at least two-fold per cycle).

June 2025 is my (conservative) estimate for the next one (9 months from now, but it might happen much earlier).

March 2026 is my (somewhat conservative) estimate for "technological singularity" (18 months from now, but it might happen much earlier).

I have both o1-preview and o1-mini, can evaluate them now.

Edited Date: 2024-09-12 10:00 pm (UTC)

From:

dmm

With o1-preview and o1-mini, the System 2 is at a "GPT-2 level", in terms of maturity, so we should expect rapid growth.

Currently it is a "GPT-2 level breakthrough in reasoning" on top of a "GPT-4 level System 1".