The next AI revolution
Sep. 12th, 2024 01:59 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
OpenAI is finally releasing their next set of models. Those models take time to ponder and reason internally before talking. This is what has been known as mysterious "Q*" and "Strawberry", but is now released as "o1 series of models".
They promise a preview version availability today for ChatGPT+ users.
Links in the comments.
They promise a preview version availability today for ChatGPT+ users.
Links in the comments.
no subject
Date: 2024-09-12 06:04 pm (UTC)no subject
Date: 2024-09-12 06:04 pm (UTC)https://openai.com/index/learning-to-reason-with-llms/
no subject
Date: 2024-09-12 06:06 pm (UTC)https://www.jasonwei.net/
https://scholar.google.com/citations?user=wA5TK_0AAAAJ&hl=en
He was the lead author on the Chain-of-Thought paper (NeurIPS 2022), he worked at Google Brain back then:
https://proceedings.neurips.cc/paper_files/paper/2022/file/9d5609613524ecf4f15af0f7b31abca4-Paper-Conference.pdf
no subject
Date: 2024-09-12 06:45 pm (UTC)"Great post from the author of Libratus (superhuman poker AI) and Diplomacy AI champion. One can argue that winning those games isn't very useful - but similar techniques apply well to boosting LLM reasoning."
https://x.com/polynoamial/status/1834280155730043108 (Noam Brown)
no subject
Date: 2024-09-12 06:48 pm (UTC)"This may be the most important figure in LLM research since the OG Chinchilla scaling law in 2022. The key insight is 2 curves working in tandem. Not one.
People have been predicting a stagnation in LLM capability by extrapolating the training scaling law, yet they didn't foresee that inference scaling is what truly beats the diminishing return.
I posted in February that no self-improving LLM algorithm was able to gain much beyond 3 rounds. No one was able to reproduce AlphaGo's success in the realm of LLM, where more compute would carry the capability envelope beyond human level.
Well, we have turned the page.
[Image]"
no subject
Date: 2024-09-12 07:13 pm (UTC)https://www.oneusefulthing.org/p/something-new-on-openais-strawberry
no subject
Date: 2024-09-12 09:57 pm (UTC)The summary:
1) This is System 2 thinking (in terms of "Thinking fast and slow")
And that’s why it is so important that the underlying LLM (the underlying System 1 thinking) is rapidly becoming cheaper and faster...
If we now focus on scaling inference, the cost and speed of an inference step is super-important...
2) This is probably "The 4th Deep Learning Revolution" (roughly speaking, synthesis of Alpha Zero and GPT-4, more or less):
AlexNet, GPT-3, GPT-4, o1
So, 2012-2020-2023-2024
7.5 years, 3 years, 18 months - these are the shortening intervals between revolutions (shrinking at least two-fold per cycle).
June 2025 is my (conservative) estimate for the next one (9 months from now, but it might happen much earlier).
March 2026 is my (somewhat conservative) estimate for "technological singularity" (18 months from now, but it might happen much earlier).
I have both o1-preview and o1-mini, can evaluate them now.
no subject
Date: 2024-09-24 04:21 pm (UTC)Currently it is a "GPT-2 level breakthrough in reasoning" on top of a "GPT-4 level System 1".