"Narrow AGI" this year?
Jan. 6th, 2025 09:04 pm"Narrow AGI" is mostly an AGI-level artificial software engineer, an AGI-level artificial mathematician, an AGI-level artificial AI researcher (and probably a single entity combining these three application areas, because a strong AI researcher has to be a decent software engineer and a decent mathematician).
It seems that at least OpenAI (and, perhaps, other entities) should have this by the middle of 2025, if not earlier, at least for their internal use (assuming no major disasters, that is, assuming that San Fransisco Bay Area is intact, and AI companies continue functioning normally).
What do we know about the technical aspects? We see o1 performance (and can experience it directly), we see the claimed (and partially confirmed) numbers for the demo versions of o3 and o3-mini, in math and in software engineering. We know that the jump from o1 to o3 took about 3 months. Two more jumps like that would probably be sufficient (and one can add "scaffolding" on top of that).
Another thing we know is that Sam Altman sounds much more confident recently. I've come to these conclusions a number of days ago, but now it turns out that Sam's mood has also shifted in a similar fashion. I'll put some links in the comments.
Jan 19 update: Sam Altman will allegedly do a closed-door government briefing on Jan 30 (that's apparently is not a very big secret and has been leaked; the main topic is presumably as follows: many people in the leading AI labs have approximately the same degree of techno-optimism as I have myself, and so their timelines are tentatively quite short). www.axios.com/2025/01/19/ai-superagent-openai-meta
It seems that at least OpenAI (and, perhaps, other entities) should have this by the middle of 2025, if not earlier, at least for their internal use (assuming no major disasters, that is, assuming that San Fransisco Bay Area is intact, and AI companies continue functioning normally).
What do we know about the technical aspects? We see o1 performance (and can experience it directly), we see the claimed (and partially confirmed) numbers for the demo versions of o3 and o3-mini, in math and in software engineering. We know that the jump from o1 to o3 took about 3 months. Two more jumps like that would probably be sufficient (and one can add "scaffolding" on top of that).
Another thing we know is that Sam Altman sounds much more confident recently. I've come to these conclusions a number of days ago, but now it turns out that Sam's mood has also shifted in a similar fashion. I'll put some links in the comments.
Jan 19 update: Sam Altman will allegedly do a closed-door government briefing on Jan 30 (that's apparently is not a very big secret and has been leaked; the main topic is presumably as follows: many people in the leading AI labs have approximately the same degree of techno-optimism as I have myself, and so their timelines are tentatively quite short). www.axios.com/2025/01/19/ai-superagent-openai-meta
no subject
Date: 2025-01-07 02:40 am (UTC)https://www.lesswrong.com/posts/T5p9NEAyrHedC2znD/we-know-how-to-build-agi-sam-altman
https://www.lesswrong.com/posts/QHtd2ZQqnPAcknDiQ/o3-oh-my
no subject
Date: 2025-01-07 02:49 am (UTC)> We are now confident we know how to build AGI as we have traditionally understood it.
and what does he mean by "as we have traditionally understood it"?
> if you could hire an AI as a remote employee to be a great software engineer, I think a lot of people would say, “OK, that’s AGI-ish.”
so he is talking about "narrow AGI", namely software engineering, and also math, these are two areas of recent breakthroughs. And these two areas do lead to the ability to make productive "artificial AI researchers", which is why Sam is saying:
> We are beginning to turn our aim beyond that [beyond AGI], to superintelligence in the true sense of the word. We love our current products, but we are here for the glorious future. With superintelligence, we can do anything else. Superintelligent tools could massively accelerate scientific discovery and innovation well beyond what we are capable of doing on our own [...]
If one wants superintelligence, AGI-level artificial AI researchers is one way to move to that direction. And it's the main way, via making better and better artificial AI researchers, towards superintelligent AI researchers, and then towards other superintelligent systems.
Although, it's not the only way to get there... But if one has AGI-level artificial AI researchers, it's very natural to start accelerated movement towards smarter and smarter artificial AI researchers beyond human level.
no subject
Date: 2025-01-07 02:56 am (UTC)https://x.com/__nmca__/status/1870170098989674833
SWE-bench verified, 71.7%. Very nice, a big jump in state-of-the-art (but also very clear that this is not an AGI level yet, the AGI level would be close to 100% on this one).
Codeforces Elo 2727, that's overwhelmingly good (within top 200 competitive coders in the world)
https://x.com/__nmca__/status/1870170112290107540
25% on the famous new FrontierMath test, which was completely unapproachable to AI models until now.
(Very nice scores on the famous Arc-AGI benchmark too, the human-level results are achieved (that was not possible until now as well).)
no subject
Date: 2025-01-07 03:03 am (UTC)https://www.interconnects.ai/p/openais-o3-the-2024-finale-of-ai
(This might be a better technical link, actually.)
no subject
Date: 2025-01-07 03:44 am (UTC)no subject
Date: 2025-01-07 04:23 am (UTC)But none of them is a truly good software engineer yet (although many of them are quite useful for a number of software engineering tasks). Among available models, the best ones are o1 (and o1-pro, of course, is even better), and new Sonnet 3.5 (20241022, especially within something like Cursor). If one wants to evaluate state-of-the-art, one should play with those...
o3 might actually be a good enough software engineer already, judging by its rather spectacular numbers, but it is not available for the outsiders yet... But so far all OpenAI demos I remember have been fair, they have never been overselling the results... Whether they are going to release full o3 (rather than just o3 mini) to the public, and what would be the price point is an interesting question... Sam has complained that unexpectedly enough they are losing money on the expensive Pro $200/month plan (while making nice money on the standard Plus $20/month plan): https://x.com/sama/status/1876104315296968813 (what happens is quite obvious, people don't care much about $20/month, and many of them are using services pretty lightly, but the subset that is willing to upgrade to $200/month is very different, and uncapped usage for that subset ends up costing too much in compute).
no subject
Date: 2025-01-07 12:40 pm (UTC)no subject
Date: 2025-01-07 01:29 pm (UTC)https://docs.github.com/en/copilot/using-github-copilot/using-claude-sonnet-in-github-copilot
no subject
Date: 2025-01-07 02:01 pm (UTC)no subject
Date: 2025-01-07 02:47 pm (UTC)no subject
Date: 2025-01-21 09:02 am (UTC)Jan 21: making this post public
no subject
Date: 2025-01-28 07:15 am (UTC)*****
The Year of Dragon is ending tomorrow: https://dmm.dreamwidth.org/79583.html Feb 10, 2024 - Jan 28, 2025
>И это будет год "только что родившегося, растущего дракона"; а уходящий год - год "умирающего, уходящего кролика".
>
>И тут-то, я чувствую, "всё" и начнётся; всё указывает на предстоящий "год критических потрясений", год, когда мир изменится радикально...
>
>Хорошо бы нам его успешно пережить и войти в новую фазу...
What actually happened had been pretty radical, but less radical than I anticipated; the actual "big transition" is very likely to happen in 2025.
no subject
Date: 2025-02-11 02:04 am (UTC)The numbers
no subject
Date: 2025-02-11 02:06 am (UTC)no subject
Date: 2025-03-15 04:23 am (UTC)We now expect a lot of things before Summer (including GPT-5)
no subject
Date: 2025-03-30 02:18 pm (UTC)https://dmm.dreamwidth.org/77617.html (scroll all the way down)
no subject
Date: 2025-05-07 09:01 pm (UTC)no subject
Date: 2025-05-26 04:14 am (UTC)no subject
Date: 2025-08-07 07:11 pm (UTC)[...]
GPT-5