dmm: (Default)
[personal profile] dmm
The most interesting conceptual AI advances seem lately to come from "prosaic alignment" start-ups. These are companies which believe that the current trend of improving Transformer models is likely to lead straight to AGI, and that better understanding of the nature and properties of these model is key to AI safety (and, of course, it's also key to better AI capabilities).

And it is often the case that the key elements of work are done by people "on the edge", "in the penumbra" of those alignment start-ups.

In the previous post I mentioned the key new understanding of large Transformer models as simulators. That work has been done "while at Conjecture", but is not listed as directly coming from Conjecture (one of those "prosaic alignment" start-ups). I think the key people involved are still at Conjecture, but they seem to be trying to keep some distance between Conjecture and this work. I am continuing to take notes of those materials and commit them to GitHub (see links in the comments to the previous post).

Here is another one of those stories. Grokking is a phenomenon, where small Transformers look at a part of a mathematical structure for quite a while, and then rather suddenly transition to understanding the whole of that mathematical structure including the part they never see in training. It has been discovered in 2021 and has been a subject of a number of follow-up attempts to understand it.

The recent breakthrough has been done in mid-August by Neel Nanda who left Anthropic (perhaps the most famous of the "prosaic alignment" start-ups) a few months ago. And it looks like he has more or less solved the mysteries behind this phenomenon. I am going to continue studying his writings more. The links are in the comments.

Date: 2022-10-16 06:39 pm (UTC)
juan_gandhi: (Default)
From: [personal profile] juan_gandhi

OMG, is it for real?! Amazing.

Profile

dmm: (Default)
Dataflow matrix machines (by Anhinga anhinga)

May 2025

S M T W T F S
    123
456 78910
11 121314151617
18192021222324
25262728293031

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jun. 15th, 2025 04:21 pm
Powered by Dreamwidth Studios