dmm | К 9 месяцам с появления GPT-3

Моя профессиональная деятельность в последние девять месяцев вся была окрашена прорывом, связанным с тем, что придумали GPT-3, и оказалось, что у этой штуки уже вполне волшебные свойства.

Вот, я хочу в комментариях проследить, как оно было, и что я по этому поводу пробовал делать (в том числе, на гитхабе).

Революция, вызванная или, по крайней мере, радикально ускоренная появлением GPT-3 и последующих работ, происходит вовсю, и я не уверен, получается ли у кого-нибудь следить за всеми важными развитиями в этой области. Я не делаю попытку обзора, это, скорее, попытка вспомнить свою личную траекторию.

Flat | Top-Level Comments Only

From:

dmm

Feb 28-March 2:

I publish "9 months since GPT-3 revolution": https://anhinga-anhinga.livejournal.com/84392.html and there is useful discussion in the comments there.

I publish this blog post, К 9 месяцам с появления GPT-3", and these comments.

So here we are, 9 months after GPT-3 revlution...

(It might be that nothing more needs to be done by people like me, it might be that some smart people elsewhere will make enough breakthroughs in the next few months to "solve AI", and we can just hope that they are thinking well about "AI safety", but we can't really participate. But if not, continuing this line of research should be of interest, and so I am going to continue working on this (I do spend time looking at what people are writing on "AI safety" recently, e.g. this https://dmm.dreamwidth.org/36635.html and some other texts; I think we should at least ponder "AI safety" issues, especially if we are doing work which might turn out to be relevant to "transition to 'true AI' " and which might impact the dynamics and properties of this "transition").)

So, one might want to just experiment with various aspects of DMMs and matrix multiplications and other ideas which are coming to one's mind in the context of DMMs, Transformers, attention-based models, and their interplay, and one should do this within one of the modern ultra-flexible frameworks for differentiable programming, such as Julia Flux or JAX, and this line of exploration has a good chance of being fruitful.

Edited Date: 2021-03-02 08:49 am (UTC)

S	M	T	W	T	F	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

Dataflow matrix machines (by Anhinga anhinga)

К 9 месяцам с появления GPT-3

К 9 месяцам с появления GPT-3

no subject

Profile

September 2025

Most Popular Tags

Page Summary

Active Entries

Style Credit

Expand Cut Tags