dmm | К 9 месяцам с появления GPT-3

Моя профессиональная деятельность в последние девять месяцев вся была окрашена прорывом, связанным с тем, что придумали GPT-3, и оказалось, что у этой штуки уже вполне волшебные свойства.

Вот, я хочу в комментариях проследить, как оно было, и что я по этому поводу пробовал делать (в том числе, на гитхабе).

Революция, вызванная или, по крайней мере, радикально ускоренная появлением GPT-3 и последующих работ, происходит вовсю, и я не уверен, получается ли у кого-нибудь следить за всеми важными развитиями в этой области. Я не делаю попытку обзора, это, скорее, попытка вспомнить свою личную траекторию.

Flat | Top-Level Comments Only

From:

dmm

We are reasonably close to the end of the story. There is one more thread: about relationship between DMMs and Differential Programming, as I am thinking about my current activity in the context of that relationship.

And also here JAX comes into the picture and will exist in it on par with Julia.

November 30: DeepMind announces that it has "solved" the problem of protein folding (it seems to be solved in the sense that the performance on par with best human labs, but much faster time-wise, seems to have been reached; however, it is not clear how to start addressing the question whether this system can also solve protein folding problems which humans can't solve at all: how do we even start testing that?)

In any case, the system, Alpha Fold 2, is a hybrid model with attention at its center, so my prediction that hybrids with attention are the future seems to be working well: https://deepmind.com/blog/article/alphafold-a-solution-to-a-50-year-old-grand-challenge-in-biology

December 4: DeepMind publishes a blog post describing DeepMind JAX ecosystem: https://deepmind.com/blog/article/using-jax-to-accelerate-our-research

This made me realize by late December that JAX and Julia Flux are of comparable flexibility: they both are the next generation ultra-flexible machine learning frameworks, they allow to compute gradients of large subsets of Python and Julia (there are some requirements for immutability of arrays in both cases, and so one is encouraged to move towards functional programming with immutable data), they allow to compute gradients with respect to tree-like structures and not just with respect to flat "tensors", and they do a lot to speed things up and to allow interoperability with all kinds of things. People have flamewars over the JAX vs Julia Flux issues, and assert superiority of the one of these two systems they like, but honestly, while these two systems are different, they are pretty competitive with each other; the trade-offs are rather complicated.

Edited Date: 2021-03-02 08:10 am (UTC)

S	M	T	W	T	F	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

Dataflow matrix machines (by Anhinga anhinga)

К 9 месяцам с появления GPT-3

К 9 месяцам с появления GPT-3

no subject

Profile

September 2025

Most Popular Tags

Page Summary

Active Entries

Style Credit

Expand Cut Tags