dmm | Some new papers

"Aurochs: An Architecture for Dataflow Threads", by a team from Stanford.

They say they learned to do dataflow-style acceleration for hash tables, B-trees, and things like that. This might be an answer to my long-standing desire to have good parallelization for less regular and less uniform computations. And the best thing is that this is a software solution, one does not have to build specialized processors to take advantage of this:

conferences.computer.org/iscapub/pdfs/ISCA2021-4ghucdBnCWYB7ES2Pe4YdT/333300a402/333300a402.pdf

"Thinking Like Transformers", by a team from Israel

"What is the computational model behind a Transformer? Where recurrent neural networks have direct parallels in finite state machines, allowing clear discussion and thought around architecture variants or trained models, Transformers have no such familiar parallel. In this paper we aim to change that, proposing a computational model for the transformer-encoder in the form of a programming language."

arxiv.org/abs/2106.06981

Столько всего происходит, времени вдруг стало резко меньше, не получается читать всё, что я обычно читаю... Дней десять назад как-то всё изменилось довольно резко, совсем другая динамика стала, feels like a transition period...

Flat | Top-Level Comments Only

From:

dmm

(And their Transformers are not universal, although they do reference "Universal Transformers" paper, https://openreview.net/forum?id=HyzdRiR9Y7 and https://arxiv.org/abs/1807.03819, noting that "transformers cannot arbitrarily repeat operations", but that the authors of "Universal Transformers" paper "devise a transformer architecture with a control unit, which can repeat its sublayers arbitrarily many times".)

What they say on "Restricted-Attention Transformers" and sorting on page 6 (that restricted attention implies reduced power), I don't buy. Their logic looks quite faulty to me...

The discussion of "Sandwich Transformers" is super-interesting.

("Improving Transformer Models by Reordering their Sublayers", https://www.aclweb.org/anthology/2020.acl-main.270/ and https://arxiv.org/abs/1911.03864)

Page 7: "Symbolic Reasoning in Transformers" is interesting.

S	M	T	W	T	F	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

Dataflow matrix machines (by Anhinga anhinga)

Some new papers

Some new papers

no subject

no subject

no subject

no subject

Profile

September 2025

Most Popular Tags

Page Summary

Active Entries

Style Credit

Expand Cut Tags