Some new papers
Jun. 23rd, 2021 11:12 am"Aurochs: An Architecture for Dataflow Threads", by a team from Stanford.
They say they learned to do dataflow-style acceleration for hash tables, B-trees, and things like that. This might be an answer to my long-standing desire to have good parallelization for less regular and less uniform computations. And the best thing is that this is a software solution, one does not have to build specialized processors to take advantage of this:
conferences.computer.org/iscapub/pdfs/ISCA2021-4ghucdBnCWYB7ES2Pe4YdT/333300a402/333300a402.pdf
"Thinking Like Transformers", by a team from Israel
"What is the computational model behind a Transformer? Where recurrent neural networks have direct parallels in finite state machines, allowing clear discussion and thought around architecture variants or trained models, Transformers have no such familiar parallel. In this paper we aim to change that, proposing a computational model for the transformer-encoder in the form of a programming language."
arxiv.org/abs/2106.06981
Столько всего происходит, времени вдруг стало резко меньше, не получается читать всё, что я обычно читаю... Дней десять назад как-то всё изменилось довольно резко, совсем другая динамика стала, feels like a transition period...
They say they learned to do dataflow-style acceleration for hash tables, B-trees, and things like that. This might be an answer to my long-standing desire to have good parallelization for less regular and less uniform computations. And the best thing is that this is a software solution, one does not have to build specialized processors to take advantage of this:
conferences.computer.org/iscapub/pdfs/ISCA2021-4ghucdBnCWYB7ES2Pe4YdT/333300a402/333300a402.pdf
"Thinking Like Transformers", by a team from Israel
"What is the computational model behind a Transformer? Where recurrent neural networks have direct parallels in finite state machines, allowing clear discussion and thought around architecture variants or trained models, Transformers have no such familiar parallel. In this paper we aim to change that, proposing a computational model for the transformer-encoder in the form of a programming language."
arxiv.org/abs/2106.06981
Столько всего происходит, времени вдруг стало резко меньше, не получается читать всё, что я обычно читаю... Дней десять назад как-то всё изменилось довольно резко, совсем другая динамика стала, feels like a transition period...
no subject
Date: 2021-06-23 06:55 pm (UTC)no subject
Date: 2021-06-23 07:05 pm (UTC)no subject
Date: 2021-06-23 07:07 pm (UTC)("Improving Transformer Models by Reordering their Sublayers", https://www.aclweb.org/anthology/2020.acl-main.270/ and https://arxiv.org/abs/1911.03864)
no subject
Date: 2021-06-23 07:08 pm (UTC)