<?xml version='1.0' encoding='utf-8' ?>

<rss version='2.0' xmlns:lj='http://www.livejournal.org/rss/lj/1.0/' xmlns:atom10='http://www.w3.org/2005/Atom'>
<channel>
  <title>Dataflow matrix machines (by Anhinga anhinga)</title>
  <link>https://dmm.dreamwidth.org/</link>
  <description>Dataflow matrix machines (by Anhinga anhinga) - Dreamwidth Studios</description>
  <lastBuildDate>Sun, 05 Feb 2023 06:26:58 GMT</lastBuildDate>
  <generator>LiveJournal / Dreamwidth Studios</generator>
  <lj:journal>dmm</lj:journal>
  <lj:journaltype>personal</lj:journaltype>
  <image>
    <url>https://v2.dreamwidth.org/11549465/3235132</url>
    <title>Dataflow matrix machines (by Anhinga anhinga)</title>
    <link>https://dmm.dreamwidth.org/</link>
    <width>100</width>
    <height>100</height>
  </image>

<item>
  <guid isPermaLink='true'>https://dmm.dreamwidth.org/68826.html</guid>
  <pubDate>Sun, 05 Feb 2023 06:26:58 GMT</pubDate>
  <title>Technical interview with Neel Nanda</title>
  <link>https://dmm.dreamwidth.org/68826.html</link>
  <description>&lt;a href=&quot;https://www.lesswrong.com/posts/r2yTwkGt3kbQG2mXi/axrp-episode-19-mechanistic-interpretability-with-neel-nanda&quot;&gt;www.lesswrong.com/posts/r2yTwkGt3kbQG2mXi/axrp-episode-19-mechanistic-interpretability-with-neel-nanda&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src=&quot;https://www.dreamwidth.org/tools/commentcount?user=dmm&amp;ditemid=68826&quot; width=&quot;30&quot; height=&quot;12&quot; alt=&quot;comment count unavailable&quot; style=&quot;vertical-align: middle;&quot;/&gt; comments</description>
  <comments>https://dmm.dreamwidth.org/68826.html</comments>
  <category>ai safety</category>
  <category>anthropic ai</category>
  <category>machine learning</category>
  <category>transformers</category>
  <category>artificial intelligence</category>
  <category>understanding internals of ai</category>
  <lj:security>public</lj:security>
  <lj:reply-count>3</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>https://dmm.dreamwidth.org/66967.html</guid>
  <pubDate>Mon, 26 Dec 2022 17:38:19 GMT</pubDate>
  <title>&quot;MPLP: Learning a Message Passing Learning Protocol&quot;</title>
  <link>https://dmm.dreamwidth.org/66967.html</link>
  <description>I have been looking at a recent rather remarkable paper which includes the DeepDream creator among its authors, and I&apos;ve decided to check whether I missed any of his works; and I turns out there is this paper I really should be aware of. This really resonates with some of the thing I have been exploring this year.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;https://arxiv.org/abs/2007.00970&quot;&gt;arxiv.org/abs/2007.00970&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;quot;We present a novel method for learning the weights of an artificial neural network - a Message Passing Learning Protocol (MPLP). In MPLP, we abstract every operations occurring in ANNs as independent agents. Each agent is responsible for ingesting incoming multidimensional messages from other agents, updating its internal state, and generating multidimensional messages to be passed on to neighbouring agents. We demonstrate the viability of MPLP as opposed to traditional gradient-based approaches on simple feed-forward neural networks, and present a framework capable of generalizing to non-traditional neural network architectures. MPLP is meta learned using end-to-end gradient-based meta-optimisation. We further discuss the observed properties of MPLP and hypothesize its applicability on various fields of deep learning.&amp;quot;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src=&quot;https://www.dreamwidth.org/tools/commentcount?user=dmm&amp;ditemid=66967&quot; width=&quot;30&quot; height=&quot;12&quot; alt=&quot;comment count unavailable&quot; style=&quot;vertical-align: middle;&quot;/&gt; comments</description>
  <comments>https://dmm.dreamwidth.org/66967.html</comments>
  <category>artificial intelligence</category>
  <category>zzznah</category>
  <category>understanding internals of ai</category>
  <category>neural networks</category>
  <category>transformers</category>
  <category>machine learning</category>
  <lj:security>public</lj:security>
  <lj:reply-count>4</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>https://dmm.dreamwidth.org/65784.html</guid>
  <pubDate>Tue, 06 Dec 2022 17:38:50 GMT</pubDate>
  <title>&quot;Towards Categorical Foundations of Learning&quot;</title>
  <link>https://dmm.dreamwidth.org/65784.html</link>
  <description>When one tries to use category theory for the applied work, a number of questions arise: Is it just too difficult to be used at all by me given my level of technical skills? Is it fruitful enough, and is the fruitfulness/efforts ratio high enough for all this to make sense?&lt;br /&gt;&lt;br /&gt;I recently discovered &lt;strong&gt;Bruno Gavranovi&lt;span class=&quot;css-901oao css-16my406 r-poiln3 r-bcqeeo r-qvutc0&quot;&gt;&lt;span class=&quot;css-901oao css-16my406 r-poiln3 r-bcqeeo r-qvutc0&quot;&gt;ć&lt;/span&gt;&lt;/span&gt;&lt;/strong&gt;&lt;span class=&quot;css-901oao css-16my406 r-poiln3 r-bcqeeo r-qvutc0&quot;&gt;&lt;span class=&quot;css-901oao css-16my406 r-poiln3 r-bcqeeo r-qvutc0&quot;&gt;, a graduate student in &lt;/span&gt;Glasgow, whose work is promising in this sense. They are really trying hard to keep things simple and also trying to make sure that there are non-trivial applications. Here is one of his essays and papers (March 2021, so it&apos;s not the most recent one, but probably the most central):&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;https://www.brunogavranovic.com/posts/2021-03-03-Towards-Categorical-Foundations-Of-Neural-Networks.html&quot;&gt;www.brunogavranovic.com/posts/2021-03-03-Towards-Categorical-Foundations-Of-Neural-Networks.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;(I am posting this here because there are people who read this blog who are interested in applied category theory and like it, not because I am trying to convince those who formed a negative opinion of this subject. I am non-committal myself, I have not decided whether applied categories have strong enough fruitfulness/efforts ratio, but this particular entry seems to be one of the best shots in this sense, so I am going to try to go deeper with their work.)&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Update: &lt;/strong&gt;their collection of papers in the intersection between Category Theory and Machine Learning: &lt;a href=&quot;https://github.com/bgavran/Category_Theory_Machine_Learning&quot;&gt;github.com/bgavran/Category_Theory_Machine_Learning&lt;/a&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src=&quot;https://www.dreamwidth.org/tools/commentcount?user=dmm&amp;ditemid=65784&quot; width=&quot;30&quot; height=&quot;12&quot; alt=&quot;comment count unavailable&quot; style=&quot;vertical-align: middle;&quot;/&gt; comments</description>
  <comments>https://dmm.dreamwidth.org/65784.html</comments>
  <category>artificial intelligence</category>
  <category>neural networks</category>
  <category>mathematics</category>
  <category>category theory</category>
  <category>machine learning</category>
  <lj:security>public</lj:security>
  <lj:reply-count>9</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>https://dmm.dreamwidth.org/64931.html</guid>
  <pubDate>Thu, 17 Nov 2022 01:44:04 GMT</pubDate>
  <title>Conferences; research updates</title>
  <link>https://dmm.dreamwidth.org/64931.html</link>
  <description>This week, Nov 17-18, Thu-Fri, 8am-11:45am Boston time, &lt;b&gt;&amp;quot;Quantum physics and the first-person perspective&amp;quot;&lt;/b&gt;: &lt;a href=&quot;https://www.essentiafoundation.org/quantum-physics-and-the-first-person-perspective/seeing/&quot;&gt;www.essentiafoundation.org/quantum-physics-and-the-first-person-perspective/seeing/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;JuliaCon 2023&lt;/strong&gt;, &lt;a href=&quot;https://juliacon.org/2023/&quot;&gt;juliacon.org/2023/&lt;/a&gt;  the call for proposals is posted, deadline Dec 18: &lt;a href=&quot;https://pretalx.com/juliacon2023/cfp&quot;&gt;pretalx.com/juliacon2023/cfp&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;I&apos;ve spent more quality time focusing of two breakthroughs in understanding the nature and the behavior of machine learning models which came from the &amp;quot;penumbra&amp;quot; of &amp;quot;prosaic alignment&amp;quot; start-ups and which &lt;strong&gt;I wrote about in my previous two posts&lt;/strong&gt;. &lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&amp;quot;Grokking is (more or less) solved.&amp;quot;&lt;/strong&gt; I took brief notes between Oct 21 and Oct 23: &lt;a href=&quot;https://github.com/anhinga/2022-notes/tree/main/Grokking-is-solved&quot;&gt;github.com/anhinga/2022-notes/tree/main/Grokking-is-solved&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&amp;quot;Generative autoregressive models are similators.&amp;quot;&lt;/strong&gt; I took extensive notes between Oct 5 and Oct 23: &lt;a href=&quot;https://github.com/anhinga/2022-notes/tree/main/Generative-autoregressive-models-are-similators&quot;&gt;github.com/anhinga/2022-notes/tree/main/Generative-autoregressive-models-are-similators&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I am continuing to develop thoughts related to these topics, I am going to gradually write more about those topics in the comments.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src=&quot;https://www.dreamwidth.org/tools/commentcount?user=dmm&amp;ditemid=64931&quot; width=&quot;30&quot; height=&quot;12&quot; alt=&quot;comment count unavailable&quot; style=&quot;vertical-align: middle;&quot;/&gt; comments</description>
  <comments>https://dmm.dreamwidth.org/64931.html</comments>
  <category>understanding internals of ai</category>
  <category>julia</category>
  <category>technological singularity</category>
  <category>transformers</category>
  <category>ai safety</category>
  <category>conference</category>
  <category>artificial intelligence</category>
  <category>physics</category>
  <category>philosophy</category>
  <category>machine learning</category>
  <category>anthropic ai</category>
  <lj:security>public</lj:security>
  <lj:reply-count>14</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>https://dmm.dreamwidth.org/64571.html</guid>
  <pubDate>Sun, 16 Oct 2022 14:54:41 GMT</pubDate>
  <title>Grokking is (more or less) solved</title>
  <link>https://dmm.dreamwidth.org/64571.html</link>
  <description>The most interesting conceptual AI advances seem lately to come from &amp;quot;prosaic alignment&amp;quot; start-ups. These are companies which believe that the current trend of improving Transformer models is likely to lead straight to AGI, and that better understanding of the nature and properties of these model is key to AI&amp;nbsp;safety (and, of course, it&apos;s also key to better AI&amp;nbsp;capabilities).&lt;br /&gt;&lt;br /&gt;And it is often the case that the key elements of work are done by people &amp;quot;on the edge&amp;quot;, &amp;quot;in the penumbra&amp;quot; of those alignment start-ups.&lt;br /&gt;&lt;br /&gt;In the previous post I mentioned the key new understanding of large Transformer models as &lt;em&gt;&lt;strong&gt;simulators&lt;/strong&gt;&lt;/em&gt;. That work has been done &amp;quot;while at Conjecture&amp;quot;, but is not listed as directly coming from Conjecture (one of those &amp;quot;prosaic alignment&amp;quot; start-ups). I think the key people involved are still at Conjecture, but they seem to be trying to keep some distance between Conjecture and this work. I am continuing to take notes of those materials and commit them to GitHub (see links in the comments to the previous post).&lt;br /&gt;&lt;br /&gt;Here is another one of those stories. Grokking is a phenomenon, where small Transformers look at a part of a mathematical structure for quite a while, and then rather suddenly transition to understanding the whole of that mathematical structure including the part they never see in training. It has been discovered in 2021 and has been a subject of a number of follow-up attempts to understand it.&lt;br /&gt;&lt;br /&gt;The recent breakthrough has been done in mid-August by Neel Nanda who left Anthropic (perhaps the most famous of the &amp;quot;prosaic alignment&amp;quot; start-ups) a few months ago. And it looks like he has more or less solved the mysteries behind this phenomenon. I am going to continue studying his writings more. The links are in the comments.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src=&quot;https://www.dreamwidth.org/tools/commentcount?user=dmm&amp;ditemid=64571&quot; width=&quot;30&quot; height=&quot;12&quot; alt=&quot;comment count unavailable&quot; style=&quot;vertical-align: middle;&quot;/&gt; comments</description>
  <comments>https://dmm.dreamwidth.org/64571.html</comments>
  <category>anthropic ai</category>
  <category>ai safety</category>
  <category>transformers</category>
  <category>machine learning</category>
  <category>artificial intelligence</category>
  <category>understanding internals of ai</category>
  <lj:security>public</lj:security>
  <lj:reply-count>5</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>https://dmm.dreamwidth.org/64434.html</guid>
  <pubDate>Wed, 21 Sep 2022 07:25:52 GMT</pubDate>
  <title>Generative autoregressive models are similators</title>
  <link>https://dmm.dreamwidth.org/64434.html</link>
  <description>Вот, наконец, кажется возник правильный подход к пониманию природы моделей вроде GPT-3 и разнообразного волшебства, с этим связанного:&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;https://www.lesswrong.com/posts/vJFdjigzmcXMhNTsx/simulators&quot;&gt;www.lesswrong.com/posts/vJFdjigzmcXMhNTsx/simulators&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Он говорит, что надо перестать думать про эти модели в терминах более старых AI-систем.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src=&quot;https://www.dreamwidth.org/tools/commentcount?user=dmm&amp;ditemid=64434&quot; width=&quot;30&quot; height=&quot;12&quot; alt=&quot;comment count unavailable&quot; style=&quot;vertical-align: middle;&quot;/&gt; comments</description>
  <comments>https://dmm.dreamwidth.org/64434.html</comments>
  <category>physics</category>
  <category>artificial intelligence</category>
  <category>philosophy</category>
  <category>machine learning</category>
  <category>understanding internals of ai</category>
  <category>ai safety</category>
  <category>transformers</category>
  <category>technological singularity</category>
  <lj:security>public</lj:security>
  <lj:reply-count>9</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>https://dmm.dreamwidth.org/63823.html</guid>
  <pubDate>Wed, 07 Sep 2022 15:34:08 GMT</pubDate>
  <title>&quot;Transformers are Sample Efficient World Models&quot;</title>
  <link>https://dmm.dreamwidth.org/63823.html</link>
  <description>Another important paper from one of Fran&amp;ccedil;ois Fleuret&apos;s collaborations: &lt;a href=&quot;https://arxiv.org/abs/2209.00588&quot;&gt;arxiv.org/abs/2209.00588&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Previous important papers include &amp;quot;Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention&amp;quot;,&lt;a href=&quot;https://arxiv.org/abs/2006.16236&quot;&gt;arxiv.org/abs/2006.16236&lt;/a&gt; and &amp;quot;Flatten the Curve: Efficiently Training Low-Curvature Neural Networks&amp;quot;, &lt;a href=&quot;https://arxiv.org/abs/2206.07144&quot;&gt;arxiv.org/abs/2206.07144&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src=&quot;https://www.dreamwidth.org/tools/commentcount?user=dmm&amp;ditemid=63823&quot; width=&quot;30&quot; height=&quot;12&quot; alt=&quot;comment count unavailable&quot; style=&quot;vertical-align: middle;&quot;/&gt; comments</description>
  <comments>https://dmm.dreamwidth.org/63823.html</comments>
  <category>artificial intelligence</category>
  <category>neural networks</category>
  <category>transformers</category>
  <category>machine learning</category>
  <lj:security>public</lj:security>
  <lj:reply-count>1</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>https://dmm.dreamwidth.org/62501.html</guid>
  <pubDate>Tue, 09 Aug 2022 03:13:17 GMT</pubDate>
  <title>Open source code generator (an alternative to OpenAI Codex)</title>
  <link>https://dmm.dreamwidth.org/62501.html</link>
  <description>&lt;a href=&quot;https://github.com/salesforce/CodeGen&quot;&gt;github.com/salesforce/CodeGen&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;One can also run one of these models via HuggingFace; it is based on &amp;quot;A Conversational Paradigm for Program Synthesis&amp;quot; paper, &lt;a href=&quot;https://arxiv.org/abs/2203.13474&quot;&gt;arxiv.org/abs/2203.13474&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Someone has even created a fake GitHub Copilot based on that (useful for those who prefer VSCode): &lt;a href=&quot;https://github.com/moyix/fauxpilot&quot;&gt;github.com/moyix/fauxpilot&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src=&quot;https://www.dreamwidth.org/tools/commentcount?user=dmm&amp;ditemid=62501&quot; width=&quot;30&quot; height=&quot;12&quot; alt=&quot;comment count unavailable&quot; style=&quot;vertical-align: middle;&quot;/&gt; comments</description>
  <comments>https://dmm.dreamwidth.org/62501.html</comments>
  <category>artificial intelligence</category>
  <category>openai codex</category>
  <category>understanding internals of ai</category>
  <category>program synthesis</category>
  <category>transformers</category>
  <category>github copilot</category>
  <category>machine learning</category>
  <lj:security>public</lj:security>
  <lj:reply-count>6</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>https://dmm.dreamwidth.org/62298.html</guid>
  <pubDate>Sat, 23 Jul 2022 20:51:52 GMT</pubDate>
  <title>Learned optimizers and related topics</title>
  <link>https://dmm.dreamwidth.org/62298.html</link>
  <description>&lt;a href=&quot;https://github.com/google/learned_optimization&quot;&gt;github.com/google/learned_optimization&lt;/a&gt; - &amp;quot;Meta-learning optimizers and more with JAX&amp;quot; &lt;br /&gt;&lt;br /&gt;This is used by various interesting papers including the famous &amp;quot;persistent evolution strategies&amp;quot; paper which I don&apos;t understand and &amp;quot;Gradients are Not All You Need&amp;quot; &lt;a href=&quot;https://arxiv.org/abs/2111.05803&quot;&gt;arxiv.org/abs/2111.05803&lt;/a&gt; tempting paper.&lt;br /&gt;&lt;br /&gt;Moreover, it is used by a super-interesting &amp;quot;Practical tradeoffs between memory, compute, and performance in learned optimizers&amp;quot;&amp;nbsp;&lt;a href=&quot;https://arxiv.org/abs/2203.11860&quot;&gt;arxiv.org/abs/2203.11860&lt;/a&gt; must-read paper, which is being published at the following conference&amp;nbsp;&lt;a href=&quot;https://lifelong-ml.cc/&quot;&gt;lifelong-ml.cc/&lt;/a&gt; (Conference on Lifelong Learning Agents - CoLLAs 2022, Aug 18-24)&lt;br /&gt;&lt;br /&gt;&lt;img src=&quot;https://www.dreamwidth.org/tools/commentcount?user=dmm&amp;ditemid=62298&quot; width=&quot;30&quot; height=&quot;12&quot; alt=&quot;comment count unavailable&quot; style=&quot;vertical-align: middle;&quot;/&gt; comments</description>
  <comments>https://dmm.dreamwidth.org/62298.html</comments>
  <category>machine learning</category>
  <category>jax</category>
  <category>learned optimizers</category>
  <category>artificial intelligence</category>
  <category>conference</category>
  <lj:security>public</lj:security>
  <lj:reply-count>9</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>https://dmm.dreamwidth.org/62181.html</guid>
  <pubDate>Sat, 16 Jul 2022 19:20:25 GMT</pubDate>
  <title>AutoML Conference 2022 (July 25-27)</title>
  <link>https://dmm.dreamwidth.org/62181.html</link>
  <description>1st International Conference on Automated Machine Learning: &lt;a href=&quot;https://automl.cc/&quot;&gt;automl.cc/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Follows ICML 2022 🇺🇦 &lt;a href=&quot;https://icml.cc/&quot;&gt;icml.cc/&lt;/a&gt; (one can attend virtually as well)&lt;br /&gt;&lt;br /&gt;Neural Architecture Search is prominent and includes a competition: &lt;a href=&quot;https://sites.google.com/view/zero-cost-nas-competition/home&quot;&gt;sites.google.com/view/zero-cost-nas-competition/home&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The most notable keynote is by Jeff Clune, &amp;quot;AI-generating algorithms: the fastest path to AGI?&amp;quot;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src=&quot;https://www.dreamwidth.org/tools/commentcount?user=dmm&amp;ditemid=62181&quot; width=&quot;30&quot; height=&quot;12&quot; alt=&quot;comment count unavailable&quot; style=&quot;vertical-align: middle;&quot;/&gt; comments</description>
  <comments>https://dmm.dreamwidth.org/62181.html</comments>
  <category>artificial intelligence</category>
  <category>conference</category>
  <category>machine learning</category>
  <lj:security>public</lj:security>
  <lj:reply-count>3</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>https://dmm.dreamwidth.org/59553.html</guid>
  <pubDate>Mon, 09 May 2022 14:38:22 GMT</pubDate>
  <title>&quot;JAX vs Julia (vs PyTorch)&quot;</title>
  <link>https://dmm.dreamwidth.org/59553.html</link>
  <description>&lt;a href=&quot;https://kidger.site/thoughts/jax-vs-julia/&quot;&gt;kidger.site/thoughts/jax-vs-julia/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Various correspondencies between JAX and Julia constructions he is listing there are quite useful for people practicing either JAX or Julia.&lt;br /&gt;&lt;br /&gt;(I am having good time with both JAX and Julia this year.)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src=&quot;https://www.dreamwidth.org/tools/commentcount?user=dmm&amp;ditemid=59553&quot; width=&quot;30&quot; height=&quot;12&quot; alt=&quot;comment count unavailable&quot; style=&quot;vertical-align: middle;&quot;/&gt; comments</description>
  <comments>https://dmm.dreamwidth.org/59553.html</comments>
  <category>jax</category>
  <category>julia</category>
  <category>machine learning</category>
  <category>differentiable programming</category>
  <lj:security>public</lj:security>
  <lj:reply-count>2</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>https://dmm.dreamwidth.org/47832.html</guid>
  <pubDate>Fri, 20 Aug 2021 12:59:24 GMT</pubDate>
  <title>Compact Transformers</title>
  <link>https://dmm.dreamwidth.org/47832.html</link>
  <description>For those of us (like myself) who&apos;d like to experiment with changing Transformer architecture on a home personal computer.&lt;br /&gt;&lt;br /&gt;Links are in the comments.&lt;br /&gt;&lt;br /&gt;&lt;img src=&quot;https://www.dreamwidth.org/tools/commentcount?user=dmm&amp;ditemid=47832&quot; width=&quot;30&quot; height=&quot;12&quot; alt=&quot;comment count unavailable&quot; style=&quot;vertical-align: middle;&quot;/&gt; comments</description>
  <comments>https://dmm.dreamwidth.org/47832.html</comments>
  <category>compact ml models</category>
  <category>transformers</category>
  <category>machine learning</category>
  <lj:security>public</lj:security>
  <lj:reply-count>12</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>https://dmm.dreamwidth.org/47385.html</guid>
  <pubDate>Mon, 16 Aug 2021 12:55:29 GMT</pubDate>
  <title>MSML21: Mathematical and Scientific Machine Learning</title>
  <link>https://dmm.dreamwidth.org/47385.html</link>
  <description>Starts in 5 minutes:&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;https://msml21.github.io/&quot;&gt;msml21.github.io/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;No registration is needed - they are just handling it in a relaxed fashion&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src=&quot;https://www.dreamwidth.org/tools/commentcount?user=dmm&amp;ditemid=47385&quot; width=&quot;30&quot; height=&quot;12&quot; alt=&quot;comment count unavailable&quot; style=&quot;vertical-align: middle;&quot;/&gt; comments</description>
  <comments>https://dmm.dreamwidth.org/47385.html</comments>
  <category>mathematics</category>
  <category>machine learning</category>
  <category>conference</category>
  <lj:security>public</lj:security>
  <lj:reply-count>17</lj:reply-count>
</item>
</channel>
</rss>
