dmm | Recent Entries

Both 70B and 8B versions are very impressive on initial blind comparisons: chat.lmsys.org/?leaderboard

If Llama 3-70B-Instruct turns out to be indeed more or less equivalent to early GPT-4, this would have a lot of wide-ranging implications.

One can use Llama 3-70B-Instruct at www.meta.ai/ for free.

Впечатления от полного солнечного затмения.

Много разных впечатлений, но самое яркое вот какое.

Камеры делают фотографии, которые никуда не годятся; ну, или, по крайней мере, принято публиковать именно такие фотографии. На самом деле, новолунная луна, полностью закрывающая солнце, конечно, не чёрная, как на всех фотографиях, а такого цвета, как закатное небо вокруг, - очень красивого умеренно-тёмно-синего, и даже не особо тёмного, цвета, вполне волшебного. Почему-то, камеры это не способны взять (из-за контраста с короной, что ли), но хоть бы уж фотошопом обрабатывали бы, чем публиковать настолько не похожее на то, что на самом деле...

github.com/GBirkel/ljdump

garote.dreamwidth.org/330489.html

Vernor Vinge died at 79 on March 20 (due to long decline from Parkinson's disease):

en.wikipedia.org/wiki/Vernor_Vinge

The history of the creation of "Attention Is All You Need", arxiv.org/abs/1706.03762

It's pretty intense; it's very interesting what it took to achieve that.

It was pretty informative throughout (I rarely watch long videos, especially if a transcript is available, but I watched this one; the last 10 min were particularly crazy in a good way).

The places where he demurred or hedged were also quite interesting; this did provide a good window into all this...

twitter.com/lexfridman/status/1769755831619219527

1 year since GPT-4 launch. Everything is progressing rapidly, the last few weeks have been intense.

GPT-4 now has some real competition from Claude 3 Opus and from Gemini models.

It also has some problems: in its default multimodal configuration, the system prompt is too long, and that can interfere with its thinking.

There is a URL for a non-multimodal "Classic" version, which might be better in this sense: chat.openai.com/g/g-YyyyMT9XH-chatgpt-classic

"In January 1941, President Franklin Roosevelt came to this chamber to speak to the nation."

"What makes our moment rare is that freedom and democracy are under attack, both at home and overseas, at the very same time."

Unanimous judgement (stating that this is a federal issue which cannot be decided by states), but 4 justices had different opinions objecting to the decision overreach (starting from page 14).

A 20 page PDF: www.supremecourt.gov/opinions/23pdf/23-719_19m2.pdf

www.sciphijournal.org/index.php/2017/11/12/why-the-culture-wins-an-appreciation-of-iain-m-banks/

An overview of BlueSky and atproto by Steve Klabnik:

steveklabnik.com/writing/how-does-bluesky-work

This does look very attractive...

Some visual illusions are very strong, this one might be the strongest I've seen yet:

twitter.com/ComputingByArts/status/1743675596268601546

One can place the cursor at various parts of the image to verify that the wire frame is not moving relative to the screen.

I don't know if there is a detailed neuroscience understanding of this one.

May 11, 2025 update: Another version of that, presumably: x.com/algekalipso/status/1921416201642930357

И это будет год "только что родившегося, растущего дракона"; а уходящий год - год "умирающего, уходящего кролика".

И тут-то, я чувствую, "всё" и начнётся; всё указывает на предстоящий "год критических потрясений", год, когда мир изменится радикально...

Хорошо бы нам его успешно пережить и войти в новую фазу...

Lean theorem prover seems to be the most convenient and practical (good enough for mathematicians to use it in real work, I am seeing plenty of such stories lately; there is a bit of controversy on whether to use Lean 3 or Lean 4).

The best quote according to "leanprover" twitter:

>Our favorite quote from the paper: "Monadic syntax is excellent for expressing stochastic algorithms, and working over finitely supported distributions avoids the need for integrability side conditions during proofs."

Links are in the comments.

A 4-3 decision to disqualify you-know-who from the primary ballot.

"The court stays its ruling until January 4, 2024, subject to any further appellate proceedings."

Now he-who-must-not-be-named has a choice whether to appeal this in Federal courts, but a loss there might result in a wider disqualification.

People can enjoy reading a 213-page decision, but the gist of it is pages 6-9 (pages 7-10 of the PDF file).

The main difference between classical Transformers and the new generation (which includes GPT-4) is that the new generation seems to be "mixtures-of-experts", with each of their feedforward layers subdivided into "experts" and only some of its "experts" activated on each inference run.

I think this is key to both GPT-4 and Mixtral 8x7B (which I suspect is approximately an open-source mini-GPT-4 and which is the new leading open-source model roughly equivalent to GPT-3.5 in performance).

Of course, GPT-4 might have some extra magic secret sauce besides being (according to rumors) a "mixture-of-experts" and scale (given how difficult it has been to even reproduce its performance so far).

Hugging Face published a very nice tutorial recently: huggingface.co/blog/moe

arxiv.org/abs/2103.06376

page 2: "Semi-ring dictionaries realize the well-known connection between relations and tensors" (from "In-Database Learning with Sparse Tensors" 2016-2018 paper)

"Are there evolutionary models which try to explain why the tendency to engage in self-deception and other numerous cognitive biases are not pruned away by evolutionary selection?"

They switched me to the new integrated mode. This is supposed to have all kinds of upgrades, in particular it is possible to read and write images in one session.

I still don't know if it is possible to competently edit images via this workflow (I'll try, and I'll read what other people say, but it might be difficult without a more image-oriented input system). Nevertheless, just like in dmm.dreamwidth.org/76698.html I read my avatar icon I am using here and talk to the new GPT-4 about it, and then I ask it to produce images based on that, see the comments for conversation and images.

Profile

Dataflow matrix machines (by Anhinga anhinga)

Neuromorphic Computations with Linear Streams

May 2025

S	M	T	W	T	F	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Syndicate

Page Summary

Active Entries

Style Credit

Style: Neutral Good for Practicality by timeasmymeasure

Expand Cut Tags

No cut tags

Page generated Jul. 8th, 2025 09:35 am

Dataflow matrix machines (by Anhinga anhinga)

Recent Entries

Llama 3 models are pretty spectacular

Solar eclipse

New "ljdump" for Dreamwidth and LJ

Vernor Vinge

WIRED published a story on Transformer invention

New Sam Altman @ Lex Fridman

ChatGPT Classic/GPT-4 Classic

This was a war-like State of the Union address

Today's court decision

Overview of "The Culture" by Iain M. Banks

BlueSky architecture

24 февраля

One of the really strong visual motion illusions

Китайский новый год начнётся 10-го февраля

DeepMind has formalized a theoretical result related to AI safety in Lean

Colorado Supreme Court decision

"Mixture-of-Experts" and Transformers

"Functional Collection Programming with Semi-Ring Dictionaries"

Evolution of Cognitive Biases (asking the AI)

New integrated mode for GPT-4 in ChatGPT+