dmm | Начал экспериментировать с GPT-4

Через ChatGPT+, за двадцатку в месяц. В общем, разница между этой штукой и ChatGPT огромная, и нет смысла ещё тянуть время.

Сразу она начала с предупреждения: "GPT-4 currently has a cap of 25 messages every 3 hours. Expect significantly lower caps, as we adjust for demand."

Но когда я попросил её дать мне советы по моему проекту, она очень мило выступила, технически грамотно в конкретном контексте, и совсем не всё, что она сказала, было общим местом (и, в любом случае, она явно шире видит, чем я; я даже почти всё это "в принципе знаю", и с чем-то могу и не согласиться, но, с другой стороны, оно всё по делу, большую часть этого дела я бы и не вспомнил).

Flat | Top-Level Comments Only

From:

dmm

Первый разговор:

https://github.com/anhinga/DMM-synthesis-lab-journal/blob/main/drafts/GPT-4-conversations/first-conversation.md

From:

dmm

Вот что посоветовала:

Improve efficiency and scalability: Investigate methods to optimize the performance of Zygote.jl when working with your neural machines. This might involve exploring ways to make gradient computation more efficient, either by implementing custom adjoints or by developing new techniques for optimizing the computational graph.

Benchmark against existing methods: Compare the performance of your synthesized neural machines with traditional machine learning frameworks or other state-of-the-art methods in neural architecture search. This will help you understand the strengths and weaknesses of your approach, as well as provide insights into potential areas of improvement.

Experiment with different problem domains: Extend your work to other problem domains and datasets to test the versatility and generalizability of your neural architecture search algorithm. This will help you better understand the capabilities of your synthesized neural machines and potentially identify new applications.

Investigate alternative search algorithms: While you've already proposed a novel algorithm for neural architecture search, you could consider exploring other search algorithms, such as genetic algorithms, reinforcement learning, or Bayesian optimization, to understand the trade-offs between different approaches and possibly improve the search process.

Study the properties of synthesized neural machines: Analyze the properties of the neural machines that your algorithm synthesizes, such as their topology, weight distribution, and learning dynamics. This could provide insights into the underlying principles governing the success of your approach and help you refine your algorithm.

Collaborate with the community: Share your findings and code with the research community, either by publishing a paper or by creating a public repository for your project. This will encourage others to build on your work, provide feedback, and potentially lead to new collaborations.

Fine-tune the training process: Explore various regularization techniques, learning rate schedules, and optimization algorithms to improve the training process and the performance of the synthesized neural machines. Experimenting with different hyperparameters and training strategies might lead to better generalization and more compact architectures.

From:

juan_gandhi

Sounds like standard BS, to me. I'd love to experiment, though, but not sure yet.

Can it produce latex? I bet it can.

From:

dmm

Не знаю; очень конкретно говорит, про custom adjoints в Zygote.jl знает, и про всякое нетривиальное...

Почти всё по делу, говорит, как компетентный специалист... Над тем, что она сказала в последнем абзаце, я, как раз, и думаю, только она формулирует лучше, чем я...

Сейчас посмотрю про LaTeX

From:

dmm

Вот, насчёт LaTeX. Я ей сказал

I have the following markdown text; please generate equivalent LaTeX: и вставил вот этот markdown: https://github.com/anhinga/2022-notes/blob/main/Generative-autoregressive-models-are-similators/README.md

Она мне на это сказала (ещё не проверял, всё ли хорошо):

Here is the equivalent LaTeX code for your provided Markdown text:

\section{Generative autoregressive models are simulators}

A radical novel understanding of the nature of GPT-3-like models seems to be crystallizing.

\textit{These models are simulators, they generate various entities (which tend to be rather short-lived at the moment),
and users are interacting with those entities. This novel understanding explains a lot of observable phenomena and
opens new promising directions of work.}

A lot of people were seeing glimpses of this for a few years now. E.g. when I was playing with GPT-2 in 2019, I noticed that
it looked like a virtual personality was created at first (sampled from a distribution of virtual personalities
conditioned by the initial prompt), and then that personality was speaking to me. Or, at least, that was my impression.

\begin{center}
\line(1,0){250}
\end{center}

Fast-forwarding to September 2022:

The following text has been published by \textbf{``janus''} on Sep 2, 2022:

\url{https://www.lesswrong.com/posts/vJFdjigzmcXMhNTsx/simulators}

It has been further publicized and discussed by Scott Alexander on Sep 19:

\url{https://astralcodexten.substack.com/p/janus-gpt-wrangling}

This both crystallizes a lot of that new understanding, shows that this viewpoint is gradually proliferating in leading
organizations focusing on better understanding the true nature of modern generative Transformers and on leveraging this
improved understanding, and helps to disseminate this viewpoint (due to the popularity of Scott Alexander's substack in
the relevant circles of professionals).

This subdirectory is for the materials related to this new understanding.

\begin{center}
\line(1,0){250}
\end{center}

\subsection{My reading notes:}

\href{reading-janus-gpt-wrangling.md}{reading-janus-gpt-wrangling.md} - written (this page also contains links to some tools and papers by ``Janus'' including their \textbf{loom} tool and ``Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm'' paper, \url{https://arxiv.org/abs/2102.07350})

\href{reading-simulators-essay.md}{reading-simulators-essay.md} - written

\begin{center}
\line(1,0){250}
\end{center}

\subsection{Thinking further about this topic}

\textit{Work in progress}

*****

Ещё сказала

Make sure to include the necessary packages in your LaTeX document preamble for the hyperlink and URL functions, such as:

\usepackage{hyperref}
\usepackage{url}

Please note that due to the limitations of the text format here, you may need to adjust the formatting slightly (e.g., line breaks) when copying this code into your LaTeX editor.

Сейчас посмотрю, что получилось...

From:

juan_gandhi

В смысле ему можно весь проект с гитхаба выдать на растерзание?

From:

dmm

Не пробовал. Я сделал copy-paste.

From:

dmm

Да, то есть documentclass, begin{document}, end{document} я вставил руками, а, в остальном, в общем, вполне эквивалентный PDF, хотя можно было бы его лучше вылизать (ну, это известное дело, если мне чего не нравится, то можно с ней про это поговорить, и она поменяет).

Я готов ей засчитать это задание про LaTeX

From:

dmm

Actually, I am going to ask it to change the LaTeX and make it better.

I am asking her:

Great! Almost there, but I'd like some minor changes. I'd like paragraphs to start on the left margin without indent, to have some modest vertical space between paragraphs, and to have the whole LaTeX document generated starting from \documentclass

She is saying:

Sure! Here's the full LaTeX document with the requested formatting changes:

\documentclass{article}

\usepackage{hyperref}
\usepackage{url}

% Remove paragraph indent
\setlength{\parindent}{0pt}
% Add vertical space between paragraphs
\setlength{\parskip}{6pt plus 2pt minus 1pt}

\begin{document}

\section{Generative autoregressive models are simulators}

A radical novel understanding of the nature of GPT-3-like models seems to be crystallizing.

\textit{These models are simulators, they generate various entities (which tend to be rather short-lived at the moment),
and users are interacting with those entities. This novel understanding explains a lot of observable phenomena and
opens new promising directions of work.}

A lot of people were seeing glimpses of this for a few years now. E.g. when I was playing with GPT-2 in 2019, I noticed that
it looked like a virtual personality was created at first (sampled from a distribution of virtual personalities
conditioned by the initial prompt), and then that personality was speaking to me. Or, at least, that was my impression.

\begin{center}
\line(1,0){250}
\end{center}

Fast-forwarding to September 2022:

The following text has been published by \textbf{``janus''} on Sep 2, 2022:

\url{https://www.lesswrong.com/posts/vJFdjigzmcXMhNTsx/simulators}

It has been further publicized and discussed by Scott Alexander on Sep 19:

\url{https://astralcodexten.substack.com/p/janus-gpt-wrangling}

This both crystallizes a lot of that new understanding, shows that this viewpoint is gradually proliferating in leading
organizations focusing on better understanding the true nature of modern generative Transformers and on leveraging this
improved understanding, and helps to disseminate this viewpoint (due to the popularity of Scott Alexander's substack in
the relevant circles of professionals).

This subdirectory is for the materials related to this new understanding.

\begin{center}
\line(1,0){250}
\end{center}

\subsection{My reading notes:}

\href{reading-janus-gpt-wrangling.md}{reading-janus-gpt-wrangling.md} - written (this page also contains links to some tools and papers by ``Janus'' including their \textbf{loom} tool and ``Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm'' paper, \url{https://arxiv.org/abs/2102.07350})

\href{reading-simulators-essay.md}{reading-simulators-essay.md} - written

\begin{center}
\line(1,0){250}
\end{center}

\subsection{Thinking further about this topic}

\textit{Work in progress}

\end{document}

This LaTeX code includes the document class, necessary packages, and the requested formatting changes. Make sure to copy the entire code into your LaTeX editor and adjust the formatting as needed.

*****

Let me see if it works...

From:

juan_gandhi

I wonder if I can delegate to format the whole book. User chaource formats his perfectly; I'm absolutely pathetic.

From:

dmm

Сколько примерно слов в книжке?

From:

juan_gandhi

Счетчик не нашел; там 250 страниц, если считать по 2000 символов на страницу и по 5 букв на слово, то получается 10 тыс слов.

From:

dmm

Вообще говоря, это ещё в пределах её контекстного окна (во всяком случае, в их большой конфигурации; я, впрочем, не знаю, где там рукоятка управления этим делом).

Так что, я бы ожидал, что это можно сделать (во всяком случае, если выставлено окно в 32 тысячи токенов).

From:

dmm

Нет, это, всё же, сто тысяч слов, а не десять.

Но, всё равно, можно преобразовать начальный фрагмент, потом сказать, чтобы так же продолжила...

From:

dmm

(Для больших задач, люди, всё же, будут делать вокруг неё завёртки. Чтобы, если нужна промежуточная память или многошаговость, или нетривиальный prompt engineering, это, всё же, не руками делать.)

From:

dmm

Yes, this just worked, and did the style changes I asked for.

From:

spamsink

Попробуй взять какую-нибудь недлинную программу с www.ioccc.org и попросить объяснить, что она делает.

Например (2020/burton),

int main(int b,char**i){long long n=B,a=I^n,r=(a/b&a)>>4,y=atoi(*++i),_=(((a^n/b)*(y>>T)|y>>S)&r)|(a^r);printf("%.8s\n",(char*)&_);}

Хитрость в том, что на сайте нет страницы, на которой был бы и текст программы, и объяснение.

Правильный ответ должен быть "я не знаю, как определены B, I, T, S, это, наверное макросы".

Для честности можно предварить программу значениями

#define B 6945503773712347754LL
#define I 5859838231191962459LL
#define T 0
#define S 7

Edited Date: 2023-03-21 04:18 pm (UTC)

From:

dmm

Сжульничала, дала детальный но неправильный ответ (сейчас попробую вариант для честности):

What does this program do? int main(int b,char**i){long long n=B,a=I^n,r=(a/b&a)>>4,y=atoi(*++i),_=(((a^n/b)*(y>>T)|y>>S)&r)|(a^r);printf("%.8s\n",(char*)&_);}

This C program accepts a single command-line argument and prints an 8-character string based on the input and some bitwise operations. Here's a breakdown of the program:

int main(int b,char**i) is the main function definition, where b is the number of command-line arguments (including the program name itself) and i is an array of pointers to the arguments.

long long n=B,a=I^n,r=(a/b&a)>>4 declares three long long variables and initializes them:
n is initialized with the value of B (66 in ASCII).
a is initialized with the result of bitwise XOR between I (73 in ASCII) and n.
r is initialized with the result of bitwise AND between a/b and a, right-shifted by 4 bits.

y=atoi(*++i) increments the pointer i to point to the first command-line argument (excluding the program name) and converts that argument to an integer using the atoi() function. The result is stored in the integer variable y.

_=(((a^n/b)*(y>>T)|y>>S)&r)|(a^r) performs a series of bitwise operations and arithmetic calculations on the variables a, n, r, y, T (84 in ASCII), and S (83 in ASCII). The result is stored in the integer variable _.

printf("%.8s\n",(char*)&_); prints the first 8 characters of the variable _ as a string, followed by a newline character.

The program expects one command-line argument (a number) and prints an 8-character string based on the input number and a series of bitwise operations.

From:

spamsink

Упоминание ASCII-значений букв-имён переменных - это весело. То, что настоящую функциональность программы (печатает аргумент в двоичном виде, не используя циклов) понять не удалось, неудивительно - и человек этого на глаз не может; но то, что оно поняло, что аргумент должен быть один, и что это число (по использованию atoi, очевидно) - очень неплохо.

From:

dmm

Вот, разбирайся, правильно, или нет (мне лень, не хочется на этом сосредотачиваться):

I SAY:
What does this program do?

#define B 6945503773712347754LL
#define I 5859838231191962459LL
#define T 0
#define S 7

int main(int b,char**i){long long n=B,a=I^n,r=(a/b&a)>>4,y=atoi(*++i),_=(((a^n/b)*(y>>T)|y>>S)&r)|(a^r);printf("%.8s\n",(char*)&_);}

SHE SAYS:
This C program accepts a single command-line argument (a number) and prints an 8-character string based on the input and some bitwise operations. The main difference from the previous version is that this version uses macros to define the values of B, I, T, and S.

Here's a breakdown of the program:

The #define lines create four macros with specified values:
B is set to 6945503773712347754LL.
I is set to 5859838231191962459LL.
T is set to 0.
S is set to 7.

int main(int b,char**i) is the main function definition, where b is the number of command-line arguments (including the program name itself) and i is an array of pointers to the arguments.

long long n=B,a=I^n,r=(a/b&a)>>4 declares three long long variables and initializes them:
n is initialized with the value of B.
a is initialized with the result of bitwise XOR between I and n.
r is initialized with the result of bitwise AND between a/b and a, right-shifted by 4 bits.

y=atoi(*++i) increments the pointer i to point to the first command-line argument (excluding the program name) and converts that argument to an integer using the atoi() function. The result is stored in the integer variable y.

_=(((a^n/b)*(y>>T)|y>>S)&r)|(a^r) performs a series of bitwise operations and arithmetic calculations on the variables a, n, r, y, T, and S. The result is stored in the integer variable _.

printf("%.8s\n",(char*)&_); prints the first 8 characters of the variable _ as a string, followed by a newline character.

The program expects one command-line argument (a number) and prints an 8-character string based on the input number and a series of bitwise operations.

Flat | Top-Level Comments Only

Profile

Dataflow matrix machines (by Anhinga anhinga)

Neuromorphic Computations with Linear Streams

September 2025

S	M	T	W	T	F	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

Page Summary

Active Entries

Style Credit

Style: Neutral Good for Practicality by timeasmymeasure

Expand Cut Tags

No cut tags

Page generated Dec. 30th, 2025 09:57 pm

Dataflow matrix machines (by Anhinga anhinga)

Начал экспериментировать с GPT-4

Начал экспериментировать с GPT-4

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

Profile

September 2025

Most Popular Tags

Page Summary

Active Entries

Style Credit

Expand Cut Tags