dmm | New Year resolution

Entry tags:

New Year resolution

To read my https://twitter.com/home more regularly (that's absolutely the best source of info at the moment).

A small fraction of today's catch:

New work by Janus

A new involved take on AI safety/alignment

(What's the right way to organize all that information?)

Links are in the comments (I think the new work by Janus is more important even for alignment, and is just overall more important of the two topics of this post)...

Flat | Top-Level Comments Only

> A new involved take on AI safety/alignment

https://theinsideview.ai/david

via https://twitter.com/MichaelTrazzi/status/1611824181636644864

"David Krueger On Academic Alignment"

"happy to finally share my 3h conversation with David Krueger, Cambridge Professor, about the AI Alignment research going on at his lab, coordination, takeoff speeds and why he doesn't have a research agenda

transcript: https://theinsideview.ai/david
youtube: [...]"

A mess, though, not a neatly organized thing...

https://www.lesswrong.com/posts/3ZdfoezXhKZHakEzy/ethan-caballero-on-broken-neural-scaling-laws-deception-and

https://arxiv.org/abs/2210.14891 "Broken Neural Scaling Laws"

*****

"Unifying Grokking and Double Descent" https://openreview.net/forum?id=JqtHMZtqWm

*****

"Assistance with large language models", https://openreview.net/forum?id=OE9V81spp6B

Edited 2023-01-08 06:18 (UTC)

https://neurips2022.mlsafety.org/

https://theinsideview.ai/david#coordination-is-neglected-and-there-are-many-low-hanging-fruits

'you shouldn’t just say, “Here are my timelines,” and it’s a number, because timelines aren’t a number, they’re distribution. You might want to invest in something that looks like more likely to pay off over long timelines, even if you think timelines are likely to be short. You still have a significant amount of mass on long timelines'

> New work by Janus

https://www.lesswrong.com/posts/nmMorGE4MS4txzr8q/simulators-seminar-sequence-1-background-and-shared

https://www.lesswrong.com/posts/TTn6vTcZ3szBctvgb/simulators-seminar-sequence-2-semiotic-physics

"Meta: Over the past few months, we've held a seminar series on the "Simulators theory" by janus. As the theory is actively under development, the purpose of the series is to discover central structures and open problems. Our aim with this sequence is to share some of our discussions with a broader audience and to encourage new research on the questions we uncover. Below, we outline the broader rationale and shared assumptions of the participants of the seminar."

I am now reading Janus twitter:

https://twitter.com/ComputingByArts/status/1611958021906731009

https://twitter.com/repligate/status/1611934083780673538

"Natural language is unfathomably versatile in what it can specify and inspire, and it's been long waiting for a more literate entity than mankind to come into its full power as a programming language ;)"

My October write-up on the "Simulators theory": https://github.com/anhinga/2022-notes/tree/main/Generative-autoregressive-models-are-similators

(I used to tentatively call it "Janus paradigm" or "Simulators paradigm".)

This new "Janus sequence" is focusing on alignment a lot (good!)

"Conditional on having a foundational understanding of simulators (supported by theorems and empirical results), we hope to be able to construct a simulation (or a set of simulations) that reliably produces useful alignment research."

Janus in the comments to the first post:

"It's like this: magic exists now. The amount of magic in the world is increasing, allowing for increasingly powerful spells and artifacts, such as CLONE MIND. This is concerning for obvious reasons. One would hope that the protagonists, whose goal it is to steer this autocatalyzing explosion of psychic energy through the needle of an eye to utopia, will become competent at magic."

https://www.lesswrong.com/posts/TTn6vTcZ3szBctvgb/simulators-seminar-sequence-2-semiotic-physics

"The term “semiotic physics” here refers to the study of the fundamental forces and laws that govern the behavior of signs and symbols. Similar to how the study of physics helps us understand and make use of the laws that govern the physical universe, semiotic physics studies the fundamental forces that govern the symbolic universe of GPT, a universe that reflects and intersects with the universe of our own cognition. We transfer concepts from dynamical systems theory, such as attractors and basins of attraction, to the semiotic universe and spell out examples and implications of the proposed perspective."

On semantics on natural language, Footnote 12:

"My (Jan's) take is that the central confusion arises because people are confused about neuroscience. The sentence "The current king of France is bald." does not refer to a king of France in the physical universe; it refers to a certain pattern of neural activations in someone's cortex. That pattern is a part of the physical universe (and thus fits into the framework of Russell et al), but it's not "simple" in the way that the early philosophers of language would have liked it to be."

(But a more usual ad hoc "vector semantics" is used instead by the authors.)

https://generative.ink/prophecies/

Footnote 23:

"Both dramatic tension and tragedy are powerful forces in the semiotic universe, and they can work against our attempts to control the behavior of the language model. For example, if we introduce a prompt that describes a group of brilliant and determined alignment researchers, we might want the language model to generate a continuation that includes a working solution to the alignment problem. However, the principles of dramatic tension and tragedy might guide the language model towards generating a continuation that includes an overlooked flaw in the proposed solution which leads to the instantiation of a misaligned superintelligence.

Thus, we need to be aware of the various forces and constraints that govern the semiotic universe, and use them to our advantage when we are trying to control the behavior of the language model. A deep understanding of how these stylistic devices are commonly used in human-generated text and how they can be influenced by various forms of training will be necessary to control and leverage the laws of semiotic physics."

Further things related to Janus ("repligate"):

https://liberaugmen.com/

via https://twitter.com/jd_pressman/status/1612106688122990592

"Basically everything in Liber Augmen is between 1024 and 2048 characters and it's pretty much nonstop expression of complex ideas. Does it suffer for that sometimes? Yeah, but not as much as you'd think."

and https://twitter.com/repligate/status/1612167786930864128

"Liber Augmen is a masterpiece of compressed insight and the only good work of this genre I've seen since the Sequences. Read it, you fools."

https://www.lesswrong.com/posts/cTdAXCb7WK6eR4fxf/what-specific-thing-would-you-do-with-ai-alignment-research

https://www.lesswrong.com/posts/a2io2mcxTWS4mxodF/results-from-a-survey-on-tool-use-and-workflows-in-alignment

Flat | Top-Level Comments Only

New Year resolution

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject