New Year resolution
Jan. 8th, 2023 12:39 amTo read my https://twitter.com/home more regularly (that's absolutely the best source of info at the moment).
A small fraction of today's catch:
New work by Janus
A new involved take on AI safety/alignment
(What's the right way to organize all that information?)
Links are in the comments (I think the new work by Janus is more important even for alignment, and is just overall more important of the two topics of this post)...
A small fraction of today's catch:
New work by Janus
A new involved take on AI safety/alignment
(What's the right way to organize all that information?)
Links are in the comments (I think the new work by Janus is more important even for alignment, and is just overall more important of the two topics of this post)...
no subject
Date: 2023-01-08 08:30 am (UTC)"Both dramatic tension and tragedy are powerful forces in the semiotic universe, and they can work against our attempts to control the behavior of the language model. For example, if we introduce a prompt that describes a group of brilliant and determined alignment researchers, we might want the language model to generate a continuation that includes a working solution to the alignment problem. However, the principles of dramatic tension and tragedy might guide the language model towards generating a continuation that includes an overlooked flaw in the proposed solution which leads to the instantiation of a misaligned superintelligence.
Thus, we need to be aware of the various forces and constraints that govern the semiotic universe, and use them to our advantage when we are trying to control the behavior of the language model. A deep understanding of how these stylistic devices are commonly used in human-generated text and how they can be influenced by various forms of training will be necessary to control and leverage the laws of semiotic physics."