~32:00 residual stream is really messy, so interpretation via meaningful paths through the model is the only viable way (thankfully, it works OK).
(perhaps people who conjecture about "holographic storage" within residual stream are right, who knows; one can consider improving it in various ways: a) towards detangling, b) alternatively, towards better holography)
no subject
(perhaps people who conjecture about "holographic storage" within residual stream are right, who knows; one can consider improving it in various ways: a) towards detangling, b) alternatively, towards better holography)