Entry tags:
Let's understand Large Language Models better
This is a good starting point:
"A Mathematical Framework for Transformer Circuits", Dec 2021
transformer-circuits.pub/2021/framework/index.html
"A Mathematical Framework for Transformer Circuits", Dec 2021
transformer-circuits.pub/2021/framework/index.html
no subject
But one should really study the next paper: https://transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html