Date: 2023-10-30 03:48 pm (UTC)
dmm: (0)
From: [personal profile] dmm
Lots of copying though; it's a frequent motif

And another frequent motif is that these things are good with fixing the weirdness of tokenizers

2:01:00 and for more complicated models, it is useful to think that attention heads are doing a lot of skip trigrams and doing other things on top of that
This account has disabled anonymous posting.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

Profile

dmm: (Default)
Dataflow matrix machines (by Anhinga anhinga)

May 2025

S M T W T F S
    123
456 78910
11 121314151617
18192021222324
25262728293031

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 30th, 2025 07:37 pm
Powered by Dreamwidth Studios