(中) Transformer/self-attention 解析
(中) Decoder,自注意力(GPT-2)第一部分,第二部分
(中) XLNet(YouTube)第一部分,第二部分,slides
Dissecting BERT encoder
Dissecting decoder
BERT (Chris McCormick)
BERT (Curiousily)