Category
Transformers
12 articles
BioBERT
Biomedical NLP, Healthcare AI, Language Models
Cross-attention
Attention Mechanisms
Diffusion Transformer (DiT)
Diffusion Models, Generative AI, Image Generation
DistilBERT
Deep Learning, Models, Natural Language Processing
Grouped-Query Attention
Deep Learning, Machine Learning
KV Cache
Deep Learning, Inference, Machine Learning
Multi-Head Self-Attention
Attention Mechanisms, Deep Learning, Machine Learning
RMSNorm
2019 in artificial intelligence, Deep learning architectures, Normalization techniques
Rotary Position Embedding
Deep learning, Large Language Models, Position encoding
Rotary position embedding (RoPE)
Deep Learning, Large Language Models, Position Encoding
Self-attention
Attention Mechanisms, Deep Learning, Machine Learning
Switch Transformer
Google, Language Models, Mixture of Experts