≃
Calvin's Notebook
home
archive
notes
☀
🌙
Gradient Checkpointing ♻️
Memory-Bound Inference
Normalization-Free Transformers
NVIDIA GPU Hardware for Deep Learning
Deep Learning Engineering: Overview
Rotary Position Embeddings (RoPE) and Context Extension
Engineering Concerns for Deep Learning Training Loops
Weight Tying 🔗