Transformer Systems · Skill Path
LLM Inference Optimization
A focused path for understanding KV cache mechanics, shared-KV attention variants, efficient attention, and KV cache serving optimization.
Transformer Systems 4 steps advanced to expert 8.5 hr 0 review decks
- 01 KV Cache Mechanics Lesson · Inference And Serving Mechanics · 2 hr · Reading
- 02 MQA And GQA Lesson · Inference And Serving Mechanics · 1.5 hr · Reading
- 03 FlashAttention V1 Lesson · Efficient Attention And KV Cache · 3 hr · Reading
- 04 KV Cache Optimization Lesson · Efficient Attention And KV Cache · 2 hr · Reading