Transformer Systems · Skill Path

LLM Inference Optimization

A focused path for understanding KV cache mechanics, shared-KV attention variants, efficient attention, and KV cache serving optimization.

Transformer Systems 4 steps advanced to expert 8.5 hr 0 review decks
  1. 01
    KV Cache Mechanics Lesson · Inference And Serving Mechanics · 2 hr · Reading
  2. 02
    MQA And GQA Lesson · Inference And Serving Mechanics · 1.5 hr · Reading
  3. 03
    FlashAttention V1 Lesson · Efficient Attention And KV Cache · 3 hr · Reading
  4. 04
    KV Cache Optimization Lesson · Efficient Attention And KV Cache · 2 hr · Reading