Catalog

Official lessons for deep ML systems study

Lesson pages turn the source posts into structured study units with metadata, prerequisites, and review hooks. Use them directly, or follow them through a roadmap or course.

Official Collection

AI Agent Frameworks

14 lessons

This lesson is part of the local ML Learning Lab collection.

Lesson Beginner on Agent Architecture Foundations

OpenHands Core Concepts

Core concepts, architecture, and engineering challenges in OpenHands.

AI Agents
35 min
Reading

#ai-agent#openhands#architecture#agent-engineering#open-devin

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Intermediate on Agent Architecture Foundations

OpenHands ReAct To CodeAct

CodeAct paper fundamentals, ReAct vs CodeAct framing, and design concepts behind OpenHands.

AI Agents
50 min
Reading

#ai-agent#openhands#codeact#react#agent-frameworks

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Intermediate on Agent Architecture Foundations

OpenHands Startup Lifecycle

Startup lifecycle and run_controller initialization flow in OpenHands.

AI Agents
1 hr
Reading

#ai-agent#openhands#startup#eventstream#agent-controller

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Intermediate on Agent Architecture Foundations

OpenHands Service Boundary

OpenHands server architecture, Socket.IO flow, and session orchestration internals.

AI Agents
45 min
Reading

#ai-agent#openhands#services#websocket#fastapi

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Intermediate on Conversation, Events, And Tool Calls

OpenHands Conversation Lifecycle

OpenHands interaction and session internals: ConversationManager, WebSession, AgentSession, and oh_user_action flow.

AI Agents
1.2 hr
Reading

#ai-agent#openhands#session#conversation#eventstream

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Intermediate on Conversation, Events, And Tool Calls

OpenHands EventStream

OpenHands EventStream internals: subscription, event distribution, Action/Observation flow, and AgentThinkAction.

AI Agents
55 min
Review deck

#ai-agent#openhands#eventstream#action#observation

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Conversation, Events, And Tool Calls

OpenHands Function Calling

OpenHands function-calling internals: tool design, action mapping, and robust parsing of LLM tool calls.

AI Agents
1.5 hr
Review deck

#ai-agent#openhands#function-call#tools#llm

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Agent Loop And Controller

OpenHands Agent State And LLM Adapter

OpenHands Agent internals: state management, agent types, state lifecycle, and LLM adapter design.

AI Agents
1.3 hr
Reading

#ai-agent#openhands#agent#state#llm

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Agent Loop And Controller

OpenHands CodeActAgent

OpenHands CodeActAgent internals: design principles, tools, context engineering, and workflow.

AI Agents
1.5 hr
Reading

#ai-agent#openhands#codeact#agent#llm

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Agent Loop And Controller

OpenHands AgentController

OpenHands AgentController internals: lifecycle control, routing, callbacks, observability, and robust execution control.

AI Agents
1.5 hr
Reading

#ai-agent#openhands#agent-controller#state-machine#event-stream

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Runtime, Memory, And Microagents

OpenHands Runtime Sandbox

OpenHands Runtime internals: sandbox execution, runtime types, event flow, and core code paths.

AI Agents
1.1 hr
Reading

#ai-agent#openhands#runtime#sandbox#docker

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Runtime, Memory, And Microagents

OpenHands Runtime Components

OpenHands runtime deep dive: plugin system, execution system, and browser environment internals.

AI Agents
1.4 hr
Reading

#ai-agent#openhands#runtime#plugins#browserenv

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Runtime, Memory, And Microagents

OpenHands Memory And Condensation

OpenHands memory internals: layered memory architecture, View/ConversationMemory/Condenser workflow, and implementation details.

AI Agents
1.3 hr
Reading

#ai-agent#openhands#memory#context#rag

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Runtime, Memory, And Microagents

OpenHands Microagents And Delegation

OpenHands microagents deep dive: architecture, delegation workflow, trigger mechanisms, and implementation details.

AI Agents
1.4 hr
Reading

#ai-agent#openhands#microagents#multi-agent#delegation

Open lesson

Official Collection

Transformer Systems

36 lessons

This lesson is part of the local ML Learning Lab collection.

Lesson Intermediate on Foundations And Data Flow

Exploring the Transformer Series (1): Attention Mechanism

A deep dive into attention mechanism foundations: seq2seq background, CNN/RNN limitations, attention principles, and historical evolution to Transformer.

Transformer Systems
1.5 hr
Reading

#transformer#attention#seq2seq#rnn#cnn

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Intermediate on Foundations And Data Flow

Exploring the Transformer Series (6) --- token

Tokenization fundamentals in Transformers: vocabulary construction, tokenizers, BPE/WordPiece/Unigram, and newer token-free directions.

Transformer Systems
1.7 hr
Reading

#transformer#token#tokenizer#vocabulary#bpe#wordpiece

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Intermediate on Foundations And Data Flow

Exploring the Transformer Series (7) --- Embedding

Transformer embedding fundamentals: from vectorization to trainable embeddings, implementation details, and modern text-embedding model evolution.

Transformer Systems
1.5 hr
Reading

#transformer#embedding#vectorization#word2vec#bert#text-embedding

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Intermediate on Foundations And Data Flow

Exploring the Transformer Series (3) --- Data Processing

Transformer data processing pipeline: dataset choices, vocabulary/tokenizers, batch construction, masks, and training data loading in Harvard code.

Transformer Systems
1.3 hr
Reading

#transformer#data-processing#dataset#tokenization#padding

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Intermediate on Foundations And Data Flow

Exploring the Transformer Series (2) --- Overall Architecture

Transformer overall architecture: workflow, attention modules, construction from Harvard code, and theoretical perspectives.

Transformer Systems
2 hr
Reading

#transformer#architecture#attention#llm#encoder-decoder

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Attention And Positional Information

Exploring the Transformer Series (8) --- Position Encoding

Transformer positional encoding: why it is needed, design evolution, sinusoidal encoding analysis, and NoPE debates.

Transformer Systems
2 hr
Reading

#transformer#position-encoding#rope#nope#attention#length-extrapolation

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Attention And Positional Information

Exploring the Transformer Series (17) --- RoPE

RoPE positional encoding, derivation, properties, extrapolation, and implementation.

Transformer Systems
2 hr
Reading

#transformer#rope#position-encoding#rotary-embedding#llm#attention

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Attention And Positional Information

Exploring the Transformer Series (9) --- Location Encoding Classification

APE vs RPE in Transformers: differences, representative methods, and relative-position design patterns.

Transformer Systems
1.5 hr
Reading

#transformer#position-encoding#ape#rpe#xlnet#t5

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Attention And Positional Information

Exploring the Transformer Series (10) --- Self-Attention

Self-attention in Transformers: principles, implementation details, scaling/softmax analysis, and modern optimization directions.

Transformer Systems
2.5 hr
Reading

#transformer#self-attention#qkv#softmax#llama3#linear-attention

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Attention And Positional Information

Exploring the Transformer Series (11) --- Mask

Transformer masks: padding mask, sequence/causal mask, implementation details, data flow, and advanced sample-packing strategies.

Transformer Systems
1.7 hr
Review deck

#transformer#mask#padding-mask#causal-mask#self-attention#sample-packing

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Intermediate on Attention And Positional Information

Exploring the Transformer Series (12) --- Multi-head Self-Attention

Multi-head self-attention in Transformers: motivation, principles, implementation details, and modern head-composition improvements.

Transformer Systems
1.5 hr
Reading

#transformer#multi-head-self-attention#attention#qkv#mha#optimization

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Intermediate on Transformer Blocks And Training

Exploring the Transformer Series (4) --- Encoder & Decoder

Transformer encoder and decoder internals: architecture, data flow, cross-attention, and decoder-only design tradeoffs.

Transformer Systems
1.5 hr
Reading

#transformer#encoder#decoder#cross-attention#decoder-only

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Transformer Blocks And Training

Exploring the Transformer Series (5) --- Training & Reasoning

Transformer training and inference in practice: teacher forcing, masks, dropout, label smoothing, learning rate scheduling, and parallelism.

Transformer Systems
2 hr
Reading

#transformer#training#reasoning#teacher-forcing#dropout#label-smoothing

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Transformer Blocks And Training

Exploring the Transformer Series (13) --- FFN

Feed-Forward Networks (FFN) in Transformers: structure, implementation, function, knowledge utilization, and optimization evolution.

Transformer Systems
2 hr
Reading

#transformer#ffn#feed-forward-network#mlp#activation#knowledge

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Transformer Blocks And Training

Exploring the Transformer Series (14) --- Residual Networks and Normalization

Residual connections and normalization in Transformers: ResNet intuition, BatchNorm vs LayerNorm, Pre-Norm vs Post-Norm, implementations, and recent variants.

Transformer Systems
1.8 hr
Reading

#transformer#residual-connection#normalization#layernorm#batchnorm#resnet

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Intermediate on Transformer Blocks And Training

Exploring the Transformer Series (15) --- Sampling and Output

Transformer generator heads, softmax, decoding strategies, sampling parameters, logits analysis, and weight sharing.

Transformer Systems
1.5 hr
Reading

#transformer#sampling#generator#softmax#beam-search#top-k

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Transformer Blocks And Training

Exploring the Transformer Series (16) --- Resource Consumption

Transformer parameter counts, memory usage, activations, FLOPs, KV cache, and optimization directions.

Transformer Systems
2 hr
Reading

#transformer#parameters#memory#activations#flops#kv-cache

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Inference And Serving Mechanics

Exploring the Transformer Series (20) --- KV Cache

Autoregressive inference redundancy, KV cache, prefill vs decode, implementation, and resource usage.

Transformer Systems
2 hr
Reading

#transformer#kv-cache#inference#prefill#decode#memory

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Inference And Serving Mechanics

Exploring the Transformer Series (27) --- MQA & GQA

MQA and GQA: MHA review, shared KV heads, grouped-query attention, implementation details, memory and speed tradeoffs, conversion, and optimization variants.

Transformer Systems
1.5 hr
Reading

#transformer#mqa#gqa#attention#kv-cache#mha

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Inference And Serving Mechanics

Exploring the Transformer Series (23) --- Length Extrapolation

Length extrapolation in Transformers and LLMs: position encoding methods, RoPE extrapolation, PI, NTK-aware interpolation, YaRN, and Giraffe.

Transformer Systems
1.5 hr
Reading

#transformer#length-extrapolation#position-encoding#rope#llm#context-window

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Expert on Efficient Attention And KV Cache

Exploring the Transformer Series (18) --- FlashAttention

FlashAttention, online softmax, tiling, IO-awareness, and memory-efficient exact attention.

Transformer Systems
3 hr
Reading

#transformer#flashattention#attention#softmax#tiling#memory

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Expert on Efficient Attention And KV Cache

Exploring the Transformer Series (19) --- FlashAttention V2 and its Upgrade

FlashAttention V2, Flash-Decoding, Flash-Mask, and FlashAttention-3.

Transformer Systems
2 hr
Reading

#transformer#flashattention#flashattention-v2#flash-decoding#flash-mask#flashattention-3

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Efficient Attention And KV Cache

Exploring the Transformer Series (24) --- KV Cache Optimization

KV Cache optimization: metrics, memory crisis, formula-based compression, stage-aware optimization, memory management, and scheduling.

Transformer Systems
2 hr
Reading

#transformer#kv-cache#optimization#inference#memory#prefill

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Expert on Efficient Attention And KV Cache

Exploring the Transformer Series (25) --- KV Cache Optimization for Handling Long Text Sequences

KV cache optimization for long text sequences: sparsification, token reuse, prefix reuse, retrieval-based schemes, and long-context KV management.

Transformer Systems
3 hr
Reading

#transformer#kv-cache#optimization#long-context#inference#sparsification

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Expert on Efficient Attention And KV Cache

Exploring the Transformer Series (26) --- KV Cache Optimization: PD Separation or Merging

KV cache optimization through PD separation or merging: static batching, ORCA, Sarathi, DistServe, SplitWise, MemServe, TetriInfer, and Mooncake.

Transformer Systems
2 hr
Reading

#transformer#kv-cache#prefill#decode#parallelism#inference

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Expert on MoE, Adaptation, And Compression

Exploring the Transformer Series (21) --- MoE

Mixture-of-Experts (MoE): conditional computation, routing, experts, load balancing, implementation, and parallel inference.

Transformer Systems
3 hr
Reading

#transformer#moe#routing#sparsity#parallelism#mixtral

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on MoE, Adaptation, And Compression

Exploring the Transformer Series (22) --- LoRA

LoRA: PEFT, low-rank adaptation, rank, initialization, implementation, optimization, and continual learning.

Transformer Systems
2.5 hr
Reading

#transformer#lora#peft#low-rank#fine-tuning#dora

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Intermediate on MoE, Adaptation, And Compression

Exploring the Transformer Series (34) --- Quantitative Fundamentals

Quantization fundamentals for Transformer LLMs: compression background, numerical representations, PTQ/QAT workflows, calibration, granularity, and acceleration.

Transformer Systems
1.5 hr
Reading

#transformer#quantization#llm-compression#ptq#qat#model-quantization

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on MoE, Adaptation, And Compression

Exploring the Transformer Series (35) --- Fundamentals of Large Model Quantization

Large model quantization fundamentals: outliers, superweights, massive activations, PTQ, QAT, and common quantization strategies.

Transformer Systems
2 hr
Reading

#transformer#quantization#llm#outlier#ptq#qat

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Expert on MoE, Adaptation, And Compression

Exploring the Transformer Series (36) --- Large Model Quantization Scheme

Large model quantization schemes across 8-bit, 4-bit, and low-bit settings, including LLM.int8(), ZeroQuant, SmoothQuant, GPTQ, AWQ, LLM-QAT, QLoRA, FlatQuant, SqueezeLLM, SpQR, BitNet, and OneBit.

Transformer Systems
3 hr
Reading

#transformer#quantization#llm-compression#ptq#qat#low-bit-quantization

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Expert on Advanced Decoding And DeepSeek Systems

Exploring the Transformer Series (28) --- DeepSeek MLA

DeepSeek MLA: low-rank KV compression, weight absorption, decoupled RoPE, resource tradeoffs, implementation details, and conversions from GQA and MHA.

Transformer Systems
2.5 hr
Reading

#transformer#mla#deepseek#attention#kv-cache#rope

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Expert on Advanced Decoding And DeepSeek Systems

Exploring the Transformer Series (29) --- DeepSeek MoE

DeepSeek MoE: load balancing, fine-grained and shared experts, DeepSeek V1/V2/V3 routing, MoD, LoRA hybrids, and efficient fine-tuning.

Transformer Systems
2.5 hr
Reading

#transformer#moe#deepseek#routing#load-balancing#experts

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Expert on Advanced Decoding And DeepSeek Systems

Exploring the Transformer Series (30) --- Decoding Speculation

Speculative decoding, speculative sampling, blockwise parallel decoding, token tree verification, and Hugging Face implementation details.

Transformer Systems
2 hr
Reading

#transformer#speculative-decoding#speculative-sampling#parallel-decoding#inference#sampling

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Expert on Advanced Decoding And DeepSeek Systems

Exploring the Transformer Series (31) --- Medusa

Medusa: multi-decoding heads, tree attention, typical acceptance, sparse tree construction, training strategies, and decoding flow.

Transformer Systems
2 hr
Reading

#transformer#medusa#speculative-decoding#inference#tree-attention#decode

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Advanced on Advanced Decoding And DeepSeek Systems

Exploring the Transformer Series (32) --- Lookahead Decoding

Lookahead decoding: Jacobi decoding, n-gram pool, 2D window, parallel verification, and llama.cpp implementation details.

Transformer Systems
1.5 hr
Reading

#transformer#lookahead-decoding#jacobi-decoding#speculative-decoding#parallel-decoding#inference

Open lesson

This lesson is part of the local ML Learning Lab collection.

Lesson Expert on Advanced Decoding And DeepSeek Systems

Exploring the Transformer Series (33) --- DeepSeek MTP

DeepSeek MTP: EAGLE, HASS, classical multi-token prediction, DeepSeek’s causal-chain design, formulas, and the vLLM implementation.

Transformer Systems
2 hr
Reading

#transformer#deepseek#mtp#multi-token-prediction#eagle#hass

Open lesson