Exploring the Transformer Series (35) --- Fundamentals of Large Model Quantization
Large model quantization fundamentals: outliers, superweights, massive activations, PTQ, QAT, and common quantization strategies.
Large model quantization fundamentals: outliers, superweights, massive activations, PTQ, QAT, and common quantization strategies.
Medusa: multi-decoding heads, tree attention, typical acceptance, sparse tree construction, training strategies, and decoding flow.
Length extrapolation in Transformers and LLMs: position encoding methods, RoPE extrapolation, PI, NTK-aware interpolation, YaRN, and Giraffe.
RoPE positional encoding, derivation, properties, extrapolation, and implementation.
Transformer overall architecture: workflow, attention modules, construction from Harvard code, and theoretical perspectives.
OpenHands Agent internals: state management, agent types, state lifecycle, and LLM adapter design.
OpenHands CodeActAgent internals: design principles, tools, context engineering, and workflow.
OpenHands function-calling internals: tool design, action mapping, and robust parsing of LLM tool calls.