Exploring the Transformer Series (35) --- Fundamentals of Large Model Quantization
Large model quantization fundamentals: outliers, superweights, massive activations, PTQ, QAT, and common quantization strategies.
Large model quantization fundamentals: outliers, superweights, massive activations, PTQ, QAT, and common quantization strategies.
Large model quantization schemes across 8-bit, 4-bit, and low-bit settings, including LLM.int8(), ZeroQuant, SmoothQuant, GPTQ, AWQ, LLM-QAT, QLoRA, FlatQuant, SqueezeLLM, SpQR, BitNet, and OneBit.
Quantization fundamentals for Transformer LLMs: compression background, numerical representations, PTQ/QAT workflows, calibration, granularity, and acceleration.