Exploring the Transformer Series (34) --- Quantitative Fundamentals
Quantization fundamentals for Transformer LLMs: compression background, numerical representations, PTQ/QAT workflows, calibration, granularity, and acceleration.
Quantization fundamentals for Transformer LLMs: compression background, numerical representations, PTQ/QAT workflows, calibration, granularity, and acceleration.