II.
SkillArea overview
Reference · liveskill-area:model-compression
Model Compression overview
Reducing model size for deployment — knowledge distillation, pruning, quantization (INT8/INT4/GPTQ/AWQ), and low-rank adaptation for efficient on-device or edge inference.
Attributes
displayName
Model Compression
description
Reducing model size for deployment — knowledge distillation,
pruning, quantization (INT8/INT4/GPTQ/AWQ), and low-rank
adaptation for efficient on-device or edge inference.
expertiseLevels
- intermediate
- expert
Outgoing edges
applies_to2
- domain:ml-ai·DomainML/AI
- specialization:ml-inference-serving·SpecializationML Inference Serving
prerequisite_for_learning1
- skill-area:inference-optimization·SkillAreaInference Optimization
Incoming edges
prerequisite_for_learning1
- skill-area:inference-optimization·SkillAreaInference Optimization