II.
SkillArea overview
Reference · liveskill-area:model-optimisation
Model Optimisation overview
Optimizing ML model performance and efficiency — quantization, pruning, distillation, compilation (ONNX, TensorRT), and hardware-specific tuning to reduce inference latency and cost without sacrificing quality.
Attributes
displayName
Model Optimisation
description
Optimizing ML model performance and efficiency — quantization, pruning,
distillation, compilation (ONNX, TensorRT), and hardware-specific tuning
to reduce inference latency and cost without sacrificing quality.
domains
expertiseLevels
- intermediate
- expert
Outgoing edges
applies_to1
- domain:ml-ops·DomainMLOps
Incoming edges
prerequisite_for_learning2
- skill-area:model-serving·SkillAreaModel Serving
- skill-area:machine-learning·SkillAreaMachine Learning
tool_used_by2
- tool:tensorrt·ToolTensorRT
- tool:onnx-runtime·ToolONNX Runtime