II.
LibraryAgent overview
Reference · livelib-agent:gpu-programming--ml-inference-optimizer
ml-inference-optimizer overview
Agent specializing in GPU-accelerated ML model optimization for production inference. Expert in TensorRT engine building, quantization strategies (PTQ, QAT), kernel fusion patterns, dynamic batching design, ONNX model optimization, inference serving patterns, and latency/throughput tradeoffs.
Attributes
displayName
ml-inference-optimizer
description
Agent specializing in GPU-accelerated ML model optimization for production inference. Expert in TensorRT engine building, quantization strategies (PTQ, QAT), kernel fusion patterns, dynamic batching design, ONNX model optimization, inference serving patterns, and latency/throughput tradeoffs.
libraryPath
library/specializations/gpu-programming/agents/ml-inference-optimizer/AGENT.md
specialization
gpu-programming
Outgoing edges
lib_applies_to_domain1
- domain:scientific-computing·DomainScientific Computing
lib_belongs_to_specialization1
- specialization:gpu-programming·Specialization
lib_involves_role2
- role:computational-scientist·RoleComputational Scientist
- role:ml-engineer·RoleMachine Learning Engineer
lib_requires_skill_area2
- skill-area:cuda-kernels·SkillAreaCUDA Kernel Programming
- skill-area:compute-shaders·SkillAreaCompute Shaders
Incoming edges
uses_agent2
- lib-process:gpu-programming--custom-cuda-operator-development·LibraryProcessspecializations/gpu-programming/custom-cuda-operator-development
- lib-process:gpu-programming--ml-inference-optimization·LibraryProcessspecializations/gpu-programming/ml-inference-optimization