II.
LibraryAgent overview
Reference · livelib-agent:gpu-programming--gpu-performance-engineer
gpu-performance-engineer overview
Expert agent for GPU performance analysis and optimization. Specialist in Nsight profiling, roofline model analysis, occupancy optimization, memory bandwidth optimization, and architecture-specific tuning.
Attributes
displayName
gpu-performance-engineer
description
Expert agent for GPU performance analysis and optimization. Specialist in Nsight profiling, roofline model analysis, occupancy optimization, memory bandwidth optimization, and architecture-specific tuning.
libraryPath
library/specializations/gpu-programming/agents/gpu-performance-engineer/AGENT.md
specialization
gpu-programming
Outgoing edges
lib_applies_to_domain1
- domain:scientific-computing·DomainScientific Computing
lib_belongs_to_specialization1
- specialization:gpu-programming·Specialization
lib_involves_role2
- role:computational-scientist·RoleComputational Scientist
- role:ml-engineer·RoleMachine Learning Engineer
lib_requires_skill_area2
- skill-area:cuda-kernels·SkillAreaCUDA Kernel Programming
- skill-area:compute-shaders·SkillAreaCompute Shaders
Incoming edges
uses_agent12
- lib-process:gpu-programming--atomic-operations-synchronization·LibraryProcessspecializations/gpu-programming/atomic-operations-synchronization
- lib-process:gpu-programming--cuda-stream-concurrency·LibraryProcessspecializations/gpu-programming/cuda-stream-concurrency
- lib-process:gpu-programming--custom-cuda-operator-development·LibraryProcessspecializations/gpu-programming/custom-cuda-operator-development
- lib-process:gpu-programming--dynamic-parallelism-implementation·LibraryProcessspecializations/gpu-programming/dynamic-parallelism-implementation
- lib-process:gpu-programming--gpu-image-video-processing·LibraryProcessspecializations/gpu-programming/gpu-image-video-processing
- lib-process:gpu-programming--gpu-performance-regression-testing·LibraryProcessspecializations/gpu-programming/gpu-performance-regression-testing
- lib-process:gpu-programming--hip-porting-cross-platform·LibraryProcessspecializations/gpu-programming/hip-porting-cross-platform
- lib-process:gpu-programming--occupancy-optimization·LibraryProcessspecializations/gpu-programming/occupancy-optimization
- lib-process:gpu-programming--performance-profiling-analysis·LibraryProcessspecializations/gpu-programming/performance-profiling-analysis
- lib-process:gpu-programming--reduction-scan-implementation·LibraryProcessspecializations/gpu-programming/reduction-scan-implementation
- lib-process:gpu-programming--stencil-computation-optimization·LibraryProcessspecializations/gpu-programming/stencil-computation-optimization
- lib-process:gpu-programming--warp-efficiency-optimization·LibraryProcessspecializations/gpu-programming/warp-efficiency-optimization