AI Inference Cost Review

workflow:ai-inference-cost-review

Workflowworkflows/workflows/workflows-finops-deep.yaml·Open in Graph →

overview json graph

Attributes

displayName

AI Inference Cost Review

workflowKind

governance

triggerType

scheduled

typicalCadence

bi-weekly

complexity

cross-team

description

Reviews AI and LLM inference costs across the organization to optimize spend while maintaining quality -- analyzing API cost breakdowns by model, feature, and team with token-level granularity, evaluating prompt engineering efficiency by measuring token counts against output quality metrics, reviewing caching layer effectiveness including semantic cache hit rates and cost avoidance, assessing model selection appropriateness by comparing quality-to-cost ratios across model tiers for each use case, identifying opportunities to shift workloads from expensive frontier models to fine-tuned smaller models, tracking cost trends against usage growth to detect non-linear cost scaling, reviewing batch vs real-time inference allocation for latency-tolerant workloads, and benchmarking per-request costs against industry norms. Produces AI cost dashboard, optimization recommendation report, and model-tier allocation review. Excludes model training and fine-tuning.

Outgoing edges (11)

applies_to_domain2

domain:finops·DomainFinOps
domain:operations·DomainOperations

involves_role3

role:ai-champion·RoleAI Champion
role:cloud-architect·RoleCloud Architect
role:data-scientist·RoleData Scientist

performed_by_org_unit2

org-unit:finops-team·OrgUnitFinOps Team
org-unit:engineering·OrgUnitEngineering

requires_skill_area2

skill-area:prompt-engineering·SkillAreaPrompt Engineering
skill-area:context-management·SkillAreaLLM Context Management

triggers_responsibility2

responsibility:cost-optimization·ResponsibilityCost optimization
responsibility:ai-agent-usage-review·ResponsibilityAI Agent Usage Review

Incoming edges (0)

None.