versionRange
>=2024-01-01
displayName
Fireworks AI
vendor
Fireworks AI
authMethods
authMethodNotes
Standard `Authorization: Bearer <api-key>`. OpenAI-compatible surface.
Account-scoped keys; per-deployment keys also supported.
endpoints
base
https://api.fireworks.ai/inference/v1
chat_completions
https://api.fireworks.ai/inference/v1/chat/completions
completions
https://api.fireworks.ai/inference/v1/completions
models
https://api.fireworks.ai/inference/v1/models
pricing
See https://fireworks.ai/pricing — pricing varies per open-weights
model and tier.
pricingTiers
name
serverless
rateLimit
Per-account RPM caps; fair-use throttling
priceMultiplier
1
description
Pay-per-token serverless inference (default).
name
on-demand-deployment
rateLimit
GPU-hour billed; throughput limited only by deployed GPU count
priceMultiplier
1
description
Dedicated on-demand GPU deployments — billed per GPU-hour.
name
enterprise-reserved
rateLimit
By contract
priceMultiplier
1
description
Reserved enterprise capacity.
rateLimitSignalingProtocol
OpenAI-compatible. 429 with `retry-after`; standard `x-ratelimit-*`
headers when present. Errors envelope mirrors OpenAI.
dataResidencyOptions
vendorFeatures
slaTier
fireworks-no-public-sla
regions