Choose the plan that's right for you

Developer

Powerful speed and reliability to start your project

100 requests/min rate limit
Up to 100 deployed models
Custom PEFT add-ons
Pay per usage
Get Started →

Business

A plan that scales with your production usage

Everything from the Developer plan
Custom rate limits
Team collaboration features
API telemetry and metrics
Dedicated email support

Enterprise

Personalized configurations for serving at scale

Everything from the Business plan
Custom pricing
Unlimited rate limits
Unlimited deployed models
Custom base models
Dedicated and self-hosted deployments
Specialized enterprise support
Text Models

Per-token pricing is applied only for non-enterprise deployments. Contact us for dedicated deployment pricing options.

Input tokens are determined from the prompt you supply in the request. Output tokens are the completions generated by the model.

Base model parameter count$/1M input tokens$/1M output tokens
up to 16B$0.20$0.80
16.1B - 80B$0.70$2.80
Mixtral 8x7B$0.40$1.60
Image Models

For image generation models like SDXL, we charge based on the number of inference steps (denoising iterations).

SDXL, $/stepSDXL w/ ControlNet, $/step
$0.0002$0.0003
Multi-Modal

For multi-modal models like LLaVA, each image is billed as 576 prompt tokens.

Frequently asked questions

© 2024 Fireworks AI All rights reserved.