Choose the plan that's right for you

Developer

Powerful speed and reliability to start your project

100 requests/min rate limit

Up to 100 deployed models

Custom PEFT add-ons

Pay per usage

Business

A plan that scales with your production usage

Everything from the Developer plan

Custom rate limits

Team collaboration features

API telemetry and metrics

Dedicated email support

Enterprise

Personalized configurations for serving at scale

Everything from the Business plan

Custom pricing

Unlimited rate limits

Unlimited deployed models

Custom base models

Dedicated and self-hosted deployments

Specialized enterprise support

Text Models

Per-token pricing is applied only for non-enterprise deployments. Contact us for dedicated deployment pricing options.

Input tokens are determined from the prompt you supply in the request. Output tokens are the completions generated by the model.

Image Models

For image generation models like SDXL, we charge based on the number of inference steps (denoising iterations).

SDXL, $/step	SDXL w/ ControlNet, $/step
$0.0002	$0.0003

Multi-Modal

For multi-modal models like LLaVA, each image is billed as 576 prompt tokens.

Frequently asked questions