Our serverless inference stack is currently in a closed beta. Join the waitlist to get early access.
Supported Models
Access the latest open-source models through a single API. Text generation, code, multimodal, speech, embeddings, and image generation.
Showing 29 models
Llama 3.3 70B
Meta
Highly capable open-source model for complex reasoning
70B
128K
$0.35/1M
$0.40/1M
Mistral Large
Mistral
Top-tier reasoning and multilingual capabilities
123B
128K
$0.50/1M
$0.50/1M
DeepSeek V3
DeepSeek
Frontier-class model with MoE architecture
671B
128K
$0.80/1M
$0.80/1M
Qwen 2.5 72B
Alibaba
Strong multilingual and coding capabilities
72B
128K
$0.40/1M
$0.45/1M
Mixtral 8x22B
Mistral
Efficient MoE architecture with expert routing
176B MoE
64K
$0.50/1M
$0.50/1M
Llama 3.1 405B
Meta
Largest open-source model, frontier performance
405B
128K
$1.00/1M
$1.00/1M
Llama 3.1 70B
Meta
70B
128K
$0.35/1M
$0.40/1M
Llama 3.1 8B
Meta
Fast and efficient for simple tasks
8B
128K
$0.10/1M
$0.10/1M
Qwen 2.5 32B
Alibaba
32B
128K
$0.25/1M
$0.30/1M
Qwen 2.5 7B
Alibaba
7B
128K
$0.08/1M
$0.08/1M
Mistral Nemo
Mistral
12B
128K
$0.15/1M
$0.15/1M
Gemma 2 27B
27B
8K
$0.20/1M
$0.25/1M
Gemma 2 9B
9B
8K
$0.10/1M
$0.10/1M
DeepSeek Coder V2
DeepSeek
Specialized for code generation and understanding
236B MoE
128K
$0.60/1M
$0.60/1M
Codestral
Mistral
Optimized for code completion and generation
22B
32K
$0.25/1M
$0.25/1M
Qwen 2.5 Coder 32B
Alibaba
32B
128K
$0.30/1M
$0.35/1M
Llama 3.2 Vision 90B
Meta
Vision-language model for image understanding
90B
128K
$0.55/1M
$0.55/1M
Llama 3.2 Vision 11B
Meta
11B
128K
$0.15/1M
$0.15/1M
Qwen2 VL 72B
Alibaba
Advanced vision-language understanding
72B
32K
$0.45/1M
$0.50/1M
Pixtral Large
Mistral
Multimodal model for vision and text
124B
128K
$0.55/1M
$0.55/1M
Whisper Large V3
OpenAI
State-of-the-art speech-to-text
1.5B
30s audio
$0.006/min/1M
SeamlessM4T V2
Meta
Multilingual speech translation
2.3B
30s audio
$0.008/min/1M
BGE Large EN v1.5
BAAI
High-quality English embeddings
335M
512 tokens
$0.02/1M
BGE M3
BAAI
Multilingual embeddings with long context
568M
8K tokens
$0.03/1M
GTE Qwen2 7B
Alibaba
Large embedding model for retrieval
7B
32K tokens
$0.05/1M
FLUX.1 Schnell
Black Forest Labs
Fast high-quality image generation
12B
-
$0.003/image/1M
FLUX.1 Dev
Black Forest Labs
Development-optimized image generation
12B
-
$0.025/image/1M
SDXL Turbo
Stability AI
Real-time image generation
6.6B
-
$0.002/image/1M
Stable Diffusion 3 Medium
Stability AI
Balanced quality and speed
2B
-
$0.015/image/1M
No models found
Try adjusting your search or filters
Need a different model?
We're constantly adding new models based on customer demand. Let us know which models you'd like to see, and we'll prioritize adding them to the platform.
Ready to get started?
Request access and start using these models in minutes. No credit card required.