Pricing
Simple, transparent pricing.
Pay for what you use. Per-second billing on compute, per-token on inference. No hidden fees.
train-llama-ft
H100 Runtime
00:04:23
Rate $2.49/hr
Per second $0.00069
Seconds used 263
Current cost
$0.18
Billed exactly for time used
// Inference
Run models via API.
Pay per token. No GPU management.
| Model | Input / 1M | Output / 1M |
|---|---|---|
| Llama 3.1 8B | $0.10 | $0.10 |
| Llama 3.1 70B | $0.35 | $0.40 |
| Llama 3.1 405B | $1.00 | $1.00 |
| Mixtral 8x22B | $0.50 | $0.50 |
| Qwen 2.5 72B | $0.40 | $0.45 |
More models in the dashboard.
Reserved GPU capacity. Consistent latency.
GPUs:
| GPU | VRAM | Price |
|---|---|---|
| L40S | 48 GB | $1.49/hr |
| A100 80GB | 80 GB | $1.99/hr |
| H100 | 80 GB | $3.29/hr |
| H200 | 141 GB | $3.69/hr |
| B200 | 192 GB | $5.89/hr |
Deploy any Hugging Face model.
// Training & Compute
Run code on GPUs.
Run Python or Docker. Per-second billing.
| GPU | VRAM | Price |
|---|---|---|
| L40S | 48 GB | $1.49/hr |
| A100 80GB | 80 GB | $1.99/hr |
| H100 | 80 GB | $3.29/hr |
| H200 | 141 GB | $3.69/hr |
| B200 | 192 GB | $5.89/hr |
Billed per second. No minimum.
Full root access. SSH in seconds.
GPUs:
| GPU | VRAM | Price |
|---|---|---|
| L40S | 48 GB | $1.05/hr |
| A100 | 80 GB | $1.39/hr |
| H100 | 80 GB | $2.49/hr |
| H200 | 141 GB | $3.19/hr |
| B200 | 192 GB | $4.29/hr |
Storage: $0.10/GB/month.
8 to 8,000 GPUs with InfiniBand. Custom pricing.
| GPU | VRAM | Price |
|---|---|---|
| NVIDIA GB300 | 288 GB | |
| NVIDIA B300 | 288 GB | |
| NVIDIA GB200 | 192 GB | |
| NVIDIA B200 | 192 GB | |
| NVIDIA H200 | 141 GB | |
| NVIDIA H100 | 80 GB |
Minimum 8 GPUs. InfiniBand included.
// Savings Calculator
See how much you could save.
Add your current GPU workloads and compare costs against Lyceum pricing.
Start building today.
No credit card required. Pay only for what you use.