Pricing

Simple, transparent pricing.

Pay for what you use. Per-second billing on compute, per-token on inference. No hidden fees.

train-llama-ft

H100

Runtime

00:04:23

Rate $2.49/hr

Per second $0.00069

Seconds used 263

Current cost

$0.18

Billed exactly for time used

// Inference

Run models via API.

Pay per token. No GPU management.

Model	Input / 1M	Cached / 1M	Output / 1M
DeepSeek V4 Pro	$1.75	$0.44	$3.50
DeepSeek V4 Flash	$0.15	$0.04	$0.30
GLM-5.2	$1.50	$0.38	$4.50
Kimi K2.7 Code	$1.25	$0.31	$4.50
MiniMax M3	$0.40	$0.10	$2.00
Kimi K2.6	$1.00	$0.25	$4.00
Qwen3.5 397B A17B	$0.60	-	$3.60
gpt-oss 120B	$0.15	-	$0.60
MiniMax M2.5	$0.30	-	$1.20
Qwen3.5 9B	$0.15	$0.04	$0.20
Qwen3 Coder 30B A3B	$0.06	$0.01	$0.25
Qwen3 Embedding 8B	$0.01	-	-

More models in the dashboard.

Reserved GPU capacity. Consistent latency.

GPUs:

Deploy any Hugging Face model.

// Training & Compute

Run Python or Docker. Per-second billing.

Billed per second. No minimum.

Full root access. SSH in seconds.

GPUs:

Storage: $0.10/GB/month.

8 to 8,000 GPUs with InfiniBand. Custom pricing.

Minimum 8 GPUs. InfiniBand included.

// Savings Calculator

Add your current GPU workloads and compare costs against Lyceum pricing.

Workloads

	Provider	GPU	VRAM	# GPUs	Duration	Mode	$/GPU/hr	Total Cost
	▾	▾	-	-	-	-	-	-

No credit card required. Pay only for what you use.