GPU Cloud Platform for AI Teams in Europe

Everything you need to run AI workloads.

Inference APIs, serverless execution, GPU VMs, and managed clusters - designed to work individually or together.

Serverless Inference

Pay-per-token API access to open-source models

from lyceum import Client

client = Client("your-api-key")

response = client.chat.completions.create(
    model="meta-llama/Llama-3-70B",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True
)

for chunk in response:
    print(chunk.choices[0].delta.content)

# Streaming response...

Hello! How can I assist you today?|

Dedicated Endpoints

Reserved GPU capacity for production models

Serverless Training

Run code on GPUs without managing infrastructure

Terminal

$ lyceum python run train.py -m gpu.h100

Provisioning H100... done

Uploading files... done

Epoch 1/10 | loss: 0.423 | acc: 0.891

Epoch 2/10 | loss: 0.312 | acc: 0.924

Epoch 3/10 | loss: 0.198 | acc: 0.957|

GPU Virtual Machines

Full root access, ready in seconds

Large-Scale Clusters

8 to 8,000 GPUs with InfiniBand

Active Queued Available 128 × H100 · 94% util

Built for teams that ship.

We handle the infrastructure complexity so you can focus on building. No capacity planning, no vendor lock-in, no surprises on your bill.

Per-second billing

Pay only for compute you actually use. Jobs that finish early don't cost you for time you didn't need.

Docker-native

Any Docker container runs on Lyceum with no modifications. No proprietary SDKs, no vendor lock-in.

EU data centres

GPUs hosted in European data centres. Full GDPR compliance, data residency in the EU.

Instant availability

GPUs provision in seconds, not hours. No capacity planning, no procurement queues.

38%

Shorter queue times

76%

Average GPU utilisation

<1m

Time to first job

99.9%

Platform uptime

GDPR Compliant · EU Data Residency

Transparent pricing.

Pay for what you use. No hidden fees.

Model

Input

Output

Llama 3.1 8B

$0.10/1M

Llama 3.1 70B

$0.35/1M

$0.40/1M

Llama 3.1 405B

$1.00/1M

Mixtral 8x22B

$0.50/1M

Qwen 2.5 72B

$0.40/1M

$0.45/1M

Per-token pricing. No minimum spend.

View all models

GPU

VRAM

Price/hour

NVIDIA B200

192 GB

$4.29

NVIDIA H200

141 GB

$3.19

NVIDIA H100

80 GB

$2.49

NVIDIA A100

80 GB

$1.39

NVIDIA L40S

48 GB

$1.05

NVIDIA T4

16 GB

$0.39

Per-second billing. No minimum commitment.

GPU

Specs

Price

NVIDIA GB300

288 GB HBM3e

NVIDIA B300

288 GB HBM3e

NVIDIA GB200

192 GB HBM3e

NVIDIA B200

192 GB HBM3e

NVIDIA H200

141 GB HBM3e

NVIDIA H100

80 GB HBM3

Long-term contracts from 3 months. We'll get back to you within 24 hours.

See full pricing

From Our Magazine

Insights on GPU infrastructure, cost optimization, and AI deployment.

View all articles

Sovereign AI

Sovereign AI: Navigating EU Data Residency

How to build AI infrastructure that meets European data sovereignty requirements.

Migration

Migrating from AWS to Dedicated GPUs

A practical guide to moving your ML workloads from hyperscalers to dedicated GPU infrastructure.

Cost Optimization

Stopping the Bleed: The $15B GPU Overprovisioning Crisis

Why most teams are paying for GPUs they don't need, and how to fix it.

Cost Optimization

How to Right Size GPU Instances for ML Workloads

Match your GPU resources to actual workload requirements and stop overspending.

GPU Memory

Eliminating CUDA OOM: Expert Memory Management for LLMs

Practical techniques to prevent out-of-memory errors when training large language models.