FLUX.2 Klein: specs, benchmarks, and how to run it on Lyceum
Sub-second image generation and multi-reference editing from Black Forest Labs.
Justus Amen
June 16, 2026 · GTM at Lyceum Technology
FLUX.2 Klein is a compact, high-speed image generation model developed by Black Forest Labs. It comes in 4B and 9B parameter variants, both distilled to just four inference steps for sub-second generation. Unlike its predecessors, FLUX.2 Klein unifies text-to-image, single-reference, and multi-reference editing into a single architecture. The model is served via our OpenAI-compatible Serverless Inference API, allowing you to generate images instantly on EU-hosted infrastructure with zero egress fees.
Get started: call FLUX.2 Klein on Lyceum
Generate images with FLUX.2 Klein on Lyceum Technology using our OpenAI-compatible Serverless Inference API. Because this is an image generation model, you will send a POST request directly to the images/generations endpoint rather than using the standard chat completions base URL. The integration requires zero infrastructure management, allowing you to focus entirely on prompt engineering.
import requests
response = requests.post(
"https://api.lyceum.technology/api/v2/external/images/generations",
headers={"Authorization": "Bearer <your lyceum api key>"},
json={"model": "lyc-flux-2-klein", "prompt": "a sunset over the ocean", "aspect_ratio": "1:1"},
)
print(response.json()["image_url"])Pricing and region for FLUX.2 Klein
When you deploy this model on Lyceum, you benefit from straightforward, usage-based pricing with no hidden base fees. The cost for FLUX.2 Klein is exactly $0.0001 per image generated. There is no specific tier assigned to this model in our catalogue, as it serves as a highly cost-efficient, fast-inference option for visual workloads.
All inference for this model is hosted in the eu-north1 region. This ensures that your prompts and generated assets remain within European borders, satisfying strict data residency requirements. By utilizing Lyceum's owned GPU infrastructure, you avoid the egress fees and unpredictable capacity constraints often associated with hyperscaler platforms.
What FLUX.2 Klein is good at
Sub-second generation speed
FLUX.2 Klein was engineered by Black Forest Labs specifically for latency-critical applications. By distilling the model down to just four inference steps, it achieves sub-second generation times on modern hardware. According to benchmark tests published by InferenceBench, a single H100 GPU can generate a photorealistic 1024x1024 image in just 0.57 seconds. This makes it an exceptional choice for interactive workflows, real-time user interfaces, and rapid prototyping where users cannot afford to wait for traditional diffusion processes.
Unified multi-reference editing
Unlike previous generations of open-weight models that required separate architectures for different tasks, FLUX.2 Klein unifies text-to-image generation and image editing into a single compact architecture. It supports both single-reference and multi-reference editing natively. This allows developers to build applications where users can upload character sheets or style boards, and the model will maintain consistent identities and aesthetics across multiple generated outputs without requiring complex fine-tuning.
Production-grade typography
One of the most significant upgrades over the FLUX.1 family is the model's ability to render legible text. FLUX.2 Klein utilizes an advanced text embedder that drastically improves prompt adherence and typography. Whether you are generating UI mockups, storefront signs, or infographics, the model reliably produces accurate text strings, allowing teams to generate production-ready assets directly from text prompts.
Benchmarks and how it compares
FLUX.2 Klein benchmark results
Evaluating image generation models requires balancing generation speed, cost, and visual quality. FLUX.2 Klein was designed to dominate the speed and efficiency metrics while maintaining a high baseline of photorealism.
| Model | Inference Steps | Time per 1024px Image (H100) | License / Access |
|---|---|---|---|
| FLUX.2 Klein 4B | 4 | 0.57 seconds | Apache 2.0 |
| FLUX.1 Schnell | 4 | ~0.60 seconds | Apache 2.0 |
| Z-Image Turbo | 8 | ~1.20 seconds | Open Weights |
| FLUX.2 [dev] (32B) | 28 | ~4.50 seconds | Non-Commercial |
Source: InferenceBench Blog and Artificial Analysis Image Arena.
When comparing FLUX.2 Klein to its predecessor, FLUX.1 Schnell, the architectural improvements are immediately apparent. Both models target the high-speed, low-step generation niche, but FLUX.2 Klein delivers noticeably better fine detail and text rendering. While FLUX.1 Schnell occasionally struggles with garbled text and softer textures, FLUX.2 Klein leverages its updated text embedder to produce crisp typography and sharper subjects at virtually the same generation speed.
Compared to Z-Image Turbo, FLUX.2 Klein 4B requires half the inference steps, resulting in faster end-to-end latency. For teams migrating off hyperscaler credits, replacing a heavy diffusion pipeline with FLUX.2 Klein on Lyceum Technology yields a significant reduction in compute costs.
Using it in production
Production configuration for FLUX.2 Klein
Integrating FLUX.2 Klein into a production environment requires understanding how to format your API requests for optimal results. Because our platform provides an OpenAI-compatible Serverless Inference API, the transition is straightforward, but image generation endpoints have specific parameter requirements.
When calling the images/generations endpoint, you must specify the model as lyc-flux-2-klein. The primary input is your prompt, which should be descriptive and clear. You can also control the dimensions of the output using the aspect_ratio parameter, which accepts standard ratios like 1:1, 16:9, or 9:16.
From a unit economics perspective, FLUX.2 Klein is exceptionally cost-effective. At a flat rate of $0.0001 per image, running a high-volume application becomes highly predictable. For example, generating 10,000 images for a dynamic marketing campaign or a user-facing avatar creator will cost exactly $1.00. There are no hidden compute duration charges or idle fees.
All requests are routed to our eu-north1 region. This geographic routing ensures that your application benefits from low-latency connections within Europe while maintaining strict adherence to data privacy regulations. Because Lyceum operates on per-second billing with scale-to-zero capabilities, you only pay for the exact number of images your users generate, making it an ideal solution for bursty, unpredictable workloads.
Running FLUX.2 Klein on EU-sovereign infrastructure
Why run FLUX.2 Klein on Lyceum
Deploying visual AI models in enterprise environments often introduces significant compliance and infrastructure challenges. By running FLUX.2 Klein on Lyceum Technology, engineering teams can bypass the complexities of managing their own GPU clusters while ensuring complete data sovereignty.
Data privacy is a critical requirement for European companies. When you use Lyceum's Serverless Inference API, your prompts and generated images are processed entirely within the eu-north1 region. This provides a clear path to GDPR compliance, a standard that is often impossible to meet when routing data through US-based API providers. Lyceum's owned GPU infrastructure ensures that your workloads run on secure, European-hosted hardware.
Furthermore, Lyceum offers open-stack transparency. We utilize optimized inference engines like vLLM and NVIDIA Dynamo, ensuring high throughput and low latency without locking you into a proprietary ecosystem. This transparency guarantees that you maintain customer portability by design.
The developer experience is built around reducing friction. With our OpenAI-compatible API, you can swap out your existing image generation provider by simply updating your base URL and API key. There is no need to rewrite your application logic or learn a new SDK. Combined with our aggressive pricing model of $0.0001 per image and zero egress fees, Lyceum provides a cost-effective, compliant platform for scaling FLUX.2 Klein.