Image Ultra: specs, benchmarks, and how to run it on Lyceum
Sub-second text-to-image generation for real-time applications.
Maximilian Niroomand
June 19, 2026 · CTO & Co-Founder at Lyceum Technology
Image Ultra is a high-speed text-to-image model optimized for real-time generation, delivering results in under one second. Built for applications requiring instant visual feedback, such as interactive design tools, gaming, and rapid prototyping, it prioritizes low latency without compromising core image quality. Lyceum Technology serves Image Ultra through its OpenAI-compatible Serverless Inference API, allowing developers to integrate sub-second image generation with zero infrastructure management. All inference runs entirely on EU-hosted GPUs, ensuring strict data residency and GDPR compliance for European enterprises.
Get started: call Image Ultra on Lyceum
Integrating Image Ultra into your application requires minimal setup. Because Lyceum Technology provides an OpenAI-compatible API, you can generate images using standard HTTP requests without managing underlying GPU infrastructure. For image generation models, you must use the dedicated images/generations endpoint rather than the standard chat completions endpoint.
Unlike text models where you might use the OpenAI Python client with a custom base URL, image generation on Lyceum requires a direct HTTP POST request. You must pass your Lyceum API key as a Bearer token in the authorization header.
import requests
response = requests.post(
"https://api.lyceum.technology/api/v2/external/images/generations",
headers={"Authorization": "Bearer <your lyceum api key>"},
json={"model": "lyc-image-ultra", "prompt": "a sunset over the ocean", "aspect_ratio": "1:1"},
)
print(response.json()["image_url"])
Pricing and region for Image Ultra
Image Ultra is available exclusively in the eu-north1 region, ensuring all data processing remains within European borders. It operates on a strict pay-per-use model priced at $0.005 per image. This model does not use a specific tier (Fast or Standard) because it is inherently optimized for maximum speed and cost-efficiency. There are no minimum commitments, no subscription fees, and no idle compute costs. You only pay for the exact number of images you successfully generate, making it highly predictable for production workloads.
What Image Ultra is good at
Sub-second generation latency
The primary advantage of Image Ultra is its extreme speed. Utilizing advanced distillation techniques (similar to Adversarial Diffusion Distillation (ADD) used in models like SDXL Turbo), it reduces the required denoising steps from the traditional 30 or more down to fewer than four steps. This architectural optimization allows the model to return high-quality images in under one second. For engineering teams, this means Image Ultra is viable for real-time applications where traditional diffusion models would introduce unacceptable delays and break the user experience.
Cost-efficient scaling for high-volume workloads
Because the model requires significantly fewer compute cycles per generation, it is highly cost-effective at scale. At $0.005 per image, applications that require high-volume generation can operate with predictable, low unit economics. This is particularly valuable for dynamic e-commerce cataloging, personalized marketing asset generation, or user-generated content moderation pipelines where thousands of images must be processed daily without inflating cloud infrastructure budgets.
Interactive and real-time workflows
Image Ultra excels in environments where users expect immediate visual feedback. This includes live UI mockups, real-time gaming asset generation, and interactive AI chat interfaces. The sub-second response time prevents workflow interruption, enabling a fluid user experience that heavier, multi-step models cannot support. Developers can wire the API directly to user input fields, generating visual concepts on the fly as the user types, which fundamentally changes how end-users interact with generative design tools.
Benchmarks and how it compares
Sub-second image model benchmarks
Image Ultra competes in the ultra-fast, distilled diffusion category. While specific Artificial Analysis ELO scores for the proprietary lyc-image-ultra endpoint are not publicly tracked, its performance profile mirrors leading sub-second models like SDXL Turbo and FLUX.1 Schnell. The table below illustrates the latency and step-count benchmarks for this class of models, demonstrating the hardware efficiency required to achieve real-time generation.
| Model | Inference Steps | Average Latency (A100) | Target Use Case |
|---|---|---|---|
| SDXL Turbo | 1 step | ~207ms | Real-time generation |
| FLUX.1 Schnell | 4 steps | ~1.4s | Fast prototyping |
| Image Ultra (Lyceum) | Optimized | < 1.0s | Low-latency API serving |
Source: Latency figures for SDXL Turbo via developer benchmarks; FLUX.1 Schnell API median latency via Artificial Analysis.
Comparison to sibling catalogue models
When evaluating Image Ultra against standard diffusion models in the Lyceum catalogue, the primary differentiator is the speed-to-quality ratio. Standard models may take 5 to 12 seconds per image but deliver superior photorealism, accurate anatomy, and reliable text rendering. Image Ultra sacrifices that top-tier fidelity to guarantee sub-second delivery. For infrastructure leads, this means Image Ultra is the default choice for interactive applications, while heavier models should be reserved for asynchronous batch processing or final-asset production where latency is not a constraint.
Using it in production
Production configuration for Image Ultra
Deploying Image Ultra in production requires understanding its specific API mechanics. Unlike text generation models, image generation does not use token streaming. You send a single synchronous POST request to the images/generations endpoint, and the API returns a JSON payload containing the image_url. Your application must be prepared to handle this synchronous block, though the sub-second latency of Image Ultra minimizes thread-locking issues.
Key parameters to configure in your production payload:
Prompt engineering
Keep descriptions concise. Distilled models respond better to direct, keyword-rich prompts rather than lengthy conversational instructions. Place the most important subjects at the beginning of the prompt string.
Aspect Ratio
The model supports standard ratios (e.g., 1:1, 16:9, 9:16). Stick to these defaults to avoid structural warping or unexpected cropping.
Error handling
Implement retry logic for network timeouts, even though the model itself processes requests in under one second.
Cost estimation at scale
Because Image Ultra is billed at a flat rate of $0.005 per image, forecasting production costs is highly predictable. If your application generates 100,000 images per month, your total compute cost is exactly $500.00. Hosted in the eu-north1 region, this price includes all infrastructure overhead. Furthermore, Lyceum Technology charges zero egress fees for retrieving the generated images, ensuring that high-traffic applications do not incur hidden network costs.
Running Image Ultra on EU-sovereign infrastructure
Why run Image Ultra on Lyceum
For European teams building AI applications, data residency is often a strict regulatory requirement. Many hyperscalers and US-based API providers route inference traffic through North American data centers, creating compliance risks for enterprise workloads. Lyceum Technology solves this by hosting Image Ultra entirely within the eu-north1 region. This ensures that your prompts and generated assets never leave the European Union, providing a clear, auditable path to GDPR and AI Act compliance.
Owned infrastructure advantage
Unlike API wrappers that rent compute from larger cloud providers, Lyceum operates its own GPU infrastructure. This structural advantage allows Lyceum to offer Image Ultra at highly competitive rates without the margin stacking seen elsewhere. You get raw GPU performance, provisioned via 40+ supply-side partners across Europe, without the hyperscaler premium. This approach eliminates the capacity bottlenecks and block-reservation requirements common on public clouds.
Frictionless integration
Switching to Lyceum requires almost no engineering effort. Because the API is fully OpenAI-compatible, developers can migrate existing image generation pipelines by updating the base URL and API key. Whether you are building an interactive design tool or a serverless GPU inference pipeline, Lyceum provides the speed of Image Ultra with the security of EU-sovereign infrastructure. You avoid vendor lock-in while benefiting from per-second billing and enterprise-grade reliability.