Inference, served clean. Frontier models, answered with conviction.

Kataleptic — from the Stoic kataleptikē phantasia, the impression that grasps the world with certainty — is an OpenAI-compatible inference endpoint serving forty-eight frontier open-source and proprietary models through one billing surface, with EU or US residency on demand.

Claim $5 free credit See the price list

· Inference · Perception · Generation · Reasoning · Agency · Inference · Perception · Generation · Reasoning · Agency · Inference · Perception · Generation · Reasoning · Agency · · Inference · Perception · Generation · Reasoning · Agency · Inference · Perception · Generation · Reasoning · Agency · Inference · Perception · Generation · Reasoning · Agency ·

§ 01

One endpoint.

Forty-eight models, three backends, two regions, one OpenAI-compatible surface. No proxy hops, no SDK rewrites, no surprise rate limits.

Catalogue Live

Forty-eight curated models

GPT-5.6 Sol/Terra/Luna, DeepSeek V4 Pro/Flash and V3.1/V3.2/R1, Llama 3.3/4-Maverick, Mistral Large 3 / Medium 3.5 / Nemo, GPT-5/5.2/5.4/5.4-mini/5.5, GPT-OSS 120B, Cohere Command-A(+), Phi-4 Mini Reasoning, Kimi K2.5/K2.6/K2.7-Code, Grok 4.3 and 4.1 Fast, Gemma 3 27B, GLM-4 9B, Qwen 3 8B, Qwen 2.5 Coder, plus realtime voice, image, video and embeddings.

See the price list
Residency Live

EU or US, on demand

Every Azure-backed model is mirrored in Sweden Central. Flip a single per-account toggle and the next request terminates inside the EU — same models, same prices, no separate billing.

Open the account dashboard
Sovereign Live

Sovereign fleet on dedicated metal

For models we self-host, inference runs on our own fleet of GPU servers in our own colocation, orchestrated by xinity-ai. Open-weight only, transparent stack, no upstream provider in the loop.

Live status & latency

§ 02

A line of code,
then you're inferring.

~/kataleptic · cURL

$ curl https://api.kataleptic.com/v1/chat/completions \
  -H "Authorization: Bearer dg_…" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.3-70b",
    "messages": [
      { "role": "user",
        "content": "Compose a haiku about sovereignty." }
    ],
    "stream": true
  }'

> data: {"id":"chatcmpl-…","provider":"duguet-ai",
> data:   "choices":[{"delta":{"content":"Quiet servers hum,"}}]}
> data: {"choices":[{"delta":{"content":" data stays where it was born—"}}]}
> data: {"choices":[{"delta":{"content":" sovereignty in code."}}]}
> data: [DONE]

§ 03

The price list

Transparent per-million-token pricing. No top-up fee, no hidden request surcharge. Billed by the token, paid from prepaid credit. First $5 on us.

Where a model supports prompt caching, re-sent context bills at the cached rate shown beneath the input price — one tenth of fresh input, and we charge nothing extra to write the cache. Agents that resend a conversation every turn pay a fraction of the headline number.

Model	Context	Input / MTok	Output / MTok	Backend
— Open-source chat —
GLM-4 9B `glm4-9b`	128k	$0.05	$0.08	Sovereign fleet · xinity
Qwen 3 8B `qwen3-8b`	32k	$0.05	$0.08	Sovereign fleet · xinity
Qwen 2.5 Coder 7B code `qwen2.5-coder-7b`	32k	$0.05	$0.08	Sovereign fleet · xinity
Mistral Nemo 12B `mistral-nemo-12b`	128k	$0.10	$0.15	Sovereign fleet · xinity
Gemma 3 27B `gemma3-27b`	128k	$0.15	$0.20	Sovereign fleet · xinity
Llama 3.3 70B `llama-3.3-70b`	131k	$0.30	$0.40	Azure MaaS
DeepSeek V3.2 `deepseek-v3-2`	164k	$0.40	$0.60	Azure MaaS
Mistral Large 3 `mistral-large-3`	131k	$0.40	$1.20	Azure MaaS
Cohere Command-A new `cohere-command-a`	131k	$2.50	$10.00	Azure MaaS
DeepSeek V3.1 tools new `deepseek-v3-1`	131k	$0.27	$1.10	Azure MaaS
Llama 4 Maverick 17B vision new `llama-4-maverick`	1M	$0.50	$1.50	Azure MaaS
— Reasoning —
GPT-OSS 120B reasoning `gpt-oss-120b`	128k	$0.15	$0.60	Azure MaaS
DeepSeek R1 reasoning `deepseek-r1`	164k	$0.50	$2.00	Azure MaaS
Kimi K2.5 reasoning `kimi-k2.5`	131k	$0.60	$2.40	Azure MaaS
Kimi K2.6 reasoning `kimi-k2.6`	131k	$0.60	$2.40	Azure MaaS
xAI Grok 4.1 Fast reasoning new `grok-4-1-fast`	131k	$0.20	$0.50	Azure MaaS
Phi-4 Mini Reasoning reasoning new `phi-4-mini-reasoning`	128k	$0.10	$0.30	Azure MaaS
— Proprietary —
GPT-5 `gpt-5`	400k	$2.50$0.25 cached	$10.00	Azure OpenAI
GPT-5.2 `gpt-5.2`	400k	$2.50$0.25 cached	$10.00	Azure OpenAI
GPT-5.4 Mini vision new `gpt-5.4-mini`	400k	$0.25$0.025 cached	$2.00	Azure OpenAI
GPT-5.4 vision new `gpt-5.4`	1M	$2.50$0.25 cached	$15.00	Azure OpenAI
GPT-5.5 vision flagship new `gpt-5.5`	1M	$2.50$0.25 cached	$15.00	Azure OpenAI
GPT-5.6 Luna vision new `gpt-5.6-luna`	400k	$1.00$0.10 cached	$6.00	Azure OpenAI
GPT-5.6 Terra vision new `gpt-5.6-terra`	1M	$2.50$0.25 cached	$15.00	Azure OpenAI
GPT-5.6 Sol vision flagship new `gpt-5.6-sol`	1M	$5.00$0.50 cached	$30.00	Azure OpenAI
— Embeddings —
Nomic Embed `nomic-embed`	8k	$0.02	—	Sovereign fleet · xinity
— Image generation · per image, 1024×1024 —
GPT-Image-1 Mini new `gpt-image-1-mini`	—	$0.011 / image		Azure OpenAI
FLUX.2 Pro new `flux-2-pro`	—	$0.040 / image		Azure MaaS
GPT-Image-2 flagship new `gpt-image-2`	—	$0.080 / image · hd ×2		Azure OpenAI
— Speech recognition · per minute of audio —
Whisper Large V3 Turbo new fastest `whisper-large-v3-turbo`	—	$0.003 / min		Sovereign fleet · xinity
Whisper Large V3 new `whisper-large-v3`	—	$0.005 / min		Sovereign fleet · xinity
GPT-4o Transcribe new `gpt-4o-transcribe`	—	$0.006 / min		Azure OpenAI
GPT-4o Transcribe (diarized) new `gpt-4o-transcribe-diarize`	—	$0.024 / min		Azure OpenAI
— Video generation · per second of output, async job —
Sora 2 new `sora-2`	—	$0.20 / sec		Azure OpenAI

A note on price: we aren't chasing the absolute floor. We undercut the major aggregators by 20–40 % on most SKUs and don't charge a top-up fee.

§ Volume discounts Auto-applied · rolling 30-day window · no contracts

$0–$500 / moPay-as-you-golisted price
$500+ / moScale5 % off
$2,000+ / moVolume10 % off
$10,000+ / moEnterprise15 % off · talk to us

§ 04

The workshop

From first principles.

We build the smallest piece that solves the actual problem. Fewer abstractions, more measurements, and a clear paper trail from input to output.

ii.

Sovereign by architecture.

Compute, weights, and data stay where you can audit them. Self-host the entire stack, or use ours under a contract that forbids egress.

iii.

Open where it matters.

We prefer open weights and open formats; we contribute the generalisable parts back. Proprietary work earns its keep on results, not on lock-in.

iv.

Production-shaped.

Every release ships with throughput numbers, latency p50/p99, failure modes, and a runbook. If it cannot be operated, it is not finished.

Claim your API key.

$5 in free credit, no card. Drop-in replacement for OpenAI or OpenRouter clients — point base URL at https://api.kataleptic.com/v1 and go.

Already have an account but lost the key? Recover by email →

One inbox, read by a human · [email protected]