OpenAI compatible API. Attested gateway. Public status.

Nebius Token Factory

Nebius Token Factory models on TrustedRouter with prices, routes, policy notes, and source links.

Verify gateway
1 URLbase_url migration
100smodels and routes
0prompt logs by default

nebius

No logs

All providers

ProviderNebius Token Factory
Models31 public models
Prepaid routes29
BYOK routes31
Zero data retentionyes
Confidential computenot claimed
Provider E2EEnot claimed
Policy noteMarked ZDR via TrustedRouter's arrangement — Nebius RETAINS inputs/outputs by default (for speculative decoding); zero retention is an opt-in control, which the deployed Nebius account has enabled. Nebius does not train on customer data.
Policy source

Measured performance

256 samples

Continuously sampled across Nebius Token Factory's routed models — p50 TTFT, throughput, and success rate. Unsupported route and probe-configuration rows are separated from provider downtime. No prompt or output content stored.

p50 TTFT1629 ms
Throughput
Uptime96.88%
Modelp50 TTFTp50 TTFBThroughputUptimeConfig excludedSamples
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1 773 ms 772 ms 100.00% 7
NousResearch/Hermes-4-70B 816 ms 814 ms 100.00% 16
Qwen/Qwen3-235B-A22B-Instruct-2507 1068 ms 965 ms 100.00% 11
openai/gpt-oss-120b 1123 ms 1122 ms 100.00% 10
meta-llama/Llama-3.3-70B-Instruct 1142 ms 1076 ms 100.00% 8
Qwen/Qwen3-30B-A3B-Instruct-2507 1171 ms 1068 ms 100.00% 10
NousResearch/Hermes-4-405B 1183 ms 1087 ms 100.00% 6
google/gemma-3-27b-it 1210 ms 1178 ms 100.00% 11
Qwen/Qwen3-32B 1277 ms 1192 ms 100.00% 9
Qwen/Qwen2.5-VL-72B-Instruct 1286 ms 1246 ms 100.00% 16
deepseek-ai/DeepSeek-V3.2 1333 ms 1314 ms 100.00% 9
moonshotai/Kimi-K2.5-fast 1467 ms 1441 ms 83.33% 12
Qwen/Qwen3-Next-80B-A3B-Thinking-fast 1629 ms 1626 ms 100.00% 12
deepseek-ai/DeepSeek-V3.2-fast 1646 ms 1619 ms 100.00% 18
Qwen/Qwen3-235B-A22B-Thinking-2507-fast 1671 ms 1568 ms 100.00% 16
openai/gpt-oss-120b-fast 1676 ms 1572 ms 100.00% 6
deepseek-ai/DeepSeek-V4-Pro 1745 ms 1743 ms 92.31% 13
MiniMaxAI/MiniMax-M2.5-fast 1760 ms 1757 ms 80.00% 5
moonshotai/Kimi-K2.5 1786 ms 1683 ms 100.00% 1 probe_config_error 15
Qwen/Qwen3.5-397B-A17B-fast 1804 ms 1699 ms 100.00% 7
nvidia/nemotron-3-super-120b-a12b 1818 ms 1715 ms 100.00% 7
zai-org/GLM-5 1978 ms 1975 ms 100.00% 12
Qwen/Qwen3-Next-80B-A3B-Thinking 2065 ms 1963 ms 100.00% 13
zai-org/GLM-5.1 7485 ms 7381 ms 42.86% 7

Full provider & model leaderboard.

Provider models

Models served by Nebius Token Factory.

Each row links to pricing, provider, benchmark, and API pages for the model.

Model Context Endpoints Prompt Completion Routes
MiniMaxAI/MiniMax-M2.5
MiniMax M2.5
204,800 2 $0.33/1M $1.32/1M prepaid BYOK
MiniMaxAI/MiniMax-M2.5-fast
MiniMax M2.5 fast
204,800 2 $0.66/1M $2.64/1M prepaid BYOK
NousResearch/Hermes-4-405B
Hermes 4 405B
131,072 2 $1.1/1M $3.3/1M prepaid BYOK
NousResearch/Hermes-4-70B
Hermes 4 70B
131,072 2 $0.143/1M $0.44/1M prepaid BYOK
PrimeIntellect/INTELLECT-3
INTELLECT 3
131,072 2 $1.1/1M $3.3/1M prepaid BYOK
Qwen/Qwen2.5-VL-72B-Instruct
Qwen2.5 VL 72B Instruct
32,768 2 $0.22/1M $0.77/1M prepaid BYOK
Qwen/Qwen3-235B-A22B-Instruct-2507
Qwen3 235B A22B Instruct 2507
131,072 2 $0.22/1M $0.66/1M prepaid BYOK
Qwen/Qwen3-235B-A22B-Thinking-2507-fast
Qwen3 235B A22B Thinking 2507 fast
131,072 2 $0.44/1M $1.76/1M prepaid BYOK
Qwen/Qwen3-30B-A3B-Instruct-2507
Qwen3 30B A3B Instruct 2507
131,072 2 $0.11/1M $0.33/1M prepaid BYOK
Qwen/Qwen3-32B
Qwen3 32B
131,072 2 $0.11/1M $0.33/1M prepaid BYOK
Qwen/Qwen3-Next-80B-A3B-Thinking
Qwen3 Next 80B A3B Thinking
131,072 2 $0.165/1M $1.65/1M prepaid BYOK
Qwen/Qwen3-Next-80B-A3B-Thinking-fast
Qwen3 Next 80B A3B Thinking fast
131,072 2 $0.33/1M $3.3/1M prepaid BYOK
Qwen/Qwen3.5-397B-A17B
Qwen3.5 397B A17B
262,144 2 $0.66/1M $3.96/1M prepaid BYOK
Qwen/Qwen3.5-397B-A17B-fast
Qwen3.5 397B A17B fast
262,144 2 $0.66/1M $3.96/1M prepaid BYOK
deepseek-ai/DeepSeek-V3.2
DeepSeek V3.2
163,840 2 $0.55/1M $1.65/1M prepaid BYOK
deepseek-ai/DeepSeek-V3.2-fast
DeepSeek V3.2 fast
163,840 2 $0.825/1M $2.475/1M prepaid BYOK
deepseek-ai/DeepSeek-V4-Pro
DeepSeek V4 Pro
1,048,576 2 $1.859/1M $3.718/1M prepaid BYOK
google/gemma-2-2b-it
gemma 2 2b it
8,192 1 $0.022/1M $0.066/1M BYOK
google/gemma-3-27b-it
Google: Gemma 3 27B
131,072 2 $0.1309/1M $0.22/1M prepaid BYOK
meta-llama/Llama-3.3-70B-Instruct
Llama 3.3 70B Instruct
131,072 2 $0.143/1M $0.44/1M prepaid BYOK
meta-llama/Meta-Llama-3.1-8B-Instruct
Meta Llama 3.1 8B Instruct
128,000 1 $0.022/1M $0.066/1M BYOK
moonshotai/Kimi-K2.5
Kimi K2.5
262,144 2 $0.66/1M $3.3/1M prepaid BYOK
moonshotai/Kimi-K2.5-fast
Kimi K2.5 fast
262,144 2 $1.1/1M $5.28/1M prepaid BYOK
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1
Llama 3_1 Nemotron Ultra 253B v1
128,000 2 $0.66/1M $1.98/1M prepaid BYOK
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B
NVIDIA Nemotron 3 Nano 30B A3B
131,072 2 $0.11/1M $0.33/1M prepaid BYOK
nvidia/Nemotron-3-Nano-Omni
Nemotron 3 Nano Omni
131,072 2 $0.165/1M $0.495/1M prepaid BYOK
nvidia/nemotron-3-super-120b-a12b
nemotron 3 super 120b a12b
131,072 2 $0.66/1M $1.98/1M prepaid BYOK
openai/gpt-oss-120b
OpenAI: gpt-oss-120b
131,072 2 $0.165/1M $0.66/1M prepaid BYOK
openai/gpt-oss-120b-fast
gpt oss 120b fast
131,072 2 $0.33/1M $1.32/1M prepaid BYOK
zai-org/GLM-5
GLM 5
202,800 2 $1.1/1M $3.52/1M prepaid BYOK
zai-org/GLM-5.1
GLM 5.1
204,800 2 $1.54/1M $4.84/1M prepaid BYOK

Sign in

Choose a sign in method.