OpenAI compatible API. Attested gateway. Public status.
Nebius Token Factory
Nebius Token Factory models on TrustedRouter with prices, routes, policy notes, and source links.
1 URLbase_url migration
100smodels and routes
0prompt logs by default
nebius
No logs
| Provider | Nebius Token Factory |
|---|---|
| Models | 31 public models |
| Prepaid routes | 29 |
| BYOK routes | 31 |
| Zero data retention | yes |
| Confidential compute | not claimed |
| Provider E2EE | not claimed |
| Policy note | Marked ZDR via TrustedRouter's arrangement — Nebius RETAINS inputs/outputs by default (for speculative decoding); zero retention is an opt-in control, which the deployed Nebius account has enabled. Nebius does not train on customer data. Policy source |
Measured performance
256 samplesContinuously sampled across Nebius Token Factory's routed models — p50 TTFT, throughput, and success rate. Unsupported route and probe-configuration rows are separated from provider downtime. No prompt or output content stored.
| p50 TTFT | 1629 ms |
|---|---|
| Throughput | — |
| Uptime | 96.88% |
| Model | p50 TTFT | p50 TTFB | Throughput | Uptime | Config excluded | Samples |
|---|---|---|---|---|---|---|
| nvidia/Llama-3_1-Nemotron-Ultra-253B-v1 | 773 ms | 772 ms | — | 100.00% | — | 7 |
| NousResearch/Hermes-4-70B | 816 ms | 814 ms | — | 100.00% | — | 16 |
| Qwen/Qwen3-235B-A22B-Instruct-2507 | 1068 ms | 965 ms | — | 100.00% | — | 11 |
| openai/gpt-oss-120b | 1123 ms | 1122 ms | — | 100.00% | — | 10 |
| meta-llama/Llama-3.3-70B-Instruct | 1142 ms | 1076 ms | — | 100.00% | — | 8 |
| Qwen/Qwen3-30B-A3B-Instruct-2507 | 1171 ms | 1068 ms | — | 100.00% | — | 10 |
| NousResearch/Hermes-4-405B | 1183 ms | 1087 ms | — | 100.00% | — | 6 |
| google/gemma-3-27b-it | 1210 ms | 1178 ms | — | 100.00% | — | 11 |
| Qwen/Qwen3-32B | 1277 ms | 1192 ms | — | 100.00% | — | 9 |
| Qwen/Qwen2.5-VL-72B-Instruct | 1286 ms | 1246 ms | — | 100.00% | — | 16 |
| deepseek-ai/DeepSeek-V3.2 | 1333 ms | 1314 ms | — | 100.00% | — | 9 |
| moonshotai/Kimi-K2.5-fast | 1467 ms | 1441 ms | — | 83.33% | — | 12 |
| Qwen/Qwen3-Next-80B-A3B-Thinking-fast | 1629 ms | 1626 ms | — | 100.00% | — | 12 |
| deepseek-ai/DeepSeek-V3.2-fast | 1646 ms | 1619 ms | — | 100.00% | — | 18 |
| Qwen/Qwen3-235B-A22B-Thinking-2507-fast | 1671 ms | 1568 ms | — | 100.00% | — | 16 |
| openai/gpt-oss-120b-fast | 1676 ms | 1572 ms | — | 100.00% | — | 6 |
| deepseek-ai/DeepSeek-V4-Pro | 1745 ms | 1743 ms | — | 92.31% | — | 13 |
| MiniMaxAI/MiniMax-M2.5-fast | 1760 ms | 1757 ms | — | 80.00% | — | 5 |
| moonshotai/Kimi-K2.5 | 1786 ms | 1683 ms | — | 100.00% | 1 probe_config_error |
15 |
| Qwen/Qwen3.5-397B-A17B-fast | 1804 ms | 1699 ms | — | 100.00% | — | 7 |
| nvidia/nemotron-3-super-120b-a12b | 1818 ms | 1715 ms | — | 100.00% | — | 7 |
| zai-org/GLM-5 | 1978 ms | 1975 ms | — | 100.00% | — | 12 |
| Qwen/Qwen3-Next-80B-A3B-Thinking | 2065 ms | 1963 ms | — | 100.00% | — | 13 |
| zai-org/GLM-5.1 | 7485 ms | 7381 ms | — | 42.86% | — | 7 |
Provider models
Models served by Nebius Token Factory.
Each row links to pricing, provider, benchmark, and API pages for the model.
| Model | Context | Endpoints | Prompt | Completion | Routes |
|---|---|---|---|---|---|
MiniMaxAI/MiniMax-M2.5MiniMax M2.5 |
204,800 | 2 | $0.33/1M | $1.32/1M | prepaid BYOK |
MiniMaxAI/MiniMax-M2.5-fastMiniMax M2.5 fast |
204,800 | 2 | $0.66/1M | $2.64/1M | prepaid BYOK |
NousResearch/Hermes-4-405BHermes 4 405B |
131,072 | 2 | $1.1/1M | $3.3/1M | prepaid BYOK |
NousResearch/Hermes-4-70BHermes 4 70B |
131,072 | 2 | $0.143/1M | $0.44/1M | prepaid BYOK |
PrimeIntellect/INTELLECT-3INTELLECT 3 |
131,072 | 2 | $1.1/1M | $3.3/1M | prepaid BYOK |
Qwen/Qwen2.5-VL-72B-InstructQwen2.5 VL 72B Instruct |
32,768 | 2 | $0.22/1M | $0.77/1M | prepaid BYOK |
Qwen/Qwen3-235B-A22B-Instruct-2507Qwen3 235B A22B Instruct 2507 |
131,072 | 2 | $0.22/1M | $0.66/1M | prepaid BYOK |
Qwen/Qwen3-235B-A22B-Thinking-2507-fastQwen3 235B A22B Thinking 2507 fast |
131,072 | 2 | $0.44/1M | $1.76/1M | prepaid BYOK |
Qwen/Qwen3-30B-A3B-Instruct-2507Qwen3 30B A3B Instruct 2507 |
131,072 | 2 | $0.11/1M | $0.33/1M | prepaid BYOK |
Qwen/Qwen3-32BQwen3 32B |
131,072 | 2 | $0.11/1M | $0.33/1M | prepaid BYOK |
Qwen/Qwen3-Next-80B-A3B-ThinkingQwen3 Next 80B A3B Thinking |
131,072 | 2 | $0.165/1M | $1.65/1M | prepaid BYOK |
Qwen/Qwen3-Next-80B-A3B-Thinking-fastQwen3 Next 80B A3B Thinking fast |
131,072 | 2 | $0.33/1M | $3.3/1M | prepaid BYOK |
Qwen/Qwen3.5-397B-A17BQwen3.5 397B A17B |
262,144 | 2 | $0.66/1M | $3.96/1M | prepaid BYOK |
Qwen/Qwen3.5-397B-A17B-fastQwen3.5 397B A17B fast |
262,144 | 2 | $0.66/1M | $3.96/1M | prepaid BYOK |
deepseek-ai/DeepSeek-V3.2DeepSeek V3.2 |
163,840 | 2 | $0.55/1M | $1.65/1M | prepaid BYOK |
deepseek-ai/DeepSeek-V3.2-fastDeepSeek V3.2 fast |
163,840 | 2 | $0.825/1M | $2.475/1M | prepaid BYOK |
deepseek-ai/DeepSeek-V4-ProDeepSeek V4 Pro |
1,048,576 | 2 | $1.859/1M | $3.718/1M | prepaid BYOK |
google/gemma-2-2b-itgemma 2 2b it |
8,192 | 1 | $0.022/1M | $0.066/1M | BYOK |
google/gemma-3-27b-itGoogle: Gemma 3 27B |
131,072 | 2 | $0.1309/1M | $0.22/1M | prepaid BYOK |
meta-llama/Llama-3.3-70B-InstructLlama 3.3 70B Instruct |
131,072 | 2 | $0.143/1M | $0.44/1M | prepaid BYOK |
meta-llama/Meta-Llama-3.1-8B-InstructMeta Llama 3.1 8B Instruct |
128,000 | 1 | $0.022/1M | $0.066/1M | BYOK |
moonshotai/Kimi-K2.5Kimi K2.5 |
262,144 | 2 | $0.66/1M | $3.3/1M | prepaid BYOK |
moonshotai/Kimi-K2.5-fastKimi K2.5 fast |
262,144 | 2 | $1.1/1M | $5.28/1M | prepaid BYOK |
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1Llama 3_1 Nemotron Ultra 253B v1 |
128,000 | 2 | $0.66/1M | $1.98/1M | prepaid BYOK |
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3BNVIDIA Nemotron 3 Nano 30B A3B |
131,072 | 2 | $0.11/1M | $0.33/1M | prepaid BYOK |
nvidia/Nemotron-3-Nano-OmniNemotron 3 Nano Omni |
131,072 | 2 | $0.165/1M | $0.495/1M | prepaid BYOK |
nvidia/nemotron-3-super-120b-a12bnemotron 3 super 120b a12b |
131,072 | 2 | $0.66/1M | $1.98/1M | prepaid BYOK |
openai/gpt-oss-120bOpenAI: gpt-oss-120b |
131,072 | 2 | $0.165/1M | $0.66/1M | prepaid BYOK |
openai/gpt-oss-120b-fastgpt oss 120b fast |
131,072 | 2 | $0.33/1M | $1.32/1M | prepaid BYOK |
zai-org/GLM-5GLM 5 |
202,800 | 2 | $1.1/1M | $3.52/1M | prepaid BYOK |
zai-org/GLM-5.1GLM 5.1 |
204,800 | 2 | $1.54/1M | $4.84/1M | prepaid BYOK |