OpenAI compatible API · Attested · Public status

Nebius Token Factory performance

Name: Nebius Token Factory TrustedRouter performance measurements
Creator: TrustedRouter
License: https://www.apache.org/licenses/LICENSE-2.0

Measured TTFT, TTFB, effective throughput, uptime, and sampled model routes for Nebius Token Factory.

Verify gateway

Onebase URL to migrate

100sof models and routes

0prompt or output logs. Always.

`nebius`

53 samples

Provider overview

Continuously sampled provider performance. TrustedRouter reports unsupported route and probe-configuration rows separately from provider downtime. Prompt and output content is not stored.

p50 TTFT	2462 ms
p95 TTFT	9987 ms
p50 TTFB	2353 ms
Effective throughput	112 tok/s n=7
Uptime	98.11%

Measured model routes

Model	p50 TTFT	p50 TTFB	Effective throughput	Uptime	Config excluded	Availability samples
openbmb/MiniCPM-V-4_5	659 ms	659 ms	114 tok/s n=1	100.00%	—	3
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1	726 ms	725 ms	—	100.00%	—	2
Qwen/Qwen3-30B-A3B-Instruct-2507	1066 ms	1066 ms	—	100.00%	—	3
NousResearch/Hermes-4-70B	1351 ms	1351 ms	—	100.00%	—	2
moonshotai/Kimi-K2.7-Code	1514 ms	1513 ms	—	100.00%	—	3
MiniMaxAI/MiniMax-M3	1563 ms	1563 ms	112 tok/s n=1	100.00%	—	1
nvidia/Nemotron-3-Nano-Omni	1618 ms	1617 ms	—	100.00%	—	1
openai/gpt-oss-120b	1619 ms	1619 ms	225 tok/s n=1	100.00%	—	2
moonshotai/Kimi-K2.6	1759 ms	1759 ms	—	100.00%	—	4
Qwen/Qwen2.5-VL-72B-Instruct	1855 ms	1854 ms	—	100.00%	—	1
nvidia/nemotron-3-super-120b-a12b	2047 ms	2046 ms	—	100.00%	—	2
zai-org/GLM-5.2	2073 ms	2073 ms	—	100.00%	—	1
NousResearch/Hermes-4-405B	2441 ms	2441 ms	—	100.00%	—	1
google/gemma-3-27b-it	2462 ms	2462 ms	—	100.00%	—	5
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B	2723 ms	2722 ms	—	100.00%	—	1
Qwen/Qwen3-235B-A22B-Instruct-2507	2736 ms	2735 ms	—	100.00%	—	3
nvidia/Cosmos3-Super-Reasoner	2738 ms	2738 ms	62 tok/s n=1	100.00%	—	3
moonshotai/kimi-k3	3184 ms	3184 ms	80 tok/s n=2	100.00%	—	1
Qwen/Qwen3-Next-80B-A3B-Thinking	3330 ms	3330 ms	—	100.00%	—	3
Qwen/Qwen3-32B	3411 ms	3411 ms	—	100.00%	—	1
deepseek-ai/DeepSeek-V4-Pro	3489 ms	3489 ms	—	100.00%	—	1
meta-llama/Llama-3.3-70B-Instruct	6942 ms	6942 ms	—	100.00%	—	7
zai-org/GLM-5.1	4479 ms	4479 ms	—	50.00%	—	2
nvidia/nemotron-3-ultra-550b-a55b	—	—	112 tok/s n=1	—	—	0