OpenAI compatible API · Attested · Public status

OpenAI: gpt-oss-120b Performance

Name: OpenAI: gpt-oss-120b TrustedRouter performance measurements
Creator: TrustedRouter
License: https://www.apache.org/licenses/LICENSE-2.0

TrustedRouter performance signals and provider route posture for OpenAI: gpt-oss-120b.

Verify gateway

Onebase URL to migrate

100sof models and routes

Noneprompt logs by default

`openai/gpt-oss-120b`

open weights Performance

All models

AI IQ IQ 103 #57 public AI IQ rank for gpt-oss-120b

View AI IQ profile

Measured performance

Continuously sampled p50/p95 time-to-first-token (TTFT), time-to-first-byte (TTFB), throughput, and success rate for OpenAI: gpt-oss-120b — unsupported route and probe-configuration rows are separated from provider downtime, and no prompt or output content is stored.

Provider	p50 TTFT	p95 TTFT	p50 TTFB	Throughput	Uptime	Config excluded	Samples
baseten	2063 ms	24191 ms	2063 ms	—	100.00%	—	19
nebius	3269 ms	13713 ms	3269 ms	—	100.00%	—	5
together	3349 ms	12794 ms	3349 ms	—	88.89%	—	9
crusoe	4432 ms	18843 ms	4431 ms	—	100.00%	—	13
siliconflow	4438 ms	12948 ms	4438 ms	—	100.00%	—	9
deepinfra	4892 ms	14074 ms	4892 ms	—	100.00%	—	5
parasail	5068 ms	11434 ms	5068 ms	—	100.00%	—	7
cerebras	5121 ms	17391 ms	5121 ms	—	100.00%	—	33
tinfoil	5172 ms	15360 ms	5172 ms	51 tok/s	100.00%	—	87
fireworks	7020 ms	14279 ms	7020 ms	—	96.43%	—	28
novita	7350 ms	7553 ms	7350 ms	—	100.00%	—	3

Full provider & model leaderboard.

Provider diversity

23 routes.

More routes give the auto router more room to fail over around provider 429 and 5xx responses.

Streaming

Gateway overhead is measured separately.

Public status separates TLS/health overhead from full model latency so slow LLMs do not inflate the router metric.

Status

Metadata rollups.

Status samples store latency, outcome, provider, model, route, cost, and region metadata only.