OpenAI compatible API. Attested gateway. Public status.

Phala performance

Measured TTFT, TTFB, throughput, uptime, and sampled model routes for Phala.

Verify gateway
1 URLbase_url migration
100smodels and routes
0prompt logs by default

phala

305 samples

Provider overview

Continuously sampled provider performance. TrustedRouter reports unsupported route and probe-configuration rows separately from provider downtime. Prompt and output content is not stored.

p50 TTFT1971 ms
p95 TTFT ms
p50 TTFB ms
Throughput44 tok/s
Uptime97.38%

Measured model routes

Modelp50 TTFTp50 TTFBThroughputUptimeConfig excludedSamples
qwen/qwen-2.5-7b-instruct 926 ms 883 ms 100.00% 15
z-ai/glm-4.7-flash 963 ms 872 ms 100.00% 16
qwen/qwen2.5-vl-72b-instruct 993 ms 946 ms 100.00% 14
openai/gpt-oss-120b 1413 ms 1411 ms 100.00% 13
deepseek/deepseek-chat-v3.1 1475 ms 1372 ms 100.00% 28
google/gemma-3-27b-it 1616 ms 1578 ms 100.00% 18
qwen/qwen3-vl-30b-a3b-instruct 1770 ms 1769 ms 100.00% 17
qwen/qwen3-30b-a3b-instruct-2507 1805 ms 1802 ms 100.00% 23
openai/gpt-oss-20b 1971 ms 1968 ms 100.00% 18
moonshotai/kimi-k2.5 2068 ms 2056 ms 94.74% 19
moonshotai/kimi-k2.6 2135 ms 1859 ms 44 tok/s 96.55% 29
z-ai/glm-4.7 2297 ms 2193 ms 94.74% 19
minimax/minimax-m2.5 2329 ms 2225 ms 100.00% 16
z-ai/glm-5.1 2417 ms 2393 ms 73.68% 19
z-ai/glm-5 2514 ms 2511 ms 100.00% 17
deepseek/deepseek-v3.2 2841 ms 2840 ms 100.00% 9
qwen/qwen3.5-397b-a17b 2885 ms 2851 ms 100.00% 15

Sign in

Choose a sign in method.