TrustedOS: the OS for AI clouds.
Dynamo and vLLM schedule your GPUs. TrustedOS runs your inference business — attested capacity, objective routing, metering, and high-margin models your customers can buy on your hardware.
Sell new models on top of GPU-hours and commodity tokens.
You already sell GPU-hours and commodity tokens. TrustedOS adds a higher-margin product line on the same silicon: composite models your customers can't get anywhere else, sold under your brand.
Composite models are token multipliers. One customer request fans out inside the attested gateway to a panel of models, a judge, and a synthesizer — every inner call is billable inference that lands on your capacity first.
More products per GPU-hour, more revenue per accelerator, one integration.
from openai import OpenAI
client = OpenAI(
base_url="https://api.trustedrouter.com/v1",
api_key="sk-tr-v1-...",
)
# A composite model id packages a whole graph:
# panel of open models -> judge -> synthesizer.
# Inner calls route to your capacity first.
resp = client.chat.completions.create(
model="trustedrouter/prometheus-1.0",
messages=[{"role": "user", "content": task}],
)
# 1 customer request above
# = N model calls on your accelerators.
Host our models.
Offer TrustedRouter composite models — Iris, Prometheus, Zeus, and custom models — to your own customers, under your brand. Prometheus 1.0 scores 69.2 on DRACO at ~$34 per run; the best single frontier model we tested scored 65.3 at ~$250.
You keep the compute margin on inner calls; we take a model royalty. Published evals, reproducible harness.
Become an attested provider.
Providers are tiered by trust posture: confidential compute, zero-retention, standard. Attested capacity qualifies for trustedrouter/e2e and /zdr traffic that standard providers can never receive.
Routing is earned, not bought: we probe uptime, latency, and throughput continuously and publish what we measure.
Run TrustedOS.
The full attested stack on your infrastructure: self-hosted gateway and control plane, a model marketplace with owner payouts, objective routing, and metering — the business layer above your serving engine.
Per-model kernel optimization is in private beta; benchmarks publish with it.
The name is a claim we can prove.
In the security world, the “trusted OS” is the operating system inside a trusted execution environment. We picked the name on purpose: TLS terminates inside attested hardware — AMD SEV-SNP and Intel TDX today, with a confidential-computing path to H100/H200-class GPUs — and the gateway fails closed if the measurement doesn't match.
Your customers' compliance teams can verify the stack you run for themselves. Don't trust the policy — verify the code.
# Nonce-bound attestation from the live gateway
NONCE=$(openssl rand -hex 16)
curl -s "https://api.trustedrouter.com/attestation?nonce=$NONCE" \
| jq .
# Response includes:
# eat_nonce — your nonce, replay-protected
# image_digest — SHA-256 of the running container
# pcrs — platform measurements at boot
#
# Match the digest against the published build:
# https://trust.trustedrouter.com
The boring details that make this real.
Commercial shape? A platform license on active accelerators plus a small share on marketplace-originated traffic only. Your direct traffic is yours — we take nothing on it. Model royalties apply when you resell our composite models. Providers are onboarding at full rate today.
White-label? Yes. Your brand, your customers, your billing relationship. There's no exclusivity in either direction.
Custom silicon? For wafer-scale and dataflow architectures the value is objective routing, fast model onboarding, and demand — sized to your compiler, not GPU kernels.
Source-available? The control plane and gateway are BUSL-1.1: read, build, and verify the exact code — the hash you compute is the hash the enclave reports. Production deployment runs under a commercial license.
Who's behind this? The team building TrustedRouter — the attested LLM gateway with public status, live attestation, and published evals. Start at trust.trustedrouter.com and read the code.
Sell new models on your capacity
providers onboarding nowTell us what you run and what you'd like to offer. We'll reach out to get your capacity and models online.
Questions
Isn't NVIDIA Dynamo already the 'inference OS'?
Keep Dynamo — and vLLM, SGLang, llm-d. They schedule GPUs inside your cluster: batching, KV cache, disaggregation. TrustedOS is the layer above: objective routing across capacity, composite models, metering, trust tiers, and demand. They compose; they don't compete.
We run custom silicon, not GPUs. Does this apply?
Yes — but differently. Wafer-scale and dataflow architectures have no CUDA-style kernels, so we don't pitch kernel optimization there. For non-GPU fleets TrustedOS brings objective routing, fast model onboarding, and composite-model demand that fans inner calls onto your capacity.
What can we offer today?
Objective routing (price/throughput/latency with fallbacks), privacy-tier routing (zdr/e2e/eu), composite and custom models, prepaid metering with per-key budgets, BYOK, and multi-region attested gateways on two clouds. Self-hosted TrustedOS, marketplace payouts, and per-model kernels (private beta) extend the platform from there.
Is the code open?
Source-available under BUSL-1.1: anyone can read, build, and verify the exact code behind the attestation claims — the hash you compute is the hash the enclave reports. Production deployment runs under a commercial license from Lore Hex Corp.
How do we start?
Use the form on the TrustedOS page. Tell us what you run and what you'd like to offer — host composite models under your brand, or qualify capacity for the attested trust tier — and we'll get your capacity and models online.