For neoclouds, AI clouds, and hardware providers

TrustedOS: the OS for AI clouds.

Dynamo and vLLM schedule your GPUs. TrustedOS runs your inference business — attested capacity, objective routing, metering, and high-margin models your customers can buy on your hardware.

Get access Verify the gateway

30+providers routed today

3attested regions, multi-cloud

0prompt logs — verifiable

AMD SEV-SNPAttested confidential VMs on GCP Confidential Space.

Intel TDXTrust domains in our live gateway regions.

NVIDIA CCConfidential-computing path for H100/H200-class GPUs.

AWS Nitro EnclavesSame attested code path on a second cloud.

The opportunity

Sell new models on top of GPU-hours and commodity tokens.

You already sell GPU-hours and commodity tokens. TrustedOS adds a higher-margin product line on the same silicon: composite models your customers can't get anywhere else, sold under your brand.

Composite models are token multipliers. One customer request fans out inside the attested gateway to a panel of models, a judge, and a synthesizer — every inner call is billable inference that lands on your capacity first.

More products per GPU-hour, more revenue per accelerator, one integration.

How composite models work Browse the catalog

One request, many inner callsPython

from openai import OpenAI

client = OpenAI(
    base_url="https://api.trustedrouter.com/v1",
    api_key="sk-tr-v1-...",
)

# A composite model id packages a whole graph:
# panel of open models -> judge -> synthesizer.
# Inner calls route to your capacity first.
resp = client.chat.completions.create(
    model="trustedrouter/prometheus-1.0",
    messages=[{"role": "user", "content": task}],
)

# 1 customer request above
# = N model calls on your accelerators.

Models

Host our models.

Offer TrustedRouter composite models — Iris, Prometheus, Zeus, and custom models — to your own customers, under your brand. Prometheus 1.0 scores 69.2 on DRACO at ~$34 per run; the best single frontier model we tested scored 65.3 at ~$250.

You keep the compute margin on inner calls; we take a model royalty. Published evals, reproducible harness.

Trust

Become an attested provider.

Providers are tiered by trust posture: confidential compute, zero-retention, standard. Attested capacity qualifies for trustedrouter/e2e and /zdr traffic that standard providers can never receive.

Routing is earned, not bought: we probe uptime, latency, and throughput continuously and publish what we measure.

Platform

Run TrustedOS.

The full attested stack on your infrastructure: self-hosted gateway and control plane, a model marketplace with owner payouts, objective routing, and metering — the business layer above your serving engine.

Per-model kernel optimization is in private beta; benchmarks publish with it.

Dynamo · vLLM · llm-d (keep them)TrustedOS (the layer above)

JobSchedules GPUs: batching, KV cache, disaggregationRuns the business: routing objectives, models, metering, trust

RoutingKV-aware, inside one clusterObjective-based across capacity: price, throughput, latency, privacy tier

MonetizationBring your own billingPrepaid metering, per-key budgets, spend alerts, usage broadcast

DemandServing onlyRouted traffic plus composite models your customers can buy on your hardware

TrustSeparate concernHardware attestation, fail-closed gateways, public evidence

HardwareNVIDIA-firstNeutral — GPU fleets today; custom silicon via routing, onboarding, and demand

RelationshipYou run itRuns alongside it — TrustedOS complements your serving engine

Why “Trusted” OS

The name is a claim we can prove.

In the security world, the “trusted OS” is the operating system inside a trusted execution environment. We picked the name on purpose: TLS terminates inside attested hardware — AMD SEV-SNP and Intel TDX today, with a confidential-computing path to H100/H200-class GPUs — and the gateway fails closed if the measurement doesn't match.

Your customers' compliance teams can verify the stack you run for themselves. Don't trust the policy — verify the code.

Live attestation evidence How attestation works

Verify the gatewaycurl + jq

# Nonce-bound attestation from the live gateway
NONCE=$(openssl rand -hex 16)
curl -s "https://api.trustedrouter.com/attestation?nonce=$NONCE" \
  | jq .

# Response includes:
#   eat_nonce     — your nonce, replay-protected
#   image_digest  — SHA-256 of the running container
#   pcrs          — platform measurements at boot
#
# Match the digest against the published build:
#   https://trust.trustedrouter.com

Working with us

The boring details that make this real.

Commercial shape? A platform license on active accelerators plus a small share on marketplace-originated traffic only. Your direct traffic is yours — we take nothing on it. Model royalties apply when you resell our composite models. Providers are onboarding at full rate today.

White-label? Yes. Your brand, your customers, your billing relationship. There's no exclusivity in either direction.

Custom silicon? For wafer-scale and dataflow architectures the value is objective routing, fast model onboarding, and demand — sized to your compiler, not GPU kernels.

Source-available? The control plane and gateway are BUSL-1.1: read, build, and verify the exact code — the hash you compute is the hash the enclave reports. Production deployment runs under a commercial license.

Who's behind this? The team building TrustedRouter — the attested LLM gateway with public status, live attestation, and published evals. Start at trust.trustedrouter.com and read the code.

Sell new models on your capacity

providers onboarding now

Tell us what you run and what you'd like to offer. We'll reach out to get your capacity and models online.

Questions

Isn't NVIDIA Dynamo already the 'inference OS'?

Keep Dynamo — and vLLM, SGLang, llm-d. They schedule GPUs inside your cluster: batching, KV cache, disaggregation. TrustedOS is the layer above: objective routing across capacity, composite models, metering, trust tiers, and demand. They compose; they don't compete.

We run custom silicon, not GPUs. Does this apply?

Yes — but differently. Wafer-scale and dataflow architectures have no CUDA-style kernels, so we don't pitch kernel optimization there. For non-GPU fleets TrustedOS brings objective routing, fast model onboarding, and composite-model demand that fans inner calls onto your capacity.

What can we offer today?

Objective routing (price/throughput/latency with fallbacks), privacy-tier routing (zdr/e2e/eu), composite and custom models, prepaid metering with per-key budgets, BYOK, and multi-region attested gateways on two clouds. Self-hosted TrustedOS, marketplace payouts, and per-model kernels (private beta) extend the platform from there.

Is the code open?

Source-available under BUSL-1.1: anyone can read, build, and verify the exact code behind the attestation claims — the hash you compute is the hash the enclave reports. Production deployment runs under a commercial license from Lore Hex Corp.

How do we start?

Use the form on the TrustedOS page. Tell us what you run and what you'd like to offer — host composite models under your brand, or qualify capacity for the attested trust tier — and we'll get your capacity and models online.