Simple pricing that scales with usage
Start free, upgrade when usage grows. Transparent plans for builders, teams, and enterprise.
Monthly stays flexible. Switch to annual to save 20%.
Free
- 1 project, 1 environment
- Basic routing + budgets
- Core metrics (tokens, latency, errors)
- Monthly usage cap (~$20 equiv.)
- Community support
Pro
- Multi-project routing policies
- Fallback, caching, rate limits
- Prompt versioning + limited A/B
- Webhooks + basic alerts
- Usage-based overages
Team
- Up to 20 seats included
- Team RBAC + audit v1
- Lower usage rates
- Shared policy/prompt library
- 99.9% SLA
Enterprise
- SSO/SAML + fine-grained RBAC
- Full audit logs and retention
- Tenant isolation + residency options
- PII redaction and safety policies
- 99.95%+ SLA and dedicated support
Modeled across 180+ composite workload profiles
Three realistic scenarios, modeled bills
Numbers below come from the published price list, typical usage, and composite workload assumptions. Real bills vary with cache hit rate, token distribution, and model-pool strategy.
Side-project chatbot · 50K req/mo · 800 avg tokens · cost goal · cache on
- Pro plan base$29
- 50K req routing (in plan)$0
- Token spend (cache 32%)$5–8
- Trace retention 7d$0
B2B copilot · 1.2M req/mo · 1.5K avg tokens · balanced strategy · agent runs
- Team plan base$199
- 0–1M req @ $1.50/M$1.50
- 1–1.2M req @ $1.20/M$0.24
- Token + cache + trace$420–620
- Agent runtime (15K runs)$190–270
Bank support copilot · 25M req/mo · SSO + BAA + EU residency · 99.95% SLA
- Annual Enterprise license$20k+
- Volume usage (committed)Per contract
- Dedicated SE + SlackIncluded
- BAA + DPA + MSAIncluded
Put pricing inside real workload shapes
The numbers below are realistic planning benchmarks built from composite workload profiles, deterministic replay assumptions, and the public pricing model. They are not audited customer production claims.
| Workload | Baseline cost | Routed cost | Savings | Policy |
|---|---|---|---|---|
Support copilot 2.8M requests/mo · 980 avg tokens · 38% repeatable intents | $7,840 | $5,360 | 31.6% | intent router + semantic cache + two-step fallback |
Knowledge QA / RAG 640K queries/mo · 2.7K avg tokens · 21% low-confidence retrieval | $11,420 | $8,030 | 29.7% | retrieval-confidence routing + eval-backed fallback |
Sales / CRM agent 1.1M generations/mo · 620 avg tokens · 6 quality gates | $3,880 | $2,960 | 23.7% | lead-tier routing + quality gates + CRM sync |
Developer assistant 360K agent runs/mo · 11 tool calls/run · 42% sandboxed writes | $18,200 | $13,940 | 23.4% | agent runtime + scoped MCP tools + trace replay |
What's in each tier
| Feature | Free | Pro | Team | Enterprise |
|---|---|---|---|---|
| Routing | ||||
| Multi-model routing | Included | Included | Included | Included |
| Auto fallback chains | Included | Included | Included | Included |
| Per-request budget caps | 1 policy | Unlimited | Unlimited | Unlimited |
| Semantic cache | Not included | Included | Included | Included |
| Custom routing policies | Not included | Included | Included | Included |
| Agents | ||||
| Agent runtime (sandboxed) | Trial | Included | Included | Included |
| Tool / MCP registry | Not included | Included | Included | Included |
| Streaming + advisor mode | Not included | Included | Included | Included |
| Observability | ||||
| Trace retention | 24h | 7d | 30d | 90d |
| OpenTelemetry export | Not included | Included | Included | Included |
| Webhooks | Not included | Included | Included | Included |
| Team & access | ||||
| Seats | 1 | 5 included | 20 included | Unlimited |
| Team RBAC + audit | Not included | Not included | Included | Included |
| SSO / SAML | Not included | Not included | Not included | Included |
| SCIM provisioning | Not included | Not included | Not included | Included |
| Compliance | ||||
| PII detection / redaction | Detect-only | Included | Included | Included |
| Tenant isolation | Not included | Not included | Included | Included |
| Data residency (US / EU / APAC) | Not included | Not included | Not included | Included |
| BAA (HIPAA) | Not included | Not included | Not included | Included |
| DPA + custom MSA | Not included | Not included | Included | Included |
| Support & SLA | ||||
| Support channel | Community | Priority email | Dedicated SE + Slack | |
| Response SLA | — | 1 business day | 4 hours | 1 hour (P0) |
| Uptime SLA | — | — | 99.9% | 99.95%+ |
What you pay beyond included plan limits
Each component is priced separately; you're only charged for what you actually use. Free has a monthly cap; Pro / Team overages add to the bill below; Enterprise runs on committed-use contracts.
| Component | Unit | Price |
|---|---|---|
| Base routing — tier 1 | 0–1M req / mo | $1.50 / 1M |
| Base routing — tier 2 | 1–10M req / mo | $1.20 / 1M |
| Base routing — tier 3 | 10M+ req / mo | $1.00 / 1M |
| Semantic cache — write | per 1M tokens | $0.10 |
| Semantic cache — read | per 1M tokens | $0.01 |
| Cache storage | per GB / mo | $0.15 |
| Tracing | per trace | $0.00015 |
| Trace storage | per GB / mo | $0.08 |
| Agent runtime — compute | per second | $0.0012 |
| Agent runtime — memory | per GB-hour | $0.08 |
| Evaluations — interactive | < 1k / mo | $0.002 |
| Evaluations — batch | ≥ 1k / mo | $0.0015 |
Prices in USD. For volume / committed-use discounts see the Enterprise plan.
Slide to see your expected bill + savings
Computed from published model prices + typical cache hit rate. Actual savings depend on your prompt distribution.
Pricing ROI Calculator
Four commitments
No surprise bills
Hard budget caps per request, per project, and per account. Hit 80% → email; hit 100% → router refuses to spend more.
Pay for value, not seats
Free seats up to plan limit; only usage scales. Adding a teammate doesn't auto-bill you.
Downgrade any time
Annual customers get pro-rated refunds. No multi-year lock-ins on Pro / Team plans.
Compliance is included
PII detection, audit logs, and DPA are part of every paid plan — not an upsell.
Regulated workloads, custom SLA, residency on demand
From $20k/yr, includes SSO/SAML, SCIM, BAA, custom MSA, dedicated SE, direct Slack channel. Committed-use discounts and multi-year contracts available.
- ✓ Data residency: US / EU / APAC (US + EU at public beta)
- ✓ 99.95%+ uptime SLA with credits
- ✓ 1-hour P0 response, dedicated Slack channel
- ✓ HIPAA BAA, custom DPA, signed MSA
- ✓ Committed-use discount tiers per contract
- ✓ Private deployment / VPC peering options
Composite buyer language about pricing
In the composite benchmark, migrating 14 AI workloads behind one routing layer cut token spend by 31% and reduced P95 latency by 22% while keeping every decision traceable for audit review.
The router, guardrails, and agent runtime model replaced four homegrown systems in the rollout plan, taking a HIPAA-bound triage copilot from mirror traffic to governed launch in 47 days.
The marketplace pricing model shows how dimensional usage, model quality, and seat controls can become one invoice instead of three disconnected billing systems.
Procurement and POC FAQ
Can we validate savings on our own traffic before buying?
Yes. A typical POC mirrors 1-2 weeks of traffic, builds a single-model baseline, then replays candidate policies without changing end-user behavior.
How do you separate public claims from private diligence material?
Public pages only show product architecture, methodology, and modeled examples. Customer-specific benchmarks, contracts, and compliance evidence are shared by request or under NDA.
What data is retained in traces?
By default traces keep request metadata, routing decisions, token counts, model choice, errors, and timing. Prompt and output content retention can be shortened or disabled per policy.
Do you support regulated workloads?
The platform is designed for regulated review with SSO, RBAC, audit export, PII controls, residency options, DPA workflows, and BAA templates for eligible enterprise customers.
Which teams need to be involved in evaluation?
The strongest evaluations include the product owner, platform/FinOps, security, legal/procurement, and one engineering owner who can compare traces against the current stack.
FAQ
Ready to start?
Free plan is instant · Pro / Team have 14-day full refund · Enterprise goes straight to the founding team.