Cost Optimization Guide

Reduce AI API costs by 18-35% through best practices and smart routing.

Core Strategies

Smart Routing

Save 18-25%

Auto-select most cost-effective models

Semantic Caching

Save 30-40%

Cache similar queries to reduce redundant calls

Batch Processing

Save 15-20%

Combine requests to reduce per-call costs

Implementation Guide

1. Enable Cost-Optimized Routing

const response = await sky.route({
  goal: "cost",                    // Optimize for cost
  strategy: "cost-optimized",      // Aggressive cost reduction
  messages: [...],
  cache: true,                     // Enable semantic caching
});

2. Set Cost Limits

const response = await sky.route({
  goal: "cost",
  messages: [...],
  limits: {
    maxCostPerRequest: 0.01,       // Max $0.01 per request
    maxTokens: 500,                // Limit output tokens
  }
});

3. Batch Processing

// Process multiple queries in batch
const queries = ["Query 1", "Query 2", "Query 3"];
const responses = await Promise.all(
  queries.map(query => 
    sky.route({
      goal: "cost",
      messages: [{ role: "user", content: query }],
      cache: true
    })
  )
);

Monitor Costs

Use Analytics API to track and optimize costs:

const analytics = await sky.analytics.usage({
  startDate: "2024-12-01",
  endDate: "2024-12-15",
  groupBy: "model"
});

console.log("Total cost:", analytics.total_cost_usd);
console.log("By model:", analytics.by_model);
console.log("Cache savings:", analytics.cache_stats.savings_usd);

Best Practices

Use nano/mini models for simple tasks

Enable caching for repetitive queries

Limit output tokens

Batch non-real-time requests

Regularly review cost analytics

Set up budget alerts

Achieve 25% Cost Savings on Average

Through smart routing and best practices

View Analytics API →

Routing Strategies

Pricing & Usage Billing

Was this page helpful?

Let us know how we can improve