Skip to content

Definitions

Point fields (from /api/metrics)

  • t: ISO timestamp (bucket start, hourly for 24h/7d, daily for 30d)
  • costUSD: Sum of Run.costUSD in bucket (rounded to 5 decimals)
  • errorRate: errors / count with status != 'success'
  • p95LatencyMs: 95th percentile of latencyMs in bucket
  • runs: total runs in bucket
  • tokensIn: sum of inputTokens
  • tokensOut: sum of outputTokens

Aggregates

  • costByProvider: sum of costUSD by provider
  • costByModel: sum of costUSD by model

Reliability metrics

  • Error rate: fraction of non‑success over total
  • Success rate: 1 - errorRate

Percentiles

  • p50/p95/p99: compute over the set of latencyMs values in the bucket.
  • Current API returns p95LatencyMs only.

Cost normalization

  • Cost per 1k tokens (example): costUSD / ((tokensIn + tokensOut) / 1000) when tokens > 0.

Examples

{ "t": "2025-01-01T01:00:00.000Z", "costUSD": 0.12345, "errorRate": 0.01, "p95LatencyMs": 850, "runs": 42, "tokensIn": 1200, "tokensOut": 800 }