Overview

Privacy‑first LLM observability and testing. No prompts or model outputs are required or stored. Only usage, cost, latency, status, provider/model, and lightweight metadata are collected. Optional client‑side hashing is supported.

Observability: Real‑time telemetry via Convex (runs_live) and durable history in Postgres (Run).
Testing & Comparison: Experiments and evaluations (schemas present; UI in progress).
Optimization: Cost and latency tracking, model/provider breakdowns, and per‑minute rollups.
Privacy‑by‑Design: Do not send prompts/responses. If you choose to compute a promptHash, keep salts client‑side; server also supports an optional promptPreview field, but you should generally omit it.

Product: visibility into cost, reliability, and model trade‑offs with zero content retention.
Platform/Infra: standardized telemetry, rate limiting, and auth; privacy and compliance defaults.
Research/Applied: run comparisons, track tokens/cost/latency, and prep for evals.

Ingestion API with Zod validation, auth by shared ingest key, and per‑project rate limiting.
Server‑side cost verification: trusts OpenRouter usage.total_cost; otherwise computes from a pricing registry.
Realtime via Convex: runs_live + kpis_1m rollups; dashboard subscribes to live queries.
Postgres history with Prisma models (Run et al.). Optional Convex→PG sync webhook.
SDKs (TS, Python) with zero‑config auto‑extraction of tokens and cost.

See also: 02-quickstart.md, 03-architecture.md, 05-apis.md.