Skip to content

Overview

What is RunForge

  • Privacy‑first LLM observability and testing. No prompts or model outputs are required or stored. Only usage, cost, latency, status, provider/model, and lightweight metadata are collected. Optional client‑side hashing is supported.

Pillars

  • Observability: Real‑time telemetry via Convex (runs_live) and durable history in Postgres (Run).
  • Testing & Comparison: Experiments and evaluations (schemas present; UI in progress).
  • Optimization: Cost and latency tracking, model/provider breakdowns, and per‑minute rollups.
  • Privacy‑by‑Design: Do not send prompts/responses. If you choose to compute a promptHash, keep salts client‑side; server also supports an optional promptPreview field, but you should generally omit it.

Personas & value

  • Product: visibility into cost, reliability, and model trade‑offs with zero content retention.
  • Platform/Infra: standardized telemetry, rate limiting, and auth; privacy and compliance defaults.
  • Research/Applied: run comparisons, track tokens/cost/latency, and prep for evals.

Key capabilities (as implemented)

  • Ingestion API with Zod validation, auth by shared ingest key, and per‑project rate limiting.
  • Server‑side cost verification: trusts OpenRouter usage.total_cost; otherwise computes from a pricing registry.
  • Realtime via Convex: runs_live + kpis_1m rollups; dashboard subscribes to live queries.
  • Postgres history with Prisma models (Run et al.). Optional Convex→PG sync webhook.
  • SDKs (TS, Python) with zero‑config auto‑extraction of tokens and cost.

See also: 02-quickstart.md, 03-architecture.md, 05-apis.md.