Skip to content

SDK Auto-Extraction: Zero-Config Usage Guide

Overview

RunForge SDK now supports zero-configuration automatic extraction of tokens, costs, and metadata from LLM API responses. Users simply wrap their existing LLM calls with runforge.track() and all metrics flow automatically to the dashboard.

Quick Start

TypeScript/JavaScript

import { RunForge } from '@runforge/sdk-ts'

// Initialize with your API key
const runforge = new RunForge({ 
  apiKey: process.env.RUNFORGE_API_KEY,
  projectId: 'your-project-id'
})

// Wrap any LLM call - everything else is automatic!
const result = await runforge.track({ experiment: 'chat-v2' }, () =>
  openai.chat.completions.create({ 
    model: 'gpt-4o-mini', 
    messages: [{ role: 'user', content: 'Hello!' }] 
  })
)
// ✅ Tokens, costs, latency automatically tracked

Python

from runforge import RunForge

# Initialize with your API key
runforge = RunForge(
    api_key=os.environ['RUNFORGE_API_KEY'],
    project_id='your-project-id'
)

# Wrap any LLM call - everything else is automatic!
result = runforge.track(
    {"experiment": "chat-v2"},
    lambda: openai.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello!"}]
    )
)
# ✅ Tokens, costs, latency automatically tracked

Supported Providers

🥇 OpenRouter (Highest Accuracy)

  • Direct cost extraction from usage.total_cost
  • Real-time pricing from provider
  • No estimation required
// OpenRouter calls automatically extract exact costs
const result = await runforge.track({ model: 'openai/gpt-4o-mini' }, () =>
  openrouter.chat.completions.create({ 
    model: 'openai/gpt-4o-mini', 
    messages 
  })
)
// Cost comes directly from OpenRouter - 100% accurate

🤖 OpenAI Direct

  • Token extraction from usage.prompt_tokens/completion_tokens
  • Server-side cost calculation using pricing registry
  • Streaming support with stream_options.include_usage
// OpenAI calls extract tokens and calculate costs
const result = await runforge.track({}, () =>
  openai.chat.completions.create({ 
    model: 'gpt-4o-mini', 
    messages,
    stream_options: { include_usage: true } // For streaming
  })
)
// Tokens extracted, cost calculated automatically

🧠 Anthropic Direct

  • Token extraction from usage.input_tokens/output_tokens
  • Server-side cost calculation using pricing registry
  • Model-specific pricing for Claude variants
// Anthropic calls extract tokens and calculate costs
const result = await runforge.track({}, () =>
  anthropic.messages.create({
    model: 'claude-3-5-sonnet-20241022',
    messages
  })
)
// Input/output tokens extracted, cost calculated automatically

How It Works

1. Automatic Provider Detection

// Provider detected from model name
'openai/gpt-4o-mini' → 'openai'
'gpt-4o-mini'        → 'openai'  
'claude-3-opus'      → 'anthropic'

2. Usage Data Extraction

  • OpenRouter: usage.total_cost (exact from provider)
  • OpenAI: usage.prompt_tokens + usage.completion_tokens
  • Anthropic: usage.input_tokens + usage.output_tokens

3. Server-Side Cost Verification

  • OpenRouter costs trusted as-is (costSource: "provider")
  • Other providers recalculated server-side (costSource: "catalog")
  • Unknown models marked as estimated (costEstimated: true)

4. Privacy-First Design

  • Never stores prompts or responses
  • Only extracts usage metadata
  • Safe for sensitive workloads

Advanced Usage

Custom Metadata

const result = await runforge.track({
  experiment: 'chat-v2',
  user_id: 'user123',
  temperature: 0.7,
  custom_field: 'value'
}, () => llmCall())

Error Tracking

try {
  const result = await runforge.track({ experiment: 'test' }, () => {
    throw new Error('Rate limited')
  })
} catch (error) {
  // Error automatically tracked with latency and status
}

Streaming Support

// For OpenAI streaming with usage
const stream = await runforge.track({}, () =>
  openai.chat.completions.create({
    model: 'gpt-4o-mini',
    messages,
    stream: true,
    stream_options: { include_usage: true }
  })
)
// Usage data extracted from final chunk

Async Functions (Python)

import asyncio

# Supports both sync and async functions
async def async_llm_call():
    return await openai.achat.completions.create(model="gpt-4o", messages=messages)

result = await runforge.track({"experiment": "async"}, async_llm_call)

Configuration Options

SDK Initialization

TypeScript

const runforge = new RunForge({
  apiKey: 'your-api-key',           // Required
  endpoint: 'https://your-domain/api/ingest',  // Optional
  projectId: 'project-id'           // Optional
})

Python

runforge = RunForge(
    api_key='your-api-key',          # Required
    endpoint='https://your-domain/api/ingest',  # Optional  
    project_id='project-id'          # Optional
)

Cost Accuracy

Provider Token Accuracy Cost Accuracy Source
OpenRouter ✅ Exact ✅ Exact Provider API
OpenAI ✅ Exact 🟡 Calculated Pricing Registry
Anthropic ✅ Exact 🟡 Calculated Pricing Registry
Others 🟡 Estimated 🟡 Estimated Fallback

Migration from Manual Configuration

Before (Manual)

// Old way - manual configuration required
const call = withLLM(
  openai.chat.completions.create.bind(openai.chat.completions),
  { 
    provider: 'openai', 
    model: 'gpt-4o', 
    price: { inUsdPerMTokIn: 5, inUsdPerMTokOut: 15 }
  },
  { apiKey: process.env.RUNFORGE_API_KEY }
)

After (Zero-Config)

// New way - completely automatic
const result = await runforge.track({ experiment: 'test' }, () =>
  openai.chat.completions.create({ model: 'gpt-4o', messages })
)

Troubleshooting

No Usage Data

If the SDK can't extract usage data: - Still tracks latency and status - Zeros for tokens and cost - Check provider response format

Incorrect Costs

  • OpenRouter costs are always exact
  • Other providers use server-side pricing registry
  • Check model name spelling and casing

Network Failures

  • SDK silently handles network failures
  • Never breaks your LLM calls
  • Metrics lost but application continues

Debug Mode

// Enable debug logging (if available)
process.env.RUNFORGE_DEBUG = '1'

Examples

See the complete examples: - TypeScript: examples/auto-extraction-demo.ts - Python: examples/auto-extraction-demo.py

Run them locally:

# TypeScript
npx tsx examples/auto-extraction-demo.ts

# Python  
python3 examples/auto-extraction-demo.py