RunForge Troubleshooting Guide¶

Quick Problem Solver¶

Not Seeing Any Data in Dashboard?¶

Most likely causes: 1. Wrong project selected - Check the project dropdown 2. Tracking not configured - Verify your SDK setup
3. API key issues - Confirm your RunForge ingest key is correct 4. Time period - Make sure you're looking at the right date range

Quick fix: Make a test AI call and wait 30 seconds, then refresh your dashboard.

Getting "Invalid API Key" Errors?¶

Most likely causes: 1. Copied key incorrectly - API keys are long and easy to mistype 2. Using wrong key type - Make sure you're using the provider's API key, not RunForge's 3. Key has been deactivated - Check your provider's dashboard 4. Insufficient credits/quota - Your provider account may be out of credit

Quick fix: Go to your provider's website, generate a fresh API key, and update it in RunForge.

Costs Showing as $0.00?¶

Most likely causes: 1. Provider doesn't return costs - Some providers don't send cost data 2. Model not in pricing database - Newer models may not have pricing info yet 3. Using through OpenRouter - Should show exact costs; if not, check your configuration 4. Free tier usage - Some providers don't charge for certain usage levels

Quick fix: Check if you're getting token counts. If yes, costs will be estimated server-side.

Common Setup Issues¶

Problem: "RunForge SDK not found" or Import Errors¶

TypeScript/JavaScript¶

Symptoms: Cannot find module '@runforge/sdk' or similar import errors

Solutions: 1. Install the SDK:

npm install @runforge/sdk
# or
yarn add @runforge/sdk

Check your import statement:

// Correct
import { RunForge } from '@runforge/sdk';

// Also correct (if using CommonJS)
const { RunForge } = require('@runforge/sdk');

Verify your package.json includes the dependency

Clear node_modules and reinstall if still having issues:

rm -rf node_modules package-lock.json
npm install

Python¶

Symptoms: ModuleNotFoundError: No module named 'runforge'

Solutions: 1. Install the SDK:

pip install runforge
# or for specific environments
pip3 install runforge
conda install runforge

Check your Python environment:
```
python -m pip list | grep runforge
```

Virtual environment issues:

# Activate your virtual environment first
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install runforge

Problem: Environment Variables Not Loading¶

Symptoms: process.env.RUNFORGE_API_KEY is undefined, or similar environment variable issues

Solutions: 1. Check your .env file location - Should be in your project root 2. Load environment variables:

// Add this at the top of your file
import 'dotenv/config';
// or
require('dotenv').config();

Verify .env file format:

# Correct format (no spaces around =)
RUNFORGE_API_KEY=your-key-here
OPENAI_API_KEY=your-openai-key-here

# Incorrect (will not work)
RUNFORGE_API_KEY = your-key-here

Check .gitignore - Make sure .env is listed so you don't commit secrets

Problem: Cannot Connect to RunForge Endpoint¶

Symptoms: Connection errors, timeouts, or "ECONNREFUSED" errors when making tracked calls

Solutions: 1. Check if RunForge is running:

# If running locally
curl http://localhost:3000/api/health

Verify your endpoint URL:

// Local development
endpoint: 'http://localhost:3000/api/ingest'

// Production (example)
endpoint: 'https://your-runforge-instance.com/api/ingest'

Firewall/network issues:
Make sure port 3000 is open (or whatever port you're using)
Check if your network blocks local connections
Try from a different network to isolate the issue
Docker/container issues:
If running RunForge in Docker, make sure ports are mapped correctly
Use host.docker.internal instead of localhost when connecting from containers

Provider-Specific Issues¶

OpenAI Problems¶

"You exceeded your current quota"¶

What it means: You've hit your OpenAI usage limit

Solutions: 1. Check your usage at platform.openai.com 2. Increase your usage limit in OpenAI's settings 3. Add more credit to your OpenAI account
4. Wait for next billing cycle if you've hit your monthly limit 5. Check for runaway processes that might be using lots of tokens

"That model is currently overloaded"¶

What it means: OpenAI's servers are busy

Solutions: 1. Wait and retry - usually resolves in a few minutes 2. Use a different model temporarily:

const fallbackModels = ['gpt-4o-mini', 'gpt-3.5-turbo'];

3. Implement retry logic:

async function retryOpenAI(apiCall: () => Promise<any>, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await apiCall();
    } catch (error) {
      if (error.code === 'model_overloaded' && i < maxRetries - 1) {
        await new Promise(resolve => setTimeout(resolve, 1000 * (i + 1)));
        continue;
      }
      throw error;
    }
  }
}

"Invalid request: model not found"¶

What it means: The model name is wrong or not available

Solutions: 1. Check model name spelling:

// Correct
model: 'gpt-4o-mini'
// Incorrect  
model: 'gpt-4-mini' // Missing 'o'

2. Use currently available models - check OpenAI's documentation 3. Update to newer model names if using deprecated models

OpenRouter Problems¶

"Insufficient credits"¶

What it means: Your OpenRouter account is out of money

Solutions: 1. Add credits at openrouter.ai 2. Check your current balance in the OpenRouter dashboard 3. Set up auto-reload to avoid interruptions 4. Monitor spending with alerts

"Model not found" on OpenRouter¶

What it means: The model isn't available or the name is wrong

Solutions: 1. Check available models at openrouter.ai/models 2. Use the correct format:

// Correct OpenRouter format
model: 'anthropic/claude-3-haiku'
model: 'openai/gpt-4o-mini'

// Incorrect
model: 'claude-3-haiku' // Missing provider prefix

3. Verify model is currently available - some models have downtime

Anthropic Problems¶

"Invalid API key"¶

What it means: Your Claude API key is wrong or inactive

Solutions: 1. Generate new key at console.anthropic.com 2. Check key permissions - make sure it has the right access 3. Verify account status - ensure your Anthropic account is in good standing

"Output blocked by content policy"¶

What it means: Claude's safety filters blocked the response

Solutions: 1. Rephrase your prompt to be less likely to trigger filters 2. Add appropriate context to clarify legitimate use cases 3. Try a different approach to your query 4. Contact Anthropic support if you believe it's a false positive

Data and Tracking Issues¶

Problem: Requests Not Appearing in Dashboard¶

Debugging steps: 1. Check project selection - Make sure you're viewing the right project 2. Verify time range - Dashboard might be filtered to a different time period
3. Look at browser network tab - Are requests to RunForge failing? 4. Check RunForge logs - Look for errors in your RunForge instance logs

Common causes: - Wrong project ID in your SDK configuration - Ingest API key mismatch - Network connectivity between your app and RunForge - Rate limiting - Too many requests being rejected

Problem: Costs Are Wildly Inaccurate¶

Investigation steps: 1. Check which provider you're using: - OpenRouter: Should be exact costs - Direct providers: May be estimated 2. Verify model names match pricing database 3. Look at token counts - Are they realistic? 4. Compare with provider billing - Check actual charges

Common causes: - Wrong model name leading to incorrect pricing lookup - Outdated pricing data for newer models
- Currency conversion issues - Free tier usage not being accounted for

Problem: Dashboard Shows Errors But App Works Fine¶

What it means: Your AI calls are succeeding, but RunForge tracking is failing

Investigation: 1. Check network connectivity from your app to RunForge 2. Verify RunForge is healthy - Can you access the dashboard? 3. Look at your tracking code - Are you handling tracking errors properly?

Best practice:

async function robustTracking(metadata: any, aiCall: () => Promise<any>) {
  try {
    return await runforge.track(metadata, aiCall);
  } catch (trackingError) {
    console.warn('RunForge tracking failed:', trackingError.message);
    // Still execute the AI call even if tracking fails
    return await aiCall();
  }
}

Performance Issues¶

Problem: Slow Response Times¶

Debugging steps: 1. Check which model you're using - Some models are inherently slower 2. Look at token counts - Very long prompts or responses take longer 3. Test without RunForge - Is the slowdown from tracking or the AI provider? 4. Check your internet connection and geographic location

Common solutions: - Switch to faster models:

// Slower but higher quality
model: 'gpt-4-turbo'

// Faster but still good quality  
model: 'gpt-4o-mini'

- Reduce max_tokens for faster responses - Use streaming for long responses to improve perceived speed - Optimize prompts to be more concise

Problem: High Error Rates¶

Investigation: 1. What types of errors are occurring? - Rate limits: Too many requests too quickly - Authentication: API key issues
- Network: Connection problems - Provider issues: Service outages

Solutions by error type:

Rate Limiting:

// Add exponential backoff
async function withBackoff(apiCall: () => Promise<any>) {
  const maxRetries = 5;
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await apiCall();
    } catch (error) {
      if (error.code === 'rate_limit_exceeded' && i < maxRetries - 1) {
        const delay = Math.min(1000 * Math.pow(2, i), 30000);
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}

Authentication Issues: - Rotate API keys - Generate fresh ones - Check key permissions - Ensure they have necessary access - Verify account status - Make sure accounts are in good standing

Network Issues: - Implement proper timeouts - Add retry logic for temporary failures - Use multiple providers as fallbacks

Integration Problems¶

Problem: RunForge Works in Development but Not Production¶

Common causes: 1. Environment variables not set in production 2. Network access blocked - Firewalls, security groups 3. Different dependencies - Production might have different package versions 4. SSL/TLS issues - Certificate problems in production

Solutions: 1. Check environment variables:

# In production environment
echo $RUNFORGE_API_KEY
echo $RUNFORGE_ENDPOINT

2. Test network connectivity:

curl -I https://your-runforge-instance.com/api/health

3. Compare package versions between environments 4. Check logs for specific error messages in production

Problem: Works Sometimes, Fails Other Times¶

This suggests: - Rate limiting - Some requests succeed before hitting limits - Network instability - Intermittent connectivity issues - Provider capacity - AI services sometimes overloaded - Race conditions - Multiple requests interfering with each other

Solutions: - Add comprehensive error handling and retry logic - Implement circuit breaker pattern for failing services - Use multiple providers for redundancy - Add proper logging to identify patterns in failures

Getting Help¶

Before Asking for Help¶

Check the logs - Look for specific error messages
Test with minimal example - Isolate the problem
Verify your setup - Double-check API keys, endpoints, etc.
Check provider status pages - Is the AI service having issues?

Information to Include When Asking for Help¶

Always include: - Exact error messages (copy/paste, don't paraphrase) - Your SDK version and environment (Node.js version, Python version, etc.) - Code snippet showing how you're using RunForge (remove API keys!) - What you expected vs what actually happened - When the problem started - Was it ever working?

Example good help request:

I'm getting this error when trying to track OpenAI calls:

Error: "Invalid API key provided"

My setup:
- @runforge/sdk version 1.2.3
- Node.js 18.17.0  
- OpenAI calls work fine without RunForge tracking

Code snippet:
const runforge = new RunForge({
  apiKey: process.env.RUNFORGE_API_KEY,
  endpoint: 'http://localhost:3000/api/ingest'
});

This started happening yesterday; it was working fine before.

Where to Get Help¶

RunForge Documentation - Check other guides first
Provider Documentation - For provider-specific issues
Community Forums - GitHub issues, Discord, etc.
Support Channels - Email support for urgent issues

Prevention Tips¶

Avoid Common Mistakes¶

Always use environment variables for API keys
Implement proper error handling - Don't let tracking failures break your app
Set reasonable rate limits - Don't overwhelm providers
Monitor your usage - Set up alerts before problems occur
Test in staging first - Don't deploy untested changes to production
Keep dependencies updated - But test updates before deploying

Health Checks¶

Regular monitoring you should set up:

// Health check function
async function healthCheck() {
  try {
    // Test RunForge connectivity
    const response = await fetch(`${RUNFORGE_ENDPOINT}/health`);
    console.log('RunForge:', response.ok ? '✅' : '❌');

    // Test provider connectivity  
    const testCall = await openai.chat.completions.create({
      model: 'gpt-4o-mini',
      messages: [{ role: 'user', content: 'test' }],
      max_tokens: 5
    });
    console.log('OpenAI:', testCall ? '✅' : '❌');

  } catch (error) {
    console.log('Health check failed:', error.message);
  }
}

// Run health check periodically
setInterval(healthCheck, 300000); // Every 5 minutes

Monthly review checklist: - [ ] Are API keys still valid and not expired? - [ ] Are provider accounts in good standing with sufficient credits? - [ ] Are error rates within acceptable ranges? - [ ] Are costs trending as expected? - [ ] Are there any new models or features to take advantage of?

Remember: Most issues are configuration problems, not bugs in RunForge or the AI providers. Double-check your setup before assuming there's a deeper problem!