Custom Provider Setup Guide¶

What You'll Learn¶

How to connect other OpenAI-compatible AI services to RunForge, including local models, enterprise providers, and specialized AI services.

Why Use Custom Providers?¶

Cost Savings¶

Local models: Run AI on your own hardware with no per-request costs
Enterprise deals: Custom pricing from providers not in standard lists
Regional providers: Often cheaper in specific geographic regions
Specialized services: Purpose-built models that may be more cost-effective

Control and Privacy¶

On-premises deployment: Keep all data within your infrastructure
Custom models: Use fine-tuned models specific to your domain
Regional compliance: Meet data residency requirements
Enterprise features: SLAs, dedicated support, custom terms

Access to New Models¶

Early access: New providers before they're officially supported
Experimental models: Research or beta models
Niche providers: Specialized in your industry or use case
Open source models: Self-hosted versions of popular models

Compatible Services¶

Popular OpenAI-Compatible Providers¶

Together AI: Fast inference for open-source models
Fireworks AI: High-performance model serving
Replicate: Easy access to thousands of models
Perplexity: Search-enhanced language models
Groq: Ultra-fast inference hardware
Mistral AI: European provider with strong models
Cohere: Enterprise-focused AI platform

Self-Hosted Solutions¶

Ollama: Run models locally on your machine
Text Generation WebUI: Web interface for local models
FastChat: Self-hosted chatbot with OpenAI API
LocalAI: Local OpenAI API alternative
LM Studio: Desktop app for running local models

Enterprise Platforms¶

Azure OpenAI: Microsoft's managed OpenAI service
AWS Bedrock: Amazon's managed AI service
Google Vertex AI: Google's enterprise AI platform
IBM Watson: Enterprise AI with custom models

Before You Start¶

✅ Account with your chosen provider (or local setup complete)
✅ API endpoint URL and authentication method
✅ RunForge project created (see Getting Started Guide)
✅ Understanding of your provider's API format
✅ 10-15 minutes of time

Step 1: Gather Provider Information¶

Required Information¶

For any custom provider, you'll need:

Base URL: The API endpoint (e.g., https://api.together.xyz/v1)
Authentication: API key, token, or other auth method
Model names: Exact model identifiers used by the provider
API format: Confirm it's OpenAI-compatible
Pricing: Cost per token (if available)

Finding Provider Details¶

Check the provider's documentation for: - API reference or developer docs - OpenAI compatibility information - Authentication requirements - Available model list - Pricing information

Example for Together AI: - Base URL: https://api.together.xyz/v1 - Auth: Bearer token via API key - Models: meta-llama/Llama-2-7b-chat-hf, etc. - Compatible: Yes (OpenAI chat completions format)

Step 2: Add Custom Provider to RunForge¶

Navigate to API Keys¶

Open RunForge and sign in
Select your project from the project dropdown
Go to Settings → API Keys

Configure Custom Provider¶

Click "Add API Key" or "Add Provider Key"
Select "Custom" from the provider dropdown
Fill out the configuration:
Provider Name: "Together AI" (or whatever you're adding)
Base URL: https://api.together.xyz/v1 (your provider's endpoint)
API Key: Your provider's API key
Key Name: "Production Together Key"
Description: "Together AI for open-source models"
Advanced options (if available):
Headers: Custom headers required by provider
Auth type: Bearer token, Basic auth, custom
Rate limits: Provider-specific limits
Test the connection and Save

Step 3: Configure Your Application¶

TypeScript/JavaScript Setup¶

Use the OpenAI SDK with custom base URL:

npm install openai @runforge/sdk

Environment variables (.env file):

# Custom Provider (e.g., Together AI)
TOGETHER_API_KEY=your-together-api-key-here
TOGETHER_BASE_URL=https://api.together.xyz/v1

# RunForge Configuration
RUNFORGE_API_KEY=your-runforge-ingest-key-here
RUNFORGE_PROJECT_ID=your-project-id-here
RUNFORGE_ENDPOINT=http://localhost:3000/api/ingest

Basic usage example:

import OpenAI from 'openai';
import { RunForge } from '@runforge/sdk';

// Configure for custom provider
const togetherAI = new OpenAI({
  apiKey: process.env.TOGETHER_API_KEY!,
  baseURL: process.env.TOGETHER_BASE_URL!,
});

const runforge = new RunForge({
  apiKey: process.env.RUNFORGE_API_KEY!,
  projectId: process.env.RUNFORGE_PROJECT_ID!,
  endpoint: process.env.RUNFORGE_ENDPOINT!
});

async function generateWithCustomProvider(prompt: string) {
  const result = await runforge.track(
    { 
      experiment: 'together-ai-test',
      provider: 'together',
      model: 'llama-2-7b-chat' // Custom tracking info
    },
    () => togetherAI.chat.completions.create({
      model: 'meta-llama/Llama-2-7b-chat-hf', // Provider's exact model name
      messages: [
        { role: 'system', content: 'You are a helpful assistant.' },
        { role: 'user', content: prompt }
      ],
      temperature: 0.7,
      max_tokens: 500
    })
  );

  return result.choices[0].message.content;
}

Python Setup¶

pip install openai runforge

Environment setup:

export TOGETHER_API_KEY=your-together-api-key-here
export TOGETHER_BASE_URL=https://api.together.xyz/v1
export RUNFORGE_API_KEY=your-runforge-ingest-key-here
export RUNFORGE_PROJECT_ID=your-project-id-here
export RUNFORGE_ENDPOINT=http://localhost:3000/api/ingest

Basic usage example:

import os
from openai import OpenAI
from runforge import RunForge

# Configure for custom provider
client = OpenAI(
    api_key=os.environ['TOGETHER_API_KEY'],
    base_url=os.environ['TOGETHER_BASE_URL']
)

rf = RunForge(
    api_key=os.environ['RUNFORGE_API_KEY'],
    project_id=os.environ['RUNFORGE_PROJECT_ID'],
    endpoint=os.environ['RUNFORGE_ENDPOINT']
)

def generate_with_custom_provider(prompt):
    result = rf.track(
        {
            "experiment": "together-ai-test",
            "provider": "together",
            "model": "llama-2-7b-chat"
        },
        lambda: client.chat.completions.create(
            model="meta-llama/Llama-2-7b-chat-hf",
            messages=[
                {"role": "system", "content": "You are a helpful assistant."},
                {"role": "user", "content": prompt}
            ],
            temperature=0.7,
            max_tokens=500
        )
    )

    return result.choices[0].message.content

Step 4: Provider-Specific Configuration¶

Together AI¶

const together = new OpenAI({
  apiKey: process.env.TOGETHER_API_KEY!,
  baseURL: 'https://api.together.xyz/v1'
});

// Popular models:
// - meta-llama/Llama-2-70b-chat-hf (high quality)
// - meta-llama/Llama-2-13b-chat-hf (balanced)  
// - meta-llama/Llama-2-7b-chat-hf (fast/cheap)

Fireworks AI¶

const fireworks = new OpenAI({
  apiKey: process.env.FIREWORKS_API_KEY!,
  baseURL: 'https://api.fireworks.ai/inference/v1'
});

// Popular models:
// - accounts/fireworks/models/llama-v2-70b-chat
// - accounts/fireworks/models/mistral-7b-instruct-4k

Replicate¶

// Replicate uses a different API format - requires special handling
import Replicate from 'replicate';

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN!
});

// Custom wrapper for RunForge tracking
async function replicateWithTracking(model: string, input: any) {
  return await runforge.track(
    { experiment: 'replicate-test', provider: 'replicate' },
    () => replicate.run(model, { input })
  );
}

Local Ollama Setup¶

const ollama = new OpenAI({
  apiKey: 'ollama', // Ollama doesn't require a real API key
  baseURL: 'http://localhost:11434/v1' // Default Ollama endpoint
});

// Available models depend on what you've downloaded locally
// Common ones: llama2, codellama, mistral, phi

Azure OpenAI¶

const azureOpenAI = new OpenAI({
  apiKey: process.env.AZURE_OPENAI_API_KEY!,
  baseURL: `https://${process.env.AZURE_RESOURCE_NAME}.openai.azure.com/openai/deployments/${process.env.AZURE_DEPLOYMENT_NAME}`,
  defaultQuery: { 'api-version': '2023-12-01-preview' },
  defaultHeaders: {
    'api-key': process.env.AZURE_OPENAI_API_KEY!
  }
});

Step 5: Handle Provider-Specific Features¶

Different Authentication Methods¶

Bearer Token (most common):

const provider = new OpenAI({
  apiKey: process.env.PROVIDER_API_KEY!,
  baseURL: 'https://api.provider.com/v1'
});

Custom Headers:

const provider = new OpenAI({
  apiKey: process.env.PROVIDER_API_KEY!,
  baseURL: 'https://api.provider.com/v1',
  defaultHeaders: {
    'X-Custom-Auth': process.env.PROVIDER_SECRET!,
    'User-Agent': 'MyApp/1.0'
  }
});

Basic Auth:

const provider = new OpenAI({
  apiKey: `${username}:${password}`, // Basic auth format
  baseURL: 'https://api.provider.com/v1'
});

Non-Standard Response Formats¶

Some providers may return slightly different response formats:

async function handleCustomResponse(provider: OpenAI, model: string, messages: any[]) {
  try {
    const result = await runforge.track(
      { experiment: 'custom-provider' },
      () => provider.chat.completions.create({
        model: model,
        messages: messages
      })
    );

    // Standard OpenAI format
    if (result.choices && result.choices[0].message) {
      return result.choices[0].message.content;
    }

    // Handle custom format
    if (result.text) {
      return result.text;
    }

    // Handle other variations
    if (result.output) {
      return result.output;
    }

    throw new Error('Unexpected response format');

  } catch (error) {
    console.error('Provider error:', error);
    throw error;
  }
}

Step 6: Test and Validate¶

Testing Checklist¶

Authentication works: No auth errors
Models are available: Can make successful requests
Response format: Compatible with your application
RunForge tracking: Data appears in dashboard
Cost tracking: Costs are recorded (may be estimated)
Error handling: Graceful failure for network issues

Validation Script¶

async function validateCustomProvider() {
  const testPrompt = "Hello! Please respond with 'Provider test successful' if you can see this.";

  try {
    console.log('Testing custom provider...');

    const response = await generateWithCustomProvider(testPrompt);
    console.log('Response:', response);

    if (response.toLowerCase().includes('successful')) {
      console.log('✅ Provider test passed');
    } else {
      console.log('⚠️  Provider responded but format may be unexpected');
    }

    console.log('Check RunForge dashboard for tracking data');

  } catch (error) {
    console.error('❌ Provider test failed:', error.message);
  }
}

validateCustomProvider();

Common Integration Patterns¶

Multi-Provider Fallback¶

const providers = [
  { name: 'primary', client: primaryProvider, models: ['model-1'] },
  { name: 'backup', client: backupProvider, models: ['backup-model-1'] },
  { name: 'fallback', client: fallbackProvider, models: ['fallback-model'] }
];

async function robustGeneration(prompt: string) {
  for (const provider of providers) {
    try {
      return await runforge.track(
        { 
          experiment: 'multi-provider', 
          variant: provider.name,
          fallback_attempt: true
        },
        () => provider.client.chat.completions.create({
          model: provider.models[0],
          messages: [{ role: 'user', content: prompt }]
        })
      );
    } catch (error) {
      console.log(`${provider.name} failed, trying next provider...`);
      continue;
    }
  }

  throw new Error('All providers failed');
}

Cost-Aware Model Selection¶

const modelPricing = {
  'expensive-but-good': 0.02, // per 1K tokens
  'balanced': 0.01,
  'cheap-and-fast': 0.002
};

function selectModelByBudget(maxCostPer1K: number, estimatedTokens: number) {
  const estimatedCost = (estimatedTokens / 1000) * maxCostPer1K;

  for (const [model, price] of Object.entries(modelPricing)) {
    if (price <= maxCostPer1K) {
      return model;
    }
  }

  return 'cheap-and-fast'; // Default to cheapest
}

Load Balancing¶

class LoadBalancedProvider {
  private providers: OpenAI[] = [];
  private currentIndex = 0;

  addProvider(provider: OpenAI) {
    this.providers.push(provider);
  }

  async generate(messages: any[]) {
    const provider = this.providers[this.currentIndex];
    this.currentIndex = (this.currentIndex + 1) % this.providers.length;

    return await runforge.track(
      { 
        experiment: 'load-balanced',
        provider_index: this.currentIndex 
      },
      () => provider.chat.completions.create({
        model: 'your-model',
        messages: messages
      })
    );
  }
}

Troubleshooting Custom Providers¶

Authentication Issues¶

Symptoms: 401, 403, or "Invalid API key" errors Solutions: 1. Verify API key is correct and active 2. Check if special headers are required 3. Confirm you have access to the specific models 4. Review provider documentation for auth requirements

Model Not Found¶

Symptoms: "Model not found" or 404 errors
Solutions: 1. Check exact model name spelling and casing 2. Verify model is available in your region/account 3. Confirm model hasn't been deprecated 4. Try listing available models via provider's API

Incompatible Response Format¶

Symptoms: Parsing errors, missing fields in response Solutions: 1. Log the full response to understand format 2. Add response format handling for this provider 3. Check if provider has OpenAI compatibility mode 4. Consider using provider's native SDK instead

Cost Tracking Issues¶

Symptoms: $0.00 costs in RunForge dashboard Solutions: 1. Add manual cost estimation based on tokens 2. Use provider's usage API to get actual costs 3. Configure RunForge with provider's pricing info 4. Consider using providers with exact cost reporting

Best Practices¶

Documentation¶

Document your setup: Keep notes on configuration
Track model performance: Compare quality across providers
Monitor costs: Many custom providers don't report exact costs
Version your configuration: Keep track of API changes

Monitoring¶

Set up alerts: Budget, error rate, performance alerts
Track provider uptime: Monitor availability across providers
Quality assurance: Test outputs periodically
Cost comparison: Regular analysis of provider costs

Security¶

Secure API keys: Use environment variables and secret management
Network security: Ensure encrypted connections (HTTPS)
Data privacy: Understand where your data is processed
Compliance: Verify provider meets your regulatory requirements

Next Steps¶

Set up comprehensive monitoring for your custom providers
Compare performance and costs across different providers
Explore use cases optimized for specific provider strengths
Optimize your dashboard to track multi-provider usage effectively