Skip to content

Custom Provider Setup Guide

What You'll Learn

How to connect other OpenAI-compatible AI services to RunForge, including local models, enterprise providers, and specialized AI services.

Why Use Custom Providers?

Cost Savings

  • Local models: Run AI on your own hardware with no per-request costs
  • Enterprise deals: Custom pricing from providers not in standard lists
  • Regional providers: Often cheaper in specific geographic regions
  • Specialized services: Purpose-built models that may be more cost-effective

Control and Privacy

  • On-premises deployment: Keep all data within your infrastructure
  • Custom models: Use fine-tuned models specific to your domain
  • Regional compliance: Meet data residency requirements
  • Enterprise features: SLAs, dedicated support, custom terms

Access to New Models

  • Early access: New providers before they're officially supported
  • Experimental models: Research or beta models
  • Niche providers: Specialized in your industry or use case
  • Open source models: Self-hosted versions of popular models

Compatible Services

  • Together AI: Fast inference for open-source models
  • Fireworks AI: High-performance model serving
  • Replicate: Easy access to thousands of models
  • Perplexity: Search-enhanced language models
  • Groq: Ultra-fast inference hardware
  • Mistral AI: European provider with strong models
  • Cohere: Enterprise-focused AI platform

Self-Hosted Solutions

  • Ollama: Run models locally on your machine
  • Text Generation WebUI: Web interface for local models
  • FastChat: Self-hosted chatbot with OpenAI API
  • LocalAI: Local OpenAI API alternative
  • LM Studio: Desktop app for running local models

Enterprise Platforms

  • Azure OpenAI: Microsoft's managed OpenAI service
  • AWS Bedrock: Amazon's managed AI service
  • Google Vertex AI: Google's enterprise AI platform
  • IBM Watson: Enterprise AI with custom models

Before You Start

  • ✅ Account with your chosen provider (or local setup complete)
  • ✅ API endpoint URL and authentication method
  • ✅ RunForge project created (see Getting Started Guide)
  • ✅ Understanding of your provider's API format
  • ✅ 10-15 minutes of time

Step 1: Gather Provider Information

Required Information

For any custom provider, you'll need:

  1. Base URL: The API endpoint (e.g., https://api.together.xyz/v1)
  2. Authentication: API key, token, or other auth method
  3. Model names: Exact model identifiers used by the provider
  4. API format: Confirm it's OpenAI-compatible
  5. Pricing: Cost per token (if available)

Finding Provider Details

Check the provider's documentation for: - API reference or developer docs - OpenAI compatibility information - Authentication requirements - Available model list - Pricing information

Example for Together AI: - Base URL: https://api.together.xyz/v1 - Auth: Bearer token via API key - Models: meta-llama/Llama-2-7b-chat-hf, etc. - Compatible: Yes (OpenAI chat completions format)

Step 2: Add Custom Provider to RunForge

  1. Open RunForge and sign in
  2. Select your project from the project dropdown
  3. Go to SettingsAPI Keys

Configure Custom Provider

  1. Click "Add API Key" or "Add Provider Key"
  2. Select "Custom" from the provider dropdown
  3. Fill out the configuration:
  4. Provider Name: "Together AI" (or whatever you're adding)
  5. Base URL: https://api.together.xyz/v1 (your provider's endpoint)
  6. API Key: Your provider's API key
  7. Key Name: "Production Together Key"
  8. Description: "Together AI for open-source models"

  9. Advanced options (if available):

  10. Headers: Custom headers required by provider
  11. Auth type: Bearer token, Basic auth, custom
  12. Rate limits: Provider-specific limits

  13. Test the connection and Save

Step 3: Configure Your Application

TypeScript/JavaScript Setup

Use the OpenAI SDK with custom base URL:

npm install openai @runforge/sdk

Environment variables (.env file):

# Custom Provider (e.g., Together AI)
TOGETHER_API_KEY=your-together-api-key-here
TOGETHER_BASE_URL=https://api.together.xyz/v1

# RunForge Configuration
RUNFORGE_API_KEY=your-runforge-ingest-key-here
RUNFORGE_PROJECT_ID=your-project-id-here
RUNFORGE_ENDPOINT=http://localhost:3000/api/ingest

Basic usage example:

import OpenAI from 'openai';
import { RunForge } from '@runforge/sdk';

// Configure for custom provider
const togetherAI = new OpenAI({
  apiKey: process.env.TOGETHER_API_KEY!,
  baseURL: process.env.TOGETHER_BASE_URL!,
});

const runforge = new RunForge({
  apiKey: process.env.RUNFORGE_API_KEY!,
  projectId: process.env.RUNFORGE_PROJECT_ID!,
  endpoint: process.env.RUNFORGE_ENDPOINT!
});

async function generateWithCustomProvider(prompt: string) {
  const result = await runforge.track(
    { 
      experiment: 'together-ai-test',
      provider: 'together',
      model: 'llama-2-7b-chat' // Custom tracking info
    },
    () => togetherAI.chat.completions.create({
      model: 'meta-llama/Llama-2-7b-chat-hf', // Provider's exact model name
      messages: [
        { role: 'system', content: 'You are a helpful assistant.' },
        { role: 'user', content: prompt }
      ],
      temperature: 0.7,
      max_tokens: 500
    })
  );

  return result.choices[0].message.content;
}

Python Setup

pip install openai runforge

Environment setup:

export TOGETHER_API_KEY=your-together-api-key-here
export TOGETHER_BASE_URL=https://api.together.xyz/v1
export RUNFORGE_API_KEY=your-runforge-ingest-key-here
export RUNFORGE_PROJECT_ID=your-project-id-here
export RUNFORGE_ENDPOINT=http://localhost:3000/api/ingest

Basic usage example:

import os
from openai import OpenAI
from runforge import RunForge

# Configure for custom provider
client = OpenAI(
    api_key=os.environ['TOGETHER_API_KEY'],
    base_url=os.environ['TOGETHER_BASE_URL']
)

rf = RunForge(
    api_key=os.environ['RUNFORGE_API_KEY'],
    project_id=os.environ['RUNFORGE_PROJECT_ID'],
    endpoint=os.environ['RUNFORGE_ENDPOINT']
)

def generate_with_custom_provider(prompt):
    result = rf.track(
        {
            "experiment": "together-ai-test",
            "provider": "together",
            "model": "llama-2-7b-chat"
        },
        lambda: client.chat.completions.create(
            model="meta-llama/Llama-2-7b-chat-hf",
            messages=[
                {"role": "system", "content": "You are a helpful assistant."},
                {"role": "user", "content": prompt}
            ],
            temperature=0.7,
            max_tokens=500
        )
    )

    return result.choices[0].message.content

Step 4: Provider-Specific Configuration

Together AI

const together = new OpenAI({
  apiKey: process.env.TOGETHER_API_KEY!,
  baseURL: 'https://api.together.xyz/v1'
});

// Popular models:
// - meta-llama/Llama-2-70b-chat-hf (high quality)
// - meta-llama/Llama-2-13b-chat-hf (balanced)  
// - meta-llama/Llama-2-7b-chat-hf (fast/cheap)

Fireworks AI

const fireworks = new OpenAI({
  apiKey: process.env.FIREWORKS_API_KEY!,
  baseURL: 'https://api.fireworks.ai/inference/v1'
});

// Popular models:
// - accounts/fireworks/models/llama-v2-70b-chat
// - accounts/fireworks/models/mistral-7b-instruct-4k

Replicate

// Replicate uses a different API format - requires special handling
import Replicate from 'replicate';

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN!
});

// Custom wrapper for RunForge tracking
async function replicateWithTracking(model: string, input: any) {
  return await runforge.track(
    { experiment: 'replicate-test', provider: 'replicate' },
    () => replicate.run(model, { input })
  );
}

Local Ollama Setup

const ollama = new OpenAI({
  apiKey: 'ollama', // Ollama doesn't require a real API key
  baseURL: 'http://localhost:11434/v1' // Default Ollama endpoint
});

// Available models depend on what you've downloaded locally
// Common ones: llama2, codellama, mistral, phi

Azure OpenAI

const azureOpenAI = new OpenAI({
  apiKey: process.env.AZURE_OPENAI_API_KEY!,
  baseURL: `https://${process.env.AZURE_RESOURCE_NAME}.openai.azure.com/openai/deployments/${process.env.AZURE_DEPLOYMENT_NAME}`,
  defaultQuery: { 'api-version': '2023-12-01-preview' },
  defaultHeaders: {
    'api-key': process.env.AZURE_OPENAI_API_KEY!
  }
});

Step 5: Handle Provider-Specific Features

Different Authentication Methods

Bearer Token (most common):

const provider = new OpenAI({
  apiKey: process.env.PROVIDER_API_KEY!,
  baseURL: 'https://api.provider.com/v1'
});

Custom Headers:

const provider = new OpenAI({
  apiKey: process.env.PROVIDER_API_KEY!,
  baseURL: 'https://api.provider.com/v1',
  defaultHeaders: {
    'X-Custom-Auth': process.env.PROVIDER_SECRET!,
    'User-Agent': 'MyApp/1.0'
  }
});

Basic Auth:

const provider = new OpenAI({
  apiKey: `${username}:${password}`, // Basic auth format
  baseURL: 'https://api.provider.com/v1'
});

Non-Standard Response Formats

Some providers may return slightly different response formats:

async function handleCustomResponse(provider: OpenAI, model: string, messages: any[]) {
  try {
    const result = await runforge.track(
      { experiment: 'custom-provider' },
      () => provider.chat.completions.create({
        model: model,
        messages: messages
      })
    );

    // Standard OpenAI format
    if (result.choices && result.choices[0].message) {
      return result.choices[0].message.content;
    }

    // Handle custom format
    if (result.text) {
      return result.text;
    }

    // Handle other variations
    if (result.output) {
      return result.output;
    }

    throw new Error('Unexpected response format');

  } catch (error) {
    console.error('Provider error:', error);
    throw error;
  }
}

Step 6: Test and Validate

Testing Checklist

  1. Authentication works: No auth errors
  2. Models are available: Can make successful requests
  3. Response format: Compatible with your application
  4. RunForge tracking: Data appears in dashboard
  5. Cost tracking: Costs are recorded (may be estimated)
  6. Error handling: Graceful failure for network issues

Validation Script

async function validateCustomProvider() {
  const testPrompt = "Hello! Please respond with 'Provider test successful' if you can see this.";

  try {
    console.log('Testing custom provider...');

    const response = await generateWithCustomProvider(testPrompt);
    console.log('Response:', response);

    if (response.toLowerCase().includes('successful')) {
      console.log('✅ Provider test passed');
    } else {
      console.log('⚠️  Provider responded but format may be unexpected');
    }

    console.log('Check RunForge dashboard for tracking data');

  } catch (error) {
    console.error('❌ Provider test failed:', error.message);
  }
}

validateCustomProvider();

Common Integration Patterns

Multi-Provider Fallback

const providers = [
  { name: 'primary', client: primaryProvider, models: ['model-1'] },
  { name: 'backup', client: backupProvider, models: ['backup-model-1'] },
  { name: 'fallback', client: fallbackProvider, models: ['fallback-model'] }
];

async function robustGeneration(prompt: string) {
  for (const provider of providers) {
    try {
      return await runforge.track(
        { 
          experiment: 'multi-provider', 
          variant: provider.name,
          fallback_attempt: true
        },
        () => provider.client.chat.completions.create({
          model: provider.models[0],
          messages: [{ role: 'user', content: prompt }]
        })
      );
    } catch (error) {
      console.log(`${provider.name} failed, trying next provider...`);
      continue;
    }
  }

  throw new Error('All providers failed');
}

Cost-Aware Model Selection

const modelPricing = {
  'expensive-but-good': 0.02, // per 1K tokens
  'balanced': 0.01,
  'cheap-and-fast': 0.002
};

function selectModelByBudget(maxCostPer1K: number, estimatedTokens: number) {
  const estimatedCost = (estimatedTokens / 1000) * maxCostPer1K;

  for (const [model, price] of Object.entries(modelPricing)) {
    if (price <= maxCostPer1K) {
      return model;
    }
  }

  return 'cheap-and-fast'; // Default to cheapest
}

Load Balancing

class LoadBalancedProvider {
  private providers: OpenAI[] = [];
  private currentIndex = 0;

  addProvider(provider: OpenAI) {
    this.providers.push(provider);
  }

  async generate(messages: any[]) {
    const provider = this.providers[this.currentIndex];
    this.currentIndex = (this.currentIndex + 1) % this.providers.length;

    return await runforge.track(
      { 
        experiment: 'load-balanced',
        provider_index: this.currentIndex 
      },
      () => provider.chat.completions.create({
        model: 'your-model',
        messages: messages
      })
    );
  }
}

Troubleshooting Custom Providers

Authentication Issues

Symptoms: 401, 403, or "Invalid API key" errors Solutions: 1. Verify API key is correct and active 2. Check if special headers are required 3. Confirm you have access to the specific models 4. Review provider documentation for auth requirements

Model Not Found

Symptoms: "Model not found" or 404 errors
Solutions: 1. Check exact model name spelling and casing 2. Verify model is available in your region/account 3. Confirm model hasn't been deprecated 4. Try listing available models via provider's API

Incompatible Response Format

Symptoms: Parsing errors, missing fields in response Solutions: 1. Log the full response to understand format 2. Add response format handling for this provider 3. Check if provider has OpenAI compatibility mode 4. Consider using provider's native SDK instead

Cost Tracking Issues

Symptoms: $0.00 costs in RunForge dashboard Solutions: 1. Add manual cost estimation based on tokens 2. Use provider's usage API to get actual costs 3. Configure RunForge with provider's pricing info 4. Consider using providers with exact cost reporting

Best Practices

Documentation

  • Document your setup: Keep notes on configuration
  • Track model performance: Compare quality across providers
  • Monitor costs: Many custom providers don't report exact costs
  • Version your configuration: Keep track of API changes

Monitoring

  • Set up alerts: Budget, error rate, performance alerts
  • Track provider uptime: Monitor availability across providers
  • Quality assurance: Test outputs periodically
  • Cost comparison: Regular analysis of provider costs

Security

  • Secure API keys: Use environment variables and secret management
  • Network security: Ensure encrypted connections (HTTPS)
  • Data privacy: Understand where your data is processed
  • Compliance: Verify provider meets your regulatory requirements

Next Steps