Skip to content

[Feature] Together AI Platform Integration with Native Provider Support #82

@Rohitjoshi9023

Description

@Rohitjoshi9023

Summary

Add native support for Together AI platform as a first-class provider in the Copilot SDK. This includes seamless integration with Together AI's diverse model catalog, real-time inference API, and enterprise-grade features like batch processing and usage monitoring.

Problem / Use Case

  1. Growing Together AI Adoption: More teams are migrating to Together AI for cost-effective, open-source model inference at scale
  2. Feature Parity: Currently, accessing Together AI requires manual API integration without built-in fallback chains or routing strategies (e.g., falling back to open-source models when proprietary APIs are rate-limited)
  3. Consistency Gaps: No standardized way to handle Together AI alongside other providers (OpenAI, Anthropic, etc.) in a single SDK

Proposed Solution

1. Together AI Provider Implementation

  • Add TogetherAIProvider class with full support for:
    • Text Generation: 70+ open-source models (Llama, Mistral, Falcon, etc.)
    • Batch Processing API: Asynchronous batch job submission for large-scale processing
    • Embeddings: Via Together AI embeddings endpoint
    • Streaming: Real-time token streaming with proper error handling

2. Model Registry Integration

  • Register Together AI models in the SDK's model registry with metadata:
    • Input/output token limits
    • Pricing per 1M tokens
    • Context window size
    • Quantization levels available

3. Fallback & Routing Compatibility

  • Enable fallback chains across Together AI models (e.g., fallback from Llama-70B to Llama-7B on rate limits)
  • Support routing strategies (priority, round-robin) specifically optimized for Together AI's load balancing

4. Authentication & Configuration

  • Environment variable support: TOGETHER_API_KEY
  • Configurable base URL for self-hosted Together inference endpoints
  • Request timeout and retry logic aligned with Together's SLA

Alternatives Considered

  1. Using Together AI's SDK directly - Lacks centralized error handling and routing across multiple providers
  2. Generic HTTP client - Would require duplicating error handling and rate limit logic
  3. Async wrapper layer - Insufficient for seamless integration with existing fallback/routing infrastructure

Additional Context

  • Together AI API Reference: https://docs.together.ai/reference
  • Community request for multi-provider support with open-source models
  • Competitive advantage: Better support for cost-sensitive applications## Summary

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions