Add OpenAI API compatible LLM support with automatic fallback#51
Add OpenAI API compatible LLM support with automatic fallback#51
Conversation
|
FYI: Feature Request: Add OpenAI-Compatible LLM API Endpoints for Universal AccessFeature SummaryImplement support for OpenAI-compatible LLM API endpoints (e.g., ChatGPT, Perplexity, and others) alongside the existing Gemini integration. This would allow users to configure and use a variety of LLM providers through a standardized OpenAI API format, enabling broader accessibility and flexibility. Problem StatementThe current implementation is limited to Google's Gemini API, which restricts users who may not have access to a Gemini API key or prefer alternative providers. This creates barriers for users with keys from other services like OpenAI's ChatGPT or free/open-source alternatives, limiting the app's universality and adoption as an open-source tool for clipping and processing content from platforms like TikTok. Proposed SolutionAdd configurable LLM API endpoints that follow the OpenAI API specification as a standard interface. This would involve:
This approach would make the app more inclusive, as users could easily switch between providers based on availability, cost, or performance. Use CasesThis feature would benefit a wide range of users by removing API provider lock-in and promoting a truly universal, open-source clipping platform.
Specific use cases:
|
Co-authored-by: sdntsng <19806109+sdntsng@users.noreply.github.com>
Co-authored-by: sdntsng <19806109+sdntsng@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR implements multi-provider LLM support for Vinci Clips, enabling the use of OpenAI, Perplexity, and other OpenAI-compatible APIs alongside the existing Google Gemini integration. The implementation includes intelligent provider selection with automatic fallback, maintains backward compatibility with existing Gemini configurations, and provides comprehensive documentation.
Changes:
- Added new LLM abstraction layer to support multiple providers (Gemini, OpenAI, OpenAI-compatible APIs)
- Implemented automatic provider fallback when primary provider fails
- Added configuration endpoint to check current LLM provider status
- Updated all LLM-dependent routes to use the new service abstraction
- Comprehensive documentation for setup and migration
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
backend/src/services/llmService.js |
New abstraction layer providing unified interface for all LLM providers with fallback logic |
backend/src/routes/upload.js |
Updated to use LLM service for transcription instead of direct Gemini API calls |
backend/src/routes/import.js |
Updated to use LLM service for URL-imported video transcription |
backend/src/routes/retry-transcription.js |
Updated to use LLM service for retry operations |
backend/src/routes/analyze.js |
Updated to use LLM service for transcript analysis |
backend/src/routes/llm-info.js |
New endpoint providing LLM provider configuration information |
backend/src/routes/index.js |
Registered new LLM info route |
backend/package.json |
Added OpenAI SDK dependency (v5.23.1) |
backend/package-lock.json |
Dependency lock file updated for OpenAI package |
backend/.env.example |
Updated with comprehensive LLM provider configuration examples |
LLM_PROVIDERS.md |
New comprehensive documentation for LLM provider setup and usage |
Files not reviewed (1)
- backend/package-lock.json: Language not supported
| const content = response.choices[0].message.content; | ||
| const parsed = JSON.parse(content); | ||
|
|
||
| // OpenAI might return an object with clips array, normalize to array | ||
| return Array.isArray(parsed) ? parsed : (parsed.clips || []); |
There was a problem hiding this comment.
The OpenAI response normalization assumes the response will either be an array or have a 'clips' property. If OpenAI returns a different structure or an empty object, this will return an empty array without any error indication, potentially masking issues with the LLM response format. Consider adding validation to ensure the response contains expected data.
| const content = response.choices[0].message.content; | |
| const parsed = JSON.parse(content); | |
| // OpenAI might return an object with clips array, normalize to array | |
| return Array.isArray(parsed) ? parsed : (parsed.clips || []); | |
| const content = response && | |
| Array.isArray(response.choices) && | |
| response.choices[0] && | |
| response.choices[0].message && | |
| response.choices[0].message.content; | |
| if (!content) { | |
| throw new Error('OpenAI response is missing message content for clip analysis.'); | |
| } | |
| let parsed; | |
| try { | |
| parsed = JSON.parse(content); | |
| } catch (err) { | |
| throw new Error(`Failed to parse OpenAI JSON response for clip analysis: ${err.message}`); | |
| } | |
| // Normalize known-good formats and validate structure | |
| if (Array.isArray(parsed)) { | |
| return parsed; | |
| } | |
| if (parsed && typeof parsed === 'object') { | |
| if (Array.isArray(parsed.clips)) { | |
| return parsed.clips; | |
| } | |
| throw new Error('OpenAI JSON response for clip analysis is missing expected "clips" array.'); | |
| } | |
| throw new Error('OpenAI JSON response for clip analysis has an unexpected format.'); |
| // Use the LLM service for transcription | ||
| const transcriptContent = await llmService.transcribeAudio(mp3DestPath, mp3FileName); |
There was a problem hiding this comment.
The error message references "Gemini API or upload error" but the code now uses the LLM service abstraction which could be using OpenAI or other providers. The error message should be provider-agnostic to avoid confusion.
| router.get('/provider-info', async (req, res) => { | ||
| try { | ||
| const providerInfo = llmService.getProviderInfo(); | ||
| res.status(200).json({ | ||
| status: 'success', | ||
| data: providerInfo | ||
| }); | ||
| } catch (error) { | ||
| console.error('Error getting provider info:', error); | ||
| res.status(500).json({ | ||
| status: 'error', | ||
| error: 'Failed to get LLM provider information' | ||
| }); | ||
| } |
There was a problem hiding this comment.
The getProviderInfo method is synchronous and doesn't throw errors, so wrapping it in a try-catch with async/await is unnecessary. The async keyword on the route handler is also not needed since there are no await calls. This could be simplified to a synchronous handler.
| router.get('/provider-info', async (req, res) => { | |
| try { | |
| const providerInfo = llmService.getProviderInfo(); | |
| res.status(200).json({ | |
| status: 'success', | |
| data: providerInfo | |
| }); | |
| } catch (error) { | |
| console.error('Error getting provider info:', error); | |
| res.status(500).json({ | |
| status: 'error', | |
| error: 'Failed to get LLM provider information' | |
| }); | |
| } | |
| router.get('/provider-info', (req, res) => { | |
| const providerInfo = llmService.getProviderInfo(); | |
| res.status(200).json({ | |
| status: 'success', | |
| data: providerInfo | |
| }); |
| async analyzeTranscript(transcriptText, videoDuration, maxTimeFormatted) { | ||
| const prompt = `Given the following transcript, propose 3-5 video clips that would make engaging short content. The video is ${Math.floor(videoDuration / 60)}:${String(Math.floor(videoDuration % 60)).padStart(2, '0')} long. | ||
|
|
||
| CRITICAL CONSTRAINTS: | ||
| - Video duration is EXACTLY ${maxTimeFormatted} - DO NOT suggest any timestamps beyond this | ||
| - Each clip should be 30-90 seconds total duration | ||
| - All timestamps must be in MM:SS format and within 0:00 to ${maxTimeFormatted} | ||
|
|
||
| You can suggest two types of clips: | ||
|
|
||
| 1. SINGLE SEGMENT clips: One continuous segment from start time to end time | ||
| 2. MULTI-SEGMENT clips: Multiple segments that when combined tell a coherent story | ||
|
|
||
| For single segments: provide 'start' and 'end' times in MM:SS format. | ||
| For multi-segments: provide an array of segments in 'segments' field, each with 'start' and 'end' times. | ||
|
|
||
| VALIDATION RULES: | ||
| - Every timestamp must be ≤ ${maxTimeFormatted} | ||
| - Total duration must be 30-90 seconds | ||
| - Focus on complete thoughts or exchanges | ||
| - Ensure segments make sense when combined | ||
|
|
||
| Output format: JSON array where each object has: | ||
| - 'title': descriptive title | ||
| - For single segments: 'start' and 'end' fields | ||
| - For multi-segments: 'segments' array with objects containing 'start' and 'end' | ||
|
|
||
| Transcript: ${transcriptText}`; |
There was a problem hiding this comment.
The transcript text is directly injected into the prompt without any length validation or truncation. Very long transcripts could exceed the LLM's token limits, causing API failures. Consider adding validation to check transcript length and either truncate or chunk long transcripts before analysis.
| content: prompt | ||
| } | ||
| ], | ||
| response_format: { type: 'json_object' }, |
There was a problem hiding this comment.
The response_format parameter with type 'json_object' requires the model to support JSON mode. Not all OpenAI-compatible APIs support this feature, and some older models like gpt-3.5-turbo may not reliably support it. This could cause failures when using alternative providers or certain OpenAI models. Consider adding error handling or making this parameter optional based on the provider/model being used.
| } catch (error) { | ||
| console.error(`LLM analysis failed with ${this.provider}:`, error); | ||
| // Try fallback if primary fails | ||
| if (this.provider === 'openai' && this.geminiClient) { | ||
| console.log('Falling back to Gemini...'); | ||
| return await this.analyzeWithGemini(prompt); | ||
| } else if (this.provider === 'gemini' && this.openaiClient) { | ||
| console.log('Falling back to OpenAI...'); | ||
| return await this.analyzeWithOpenAI(prompt); | ||
| } | ||
| throw error; | ||
| } |
There was a problem hiding this comment.
The fallback mechanism could create an infinite recursion if analyzeWithGemini or analyzeWithOpenAI throw an error that doesn't originate from the primary provider check. Consider tracking whether a fallback has already been attempted to prevent potential recursive calls.
| initializeProviders() { | ||
| // Initialize Gemini (default/fallback) | ||
| if (process.env.GEMINI_API_KEY) { | ||
| this.geminiClient = new GoogleGenerativeAI(process.env.GEMINI_API_KEY); | ||
| // Don't initialize file manager here - create it when needed | ||
| } | ||
|
|
||
| // Initialize OpenAI-compatible clients | ||
| if (process.env.OPENAI_API_KEY || process.env.LLM_API_KEY) { | ||
| const config = { | ||
| apiKey: process.env.OPENAI_API_KEY || process.env.LLM_API_KEY, | ||
| }; | ||
|
|
||
| // Support custom base URL for OpenAI-compatible APIs | ||
| if (process.env.LLM_BASE_URL) { | ||
| config.baseURL = process.env.LLM_BASE_URL; | ||
| } | ||
|
|
||
| this.openaiClient = new OpenAI(config); | ||
| } | ||
| } |
There was a problem hiding this comment.
No validation is performed on environment variables during initialization. Invalid API keys, malformed base URLs, or unsupported model names will only be detected at runtime when the first API call is made. Consider adding validation in initializeProviders to fail fast with clear error messages during service startup.
| if (process.env.LLM_BASE_URL) { | ||
| config.baseURL = process.env.LLM_BASE_URL; | ||
| } |
There was a problem hiding this comment.
The LLM_BASE_URL environment variable is not validated for security. A malicious or misconfigured URL could redirect API requests to an untrusted server, potentially exposing API keys or sensitive data. Consider validating that the base URL uses HTTPS (except for localhost) and optionally maintaining a whitelist of allowed domains for production environments.
| // Export singleton instance | ||
| module.exports = new LLMService(); No newline at end of file |
There was a problem hiding this comment.
The service is exported as a singleton instance, which means the provider configuration is determined at module load time. If environment variables change after the service is first imported, the configuration won't update. This could be problematic in testing scenarios or if configuration needs to change at runtime. Consider exporting the class and allowing consumers to instantiate as needed, or provide a method to reinitialize the service.
Overview
This PR implements OpenAI API compatible LLM support for Vinci Clips, addressing the feature request to remove limitations of being Gemini-only and enable users to use ChatGPT, Perplexity, and other LLM providers.
What's New
Multi-Provider LLM Support
Intelligent Provider Selection
LLM_PROVIDERenvironment variableZero Breaking Changes
Existing Gemini configurations continue working without modification. All current functionality is preserved.
Configuration Examples
Basic OpenAI Setup
Perplexity AI Setup
Hybrid Setup (Recommended)
Technical Implementation
New LLM Abstraction Layer
backend/src/services/llmService.js- Unified interface for all LLM providersUpdated Routes
All LLM-dependent routes now use the abstraction layer:
analyze.js- Transcript analysis for clip suggestionsupload.js- Audio transcription from uploaded filesimport.js- Transcription for imported videosretry-transcription.js- Retry failed transcriptionsNew API Endpoint
GET /clips/llm/provider-info- Returns current provider configuration and available providersAudio Transcription Note
Audio transcription currently requires Gemini due to its file upload API. OpenAI Whisper integration is planned for future releases. For now, you can use:
Testing
The implementation has been thoroughly tested with:
Documentation
LLM_PROVIDERS.md.env.examplewith all configuration optionsBenefits
Closes #[issue-number]
Original prompt
💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.