Skip to content

Conversation

@dzakwanalifi
Copy link

@dzakwanalifi dzakwanalifi commented Oct 24, 2025

Problem

PageIndex currently only supports OpenAI models, limiting user choice and potentially increasing costs.

Solution

Add unified interface supporting both OpenAI GPT-4 and Gemini 2.5 Flash models with structured output capabilities.

Key Changes

  • Add LLM provider abstraction layer in utils.py
  • Support structured output with Pydantic models for Gemini
  • Add provider configuration in config.yaml
  • Fix JSON parsing error handling in page_index.py
  • Update function signatures for better model parameter handling

Code Changes

  • utils.py: Added LLMProvider class with unified interface
  • config.yaml: Added provider configuration option
  • page_index.py: Enhanced error handling and Gemini integration

Use Case

Users can now choose between providers:

# OpenAI (default)
python run_pageindex.py --pdf_path doc.pdf

# Gemini
python run_pageindex.py --pdf_path doc.pdf --provider gemini

This provides flexibility for different cost/performance requirements while maintaining full compatibility.

Add support for both OpenAI and Gemini providers with unified interface.
Includes structured output support and improved error handling.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant