GAIA Benchmark Agent

This repository contains an AI agent designed to solve Level 1 questions from the GAIA (General AI Assistance) benchmark. The agent uses LangChain, LangGraph, and various tools to reason, search, and process information for accurate answers. It is deployed as a Gradio web interface for easy interaction and evaluation.

Features

Agent Architecture: A LangGraph-based agent that combines reasoning with tool usage (e.g., web search, Wikipedia lookup, Python execution, audio transcription, image description).
Tool Integration: Custom tools for web research, file handling, multimedia processing, and more.
Evaluation Runner: Fetches questions from a remote API, runs the agent, caches results, and submits for scoring.
Gradio Interface: User-friendly UI to run evaluations and submit answers.

Installation

Clone the repository:

git clone https://github.com/RicPiz/GAIA-Agents-AI.git
cd GAIA-Agents-AI

Install dependencies:
```
pip install -r requirements.txt
```
Set up environment variables:
- Create a .env file in the root directory.
- Add your OpenAI API key:
```
OPENAI_API_KEY=your_openai_api_key
```
- (Optional) Set SPACE_ID for Hugging Face Spaces integration.

Usage

Run the Gradio app:
```
python app.py
```
Open the provided URL in your browser (e.g., http://127.0.0.1:7860).
Log in with your Hugging Face account.
Click "Run Evaluation (Cache Answers)" to process all questions.
Click "Submit Cached Answers" to send results to the scoring API.

The app will display status updates and a table of questions, answers, and tools used.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LICENSE		LICENSE
README.md		README.md
agents.py		agents.py
app.py		app.py
model.py		model.py
requirements.txt		requirements.txt
tools.py		tools.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GAIA Benchmark Agent

Features

Installation

Usage

License

About

Uh oh!

Releases

Packages

Languages

License

RicPiz/GAIA-Agents-AI

Folders and files

Latest commit

History

Repository files navigation

GAIA Benchmark Agent

Features

Installation

Usage

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages