Skip to content

Multi-Agent architecture system that generates relevant AI and Generative AI (GenAI) use cases for a given Company or Industry.

Notifications You must be signed in to change notification settings

JANNATHA-MANISH/GENAI-MultiAgentXpert

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Multi-Agent Automation System

alt text


Project Overview

The Multi-Agent Automation System is a modular Python-based system designed to automate web scraping, AI-powered use case generation, and Kaggle dataset collection. The system utilizes three independent agents, each performing a specific task:

  1. Agent 1: Web Scraping - Extracts text from websites based on a list of keywords.
  2. Agent 2: Use Case Generator - Uses AI to generate relevant use cases from the extracted text.
  3. Agent 3: Resource Collector - Uses the Kaggle API to search and download datasets related to the use cases.

The system is built using the Retrieval-Augmented Generation (RAG) framework for data retrieval and processing, LangChain for handling the flow of data between agents, and the Gemini AI API for generating AI-based outputs.

Technologies Used:

RAG (Retrieval-Augmented Generation): This system uses RAG for enhancing the quality of AI generation by retrieving relevant data before processing.

LangChain: Utilized for building the system's workflow, making it possible to manage agents and tasks seamlessly.

Gemini AI API: Used for generating use cases and text processing tasks via an API key.

By leveraging these powerful tools, this multi-agent system can efficiently scrape websites, generate useful insights, and retrieve datasets for further analysis.

The system is coordinated by a central controller script (main_agent.py), and it stores the output in various files within the data/ directory.


Table of Contents


Architecture Diagram

Below is the architecture diagram of the system:

                          +-------------------------------+
                          |       Main Controller        |
                          |         main_agent.py        |
                          +-------------------------------+
                                     |     |     |
        -----------------------------+     |     +--------------------------------
        |                                  |                                  |
+------------------------+    +------------------------+    +------------------------+
|   Agent 1: Web Scraper |    | Agent 2: Use Case Gen  |    | Agent 3: Resource Coll |
|    agent1_webscrap.py  |    |  agent2_usecase.py     |    |  agent3_resource.py    |
+------------------------+    +------------------------+    +------------------------+
        |                         |                         |
        |                         |                         |
+------------------------+   +------------------------+   +------------------------+
|   Extracted Text File  |   |  Generated Use Cases   |   |   Downloaded Datasets  |
|   extracted_text.txt   |   | use_cases.txt          |   |   resource_links.csv   |
+------------------------+   +------------------------+   +------------------------+
                                     |
                          +------------------------+
                          |                        |
                          |   keywords.txt         |
                          +------------------------+

Data Storage Files

The following files are created by the system to store the output:

File Name Data Type Description
extracted_text.txt Plain Text Extracted text from web pages based on keywords.
use_cases.txt Plain Text Generated use cases using AI (from extracted text).
keywords.txt Plain Text List of keywords for web scraping.
resource_links.csv CSV Resource links related to Kaggle datasets.

Sample Input and Output

Sample Input:(if needed)

Indian Companies by Industry

Company Name Industry Segment Wikipedia Link
Tata Motors Automotive Manufacturing of commercial and passenger vehicles Visit Wikipedia
Mahindra & Mahindra Automotive Manufacturing of SUVs, tractors, and commercial vehicles Visit Wikipedia
Bajaj Auto Automotive Manufacturing of motorcycles, scooters, and three-wheelers Visit Wikipedia
Larsen & Toubro (L&T) Construction Engineering, construction, and infrastructure development Visit Wikipedia
DLF Limited Real Estate Residential and commercial property development Visit Wikipedia
Godrej Properties Real Estate Residential and commercial real estate projects Visit Wikipedia
Reliance Industries Energy Oil refining, petrochemicals, and renewable energy Visit Wikipedia
Indian Oil Corporation Energy Oil refining, distribution, and marketing Visit Wikipedia
NTPC Limited Energy Power generation and renewable energy Visit Wikipedia
Infosys IT Services Software development, consulting, and IT outsourcing Visit Wikipedia
TCS (Tata Consultancy Services) IT Services IT services, consulting, and business solutions Visit Wikipedia
Wipro IT Services IT services, consulting, and digital transformation Visit Wikipedia
HDFC Bank Finance Retail banking, corporate banking, and loans Visit Wikipedia
ICICI Bank Finance Retail banking, corporate banking, insurance Visit Wikipedia
Bajaj Finserv Finance Financial services including lending, insurance, and wealth management Visit Wikipedia
Apollo Hospitals Healthcare Multispecialty hospitals and healthcare services Visit Wikipedia
Fortis Healthcare Healthcare Multispecialty hospitals and diagnostics Visit Wikipedia
Dr. Reddy's Laboratories Pharmaceuticals Manufacturing of generic drugs, active pharmaceutical ingredients (APIs) Visit Wikipedia
Cipla Pharmaceuticals Manufacturing of generic drugs and respiratory care products Visit Wikipedia
Flipkart E-commerce/Retail Online retail platform for electronics, fashion, groceries, etc. Visit Wikipedia

Education Companies That Require GenAI

Company Name Industry Segment Wikipedia Link
BYJU'S Education Online learning platform for K-12 students Visit Wikipedia
Vedantu Education Online tutoring platform for school students Not available on Wikipedia
Unacademy Education Online education platform for competitive exams Not available on Wikipedia
Simplilearn Education Online certification training courses in technology and business Not available on Wikipedia
Toppr (Acquired by BYJU'S) Education Online learning app for K-12 students Not available on Wikipedia

Sample Output:

extracted_text.txt
This file contains the raw text extracted from websites based on the keywords provided.

Extracted Text
"Startup ideas are crucial for growing a business..."
"Entrepreneurship requires a strong vision and strategy..."

use_cases.txt
This file contains the AI-generated use cases derived from the extracted text.

Use Case
"AI for business growth"
"Use of AI in marketing"

keywords.txt
This file contains a list of keywords that the scraper will use to identify relevant content for extraction from websites.

Keyword
startup ideas
entrepreneurship
AI use cases

resource_links.csv
This file contains the Kaggle dataset resources related to the generated use cases.

Dataset Name Link
"Business Growth AI" https://www.kaggle.com/dataset-xyz
"Marketing AI" https://www.kaggle.com/dataset-abc


File Structure

/Multi-agent architecture
│
├── main_agent.py            # Main controller that triggers agent execution
├── agents/                  # Folder containing agent scripts
│   ├── agent1_webscrap.py   # Web scraping agent
│   ├── agent2_usecase.py    # Use case generation agent
│   ├── agent3_resources.py  # Resource collection agent
│
├── data/                    # Folder where output files are saved
│   ├── extracted_text.txt   # Raw text scraped from URLs
│   ├── use_cases.txt        # Generated use cases
│   ├── keywords.txt         # Extracted keywords
│   ├── resource_links.csv   # Collected resource links in CSV format
│
├── .env                     # Contains API keys and environment variables
├── requirements.txt         # Python dependencies
├── myenv                    # Virtual environment directory

Sample Input & Output

Sample Input:

When running main_agent.py, you will provide one or more URLs as input:

python main_agent.py "https://example1.com" "https://example2.com"

Sample Output:

extracted_text.txt:

Text extracted from the website https://example1.com:
Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Text extracted from the website https://example2.com:
Sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

use_cases.txt:

Use Case 1: Text extraction from websites.
Description: This use case involves extracting relevant data from multiple sources.
...

keywords.txt:

Text, Website, Extraction, Use Case, AI, Automation

resource_links.csv:

URL,Category,Description
https://resource1.com,AI,Resource for AI research
https://resource2.com,Web Scraping,Guides for efficient web scraping

Execution Instructions

1. Clone the Repository

First, clone the repository to your local machine:

git clone https://github.com/your-username/multi-agent-automation.git
cd multi-agent-automation

2. Set Up the Environment

Create a virtual environment (optional but recommended):

python -m venv menv

Activate the virtual environment:

  • On Windows:
    menv\Scripts\activate
  • On macOS/Linux:
    source menv/bin/activate
     .\menv\Scripts\activate

3. Install Dependencies

Install the required Python dependencies:

pip install -r requirements.txt

4. Configure API Keys

Create a .env file in the root directory and add your API keys for Google Generative AI and Kaggle:

GOOGLE_API_KEY=your-google-api-key
KAGGLE_API_KEY=your-kaggle-api-key
JINA_API_KEY  =your-kaggle-api-key

5. Run the System

To run the multi-agent automation system, simply execute the main_agent.py script:

python main_agent.py

This will trigger the sequence of tasks:

  • Web scraping (Agent 1)
  • Use case generation (Agent 2)
  • Resource collection (Agent 3)

Run the main agent:

To execute the scraping process, run the main_agent.py with the list of URLs:

python main_agent.py "https://example1.com" "https://example2.com"

Requirements

  • Python 3.9 or later
  • Libraries listed in requirements.txt
  • Virtual environment for isolating dependencies

License

This project is licensed under the MIT License - see the LICENSE file for details.


alt text

About

Multi-Agent architecture system that generates relevant AI and Generative AI (GenAI) use cases for a given Company or Industry.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages