Skip to content

A python wrapper and command-line interface (CLI) for Google's Gemini Pro 1.0 and upcoming 1.5 models. This toolkit allows users to easily access the multi-modal text and vision API's, along with a simple chat interface for the multi-turn text API (ChatBot), offering access to the full suite of Google's Gemini Pro and soon Gemini Ultra LLMs.

License

Notifications You must be signed in to change notification settings

ling123456wwwww/Gemini-AI-Wrapper-and-CLI

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

81 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Google Gemini AI

Gemini AI Wrapper and CLI

maintained - yes contributions - welcome

Google Gemini AI

Note

A stable release (v1.2.0) will be made available on Friday, March 8, 2024. Please check back here for updates.

Overview

This toolkit provides a straightforward interface for interacting with Google's Gemini Pro 1.0 and upcoming 1.5 AI models. It facilitates tasks such as text generation, image captioning & analysis, and multi-turn chat (chatbot) functionality by abstracting complex API calls into simpler, more accessible commands. This tool is especially useful for everyday users, developers and researchers who wish to incorporate advanced AI capabilities into their projects without delving into the intricacies of direct API communication offering access to the full suite of Google's Gemini Pro and soon Gemini Ultra large language models.

Key Features

  • Chat Functionality: Chat with Gemini's advanced conversational models.
  • Image Captioning: Analyze images and generate descriptive captions or insights.
  • Text Generation: Generate creative and contextually relevant text based on prompts.
  • Command-Line Interface (CLI): Access Gemini AI functionalities directly from the command line for quick integrations and testing.
  • Python Wrapper: Enables seamless interaction with the full suite of Gemini models offered by Google using just two lines of code.

Prerequisites

  • Python 3.x
  • An API key from Google AI Studio

Dependencies

The following Python packages are required:

  • requests: For making HTTP requests to Google's Gemini API.

The following Python packages are optional:

  • python-dotenv: For managing API keys and other environment variables.

Installation

Follow these steps to set up the Gemini AI Wrapper and CLI on your system:

Clone the repository:

git clone https://github.com/RMNCLDYO/Gemini-AI-Wrapper-and-CLI.git
cd Gemini-AI-Wrapper-and-CLI

Install required Python packages:

pip install -r requirements.txt

Configuration

  1. To use the Gemini API, you'll need an API key. If you don't already have one, create a key in Google AI Studio.
  2. Once you have your API key, create a new file named .env in the root directory (main folder), or rename the example.env file in the root directory of this project to .env.
  3. Add your API key to the .env file as follows:
    API_KEY=your_api_key_here
    
  4. The program will automatically load and use your API key when chatting with the language model.

CLI Usage

Start a Chat Session:

python gemini_cli.py chat

Generate Text from a Prompt:

python gemini_cli.py text "Your text prompt here"

Generate Caption from an Image:

python gemini_cli.py vision path/to/your/image.jpg "Vision prompt"

Python Wrapper Usage

Start a Chat Session:

from gemini_chat import ChatAPI

ChatAPI().chat()

Generate Text from a Prompt:

from gemini_text import TextAPI

TextAPI().text("Your text prompt here")

Generate Caption from an Image:

from gemini_vision import VisionAPI

VisionAPI().vision("path/to/your/image.jpg", "Vision prompt")

VisionAPI - Limitations and Requirements

  • Images must be in one of the following image data MIME types:

    • PNG - image/png
    • JPEG - image/jpeg
    • WEBP - image/webp
    • HEIC - image/heic
    • HEIF - image/heif
  • Maximum of 16 individual images.

  • Maximum of 4MB of data, including images and text.

  • No specific limits to the number of pixels in an image; however, larger images are scaled down to fit a maximum resolution of 3072 x 3072 while preserving their original aspect ratio.

Prompts with a single image tend to yield better results.

Contributing

Contributions are welcome!

Please refer to CONTRIBUTING.md for detailed guidelines on how to contribute to this project.

Reporting Issues

Encountered a bug? We'd love to hear about it. Please follow these steps to report any issues:

  1. Check if the issue has already been reported.
  2. Use the Bug Report template to create a detailed report.
  3. Submit the report here.

Your report will help us make the project better for everyone.

Feature Requests

Got an idea for a new feature? Feel free to suggest it. Here's how:

  1. Check if the feature has already been suggested or implemented.
  2. Use the Feature Request template to create a detailed request.
  3. Submit the request here.

Your suggestions for improvements are always welcome.

Versioning and Changelog

Stay up-to-date with the latest changes and improvements in each version:

  • CHANGELOG.md provides detailed descriptions of each release.

Security

Your security is important to us. If you discover a security vulnerability, please follow our responsible disclosure guidelines found in SECURITY.md. Please refrain from disclosing any vulnerabilities publicly until said vulnerability has been reported and addressed.

License

Licensed under the MIT License. See LICENSE for details.

About

A python wrapper and command-line interface (CLI) for Google's Gemini Pro 1.0 and upcoming 1.5 models. This toolkit allows users to easily access the multi-modal text and vision API's, along with a simple chat interface for the multi-turn text API (ChatBot), offering access to the full suite of Google's Gemini Pro and soon Gemini Ultra LLMs.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%