Skip to content

Latest commit

 

History

History
333 lines (235 loc) · 10.8 KB

backend.md

File metadata and controls

333 lines (235 loc) · 10.8 KB

⚙️ Backend

Table of Contents


Overview

Backend is an umbrella module that encapsulates a unified way to work with the following functionalities:

  • Chat Models via (ChatModel class)
  • Embedding Models (coming soon)
  • Audio Models (coming soon)
  • Image Models (coming soon)

BeeAI framework's backend is designed with a provider-based architecture, allowing you to switch between different AI service providers while maintaining a consistent API.

Note

Location within the framework: beeai_framework/backend.


Supported providers

The following table depicts supported providers. Each provider requires specific configuration through environment variables. Ensure all required variables are set before initializing a provider.

Name Chat Embedding Dependency Environment Variables
Ollama ollama-ai-provider OLLAMA_CHAT_MODEL
OLLAMA_BASE_URL
OpenAI openai OPENAI_CHAT_MODEL
OPENAI_API_BASE
OPENAI_API_KEY
OPENAI_ORGANIZATION
Watsonx @ibm-cloud/watsonx-ai WATSONX_CHAT_MODEL
WATSONX_EMBEDDING_MODEL
WATSONX_API_KEY
WATSONX_PROJECT_ID
WATSONX_SPACE_ID
WATSONX_VERSION
WATSONX_REGION
Groq GROQ_CHAT_MODEL
GROQ_API_KEY
Amazon Bedrock Coming soon! AWS_CHAT_MODEL
AWS_EMBEDDING_MODEL
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_REGION
AWS_SESSION_TOKEN
Google Vertex Coming soon! GOOGLE_VERTEX_CHAT_MODEL
GOOGLE_VERTEX_EMBEDDING_MODEL
GOOGLE_VERTEX_PROJECT
GOOGLE_VERTEX_ENDPOINT
GOOGLE_VERTEX_LOCATION
Azure OpenAI Coming soon! AZURE_OPENAI_CHAT_MODEL
AZURE_OPENAI_EMBEDDING_MODEL
AZURE_OPENAI_API_KEY
AZURE_OPENAI_API_ENDPOINT
AZURE_OPENAI_API_RESOURCE
AZURE_OPENAI_API_VERSION
Anthropic Coming soon! ANTHROPIC_CHAT_MODEL
ANTHROPIC_EMBEDDING_MODEL
ANTHROPIC_API_KEY
ANTHROPIC_API_BASE_URL
ANTHROPIC_API_HEADERS
xAI XAI_CHAT_MODEL
XAI_API_KEY

Tip

If you don't see your provider raise an issue here. Meanwhile, you can use Ollama adapter.


Backend initialization

The Backend class serves as a central entry point to access models from your chosen provider.

import asyncio
import json
import sys
import traceback

from pydantic import BaseModel, Field

from beeai_framework import ToolMessage
from beeai_framework.adapters.watsonx.backend.chat import WatsonxChatModel
from beeai_framework.backend.chat import ChatModel
from beeai_framework.backend.message import MessageToolResultContent, UserMessage
from beeai_framework.cancellation import AbortSignal
from beeai_framework.errors import AbortError, FrameworkError
from beeai_framework.tools.weather.openmeteo import OpenMeteoTool

# Setting can be passed here during initiation or pre-configured via environment variables
llm = WatsonxChatModel(
    "ibm/granite-3-8b-instruct",
    # settings={
    #     "project_id": "WATSONX_PROJECT_ID",
    #     "api_key": "WATSONX_API_KEY",
    #     "api_base": "WATSONX_API_URL",
    # },
)


async def watsonx_from_name() -> None:
    watsonx_llm = ChatModel.from_name(
        "watsonx:ibm/granite-3-8b-instruct",
        # {
        #     "project_id": "WATSONX_PROJECT_ID",
        #     "api_key": "WATSONX_API_KEY",
        #     "api_base": "WATSONX_API_URL",
        # },
    )
    user_message = UserMessage("what states are part of New England?")
    response = await watsonx_llm.create(messages=[user_message])
    print(response.get_text_content())


async def watsonx_sync() -> None:
    user_message = UserMessage("what is the capital of Massachusetts?")
    response = await llm.create(messages=[user_message])
    print(response.get_text_content())


async def watsonx_stream() -> None:
    user_message = UserMessage("How many islands make up the country of Cape Verde?")
    response = await llm.create(messages=[user_message], stream=True)
    print(response.get_text_content())


async def watsonx_stream_abort() -> None:
    user_message = UserMessage("What is the smallest of the Cape Verde islands?")

    try:
        response = await llm.create(messages=[user_message], stream=True, abort_signal=AbortSignal.timeout(0.5))

        if response is not None:
            print(response.get_text_content())
        else:
            print("No response returned.")
    except AbortError as err:
        print(f"Aborted: {err}")


async def watson_structure() -> None:
    class TestSchema(BaseModel):
        answer: str = Field(description="your final answer")

    user_message = UserMessage("How many islands make up the country of Cape Verde?")
    response = await llm.create_structure(schema=TestSchema, messages=[user_message])
    print(response.object)


async def watson_tool_calling() -> None:
    watsonx_llm = ChatModel.from_name(
        "watsonx:ibm/granite-3-8b-instruct",
    )
    user_message = UserMessage("What is the current weather in Boston?")
    weather_tool = OpenMeteoTool()
    response = await watsonx_llm.create(messages=[user_message], tools=[weather_tool])
    tool_call_msg = response.get_tool_calls()[0]
    print(tool_call_msg.model_dump())
    tool_response = await weather_tool.run(json.loads(tool_call_msg.args))
    tool_response_msg = ToolMessage(
        MessageToolResultContent(
            result=tool_response.get_text_content(), tool_name=tool_call_msg.tool_name, tool_call_id=tool_call_msg.id
        )
    )
    print(tool_response_msg.to_plain())
    final_response = await watsonx_llm.create(messages=[user_message, tool_response_msg], tools=[])
    print(final_response.get_text_content())


async def main() -> None:
    print("*" * 10, "watsonx_from_name")
    await watsonx_from_name()
    print("*" * 10, "watsonx_sync")
    await watsonx_sync()
    print("*" * 10, "watsonx_stream")
    await watsonx_stream()
    print("*" * 10, "watsonx_stream_abort")
    await watsonx_stream_abort()
    print("*" * 10, "watson_structure")
    await watson_structure()
    print("*" * 10, "watson_tool_calling")
    await watson_tool_calling()


if __name__ == "__main__":
    try:
        asyncio.run(main())
    except FrameworkError as e:
        traceback.print_exc()
        sys.exit(e.explain())

Source: examples/backend/providers/watsonx.py

All providers examples can be found in examples/backend/providers.


Chat model

The ChatModel class represents a Chat Large Language Model and provides methods for text generation, streaming responses, and more. You can initialize a chat model in multiple ways:

Method 1: Using the generic factory method

from beeai_framework.backend.chat import ChatModel

ollama_chat_model = ChatModel.from_name("ollama:llama3.1")

Method 2: Creating a specific provider model directly

from beeai_framework.adapters.ollama.backend.chat import OllamaChatModel

ollama_chat_model = OllamaChatModel("llama3.1")

Chat model configuration

You can configure various parameters for your chat model:

Coming soon

Text generation

The most basic usage is to generate text responses:

from beeai_framework.adapters.ollama.backend.chat import OllamaChatModel
from beeai_framework.backend.message import UserMessage

ollama_chat_model = OllamaChatModel("llama3.1")
response = await ollama_chat_model.create(
    messages=[UserMessage("what states are part of New England?")]
)

print(response.get_text_content())

Note

Execution parameters (those passed to model.create({...})) are superior to ones defined via config.

Streaming responses

For applications requiring real-time responses:

from beeai_framework.adapters.ollama.backend.chat import OllamaChatModel
from beeai_framework.backend.message import UserMessage

llm = OllamaChatModel("llama3.1")
user_message = UserMessage("How many islands make up the country of Cape Verde?")
response = await llm.create(messages=[user_message], stream=True)

Structured generation

Generate structured data according to a schema:

Coming soon

Source: /examples/backend/structured.py

Tool calling

Integrate external tools with your AI model:

Coming soon

Source: /examples/backend/toolCalling.py


Embedding model

The EmbedingModel class provides functionality for generating vector embeddings from text.

Embedding model initialization

You can initialize an embedding model in multiple ways:

Method 1: Using the generic factory method

Coming soon

Method 2: Creating a specific provider model directly

Coming soon

Embedding model usage

Generate embeddings for one or more text strings:

Coming soon

Advanced usage

If your preferred provider isn't directly supported, you can use the LangChain adapter as a bridge.

This allows you to leverage any provider that has LangChain compatibility.

Coming soon

Source: /examples/backend/providers/langchain.py


Troubleshooting

Common issues and their solutions:

  1. Authentication errors: Ensure all required environment variables are set correctly
  2. Model not found: Verify that the model ID is correct and available for the selected provider

Examples

  • All backend examples can be found in here.