Skip to content

Commit

Permalink
removed obsolete sections
Browse files Browse the repository at this point in the history
  • Loading branch information
gdcsinaptik committed Jan 17, 2025
1 parent aa7ee42 commit ec86585
Show file tree
Hide file tree
Showing 12 changed files with 174 additions and 170 deletions.
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,16 +154,18 @@ PandaAI is available under the MIT expat license, except for the `pandasai/ee` d
If you are interested in managed PandaAI Cloud or self-hosted Enterprise Offering, [contact us](https://forms.gle/JEUqkwuTqFZjhP7h8).

## Resources

- [Docs](https://pandas-ai.readthedocs.io/en/latest/) for comprehensive documentation
- [Examples](examples) for example notebooks
- [Discord](https://discord.gg/KYKj9F2FRH) for discussion with the community and PandaAI team

> **Beta Notice**
> Release v3 is currently in beta. This documentation reflects the features and functionality in progress and may change before the final release.
## 🤝 Contributing

Contributions are welcome! Please check the outstanding issues and feel free to open a pull request.
For more information, please check out the [contributing guidelines](CONTRIBUTING.md).

### Thank you!

[![Contributors](https://contrib.rocks/image?repo=sinaptik-ai/pandas-ai)](https://github.com/sinaptik-ai/pandas-ai/graphs/contributors)
[![Contributors](https://contrib.rocks/image?repo=sinaptik-ai/pandas-ai)](https://github.com/sinaptik-ai/pandas-ai/graphs/contributors)
2 changes: 1 addition & 1 deletion docs/mint.json
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@
},
{
"group": "Natural Language",
"pages": ["v3/overview-nl", "v3/large-language-models", "v3/chat-and-cache", "v3/output-formats"],
"pages": ["v3/overview-nl", "v3/large-language-models", "v3/chat-and-output"],
"version": "v3"
},
{
Expand Down
84 changes: 61 additions & 23 deletions docs/v3/agent.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,13 @@ You can train PandaAI to understand your data better and to improve its performa

## Prerequisites

Before you start training PandaAI, you need to set your PandaAI API key. You can generate your API key by signing up at [https://app.pandabi.ai](https://app.pandabi.ai).

Then you can set your API key as an environment variable:
Before you start training PandaAI, you need to set your PandaAI API key.
You can generate your API key by signing up at [https://app.pandabi.ai](https://app.pandabi.ai).

```python
import os
import pandasai as pai

os.environ["PANDABI_API_KEY"] = "YOUR_PANDABI_API_KEY"
pai.api_key.set("your-pai-api-key")
```

It is important that you set the API key, or it will fail with the following error: `No vector store provided. Please provide a vector store to train the agent`.
Expand All @@ -37,10 +36,10 @@ The training uses by default the `BambooVectorStore` to store the training data,
As an alternative, if you want to use a local vector store (enterprise only for production use cases), you can use the `ChromaDB`, `Qdrant` or `Pinecone` vector stores (see examples below).

```python
import pandasai as pai
from pandasai import Agent

# Set your PandasAI API key (you can generate one signing up at https://app.pandabi.ai)
os.environ["PANDABI_API_KEY"] = "YOUR_PANDABI_API_KEY"
pai.api_key.set("your-pai-api-key")

agent = Agent("data.csv")
agent.train(docs="The fiscal year starts in April")
Expand All @@ -65,19 +64,22 @@ agent = Agent("data.csv")

# Train the model
query = "What is the total sales for the current fiscal year?"
response = """
import pandas as pd
# The following code is passed as a string to the response variable
response = '\n'.join([
'import pandas as pd',
'',
'df = dfs[0]',
'',
'# Calculate the total sales for the current fiscal year',
'total_sales = df[df[\'date\'] >= pd.to_datetime(\'today\').replace(month=4, day=1)][\'sales\'].sum()',
'result = { "type": "number", "value": total_sales }'
])

df = dfs[0]
# Calculate the total sales for the current fiscal year
total_sales = df[df['date'] >= pd.to_datetime('today').replace(month=4, day=1)]['sales'].sum()
result = { "type": "number", "value": total_sales }
"""
agent.train(queries=[query], codes=[response])

response = agent.chat("What is the total sales for the last fiscal year?")
print(response)

# The model will use the information provided in the training to generate a response
```

Expand Down Expand Up @@ -114,15 +116,17 @@ agent = Agent("data.csv", vectorstore=vector_store)

# Train the model
query = "What is the total sales for the current fiscal year?"
response = """
import pandas as pd
df = dfs[0]
# The following code is passed as a string to the response variable
response = '\n'.join([
'import pandas as pd',
'',
'df = dfs[0]',
'',
'# Calculate the total sales for the current fiscal year',
'total_sales = df[df[\'date\'] >= pd.to_datetime(\'today\').replace(month=4, day=1)][\'sales\'].sum()',
'result = { "type": "number", "value": total_sales }'
])

# Calculate the total sales for the current fiscal year
total_sales = df[df['date'] >= pd.to_datetime('today').replace(month=4, day=1)]['sales'].sum()
result = { "type": "number", "value": total_sales }
"""
agent.train(queries=[query], codes=[response])

response = agent.chat("What is the total sales for the last fiscal year?")
Expand All @@ -149,3 +153,37 @@ vector_store = BambooVectorStor(api_key="YOUR_PANDABI_API_KEY")
# Instantiate the agent with the custom vector store
agent = Agent(connector, config={...} vectorstore=vector_store)
```
## Custom Head

In some cases, you might want to provide custom data samples to the conversational agent to improve its understanding and responses. For example, you might want to:
- Provide better examples that represent your data patterns
- Avoid sharing sensitive information
- Guide the agent with specific data scenarios

You can do this by passing a custom head to the agent:

```python
import pandas as pd
import pandasai as pai

# Your original dataframe
df = pd.DataFrame({
'sensitive_id': [1001, 1002, 1003, 1004, 1005],
'amount': [150, 200, 300, 400, 500],
'category': ['A', 'B', 'A', 'C', 'B']
})

# Create a custom head with anonymized data
head_df = pd.DataFrame({
'sensitive_id': [1, 2, 3, 4, 5],
'amount': [100, 200, 300, 400, 500],
'category': ['A', 'B', 'C', 'A', 'B']
})

# Use the custom head
smart_df = pai.SmartDataframe(df, config={
"custom_head": head_df
})
```

The agent will use your custom head instead of the default first 5 rows of the dataframe when analyzing and responding to queries.
70 changes: 0 additions & 70 deletions docs/v3/chat-and-cache.mdx

This file was deleted.

42 changes: 39 additions & 3 deletions docs/v3/output-formats.mdx → docs/v3/chat-and-output.mdx
Original file line number Diff line number Diff line change
@@ -1,12 +1,48 @@
---
title: 'Output formats'
description: 'Understanding the different output formats supported by PandaAI'
title: "Chat and output formats"
description: "Learn how to use PandaAI's powerful chat functionality and the output formats for natural language data analysis"
---

PandaAI supports multiple output formats for responses, each designed to handle different types of data and analysis results effectively. This document outlines the available output formats and their use cases.
<Note title="Beta Notice">
Release v3 is currently in beta. This documentation reflects the features and functionality in progress and may change before the final release.
</Note>

## Chat

The `.chat()` method is PandaAI's core feature that enables natural language interaction with your data. It allows you to:
- Query your data using plain English
- Generate visualizations and statistical analyses
- Work with multiple DataFrames simultaneously

For a more UI-based data analysis experience, check out our [Data Platform](/v3/ai-dashboards).

### Basic Usage

```python
import pandasai as pai

df_customers = pai.load("company/customers")

response = df_customers.chat("Which are our top 5 customers?")
```

### Chat with multiple DataFrames

```python
import pandasai as pai

df_customers = pai.load("company/customers")
df_orders = pai.load("company/orders")
df_products = pai.load("company/products")

response = pai.chat('Who are our top 5 customers and what products do they buy most frequently?', df_customers, df_orders, df_products)
```

## Available Output Formats

PandaAI supports multiple output formats for responses, each designed to handle different types of data and analysis results effectively. This document outlines the available output formats and their use cases.


### DataFrame Response
Used when the result is a pandas DataFrame. This format preserves the tabular structure of your data and allows for further data manipulation.

Expand Down
6 changes: 5 additions & 1 deletion docs/v3/dataframes.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,12 @@ title: 'Semantic Dataframes'
description: 'Working with semantic dataframes in PandaAI'
---

<Note title="Beta Notice">
Release v3 is currently in beta. This documentation reflects the features and functionality in progress and may change before the final release.
</Note>

Once you have turned raw data into semantic enhanced dataframes with the [semantic layer](/v3/semantic-layer), you can load them as either materialized or virtualized dataframes, depending on the data source.
Using the `.chat` method, you can ask questions and get responses and charts.
Using the [`.chat`](/v3/chat-and-output) method, you can ask questions and get responses and charts.
These dataframes can be [shared with your team](/v3/share-dataframes) by pushing them to our [data platform](/v3/ai-dashboards).

## Materialized Dataframes
Expand Down
2 changes: 1 addition & 1 deletion docs/v3/getting-started.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ Depending on your question, it can return different objects:
- chart
- number

Find it more about output data formats [here](/v3/output-formats)
Find it more about output data formats [here](/v3/chat-and-output#available-output-formats).

### Create and load dataframes

Expand Down
11 changes: 5 additions & 6 deletions docs/v3/large-language-models.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,20 +9,19 @@ Release v3 is currently in beta. This documentation reflects the features and fu

PandaAI supports multiple LLMs.
To make the library lightweight, the default LLM is BambooLLM, developed by PandaAI team themselves.
To use other LLMs, you need to install the corresponding llm extension. Once a LLM extension is installed, you can configure it simply using `pai.config.set()`.
Then, every time you use the `.chat()` method, it will use the configured LLM.
To use other LLMs, you need to install the corresponding llm extension.
Once a LLM extension is installed, you can configure it simply using [`pai.config.set()`](/v3/overview-nl#configure-the-nl-layer).
Then, every time you use the [`.chat()`](/v3/chat-and-output) method, it will use the configured LLM.

## BambooLLM

BambooLLM is the default LLM for PandaAI, fine-tuned for data analysis.
You can get your free API key by signing up at [pandabi.ai](https://app.pandabi.ai).
You can get your free API key by signing up at [app.pandabi.ai](https://app.pandabi.ai).

```python
import pandasai as pai

# set up BambooLLM
# replace "YOUR_PANDABI_API_KEY" with your API key from https://app.pandabi.ai
os.environ["PANDABI_API_KEY"] = "YOUR_PANDABI_API_KEY"
pai.api_key.set("api-key")
```

## OpenAI models
Expand Down
10 changes: 2 additions & 8 deletions docs/v3/overview-nl.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@ Release v3 is currently in beta. This documentation reflects the features and fu
## How does PandaAI NL Layer work?

The Natural Language Layer uses generative AI to transform natural language queries into production-ready code generated by LLMs.
When you use the `.chat` method on a [semantic dataframe](/v3/dataframes), PandaAI passes to the LLM the question, the table headers, and 5-10 rows of the Dataframe.
When you use the [`.chat`](/v3/chat-and-output) method on a [semantic dataframe](/v3/dataframes), PandaAI passes to the LLM the question, the table headers, and 5-10 rows of the Dataframe.
It then instructs the LLM to generate the most relevant code, whether Python or SQL. The code is then executed locally.
There are different output formats supported by PandaAI, which can be found [here](/v3/output-formats).
There are different output formats supported by PandaAI, which can be found [here](/v3/chat-and-output#available-output-formats).

The NL Layer is also one of the core components of our [Data Platform](/v3/ai-dashboards), which allows you to turn raw data into collaborative AI dashboards with in-built conversational agents with a single line of code.
```python
Expand All @@ -33,7 +33,6 @@ pai.config.set({
"llm": "openai",
"save_logs": True,
"verbose": False,
"enable_cache": True,
"max_retries": 3
})
```
Expand All @@ -53,11 +52,6 @@ pai.config.set({
- **Default**: `False`
- **Description**: Whether to print the logs in the console as PandaAI is executed.

#### enable_cache
- **Type**: `bool`
- **Default**: `True`
- **Description**: Whether to enable caching. If set to True, PandaAI will cache the results of the LLM to improve the response time. If set to False, PandaAI will always call the LLM. Learn more about [caching](/v3/chat-and-cache#cache).

#### max_retries
- **Type**: `int`
- **Default**: `3`
Expand Down
4 changes: 4 additions & 0 deletions docs/v3/semantic-layer.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@ title: 'Semantic Layer'
description: 'Turn raw data into semantic-enhanced and clean dataframes'
---

<Note title="Beta Notice">
Release v3 is currently in beta. This documentation reflects the features and functionality in progress and may change before the final release.
</Note>

## What's the Semantic Layer?

The semantic layer allows you to turn raw data into [dataframes](/v3/dataframes) you can ask questions to and [share with your team](/v3/share-dataframes) as conversational AI dashboards. It serves several important purposes:
Expand Down
Loading

0 comments on commit ec86585

Please sign in to comment.