removed obsolete sections

Sinaptik-AI · Jan 17, 2025 · ec86585 · ec86585
1 parent aa7ee42
commit ec86585
Show file tree

Hide file tree

Showing 12 changed files with 174 additions and 170 deletions.
diff --git a/README.md b/README.md
@@ -154,16 +154,18 @@ PandaAI is available under the MIT expat license, except for the `pandasai/ee` d
 If you are interested in managed PandaAI Cloud or self-hosted Enterprise Offering, [contact us](https://forms.gle/JEUqkwuTqFZjhP7h8).
 
 ## Resources
-
 - [Docs](https://pandas-ai.readthedocs.io/en/latest/) for comprehensive documentation
 - [Examples](examples) for example notebooks
 - [Discord](https://discord.gg/KYKj9F2FRH) for discussion with the community and PandaAI team
 
+> **Beta Notice**  
+> Release v3 is currently in beta. This documentation reflects the features and functionality in progress and may change before the final release.
+
 ## 🤝 Contributing
 
 Contributions are welcome! Please check the outstanding issues and feel free to open a pull request.
 For more information, please check out the [contributing guidelines](CONTRIBUTING.md).
 
 ### Thank you!
 
-[![Contributors](https://contrib.rocks/image?repo=sinaptik-ai/pandas-ai)](https://github.com/sinaptik-ai/pandas-ai/graphs/contributors)
+[![Contributors](https://contrib.rocks/image?repo=sinaptik-ai/pandas-ai)](https://github.com/sinaptik-ai/pandas-ai/graphs/contributors)
diff --git a/docs/mint.json b/docs/mint.json
@@ -64,7 +64,7 @@
       },
       {
         "group": "Natural Language",
-        "pages": ["v3/overview-nl", "v3/large-language-models", "v3/chat-and-cache", "v3/output-formats"],
+        "pages": ["v3/overview-nl", "v3/large-language-models", "v3/chat-and-output"],
         "version": "v3"
       },
       {

diff --git a/docs/v3/agent.mdx b/docs/v3/agent.mdx
@@ -12,14 +12,13 @@ You can train PandaAI to understand your data better and to improve its performa
 
 ## Prerequisites
 
-Before you start training PandaAI, you need to set your PandaAI API key. You can generate your API key by signing up at [https://app.pandabi.ai](https://app.pandabi.ai).
-
-Then you can set your API key as an environment variable:
+Before you start training PandaAI, you need to set your PandaAI API key. 
+You can generate your API key by signing up at [https://app.pandabi.ai](https://app.pandabi.ai).
 
 ```python
-import os
+import pandasai as pai
 
-os.environ["PANDABI_API_KEY"] = "YOUR_PANDABI_API_KEY"
+pai.api_key.set("your-pai-api-key")
 ```
 
 It is important that you set the API key, or it will fail with the following error: `No vector store provided. Please provide a vector store to train the agent`.
@@ -37,10 +36,10 @@ The training uses by default the `BambooVectorStore` to store the training data,
 As an alternative, if you want to use a local vector store (enterprise only for production use cases), you can use the `ChromaDB`, `Qdrant` or `Pinecone` vector stores (see examples below).
 
 ```python
+import pandasai as pai
 from pandasai import Agent
 
-# Set your PandasAI API key (you can generate one signing up at https://app.pandabi.ai)
-os.environ["PANDABI_API_KEY"] = "YOUR_PANDABI_API_KEY"
+pai.api_key.set("your-pai-api-key")
 
 agent = Agent("data.csv")
 agent.train(docs="The fiscal year starts in April")
@@ -65,19 +64,22 @@ agent = Agent("data.csv")
 
 # Train the model
 query = "What is the total sales for the current fiscal year?"
-response = """
-import pandas as pd
+# The following code is passed as a string to the response variable
+response = '\n'.join([
+    'import pandas as pd',
+    '',
+    'df = dfs[0]',
+    '',
+    '# Calculate the total sales for the current fiscal year',
+    'total_sales = df[df[\'date\'] >= pd.to_datetime(\'today\').replace(month=4, day=1)][\'sales\'].sum()',
+    'result = { "type": "number", "value": total_sales }'
+])
 
-df = dfs[0]
-
-# Calculate the total sales for the current fiscal year
-total_sales = df[df['date'] >= pd.to_datetime('today').replace(month=4, day=1)]['sales'].sum()
-result = { "type": "number", "value": total_sales }
-"""
 agent.train(queries=[query], codes=[response])
 
 response = agent.chat("What is the total sales for the last fiscal year?")
 print(response)
+
 # The model will use the information provided in the training to generate a response
 ```
 
@@ -114,15 +116,17 @@ agent = Agent("data.csv", vectorstore=vector_store)
 
 # Train the model
 query = "What is the total sales for the current fiscal year?"
-response = """
-import pandas as pd
-
-df = dfs[0]
+# The following code is passed as a string to the response variable
+response = '\n'.join([
+    'import pandas as pd',
+    '',
+    'df = dfs[0]',
+    '',
+    '# Calculate the total sales for the current fiscal year',
+    'total_sales = df[df[\'date\'] >= pd.to_datetime(\'today\').replace(month=4, day=1)][\'sales\'].sum()',
+    'result = { "type": "number", "value": total_sales }'
+])
 
-# Calculate the total sales for the current fiscal year
-total_sales = df[df['date'] >= pd.to_datetime('today').replace(month=4, day=1)]['sales'].sum()
-result = { "type": "number", "value": total_sales }
-"""
 agent.train(queries=[query], codes=[response])
 
 response = agent.chat("What is the total sales for the last fiscal year?")
@@ -149,3 +153,37 @@ vector_store = BambooVectorStor(api_key="YOUR_PANDABI_API_KEY")
 # Instantiate the agent with the custom vector store
 agent = Agent(connector, config={...} vectorstore=vector_store)
 ```
+## Custom Head
+
+In some cases, you might want to provide custom data samples to the conversational agent to improve its understanding and responses. For example, you might want to:
+- Provide better examples that represent your data patterns
+- Avoid sharing sensitive information
+- Guide the agent with specific data scenarios
+
+You can do this by passing a custom head to the agent:
+
+```python
+import pandas as pd
+import pandasai as pai
+
+# Your original dataframe
+df = pd.DataFrame({
+    'sensitive_id': [1001, 1002, 1003, 1004, 1005],
+    'amount': [150, 200, 300, 400, 500],
+    'category': ['A', 'B', 'A', 'C', 'B']
+})
+
+# Create a custom head with anonymized data
+head_df = pd.DataFrame({
+    'sensitive_id': [1, 2, 3, 4, 5],
+    'amount': [100, 200, 300, 400, 500],
+    'category': ['A', 'B', 'C', 'A', 'B']
+})
+
+# Use the custom head
+smart_df = pai.SmartDataframe(df, config={
+    "custom_head": head_df
+})
+```
+
+The agent will use your custom head instead of the default first 5 rows of the dataframe when analyzing and responding to queries.
diff --git a/docs/v3/chat-and-cache.mdx b/docs/v3/chat-and-cache.mdx
diff --git a/docs/v3/output-formats.mdx → docs/v3/chat-and-output.mdx b/docs/v3/output-formats.mdx → docs/v3/chat-and-output.mdx
@@ -1,12 +1,48 @@
 ---
-title: 'Output formats'
-description: 'Understanding the different output formats supported by PandaAI'
+title: "Chat and output formats"
+description: "Learn how to use PandaAI's powerful chat functionality and the output formats for natural language data analysis"
 ---
 
-PandaAI supports multiple output formats for responses, each designed to handle different types of data and analysis results effectively. This document outlines the available output formats and their use cases.
+<Note title="Beta Notice">
+Release v3 is currently in beta. This documentation reflects the features and functionality in progress and may change before the final release.
+</Note>
+
+## Chat
+
+The `.chat()` method is PandaAI's core feature that enables natural language interaction with your data. It allows you to:
+- Query your data using plain English
+- Generate visualizations and statistical analyses
+- Work with multiple DataFrames simultaneously
+
+For a more UI-based data analysis experience, check out our [Data Platform](/v3/ai-dashboards).
+
+### Basic Usage
+
+```python
+import pandasai as pai
+
+df_customers = pai.load("company/customers")
+
+response = df_customers.chat("Which are our top 5 customers?")
+```
+
+### Chat with multiple DataFrames
+
+```python
+import pandasai as pai
+
+df_customers = pai.load("company/customers")
+df_orders = pai.load("company/orders")
+df_products = pai.load("company/products")
+
+response = pai.chat('Who are our top 5 customers and what products do they buy most frequently?', df_customers, df_orders, df_products)
+```
 
 ## Available Output Formats
 
+PandaAI supports multiple output formats for responses, each designed to handle different types of data and analysis results effectively. This document outlines the available output formats and their use cases.
+
+
 ### DataFrame Response
 Used when the result is a pandas DataFrame. This format preserves the tabular structure of your data and allows for further data manipulation.
 

diff --git a/docs/v3/dataframes.mdx b/docs/v3/dataframes.mdx
@@ -3,8 +3,12 @@ title: 'Semantic Dataframes'
 description: 'Working with semantic dataframes in PandaAI'
 ---
 
+<Note title="Beta Notice">
+Release v3 is currently in beta. This documentation reflects the features and functionality in progress and may change before the final release.
+</Note>
+
 Once you have turned raw data into semantic enhanced dataframes with the [semantic layer](/v3/semantic-layer), you can load them as either materialized or virtualized dataframes, depending on the data source. 
-Using the `.chat` method, you can ask questions and get responses and charts. 
+Using the [`.chat`](/v3/chat-and-output) method, you can ask questions and get responses and charts. 
 These dataframes can be [shared with your team](/v3/share-dataframes) by pushing them to our [data platform](/v3/ai-dashboards).
 
 ## Materialized Dataframes

diff --git a/docs/v3/getting-started.mdx b/docs/v3/getting-started.mdx
@@ -47,7 +47,7 @@ Depending on your question, it can return different objects:
 - chart
 - number
 
-Find it more about output data formats [here](/v3/output-formats)
+Find it more about output data formats [here](/v3/chat-and-output#available-output-formats).
 
 ### Create and load dataframes
 

diff --git a/docs/v3/large-language-models.mdx b/docs/v3/large-language-models.mdx
@@ -9,20 +9,19 @@ Release v3 is currently in beta. This documentation reflects the features and fu
 
 PandaAI supports multiple LLMs.
 To make the library lightweight, the default LLM is BambooLLM, developed by PandaAI team themselves.
-To use other LLMs, you need to install the corresponding llm extension. Once a LLM extension is installed, you can configure it simply using `pai.config.set()`.
-Then, every time you use the `.chat()` method, it will use the configured LLM.
+To use other LLMs, you need to install the corresponding llm extension.
+Once a LLM extension is installed, you can configure it simply using [`pai.config.set()`](/v3/overview-nl#configure-the-nl-layer).
+Then, every time you use the [`.chat()`](/v3/chat-and-output) method, it will use the configured LLM.
 
 ## BambooLLM
 
 BambooLLM is the default LLM for PandaAI, fine-tuned for data analysis.
-You can get your free API key by signing up at [pandabi.ai](https://app.pandabi.ai).
+You can get your free API key by signing up at [app.pandabi.ai](https://app.pandabi.ai).
 
 ```python
 import pandasai as pai
 
-# set up BambooLLM
-# replace "YOUR_PANDABI_API_KEY" with your API key from https://app.pandabi.ai
-os.environ["PANDABI_API_KEY"] = "YOUR_PANDABI_API_KEY"
+pai.api_key.set("api-key")
 ```
 
 ## OpenAI models

diff --git a/docs/v3/overview-nl.mdx b/docs/v3/overview-nl.mdx
@@ -10,9 +10,9 @@ Release v3 is currently in beta. This documentation reflects the features and fu
 ## How does PandaAI NL Layer work?
 
 The Natural Language Layer uses generative AI to transform natural language queries into production-ready code generated by LLMs.
-When you use the `.chat` method on a [semantic dataframe](/v3/dataframes), PandaAI passes to the LLM the question, the table headers, and 5-10 rows of the Dataframe.
+When you use the [`.chat`](/v3/chat-and-output) method on a [semantic dataframe](/v3/dataframes), PandaAI passes to the LLM the question, the table headers, and 5-10 rows of the Dataframe.
 It then instructs the LLM to generate the most relevant code, whether Python or SQL. The code is then executed locally.
-There are different output formats supported by PandaAI, which can be found [here](/v3/output-formats).
+There are different output formats supported by PandaAI, which can be found [here](/v3/chat-and-output#available-output-formats).
 
 The NL Layer is also one of the core components of our [Data Platform](/v3/ai-dashboards), which allows you to turn raw data into collaborative AI dashboards with in-built conversational agents with a single line of code.
 ```python
@@ -33,7 +33,6 @@ pai.config.set({
    "llm": "openai",
    "save_logs": True,
    "verbose": False,
-   "enable_cache": True,
    "max_retries": 3
 })
 ```
@@ -53,11 +52,6 @@ pai.config.set({
 - **Default**: `False`
 - **Description**: Whether to print the logs in the console as PandaAI is executed.
 
-#### enable_cache
-- **Type**: `bool`
-- **Default**: `True`
-- **Description**: Whether to enable caching. If set to True, PandaAI will cache the results of the LLM to improve the response time. If set to False, PandaAI will always call the LLM. Learn more about [caching](/v3/chat-and-cache#cache).
-
 #### max_retries
 - **Type**: `int`
 - **Default**: `3`

diff --git a/docs/v3/semantic-layer.mdx b/docs/v3/semantic-layer.mdx
@@ -3,6 +3,10 @@ title: 'Semantic Layer'
 description: 'Turn raw data into semantic-enhanced and clean dataframes'
 ---
 
+<Note title="Beta Notice">
+Release v3 is currently in beta. This documentation reflects the features and functionality in progress and may change before the final release.
+</Note>
+
 ## What's the Semantic Layer?
 
 The semantic layer allows you to turn raw data into [dataframes](/v3/dataframes) you can ask questions to and [share with your team](/v3/share-dataframes) as conversational AI dashboards. It serves several important purposes:
-Original file line number
+Diff line change
@@ Expand Up / @@ -47,7 +47,7 @@ Depending on your question, it can return different objects: @@
     - chart
     - number
-    Find it more about output data formats [here](/v3/output-formats)
+    Find it more about output data formats [here](/v3/chat-and-output#available-output-formats).
     ### Create and load dataframes
@@ Expand Down @@