Skip to content

Commit

Permalink
Assistant improvements: Table annotations, and few-shot examples (#664)
Browse files Browse the repository at this point in the history
* Table annotations
* Few-shot prompt examples
* View Assistant history
* Better 'relevant table' detection and UI
* Improved prompts
* Cmd+shift+F shortcut for formatting SQL
  • Loading branch information
chrisclark authored Aug 29, 2024
1 parent 84c2eef commit 2d7736d
Show file tree
Hide file tree
Showing 43 changed files with 1,532 additions and 294 deletions.
47 changes: 40 additions & 7 deletions HISTORY.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,44 @@ This project adheres to `Semantic Versioning <https://semver.org/>`_.

vNext
===========================
* `#660`_: Userspace connection migration. This should be an invisible change, but represents a significant refactor of how connections function.
Instead of a weird blend of DatabaseConnection models and underlying Django models (which were the original Explorer connections),
this migrates all connections to DatabaseConnection models and implements proper foreign keys to them on the Query and QueryLog models.
A data migration creates new DatabaseConnection models based on the configured settings.EXPLORER_CONNECTIONS.
Going forward, admins can create new Django-backed DatabaseConnection models by registering the connection in EXPLORER_CONNECTIONS, and then creating a
DatabaseConnection model using the Django admin or the user-facing /connections/new/ form, and entering the Django DB alias and setting the connection type to "Django Connection"

* `#664`_: Improvements to the AI SQL Assistant:

- Table Annotations: Write persistent table annotations with descriptive information that will get injected into the
prompt for the assistant. For example, if a table is commonly joined to another table through a non-obvious foreign
key, you can tell the assistant about it in plain english, as an annotation to that table. Every time that table is
deemed 'relevant' to an assistant request, that annotation will be included alongside the schema and sample data.
- Few-Shot Examples: Using the small checkbox on the bottom-right of any saved queries, you can designate certain
queries as 'few shot examples". When making an assistant request, any designated few-shot examples that reference
the same tables as your assistant request will get included as 'reference sql' in the prompt for the LLM.
- Autocomplete / multiselect when selecting tables info to send to the SQL Assistant. Much easier and more keyboard
focused.
- Relevant tables are added client-side visually, in real time, based on what's in the SQL editor and/or any tables
mentioned in the assistant request. The dependency on sql_metadata is therefore removed, as server-side SQL parsing
is no longer necessary.
- Ability to view Assistant request/response history.
- Improved system prompt that emphasizes the particular SQL dialect being used.
- Addresses issue #657.

* `#660`_: Userspace connection migration.

- This should be an invisible change, but represents a significant refactor of how connections function. Instead of a
weird blend of DatabaseConnection models and underlying Django models (which were the original Explorer
connections), this migrates all connections to DatabaseConnection models and implements proper foreign keys to them
on the Query and QueryLog models. A data migration creates new DatabaseConnection models based on the configured
settings.EXPLORER_CONNECTIONS. Going forward, admins can create new Django-backed DatabaseConnection models by
registering the connection in EXPLORER_CONNECTIONS, and then creating a DatabaseConnection model using the Django
admin or the user-facing /connections/new/ form, and entering the Django DB alias and setting the connection type
to "Django Connection".
- The Query.connection and QueryLog.connection fields are deprecated and will be removed in a future release. They
are kept around in this release in case there is an unforeseen issue with the migration. Preserving the fields for
now ensures there is no data loss in the event that a rollback to an earlier version is required.

* Fixed a bug when validating connections to uploaded files. Also added basic locking when downloading files from S3.

* Keyboard shortcut for formatting the SQL in the editor.

- Cmd+Shift+F (Windows: Ctrl+Shift+F)
- The format button has been moved tobe a small icon towards the bottom-right of the SQL editor.

`5.2.0`_ (2024-08-19)
===========================
Expand Down Expand Up @@ -643,6 +674,8 @@ Initial Release
.. _#651: https://github.com/explorerhq/sql-explorer/pull/651
.. _#659: https://github.com/explorerhq/sql-explorer/pull/659
.. _#662: https://github.com/explorerhq/sql-explorer/pull/662
.. _#660: https://github.com/explorerhq/sql-explorer/pull/660
.. _#664: https://github.com/explorerhq/sql-explorer/pull/664

.. _#269: https://github.com/explorerhq/sql-explorer/issues/269
.. _#288: https://github.com/explorerhq/sql-explorer/issues/288
Expand Down
17 changes: 14 additions & 3 deletions docs/features.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,19 @@ SQL Assistant
-------------
- Built in integration with OpenAI (or the LLM of your choosing)
to quickly get help with your query, with relevant schema
automatically injected into the prompt. Simple, effective.
automatically injected into the prompt.
- The assistant tries hard to get relevant context into the prompt to the LLM, alongside your explicit request. You
can choose tables to include explicitly (and any tables you are reference in your SQL you will see get included as
well). When a table is "included", the prompt will include the schema of the table, 3 sample rows, any Table
Annotations you have added, and any designated "few shot examples". More on each of those below.
- Table Annotations: Write persistent table annotations with descriptive information that will get injected into the
prompt for the assistant. For example, if a table is commonly joined to another table through a non-obvious foreign
key, you can tell the assistant about it in plain english, as an annotation to that table. Every time that table is
deemed 'relevant' to an assistant request, that annotation will be included alongside the schema and sample data.
- Few-shot examples: Using the small checkbox on the bottom-right of any saved query, you can designate queries as
"Assistant Examples". When making an assistant request, the 'included tables' are intersected with tables referenced
by designated Example queries, and those queries are injected into the prompt, and the LLM is told that that these
are good reference queries.

Database Support
----------------
Expand Down Expand Up @@ -222,8 +234,7 @@ Power tips
view.
- Command+Enter and Ctrl+Enter will execute a query when typing in
the SQL editor area.
- Hit the "Format" button to format and clean up your SQL (this is
non-validating -- just formatting).
- Cmd+Shift+F (Windows: Ctrl+Shift+F) to format the SQL in the editor.
- Use the Query Logs feature to share one-time queries that aren't
worth creating a persistent query for. Just run your SQL in the
playground, then navigate to ``/logs`` and share the link
Expand Down
2 changes: 1 addition & 1 deletion explorer/admin.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

@admin.register(Query)
class QueryAdmin(admin.ModelAdmin):
list_display = ("title", "description", "created_by_user",)
list_display = ("title", "description", "created_by_user", "few_shot")
list_filter = ("title",)
raw_id_fields = ("created_by_user",)
actions = [generate_report_action()]
Expand Down
6 changes: 6 additions & 0 deletions explorer/app_settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -152,11 +152,17 @@
EXPLORER_AI_API_KEY = getattr(settings, "EXPLORER_AI_API_KEY", None)

EXPLORER_ASSISTANT_BASE_URL = getattr(settings, "EXPLORER_ASSISTANT_BASE_URL", "https://api.openai.com/v1")

# Deprecated. Will be removed in a future release. Please use EXPLORER_ASSISTANT_MODEL_NAME instead
EXPLORER_ASSISTANT_MODEL = getattr(settings, "EXPLORER_ASSISTANT_MODEL",
# Return the model name and max_tokens it supports
{"name": "gpt-4o",
"max_tokens": 128000})

EXPLORER_ASSISTANT_MODEL_NAME = getattr(settings, "EXPLORER_ASSISTANT_MODEL_NAME",
EXPLORER_ASSISTANT_MODEL["name"])


EXPLORER_DB_CONNECTIONS_ENABLED = getattr(settings, "EXPLORER_DB_CONNECTIONS_ENABLED", False)
EXPLORER_USER_UPLOADS_ENABLED = getattr(settings, "EXPLORER_USER_UPLOADS_ENABLED", False)
EXPLORER_PRUNE_LOCAL_UPLOAD_COPY_DAYS_INACTIVITY = getattr(settings,
Expand Down
42 changes: 42 additions & 0 deletions explorer/assistant/forms.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
from django import forms
from explorer.assistant.models import TableDescription
from explorer.ee.db_connections.utils import default_db_connection


class TableDescriptionForm(forms.ModelForm):
class Meta:
model = TableDescription
fields = "__all__"
widgets = {
"database_connection": forms.Select(attrs={"class": "form-select"}),
"description": forms.Textarea(attrs={"class": "form-control", "rows": 3}),
}

def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
if not self.instance.pk: # Check if this is a new instance
# Set the default value for database_connection
self.fields["database_connection"].initial = default_db_connection()

if self.instance and self.instance.table_name:
choices = [(self.instance.table_name, self.instance.table_name)]
else:
choices = []

f = forms.ChoiceField(
choices=choices,
widget=forms.Select(attrs={"class": "form-select", "data-placeholder": "Select table"})
)

# We don't actually care about validating the 'choices' that the ChoiceField does by default.
# Really we are just using that field type in order to get a valid pre-populated Select widget on the client
# But also it can't be blank!
def valid_value_new(v):
return bool(v)

f.valid_value = valid_value_new

self.fields["table_name"] = f

if self.instance and self.instance.table_name:
self.fields["table_name"].initial = self.instance.table_name
17 changes: 17 additions & 0 deletions explorer/assistant/models.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
from django.db import models
from django.conf import settings
from explorer.ee.db_connections.models import DatabaseConnection


class PromptLog(models.Model):
Expand All @@ -8,6 +9,7 @@ class Meta:
app_label = "explorer"

prompt = models.TextField(blank=True)
user_request = models.TextField(blank=True)
response = models.TextField(blank=True)
run_by_user = models.ForeignKey(
settings.AUTH_USER_MODEL,
Expand All @@ -19,3 +21,18 @@ class Meta:
duration = models.FloatField(blank=True, null=True) # seconds
model = models.CharField(blank=True, max_length=128, default="")
error = models.TextField(blank=True, null=True)
database_connection = models.ForeignKey(to=DatabaseConnection, on_delete=models.SET_NULL, blank=True, null=True)


class TableDescription(models.Model):

class Meta:
app_label = "explorer"
unique_together = ("database_connection", "table_name")

database_connection = models.ForeignKey(to=DatabaseConnection, on_delete=models.CASCADE)
table_name = models.CharField(max_length=512)
description = models.TextField()

def __str__(self):
return f"{self.database_connection.alias} - {self.table_name}"
16 changes: 16 additions & 0 deletions explorer/assistant/urls.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
from django.urls import path
from explorer.assistant.views import (TableDescriptionListView,
TableDescriptionCreateView,
TableDescriptionUpdateView,
TableDescriptionDeleteView,
AssistantHelpView,
AssistantHistoryApiView)

assistant_urls = [
path("assistant/", AssistantHelpView.as_view(), name="assistant"),
path("assistant/history/", AssistantHistoryApiView.as_view(), name="assistant_history"),
path("table-descriptions/", TableDescriptionListView.as_view(), name="table_description_list"),
path("table-descriptions/new/", TableDescriptionCreateView.as_view(), name="table_description_create"),
path("table-descriptions/<int:pk>/update/", TableDescriptionUpdateView.as_view(), name="table_description_update"),
path("table-descriptions/<int:pk>/delete/", TableDescriptionDeleteView.as_view(), name="table_description_delete"),
]
Loading

0 comments on commit 2d7736d

Please sign in to comment.