Skip to content

Commit

Permalink
All connections are db-backed (#660)
Browse files Browse the repository at this point in the history
* make all connections db-backed
  • Loading branch information
chrisclark authored Aug 21, 2024
1 parent 26c33ac commit 84c2eef
Show file tree
Hide file tree
Showing 55 changed files with 755 additions and 806 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,4 @@ chinook.sqlite
model_data.json
tst1
tst1-journal
tst2
10 changes: 10 additions & 0 deletions HISTORY.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,16 @@ Change Log
This document records all notable changes to `SQL Explorer <https://github.com/explorerhq/sql-explorer>`_.
This project adheres to `Semantic Versioning <https://semver.org/>`_.

vNext
===========================
* `#660`_: Userspace connection migration. This should be an invisible change, but represents a significant refactor of how connections function.
Instead of a weird blend of DatabaseConnection models and underlying Django models (which were the original Explorer connections),
this migrates all connections to DatabaseConnection models and implements proper foreign keys to them on the Query and QueryLog models.
A data migration creates new DatabaseConnection models based on the configured settings.EXPLORER_CONNECTIONS.
Going forward, admins can create new Django-backed DatabaseConnection models by registering the connection in EXPLORER_CONNECTIONS, and then creating a
DatabaseConnection model using the Django admin or the user-facing /connections/new/ form, and entering the Django DB alias and setting the connection type to "Django Connection"


`5.2.0`_ (2024-08-19)
===========================
* `#651`_: Ability to append an upload to a previously uploaded file/sqlite DB as a new table
Expand Down
4 changes: 2 additions & 2 deletions docs/development.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,10 +36,10 @@ Install the dev requirements:

And then:

``python manage.py test --settings=tests.settings``
``python manage.py test --settings=explorer.tests.settings``

Or with coverage:

``coverage run --source='.' manage.py test --settings=tests.settings``
``coverage run --source='.' manage.py test --settings=explorer.tests.settings``
``coverage combine``
``coverage report``
27 changes: 3 additions & 24 deletions docs/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,20 +30,6 @@ Add to your ``INSTALLED_APPS``, located in the ``settings.py`` file in your proj
'explorer',
)
Configure your settings to something like:

.. code-block:: python
EXPLORER_CONNECTIONS = { 'Default': 'readonly' }
EXPLORER_DEFAULT_CONNECTION = 'readonly'
The first setting lists the connections you want to allow Explorer to
use. The keys of the connections dictionary are friendly names to show
Explorer users, and the values are the actual database aliases used in
``settings.DATABASES``. It is highly recommended to setup read-only roles
in your database, add them in your project's ``DATABASES`` setting and
use these read-only connections in the ``EXPLORER_CONNECTIONS``.

Add the following to your urls.py (all Explorer URLs are restricted
via the ``EXPLORER_PERMISSION_VIEW`` and ``EXPLORER_PERMISSION_CHANGE``
settings. See Settings section below for further documentation.):
Expand All @@ -57,15 +43,6 @@ settings. See Settings section below for further documentation.):
path('explorer/', include('explorer.urls')),
]
If you want to quickly use django-sql-explorer with the existing default
connection **and know what you are doing** (or you are on development), you
can use the following settings:

.. code-block:: python
EXPLORER_CONNECTIONS = { 'Default': 'default' }
EXPLORER_DEFAULT_CONNECTION = 'default'
Run migrate to create the tables:

``python manage.py migrate``
Expand All @@ -78,7 +55,9 @@ And run the server:

``python manage.py runserver``

You can now browse to http://127.0.0.1:8000/explorer/ and get exploring!
You can now browse to http://127.0.0.1:8000/explorer/. Add a database connection at /explorer/connections/new/, and you
are ready to start exploring! If you have a database in your settings.DATABASES you would like to query, you can create
a connection with the same alias and name and set the Engine to "Django Database".

Note that Explorer expects STATIC_URL to be set appropriately. This isn't a problem
with vanilla Django setups, but if you are using e.g. Django Storages with S3, you
Expand Down
15 changes: 4 additions & 11 deletions docs/settings.rst
Original file line number Diff line number Diff line change
Expand Up @@ -93,24 +93,17 @@ Generate DB schema asynchronously. Requires Celery and ``EXPLORER_TASKS_ENABLED`
EXPLORER_ASYNC_SCHEMA = False
Default connection
******************

The name of the Django database connection to use. Ideally set this to a connection with read only permissions

.. code-block:: python
EXPLORER_DEFAULT_CONNECTION = None # Must be set for the app to work, as this is required
Database connections
********************

A dictionary of ``{'Friendly Name': 'django_db_alias'}``.
If you want to create a DatabaseConnection to a DB that is registered in settings.DATABASES, then add the alias to this
dictionary (with a friendly name of your choice), and then create a new connection (../connections/new/) with the alias
and name set to the alias of the Django database, and the Engine set to "Django Database".

.. code-block:: python
EXPLORER_CONNECTIONS = {} # At a minimum, should be set to something like { 'Default': 'readonly' } or similar. See connections.py for more documentation.
EXPLORER_CONNECTIONS = {}
Permission view
Expand Down
3 changes: 2 additions & 1 deletion explorer/app_settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@


EXPLORER_CONNECTIONS = getattr(settings, "EXPLORER_CONNECTIONS", {})

# Deprecated as of 6.0. Will be removed in a future version.
EXPLORER_DEFAULT_CONNECTION = getattr(
settings, "EXPLORER_DEFAULT_CONNECTION", None
)
Expand Down Expand Up @@ -75,7 +77,6 @@
EXPLORER_RECENT_QUERY_COUNT = getattr(
settings, "EXPLORER_RECENT_QUERY_COUNT", 5
)
EXPLORER_ASYNC_SCHEMA = getattr(settings, "EXPLORER_ASYNC_SCHEMA", False)

DEFAULT_EXPORTERS = [
("csv", "explorer.exporters.CSVExporter"),
Expand Down
47 changes: 18 additions & 29 deletions explorer/apps.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
from django.apps import AppConfig
from django.core.exceptions import ImproperlyConfigured
from django.db import connections as djcs
from django.utils.translation import gettext_lazy as _
from django.db import transaction, DEFAULT_DB_ALIAS, connections


class ExplorerAppConfig(AppConfig):
Expand All @@ -10,36 +9,26 @@ class ExplorerAppConfig(AppConfig):
verbose_name = _("SQL Explorer")
default_auto_field = "django.db.models.AutoField"

def ready(self):
from explorer.schema import build_async_schemas
_validate_connections()
build_async_schemas()

# SQL Explorer DatabaseConnection models store connection info but are always translated into "django-style" connections
# before use, because we use Django's DB engine to run queries, gather schema information, etc.

def _get_default():
from explorer.app_settings import EXPLORER_DEFAULT_CONNECTION
return EXPLORER_DEFAULT_CONNECTION
# In general this isn't a problem; we cough up a django-style connection and all is well. The exception is when using
# the `with transaction.atomic(using=...):` context manager. The atomic() function takes a connection *alias* argument
# and the retrieves the connection from settings.DATABASES. But of course if we are providing a user-created connection
# alias, Django doesn't find it.

# The solution is to monkey-patch the `get_connection` function within transaction.atomic to make it aware of the
# user-created connections.

def _get_explorer_connections():
from explorer.app_settings import EXPLORER_CONNECTIONS
return EXPLORER_CONNECTIONS
# This code should be double-checked against new versions of Django to make sure the original logic is still correct.

def new_get_connection(using=None):
from explorer.ee.db_connections.models import DatabaseConnection
if using is None:
using = DEFAULT_DB_ALIAS
if using in connections:
return connections[using]
return DatabaseConnection.objects.get(alias=using).as_django_connection()

def _validate_connections():

# Validate connections, when using settings.EXPLORER_CONNECTIONS
# Skip if none are configured, as the app will use user-configured connections (DatabaseConnection models)
if _get_explorer_connections().values() and _get_default() not in _get_explorer_connections().values():
raise ImproperlyConfigured(
f"EXPLORER_DEFAULT_CONNECTION is {_get_default()}, "
f"but that alias is not present in the values of "
f"EXPLORER_CONNECTIONS"
)

for name, conn_name in _get_explorer_connections().items():
if conn_name not in djcs:
raise ImproperlyConfigured(
f"EXPLORER_CONNECTIONS contains ({name}, {conn_name}), "
f"but {conn_name} is not a valid Django DB connection."
)
transaction.get_connection = new_get_connection
27 changes: 11 additions & 16 deletions explorer/assistant/utils.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
from explorer import app_settings
from explorer.schema import schema_info
from explorer.models import ExplorerValue
from explorer.utils import get_valid_connection
from django.db.utils import OperationalError


Expand Down Expand Up @@ -35,8 +34,8 @@ def extract_response(r):
return r[-1].content


def tables_from_schema_info(connection, table_names):
schema = schema_info(connection)
def tables_from_schema_info(db_connection, table_names):
schema = schema_info(db_connection)
return [table for table in schema if table[0] in set(table_names)]


Expand Down Expand Up @@ -65,8 +64,7 @@ def sample_rows_from_table(connection, table_name):
Returns:
A list of rows with field values truncated if they exceed 500 characters/bytes.
"""
conn = get_valid_connection(connection)
cursor = conn.cursor()
cursor = connection.cursor()
try:
cursor.execute(f"SELECT * FROM {table_name} LIMIT {ROW_SAMPLE_SIZE}")
ret = [[header[0] for header in cursor.description]]
Expand Down Expand Up @@ -125,30 +123,27 @@ def fits_in_window(string: str) -> bool:
return num_tokens_from_string(string) < (app_settings.EXPLORER_ASSISTANT_MODEL["max_tokens"] * 0.95)


def build_prompt(request_data, included_tables):
def build_prompt(db_connection, assistant_request, included_tables, query_error=None, sql=None):
user_prompt = ""
djc = db_connection.as_django_connection()
user_prompt += f"## Database Vendor / SQL Flavor is {djc.vendor}\n\n"

db_vendor = get_valid_connection(request_data.get("connection")).vendor
user_prompt += f"## Database Vendor / SQL Flavor is {db_vendor}\n\n"
if query_error:
user_prompt += f"## Query Error ##\n\n{query_error}\n\n"

db_error = request_data.get("db_error")
if db_error:
user_prompt += f"## Query Error ##\n\n{db_error}\n\n"

sql = request_data.get("sql")
if sql:
user_prompt += f"## Existing SQL ##\n\n{sql}\n\n"

results_sample = sample_rows_from_tables(request_data["connection"],
results_sample = sample_rows_from_tables(djc,
included_tables)
if fits_in_window(user_prompt + results_sample):
user_prompt += f"## Table Structure with Sampled Data ##\n\n{results_sample}\n\n"
else: # If it's too large with sampling, then provide *just* the structure
table_struct = tables_from_schema_info(request_data["connection"],
table_struct = tables_from_schema_info(db_connection,
included_tables)
user_prompt += f"## Table Structure ##\n\n{table_struct}\n\n"

user_prompt += f"## User's Request to Assistant ##\n\n{request_data['assistant_request']}\n\n"
user_prompt += f"## User's Request to Assistant ##\n\n{assistant_request}\n\n"

prompt = {
"system": ExplorerValue.objects.get_item(ExplorerValue.ASSISTANT_SYSTEM_PROMPT).value,
Expand Down
10 changes: 9 additions & 1 deletion explorer/assistant/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import json

from explorer.telemetry import Stat, StatNames
from explorer.ee.db_connections.models import DatabaseConnection
from explorer.assistant.models import PromptLog
from explorer.assistant.utils import (
do_req, extract_response,
Expand All @@ -18,7 +19,14 @@ def run_assistant(request_data, user):
extra_tables = request_data.get("selected_tables", [])
included_tables = get_table_names_from_query(sql) + extra_tables

prompt = build_prompt(request_data, included_tables)
connection_id = request_data.get("connection_id")
try:
conn = DatabaseConnection.objects.get(id=connection_id)
except DatabaseConnection.DoesNotExist:
return "Error: Connection not found"

prompt = build_prompt(conn, request_data.get("assistant_request"),
included_tables, request_data.get("db_error"), request_data.get("sql"))

start = timezone.now()
pl = PromptLog(
Expand Down
62 changes: 0 additions & 62 deletions explorer/connections.py

This file was deleted.

Loading

0 comments on commit 84c2eef

Please sign in to comment.