Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -67,4 +67,4 @@ target/checksum.txt
repo
.cpcache
.lsp
.clj-kondo
.clj-kondo
14 changes: 8 additions & 6 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#Copied from https://github.com/dacort/metabase-athena-driver/blob/d7572cd99551ea998a011f8f00a1e39c1eaa59b8/Dockerfile
ARG METABASE_VERSION=v0.46.6.2
ARG METABASE_VERSION=v0.50.26
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, thi suggestion might have been a mistake. I figured that if latest master of relferrerira built againt 0.50.26, this would as well

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, fair assumption tbh


FROM clojure:openjdk-11-tools-deps-slim-buster AS stg_base

Expand Down Expand Up @@ -36,20 +36,22 @@ WORKDIR /build/metabase
# Now build the driver
FROM stg_base as stg_build
RUN clojure \
-Sdeps "{:aliases {:sparksql-databricks {:extra-deps {com.metabase/sparksql-databricks {:local/root \"/build/driver\"}}}}}" \
-X:build:sparksql-databricks \
-Sdeps "{:aliases {:sparksql-databricks-v2 {:extra-deps {com.metabase/sparksql-databricks {:local/root \"/build/driver\"}}}}}" \
-X:build:sparksql-databricks-v2 \
build-drivers.build-driver/build-driver! \
"{:driver :sparksql-databricks, :project-dir \"/build/driver\", :target-dir \"/build/driver/target\"}"
"{:driver :sparksql-databricks-v2, :project-dir \"/build/driver\", :target-dir \"/build/driver/target\"}"

# We create an export stage to make it easy to export the driver
FROM scratch as stg_export
COPY --from=stg_build /build/driver/target/sparksql-databricks.metabase-driver.jar /
COPY --from=stg_build /build/driver/target/sparksql-databricks-v2.metabase-driver.jar /

# Now we can run Metabase with our built driver
FROM metabase/metabase:${METABASE_VERSION} AS stg_runner

# A metabase user/group is manually added in https://github.com/metabase/metabase/blob/master/bin/docker/run_metabase.sh
# Make the UID and GID match
COPY --chown=2000:2000 --from=stg_build \
/build/driver/target/sparksql-databricks.metabase-driver.jar \
/build/driver/target/sparksql-databricks-v2.metabase-driver.jar \
/plugins/sparksql-databricks.metabase-driver.jar

RUN wget https://github.com/relferreira/metabase-sparksql-databricks-driver/releases/download/1.6.0/sparksql-databricks.metabase-driver.jar -O /plugins/sparksql-databricks.metabase-driver-old.jar
91 changes: 5 additions & 86 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,97 +1,16 @@
# Metabase Driver: Spark Databricks
So the credits are a bit complicated, but originally, this driver was developed by Fernando Goncalves and Rajesh Kumar Ravi. Their original
repository is no longer around. However, github user [relferreira](https://github.com/relferreira), kindly updates it [here](https://github.com/relferreira/metabase-sparksql-databricks-driver/tree/master). However, his solution
does not allow for OAuth Secrets, which was something solved by [shrodingers](https://github.com/shrodingers) at [Brigad](https://github.com/Brigad/metabase-sparksql-databricks-driver).

**Credits**: This repository is only an updated version of the work of Fernando Goncalves and Rajesh Kumar Ravi.
Thus, this work is a combination of two somewhat actively maintained repositories. All that I do is to merge the two solutions and update the driver to work with Metabase 0.50.23.

## Installation

To build a dockerized Metabase including the Databricks driver from this repository, simply run:

```
docker build -t metabase:0.46.6.2-db -f Dockerfile .
docker build -t metabase:0.50.23-databricks -f Dockerfile .
```

The Metabase Databricks driver gets build and included in a final Metabase docker image.

### To be fixed for >= v0.46:

To run the tests for this driver, run the following:

```
docker build -t metabase/databricks-test --target stg_test .
docker run --rm --name mb-test metabase/databricks-test
```

or, if you have Clojure on your local machine, just:

```
clojure -X:test
```

# Connecting

## Parameters

![Connection Parameters](docs/parameters.png)

- Display Name: a identification name for your database in Metabase
- Host: your Databricks URL (adb-XXXXXXXXX.azuredatabricks.net)
- Port: usually 443
- Database Name: usually `default`
- Username: usually `token`
- Password: personal access token created in Databrick's dashboard
- Additional JDBC connection string options:
- SQL Warehouse (Endpoint): you can find it at `/sql/warehouses/` at the `Connection details` tab. It should have the following pattern: `;transportMode=http;ssl=1;AuthMech=3;httpPath=/sql/1.0/endpoints/<SQL WAREHOUSE ID>;UID=token;PWD=<ACCESS TOKEN>`
- Cluster Endpoint: you will find it at your cluster's details page. It should have the following pattern: `;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/<ORG ID>/<CLUSTER ID>;AuthMech=3;UID=token;PWD=<ACCESS TOKEN>`

## Building the driver (the fast way)

Use the `Dockerfile` on this repo:

```bash
docker build -t metabase:metabase-head-databricks-1.3.0 .
```

And you can deploy to some docker registry of your own and use the image!

Example of running:

```bash
docker run -d -p 3000:3000 --name metabase metabase:metabase-head-databricks-1.6.0
```

And access `http://localhost:3000`.

## Building the driver (advanced way)

### Prereq: Install Metabase as a local maven dependency, compiled for building drivers

Clone the [Metabase repo](https://github.com/metabase/metabase) first if you haven't already done so.

```bash
cd /path/to/metabase/
./bin/build
```

### Build the Spark Databricks driver

```bash
# (In the sparksql-databricks driver directory)
clojure -X:build :project-dir "\"$(pwd)\""
```

### Copy it to your plugins dir and restart Metabase

```bash
mkdir -p /path/to/metabase/plugins/
cp target/sparksql-databricks.metabase-driver.jar /path/to/metabase/plugins/
jar -jar /path/to/metabase/metabase.jar
```

_or:_

```bash
mkdir -p /path/to/metabase/plugins
cp target/sparksql-databricks.metabase-driver.jar /path/to/metabase/plugins/
cd /path/to/metabase_source
lein run
```
Binary file modified docs/parameters.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
53 changes: 35 additions & 18 deletions resources/metabase-plugin.yaml
Original file line number Diff line number Diff line change
@@ -1,41 +1,58 @@
info:
name: Metabase Databricks Spark SQL Driver
name: Metabase Databricks Spark SQL Driver (v2)
version: 1.0.0-SNAPSHOT
description: Allows Metabase to connect to Databricks Spark SQL databases.
driver:
- name: hive-like
lazy-load: true
abstract: true
parent: sql-jdbc
- name: sparksql-databricks
display-name: Spark SQL (Databricks)
- name: sparksql-databricks-v2
display-name: Databricks SQL (v2)
lazy-load: true
parent: hive-like
connection-properties:
- merge:
- host
- placeholder: "<account>.cloud.databricks.com"
- merge:
- port
- default: 443
- host
- placeholder: "<account>.cloud.databricks.com"
helper-text: "The hostname of your Databricks account"
- name: app-id
display-name: Databricks client id
placeholder: "9af18267-60e7-4061-b2d5-e2414af88b0b"
required: true
helper-text: "The id of the service principal you generated an Oauth token for (see : https://docs.databricks.com/en/dev-tools/authentication-oauth.html)"
- name: app-secret
display-name: Databricks OAuth secret
placeholder: "doseXXXXXXXXXXXX"
required: true
helper-text: "The secret of the service principal you generated an Oauth token for (see : https://docs.databricks.com/en/dev-tools/authentication-oauth.html)"
- name: http-path
display-name: HTTP Path
placeholder: "/sql/1.0/warehouses/<id>"
helper-text: "The path to the Databricks SQL endpoint (see : https://docs.databricks.com/en/integrations/compute-details.html)"
required: true
- name: catalog
display-name: Catalog
placeholder: "<catalog-name>"
required: true
- merge:
- dbname
- placeholder: default
- merge:
- user
- default: token
- merge:
- password
- placeholder: "<user_token>"
- required: false
display-name: Schema / Database (Optional)
- advanced-options-start
- merge:
- additional-options
- name: jdbc-flags
placeholder: ";transportMode=http;ssl=1;httpPath=<cluster-http-path>;AuthMech=3;UID=token;PWD=<token>"
placeholder: ";transportMode=http;ssl=1;"
- merge:
- additional-options
- name: port
display-name: HTTP Port
placeholder: "443"
default: 443
- default-advanced-options
connection-properties-include-tunnel-config: false
init:
- step: load-namespace
namespace: metabase.driver.sparksql-databricks
namespace: metabase.driver.sparksql-databricks-v2
- step: register-jdbc-driver
class: metabase.driver.FixedSparkDriver
6 changes: 6 additions & 0 deletions scripts/extract_plugin.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)

docker buildx build --build-arg METABASE_VERSION=v0.49.7 --target stg_export --platform "linux/amd64" -t metabase:databricks-plugin "$SCRIPT_DIR/.."
container_id=$(docker create "metabase:databricks-plugin" /bin/bash)
docker cp "$container_id:/sparksql-databricks-v2.metabase-driver.jar" "$SCRIPT_DIR/../dist/databricks-sql.metabase-driver.jar"
docker rm "$container_id"
13 changes: 7 additions & 6 deletions src/metabase/driver/connection.clj
Original file line number Diff line number Diff line change
Expand Up @@ -37,15 +37,16 @@

(defn decorate-and-fix
[impl]
(decorator
java.sql.Connection
impl
(getHoldability
(when impl
(decorator
java.sql.Connection
impl
(getHoldability
[]
ResultSet/CLOSE_CURSORS_AT_COMMIT)
(setReadOnly
(setReadOnly
[read-only?]
(when (.isClosed this)
(throw (SQLException. "Connection is closed")))
(when read-only?
(throw (SQLException. "Enabling read-only mode is not supported"))))))
(throw (SQLException. "Enabling read-only mode is not supported")))))))
17 changes: 8 additions & 9 deletions src/metabase/driver/hive_like.clj
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
[buddy.core.codecs :as codecs]
[clojure.string :as str]
[honey.sql :as sql]
[java-time :as t]
[java-time.api :as t]
[metabase.driver :as driver]
[metabase.driver.sql-jdbc.connection :as sql-jdbc.conn]
[metabase.driver.sql-jdbc.execute :as sql-jdbc.execute]
Expand Down Expand Up @@ -75,10 +75,6 @@
#"map" :type/Dictionary
#".*" :type/*))

(defmethod sql.qp/honey-sql-version :hive-like
[_driver]
2)

(defmethod sql.qp/current-datetime-honeysql-form :hive-like
[_]
(h2x/with-database-type-info :%now "timestamp"))
Expand All @@ -96,7 +92,7 @@
(defn- trunc-with-format [format-str expr]
(str-to-date format-str (date-format format-str expr)))

(defmethod sql.qp/date [:hive-like :default] [_ _ expr] (h2x/->timestamp expr))
(defmethod sql.qp/date [:hive-like :default] [_ _ expr] expr)
(defmethod sql.qp/date [:hive-like :minute] [_ _ expr] (trunc-with-format "yyyy-MM-dd HH:mm" (h2x/->timestamp expr)))
(defmethod sql.qp/date [:hive-like :minute-of-hour] [_ _ expr] [:minute (h2x/->timestamp expr)])
(defmethod sql.qp/date [:hive-like :hour] [_ _ expr] (trunc-with-format "yyyy-MM-dd HH" (h2x/->timestamp expr)))
Expand Down Expand Up @@ -264,6 +260,9 @@
(sql-jdbc.execute/set-parameter driver ps i (t/local-date-time t (t/local-time 0))))

;; TIMEZONE FIXME — not sure what timezone the results actually come back as
;;
;; Also, pretty sure Spark SQL doesn't have a TIME type anyway.
;; https://spark.apache.org/docs/latest/sql-ref-datatypes.html
(defmethod sql-jdbc.execute/read-column-thunk [:hive-like Types/TIME]
[_ ^ResultSet rs _rsmeta ^Integer i]
(fn []
Expand All @@ -273,11 +272,11 @@
(defmethod sql-jdbc.execute/read-column-thunk [:hive-like Types/DATE]
[_ ^ResultSet rs _rsmeta ^Integer i]
(fn []
(when-let [t (.getDate rs i)]
(t/zoned-date-time (t/local-date t) (t/local-time 0) (t/zone-id "UTC")))))
(when-let [s (.getString rs i)]
(u.date/parse s))))

(defmethod sql-jdbc.execute/read-column-thunk [:hive-like Types/TIMESTAMP]
[_ ^ResultSet rs _rsmeta ^Integer i]
(fn []
(when-let [t (.getTimestamp rs i)]
(t/zoned-date-time (t/local-date-time t) (t/zone-id "UTC")))))
(t/zoned-date-time (t/local-date-time t) (t/zone-id "UTC")))))
Loading