Skip to content

feat: JSON indexing for EQL V2 #263

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 44 commits into from
Jul 1, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
d1fa669
feat: add new `Type::Associated` variant & EqlTraits
freshtonic Jun 13, 2025
5e4fcf4
feat: declarative type environment
freshtonic Jun 13, 2025
3bab51c
feat: macros for building `TypeEnv`s
freshtonic Jun 13, 2025
abaa2e7
feat: add ability to put type bounds on EQL columns in `schema` macro
freshtonic Jun 13, 2025
5a8f2bd
feat: support bounded type variables and associated types in `Unifier`
freshtonic Jun 13, 2025
43e86b0
feat: SQL operator and function definitions that support EQL types
freshtonic Jun 13, 2025
0867d69
feat: infer function types using delcared SQL/EQL functions
freshtonic Jun 13, 2025
762a96f
feat: infer binary operator types using delcared SQL/EQL operators
freshtonic Jun 13, 2025
1f3a451
chore: various refactorings
freshtonic Jun 13, 2025
0d96253
chore: make schema delta functionality aware of bounds on EQL column …
freshtonic Jun 13, 2025
f7084a7
fix: assorted fixups (due to out of sequence rebasing)
freshtonic Jun 13, 2025
c9100d2
fix: add select_jsonb_path_query
freshtonic Jun 13, 2025
402a810
WIP: get proxy to use new types
freshtonic Jun 13, 2025
63c7682
WIP: just enough to get the proxy to compile against the new EQL types
freshtonic Jun 13, 2025
874f3a3
fix: add all EQLTraits to EQL col
tobyhede Jun 16, 2025
af16330
Add test for jsonb_path_query inference
tobyhede Jun 16, 2025
6e7312b
feat: jsonb_path_query
tobyhede Jun 19, 2025
df568d7
feat: support `SETOF` in the type system
freshtonic Jun 17, 2025
296456a
docs: RustDoc on Type
freshtonic Jun 17, 2025
7f8c54b
fix: SQL function renaming fails when return type is Native
freshtonic Jun 23, 2025
902efae
chore: eql-mapper rustdoc
freshtonic Jun 24, 2025
a03a397
more docs
freshtonic Jun 24, 2025
8079f6a
refactor: remove `Type::Constructor` variant
freshtonic Jun 24, 2025
3f7a7f3
ref(eql-mapper): simplify `Projection` representation
freshtonic Jun 25, 2025
4528d8b
update to eql-2.0.6
tobyhede Jun 25, 2025
54cac15
fix: update proxy integration
tobyhede Jun 25, 2025
be957c5
feat: jsonb_path_query
tobyhede Jun 25, 2025
d8dc263
fix(eql-mapper): broken test in eql-mapper-macros
freshtonic Jun 26, 2025
57b70ee
ref(eql-mapper): remove unused `provenance` mod
freshtonic Jun 26, 2025
8b62144
chore: put cipherstash-client in Cargo workspace
freshtonic Jun 26, 2025
bbdfe05
ref(eql-mapper): rust doc and trivial refactorings
freshtonic Jun 26, 2025
3aa47ea
chore: clippy
freshtonic Jun 26, 2025
77cde08
chore: fmt
freshtonic Jun 26, 2025
ccdec51
fix: bad conflict resolutions during rebase
freshtonic Jun 26, 2025
ca4c58b
fix(tests): proxy jsonb integration tests
freshtonic Jun 27, 2025
eb1c8ef
clippy: Rust 1.88.0 has added a bunch of new default lints
freshtonic Jun 27, 2025
7e24733
fix(fmt): looks like some new formatting too
freshtonic Jun 27, 2025
61457f5
fix: BorrowMutError in `sql_function_types`
freshtonic Jun 30, 2025
eeeb2c1
fix(tests): elixir test flake
freshtonic Jun 30, 2025
bfb0864
test: jsonb_path_exists
tobyhede Jun 30, 2025
d06fc81
test: select_jsonb_path_query_with_unknown
tobyhede Jun 30, 2025
ae77500
fix: clippy
tobyhede Jun 30, 2025
5bfe6ee
fix: revert to string parsing for simple query
tobyhede Jun 30, 2025
b748d5e
test: jsonb_path_query_first
tobyhede Jun 30, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
154 changes: 104 additions & 50 deletions Cargo.lock

Large diffs are not rendered by default.

12 changes: 10 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,13 @@ edition = "2021"
[profile.dev]
incremental = true
debug = true
opt-level = 0
split-debuginfo = "unpacked" # or "unpacked" on macOS

[profile.dev.package.sqltk]
opt-level = 0
debug = true
split-debuginfo = "unpacked" # or "unpacked" on macOS

# [profile.dev.package]# aws-lc-sys.opt-level = 3
# proc-macro2.opt-level = 3
Expand All @@ -17,8 +24,8 @@ debug = true
# sqlparser.opt-level = 3
# syn.opt-level = 3

[profile.dev.build-override]
opt-level = 3
# [profile.dev.build-override]
# opt-level = 3

[profile.test]
incremental = true
Expand All @@ -36,6 +43,7 @@ debug = true

[workspace.dependencies]
sqltk = { version = "0.10.0" }
cipherstash-client = "0.23.0"
thiserror = "2.0.9"
tokio = { version = "1.44.2", features = ["full"] }
tracing = "0.1"
Expand Down
22 changes: 12 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,17 +29,19 @@

[Read the announcement](https://cipherstash.com/blog/introducing-proxy)

CipherStash Proxy provides transparent, *searchable* encryption for your existing Postgres database.

CipherStash Proxy provides a transparent proxy to your existing Postgres database.
CipherStash Proxy:
* Automatically encrypts and decrypts data with zero changes to SQL
* Supports queries over *encrypted* values:
- equality
- comparison
- ordering
- grouping
* Is written in Rust for high performance and strongly-typed mapping of SQL statements.
* Manages keys using CipherStash ZeroKMS, offering up to 14x the performance of AWS KMS

Proxy:
* Automatically encrypts and decrypts the columns you specify
* Supports most query types over encrypted values
* Runs in a Docker container
* Is written in Rust and uses a formal type system for SQL mapping
* Works with CipherStash ZeroKMS and offers up to 14x the performance of AWS KMS

Behind the scenes, it uses the [Encrypt Query Language](https://github.com/cipherstash/encrypt-query-language/) to index and search encrypted data.
Behind the scenes, CipherStash Proxy uses the [Encrypt Query Language](https://github.com/cipherstash/encrypt-query-language/) to index and search encrypted data.

## Table of contents

Expand All @@ -54,7 +56,7 @@ Behind the scenes, it uses the [Encrypt Query Language](https://github.com/ciphe
> [!IMPORTANT]
> **Prerequisites:** Before you start you need to have this software installed:
> - [Docker](https://www.docker.com/) — see Docker's [documentation for installing](https://docs.docker.com/get-started/get-docker/)


Get up and running in local dev in < 5 minutes:

Expand Down
2 changes: 1 addition & 1 deletion mise.toml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ CS_PROXY__HOST = "proxy"
# Misc
DOCKER_CLI_HINTS = "false" # Please don't show us What's Next.

CS_EQL_VERSION = "eql-2.0.4"
CS_EQL_VERSION = "eql-2.0.6"

[tools]
"cargo:cargo-binstall" = "latest"
Expand Down
20 changes: 10 additions & 10 deletions packages/cipherstash-proxy-integration/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,22 @@ version = "0.1.0"
edition = "2021"

[dependencies]
bytes = "1.10.1"
cipherstash-client = { workspace = true, features = ["tokio"] }
cipherstash-config = "0.2.3"
cipherstash-proxy = { path = "../cipherstash-proxy/" }
chrono = { version = "0.4.39", features = ["clock"] }
clap = "4.5.32"
fake = { version = "4", features = ["chrono", "derive"] }

hex = "0.4.3"
postgres-types = { version = "0.2.9", features = ["derive"] }
rand = "0.9"
recipher = "0.1.3"
rustls = { version = "0.23.20", default-features = false, features = ["std"] }
serde = "1.0"
serde_json = "1.0"
tap = "1.0.1"
temp-env = "0.3.6"
tokio = { workspace = true }
tokio-postgres = { version = "0.7", features = [
Expand All @@ -21,14 +30,5 @@ tokio-postgres-rustls = "0.13.0"
tokio-rustls = "0.26.0"
tracing = { workspace = true }
tracing-subscriber = { workspace = true }
webpki-roots = "1.0"

[dev-dependencies]
cipherstash-client = { version = "0.22.0", features = ["tokio"] }
cipherstash-config = "0.2.3"
clap = "4.5.32"
fake = { version = "4", features = ["chrono", "derive"] }
hex = "0.4.3"
postgres-types = { version = "0.2.9", features = ["derive"] }
tap = "1.0.1"
uuid = { version = "1.11.0", features = ["serde", "v4"] }
webpki-roots = "1.0"
45 changes: 42 additions & 3 deletions packages/cipherstash-proxy-integration/src/common.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ use rustls::{
client::danger::ServerCertVerifier, crypto::aws_lc_rs::default_provider,
pki_types::CertificateDer, ClientConfig,
};
use serde_json::Value;
use std::sync::{Arc, Once};
use tokio_postgres::{types::ToSql, Client, NoTls};
use tracing_subscriber::{filter::Directive, EnvFilter, FmtSubscriber};
Expand Down Expand Up @@ -105,7 +106,7 @@ pub async fn connect_with_tls(port: u16) -> Client {

tokio::spawn(async move {
if let Err(e) = connection.await {
eprintln!("connection error: {}", e);
eprintln!("connection error: {e}");
}
});
client
Expand All @@ -117,7 +118,7 @@ pub async fn connect(port: u16) -> Client {

tokio::spawn(async move {
if let Err(e) = connection.await {
eprintln!("connection error: {}", e);
eprintln!("connection error: {e}");
}
});

Expand Down Expand Up @@ -175,7 +176,18 @@ where
rows.iter()
.filter_map(|row| {
if let tokio_postgres::SimpleQueryMessage::Row(r) = row {
r.get(0).and_then(|val| val.parse::<T>().ok())
r.get(0).and_then(|val| {
// Convert string value to FromSql compatible type
// Try different type conversions based on the value format
// PostgreSQL returns booleans as "t" or "f" in simple queries

// Convert PostgreSQL boolean format to binary representation
match val {
"t" => "true".parse::<T>().ok(),
"f" => "false".parse::<T>().ok(),
_ => val.parse::<T>().ok(),
}
})
} else {
None
}
Expand All @@ -199,6 +211,33 @@ pub async fn simple_query_with_null(sql: &str) -> Vec<Option<String>> {
.collect()
}

pub async fn insert(sql: &str, params: &[&(dyn ToSql + Sync)]) {
let client = connect_with_tls(PROXY).await;
client.query(sql, params).await.unwrap();
}

pub async fn insert_jsonb() -> Value {
let id = random_id();

let encrypted_jsonb = serde_json::json!({
"id": id,
"string": "hello",
"number": 42,
"nested": {
"number": 1815,
"string": "world",
},
"array_string": ["hello", "world"],
"array_number": [42, 84],
});

let sql = "INSERT INTO encrypted (id, encrypted_jsonb) VALUES ($1, $2)".to_string();

insert(&sql, &[&id, &encrypted_jsonb]).await;

encrypted_jsonb
}

///
/// Configure the client TLS settings.
/// These are the settings for connecting to the database with TLS.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@ mod tests {

macro_rules! test_insert_with_literal {
($name: ident, $type: ident, $pg_type: ident) => {
test_insert_with_literal!($name, $type, $pg_type, false);
};

($name: ident, $type: ident, $pg_type: ident, $cast: expr) => {
#[tokio::test]
pub async fn $name() {
trace();
Expand All @@ -22,8 +26,14 @@ mod tests {

let expected = vec![encrypted_val.clone()];

let insert_sql = format!("INSERT INTO encrypted (id, {encrypted_col}) VALUES ($1, '{encrypted_val}')");
let select_sql = format!("SELECT {encrypted_col} FROM encrypted WHERE id = $1");
let cast_to_type: &str = if $cast {
&format!("::{}", stringify!($pg_type))
} else {
""
};

let insert_sql = format!("INSERT INTO encrypted (id, {encrypted_col}) VALUES ($1, '{encrypted_val}'{cast_to_type})");
let select_sql = format!("SELECT {encrypted_col}{cast_to_type} FROM encrypted WHERE id = $1");

execute_query(&insert_sql, &[&id]).await;
let actual = query_by::<$type>(&select_sql, &id).await;
Expand All @@ -36,6 +46,10 @@ mod tests {

macro_rules! test_insert_simple_query_with_literal {
($name: ident, $type: ident, $pg_type: ident) => {
test_insert_simple_query_with_literal!($name, $type, $pg_type, false);
};

($name: ident, $type: ident, $pg_type: ident, $cast: expr) => {
#[tokio::test]
pub async fn $name() {
trace();
Expand All @@ -48,8 +62,14 @@ mod tests {
let encrypted_col = format!("encrypted_{}", stringify!($pg_type));
let encrypted_val = crate::value_for_type!($type, random_limited());

let insert_sql = format!("INSERT INTO encrypted (id, {encrypted_col}) VALUES ({id}, '{encrypted_val}')");
let select_sql = format!("SELECT {encrypted_col} FROM encrypted WHERE id = {id}");
let cast_to_type: &str = if $cast {
&format!("::{}", stringify!($pg_type))
} else {
""
};

let insert_sql = format!("INSERT INTO encrypted (id, {encrypted_col}) VALUES ({id}, '{encrypted_val}'{cast_to_type})");
let select_sql = format!("SELECT {encrypted_col}{cast_to_type} FROM encrypted WHERE id = {id}");


let expected = vec![encrypted_val];
Expand All @@ -69,7 +89,7 @@ mod tests {
test_insert_with_literal!(insert_with_literal_bool, bool, bool);
test_insert_with_literal!(insert_with_literal_text, String, text);
test_insert_with_literal!(insert_with_literal_date, NaiveDate, date);
test_insert_with_literal!(insert_with_literal_jsonb, Value, jsonb);
test_insert_with_literal!(insert_with_literal_jsonb, Value, jsonb, true);

test_insert_simple_query_with_literal!(insert_simple_query_with_literal_int2, i16, int2);
test_insert_simple_query_with_literal!(insert_simple_query_with_literal_int4, i32, int4);
Expand All @@ -78,7 +98,12 @@ mod tests {
test_insert_simple_query_with_literal!(insert_simple_query_with_literal_bool, bool, bool);
test_insert_simple_query_with_literal!(insert_simple_query_with_literal_text, String, text);
test_insert_simple_query_with_literal!(insert_simple_query_with_literal_date, NaiveDate, date);
test_insert_simple_query_with_literal!(insert_simple_query_with_literal_jsonb, Value, jsonb);
test_insert_simple_query_with_literal!(
insert_simple_query_with_literal_jsonb,
Value,
jsonb,
true
);

// -----------------------------------------------------------------

Expand Down
1 change: 1 addition & 0 deletions packages/cipherstash-proxy-integration/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ mod pipeline;
mod schema_change;
mod select;
mod simple_protocol;
mod support;

#[macro_export]
macro_rules! value_for_type {
Expand Down
4 changes: 2 additions & 2 deletions packages/cipherstash-proxy-integration/src/map_literals.rs
Original file line number Diff line number Diff line change
Expand Up @@ -55,12 +55,12 @@ mod tests {
let encrypted_jsonb = serde_json::json!({"key": "value"});

let sql = format!(
"INSERT INTO encrypted (id, encrypted_jsonb) VALUES ($1, '{encrypted_jsonb}')",
"INSERT INTO encrypted (id, encrypted_jsonb) VALUES ($1, '{encrypted_jsonb}'::jsonb)",
);

client.query(&sql, &[&id]).await.unwrap();

let sql = "SELECT id, encrypted_jsonb FROM encrypted WHERE id = $1";
let sql = "SELECT id, encrypted_jsonb::jsonb FROM encrypted WHERE id = $1";
let rows = client.query(sql, &[&id]).await.unwrap();

assert_eq!(rows.len(), 1);
Expand Down
2 changes: 1 addition & 1 deletion packages/cipherstash-proxy-integration/src/migrate/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ mod tests {
let config = match TandemConfig::load(&args) {
Ok(config) => config,
Err(err) => {
eprintln!("Configuration Error: {}", err);
eprintln!("Configuration Error: {err}");
panic!();
}
};
Expand Down
54 changes: 54 additions & 0 deletions packages/cipherstash-proxy-integration/src/select/indexing.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
#[cfg(test)]
mod tests {
use crate::common::{
connect_with_tls, insert, query_by, random_id, simple_query, trace, PROXY,
};
use tokio_postgres::types::{FromSql, ToSql};
use tracing::info;

#[derive(Debug, ToSql, FromSql, PartialEq)]
#[postgres(name = "domain_type_with_check")]
pub struct Domain(String);

///
/// Tests insertion of custom domain type
///
#[tokio::test]
async fn select_with_index() {
trace();

// let id = random_id();
// let encrypted_val = Domain("ZZ".to_string());

// CREATE INDEX ON encrypted (e eql_v2.encrypted_operator_class);
// SELECT ore.e FROM ore WHERE id = 42 INTO ore_term;

for n in 1..=10 {
let id = random_id();

let encrypted_text = format!("hello_{}", n);

let sql = format!("INSERT INTO encrypted (id, encrypted_text) VALUES ($1, $2)");
insert(&sql, &[&id, &encrypted_text]).await;
}

let client = connect_with_tls(PROXY).await;

let sql = "CREATE INDEX ON encrypted (encrypted_text eql_v2.encrypted_operator_class)";
let _ = client.simple_query(sql).await;

// let sql =
// "EXPLAIN ANALYZE SELECT encrypted_text FROM encrypted WHERE encrypted_text <= '{\"hm\": \"abc\"}'::jsonb::eql_v2_encrypted";
// let result = simple_query::<String>(sql).await;

let sql = "EXPLAIN ANALYZE SELECT encrypted_text FROM encrypted WHERE encrypted_text <= $1";

let encrypted_text = "hello_10".to_string();
let result = query_by::<String>(sql, &encrypted_text).await;

info!("Result: {:?}", result);

// let expected = vec![encrypted_val];
// assert_eq!(expected, result);
}
}
Loading