Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions supabase/migrations/20251021110000_task_ticket_vector_search.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
CREATE OR REPLACE FUNCTION get_similar_tasks(
query_embedding vector(768),
match_count INT DEFAULT 3
)
RETURNS TABLE (
task_id UUID,
title TEXT,
description TEXT,
similarity FLOAT
) AS $$
BEGIN
RETURN QUERY
SELECT
t.id AS task_id,
t.title,
t.description,
1 - (t.description_embedding <=> query_embedding) AS similarity
FROM tasks t
WHERE t.description_embedding IS NOT NULL
ORDER BY t.description_embedding <=> query_embedding
Comment on lines +17 to +20
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== List all migration files in order ==="
fd -t f '.*\.sql$' supabase/migrations 2>/dev/null | sort || echo "No migrations directory found"

echo
echo "=== Check base table creation migrations ==="
echo "--- Content of 20251021030000_task_schema.sql ---"
cat supabase/migrations/20251021030000_task_schema.sql 2>/dev/null || echo "File not found"

echo
echo "--- Content of 20251021040000_tickets_schema.sql ---"
cat supabase/migrations/20251021040000_tickets_schema.sql 2>/dev/null || echo "File not found"

echo
echo "=== Check vector search migration ==="
echo "--- Content of 20251021110000_task_ticket_vector_search.sql (lines 1-50) ---"
head -50 supabase/migrations/20251021110000_task_ticket_vector_search.sql 2>/dev/null || echo "File not found"

echo
echo "=== Check for embedding column additions elsewhere ==="
rg -nP --type=sql 'ALTER\s+TABLE\s+(public\.)?tasks\s+ADD\s+COLUMN.*embedding|ALTER\s+TABLE\s+(public\.)?tickets\s+ADD\s+COLUMN.*embedding' supabase/migrations 2>/dev/null || echo "No ALTER TABLE ADD COLUMN statements for embedding columns"

echo
echo "=== Check for any reference to description_embedding creation ==="
rg -nP --type=sql 'description_embedding' supabase/migrations 2>/dev/null || echo "No description_embedding references found"

Repository: AOSSIE-Org/Ell-ena

Length of output: 11657


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Content of 20251021100000_generate_missing_embeddings.sql ==="
cat supabase/migrations/20251021100000_generate_missing_embeddings.sql 2>/dev/null || echo "File not found"

echo
echo "=== Content of 20251021120000_add_hnsw_indexes.sql ==="
cat supabase/migrations/20251021120000_add_hnsw_indexes.sql 2>/dev/null || echo "File not found"

Repository: AOSSIE-Org/Ell-ena

Length of output: 2048


Add explicit checks for embedding columns before defining vector search functions.

Lines 17–20 and 42–45 reference description_embedding columns that do not exist in the base table schemas (20251021030000_task_schema.sql and 20251021040000_tickets_schema.sql). Migration 20251021100000_generate_missing_embeddings.sql does not create these columns—it only processes meeting embeddings. This creates a hard dependency on external code (PR #166) that must be applied first; without it, these functions fail silently at runtime and the subsequent index creation in 20251021120000_add_hnsw_indexes.sql fails outright.

Add guards to fail fast with a clear message:

Suggested guard to fail fast with a clear message
+DO $$
+BEGIN
+  IF NOT EXISTS (
+    SELECT 1
+    FROM information_schema.columns
+    WHERE table_schema = 'public'
+      AND table_name = 'tasks'
+      AND column_name = 'description_embedding'
+  ) THEN
+    RAISE EXCEPTION 'Missing public.tasks.description_embedding. Apply migration from PR `#166` first.';
+  END IF;
+
+  IF NOT EXISTS (
+    SELECT 1
+    FROM information_schema.columns
+    WHERE table_schema = 'public'
+      AND table_name = 'tickets'
+      AND column_name = 'description_embedding'
+  ) THEN
+    RAISE EXCEPTION 'Missing public.tickets.description_embedding. Apply migration from PR `#166` first.';
+  END IF;
+END $$;
+
 CREATE OR REPLACE FUNCTION get_similar_tasks(
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
1 - (t.description_embedding <=> query_embedding) AS similarity
FROM tasks t
WHERE t.description_embedding IS NOT NULL
ORDER BY t.description_embedding <=> query_embedding
DO $$
BEGIN
IF NOT EXISTS (
SELECT 1
FROM information_schema.columns
WHERE table_schema = 'public'
AND table_name = 'tasks'
AND column_name = 'description_embedding'
) THEN
RAISE EXCEPTION 'Missing public.tasks.description_embedding. Apply migration from PR `#166` first.';
END IF;
IF NOT EXISTS (
SELECT 1
FROM information_schema.columns
WHERE table_schema = 'public'
AND table_name = 'tickets'
AND column_name = 'description_embedding'
) THEN
RAISE EXCEPTION 'Missing public.tickets.description_embedding. Apply migration from PR `#166` first.';
END IF;
END $$;
CREATE OR REPLACE FUNCTION get_similar_tasks(
query_text TEXT,
limit_count INT DEFAULT 5
)
RETURNS TABLE(id UUID, description TEXT, similarity FLOAT8) AS $$
BEGIN
RETURN QUERY
SELECT
t.id,
t.description,
1 - (t.description_embedding <=> query_embedding) AS similarity
FROM tasks t
WHERE t.description_embedding IS NOT NULL
ORDER BY t.description_embedding <=> query_embedding
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@supabase/migrations/20251021110000_task_ticket_vector_search.sql` around
lines 17 - 20, The migration uses tasks.description_embedding (and the
equivalent tickets column) without guaranteeing those columns exist; add
explicit guards at the top of the migration/function(s) that check
information_schema.columns (or pg_catalog) for the presence of
description_embedding on the tasks and tickets tables and raise a clear,
fast-failing error if missing (e.g., "description_embedding column missing on
tasks — apply migration X first"). Place these checks before any use of
description_embedding (before the similarity SELECTs / ORDER BY and before
creating indexes) so the migration fails loudly rather than producing silent
runtime errors; perform the same check for tickets where referenced.

LIMIT match_count;
END;
$$ LANGUAGE plpgsql;


CREATE OR REPLACE FUNCTION get_similar_tickets(
query_embedding vector(768),
match_count INT DEFAULT 3
)
RETURNS TABLE (
ticket_id UUID,
title TEXT,
description TEXT,
similarity FLOAT
) AS $$
BEGIN
RETURN QUERY
SELECT
tk.id AS ticket_id,
tk.title,
tk.description,
1 - (tk.description_embedding <=> query_embedding) AS similarity
FROM tickets tk
WHERE tk.description_embedding IS NOT NULL
ORDER BY tk.description_embedding <=> query_embedding
LIMIT match_count;
END;
$$ LANGUAGE plpgsql;
5 changes: 5 additions & 0 deletions supabase/migrations/20251021120000_add_hnsw_indexes.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
CREATE INDEX IF NOT EXISTS idx_tasks_embedding
ON tasks USING hnsw (description_embedding vector_cosine_ops);

CREATE INDEX IF NOT EXISTS idx_tickets_embedding
ON tickets USING hnsw (description_embedding vector_cosine_ops);
Comment on lines +1 to +5
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Check if there are any version constraints or documentation in the repo
git ls-files | grep -E "(requirements|docker|package.json|supabase|version)" | head -20

Repository: AOSSIE-Org/Ell-ena

Length of output: 1081


🌐 Web query:

pgvector HNSW index support version history

💡 Result:

pgvector HNSW index support (and major HNSW-related changes) by version:

  • 0.5.0 (2023-08-28): Added HNSW index type (USING hnsw). [1]
  • 0.5.1 (2023-10-10): Improved HNSW index build performance; MVCC snapshot check for index scans. [1]
  • 0.6.0 (2024-01-29): Parallel index builds for HNSW; improved HNSW performance; reduced memory usage and WAL for HNSW builds; additional fixes. [1][2]
  • 0.6.2 (2024-03-18): Reduced lock contention with parallel HNSW builds. [1]
  • 0.7.0 (2024-04-29): Added HNSW support for L1 distance (plus other new types/features). [1]
  • 0.7.1–0.7.4 (2024-06-03 to 2024-08-05): On-disk HNSW build improvements and fixes for parallel-build locking. [1]
  • 0.8.0 (2024-10-30): Added iterative index scans (works with HNSW); improved HNSW scan performance, inserts, and on-disk builds; better cost estimation for filtered queries. [1][3]
  • 0.8.1 (2025-09-04): (No HNSW-specific item called out.) [1]
  • 0.8.2 (2026-02-25): Fixed buffer overflow with parallel HNSW build; Postgres 18 EXPLAIN-related fix mentions index searches. [1]

Sources: pgvector PGXN changelog (canonical per-version list), plus PostgreSQL.org release announcements for 0.6.0 and 0.8.0. [1][2][3]

References:
[1] (pgxn.org)
[2] (postgresql.org)
[3] (postgresql.org)

Citations:


🌐 Web query:

pgvector vector_cosine_ops when introduced version

💡 Result:

vector_cosine_ops exists at least as early as pgvector (PGXN “vector”) v0.1.2 (2021-05-05)—the v0.1.2 README already documents creating an ivfflat index with vector_cosine_ops. [1]

PGXN’s source archive listing also shows v0.1.2 as the earliest available pgvector release in that archive, so it’s not possible (from PGXN alone) to verify whether it was introduced in v0.1.0 or v0.1.1. [2]

Sources: [1] [2]


🏁 Script executed:

# Look for Supabase configuration files that might specify pgvector version
find supabase -type f -name "config.toml" -o -name "*.json" -o -name "*.md" | head -20

Repository: AOSSIE-Org/Ell-ena

Length of output: 206


🏁 Script executed:

# Check the migration directory for context and other migrations
ls -la supabase/migrations/ | head -30

Repository: AOSSIE-Org/Ell-ena

Length of output: 1237


🏁 Script executed:

# Look at the migration file itself and nearby migrations for version hints
cat -n supabase/migrations/20251021120000_add_hnsw_indexes.sql

Repository: AOSSIE-Org/Ell-ena

Length of output: 321


🏁 Script executed:

# Check if there's a .env or docker file with version constraints
cat supabase/.gitignore

Repository: AOSSIE-Org/Ell-ena

Length of output: 133


🏁 Script executed:

# Look for any README or docs mentioning pgvector version
find . -maxdepth 3 -name "README*" -o -name "*.md" | xargs grep -l "pgvector\|vector" 2>/dev/null | head -10

Repository: AOSSIE-Org/Ell-ena

Length of output: 131


🏁 Script executed:

# Check README.md for pgvector or version information
cat README.md | grep -A 5 -B 5 "pgvector\|vector\|version" -i | head -50

Repository: AOSSIE-Org/Ell-ena

Length of output: 2343


🏁 Script executed:

# Check BACKEND.md for setup or version information
cat BACKEND.md | grep -A 5 -B 5 "pgvector\|vector\|version" -i | head -50

Repository: AOSSIE-Org/Ell-ena

Length of output: 771


🏁 Script executed:

# Look at the vector search migration to see if there are version hints
cat -n supabase/migrations/20251021090000_meeting_vector_search.sql

Repository: AOSSIE-Org/Ell-ena

Length of output: 4913


🏁 Script executed:

# Check the task schema migration to see when vector type is first used
cat -n supabase/migrations/20251021030000_task_schema.sql | head -80

Repository: AOSSIE-Org/Ell-ena

Length of output: 3184


Ensure pgvector >= 0.5.0 is deployed before applying these indexes.

The migration uses USING hnsw syntax (lines 2 and 5), which requires pgvector 0.5.0 or later. Earlier versions will fail at apply time. No version constraint is documented in the repository.

Before rollout, verify the deployed vector extension version:

SELECT extversion FROM pg_extension WHERE extname = 'vector';

Expected: 0.5.0 or later. Consider adding a version check comment to the migration or documenting the minimum pgvector version requirement.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@supabase/migrations/20251021120000_add_hnsw_indexes.sql` around lines 1 - 5,
The migration creates HNSW vector indexes (idx_tasks_embedding and
idx_tickets_embedding) using "USING hnsw", which requires pgvector >= 0.5.0;
update the migration to guard against older pgvector by either adding a
pre-check comment or a runtime/version check before creating the indexes: verify
the installed vector extension version (extname = 'vector') and abort or skip
with a clear message if extversion < 0.5.0, or document the minimum pgvector
version requirement in the migration header so deployers know to upgrade before
applying these CREATE INDEX statements.