fix(etl): bulk-update Jan Aushadhi prices via atomic RPC#2001
Conversation
merge_jan_aushadhi_price used PostgREST .upsert(), which compiles to
INSERT ... ON CONFLICT. Even for existing rows the full INSERT was
validated against medicines.generic_name (NOT NULL), but the batch
payload only carries {id, jan_aushadhi_price}, so every batch failed and
fell back to ~100x slower row-by-row PATCHes.
Add a bulk_update_jan_aushadhi_price(jsonb) RPC that does a single
atomic UPDATE of jan_aushadhi_price keyed by id — no INSERT, so NOT NULL
columns are never touched and no other field can be corrupted. The
loader now calls it per batch, reconciles a short row count as failed,
and keeps the row-by-row path as a fallback.
Closes RatLoopz#1966
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b2c6d3cf4f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Codex review (P1): Postgres grants EXECUTE on new functions to PUBLIC, so the SECURITY DEFINER writer was callable by anon/authenticated via PostgREST, letting any caller overwrite medicines.jan_aushadhi_price and bypass the medicines_service_write RLS policy. Switch the function to SECURITY INVOKER so RLS still applies, then REVOKE EXECUTE from PUBLIC and GRANT only to service_role. The ETL connects as service_role (allowed by RLS); anon/authenticated now get permission denied.
|
🎉 Congratulations @shashank03-dev! Your Pull Request "fix(etl): bulk-update Jan Aushadhi prices via atomic RPC" has been successfully merged by @dipexplorer. Thank you for your valuable contribution to SahiDawa! 🇮🇳 Follow us on LinkedIn: https://www.linkedin.com/company/ratloopz/ to get shoutout |
🛑 STOP: Assignment & File Scope Check
📋 PR Summary & Link
The price back-fill (
merge_jan_aushadhi_priceinapps/etl/src/loaders/supabase_loader.py) was calling.upsert(). PostgREST turns an upsert intoINSERT ... ON CONFLICT, which means Postgres validates the full row before it ever gets to the conflict clause. Our batch payload only carriesidandjan_aushadhi_price, so every row tripped theNOT NULLconstraint ongeneric_name(andbrand_name,manufacturer). The batch threw, and the loader dropped down to one PATCH per row, which is roughly 100x slower.I added a Postgres function,
bulk_update_jan_aushadhi_price(p_updates jsonb), that runs a singleUPDATEofjan_aushadhi_pricematched onid. No INSERT happens, so the not-null columns are never looked at and nothing else on the row can change. The loader calls it once per batch through the retry helper that was already there, and keeps the row-by-row update as a fallback if the RPC ever fails.Two things worth flagging for review:
jan_aushadhi_priceas the JSON key instead ofpricelike the issue example. It matches the payload the loader already builds and makes it obvious which column gets written.failedinstead of pretending everything went through. That keepschecked == updated + skipped + failedhonest.📸 Proof of Work (Screenshots / Logs)
No UI here, it's an ETL/DB change. Two kinds of proof.
Tests:
New tests cover the back-fill going through the RPC with an
{id, jan_aushadhi_price}-only payload (never an upsert), the retry-then-fallback path, a short row count getting counted asfailed, and an unrecognized RPC response being treated as a full batch with a loud log.I also ran it against a real Postgres 16 to be sure the migration and function behave, not just the test fake. Created the actual
medicinestable with its not-null columns, applied this PR's migration, then put one row through the old path and the new one:The old
.upsert()reproduces the exact not-null error from the issue, and it does so even though the row already exists. The RPC fills the price, hands back a count, and leavesgeneric_namealone.One operational note: the function needs this PR's migration (
20260616174742_add_bulk_update_jan_aushadhi_price_rpc.sql) applied to the environment. If it isn't there, the loader logs an error and falls back to the old row-by-row path rather than failing the run outright.🏷️ PR Type
type: bugtype: performance✅ Checklist
Closes #1966)mainand resolved any conflicts