Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataStore PG Dump #191

Open
wants to merge 7 commits into
base: canada-v2.10
Choose a base branch
from
Open

DataStore PG Dump #191

wants to merge 7 commits into from

Conversation

JVickery-TBS
Copy link

Adds a .sql format option to datastore dump endpoint which streams back the contents of a customized pg_dump on the table. Is this useful at all? I have no idea, but the idea was to give the users who wanted to use the datastore_search_sql an option to get the SQL table and they can query locally.

- pg_dump sql format for datastore dump.
# Conflicts:
#	ckanext/datastore/blueprint.py
### RESOLVED.
- Datastore pg_dump endpoint.
- Add change log file.
@wardi
Copy link
Member

wardi commented Feb 14, 2025

This is an interesting idea, if someone has a local postgres it would make it easier for them to work with the data with the correct data types and everything.

Just curious, do the column comments (the data dictionary) also get exported by dumping this way?

- Use subprocess.run for timeouts.
- Added max execution for sql dump config.
- Made sql dump pluggable.
- Used max buffer sizes for subprocess and byte chunks.
@JVickery-TBS
Copy link
Author

@wardi okay I have done all the above, changes to subprocess.run to be able to handle timeouts and proper killing of the subprocess. And made it so the dump sql is pluggable.

I am currently having some issues in our setup with full text search stuff with giant DS tables, so just fixing that up and will see if the column comments get exported or not.

@JVickery-TBS
Copy link
Author

@wardi yup! the comments from Data Dicitonary do in fact get exported:

COMMENT ON COLUMN "public"."0690bcb7-42a6-40b4-9ab2-bf4ca4f4ebb3"."ref_number" IS '{"_info":{"label_en":"Reference Number","label_fr":"Reference Number fr","notes_en":"Reference Number des","notes_fr":"Reference Number des fr","type_override":""}}';
COMMENT ON COLUMN "public"."0690bcb7-42a6-40b4-9ab2-bf4ca4f4ebb3"."amendment_number" IS '{"_info":{"label_en":"Amendment Number","label_fr":"Amendment Number fr","notes_en":"Amendment Number des","notes_fr":"Amendment Number des fr","type_override":""}}';
COMMENT ON COLUMN "public"."0690bcb7-42a6-40b4-9ab2-bf4ca4f4ebb3"."amendment_date" IS '{"_info":{"label_en":"","label_fr":"","notes_en":"","notes_fr":"","type_override":""}}';
COMMENT ON COLUMN "public"."0690bcb7-42a6-40b4-9ab2-bf4ca4f4ebb3"."agreement_type" IS '{"_info":{"label_en":"","label_fr":"","notes_en":"","notes_fr":"","type_override":""}}';
COMMENT ON COLUMN "public"."0690bcb7-42a6-40b4-9ab2-bf4ca4f4ebb3"."recipient_type" IS '{"_info":{"label_en":"","label_fr":"","notes_en":"","notes_fr":"","type_override":""}}';

@JVickery-TBS JVickery-TBS requested a review from wardi February 24, 2025 16:37
@wardi
Copy link
Member

wardi commented Mar 15, 2025

May not be accepted upstream. There's no psql cli tools in the default ckan docker container and there might be problems with opening too many connections to the DB if we're using external tools to make new connections as a result of a web request.

@JVickery-TBS
Copy link
Author

@wardi yeah figured I would not put this upstream. Will show this to Data and Biz team this Friday and see what they think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants