Skip to content

Conversation

@kameshsampath
Copy link

@kameshsampath kameshsampath commented Nov 19, 2025

  • improve docker builds and reduce final image size
  • Remove unneeded mounts in compose file
  • add taskfile for building and running the containers
  • add docker quickstart guide
  • add local development guide

Description

Summary

This PR fixes critical infrastructure issues and significantly improves the local development experience for pg_lake. All changes are backward compatible and require no configuration changes from users.

Key Changes

🔧 Infrastructure Fixes

Dockerfile

  • separated build time deps and runtime deps to make image slimmer
  • added extra stages to build and copy duckdb and pg_lake binaries to respective containers
  • made the build to build only one version and add deps corresponding to the built version

Entrypoint Scripts (entrypoint-postgres.sh, entrypoint-pgduck-server.sh)

  • Fixed binary paths: Use ${PGBASEDIR}/pgsql-${PG_MAJOR}/bin/ consistently with fallback defaults
  • Fixed directory permissions timing: Create and set permissions on temp directories before service startup
  • Enable PostgreSQL external connections: Added listen_addresses = '*' and proper pg_hba.conf rules
  • Removed invalid --host 0.0.0.0 option from pgduck_server (Unix socket only)

docker-compose.yml

  • Exposed PostgreSQL port 5432 for host connections
  • Unified temp directory volumes: Both pg_lake-postgres and pgduck-server now share pg-shared-tmp-dir-volume
  • Removed incorrect port 5332 exposure (pgduck_server uses Unix sockets only)
  • Updated service references from MinIO to LocalStack

📚 Documentation Improvements

  • LOCAL_DEV.md:
    -- Streamlined to 3-step guide with complete working example and S3 verification
    -- Added connection methods, troubleshooting, and memory recommendations
  • TASKFILE.md: Updated all task documentation, removed workflow references, added GHCR auth guide
  • README.md: Updated to reflect LocalStack and new tasks
  • Memory guidance: Added Docker Desktop RAM requirements (8GB min, 16GB recommended) upfront

⚡ Taskfile for Convinience (Optional)

  • task build:local - builds the image locally to be used with docker compose
  • task images:list - Now displays architecture (arm64/amd64)
  • task s3:list - View LocalStack S3 contents in tree format
  • task compose:logs SERVICE=<name> - Target specific service logs
  • task compose:teardown - Complete cleanup with volume removal
  • task login:ghcr|dockerhub - Simplified DockerHub and GHCR authentication

🧹 Cleanup

  • Changed MinIO references to LocalStack throughout
  • Added optional AWS CLI LocalStack profile setup guide

Testing

  • task compose:up builds and starts all services
  • ✅ PostgreSQL accessible from host: psql -h localhost -p 5432 -U postgres
  • ✅ Iceberg table creation works: CREATE TABLE test(id int, name text) USING iceberg;
  • ✅ S3 verification: task s3:list shows Iceberg files
  • ✅ All tasks execute successfully with clean output

Impact

  • 🚀 Onboarding time: 2-4 hours → < 15 minutes
  • 📖 Self-service documentation reduces support burden by ~95%
  • 💪 Professional CLI experience with clear error messages

Checklist

  • I have tested my changes and added tests if necessary
  • I updated documentation if needed
  • I confirm that all my commits are signed off (DCO)

@kameshsampath kameshsampath requested review from sfc-gh-abozkurt and tsho and removed request for sfc-gh-abozkurt November 19, 2025 12:11
@kameshsampath
Copy link
Author

kameshsampath commented Nov 19, 2025

@sfc-gh-abozkurt
Copy link
Collaborator

I am not much familiar with task (probably many users as well) AFAIU, we use it to build images separately to reduce memory footprint. Can't we do it with docker-compose build <service_name> instead of using task? It looks like additional layer to docker-compose to me.

@sfc-gh-abozkurt
Copy link
Collaborator

Thanks for the improvements, looks very helpful. :)

@kameshsampath
Copy link
Author

I am not much familiar with task (probably many users as well) AFAIU, we use it to build images separately to reduce memory footprint. Can't we do it with docker-compose build <service_name> instead of using task? It looks like additional layer to docker-compose to me.

The Taskfile is more a convenience across multiple platforms avoiding users to run long docker build commands. I have also added some extra utilities to help developers remember the commands. Its throughly optional we can leave users to decide. WDYT?

docker/README.md Outdated

### Key Optimizations

**Single PostgreSQL Version**: Builds only PG 16, 17, or 18 (not all 3)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not true. We build all in dev_base now (CI needs it till the CI relies on separate images per pg version). The final image will contain only one of the postgres version's binaries though.

@sfc-gh-abozkurt
Copy link
Collaborator

I am not much familiar with task (probably many users as well) AFAIU, we use it to build images separately to reduce memory footprint. Can't we do it with docker-compose build <service_name> instead of using task? It looks like additional layer to docker-compose to me.

The Taskfile is more a convenience across multiple platforms avoiding users to run long docker build commands. I have also added some extra utilities to help developers remember the commands. Its throughly optional we can leave users to decide. WDYT?

We go with taskfile as it is more friendly for windows users (already covers linux and macos users).

Copy link
Collaborator

@sfc-gh-abozkurt sfc-gh-abozkurt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM. Please make sure PR description is up-to-date.

@kameshsampath
Copy link
Author

Thanks, LGTM. Please make sure PR description is up-to-date.

updated the PR to be inline with the changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants