This repository provides a streamlined setup for running AWS Glue ETL libraries locally with support for AWS SSO, based on Amazon Linux 2023, in a DevContainer. It resolves common challenges faced when configuring AWS Glue locally, as discussed in various resources.
- DevContainer Integration: Fully configured for use with Visual Studio Code or compatible tools.
- AWS SSO Support: Properly handles AWS SSO credential management to facilitate seamless local testing.
- Jupyter Environment: Preconfigured Jupyter server for interactive development.
- Spark Configuration: Includes PySpark setup for ETL development.
- References and Improvements: Inspired by discussions and community solutions, such as:
- Visual Studio Code installed on your local machine.
- Docker installed and running.
- AWS SSO configuration in your
~/.aws/credentials
or environment variables. - Familiarity with AWS Glue and ETL processes.
-
Clone this repository:
git clone <repository-url> cd <repository-folder>
-
Open the repository in Visual Studio Code.
-
When prompted, open the folder in the DevContainer. Alternatively, you can manually rebuild the container:
- Press
Ctrl+Shift+P
(orCmd+Shift+P
on Mac) to open the Command Palette. - Select Remote-Containers: Rebuild and Reopen in Container.
- Press
-
The DevContainer will start and initialize:
- AWS credentials will be mounted into the container.
- The Jupyter server will be started automatically.
Once the DevContainer is running, JupyterLab will be accessible at http://localhost:8889
. It is preconfigured with all necessary libraries and paths for AWS Glue development.
- DevContainer: Configuration is defined in
devcontainer.json
for a seamless setup. - Jupyter: Automatically starts on container launch via the
postStartCommand
script. - Spark: Configured with paths to
PyGlue.zip
andpy4j
for full Glue functionality. - Network Ports:
4040
: Spark UI18080
: Spark History Server8998
: Livy Server8889
: JupyterLab
Modify the devcontainer.json
and Dockerfile
to customize dependencies and environment configurations.
Feel free to open issues or submit pull requests to improve this repository.