Skip to content

Conversation

KathrynLin
Copy link
Collaborator

Summary

Changes

  • Added Amazon EMR EC2 Integration section describing cluster, instance, and step management capabilities
  • Added EMR EC2 Handler Tools table with detailed descriptions of:
    • manage_aws_emr_clusters: Comprehensive EMR cluster lifecycle management
    • manage_aws_emr_ec2_instances: Dynamic scaling and instance management
    • manage_aws_emr_ec2_steps: Big data processing job orchestration
  • Included key operations and requirements for each tool

Checklist

If your change doesn't seem to apply, please leave them unchecked.

  • I have reviewed the contributing guidelines
  • I have performed a self-review of this change
  • Changes have been tested
  • Changes are documented

Is this a breaking change? (Y/N)

RFC issue number:

Checklist:

  • Migration process documented
  • Implement warnings (if it can live side by side)

Acknowledgment

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.

LiyuanLD and others added 30 commits June 26, 2025 14:45
feat(dataprocessing-mcp-server): AWS Glue Data Catalog mcp tools
Fix(dataprocessing-mcp-server): fix READEME.md and unit tests
…ibute access type check andadd more unit test for test coverage
chore: add more unit tests
feat(dataprocessing-mcp-server): AWS Glue Interactive Sessions and Workflows mcp tools
test(dataprocessing-mcp-server): Glue Commons MCP: Validation test fix and reformat files
fix(dataprocessing-mcp-server): fix unit test in test_server.py and type check failure in main branch
* Add EMR EC2 instance management functionality

- Add EMREc2InstanceHandler for comprehensive EMR instance operations
- Support for instance fleets and instance groups management
- Include add, modify, and list operations for EMR instances
- Add comprehensive test coverage for EMR functionality
- Support for both read and write operations with proper validation

* feat: integrate EMR EC2 instance handler into dataprocessing MCP server

- Add EMREc2InstanceHandler import and initialization
- Update server description to include EMR EC2 instance management
- Add comprehensive EMR usage examples and workflows in server instructions
- Enable EMR instance fleet and instance group operations through MCP tools

* refactor: improve EMR EC2 instance handler implementation

- Refactor EMR EC2 instance handler for better code organization
- Update EMR models with improved structure and validation
- Enhance test coverage and test organization
- Fix import ordering in server.py for better code style
- Improve error handling and response formatting

---------

Co-authored-by: Kathryn Lin <[email protected]>
…ols (#12)

* Add Glue ETL Handler for Glue Job Tools

* Reformat Glue ETL Handler and tests with precommit

* Fix test_initialization_parameters

* chore(dataprocessing-mcp-server): Add missing docstrings for Glue ETL Handler tests

---------

Co-authored-by: Chris Kha <[email protected]>
…ndlers (#15)

* feat: integrate EMR EC2 instance handler into dataprocessing MCP server

- Add EMREc2InstanceHandler import and initialization
- Update server description to include EMR EC2 instance management
- Add comprehensive EMR usage examples and workflows in server instructions
- Enable EMR instance fleet and instance group operations through MCP tools

* test: add comprehensive edge case tests for EMR EC2 instance handler

- Add extensive test coverage for error handling scenarios
- Test parameter validation and edge cases
- Add tests for empty response handling
- Test tag verification edge cases
- Add tests for complex parameter combinations
- Test string conversion scenarios
- Improve test coverage for modify operations without cluster_id
- Add tests for missing response fields and empty values
- Test pagination marker handling in list operations
- Enhance overall test robustness and code coverage

* feat: add EMR EC2 steps handler

* WIP: EMR EC2 steps handler improvements and tests

* feat: integrate EMR EC2 steps handler into server configuration

- Add EMREc2StepsHandler initialization in main server setup
- Include EMR EC2 steps management workflows in server instructions
- Enable step management operations through MCP tools

* style: improve code formatting in EMR EC2 steps handler tests

- Fix line length issues for better readability
- Improve multi-line assertion formatting
- Clean up whitespace inconsistencies

---------

Co-authored-by: Kathryn Lin <[email protected]>
…ue Interactive Session and AWS Glue Workflows mcp tools (#16)

* fix(dataprocessing-mcp-server): Add unit and update readme for AWS Glue Interactive Session and AWS Glue Workflows mcp tools

* fix(dataprocessing-mcp-server): Add unit and update readme for AWS Glue Interactive Session and AWS Glue Workflows mcp tools

---------

Co-authored-by: raghav-aws <[email protected]>
naikvaib and others added 6 commits June 30, 2025 16:28
* fix unit test in test_server.py

* fix: fix all type check failure in main

* feat: Add EMR EC2 instance management to dataprocessing-mcp-server (#9)

* Add EMR EC2 instance management functionality

- Add EMREc2InstanceHandler for comprehensive EMR instance operations
- Support for instance fleets and instance groups management
- Include add, modify, and list operations for EMR instances
- Add comprehensive test coverage for EMR functionality
- Support for both read and write operations with proper validation

* feat: integrate EMR EC2 instance handler into dataprocessing MCP server

- Add EMREc2InstanceHandler import and initialization
- Update server description to include EMR EC2 instance management
- Add comprehensive EMR usage examples and workflows in server instructions
- Enable EMR instance fleet and instance group operations through MCP tools

* refactor: improve EMR EC2 instance handler implementation

- Refactor EMR EC2 instance handler for better code organization
- Update EMR models with improved structure and validation
- Enhance test coverage and test organization
- Fix import ordering in server.py for better code style
- Improve error handling and response formatting

---------

Co-authored-by: Kathryn Lin <[email protected]>

* chore(dataprocessing-mcp-server): add more unit test for AWS Glue Data Catalog (#13)

* feat(dataprocessing-mcp-server): Add Glue ETL Handler for Glue Job Tools (#12)

* Add Glue ETL Handler for Glue Job Tools

* Reformat Glue ETL Handler and tests with precommit

* Fix test_initialization_parameters

* chore(dataprocessing-mcp-server): Add missing docstrings for Glue ETL Handler tests

---------

Co-authored-by: Chris Kha <[email protected]>

* chore(dataprocessing-mcp-server): Add unit tests for Glue Commons mcp tools (#14)

Co-authored-by: Chris Kha <[email protected]>

* feat(dataprocessing-mcp-server): Add AWS Athena Query Tools

* fix: fix unit test case to remove count

* fix: remove content and isError params from athena models

* Remove content and isError from Athena Models

* fix ruff for athena_models

---------

Co-authored-by: LiyuanLD <[email protected]>
Co-authored-by: Fangqing Lin <[email protected]>
Co-authored-by: Kathryn Lin <[email protected]>
Co-authored-by: Liyuan Lin <[email protected]>
Co-authored-by: Christopher Kha <[email protected]>
Co-authored-by: Chris Kha <[email protected]>
* feat(dataprocessing-mcp-server): add AWS Glue Crawler MCP tools

* chore(dataprocessing-mcp-server): enable AWs Glue Data Catalog classifer and scheduler tools
* feat: add comprehensive EMR EC2 cluster management handler

- Add EMREc2ClusterHandler with full CRUD operations for EMR clusters
- Support cluster lifecycle: create, describe, modify, terminate, list
- Add security configuration management (create, delete, describe, list)
- Include cluster attribute modification and waiting capabilities
- Add comprehensive response models for all cluster operations
- Integrate handler into server configuration
- Add extensive unit tests with 95%+ coverage
- Support pagination, filtering, and error handling
- Include MCP-managed tagging for resource tracking

* style: improve code formatting and consistency in EMR EC2 cluster handler

- Convert double quotes to single quotes for consistency
- Improve import organization and formatting
- Enhance code readability with better line breaks
- Maintain consistent parameter formatting
- Update test file formatting to match handler style
- No functional changes, only style improvements

---------

Co-authored-by: Kathryn Lin <[email protected]>
… tools (#22)

* fix unit test in test_server.py

* fix: fix all type check failure in main

* feat: Add EMR EC2 instance management to dataprocessing-mcp-server (#9)

* Add EMR EC2 instance management functionality

- Add EMREc2InstanceHandler for comprehensive EMR instance operations
- Support for instance fleets and instance groups management
- Include add, modify, and list operations for EMR instances
- Add comprehensive test coverage for EMR functionality
- Support for both read and write operations with proper validation

* feat: integrate EMR EC2 instance handler into dataprocessing MCP server

- Add EMREc2InstanceHandler import and initialization
- Update server description to include EMR EC2 instance management
- Add comprehensive EMR usage examples and workflows in server instructions
- Enable EMR instance fleet and instance group operations through MCP tools

* refactor: improve EMR EC2 instance handler implementation

- Refactor EMR EC2 instance handler for better code organization
- Update EMR models with improved structure and validation
- Enhance test coverage and test organization
- Fix import ordering in server.py for better code style
- Improve error handling and response formatting

---------

Co-authored-by: Kathryn Lin <[email protected]>

* chore(dataprocessing-mcp-server): add more unit test for AWS Glue Data Catalog (#13)

* feat(dataprocessing-mcp-server): Add Glue ETL Handler for Glue Job Tools (#12)

* Add Glue ETL Handler for Glue Job Tools

* Reformat Glue ETL Handler and tests with precommit

* Fix test_initialization_parameters

* chore(dataprocessing-mcp-server): Add missing docstrings for Glue ETL Handler tests

---------

Co-authored-by: Chris Kha <[email protected]>

* chore(dataprocessing-mcp-server): Add unit tests for Glue Commons mcp tools (#14)

Co-authored-by: Chris Kha <[email protected]>

* feat(dataprocessing-mcp-server): Add AWS Athena Query Tools

* fix: fix unit test case to remove count

* fix: remove content and isError params from athena models

* Remove content and isError from Athena Models

* fix ruff for athena_models

* feat(dataprocessing-mcp-server): Add Athena Workgroup and DataCatalog tools

* chore:(dataprocessing-mcp-server): Add Unit test cases for Athena Models

---------

Co-authored-by: LiyuanLD <[email protected]>
Co-authored-by: Fangqing Lin <[email protected]>
Co-authored-by: Kathryn Lin <[email protected]>
Co-authored-by: Liyuan Lin <[email protected]>
Co-authored-by: Christopher Kha <[email protected]>
Co-authored-by: Chris Kha <[email protected]>
- Add Amazon EMR EC2 Integration section with cluster, instance, and step management capabilities
- Add EMR EC2 Handler Tools table with manage_aws_emr_clusters, manage_aws_emr_ec2_instances, and manage_aws_emr_ec2_steps tools
- Include comprehensive descriptions of key operations and requirements for each tool
- Add comprehensive EMR EC2 Cluster Management section with 10 example operations
- Include cluster lifecycle operations: create, describe, list, modify, terminate
- Include security configuration operations: create, delete, describe, list
- Provide practical examples for each operation with proper parameters
@naikvaib naikvaib self-requested a review as a code owner July 9, 2025 00:49
naikvaib pushed a commit that referenced this pull request Jul 14, 2025
* feat: AWS MCP Server

* fix: Fix couple of code scanning issues

* fix: fix pre-commit hooks checks

* fix: add missing importlib_resources dependency on pre-commit hook

* fix: ensure pip is installed before attempting to install importlib

* fix: ensure aws-mcp-server package is installed before generating confirm list

* fix: disable rag test temporarily

* Update Docker image

* fix: add missing resume_token when stopping early after processing first page

* Update call_aws tool prompt

* fix: trivy scan

* fix: skip trivy when trivy-results.sarif is in PR

Signed-off-by: Scott Schreckengaust <[email protected]>

* fix: skip py.typed for licenses

Signed-off-by: Scott Schreckengaust <[email protected]>

* fix: add trivy-results.sarif to gitignore

Signed-off-by: Scott Schreckengaust <[email protected]>

* fix: add trivy-results.sarif for aws-mcp-server

Signed-off-by: Scott Schreckengaust <[email protected]>

* fix: if statement without brackets

Signed-off-by: Scott Schreckengaust <[email protected]>

* fix: use the commit from the checkout

Signed-off-by: Scott Schreckengaust <[email protected]>

* fix: switch diff to HEAD

Signed-off-by: Scott Schreckengaust <[email protected]>

* fix: get depth of PR

Signed-off-by: Scott Schreckengaust <[email protected]>

* fix: use GITHUB_OUTPUT

Signed-off-by: Scott Schreckengaust <[email protected]>

* fix: check for existing tryvy results first

Signed-off-by: Scott Schreckengaust <[email protected]>

* fix: view before and after lfs pull

Signed-off-by: Scott Schreckengaust <[email protected]>

* fix: git lfs checkout

Signed-off-by: Scott Schreckengaust <[email protected]>

* fix: git lfs ls-files debugging

Signed-off-by: Scott Schreckengaust <[email protected]>

* add test workflow

* add test workflow

* add .gitattributes

---------

Signed-off-by: Scott Schreckengaust <[email protected]>

* Removed consent, implemented AWSCLI customizations support, README and refactoring changes

* fix: pyright issues

* Remove custom client-side filtering and apply filter directly (#5)

* Update README

* Remove constraints

* feat: Adding error logging using mcp context so that clients can log errors (#8)

* Adding error logging using mcp context so that clients can log errors appropriately

* pyright fixes

* Fixing UTs

* Using AwsMcpServerErrorResponse in suggest_aws_commands tool

* Using isinstance for type checking

---------

Co-authored-by: Sagnik Dutta <[email protected]>

* feat: call_aws prompt update to decrease filter/query use (#10)

Co-authored-by: Baris Kurt <[email protected]>

* feat: Throw error when JMESPATH expression for --query cannot be parsed (#9)

* fix: aws-mcp readme updates (#11)

Co-authored-by: Roman Shevchuk <[email protected]>

* fix: remove MAX_OUTPUT_TOKENS env variable (#13)

* Remove MAX_OUTPUT_TOKENS

* Update README.md

* Remove is_counting tool parameter (#14)

* fix: allow high-level s3 commands (#15)

Co-authored-by: Shirui Yang <[email protected]>

* fix: Updating README (#16)

* fix: Updating README

* fix: precommit check

* fix: Removing coverage report which was added accidentally

* test: increase test coverage (#17)

* test: add kb test

* Add more tests to increase coverage

* Remove functional testing for private methods

---------

Co-authored-by: Shirui Yang <[email protected]>
Co-authored-by: Arne Wouters <[email protected]>

* feat: build embeddings (#7)

Implement embeddings generation CI.

During build time check if aws-mcp-server files were changed
Download latest GH artifact for 'main' branch
Unpack its source and check existence of embeddings artifact
If it is present - check its awscli version (stated in the name).
4.1 If local awscli version is same as stated in the embeddings title - use this file
4.2 If local awscli version is different - generate new embeddings
If it is not present - generate embeddings
During uv build, package embeddings into final distribution

---------

Signed-off-by: GitHub <[email protected]>
Co-authored-by: Roman Shevchuk <[email protected]>

* feat: generate embeddings for all AWS CLI commands (#18)

Co-authored-by: Azat Nizametdinov <[email protected]>

* feat: rebuild embeddings if core kb files changed (#20)

* feat: rebuild embeddings if generation logic changed

* fix: change compare base to main

* chore: change in kb folder

* fix: remove unnecessary step

* fix: debug

* fix: debug2

* chore: change in core kb

* fix: use fromJson

* chore: change kb file

* fix: log

* fix: log2

* feat: verification

* feat: final

* feat: pin awscli version to latest

* fix: update uv.lock

---------

Co-authored-by: Roman Shevchuk <[email protected]>

* fix: remove print statement (#21)

* chore: Ignore Semgrep Finding python37-compatibility-importlib2 (#23)

Ignore the following Semgrep finding because AWS MCP Server requires Python version >= 3.10
Semgrep Finding: python.lang.compatibility.python37.python37-compatibility-importlib2
Found 'importlib.resources', which is a module only available on Python 3.7+. This does not work in lower versions, and therefore is not backwards compatible. Use importlib_resources instead for older Python versions.

Co-authored-by: Azat Nizametdinov <[email protected]>

* chore: Improve README and add Legal statements

Co-authored-by: Shirui Yang <[email protected]>

* chore: add missing tests for embeddings CI (#25)

* chore: add missing tests for embeddings CI

* chore: address comments

---------

Co-authored-by: Roman Shevchuk <[email protected]>

* chore: add logging for the source of credentials (#19)

Co-authored-by: Shirui Yang <[email protected]>

* feat: Using service reference API for getting readonly operations list (#24)

* fix: Adding cached read only policy to prevent failure during server startup due to missing iam:GetPolicy

* fix: Updating UTs + fixing import

* feat: Using service reference API for getting readonly operations list

* fix: Adding timeout to requests call

* Updating readme

* chore: Removing keyword search RAG code (#26)

* Removing keyword search RAG code

* Updating README according to comments

* Removing unused methods and tests

* Rewrite suggest_aws_commands description in README

---------

Co-authored-by: Shirui Yang <[email protected]>

* chore: update README following Legal review (#27)

Co-authored-by: Shirui Yang <[email protected]>

* chore: add __main__ handler to server.py (#28)

Co-authored-by: Roman Shevchuk <[email protected]>

* chore: Adding support for custom cli commands in readonly mode (#29)

* chore: Adding support for custom cli commands in readonly mode

* Renaming variables

* Adding disclaimer for file system operations (#30)

* feat: Add AWS_MCP_WORKING_DIR to prevent file operations in unexpected locations (#32)

* feat: Add AWS_MCP_WORKING_DIR to prevent file operations in unexpected locations

When using relative paths, commands like aws s3 sync and aws s3 cp could create/overwrite/delete
files in unexpected locations without a controlled working directory

* feat: Check that AWS_MCP_WORKING_DIR is absolute path

---------

Co-authored-by: Azat Nizametdinov <[email protected]>

* Override default creds chain with specific env var (#31)

Co-authored-by: Shirui Yang <[email protected]>

* chore: better readme (#33)

* Improve installation instructions and nicefy README

* chore: remove json comments

* feat: Add AWS_MCP_WORKING_DIR to prevent file operations in unexpected locations (#32)

* feat: Add AWS_MCP_WORKING_DIR to prevent file operations in unexpected locations

When using relative paths, commands like aws s3 sync and aws s3 cp could create/overwrite/delete
files in unexpected locations without a controlled working directory

* feat: Check that AWS_MCP_WORKING_DIR is absolute path

---------

Co-authored-by: Azat Nizametdinov <[email protected]>

* Override default creds chain with specific env var (#31)

Co-authored-by: Shirui Yang <[email protected]>

* Improve installation instructions and nicefy README

* chore: resolve merge conflicts

* chore: added missing env variables

* chore: minor tweaks

---------

Co-authored-by: Roman Shevchuk <[email protected]>
Co-authored-by: Azat Nizametdinov <[email protected]>
Co-authored-by: Azat Nizametdinov <[email protected]>
Co-authored-by: Shirui Yang <[email protected]>
Co-authored-by: Shirui Yang <[email protected]>

* Recommend AWS MCP server as the default MCP server for interacting with AWS (#34)

Co-authored-by: Claudiu Popa <[email protected]>

* chore: rebrand aws-mcp to aws-api-mcp (awslabs#35)

* chore: rebrand aws-mcp to aws-api-mcp

* Update docs/servers/aws-api-mcp-server.md

Co-authored-by: Arne Wouters <[email protected]>

* Update README.md

Co-authored-by: Arne Wouters <[email protected]>

* Update docs/index.md

Co-authored-by: Arne Wouters <[email protected]>

---------

Co-authored-by: Roman Shevchuk <[email protected]>
Co-authored-by: Arne Wouters <[email protected]>

* fix: Updating custom readonly operations (awslabs#39)

* Final README and CONTRIBUTING update (awslabs#38)

Co-authored-by: Shirui Yang <[email protected]>

* feat: update core MCP server with AWS API MCP (awslabs#40)

* feat: update core MCP server with AWS API MCP

* Update index.md

* Update alias in CODEOWNERS

* Add allowlist for customizations (awslabs#41)

Co-authored-by: Bidesh Thapaliya <[email protected]>

* feat: Increasing top_k to 5 for suggest_aws_command (awslabs#44)

* Increasing top_k to 5 for suggest_aws_command

* fix: format

---------

Co-authored-by: Baris Kurt <[email protected]>

* feat: Add validation for unsupported outfile parameters (awslabs#43)

Co-authored-by: Azat Nizametdinov <[email protected]>

* fix: Adding support for readonly operations where translated operation names are different from IAM operation names (awslabs#42)

* fix: Adding support for readonly operations where translated operation names are different from IAM operation names

* Addressing comments

* Include ISO partition (awslabs#45)

Co-authored-by: Claudiu Popa <[email protected]>

* Add note on tool name change (awslabs#46)

Co-authored-by: Shirui Yang <[email protected]>

* fix: replace more print statements with logger (awslabs#37)

* fix: replace more print statements with logger

* update tests

* Fix circular dependency and put logger setup in the correct place

* Add progress bar for generating embeddings

* chore: fix docusaurus build

* fix: addressing comments and changes based on latest decisions (awslabs#48)

Co-authored-by: Roman Shevchuk <[email protected]>

* feat: lower python version to 3.12 and update Docker image to use it (awslabs#50)

* feat: lower default python to 3.12

* fix: update README

---------

Co-authored-by: Roman Shevchuk <[email protected]>

* fix: always package embeddings even for unrelated changes

* fix: Adding security disclaimer in the readme (awslabs#49)

* Add more security disclaimers for least-privilege (awslabs#51)

Co-authored-by: Claudiu Popa <[email protected]>

* fix: update tool prompt with working and home directory and encourage absolute paths (awslabs#53)

* Update tool prompt with working and home directory and encourage absolute paths

* Resolve tilde character manually

* Add support for logging to a file (awslabs#52)

The MCP server was not logging anything on file, which made impossible
monitoring and forensics on what the server did. By having the logs
stored in a well-defined location, the user is able to track what the
server did on their behalf.

Co-authored-by: Claudiu Popa <[email protected]>

* Update README with better links and instructions (awslabs#56)

Co-authored-by: Claudiu Popa <[email protected]>

* Remove API classification from interpretation (awslabs#55)

Co-authored-by: Claudiu Popa <[email protected]>

* chore: update README.md (awslabs#54)

* Update README.md

* Update README.md

* Update README.md (awslabs#57)

* chore: fix AWS_REGION docs

---------

Signed-off-by: Scott Schreckengaust <[email protected]>
Signed-off-by: GitHub <[email protected]>
Co-authored-by: Roman Shevchuk <[email protected]>
Co-authored-by: Arne Wouters <[email protected]>
Co-authored-by: Scott Schreckengaust <[email protected]>
Co-authored-by: Arne Wouters <[email protected]>
Co-authored-by: Shirui Yang <[email protected]>
Co-authored-by: Sagnik Dutta <[email protected]>
Co-authored-by: Sagnik Dutta <[email protected]>
Co-authored-by: Barış Kurt <[email protected]>
Co-authored-by: Baris Kurt <[email protected]>
Co-authored-by: Shirui Yang <[email protected]>
Co-authored-by: Azat Nizametdinov <[email protected]>
Co-authored-by: Azat Nizametdinov <[email protected]>
Co-authored-by: Claudiu Popa <[email protected]>
Co-authored-by: Claudiu Popa <[email protected]>
Co-authored-by: Arne Wouters <[email protected]>
Co-authored-by: Bidesh Thapaliya <[email protected]>
Co-authored-by: Bidesh Thapaliya <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants