Deploy a "Streamable-HTTP" MCP server on AWS Lambda and Amazon API Gateway for Amazon Bedrock spend analysis
This project deploys a Model Context Protocol (MCP) server as a containerized application on AWS Lambda and accessible to clients via an Amazon API Gateway. The MCP protocol now supports Streamable-HTTP
which means that the server operates as an independent process that can handle multiple client connections. The server in this repo sends a Mcp-Session-id
field to the client in the HTTP response header when accepting a new connection and the client then includes this header field in every subsequent request it sends to the server thus enabling session management.
The MCP server and client in this repo are both written in TypeScript. The server is built as a container and deployed on a Lambda and is available as an endpoint via an API Gateway.
Note: as of 4/22/2025 Lambda supports HTTP Streaming for Node.js managed runtime (link). API Gateway does not support Service Side Events (SSE) which is why Streamable HTTP comes in handy and we can now deploy an MCP server on Lambda and API Gateway.
The MCP server in this repo provides a tool to get a summary of the spend on Amazon Bedrock in a given AWS account.
The following diagram illustrates the architecture of the MCP server deployment:
Image credit: https://github.com/aal80/simple-mcp-server-on-lambda
The architecture consists of:
- API Gateway: Provides the HTTP API endpoint
- Lambda Function: Runs the containerized MCP server
- ECR: Stores the Docker container image
- CloudWatch Logs: Collects and stores logs from both the server and Bedrock usage
- Bedrock: The underlying model service that the MCP server interacts with
-
Node.js & npm: Install Node.js version 18.x or later (which includes npm). This is required for running the server and client locally.
-
Python 3.11: Required for the deployment script (
deploy.py
). -
Docker: Install Docker Desktop or Docker Engine. Required for building the container image for AWS Lambda deployment.
-
AWS Account & CLI: Required for deployment:
- An active AWS account
- AWS CLI installed and configured with appropriate credentials
- Boto3 Python package
-
Setup model invocation logs in Amazon CloudWatch.
-
Ensure that the IAM user/role being used has full read-only access to Amazon Cost Explorer and Amazon CloudWatch, this is required for the MCP server to retrieve data from these services.
See here and here for sample policy examples that you can use & modify as per your requirements.
You can run both the server and client locally for development and testing purposes.
-
Install dependencies:
npm install
-
Start the server:
npx tsx src/server.ts
The server will start on
http://localhost:3000
by default.
- The client can be run using the same command:
npx tsx src/client.ts
Example:
# Initialize
curl -XPOST "http://localhost:3000/prod/mcp" \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Accept: text/event-stream" \
-d '{
"jsonrpc": "2.0",
"method": "initialize",
"params": {
"clientInfo": { "name": "curl-client", "version": "1.0" },
"protocolVersion": "2025-03-26",
"capabilities": {}
},
"id": "init-1"
}'
If successful you should see an output similar to the following:
event: message
id: 2d11fbfc-f4e8-4738-8de8-60b59324459d_1745362335448_hj5mjalf
data: {"result":{"protocolVersion":"2024-11-05","capabilities":{"logging":{},"tools":{"listChanged":true},"prompts":{"listChanged":true},"resources":{"listChanged":true}},"serverInfo":{"name":"bedrock-usage-stats-http-server","version":"1.0.1"}},"jsonrpc":"2.0","id":"init-1"}
- Container Image: The Node.js/Express application is packaged into a Docker container image using Node.js 18.
- AWS ECR: The Docker image is stored in Amazon Elastic Container Registry (ECR).
- AWS Lambda: A Lambda function is created using the container image from ECR.
- API Gateway: An HTTP API is created with API key authentication and usage plans.
- CloudWatch Logs: The server integrates with CloudWatch Logs for monitoring and debugging.
The server implements the following tools:
- greet: A simple greeting tool that returns a personalized message
- multi-greet: A tool that sends multiple greetings with delays between them
- bedrock-logs: A tool for querying AWS Bedrock usage logs
-
Install
uv
and Python dependencies needed for deployment to Lambda# Install uv if you don't have it curl -LsSf https://astral.sh/uv/install.sh | sh export PATH="$HOME/.local/bin:$PATH" # Create a virtual environment and install dependencies uv venv --python 3.11 && source .venv/bin/activate && uv pip install --requirement pyproject.toml
-
Run the Deployment Script:
python deploy.py \ --function-name bedrock-spend-mcp-server \ --role-arn <lambda-role-arn> \ --region us-east-1 \ --memory 2048 \ --timeout 300 \ --api-gateway \ --api-name mcp-server-api \ --stage-name prod
The script will:
- Build the Docker image for the correct Lambda architecture (
linux/amd64
) - Create the ECR repository if it doesn't exist
- Authenticate Docker with ECR
- Push the image to ECR
- Create the Lambda execution IAM role and attach policies
- Create or update the Lambda function
- Create or update the API Gateway with API key authentication
- Set up usage plans and throttling
- Build the Docker image for the correct Lambda architecture (
-
Deployment Output:
- Upon successful completion, the script will print a summary including:
- ECR Image URI
- IAM Role ARN
- Lambda Function ARN
- API Gateway URL
- API Key
- Note the API URL as printed out in the output, you should see something similar to:
API Gateway successfully deployed! API URL: https://<api-id>.execute-api.us-east-1.amazonaws.com/prod
- Upon successful completion, the script will print a summary including:
This deployment supports optional Lambda-based request authorization for the API Gateway.
- Authorizer Lambda: A separate Python Lambda function (
src/auth/auth.py
) is deployed alongside the main MCP server Lambda. - API Gateway Configuration: When authorization is enabled during deployment, the API Gateway routes are configured to first trigger this authorizer Lambda for incoming requests.
- Token Check: The authorizer Lambda expects an
Authorization
header in the formatBearer <token>
(e.g.,Authorization: Bearer MyTestToken
). It extracts the token part (currently, it accepts any token following theBearer
prefix due to placeholder validation logic). - Allow/Deny: If the header is present and correctly formatted, the authorizer returns an "Allow" response to API Gateway. Otherwise, it returns "Deny".
- Backend Invocation: If allowed, API Gateway proceeds to invoke the main MCP server Lambda (
bedrock
in the example below). If denied, API Gateway returns a{"message":"Forbidden"}
response directly to the client.
To deploy the API Gateway with the Lambda authorizer enabled, use the --enable-authorizer
flag and provide a name for the authorizer function using --authorizer-function-name
:
python deploy.py \
--function-name bedrock \
--role-arn <your-lambda-execution-role-arn> \
--api-gateway \
--enable-authorizer \
--authorizer-function-name my-api \
# Add other options like --region, --image-uri etc. as needed
This command will:
- Deploy the main containerized Lambda (
bedrock
). - Deploy the authorizer Lambda (
my-api
) fromsrc/auth/auth.py
. - Configure the API Gateway (
bedrock-api
) to usemy-api
as the authorizer for all routes.
Use the Authorization: Bearer <your-token>
header when making requests:
# Replace <api-id> and <region> with your deployment output
# Replace MyTestToken with your actual token if you implement real validation
API_URL="https://li8mz4qlzc.execute-api.us-east-1.amazonaws.com/prod/mcp" # Example URL
curl -XPOST "$API_URL" \
-H "Content-Type: application/json" \
-H "Accept: application/json, text/event-stream" \
-H "Authorization: Bearer MyTestToken" \
-d '{"jsonrpc": "2.0","method": "initialize","params": {"clientInfo": { "name": "curl-client", "version": "1.0" },"protocolVersion": "2025-03-26","capabilities": {}},"id": "init-1"}' | cat
If authorization is not enabled during deployment, you can omit the -H "Authorization: Bearer MyTestToken"
header.
After deployment, you can connect to the server using the client.ts
script:
-
Set the required environment variables:
# note the extra "/mcp" at the end of the API URL export MCP_SERVER_URL="https://<api-id>.execute-api.<region>.amazonaws.com/prod/mcp"
-
Install dependencies:
npm install
-
Run the client:
npx tsx src/client.ts
The client will automatically:
- Initialize a connection with the server
- Handle session management
- Provide an interactive interface for using the available tools
-
To get a summary of the Amazon Bedrock spend over the last few days run the
bedrock-report
command on the client CLI.>bedrock-report <region> <log-group-name> <number-of-days> <aws-account-id>
The above command should produce an output similar to the following:
Tool result: Bedrock Daily Usage Report (Past 17 days - us-east-1) Total Requests: 13060 Total Input Tokens: 2992387 Total Completion Tokens: 254124 Total Tokens: 3246511 --- Daily Totals --- 2025-04-06: Requests=8330, Input=1818253, Completion=171794, Total=1990047 2025-04-07: Requests=4669, Input=936299, Completion=71744, Total=1008043 2025-04-10: Requests=4, Input=4652, Completion=370, Total=5022 2025-04-11: Requests=6, Input=17523, Completion=1201, Total=18724 2025-04-13: Requests=27, Input=67524, Completion=4406, Total=71930 2025-04-14: Requests=24, Input=148136, Completion=4609, Total=152745 --- Region Summary --- us-east-1: Requests=13060, Input=2992387, Completion=254124, Total=3246511 --- Model Summary --- nova-lite-v1:0: Requests=93, Input=177416, Completion=30331, Total=207747 titan-embed-text-v1: Requests=62, Input=845, Completion=0, Total=845 nova-micro-v1:0: Requests=27, Input=63396, Completion=10225, Total=73621 llama3-3-70b-instruct-v1:0: Requests=3749, Input=780568, Completion=58978, Total=839546 claude-3-5-sonnet-20241022-v2:0: Requests=5353, Input=846616, Completion=82570, Total=929186 command-r-plus-v1:0: Requests=3644, Input=659689, Completion=40900, Total=700589 nova-pro-v1:0: Requests=40, Input=116939, Completion=13144, Total=130083 claude-3-5-haiku-20241022-v1:0: Requests=88, Input=342266, Completion=17606, Total=359872 claude-3-haiku-20240307-v1:0: Requests=4, Input=4652, Completion=370, Total=5022 --- User Summary --- arn:aws:sts::012345678091:assumed-role/role-name/i-0ed8662e2ec5052df: Requests=314, Input=705514, Completion=71676, Total=777190 arn:aws:sts::012345678091:assumed-role/role-name/i-0e7fa4b21ef43662a: Requests=1422, Input=232289, Completion=20468, Total=252757 arn:aws:sts::012345678091:assumed-role/role-name/i-0a0528a4884da8642: Requests=11324, Input=2054584, Completion=161980, Total=2216564
- Client sometimes fails to call tools, this can be resolved by restarting the client or establishing connection again.
This project was developed with the support of the following technologies and services:
- AWS Lambda for serverless computing
- Amazon API Gateway for API management
- Amazon Bedrock for foundation models
- Model Context Protocol for the communication protocol
- Node.js and TypeScript for the implementation
- Express.js for the web server framework