Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions packages/aws_bedrock_agentcore/changelog.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
# newer versions go on top
- version: "0.3.0"
changes:
- description: Add alerting rule templates for the gateway, identity, memory, browser tool and code interpreter.
type: enhancement
link: https://github.com/elastic/integrations/pull/16705
- version: "0.2.0"
changes:
- description: Add browser tool, and code interpreter metrics with dashboards.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{
"id": "aws-bedrock-agentcore-browser-errors",
"type": "alerting_rule_template",
"attributes": {
"name": "[AWS Bedrock AgentCore] Browser errors",
"tags": ["AWS", "Amazon Bedrock AgentCore", "Browser"],
Copy link
Contributor

@gpop63 gpop63 Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

I think Muthu suggested keeping just the service name in tags e.g. AWS Bedrock AgentCore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. However, I would like to see Browser, Gateway are high-level components rather than internal components such as memory, disk. So, including the names of these high-level components may not be wrong. But, considering the number of alert templates included presently is limited, I am opting out of adding the names of high-level components.

"ruleTypeId": ".es-query",
"schedule": {
"interval": "5m"
},
"params": {
"searchType": "esqlQuery",
"timeWindowSize": 5,
"timeWindowUnit": "m",
"esqlQuery": {
"esql": "// Alert triggers when errors occur during AWS Bedrock AgentCore Browser operations.\n//\n// Browser tool enables agents to interact with web pages programmatically,\n// automating web-based tasks. Operations include:\n// - StartBrowserSession: Initiates a new browser automation session\n// - StopBrowserSession: Terminates a browser session\n// - ConnectBrowserAutomationStream: Establishes automation stream connection\n// - ConnectBrowserLiveViewStream: Establishes live view stream\n// - GetBrowserSession/ListBrowserSessions: Session management operations\n//\n// This alert monitors both user errors (4xx) and system errors (5xx).\n//\n// The alert is grouped by cloud account, region, and resource to pinpoint\n// specific browser sessions experiencing issues.\n//\n// To reduce alert noise, increase the threshold (e.g., `total_errors > 5`).\n// For more details, see: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/browser.html\n\nFROM metrics-aws_bedrock_agentcore.metrics-*\n| WHERE aws.dimensions.Operation IN (\"StartBrowserSession\", \"StopBrowserSession\", \"ConnectBrowserAutomationStream\", \"ConnectBrowserLiveViewStream\", \"GetBrowserSession\", \"ListBrowserSessions\", \"UpdateBrowserStream\")\n| STATS total_user_errors = sum(aws.bedrock_agentcore.metrics.UserErrors.sum), total_system_errors = sum(aws.bedrock_agentcore.metrics.SystemErrors.sum) BY cloud.account.id, cloud.region, aws.dimensions.Resource\n| EVAL total_errors = COALESCE(total_user_errors, TO_LONG(0)) + COALESCE(total_system_errors, TO_LONG(0))\n| WHERE total_errors > 0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For more details, see: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/browser.html

Do we need to link the documentation on metrics here? It is useful when there is a topic related to alerting rule metrics in the docs, but I don't see any mention about alerting metrics. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed by pointing to the correct URL

},
"groupBy": "row",
"timeField": "event.ingested"
},
"alertDelay": {
"active": 1
}
},
"managed": true,
"coreMigrationVersion": "8.8.0",
"typeMigrationVersion": "10.1.0"
}






Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{
"id": "aws-bedrock-agentcore-browser-session-throttles",
"type": "alerting_rule_template",
"attributes": {
"name": "[AWS Bedrock AgentCore] Browser session throttles",
"tags": ["AWS", "Amazon Bedrock AgentCore", "Browser"],
"ruleTypeId": ".es-query",
"schedule": {
"interval": "5m"
},
"params": {
"searchType": "esqlQuery",
"timeWindowSize": 5,
"timeWindowUnit": "m",
"esqlQuery": {
"esql": "// Alert triggers when throttling occurs during AWS Bedrock AgentCore Browser operations.\n//\n// Browser session throttling indicates that requests are being rate-limited,\n// which can impact agent ability to perform web automation tasks.\n//\n// Common causes of browser throttling:\n// - High volume of browser session requests\n// - Concurrent session limits exceeded\n// - Resource quota constraints\n//\n// The alert is grouped by cloud account, region, and resource to identify\n// which browser resources are being throttled.\n//\n// To reduce alert noise, increase the threshold (e.g., `total_throttles > 10`).\n// For more details, see: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/browser.html\n\nFROM metrics-aws_bedrock_agentcore.metrics-*\n| WHERE aws.dimensions.Operation IN (\"StartBrowserSession\", \"StopBrowserSession\", \"ConnectBrowserAutomationStream\", \"ConnectBrowserLiveViewStream\")\n| STATS total_throttles = sum(aws.bedrock_agentcore.metrics.Throttles.sum) BY cloud.account.id, cloud.region, aws.dimensions.Resource\n| WHERE total_throttles > 0"
},
"groupBy": "row",
"timeField": "event.ingested"
},
"alertDelay": {
"active": 1
}
},
"managed": true,
"coreMigrationVersion": "8.8.0",
"typeMigrationVersion": "10.1.0"
}






Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{
"id": "aws-bedrock-agentcore-code-interpreter-errors",
"type": "alerting_rule_template",
"attributes": {
"name": "[AWS Bedrock AgentCore] Code interpreter errors",
"tags": ["AWS", "Amazon Bedrock AgentCore", "Code Interpreter"],
"ruleTypeId": ".es-query",
"schedule": {
"interval": "5m"
},
"params": {
"searchType": "esqlQuery",
"timeWindowSize": 5,
"timeWindowUnit": "m",
"esqlQuery": {
"esql": "// Alert triggers when errors occur during AWS Bedrock AgentCore Code Interpreter operations.\n//\n// Code Interpreter enables agents to execute code within secure, isolated sessions.\n// This alert monitors both user errors (client-side, 4xx) and system errors (server-side, 5xx)\n// across all Code Interpreter operations including:\n// - StartCodeInterpreterSession: Initiates a new code execution session\n// - InvokeCodeInterpreter: Executes code within an active session\n// - StopCodeInterpreterSession: Terminates an active session\n// - CodeInterpreterSession: Session lifecycle metrics\n//\n// The alert is grouped by cloud account, region, and resource to pinpoint the\n// specific Code Interpreter experiencing issues.\n//\n// To reduce alert noise, increase the threshold (e.g., `total_errors > 5`) to only\n// alert on sustained error patterns rather than isolated incidents.\n// For more details, see: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/code-interpreter-observability.html\n\nFROM metrics-aws_bedrock_agentcore.metrics-*\n| WHERE aws.dimensions.Operation IN (\"StartCodeInterpreterSession\", \"InvokeCodeInterpreter\", \"StopCodeInterpreterSession\", \"CodeInterpreterSession\")\n| STATS total_user_errors = sum(aws.bedrock_agentcore.metrics.UserErrors.sum), total_system_errors = sum(aws.bedrock_agentcore.metrics.SystemErrors.sum) BY cloud.account.id, cloud.region, aws.dimensions.Resource\n| EVAL total_errors = COALESCE(total_user_errors, TO_LONG(0)) + COALESCE(total_system_errors, TO_LONG(0))\n| WHERE total_errors > 0"
},
"groupBy": "row",
"timeField": "event.ingested"
},
"alertDelay": {
"active": 1
}
},
"managed": true,
"coreMigrationVersion": "8.8.0",
"typeMigrationVersion": "10.1.0"
}






Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{
"id": "aws-bedrock-agentcore-code-interpreter-high-latency",
"type": "alerting_rule_template",
"attributes": {
"name": "[AWS Bedrock AgentCore] Code interpreter high latency",
"tags": ["AWS", "Amazon Bedrock AgentCore", "Code Interpreter"],
"ruleTypeId": ".es-query",
"schedule": {
"interval": "5m"
},
"params": {
"searchType": "esqlQuery",
"timeWindowSize": 5,
"timeWindowUnit": "m",
"esqlQuery": {
"esql": "// Alert triggers when high execution duration is detected during AWS Bedrock AgentCore Code Interpreter operations.\n//\n// Code Interpreter executes code within secure, isolated sessions. High duration\n// can indicate complex computations, resource constraints, or inefficient code.\n//\n// Duration measures the average execution time for code interpreter operations\n// in milliseconds.\n//\n// High latency in code execution can impact:\n// - Agent response times\n// - User experience\n// - Session timeouts\n//\n// The alert is grouped by cloud account, region, and resource.\n//\n// To adjust sensitivity, change the threshold (default: 30000ms = 30 seconds).\n// For more details, see: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/code-interpreter-observability.html\n\nFROM metrics-aws_bedrock_agentcore.metrics-*\n| WHERE aws.dimensions.Operation IN (\"InvokeCodeInterpreter\", \"CodeInterpreterSession\")\n| STATS avg_duration_ms = avg(aws.bedrock_agentcore.metrics.Duration.avg) BY cloud.account.id, cloud.region, aws.dimensions.Resource\n| WHERE avg_duration_ms > 30000"
},
"groupBy": "row",
"timeField": "event.ingested"
},
"alertDelay": {
"active": 1
}
},
"managed": true,
"coreMigrationVersion": "8.8.0",
"typeMigrationVersion": "10.1.0"
}






Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{
"id": "aws-bedrock-agentcore-code-interpreter-throttles",
"type": "alerting_rule_template",
"attributes": {
"name": "[AWS Bedrock AgentCore] Code interpreter throttles",
"tags": ["AWS", "Amazon Bedrock AgentCore", "Code Interpreter"],
"ruleTypeId": ".es-query",
"schedule": {
"interval": "5m"
},
"params": {
"searchType": "esqlQuery",
"timeWindowSize": 5,
"timeWindowUnit": "m",
"esqlQuery": {
"esql": "// Alert triggers when throttling occurs during AWS Bedrock AgentCore Code Interpreter operations.\n//\n// Code Interpreter enables agents to execute code within secure, isolated sessions.\n// Throttling indicates that code execution requests are being rate-limited.\n//\n// Operations monitored:\n// - StartCodeInterpreterSession: Session creation throttles\n// - InvokeCodeInterpreter: Code execution throttles\n// - StopCodeInterpreterSession: Session termination throttles\n//\n// Common causes of throttling:\n// - High volume of code execution requests\n// - Concurrent session limits exceeded\n// - Compute resource constraints\n//\n// The alert is grouped by cloud account, region, and resource.\n//\n// To reduce alert noise, increase the threshold (e.g., `total_throttles > 10`).\n// For more details, see: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/code-interpreter-observability.html\n\nFROM metrics-aws_bedrock_agentcore.metrics-*\n| WHERE aws.dimensions.Operation IN (\"StartCodeInterpreterSession\", \"InvokeCodeInterpreter\", \"StopCodeInterpreterSession\")\n| STATS total_throttles = sum(aws.bedrock_agentcore.metrics.Throttles.sum) BY cloud.account.id, cloud.region, aws.dimensions.Resource\n| WHERE total_throttles > 0"
},
"groupBy": "row",
"timeField": "event.ingested"
},
"alertDelay": {
"active": 1
}
},
"managed": true,
"coreMigrationVersion": "8.8.0",
"typeMigrationVersion": "10.1.0"
}






Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{
"id": "aws-bedrock-agentcore-gateway-errors",
"type": "alerting_rule_template",
"attributes": {
"name": "[AWS Bedrock AgentCore] Gateway errors",
"tags": ["AWS", "Amazon Bedrock AgentCore", "Gateway"],
"ruleTypeId": ".es-query",
"schedule": {
"interval": "5m"
},
"params": {
"searchType": "esqlQuery",
"timeWindowSize": 5,
"timeWindowUnit": "m",
"esqlQuery": {
"esql": "// Alert triggers when errors occur during AWS Bedrock AgentCore Gateway invocations.\n//\n// Gateway provides a unified API endpoint that enables agents to securely connect\n// to enterprise tools and external resources. It acts as a proxy that handles\n// authentication, authorization, and routing of requests.\n//\n// This alert monitors both:\n// - User errors (4xx): Client-side errors like invalid requests, unauthorized access\n// - System errors (5xx): Server-side errors indicating infrastructure issues\n//\n// The alert is grouped by cloud account, region, and resource to pinpoint the\n// specific gateway experiencing issues.\n//\n// To reduce alert noise, increase the threshold (e.g., `total_errors > 5`) to only\n// alert on sustained error patterns.\n// For more details, see: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/observability-gateway-metrics.html\n\nFROM metrics-aws_bedrock_agentcore.metrics-*\n| WHERE aws.dimensions.Operation == \"InvokeGateway\"\n| STATS total_user_errors = sum(aws.bedrock_agentcore.metrics.UserErrors.sum), total_system_errors = sum(aws.bedrock_agentcore.metrics.SystemErrors.sum) BY cloud.account.id, cloud.region, aws.dimensions.Resource\n| EVAL total_errors = COALESCE(total_user_errors, TO_LONG(0)) + COALESCE(total_system_errors, TO_LONG(0))\n| WHERE total_errors > 0"
},
"groupBy": "row",
"timeField": "event.ingested"
},
"alertDelay": {
"active": 1
}
},
"managed": true,
"coreMigrationVersion": "8.8.0",
"typeMigrationVersion": "10.1.0"
}






Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{
"id": "aws-bedrock-agentcore-gateway-high-latency",
"type": "alerting_rule_template",
"attributes": {
"name": "[AWS Bedrock AgentCore] Gateway high latency",
"tags": ["AWS", "Amazon Bedrock AgentCore", "Gateway"],
"ruleTypeId": ".es-query",
"schedule": {
"interval": "5m"
},
"params": {
"searchType": "esqlQuery",
"timeWindowSize": 5,
"timeWindowUnit": "m",
"esqlQuery": {
"esql": "// Alert triggers when high latency is detected during AWS Bedrock AgentCore Gateway invocations.\n//\n// Gateway serves as a unified API endpoint that enables agents to securely connect to\n// enterprise tools and resources. High latency can indicate network issues, slow\n// downstream services, or resource constraints.\n//\n// Latency measures the average time elapsed between receiving the gateway request\n// and returning the response, measured in milliseconds.\n//\n// The alert is grouped by cloud account, region, and resource to pinpoint the\n// specific gateway experiencing high latency.\n//\n// To adjust sensitivity, change the threshold in the WHERE clause (default: 5000ms = 5 seconds).\n// For more details, see: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/observability-gateway-metrics.html\n\nFROM metrics-aws_bedrock_agentcore.metrics-*\n| WHERE aws.dimensions.Operation == \"InvokeGateway\"\n| STATS avg_latency_ms = avg(aws.bedrock_agentcore.metrics.Latency.avg) BY cloud.account.id, cloud.region, aws.dimensions.Resource\n| WHERE avg_latency_ms > 5000"
},
"groupBy": "row",
"timeField": "event.ingested"
},
"alertDelay": {
"active": 1
}
},
"managed": true,
"coreMigrationVersion": "8.8.0",
"typeMigrationVersion": "10.1.0"
}






Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{
"id": "aws-bedrock-agentcore-gateway-throttles",
"type": "alerting_rule_template",
"attributes": {
"name": "[AWS Bedrock AgentCore] Gateway throttles",
"tags": ["AWS", "Amazon Bedrock AgentCore", "Gateway"],
"ruleTypeId": ".es-query",
"schedule": {
"interval": "5m"
},
"params": {
"searchType": "esqlQuery",
"timeWindowSize": 5,
"timeWindowUnit": "m",
"esqlQuery": {
"esql": "// Alert triggers when throttling occurs during AWS Bedrock AgentCore Gateway invocations.\n//\n// Gateway throttling indicates that requests are being rate-limited due to exceeding\n// allowed TPS (transactions per second) or quota limits. This can impact agent\n// performance and user experience.\n//\n// Common causes of throttling:\n// - High request volume exceeding service quotas\n// - Burst traffic patterns\n// - Insufficient provisioned capacity\n//\n// The alert is grouped by cloud account, region, and resource to identify\n// which gateways are being throttled.\n//\n// To reduce alert noise, increase the threshold (e.g., `total_throttles > 10`).\n// Consider requesting a service quota increase if sustained throttling occurs.\n// For more details, see: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/observability-gateway-metrics.html\n\nFROM metrics-aws_bedrock_agentcore.metrics-*\n| WHERE aws.dimensions.Operation == \"InvokeGateway\"\n| STATS total_throttles = sum(aws.bedrock_agentcore.metrics.Throttles.sum) BY cloud.account.id, cloud.region, aws.dimensions.Resource\n| WHERE total_throttles > 0"
},
"groupBy": "row",
"timeField": "event.ingested"
},
"alertDelay": {
"active": 1
}
},
"managed": true,
"coreMigrationVersion": "8.8.0",
"typeMigrationVersion": "10.1.0"
}






Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
{
"id": "aws-bedrock-agentcore-identity-throttles",
"type": "alerting_rule_template",
"attributes": {
"name": "[AWS Bedrock AgentCore] Identity throttles",
"tags": ["AWS", "Amazon Bedrock AgentCore", "Identity"],
"ruleTypeId": ".es-query",
"schedule": {
"interval": "5m"
},
"params": {
"searchType": "esqlQuery",
"timeWindowSize": 5,
"timeWindowUnit": "m",
"esqlQuery": {
"esql": "// Alert triggers when throttling occurs during AWS Bedrock AgentCore Identity operations.\n//\n// Identity service handles authentication and token management for agents.\n// Throttling in identity operations can indicate:\n// - Workload access token fetch throttles: Rate limiting on workload identity tokens\n// - Resource access token fetch throttles: Rate limiting on OAuth2 token fetches\n// - API key fetch throttles: Rate limiting on API key retrievals\n//\n// Token fetch throttling can prevent agents from accessing protected resources\n// and may cause cascading failures in agent workflows.\n//\n// The alert is grouped by cloud account and region to identify affected environments.\n//\n// To reduce alert noise, increase the threshold (e.g., `total_throttles > 10`).\n\nFROM metrics-aws_bedrock_agentcore.metrics-*\n| STATS workload_throttles = sum(aws.bedrock_agentcore.metrics.WorkloadAccessTokenFetchThrottles.sum), resource_throttles = sum(aws.bedrock_agentcore.metrics.ResourceAccessTokenFetchThrottles.sum), apikey_throttles = sum(aws.bedrock_agentcore.metrics.ApiKeyFetchThrottles.sum) BY cloud.account.id, cloud.region\n| EVAL total_throttles = COALESCE(workload_throttles, TO_LONG(0)) + COALESCE(resource_throttles, TO_LONG(0)) + COALESCE(apikey_throttles, TO_LONG(0))\n| WHERE total_throttles > 0"
},
"groupBy": "row",
"timeField": "event.ingested"
},
"alertDelay": {
"active": 1
}
},
"managed": true,
"coreMigrationVersion": "8.8.0",
"typeMigrationVersion": "10.1.0"
}

Loading