Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion packages/aws_bedrock_agentcore/_dev/build/docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,4 +64,7 @@ The metrics include the following dimensions for enhanced filtering and analysis
- `SessionId`: The session identifier for agent invocations

{{event "metrics"}}
{{fields "metrics"}}
{{fields "metrics"}}

## Alerting Rule Template
{{alertRuleTemplates}}
5 changes: 5 additions & 0 deletions packages/aws_bedrock_agentcore/changelog.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
# newer versions go on top
- version: "0.3.0"
changes:
- description: Add alerting rule templates for the gateway, identity, memory, browser tool and code interpreter.
type: enhancement
link: https://github.com/elastic/integrations/pull/16705
- version: "0.2.0"
changes:
- description: Add browser tool, and code interpreter metrics with dashboards.
Expand Down
71 changes: 71 additions & 0 deletions packages/aws_bedrock_agentcore/docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -194,3 +194,74 @@ An example event for `metrics` looks as following:
| data_stream.namespace | Data stream namespace. | constant_keyword | | |
| data_stream.type | Data stream type. | constant_keyword | | |
| event.module | Name of the module this data is coming from. If your monitoring agent supports the concept of modules or plugins to process events of a given source (e.g. Apache logs), `event.module` should contain the name of this module. | constant_keyword | | |


## Alerting Rule Template
Alert rule templates provide pre-defined configurations for creating alert rules in Kibana.

For more information, refer to the [Elastic documentation](https://www.elastic.co/docs/reference/fleet/alerting-rule-templates).

Alert rule templates require Elastic Stack version 9.2.0 or later.

The following alert rule templates are available:

**[AWS Bedrock AgentCore] Agent runtime high latency**



**[AWS Bedrock AgentCore] Agent runtime system errors**



**[AWS Bedrock AgentCore] Agent runtime user errors**



**[AWS Bedrock AgentCore] Browser errors**



**[AWS Bedrock AgentCore] Browser session throttles**



**[AWS Bedrock AgentCore] Code interpreter errors**



**[AWS Bedrock AgentCore] Code interpreter high latency**



**[AWS Bedrock AgentCore] Code interpreter throttles**



**[AWS Bedrock AgentCore] Gateway errors**



**[AWS Bedrock AgentCore] Gateway high latency**



**[AWS Bedrock AgentCore] Gateway throttles**



**[AWS Bedrock AgentCore] Identity throttles**



**[AWS Bedrock AgentCore] Identity token fetch failures**



**[AWS Bedrock AgentCore] Memory errors**



**[AWS Bedrock AgentCore] Memory high latency**



Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
"type": "alerting_rule_template",
"attributes": {
"name": "[AWS Bedrock AgentCore] Agent runtime high latency",
"tags": ["AWS", "Amazon Bedrock AgentCore", "Agent runtime"],
"tags": ["AWS", "Amazon Bedrock AgentCore"],
"ruleTypeId": ".es-query",
"schedule": {
"interval": "5m"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
"type": "alerting_rule_template",
"attributes": {
"name": "[AWS Bedrock AgentCore] Agent runtime system errors",
"tags": ["AWS", "Amazon Bedrock AgentCore", "Agent runtime"],
"tags": ["AWS", "Amazon Bedrock AgentCore"],
"ruleTypeId": ".es-query",
"schedule": {
"interval": "5m"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
"type": "alerting_rule_template",
"attributes": {
"name": "[AWS Bedrock AgentCore] Agent runtime user errors",
"tags": ["AWS", "Amazon Bedrock AgentCore", "Agent runtime"],
"tags": ["AWS", "Amazon Bedrock AgentCore"],
"ruleTypeId": ".es-query",
"schedule": {
"interval": "5m"
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{
"id": "aws-bedrock-agentcore-browser-errors",
"type": "alerting_rule_template",
"attributes": {
"name": "[AWS Bedrock AgentCore] Browser errors",
"tags": ["AWS", "Amazon Bedrock AgentCore"],
"ruleTypeId": ".es-query",
"schedule": {
"interval": "5m"
},
"params": {
"searchType": "esqlQuery",
"timeWindowSize": 5,
"timeWindowUnit": "m",
"esqlQuery": {
"esql": "// Alert triggers when errors occur during AWS Bedrock AgentCore Browser operations.\n//\n// Browser tool enables agents to interact with web pages programmatically,\n// automating web-based tasks. Operations include:\n// - StartBrowserSession: Initiates a new browser automation session\n// - StopBrowserSession: Terminates a browser session\n// - ConnectBrowserAutomationStream: Establishes automation stream connection\n// - ConnectBrowserLiveViewStream: Establishes live view stream\n// - GetBrowserSession/ListBrowserSessions: Session management operations\n//\n// This alert monitors both user errors (4xx) and system errors (5xx).\n//\n// The alert is grouped by cloud account, region, and resource to pinpoint\n// specific browser sessions experiencing issues.\n//\n// To reduce alert noise, increase the threshold (e.g., `total_errors > 5`).\n// For more details, see: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/browser.html\n\nFROM metrics-aws_bedrock_agentcore.metrics-*\n| WHERE aws.dimensions.Operation IN (\"StartBrowserSession\", \"StopBrowserSession\", \"ConnectBrowserAutomationStream\", \"ConnectBrowserLiveViewStream\", \"GetBrowserSession\", \"ListBrowserSessions\", \"UpdateBrowserStream\")\n| STATS total_user_errors = sum(aws.bedrock_agentcore.metrics.UserErrors.sum), total_system_errors = sum(aws.bedrock_agentcore.metrics.SystemErrors.sum) BY cloud.account.id, cloud.region, aws.dimensions.Resource\n| EVAL total_errors = COALESCE(total_user_errors, TO_LONG(0)) + COALESCE(total_system_errors, TO_LONG(0))\n| WHERE total_errors > 0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For more details, see: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/browser.html

Do we need to link the documentation on metrics here? It is useful when there is a topic related to alerting rule metrics in the docs, but I don't see any mention about alerting metrics. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed by pointing to the correct URL

},
"groupBy": "row",
"timeField": "event.ingested"
},
"alertDelay": {
"active": 1
}
},
"managed": true,
"coreMigrationVersion": "8.8.0",
"typeMigrationVersion": "10.1.0"
}






Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{
"id": "aws-bedrock-agentcore-browser-session-throttles",
"type": "alerting_rule_template",
"attributes": {
"name": "[AWS Bedrock AgentCore] Browser session throttles",
"tags": ["AWS", "Amazon Bedrock AgentCore"],
"ruleTypeId": ".es-query",
"schedule": {
"interval": "5m"
},
"params": {
"searchType": "esqlQuery",
"timeWindowSize": 5,
"timeWindowUnit": "m",
"esqlQuery": {
"esql": "// Alert triggers when throttling occurs during AWS Bedrock AgentCore Browser operations.\n//\n// Browser session throttling indicates that requests are being rate-limited,\n// which can impact agent ability to perform web automation tasks.\n//\n// Common causes of browser throttling:\n// - High volume of browser session requests\n// - Concurrent session limits exceeded\n// - Resource quota constraints\n//\n// The alert is grouped by cloud account, region, and resource to identify\n// which browser resources are being throttled.\n//\n// To reduce alert noise, increase the threshold (e.g., `total_throttles > 10`).\n// For more details, see: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/browser.html\n\nFROM metrics-aws_bedrock_agentcore.metrics-*\n| WHERE aws.dimensions.Operation IN (\"StartBrowserSession\", \"StopBrowserSession\", \"ConnectBrowserAutomationStream\", \"ConnectBrowserLiveViewStream\")\n| STATS total_throttles = sum(aws.bedrock_agentcore.metrics.Throttles.sum) BY cloud.account.id, cloud.region, aws.dimensions.Resource\n| WHERE total_throttles > 0"
},
"groupBy": "row",
"timeField": "event.ingested"
},
"alertDelay": {
"active": 1
}
},
"managed": true,
"coreMigrationVersion": "8.8.0",
"typeMigrationVersion": "10.1.0"
}






Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{
"id": "aws-bedrock-agentcore-code-interpreter-errors",
"type": "alerting_rule_template",
"attributes": {
"name": "[AWS Bedrock AgentCore] Code interpreter errors",
"tags": ["AWS", "Amazon Bedrock AgentCore"],
"ruleTypeId": ".es-query",
"schedule": {
"interval": "5m"
},
"params": {
"searchType": "esqlQuery",
"timeWindowSize": 5,
"timeWindowUnit": "m",
"esqlQuery": {
"esql": "// Alert triggers when errors occur during AWS Bedrock AgentCore Code Interpreter operations.\n//\n// Code Interpreter enables agents to execute code within secure, isolated sessions.\n// This alert monitors both user errors (client-side, 4xx) and system errors (server-side, 5xx)\n// across all Code Interpreter operations including:\n// - StartCodeInterpreterSession: Initiates a new code execution session\n// - InvokeCodeInterpreter: Executes code within an active session\n// - StopCodeInterpreterSession: Terminates an active session\n// - CodeInterpreterSession: Session lifecycle metrics\n//\n// The alert is grouped by cloud account, region, and resource to pinpoint the\n// specific Code Interpreter experiencing issues.\n//\n// To reduce alert noise, increase the threshold (e.g., `total_errors > 5`) to only\n// alert on sustained error patterns rather than isolated incidents.\n// For more details, see: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/code-interpreter-observability.html\n\nFROM metrics-aws_bedrock_agentcore.metrics-*\n| WHERE aws.dimensions.Operation IN (\"StartCodeInterpreterSession\", \"InvokeCodeInterpreter\", \"StopCodeInterpreterSession\", \"CodeInterpreterSession\")\n| STATS total_user_errors = sum(aws.bedrock_agentcore.metrics.UserErrors.sum), total_system_errors = sum(aws.bedrock_agentcore.metrics.SystemErrors.sum) BY cloud.account.id, cloud.region, aws.dimensions.Resource\n| EVAL total_errors = COALESCE(total_user_errors, TO_LONG(0)) + COALESCE(total_system_errors, TO_LONG(0))\n| WHERE total_errors > 0"
},
"groupBy": "row",
"timeField": "event.ingested"
},
"alertDelay": {
"active": 1
}
},
"managed": true,
"coreMigrationVersion": "8.8.0",
"typeMigrationVersion": "10.1.0"
}






Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{
"id": "aws-bedrock-agentcore-code-interpreter-high-latency",
"type": "alerting_rule_template",
"attributes": {
"name": "[AWS Bedrock AgentCore] Code interpreter high latency",
"tags": ["AWS", "Amazon Bedrock AgentCore"],
"ruleTypeId": ".es-query",
"schedule": {
"interval": "5m"
},
"params": {
"searchType": "esqlQuery",
"timeWindowSize": 5,
"timeWindowUnit": "m",
"esqlQuery": {
"esql": "// Alert triggers when high execution duration is detected during AWS Bedrock AgentCore Code Interpreter operations.\n//\n// Code Interpreter executes code within secure, isolated sessions. High duration\n// can indicate complex computations, resource constraints, or inefficient code.\n//\n// Duration measures the average execution time for code interpreter operations\n// in milliseconds.\n//\n// High latency in code execution can impact:\n// - Agent response times\n// - User experience\n// - Session timeouts\n//\n// The alert is grouped by cloud account, region, and resource.\n//\n// To adjust sensitivity, change the threshold (default: 30000ms = 30 seconds).\n// For more details, see: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/code-interpreter-observability.html\n\nFROM metrics-aws_bedrock_agentcore.metrics-*\n| WHERE aws.dimensions.Operation IN (\"InvokeCodeInterpreter\", \"CodeInterpreterSession\")\n| STATS avg_duration_ms = avg(aws.bedrock_agentcore.metrics.Duration.avg) BY cloud.account.id, cloud.region, aws.dimensions.Resource\n| WHERE avg_duration_ms > 30000"
},
"groupBy": "row",
"timeField": "event.ingested"
},
"alertDelay": {
"active": 1
}
},
"managed": true,
"coreMigrationVersion": "8.8.0",
"typeMigrationVersion": "10.1.0"
}






Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{
"id": "aws-bedrock-agentcore-code-interpreter-throttles",
"type": "alerting_rule_template",
"attributes": {
"name": "[AWS Bedrock AgentCore] Code interpreter throttles",
"tags": ["AWS", "Amazon Bedrock AgentCore"],
"ruleTypeId": ".es-query",
"schedule": {
"interval": "5m"
},
"params": {
"searchType": "esqlQuery",
"timeWindowSize": 5,
"timeWindowUnit": "m",
"esqlQuery": {
"esql": "// Alert triggers when throttling occurs during AWS Bedrock AgentCore Code Interpreter operations.\n//\n// Code Interpreter enables agents to execute code within secure, isolated sessions.\n// Throttling indicates that code execution requests are being rate-limited.\n//\n// Operations monitored:\n// - StartCodeInterpreterSession: Session creation throttles\n// - InvokeCodeInterpreter: Code execution throttles\n// - StopCodeInterpreterSession: Session termination throttles\n//\n// Common causes of throttling:\n// - High volume of code execution requests\n// - Concurrent session limits exceeded\n// - Compute resource constraints\n//\n// The alert is grouped by cloud account, region, and resource.\n//\n// To reduce alert noise, increase the threshold (e.g., `total_throttles > 10`).\n// For more details, see: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/code-interpreter-observability.html\n\nFROM metrics-aws_bedrock_agentcore.metrics-*\n| WHERE aws.dimensions.Operation IN (\"StartCodeInterpreterSession\", \"InvokeCodeInterpreter\", \"StopCodeInterpreterSession\")\n| STATS total_throttles = sum(aws.bedrock_agentcore.metrics.Throttles.sum) BY cloud.account.id, cloud.region, aws.dimensions.Resource\n| WHERE total_throttles > 0"
},
"groupBy": "row",
"timeField": "event.ingested"
},
"alertDelay": {
"active": 1
}
},
"managed": true,
"coreMigrationVersion": "8.8.0",
"typeMigrationVersion": "10.1.0"
}






Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{
"id": "aws-bedrock-agentcore-gateway-errors",
"type": "alerting_rule_template",
"attributes": {
"name": "[AWS Bedrock AgentCore] Gateway errors",
"tags": ["AWS", "Amazon Bedrock AgentCore"],
"ruleTypeId": ".es-query",
"schedule": {
"interval": "5m"
},
"params": {
"searchType": "esqlQuery",
"timeWindowSize": 5,
"timeWindowUnit": "m",
"esqlQuery": {
"esql": "// Alert triggers when errors occur during AWS Bedrock AgentCore Gateway invocations.\n//\n// Gateway provides a unified API endpoint that enables agents to securely connect\n// to enterprise tools and external resources. It acts as a proxy that handles\n// authentication, authorization, and routing of requests.\n//\n// This alert monitors both:\n// - User errors (4xx): Client-side errors like invalid requests, unauthorized access\n// - System errors (5xx): Server-side errors indicating infrastructure issues\n//\n// The alert is grouped by cloud account, region, and resource to pinpoint the\n// specific gateway experiencing issues.\n//\n// To reduce alert noise, increase the threshold (e.g., `total_errors > 5`) to only\n// alert on sustained error patterns.\n// For more details, see: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/observability-gateway-metrics.html\n\nFROM metrics-aws_bedrock_agentcore.metrics-*\n| WHERE aws.dimensions.Operation == \"InvokeGateway\"\n| STATS total_user_errors = sum(aws.bedrock_agentcore.metrics.UserErrors.sum), total_system_errors = sum(aws.bedrock_agentcore.metrics.SystemErrors.sum) BY cloud.account.id, cloud.region, aws.dimensions.Resource\n| EVAL total_errors = COALESCE(total_user_errors, TO_LONG(0)) + COALESCE(total_system_errors, TO_LONG(0))\n| WHERE total_errors > 0"
},
"groupBy": "row",
"timeField": "event.ingested"
},
"alertDelay": {
"active": 1
}
},
"managed": true,
"coreMigrationVersion": "8.8.0",
"typeMigrationVersion": "10.1.0"
}






Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{
"id": "aws-bedrock-agentcore-gateway-high-latency",
"type": "alerting_rule_template",
"attributes": {
"name": "[AWS Bedrock AgentCore] Gateway high latency",
"tags": ["AWS", "Amazon Bedrock AgentCore"],
"ruleTypeId": ".es-query",
"schedule": {
"interval": "5m"
},
"params": {
"searchType": "esqlQuery",
"timeWindowSize": 5,
"timeWindowUnit": "m",
"esqlQuery": {
"esql": "// Alert triggers when high latency is detected during AWS Bedrock AgentCore Gateway invocations.\n//\n// Gateway serves as a unified API endpoint that enables agents to securely connect to\n// enterprise tools and resources. High latency can indicate network issues, slow\n// downstream services, or resource constraints.\n//\n// Latency measures the average time elapsed between receiving the gateway request\n// and returning the response, measured in milliseconds.\n//\n// The alert is grouped by cloud account, region, and resource to pinpoint the\n// specific gateway experiencing high latency.\n//\n// To adjust sensitivity, change the threshold in the WHERE clause (default: 5000ms = 5 seconds).\n// For more details, see: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/observability-gateway-metrics.html\n\nFROM metrics-aws_bedrock_agentcore.metrics-*\n| WHERE aws.dimensions.Operation == \"InvokeGateway\"\n| STATS avg_latency_ms = avg(aws.bedrock_agentcore.metrics.Latency.avg) BY cloud.account.id, cloud.region, aws.dimensions.Resource\n| WHERE avg_latency_ms > 5000"
},
"groupBy": "row",
"timeField": "event.ingested"
},
"alertDelay": {
"active": 1
}
},
"managed": true,
"coreMigrationVersion": "8.8.0",
"typeMigrationVersion": "10.1.0"
}






Loading