Skip to content

Commit

Permalink
Provide the LLM the ability to fetch all or specific container logs w…
Browse files Browse the repository at this point in the history
…ithin a pod (#207)

- Adds a handful of tools for LLM to get all container logs from a pod
- Tags llm tests and run these only once per push
  • Loading branch information
nherment authored Nov 19, 2024
1 parent a81ac05 commit c0eba0d
Show file tree
Hide file tree
Showing 9 changed files with 242 additions and 5 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/build-and-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ jobs:
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: |
poetry run pytest
poetry run pytest -m "not llm"
- name: Test the binary
shell: bash
Expand Down
36 changes: 36 additions & 0 deletions .github/workflows/llm-evaluation.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
name: Evaluate LLM test cases

on: [push]

jobs:
build:
strategy:
matrix:
python-version: ["3.12"]

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}

- name: Install Python dependencies and build
# if you change something here, you must also change it in .github/workflows/build-binaries-and-brew.yaml
run: |
python -m pip install --upgrade pip setuptools pyinstaller
curl -sSL https://install.python-poetry.org | python3 - --version 1.4.0
poetry config virtualenvs.create false
poetry install --no-root
poetry run python -m playwright install --with-deps firefox
- name: Run tests
shell: bash
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: |
poetry run pytest -m "llm"
27 changes: 23 additions & 4 deletions holmes/plugins/toolsets/kubernetes.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,14 +28,35 @@ toolsets:
description: "Run `kubectl logs --previous` on a single Kubernetes pod. Used to fetch logs for a pod that crashed and see logs from before the crash. Never give a deployment name or a resource that is not a pod."
command: "kubectl logs {{ name}} -n {{ namespace }} --previous"

- name: "kubectl_previous_logs_all_containers"
description: "Run `kubectl logs --previous` on a single Kubernetes pod. Used to fetch logs for a pod that crashed and see logs from before the crash."
command: "kubectl logs {{pod_name}} -n {{ namespace }} --previous --all-containers"

- name: "kubectl_container_previous_logs"
description: "Run `kubectl logs --previous` on a single container of a Kubernetes pod. Used to fetch logs for a pod that crashed and see logs from before the crash."
command: "kubectl logs {{pod_name}} -c {{container_name}} -n {{ namespace }} --previous"

- name: "kubectl_logs"
description: "Run `kubectl logs` on a single Kubernetes pod. Never give a deployment name or a resource that is not a pod."
command: "kubectl logs {{ name}} -n {{ namespace }}"
command: "kubectl logs {{name}} -n {{ namespace }}"

- name: "kubectl_logs_all_containers"
description: "Run `kubectl logs` on all containers within a single Kubernetes pod."
command: "kubectl logs {{pod_name}} -n {{ namespace }} --all-containers"

- name: "kubectl_container_logs"
description: "Run `kubectl logs` on a single container within a Kubernetes pod. This is to get the logs of a specific container in a multi-container pod."
command: "kubectl logs {{pod_name}} -c {{container_name}} -n {{ namespace }} "


- name: "kubectl_logs_grep"
description: "Search for a specific term in the logs of a single Kubernetes pod. Only provide a pod name, not a deployment or other resource."
command: "kubectl logs {{ name }} -n {{ namespace }} | grep {{ search_term }}"

- name: "kubectl_events"
description: "Retrieve the events for a specific Kubernetes resource. `resource_type` can be any kubernetes resource type: 'pod', 'service', 'deployment, 'job'', 'node', etc."
command: "kubectl events --for {{resource_type}}/{{ pod_name }} -n {{ namespace }}"

- name: "kubectl_memory_requests_all_namespaces"
description: "Fetch and display memory requests for all pods across all namespaces in MiB, summing requests across multiple containers where applicable and handling binary, decimal, and millibyte units correctly."
command: |
Expand Down Expand Up @@ -120,9 +141,7 @@ toolsets:
print namespace, name, sum_memory(requests) " Mi";
}' | sort -k3 -nr
- name: "kubectl_events"
description: "Retrieve the events for a specific Kubernetes resource. `resource_type` can be any kubernetes resource type: 'pod', 'service', 'deployment, 'job'', 'node', etc."
command: "kubectl events --for {{resource_type}}/{{ pod_name }} -n {{ namespace }}"
# NOTE: this is only possible for probes with a healthz endpoint - we do this to avoid giving the LLM generic
# http GET capabilities which are more powerful than we want to expose
#- name: "check_liveness_probe"
Expand Down
4 changes: 4 additions & 0 deletions pytest.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@

[pytest]
markers =
llm: Evaluate LLM behaviour (prompt, tools, etc.)
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
{"toolset_name":"kubernetes/core","tool_name":"kubectl_container_logs","match_params":{"pod_name":"customer-orders-67889fd856-k94k7","container_name":"fastapi-app","namespace":"default"}}
stdout:
INFO: Started server process [1]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
INFO: 10.244.0.16:46364 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:33610 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:47000 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:53562 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Received request for checkout page.
INFO:app:Connecting to promotions database to see if we should try to upsell user
INFO:app:Connecting to database at None
INFO:app:Fetching data using stored procedure: sp_CheckUserNotifications
INFO: 10.244.0.16:59206 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Database call completed in 8.01 seconds.
INFO:app:Promotions result: True
INFO:app:Page rendered in 8.01 seconds.
INFO: 127.0.0.1:34748 - "GET / HTTP/1.1" 200 OK
INFO: 10.244.0.16:56156 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:41600 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:35976 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:35584 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Received request for checkout page.
INFO:app:Connecting to promotions database to see if we should try to upsell user
INFO:app:Connecting to database at None
INFO:app:Fetching data using stored procedure: sp_CheckUserNotifications
INFO:app:Database call completed in 7.00 seconds.
INFO:app:Promotions result: True
INFO:app:Page rendered in 7.00 seconds.
INFO: 127.0.0.1:59258 - "GET / HTTP/1.1" 200 OK
INFO: 10.244.0.16:39944 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:39850 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:55216 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:51152 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Received request for checkout page.
INFO:app:Connecting to promotions database to see if we should try to upsell user
INFO:app:Connecting to database at None
INFO:app:Fetching data using stored procedure: sp_CheckUserNotifications
INFO: 10.244.0.16:47072 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Database call completed in 9.00 seconds.
INFO:app:Promotions result: True
INFO:app:Page rendered in 9.00 seconds.
INFO: 127.0.0.1:39504 - "GET / HTTP/1.1" 200 OK
INFO: 10.244.0.16:42586 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:52628 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:38852 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:40626 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Received request for checkout page.
INFO:app:Connecting to promotions database to see if we should try to upsell user
INFO:app:Connecting to database at None
INFO:app:Fetching data using stored procedure: sp_CheckUserNotifications
INFO:app:Database call completed in 7.01 seconds.
INFO:app:Promotions result: True
INFO:app:Page rendered in 7.01 seconds.
INFO: 127.0.0.1:49094 - "GET / HTTP/1.1" 200 OK
INFO: 10.244.0.16:34684 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:43422 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:49774 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:57556 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Received request for checkout page.
INFO:app:Connecting to promotions database to see if we should try to upsell user
INFO:app:Connecting to database at None
INFO:app:Fetching data using stored procedure: sp_CheckUserNotifications
INFO: 10.244.0.16:58876 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Database call completed in 8.01 seconds.
INFO:app:Promotions result: True
INFO:app:Page rendered in 8.01 seconds.
INFO: 127.0.0.1:45622 - "GET / HTTP/1.1" 200 OK
INFO: 10.244.0.16:44866 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:54794 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:39550 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:49456 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Received request for checkout page.
INFO:app:Connecting to promotions database to see if we should try to upsell user
INFO:app:Connecting to database at None
INFO:app:Fetching data using stored procedure: sp_CheckUserNotifications
INFO:app:Database call completed in 8.01 seconds.
INFO:app:Promotions result: True
INFO:app:Page rendered in 8.01 seconds.
INFO: 127.0.0.1:55750 - "GET / HTTP/1.1" 200 OK
INFO: 10.244.0.16:55426 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:55114 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:33410 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:40844 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Received request for checkout page.
INFO:app:Connecting to promotions database to see if we should try to upsell user
INFO:app:Connecting to database at None
INFO:app:Fetching data using stored procedure: sp_CheckUserNotifications
INFO: 10.244.0.16:38884 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Database call completed in 7.01 seconds.
INFO:app:Promotions result: True
INFO:app:Page rendered in 7.01 seconds.
INFO: 127.0.0.1:50872 - "GET / HTTP/1.1" 200 OK
INFO: 10.244.0.16:40396 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:40466 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:58458 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:34996 - "GET /metrics HTTP/1.1" 200 OK

stderr:
65 changes: 65 additions & 0 deletions tests/fixtures/test_chat/8_multi_container_pod/kubectl_logs.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
{"toolset_name":"kubernetes/core","tool_name":"kubectl_logs","match_params":{"name":"customer-orders-67889fd856-k94k7","namespace":"default"}}
stdout:

<html>
<head>
<title>Checkout Status</title>
</head>
<body>
<h1>Success!</h1>
<p>Promotions: True</p>
</body>
</html>

<html>
<head>
<title>Checkout Status</title>
</head>
<body>
<h1>Success!</h1>
<p>Promotions: True</p>
</body>
</html>

<html>
<head>
<title>Checkout Status</title>
</head>
<body>
<h1>Success!</h1>
<p>Promotions: True</p>
</body>
</html>

<html>
<head>
<title>Checkout Status</title>
</head>
<body>
<h1>Success!</h1>
<p>Promotions: True</p>
</body>
</html>

<html>
<head>
<title>Checkout Status</title>
</head>
<body>
<h1>Success!</h1>
<p>Promotions: True</p>
</body>
</html>

<html>
<head>
<title>Checkout Status</title>
</head>
<body>
<h1>Success!</h1>
<p>Promotions: True</p>
</body>
</html>

stderr:
Defaulted container "curl-sidecar" out of: curl-sidecar, fastapi-app
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{"toolset_name":"kubernetes/core","tool_name":"kubectl_logs_grep","match_params":{"name":"customer-orders-67889fd856-k94k7","namespace":"default","search_term":"render time"}}
Command `kubectl logs customer-orders-67889fd856-k94k7 -n default | grep 'render time'` failed with return code 1
stdout:

stderr:
Defaulted container "curl-sidecar" out of: curl-sidecar, fastapi-app
6 changes: 6 additions & 0 deletions tests/fixtures/test_chat/8_multi_container_pod/test_case.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
user_prompt: "How are the page render times for pod customer-orders-67889fd856-k94k7"
expected_output: "Page render times for `customer-orders-67889fd856-k94k7` range from 7.00 to 9.00 seconds."
evaluation:
answer_relevancy: .5
faithfulness: .5
contextual_precision: 0
1 change: 1 addition & 0 deletions tests/test_chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
def idfn(test_case:AskHolmesTestCase):
return test_case.id

@pytest.mark.llm
@pytest.mark.parametrize("test_case", test_cases, ids=idfn)
def test_ask_holmes_with_tags(test_case:AskHolmesTestCase):

Expand Down

0 comments on commit c0eba0d

Please sign in to comment.