Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/master' into curl-in-base
Browse files Browse the repository at this point in the history
  • Loading branch information
Sheeproid committed Nov 24, 2024
2 parents b008386 + c0eba0d commit 2b6a62b
Show file tree
Hide file tree
Showing 18 changed files with 532 additions and 95 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/build-and-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ jobs:
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: |
poetry run pytest
poetry run pytest -m "not llm"
- name: Test the binary
shell: bash
Expand Down
36 changes: 36 additions & 0 deletions .github/workflows/llm-evaluation.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
name: Evaluate LLM test cases

on: [push]

jobs:
build:
strategy:
matrix:
python-version: ["3.12"]

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}

- name: Install Python dependencies and build
# if you change something here, you must also change it in .github/workflows/build-binaries-and-brew.yaml
run: |
python -m pip install --upgrade pip setuptools pyinstaller
curl -sSL https://install.python-poetry.org | python3 - --version 1.4.0
poetry config virtualenvs.create false
poetry install --no-root
poetry run python -m playwright install --with-deps firefox
- name: Run tests
shell: bash
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: |
poetry run pytest -m "llm"
3 changes: 3 additions & 0 deletions helm/holmes/templates/holmes.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,9 @@ spec:
{{- if .Values.additionalEnvVars -}}
{{ toYaml .Values.additionalEnvVars | nindent 10 }}
{{- end }}
{{- if .Values.additional_env_vars -}}
{{ toYaml .Values.additional_env_vars | nindent 10 }}
{{- end }}
lifecycle:
preStop:
exec:
Expand Down
1 change: 1 addition & 0 deletions helm/holmes/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ certificate: "" # base64 encoded
logLevel: INFO

additionalEnvVars: []
additional_env_vars: []
imagePullSecrets: []

allowedToolsets: "kubernetes/core,findings,internet"
Expand Down
2 changes: 2 additions & 0 deletions holmes/plugins/prompts/_general_instructions.jinja2
Original file line number Diff line number Diff line change
Expand Up @@ -33,3 +33,5 @@ Special cases and how to reply:
* as a special case of that, if you try to investigate by running a tool and the tool gives you output that permissions are missing *to run the tool* then say "I tried to investigate but I am missing permissions to run the tool <tool_name>. <details and exact logs of the error message>"
* that is different than - for example - fetching a pod's logs and seeing that the pod itself has permission errors. in that case, you explain say that permission errors are the cause of the problem and give details
* Issues are a subset of findings. When asked about an issue or a finding and you have an id, use the tool `fetch_finding_by_id`.
* For any question, try to make the answer specific to the user's cluster.
** For example, if asked to port forward, find out the app or pod port (kubectl decribe) and provide a port forward command specific to the user's question
286 changes: 198 additions & 88 deletions holmes/plugins/toolsets/kubernetes.yaml

Large diffs are not rendered by default.

4 changes: 4 additions & 0 deletions pytest.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@

[pytest]
markers =
llm: Evaluate LLM behaviour (prompt, tools, etc.)
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
{"toolset_name":"kubernetes/core","tool_name":"kubectl_describe","match_params":{"kind":"pod","name":"my_grafana_4j981","namespace":"default"}}
Name: my_grafana_4j981
Namespace: default
Priority: 0
Service Account: my_grafana_4j981-service-account
Node: ip-172-31-21-139.us-east-2.compute.internal/172.31.21.139
Start Time: Mon, 04 Nov 2024 10:28:53 +0100
Labels: app=grafana
pod-template-hash=6958c5bdd8
Annotations: <none>
Status: Running
IP: 172.31.25.172
IPs:
IP: 172.31.25.172
Controlled By: ReplicaSet/my_grafana_4j981
Containers:
runner:
Container ID: containerd://b1d346ba710299dd3e1c1745c362062570488b57356072dbc4637cbf6b77ccb2
Image: robustadev/grafana:0.18.0
Image ID: docker.io/robustadev/grafana@sha256:273035ec62f104da1452d65fc30cfcb0085e8a49ce73b9ffa043f747f3afc31b
Port: 3000
Host Port: <none>
State: Running
Started: Mon, 04 Nov 2024 10:29:17 +0100
Ready: True
Restart Count: 0
Limits:
memory: 1Gi
Requests:
cpu: 250m
memory: 1Gi
Mounts:
/etc/robusta/auth from auth-config-secret (rw)
/etc/robusta/config from playbooks-config-secret (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-prfkr (ro)
Conditions:
Type Status
PodReadyToStartContainers True
Initialized True
Ready True
ContainersReady True
PodScheduled True
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m28s default-scheduler Successfully assigned default/nginxreplica to aks-nodepool1-26081864-vmss000004
Normal Pulling 4m28s kubelet Pulling image "grafana"
Normal Pulled 4m28s kubelet Successfully pulled image "grafana" in 272.563572ms
Normal Created 4m28s kubelet Created container grafana
Normal Started 4m28s kubelet Started container grafana
Original file line number Diff line number Diff line change
@@ -1,9 +1,16 @@
user_prompt: 'what is the command to port-forward to << { "type": "pod", "name": "my_grafana_4j981" } >>'
#user_prompt: "what is the command to port-forward to my grafana service?"
expected_output: "kubectl port-forward service/my_grafana_4j981 3000:3000"
expected_output: |
To port-forward to the pod `my_grafana_4j981`, use the following command:
```bash
kubectl port-forward pod/my_grafana_4j981 3000:3000 -n default
```
This command forwards port 3000 on your local machine to port 3000 on the pod.
retrieval_context:
- "The tool kubectl_get_all` reports that a grafana service is running but does not have an external IP address"
- "The tool kubectl_get_all` reports that the name of the grafana service is my_grafana_4j981"
- "By default grafana runs on port 3000. We can assume the user's grafana instance runs on that port"
- "The grafana service is running but does not have an external IP address"
- "The name of the grafana service is my_grafana_4j981"
- "Grafana is running on port 3000"
evaluation:
faithfulness: 0.2
faithfulness: 0.5
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{"toolset_name":"kubernetes/core","tool_name":"kubectl_events","match_params":{"resource_type":"pod","pod_name":"nginx-6958c5bdd8-69gtn","namespace":"default"}}
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m28s default-scheduler Successfully assigned default/nginxreplica to aks-nodepool1-26081864-vmss000004
Normal Pulling 4m28s kubelet Pulling image "nginx"
Normal Pulled 4m28s kubelet Successfully pulled image "nginx" in 272.563572ms
Normal Created 4m28s kubelet Created container nginxreplica
Normal Started 4m28s kubelet Started container nginxreplica<
7 changes: 7 additions & 0 deletions tests/fixtures/test_chat/7_get_pod_events/kubectl_events.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{"toolset_name":"kubernetes/core","tool_name":"kubectl_events","match_params":{"resource_type":"pod","pod_name":"robusta-runner-746d848db9-f8ns9","namespace":"default"}}
LAST SEEN TYPE REASON OBJECT MESSAGE
35m Normal Pulling Pod/robusta-runner-746d848db9-f8ns9 Pulling image "robustadev/robusta-runner:0.19.0"
35m Normal Scheduled Pod/robusta-runner-746d848db9-f8ns9 Successfully assigned default/robusta-runner-746d848db9-f8ns9 to kind-control-plane
34m Normal Pulled Pod/robusta-runner-746d848db9-f8ns9 Successfully pulled image "robustadev/robusta-runner:0.19.0" in 13.508s (58.067s including waiting). Image size: 281790224 bytes.
34m Normal Created Pod/robusta-runner-746d848db9-f8ns9 Created container runner
34m Normal Started Pod/robusta-runner-746d848db9-f8ns9 Started container runner
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
{"toolset_name":"kubernetes/core","tool_name":"kubectl_find_resource","match_params":{"kind":"pod","keyword":"robusta-runner"}}
default robusta-runner-746d848db9-f8ns9 1/1 Running 0 16m 10.244.0.11 kind-control-plane <none> <none> app=robusta-runner,pod-template-hash=746d848db9,robustaComponent=runner
25 changes: 25 additions & 0 deletions tests/fixtures/test_chat/7_get_pod_events/test_case.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
user_prompt: "Fetch all kubernetes events related to the robusta-runner pod"
expected_output: |
The `robusta-runner-746d848db9-f8ns9` pod in the `default` namespace has the following events:
1. **Pulling**: Pulling image `robustadev/robusta-runner:0.19.0`.
2. **Scheduled**: Successfully assigned default/robusta-runner-746d848db9-f8ns9 to kind-control-plane
3. **Pulled**: Successfully pulled image "robustadev/robusta-runner:0.19.0" in 13.508s (58.067s including waiting). Image size: 281790224 bytes.
4. **Created**: Created container `runner`.
5. **Started**: Started container `runner`.
retrieval_context:
- |
Here are the events:
Pulling image "robustadev/robusta-runner:0.19.0"
Successfully assigned default/robusta-runner-746d848db9-f8ns9 to kind-control-plane
Successfully pulled image "robustadev/robusta-runner:0.19.0" in 13.508s (58.067s including waiting). Image size: 281790224 bytes.
Created container runner
Started container runner
evaluation:
answer_relevancy: .5
faithfulness: .5
contextual_precision: .5
contextual_recall: .5
contextual_relevancy: .5
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
{"toolset_name":"kubernetes/core","tool_name":"kubectl_container_logs","match_params":{"pod_name":"customer-orders-67889fd856-k94k7","container_name":"fastapi-app","namespace":"default"}}
stdout:
INFO: Started server process [1]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
INFO: 10.244.0.16:46364 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:33610 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:47000 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:53562 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Received request for checkout page.
INFO:app:Connecting to promotions database to see if we should try to upsell user
INFO:app:Connecting to database at None
INFO:app:Fetching data using stored procedure: sp_CheckUserNotifications
INFO: 10.244.0.16:59206 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Database call completed in 8.01 seconds.
INFO:app:Promotions result: True
INFO:app:Page rendered in 8.01 seconds.
INFO: 127.0.0.1:34748 - "GET / HTTP/1.1" 200 OK
INFO: 10.244.0.16:56156 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:41600 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:35976 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:35584 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Received request for checkout page.
INFO:app:Connecting to promotions database to see if we should try to upsell user
INFO:app:Connecting to database at None
INFO:app:Fetching data using stored procedure: sp_CheckUserNotifications
INFO:app:Database call completed in 7.00 seconds.
INFO:app:Promotions result: True
INFO:app:Page rendered in 7.00 seconds.
INFO: 127.0.0.1:59258 - "GET / HTTP/1.1" 200 OK
INFO: 10.244.0.16:39944 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:39850 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:55216 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:51152 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Received request for checkout page.
INFO:app:Connecting to promotions database to see if we should try to upsell user
INFO:app:Connecting to database at None
INFO:app:Fetching data using stored procedure: sp_CheckUserNotifications
INFO: 10.244.0.16:47072 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Database call completed in 9.00 seconds.
INFO:app:Promotions result: True
INFO:app:Page rendered in 9.00 seconds.
INFO: 127.0.0.1:39504 - "GET / HTTP/1.1" 200 OK
INFO: 10.244.0.16:42586 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:52628 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:38852 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:40626 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Received request for checkout page.
INFO:app:Connecting to promotions database to see if we should try to upsell user
INFO:app:Connecting to database at None
INFO:app:Fetching data using stored procedure: sp_CheckUserNotifications
INFO:app:Database call completed in 7.01 seconds.
INFO:app:Promotions result: True
INFO:app:Page rendered in 7.01 seconds.
INFO: 127.0.0.1:49094 - "GET / HTTP/1.1" 200 OK
INFO: 10.244.0.16:34684 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:43422 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:49774 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:57556 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Received request for checkout page.
INFO:app:Connecting to promotions database to see if we should try to upsell user
INFO:app:Connecting to database at None
INFO:app:Fetching data using stored procedure: sp_CheckUserNotifications
INFO: 10.244.0.16:58876 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Database call completed in 8.01 seconds.
INFO:app:Promotions result: True
INFO:app:Page rendered in 8.01 seconds.
INFO: 127.0.0.1:45622 - "GET / HTTP/1.1" 200 OK
INFO: 10.244.0.16:44866 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:54794 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:39550 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:49456 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Received request for checkout page.
INFO:app:Connecting to promotions database to see if we should try to upsell user
INFO:app:Connecting to database at None
INFO:app:Fetching data using stored procedure: sp_CheckUserNotifications
INFO:app:Database call completed in 8.01 seconds.
INFO:app:Promotions result: True
INFO:app:Page rendered in 8.01 seconds.
INFO: 127.0.0.1:55750 - "GET / HTTP/1.1" 200 OK
INFO: 10.244.0.16:55426 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:55114 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:33410 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:40844 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Received request for checkout page.
INFO:app:Connecting to promotions database to see if we should try to upsell user
INFO:app:Connecting to database at None
INFO:app:Fetching data using stored procedure: sp_CheckUserNotifications
INFO: 10.244.0.16:38884 - "GET /metrics HTTP/1.1" 200 OK
INFO:app:Database call completed in 7.01 seconds.
INFO:app:Promotions result: True
INFO:app:Page rendered in 7.01 seconds.
INFO: 127.0.0.1:50872 - "GET / HTTP/1.1" 200 OK
INFO: 10.244.0.16:40396 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:40466 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:58458 - "GET /metrics HTTP/1.1" 200 OK
INFO: 10.244.0.16:34996 - "GET /metrics HTTP/1.1" 200 OK

stderr:
65 changes: 65 additions & 0 deletions tests/fixtures/test_chat/8_multi_container_pod/kubectl_logs.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
{"toolset_name":"kubernetes/core","tool_name":"kubectl_logs","match_params":{"name":"customer-orders-67889fd856-k94k7","namespace":"default"}}
stdout:

<html>
<head>
<title>Checkout Status</title>
</head>
<body>
<h1>Success!</h1>
<p>Promotions: True</p>
</body>
</html>

<html>
<head>
<title>Checkout Status</title>
</head>
<body>
<h1>Success!</h1>
<p>Promotions: True</p>
</body>
</html>

<html>
<head>
<title>Checkout Status</title>
</head>
<body>
<h1>Success!</h1>
<p>Promotions: True</p>
</body>
</html>

<html>
<head>
<title>Checkout Status</title>
</head>
<body>
<h1>Success!</h1>
<p>Promotions: True</p>
</body>
</html>

<html>
<head>
<title>Checkout Status</title>
</head>
<body>
<h1>Success!</h1>
<p>Promotions: True</p>
</body>
</html>

<html>
<head>
<title>Checkout Status</title>
</head>
<body>
<h1>Success!</h1>
<p>Promotions: True</p>
</body>
</html>

stderr:
Defaulted container "curl-sidecar" out of: curl-sidecar, fastapi-app
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{"toolset_name":"kubernetes/core","tool_name":"kubectl_logs_grep","match_params":{"name":"customer-orders-67889fd856-k94k7","namespace":"default","search_term":"render time"}}
Command `kubectl logs customer-orders-67889fd856-k94k7 -n default | grep 'render time'` failed with return code 1
stdout:

stderr:
Defaulted container "curl-sidecar" out of: curl-sidecar, fastapi-app
6 changes: 6 additions & 0 deletions tests/fixtures/test_chat/8_multi_container_pod/test_case.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
user_prompt: "How are the page render times for pod customer-orders-67889fd856-k94k7"
expected_output: "Page render times for `customer-orders-67889fd856-k94k7` range from 7.00 to 9.00 seconds."
evaluation:
answer_relevancy: .5
faithfulness: .5
contextual_precision: 0
3 changes: 2 additions & 1 deletion tests/test_chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,13 @@

TEST_CASES_FOLDER = Path("tests/fixtures/test_chat")

test_cases = load_ask_holmes_test_cases(TEST_CASES_FOLDER, expected_number_of_test_cases=6)
test_cases = load_ask_holmes_test_cases(TEST_CASES_FOLDER)


def idfn(test_case:AskHolmesTestCase):
return test_case.id

@pytest.mark.llm
@pytest.mark.parametrize("test_case", test_cases, ids=idfn)
def test_ask_holmes_with_tags(test_case:AskHolmesTestCase):

Expand Down

0 comments on commit 2b6a62b

Please sign in to comment.