Merge remote-tracking branch 'origin/master' into llm_eval_improvements

robusta-dev · Nov 27, 2024 · 740fe68 · 740fe68
2 parents 0829be5 + 743e62c
commit 740fe68
Show file tree

Hide file tree

Showing 13 changed files with 124 additions and 64 deletions.
diff --git a/Dockerfile b/Dockerfile
@@ -70,6 +70,8 @@ RUN python -m playwright install firefox --with-deps
 # We're installing here libexpat1, to upgrade the package to include a fix to 3 high CVEs. CVE-2024-45491,CVE-2024-45490,CVE-2024-45492
 RUN apt-get update \
     && apt-get install -y \
+    curl \
+    jq \
     git \
     apt-transport-https \
     gnupg2 \

diff --git a/README.md b/README.md
@@ -1,19 +1,18 @@
 <div align="center">
-  <h1 align="center">Solve Prometheus alerts faster with an AI assistant</h1>
+  <h1 align="center">Solve cloud alerts faster with an AI assistant</h1>
   <h2 align="center">HolmesGPT - AI Agent for On-Call Engineers 🔥</h2>
   <p align="center">
-    <a href="#examples"><strong>Examples</strong></a> |
+    <a href="#ways-to-use-holmesgpt"><strong>Examples</strong></a> |
     <a href="#key-features"><strong>Key Features</strong></a> |
     <a href="#installation"><strong>Installation</strong></a> |
     <a href="https://www.youtube.com/watch?v=TfQfx65LsDQ"><strong>YouTube Demo</strong></a>
   </p>
 </div>
 
-Transforms your existing cloud alerts from this 👇
+Improve developer experience and reduce mean-time-to-respond (MTTR) by transforming alerts from this 👇
 
 ![Screenshot 2024-10-31 at 12 01 12 2](https://github.com/user-attachments/assets/931ebd71-ccd2-4b7b-969d-a061a99cec2d)
 
-
 To this 👇
 
 ![Screenshot 2024-10-31 at 11 40 09](https://github.com/user-attachments/assets/9e2c7a23-b942-4720-8a98-488323e092ca)
@@ -34,7 +33,8 @@ To this 👇
 ## Ways to Use HolmesGPT
 
 <details>
-<summary> AI analysis in Robusta UI</summary>
+<summary> Analyze your alerts in a free UI</summary>
+
 Includes free use of the Robusta AI model.
 
 ![Screenshot 2024-10-31 at 11 40 09](https://github.com/user-attachments/assets/2e90cc7b-4b0a-4386-ab4f-0d36692b549c)
@@ -44,7 +44,7 @@ Includes free use of the Robusta AI model.
 </details>
 
 <details>
-<summary>Root cause for Prometheus alerts in Slack</summary>
+<summary>Add root-cause-analysis to Prometheus alerts in Slack</summary>
 
 Investigate Prometheus alerts right from Slack with the official [Robusta integration](https://docs.robusta.dev/holmes_chart_dependency/configuration/ai-analysis.html).
 
@@ -62,7 +62,9 @@ Note - if on Mac OS and using the Docker image, you will need to use `http://doc
 
 
 <details>
-<summary>Free-text questions (CLI)</summary>
+<summary>Query observability data in human language</summary>
+
+Via the Holmes CLI or [a free UI (video)](https://www.loom.com/share/3cdcd94ed6bc458888b338493b108d1d?t=0)
 
 ```bash
 holmes ask "what pods are in crashloopbackoff in my cluster and why?"
@@ -164,19 +166,16 @@ plugins:
 ```
 </details>
 
-
-### Bring your own LLM
 <details>
-<summary>Bring your own LLM</summary>
+<summary>Importing Holmes as a Python library and bringing your own LLM</summary>
 
 You can use Holmes as a library and pass in your own LLM implementation. This is particularly useful if LiteLLM or the default Holmes implementation does not suit you.
 
 See an example implementation [here](examples/custom_llm.py).
 
-
 </details>
 
-Like what you see? Checkout [other use cases](#other-use-cases) or get started by [installing HolmesGPT](#installation).
+Like what you see? Discover [more use cases](#more-use-cases) or get started by [installing HolmesGPT](#installation).
 
 ## Installation
 
@@ -372,13 +371,6 @@ To work with Azure AI, you need to provide the below variables:
 
 </details>
 
-**Trusting custom Certificate Authority (CA) certificate:**
-
-If your llm provider url uses a certificate from a custom CA, in order to trust it, base-64 encode the certificate, and store it in an environment variable named ``CERTIFICATE``
-
-
-
-
 ### Getting an API Key
 
 HolmesGPT requires an LLM API Key to function. The most common option is OpenAI, but many [LiteLLM-compatible](https://docs.litellm.ai/docs/providers/) models are supported. To use an LLM, set `--model` (e.g. `gpt-4o` or `bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0`) and `--api-key` (if necessary). Depending on the provider, you may need to set environment variables too.
@@ -493,6 +485,13 @@ In particular, note that [vLLM does not yet support function calling](https://gi
 
 </details>
 
+**Additional LLM Configuration:**
+
+<details>
+<summary>Trusting custom Certificate Authority (CA) certificate</summary>
+If your llm provider url uses a certificate from a custom CA, in order to trust it, base-64 encode the certificate, and store it in an environment variable named <b>CERTIFICATE</b>
+</details>
+
 ### Enabling Integrations
 
 <details>
@@ -524,7 +523,7 @@ HolmesGPT can consult webpages containing runbooks or other relevant information
 HolmesGPT uses playwright to scrape webpages and requires playwright to be installed and working through `playwright install`.
 </details>
 
-## Other Use Cases
+## More Use Cases
 
 HolmesGPT was designed for incident response, but it is a general DevOps assistant too. Here are some examples:
 

diff --git a/holmes/core/llm.py b/holmes/core/llm.py
@@ -71,13 +71,45 @@ def check_llm(self, model:str, api_key:Optional[str]):
         if not lookup:
             raise Exception(f"Unknown provider for model {model}")
         provider = lookup[1]
-        api_key_env_var = f"{provider.upper()}_API_KEY"
-        if api_key:
-            os.environ[api_key_env_var] = api_key
-        model_requirements = litellm.validate_environment(model=model)
+        if provider == "watsonx":
+            # NOTE: LiteLLM's validate_environment does not currently include checks for IBM WatsonX.
+            # The following WatsonX-specific variables are set based on documentation from:
+            # https://docs.litellm.ai/docs/providers/watsonx
+            # Required variables for WatsonX:
+            # - WATSONX_URL: Base URL of your WatsonX instance (required)
+            # - WATSONX_APIKEY or WATSONX_TOKEN: IBM Cloud API key or IAM auth token (one is required)
+            model_requirements = {'missing_keys': [], 'keys_in_environment': True}
+            if api_key:
+                os.environ["WATSONX_APIKEY"] = api_key
+            if not "WATSONX_URL" in os.environ:
+                model_requirements['missing_keys'].append("WATSONX_URL")
+                model_requirements['keys_in_environment'] = False
+            if not "WATSONX_APIKEY" in os.environ and not "WATSONX_TOKEN" in os.environ:
+                model_requirements['missing_keys'].extend(["WATSONX_APIKEY", "WATSONX_TOKEN"])
+                model_requirements['keys_in_environment'] = False
+            # WATSONX_PROJECT_ID is required because we don't let user pass it to completion call directly
+            if not "WATSONX_PROJECT_ID" in os.environ:
+                model_requirements['missing_keys'].append("WATSONX_PROJECT_ID")
+                model_requirements['keys_in_environment'] = False
+            # https://docs.litellm.ai/docs/providers/watsonx#usage---models-in-deployment-spaces
+            # using custom watsonx deployments might require to set WATSONX_DEPLOYMENT_SPACE_ID env
+            if "watsonx/deployment/" in self.model:
+                logging.warning(
+                            "Custom WatsonX deployment detected. You may need to set the WATSONX_DEPLOYMENT_SPACE_ID "
+                            "environment variable for proper functionality. For more information, refer to the documentation: "
+                            "https://docs.litellm.ai/docs/providers/watsonx#usage---models-in-deployment-spaces"
+                )
+        else:
+            # 
+            api_key_env_var = f"{provider.upper()}_API_KEY"
+            if api_key:
+                os.environ[api_key_env_var] = api_key
+            model_requirements = litellm.validate_environment(model=model)
+
         if not model_requirements["keys_in_environment"]:
             raise Exception(f"model {model} requires the following environment variables: {model_requirements['missing_keys']}")
 
+
     def _strip_model_prefix(self) -> str:
         """
         Helper function to strip 'openai/' prefix from model name if it exists.
@@ -125,14 +157,7 @@ def completion(self, messages: List[Dict[str, Any]], tools: Optional[List[Tool]]
             drop_params=drop_params
         )
 
-
-
         if isinstance(result, ModelResponse):
-            response = result.choices[0]
-            response_message = response.message
-            # when asked to run tools, we expect no response other than the request to run tools unless bedrock
-            if response_message.content and ('bedrock' not in self.model and logging.DEBUG != logging.root.level):
-                logging.warning(f"got unexpected response when tools were given: {response_message.content}")
             return result
         else:
             raise Exception(f"Unexpected type returned by the LLM {type(result)}")

diff --git a/holmes/core/tool_calling_llm.py b/holmes/core/tool_calling_llm.py
@@ -117,7 +117,7 @@ def call(
                     messages, max_context_size, maximum_output_token
                 )
 
-            logging.debug(f"sending messages {messages}")
+            logging.debug(f"sending messages={messages}\n\ntools={tools}")
             try:
                 full_response = self.llm.completion(
                     messages=parse_messages_tags(messages),
@@ -127,7 +127,7 @@ def call(
                     response_format=response_format,
                     drop_params=True,
                 )
-                logging.debug(f"got response {full_response}")
+                logging.debug(f"got response {full_response.to_json()}")
             # catch a known error that occurs with Azure and replace the error message with something more obvious to the user
             except BadRequestError as e:
                 if (

diff --git a/holmes/main.py b/holmes/main.py
@@ -51,7 +51,7 @@
 
 class Verbosity(Enum):
     NORMAL = 0
-    LOG_QUERIES = 1
+    LOG_QUERIES = 1  # TODO: currently unused
     VERBOSE = 2
     VERY_VERBOSE = 3
 
@@ -65,24 +65,7 @@ def cli_flags_to_verbosity(verbose_flags: List[bool]) -> Verbosity:
     else:
         return Verbosity.VERY_VERBOSE
 
-def init_logging(verbose_flags: List[bool] = None):
-    verbosity = cli_flags_to_verbosity(verbose_flags)
-
-    if verbosity == Verbosity.VERY_VERBOSE:
-        logging.basicConfig(level=logging.DEBUG, format="%(message)s", handlers=[RichHandler(show_level=False, show_time=False)])
-    else:
-        logging.basicConfig(level=logging.INFO, format="%(message)s", handlers=[RichHandler(show_level=False, show_time=False)])
-
-    if verbosity.value >= Verbosity.NORMAL.value:
-        logging.info(f"verbosity is {verbosity}")
-
-    if verbosity.value >= Verbosity.LOG_QUERIES.value:
-        # TODO
-        pass
-
-    if verbosity.value >= Verbosity.VERBOSE.value:
-        logging.getLogger().setLevel(logging.DEBUG)
-
+def suppress_noisy_logs():
     # disable INFO logs from OpenAI
     logging.getLogger("httpx").setLevel(logging.WARNING)
     # disable INFO logs from LiteLLM
@@ -94,8 +77,24 @@ def init_logging(verbose_flags: List[bool] = None):
     logging.getLogger("openai._base_client").setLevel(logging.INFO)
     logging.getLogger("httpcore").setLevel(logging.INFO)
     logging.getLogger("markdown_it").setLevel(logging.INFO)
-    # Suppress UserWarnings from the slack_sdk module
+    # suppress UserWarnings from the slack_sdk module
     warnings.filterwarnings("ignore", category=UserWarning, module="slack_sdk.*")
+
+def init_logging(verbose_flags: List[bool] = None):
+    verbosity = cli_flags_to_verbosity(verbose_flags)
+
+    if verbosity == Verbosity.VERY_VERBOSE:
+        logging.basicConfig(level=logging.DEBUG, format="%(message)s", handlers=[RichHandler(show_level=False, show_time=False)])
+    elif verbosity == Verbosity.VERBOSE:
+        logging.basicConfig(level=logging.INFO, format="%(message)s", handlers=[RichHandler(show_level=False, show_time=False)])
+        logging.getLogger().setLevel(logging.DEBUG)
+        suppress_noisy_logs()
+    else:
+        logging.basicConfig(level=logging.INFO, format="%(message)s", handlers=[RichHandler(show_level=False, show_time=False)])
+        suppress_noisy_logs()
+
+    logging.debug(f"verbosity is {verbosity}")
+
     return Console()
 
 # Common cli options
@@ -138,7 +137,7 @@ def init_logging(verbose_flags: List[bool] = None):
     [],
     "--verbose",
     "-v",
-    help="Verbose output. You can pass multiple times to increase the verbosity. e.g. -v or -vv or -vvv or -vvvv",
+    help="Verbose output. You can pass multiple times to increase the verbosity. e.g. -v or -vv or -vvv",
 )
 opt_echo_request: bool = typer.Option(
     True,

diff --git a/poetry.lock b/poetry.lock
diff --git a/pyproject.toml b/pyproject.toml
@@ -28,7 +28,7 @@ supabase = "^2.5"
 colorlog = "^6.8.2"
 strenum = "^0.4.15"
 markdown = "^3.6"
-litellm = "^1.50.2"
+litellm = "^1.52.6"
 certifi = "^2024.7.4"
 urllib3 = "^1.26.19"
 boto3 = "^1.34.145"