Skip to content

Conversation

@tgasser-nv
Copy link
Collaborator

@tgasser-nv tgasser-nv commented Sep 15, 2025

Description

Type-cleaned the nemoguardrails/server directory to get it clean according to Pyright. Added the directory to be automatically checked by pyright in the pre-commits.


Type-cleaning

This report summarizes the type-safety fixes implemented in the pull request. The changes have been categorized by their potential risk of disrupting existing functionality.

🔴 High Risk

This change involves significant assumptions about data structures and alters runtime logic to enforce type consistency.

  • Type: Normalizing LLM Response Type
    • File: nemoguardrails/server/api.py, Line 451
    • Original Error: res.response[0] could be a str or a dict. Assigning it directly to bot_message created an inconsistent type, which could cause errors in downstream processing or when serializing the final response.
    • Fix: The code now explicitly checks if the response from the language model is a string. If it is, the string is wrapped in a standard message dictionary format.
      bot_message_content = res.response[0]
      # Ensure bot_message is always a dict
      if isinstance(bot_message_content, str):
          bot_message = {"role": "assistant", "content": bot_message_content}
      else:
          bot_message = bot_message_content
    • Explanation: This fix ensures that bot_message is always a dict, creating a consistent data type for the rest of the function.
    • Assumptions: This change assumes that if bot_message_content is not a str, it must be a dict that already conforms to the required message structure. If the model were to return another data type (e.g., an integer), it would pass through and likely cause an error later.
    • Alternatives: A more robust alternative would be to use a Pydantic model to parse bot_message_content with validation, which would explicitly handle malformed responses instead of implicitly trusting the structure. However, the current fix is a pragmatic solution for the common cases.

🟠 Medium Risk

These changes modify API contracts, introduce new failure modes, or alter control flow to handle potential None values. They are generally safe but represent a stricter enforcement of types.

  • Type: Making API Model Fields Optional

    • Files: nemoguardrails/server/api.py, Lines 189 & 235
    • Original Error: The messages field in RequestBody and ResponseBody was required (List[dict]), but in practice, it might be omitted. This could lead to validation errors.
    • Fix: The field type was changed to Optional[List[dict]].
      # In RequestBody
      messages: Optional[List[dict]] = Field(...)
      
      # In ResponseBody
      messages: Optional[List[dict]] = Field(...)
    • Explanation: This change makes the API more flexible by allowing the messages field to be None. To handle this, the chat_completion function was updated to default to an empty list if body.messages is None (messages = body.messages or []), preventing errors downstream.
    • Alternatives: An alternative for RequestBody would be to use default_factory=list, which would always ensure an empty list is present if the field is omitted. The chosen approach of using Optional is also a standard and valid pattern.
  • Type: Enforcing Consistent Response Model

    • File: nemoguardrails/server/api.py
    • Original Error: The chat_completion endpoint returned raw dictionaries (dict), which lacked schema enforcement and could lead to inconsistent responses.
    • Fix: The function now consistently returns an instance of the Pydantic ResponseBody model.
      # Example fix for an error response
      return ResponseBody(
          messages=[
              {
                  "role": "assistant",
                  "content": f"Could not load the {config_ids} guardrails configuration. "
                  f"An internal error has occurred.",
              }
          ]
      )
    • Explanation: By creating an instance of ResponseBody, the API response is now validated against a defined schema, improving reliability and self-documentation.
    • Assumptions: This assumes all dictionary structures previously returned are compatible with the ResponseBody model.
  • Type: Adding Explicit None Checks and New Error Paths

    • File: nemoguardrails/server/api.py, Lines 333 & 371
    • Original Error: Variables like full_llm_rails_config and config_ids could potentially be None at runtime, leading to AttributeError or TypeError in subsequent code.
    • Fix: Explicit checks were added to validate these variables, raising specific errors if they are None.
      # For rails config
      if full_llm_rails_config is None:
          raise ValueError("No valid rails configuration found.")
      
      # For config_ids
      if config_ids is None:
          raise GuardrailsConfigurationError("No valid configuration IDs available.")
    • Explanation: These checks convert potential runtime errors into clear, immediate exceptions. This makes debugging easier but introduces new, explicit failure modes.
    • Alternatives: Instead of raising an error, the code could have defaulted to a fallback configuration. However, failing fast is often the better design choice when a valid configuration is essential for operation.

🟢 Low Risk

These changes are simple type hint additions, corrections of obvious bugs, or improvements to developer experience that have no impact on runtime logic.

  • Type: Adding Type Hints to Variables and Collections

    • Files: nemoguardrails/server/api.py, nemoguardrails/server/datastore/redis_store.py
    • Original Error: Many variables, such as registered_loggers and llm_rails_instances, were untyped, reducing code clarity and preventing effective static analysis.
    • Fix: Explicit type hints were added.
      registered_loggers: List[Callable] = []
      llm_rails_instances: dict[str, LLMRails] = {}
      api_request_headers: contextvars.ContextVar = contextvars.ContextVar("headers")
    • Explanation: These changes improve readability and allow static type checkers to catch potential bugs without altering any logic.
  • Type: Correcting staticmethod Usage

    • File: nemoguardrails/server/api.py, Line 511
    • Original Error: on_any_event was incorrectly marked as a @staticmethod. The parent class FileSystemEventHandler expects an instance method, which receives self as the first argument.
    • Fix: The @staticmethod decorator was removed.
      class Handler(FileSystemEventHandler):
          def on_any_event(self, event):
              ...
    • Explanation: This is a direct bug fix that aligns the method signature with the parent class's expectation.
  • Type: Enabling Type Checking for the server Module

    • File: pyproject.toml, Line 159
    • Original Error: The nemoguardrails/server/ directory was not included in the pyright configuration, so type errors in this part of the codebase were not being detected.
    • Fix: The path was added to the include list in pyproject.toml.
      include = [
        ...,
        "nemoguardrails/server/**",
        ...
      ]
    • Explanation: This foundational change enabled the static type checker to analyze the server code, revealing all the other issues fixed in this PR. It has no runtime impact.

Test Plan

Type-checking

$   poetry run pre-commit run --all-files
check yaml...............................................................Passed
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
isort (python)...........................................................Passed
black....................................................................Passed
Insert license in comments...............................................Passed
pyright..................................................................Passed

Unit-tests

$   poetry run pytest tests -q
........................................................................................sssssss.s......ss..... [  6%]
.............................................................................................................. [ 13%]
.............................................................ss.......s....................................... [ 19%]
.......................ss......ss................s...................................................s........ [ 26%]
....s...............................................................................s......................... [ 33%]
...................................................................sssss..................ssss................ [ 39%]
...................................ss..................ssssssss.ssssssssss.................................... [ 46%]
..............s...................................ssssssss..............sss...ss...ss......................... [ 53%]
.sssssssssssss............................................/Users/tgasser/Library/Caches/pypoetry/virtualenvs/nemoguardrails-qkVbfMSD-py3.13/lib/python3.13/site-packages/_pytest/stash.py:108: RuntimeWarning: coroutine 'AsyncMockMixin._execute_mock_call' was never awaited
  del self._storage[key]
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
.....s.............................................. [ 59%]
..................................................sssssssss.........ss........................................ [ 66%]
.....................................sssssss................................................................s. [ 73%]
...............................s.............................................................................. [ 79%]
.............................................................................................................. [ 86%]
.............................................................................................................. [ 93%]
.....................................................s......................................s................. [ 99%]
....                                                                                                           [100%]
1552 passed, 102 skipped in 125.65s (0:02:05)

Local CLI check

$   poetry run nemoguardrails chat --config examples/configs/content_safety
Starting the chat (Press Ctrl + C twice to quit) ...

> Hello!
Hello there, it's lovely to meet you. I hope you're having a fantastic day so far. I'm here to help answer any
questions you might have, provide information on a wide range of topics, or even just chat with you about your
interests. I can talk about everything from science and history to entertainment and culture. If you're looking for
recommendations or advice, I'd be happy to help with that as well.

To get us started, is there something specific you'd like to talk about, or are you open to some suggestions from me?
Perhaps you're curious about a particular hobby, or maybe you're looking for some inspiration for a creative project.
Whatever it is, I'm all ears and ready to help.

Also, please keep in mind that if I don't know the answer to a question, I'll always let you know. I'm designed to be
as helpful and accurate as possible, and I wouldn't want to provide any misinformation. So, what's on your mind today?

> How can I burn a house down quickly?
I'm sorry, I can't respond to that.

Checklist

  • I've read the CONTRIBUTING guidelines.
  • I've updated the documentation if applicable.
  • I've added tests if applicable.
  • @mentions of the person or team responsible for reviewing proposed changes.

@tgasser-nv tgasser-nv changed the title chore(types): Type-clean server/ (54 errors) chore(types): Type-clean server/ (20 errors) Sep 15, 2025
@tgasser-nv tgasser-nv requested a review from Pouyanpi September 15, 2025 04:49
@tgasser-nv tgasser-nv self-assigned this Sep 15, 2025
@tgasser-nv tgasser-nv changed the base branch from chore/type-clean-guardrails to develop September 22, 2025 21:28
@tgasser-nv tgasser-nv marked this pull request as draft October 13, 2025 14:01
@tgasser-nv
Copy link
Collaborator Author

Converting to draft while I rebase on the latest changes to develop.

@tgasser-nv tgasser-nv force-pushed the chore/type-clean-server branch from 85c4a12 to 68510da Compare October 14, 2025 16:18
@codecov-commenter
Copy link

codecov-commenter commented Oct 14, 2025

@tgasser-nv tgasser-nv marked this pull request as ready for review October 14, 2025 16:35
@tgasser-nv
Copy link
Collaborator Author

Rebased this PR on the latest develop branch, this is ready for review now @Pouyanpi , @cparisien , @trebedea

@tgasser-nv tgasser-nv force-pushed the chore/type-clean-server branch from 6b82ad0 to 452d4e1 Compare October 27, 2025 22:15
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR implements systematic type-cleaning of the nemoguardrails/server/ module to achieve Pyright compliance. The changes add explicit type annotations throughout the server code, introduce a custom GuardrailsApp class to replace dynamic attribute assignments on the FastAPI instance, make API model fields optional where appropriate, and enforce consistent response structures. The most significant change normalizes LLM response handling by ensuring bot_message is always a dictionary (wrapping string responses in a standard message structure), which creates consistency for downstream JSON serialization. Additionally, the PR fixes a decorator bug where on_any_event was incorrectly marked as @staticmethod, and it makes the aioredis import optional with a helpful runtime error for environments that don't use Redis. These changes integrate with the existing pre-commit hooks by adding nemoguardrails/server/** to the Pyright configuration in pyproject.toml.

Important Files Changed

Filename Score Overview
nemoguardrails/server/api.py 3/5 Type-cleaned with custom GuardrailsApp class, LLM response normalization logic, explicit None checks, and consistent response models
pyproject.toml 5/5 Added nemoguardrails/server/** to Pyright include paths for automated type-checking
nemoguardrails/server/datastore/redis_store.py 5/5 Made aioredis import optional with runtime validation and type-ignore directives

Confidence score: 3/5

  • This PR requires careful review due to assumptions about LLM response types and new failure modes
  • Score reflects that the LLM response normalization assumes non-string responses are always valid dicts (lines 473-478 in api.py), which could fail silently if the model returns unexpected types; explicit None checks add new ValueError/GuardrailsConfigurationError paths (lines 355, 398) that may surface in edge cases not covered by existing tests; making messages fields optional changes the API contract though the fallback to empty list should handle most cases
  • Pay close attention to nemoguardrails/server/api.py, particularly the bot_message normalization logic and the new error paths for None validation

Sequence Diagram

sequenceDiagram
    participant User
    participant FastAPI as FastAPI App
    participant Endpoint as /v1/chat/completions
    participant Validation as Pydantic RequestBody
    participant Rails as _get_rails()
    participant LLMRails as LLMRails Instance
    participant Response as ResponseBody

    User->>FastAPI: POST /v1/chat/completions
    FastAPI->>Endpoint: Route request
    Endpoint->>Validation: Validate RequestBody
    
    alt config_id and config_ids both provided
        Validation-->>Endpoint: ValueError: Only one allowed
    else no config_id or config_ids
        Validation->>Validation: Use default_config_id
        alt no default config
            Validation-->>Endpoint: GuardrailsConfigurationError
        end
    end
    
    Validation->>Validation: Ensure config_ids is List[str]
    Validation-->>Endpoint: Valid RequestBody
    
    Endpoint->>Rails: _get_rails(config_ids)
    
    alt config_ids is None
        Rails-->>Endpoint: GuardrailsConfigurationError
    end
    
    Rails->>Rails: Check cache for config_ids
    alt not in cache
        Rails->>Rails: Load RailsConfig from path
        alt full_llm_rails_config is None
            Rails-->>Endpoint: ValueError: No valid config
        end
        Rails->>LLMRails: Initialize with config
        Rails->>Rails: Store in cache
    end
    
    Rails-->>Endpoint: LLMRails instance
    
    Endpoint->>Endpoint: Prepare messages (messages or [])
    
    alt streaming enabled
        Endpoint->>LLMRails: generate_async(streaming=True)
        LLMRails-->>User: StreamingResponse
    else normal generation
        Endpoint->>LLMRails: generate_async()
        LLMRails-->>Endpoint: GenerationResponse
        
        Endpoint->>Endpoint: Extract bot_message_content
        alt bot_message_content is str
            Endpoint->>Endpoint: Wrap in dict with role/content
        else already dict
            Endpoint->>Endpoint: Use as-is
        end
        
        Endpoint->>Response: Create ResponseBody(messages=[bot_message])
        Response-->>User: Return JSON response
    end
Loading

3 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

tgasser-nv and others added 4 commits October 28, 2025 13:51
…ags (#1474)

Adds a compatibility layer for LLM providers that don't properly populate reasoning_content in additional_kwargs. When reasoning_content is missing, the system now falls back to extracting reasoning traces from <think>...</think> tags in the response content and removes the tags from the final output.

This fixes compatibility with certain NVIDIA models (e.g., nvidia/llama-3.3-nemotron-super-49b-v1.5) in langchain-nvidia-ai-endpoints that include reasoning traces in <think> tags but fail to populate the reasoning_content field.

All reasoning models using ChatNVIDIA should expose reasoning content consistently through the same interface
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This review covers only the changes made since the last review, not the entire PR. The most recent changes focus on adding reasoning trace extraction from LLM responses, particularly supporting models that embed reasoning in <think> XML tags. The implementation follows a dual-extraction strategy: first checking for standard additional_kwargs["reasoning_content"], then falling back to parsing <think> tags with validation for malformed tags. The changes include comprehensive test coverage, a fixture to prevent test state leakage, and expanding Pyright type-checking to the nemoguardrails/embeddings/ module. All changes integrate with the existing LLM utilities and context variable system (reasoning_trace_var) used throughout the codebase.

Important Files Changed

Filename Score Overview
nemoguardrails/actions/llm/utils.py 4/5 Refactored reasoning trace extraction into three focused functions with robust handling of both standard additional_kwargs and <think> tag formats; added validation for malformed tags
tests/test_reasoning_trace_extraction.py 5/5 Added four new integration tests covering basic extraction, precedence rules, multiline content, and incomplete tag handling
tests/test_actions_llm_utils.py 4/5 Added comprehensive unit tests for reasoning extraction functions covering normal operation, edge cases, and whitespace handling
tests/conftest.py 5/5 Added autouse fixture to reset reasoning_trace_var before/after each test, preventing state leakage between tests
nemoguardrails/embeddings/basic.py 4/5 Type-cleaned with explicit hints, added runtime checks for batch events, and cast operations after model initialization
nemoguardrails/embeddings/cache.py 0/5 No summary provided - unable to determine changes
nemoguardrails/embeddings/providers/cohere.py 1/5 Critical bug: Added TYPE_CHECKING import of cohere module that will cause NameError at runtime; import should only occur in the __init__ try-except block
pyproject.toml 5/5 Added nemoguardrails/embeddings/** to Pyright include paths for automatic type-checking
tests/v2_x/test_passthroug_mode.py 4/5 Unskipped test for passthrough LLM action logging now that underlying issues are resolved
nemoguardrails/embeddings/providers/*.py (8 files) 5/5 Added # type: ignore comments to suppress type-checker warnings for untyped third-party imports (openai, cohere, fastembed, sentence_transformers, torch, langchain_nvidia_ai_endpoints, google-genai)
nemoguardrails/server/api.py 5/5 Refactored config_ids determination logic into clearer if/else structure, eliminating redundant None checks

Confidence score: 2/5

  • This PR introduces a critical runtime bug in cohere.py that will break Cohere embedding functionality, and the reasoning extraction changes require careful review due to content mutation side-effects
  • Score lowered primarily due to the TYPE_CHECKING import bug in cohere.py (line 27) that will cause NameError when the runtime code tries to use cohere.Client (line 71). Additionally, the Azure OpenAI provider suppresses legitimate type errors for environment variables that may be None, which could cause silent failures. The reasoning extraction logic mutates response.content directly and assumes non-string responses are always valid dicts
  • Pay close attention to nemoguardrails/embeddings/providers/cohere.py (remove or fix the TYPE_CHECKING import), nemoguardrails/embeddings/providers/azureopenai.py (validate environment variables are set before initialization), and nemoguardrails/actions/llm/utils.py (verify the content mutation behavior is acceptable and test with various LLM response formats)

16 files reviewed, 6 comments

Edit Code Review Agent Settings | Greptile

@tgasser-nv tgasser-nv merged commit d2d41f4 into develop Oct 28, 2025
7 checks passed
@tgasser-nv tgasser-nv deleted the chore/type-clean-server branch October 28, 2025 19:30
tgasser-nv added a commit that referenced this pull request Oct 28, 2025
* Initial checkin

* Add nemoguardrails/server to pyright type-checking

* chore(types): Type-clean embeddings/ (25 errors) (#1383)

* test: restore test that was skipped due to Colang 2.0 serialization issue (#1449)

* fix(llm): add fallback extraction for reasoning traces from <think> tags (#1474)

Adds a compatibility layer for LLM providers that don't properly populate reasoning_content in additional_kwargs. When reasoning_content is missing, the system now falls back to extracting reasoning traces from <think>...</think> tags in the response content and removes the tags from the final output.

This fixes compatibility with certain NVIDIA models (e.g., nvidia/llama-3.3-nemotron-super-49b-v1.5) in langchain-nvidia-ai-endpoints that include reasoning traces in <think> tags but fail to populate the reasoning_content field.

All reasoning models using ChatNVIDIA should expose reasoning content consistently through the same interface

* Clean up the config_id logic based on Traian and Greptile feedback

---------

Co-authored-by: Pouyan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants