Skip to content

feat(RHINENG-25223): Update workloads filtering logic to use OR operator#3839

Open
computercamplove wants to merge 13 commits intomasterfrom
RHINENG-25223
Open

feat(RHINENG-25223): Update workloads filtering logic to use OR operator#3839
computercamplove wants to merge 13 commits intomasterfrom
RHINENG-25223

Conversation

@computercamplove
Copy link
Copy Markdown
Contributor

@computercamplove computercamplove commented Mar 26, 2026

Overview

This PR is being created to address RHINENG-25223.
Currently, the backend implementation for this filter uses AND logic. This task requires updating the backend to use OR logic to ensure the Workload filter behaves consistently with all other existing Inventory filters (e.g., System Type, Status, Tags, and Data Collector)

PR Checklist

  • Keep PR title short, ideally under 72 characters
  • Descriptive comments provided in complex code blocks
  • Include raw query examples in the PR description, if adding/modifying SQL query
  • Tests: validate optimal/expected output
  • Tests: validate exceptions and failure scenarios
  • Tests: edge cases
  • Recovers or fails gracefully during potential resource outages (e.g. DB, Kafka)
  • Uses type hinting, if convenient
  • Documentation, if this PR changes the way other services interact with host inventory
  • Links to related PRs

Secure Coding Practices Documentation Reference

You can find documentation on this checklist here.

Secure Coding Checklist

  • Input Validation
  • Output Encoding
  • Authentication and Password Management
  • Session Management
  • Access Control
  • Cryptographic Practices
  • Error Handling and Logging
  • Data Protection
  • Communication Security
  • System Configuration
  • Database Security
  • File Management
  • Memory Management
  • General Coding Practices

Summary by Sourcery

Adjust system profile workload filtering to use OR logic across multiple workload criteria while keeping standard filters behavior unchanged.

New Features:

  • Support OR-based combination of multiple workload-related system profile filters.

Bug Fixes:

  • Ensure workload filters return hosts matching any of the specified workloads instead of requiring all of them.

Enhancements:

  • Refactor system profile filtering to distinguish workload filters from standard fields and group workload conditions appropriately.

Tests:

  • Add tests verifying that multiple workload filters are combined with OR semantics and update existing fuzzy match cases to reflect the new behavior.

@github-actions
Copy link
Copy Markdown
Contributor

SC Environment Impact Assessment

Overall Impact:NONE

No SC Environment-specific impacts detected in this PR.

What was checked

This PR was automatically scanned for:

  • Database migrations
  • ClowdApp configuration changes
  • Kessel integration changes
  • AWS service integrations (S3, RDS, ElastiCache)
  • Kafka topic changes
  • Secrets management changes
  • External dependencies

@computercamplove
Copy link
Copy Markdown
Contributor Author

WIP - need to update IQE tests

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • Given that _build_workloads_filter is now used for both workload and non-workload fields (via _process_standard_group), consider renaming it (and possibly moving it) to better reflect its broader purpose and avoid confusion for future maintainers.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Given that `_build_workloads_filter` is now used for both workload and non-workload fields (via `_process_standard_group`), consider renaming it (and possibly moving it) to better reflect its broader purpose and avoid confusion for future maintainers.

## Individual Comments

### Comment 1
<location path="tests/test_api_hosts_get.py" line_range="2792-2801" />
<code_context>
+def test_query_multiple_workloads_uses_or_logic(db_create_host, api_get, sp_filter_param):
</code_context>
<issue_to_address>
**issue (testing):** Add a case where a single host matches multiple workloads to verify deduplication and semantics.

The current test confirms OR semantics across the three workloads and excludes unrelated workloads. To strengthen it, please add a case where a single host satisfies at least two workload predicates (e.g., `sap_system=True` and `ansible.controller_version=1.2.3`) and assert that:

- The host ID appears only once in `response_ids`, and
- The total result count reflects deduplication.

This will validate both the OR logic and that multi-matching hosts don’t produce duplicate results.
</issue_to_address>

### Comment 2
<location path="tests/test_api_hosts_get.py" line_range="2772-2781" />
<code_context>
+@pytest.mark.parametrize(
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a negative test for multiple workload filters that should return an empty result set.

Right now all `sp_filter_param` cases ensure at least one host matches, which validates the OR behavior. Please add one parametrized case where valid workload filters collectively match no hosts (e.g., a `sap_sids` and `ansible.controller_version` that don’t exist on any created host), and assert a 200 response with an empty `results` list. This protects against regressions where the OR/AND logic returns all hosts or errors instead of an empty set.

Suggested implementation:

```python
    response_status, response_data = api_get(url)
    assert response_status == 400
    assert "Param filter must be appended with [] to accept multiple values." in response_data["detail"]


@pytest.mark.parametrize(
    "sp_filter_param",
    (
        (
            "[workloads][sap][sids][contains][]=NONEXISTENT_SID"
            "&filter[system_profile][workloads][sap][sap_system][]=true"
            "&filter[system_profile][workloads][ansible][controller_version][]=9.9.9"
        ),
    ),
)
def test_get_hosts_sp_workload_filters_no_matches(api_get, sp_filter_param):
    url = build_hosts_url(f"?filter[system_profile]{sp_filter_param}")
    response_status, response_data = api_get(url)

    assert response_status == 200
    assert response_data["results"] == []


@pytest.mark.parametrize(

```

1. This new test assumes there is an existing `build_hosts_url` helper (or equivalent) used elsewhere in this file to construct the hosts URL. If the helper is named differently, update the call in `test_get_hosts_sp_workload_filters_no_matches` accordingly.
2. The specific workload filters (`NONEXISTENT_SID` and `9.9.9`) should be chosen such that they do not match any of the hosts created in the fixtures for this test module. If test data fixtures include those values, adjust the filter values to something guaranteed not to exist.
3. If the API response schema differs (e.g., results are nested or paginated differently), adjust the final assertion to check the correct path to the hosts list, ensuring it asserts that the result set is empty while status is 200.
</issue_to_address>

### Comment 3
<location path="api/filtering/db_custom_filters.py" line_range="398" />
<code_context>
+    return field_name in WORKLOADS_FIELDS or field_name == "workloads"
+
+
+def _process_workload_group(grouped_filter_param):
+    """Workloads always use OR conjunction for multiple values."""
+    if isinstance(grouped_filter_param, list):
</code_context>
<issue_to_address>
**issue (complexity):** Consider replacing the two separate workload/standard group-processing helpers with a single helper parameterized by `is_workload` to unify the control flow and reduce duplication.

You can simplify the new logic by collapsing the two thin helpers into a single helper that takes `is_workload` as a parameter, so the control flow stays in one place while preserving the new workload-grouping behavior.

For example:

```python
def _build_group_filter(grouped_filter_param, *, is_workload: bool):
    if isinstance(grouped_filter_param, list):
        if is_workload:
            # Workloads: OR across values
            return or_(*(_build_workloads_filter(f) for f in grouped_filter_param))

        # Standard fields: AND for arrays, OR otherwise
        field_filter = _get_field_filter_for_deepest_param(
            system_profile_spec(), grouped_filter_param[0]
        )
        conjunction = and_ if field_filter == "array" else or_
        return conjunction(_build_workloads_filter(f) for f in grouped_filter_param)

    # Single filter object: workloads and standard share the same builder
    return _build_workloads_filter(grouped_filter_param)
```

Then `build_system_profile_filter` can express the full behavior in one coherent loop, while still separating workload filters for a final OR-group:

```python
def build_system_profile_filter(system_profile_param: dict) -> tuple:
    standard_filters = []
    workload_filters = []

    filter_param_list = _unique_paths(system_profile_param, ["operating_system"])

    for grouped_filter_param in filter_param_list:
        is_workload = _is_workload_filter(grouped_filter_param)
        group_filter = _build_group_filter(grouped_filter_param, is_workload=is_workload)

        if is_workload:
            workload_filters.append(group_filter)
        else:
            standard_filters.append(group_filter)

    system_profile_filter = tuple(standard_filters)
    if workload_filters:
        system_profile_filter += (or_(*workload_filters),)

    return system_profile_filter
```

This keeps:

- The new semantics of grouping all workload filters into a single `or_(*workload_filters)` term.
- The existing rule “arrays use AND, everything else uses ORfor non-workload fields.

But it removes duplicated branching on `isinstance(..., list)` and the extra `_process_workload_group` / `_process_standard_group` indirection, making the control flow easier to follow.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

"[workloads][sap][sids][contains][]=ABC&filter[system_profile][workloads][sap][sids][contains][]=GHI",
"[sap][sids][contains][]=ABC&filter[system_profile][sap][sids][contains][]=GHI",
"[workloads][sap][sids][contains][]=HIJ&filter[system_profile][workloads][sap][sids][contains][]=GHI",
"[sap][sids][contains][]=HIJ&filter[system_profile][sap][sids][contains][]=GHI",
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the 'not found' tests to use non-existent workload SAP SID value - because the filtering logic was refactored from AND to OR

@computercamplove computercamplove added do not merge and removed help wanted Extra attention is needed labels Mar 27, 2026
@computercamplove computercamplove added ready for review The PR is ready for review and removed do not merge labels Mar 30, 2026
@computercamplove
Copy link
Copy Markdown
Contributor Author

/retest

3 similar comments
@computercamplove
Copy link
Copy Markdown
Contributor Author

/retest

@computercamplove
Copy link
Copy Markdown
Contributor Author

/retest

@computercamplove
Copy link
Copy Markdown
Contributor Author

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant