Skip to content

refactor(RHINENG-21706): remove code which writes PRS old format#3880

Open
FabriciaDinizRH wants to merge 2 commits intoRedHatInsights:masterfrom
FabriciaDinizRH:remove-old-stleness-code
Open

refactor(RHINENG-21706): remove code which writes PRS old format#3880
FabriciaDinizRH wants to merge 2 commits intoRedHatInsights:masterfrom
FabriciaDinizRH:remove-old-stleness-code

Conversation

@FabriciaDinizRH
Copy link
Copy Markdown
Contributor

@FabriciaDinizRH FabriciaDinizRH commented Apr 1, 2026

Overview

This PR is being created to address RHINENG-21706.
Removes old code which writes the old per_reporter_staleness format
Removes outdated tests
Updates tests to verify only the new format for writes

PR Checklist

  • Keep PR title short, ideally under 72 characters
  • Descriptive comments provided in complex code blocks
  • Include raw query examples in the PR description, if adding/modifying SQL query
  • Tests: validate optimal/expected output
  • Tests: validate exceptions and failure scenarios
  • Tests: edge cases
  • Recovers or fails gracefully during potential resource outages (e.g. DB, Kafka)
  • Uses type hinting, if convenient
  • Documentation, if this PR changes the way other services interact with host inventory
  • Links to related PRs

Secure Coding Practices Documentation Reference

You can find documentation on this checklist here.

Secure Coding Checklist

  • Input Validation
  • Output Encoding
  • Authentication and Password Management
  • Session Management
  • Access Control
  • Cryptographic Practices
  • Error Handling and Logging
  • Data Protection
  • Communication Security
  • System Configuration
  • Database Security
  • File Management
  • Memory Management
  • General Coding Practices

Summary by Sourcery

Simplify per_reporter_staleness handling to always use the flattened ISO-string format and remove legacy nested-format and related feature-flag code.

Enhancements:

  • Update Host model and staleness utilities to always store per_reporter_staleness as flat ISO last_check_in strings per reporter.
  • Adjust test helpers and data generation utilities to reflect the simplified flat per_reporter_staleness schema.
  • Remove the obsolete feature flag controlling flattened per_reporter_staleness and associated update logic.

Tests:

  • Rewrite staleness, host filtering, tagging, and serialization tests to assume only the flat per_reporter_staleness format and drop coverage for the legacy nested format.
  • Remove the backfill_per_reporter_culled_timestamps job tests that exercised the deprecated nested per_reporter_staleness behavior.

Chores:

  • Delete the obsolete backfill_per_reporter_culled_timestamps job that existed solely for migrating nested per_reporter_staleness data.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

SC Environment Impact Assessment

Overall Impact:NONE

No SC Environment-specific impacts detected in this PR.

What was checked

This PR was automatically scanned for:

  • Database migrations
  • ClowdApp configuration changes
  • Kessel integration changes
  • AWS service integrations (S3, RDS, ElastiCache)
  • Kafka topic changes
  • Secrets management changes
  • External dependencies

@FabriciaDinizRH FabriciaDinizRH force-pushed the remove-old-stleness-code branch from e2eb092 to d1eab12 Compare April 1, 2026 23:38
@FabriciaDinizRH
Copy link
Copy Markdown
Contributor Author

@sourcery-ai review

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • Now that per_reporter_staleness is always flat, consider renaming helpers like create_reporter_data (and adjusting docstrings) so their names and docs reflect that they return an ISO string rather than a dict, to avoid confusion for future callers.
  • create_host_with_reporter still accepts a staleness_config argument that is no longer used; you could drop this parameter (and related wording in the docstring) to make the helper’s interface match its current behavior.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Now that per_reporter_staleness is always flat, consider renaming helpers like create_reporter_data (and adjusting docstrings) so their names and docs reflect that they return an ISO string rather than a dict, to avoid confusion for future callers.
- create_host_with_reporter still accepts a staleness_config argument that is no longer used; you could drop this parameter (and related wording in the docstring) to make the helper’s interface match its current behavior.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@FabriciaDinizRH FabriciaDinizRH marked this pull request as ready for review April 1, 2026 23:54
@FabriciaDinizRH FabriciaDinizRH added the ready for review The PR is ready for review label Apr 1, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="tests/test_custom_staleness.py" line_range="80-82" />
<code_context>
-                        hosts_before_update[0].per_reporter_staleness[reporter]["stale_timestamp"]
-                        == expected_stale_before
-                    )
+                v = hosts_before_update[0].per_reporter_staleness[reporter]
+                assert isinstance(v, str)
+                assert v == _now.isoformat()

             staleness_url = build_staleness_url()
</code_context>
<issue_to_address>
**suggestion (testing):** Custom staleness async tests no longer validate any staleness timestamp behavior beyond PRS shape

These tests now only assert that `per_reporter_staleness[reporter]` is an ISO string matching `_now`, whereas they previously validated how `last_check_in` related to `stale_timestamp`/`culled_timestamp` under different configs. Since `_update_hosts_staleness_async` still depends on staleness configuration, can we add at least one assertion in these async flows that checks `host.stale_timestamp` (or serialized staleness timestamps) uses the expected offsets for the custom config, both before and after PATCH/DELETE, to retain that regression coverage?

Suggested implementation:

```python
            # Validate async per-reporter staleness shape (ISO string at "now")
            for reporter in hosts_before_update[0].per_reporter_staleness:
                v = hosts_before_update[0].per_reporter_staleness[reporter]
                assert isinstance(v, str)
                assert v == _now.isoformat()

            # Additionally validate host-level staleness timestamps before the async update
            host_before_update = hosts_before_update[0]
            stale_timestamp_before = host_before_update.stale_timestamp
            culled_timestamp_before = getattr(host_before_update, "culled_timestamp", None)

            # The concrete offsets here should match the custom staleness configuration used
            # for this test (see <additional_changes> notes for aligning these expectations).
            assert stale_timestamp_before is not None
            assert stale_timestamp_before >= _now

            if culled_timestamp_before is not None:
                assert culled_timestamp_before >= stale_timestamp_before

            staleness_url = build_staleness_url()
            status, _ = api_patch(staleness_url, CUSTOM_STALENESS_HOST_BECAME_STALE)
            wait_for_all_events(event_producer, num_hosts)

            hosts_after_update = db_get_hosts(host_ids).all()
            host_after_update = hosts_after_update[0]

            # Validate that async staleness recalculation applied the custom offsets
            stale_timestamp_after = host_after_update.stale_timestamp
            culled_timestamp_after = getattr(host_after_update, "culled_timestamp", None)

            assert stale_timestamp_after is not None

            # Depending on the semantics of CUSTOM_STALENESS_HOST_BECAME_STALE, the host is
            # expected to be "stale" relative to _now. For a typical "became stale" config,
            # stale_timestamp should be at, or before, _now.
            assert stale_timestamp_after <= _now

            if culled_timestamp_after is not None:
                # Culled timestamp is normally some offset after stale_timestamp; ensure
                # that relationship still holds under the async path.
                assert culled_timestamp_after >= stale_timestamp_after

            # Ensure that async recomputation actually changed persisted staleness values
            assert stale_timestamp_after != stale_timestamp_before or culled_timestamp_after != culled_timestamp_before

            for reporter in hosts_after_update[0].per_reporter_staleness:

```

notes for aligning these expectations).
            assert stale_timestamp_before is not None
            assert stale_timestamp_before >= _now

            if culled_timestamp_before is not None:
                assert culled_timestamp_before >= stale_timestamp_before

            staleness_url = build_staleness_url()
            status, _ = api_patch(staleness_url, CUSTOM_STALENESS_HOST_BECAME_STALE)
            wait_for_all_events(event_producer, num_hosts)

            hosts_after_update = db_get_hosts(host_ids).all()
            host_after_update = hosts_after_update[0]

            # Validate that async staleness recalculation applied the custom offsets
            stale_timestamp_after = host_after_update.stale_timestamp
            culled_timestamp_after = getattr(host_after_update, "culled_timestamp", None)

            assert stale_timestamp_after is not None

            # Depending on the semantics of CUSTOM_STALENESS_HOST_BECAME_STALE, the host is
            # expected to be "stale" relative to _now. For a typical "became stale" config,
            # stale_timestamp should be at, or before, _now.
            assert stale_timestamp_after <= _now

            if culled_timestamp_after is not None:
                # Culled timestamp is normally some offset after stale_timestamp; ensure
                # that relationship still holds under the async path.
                assert culled_timestamp_after >= stale_timestamp_after

            # Ensure that async recomputation actually changed persisted staleness values
            assert stale_timestamp_after != stale_timestamp_before or culled_timestamp_after != culled_timestamp_before

            for reporter in hosts_after_update[0].per_reporter_staleness:
>>>>>>> REPLACE
</file_operation>
</file_operations>

<additional_changes>
To fully align these assertions with your custom staleness configuration and keep them in sync with the synchronous tests:

1. Replace the generic inequalities (`>= _now`, `<= _now`, `>= stale_timestamp_before`, etc.) with exact offset checks that mirror your existing sync tests. For example, if your custom config defines:
   - `CUSTOM_STALENESS_HOST_BECAME_STALE["stale_timestamp_offset"]`
   - `CUSTOM_STALENESS_HOST_BECAME_STALE["culled_timestamp_offset"]`
   or similar, compute:
   ```python
   expected_stale_after = _now + CUSTOM_STALENESS_HOST_BECAME_STALE["stale_timestamp_offset"]
   expected_culled_after = _now + CUSTOM_STALENESS_HOST_BECAME_STALE["culled_timestamp_offset"]
   assert stale_timestamp_after == expected_stale_after
   assert culled_timestamp_after == expected_culled_after
   ```
2. If you already have a helper (e.g. `_assert_custom_staleness_offsets(host, now, config)`) used in other tests, prefer calling that helper here instead of duplicating the offset logic. For instance:
   ```python
   _assert_custom_staleness_offsets(host_before_update, _now, CUSTOM_STALENESS_INITIAL_CONFIG)
   ...
   _assert_custom_staleness_offsets(host_after_update, _now, CUSTOM_STALENESS_HOST_BECAME_STALE)
   ```
3. Ensure the `_now` used in these assertions is the same reference time used when computing the expected offsets in the custom config (or in your helper) so that the comparisons remain stable and deterministic.
</issue_to_address>

### Comment 2
<location path="tests/helpers/api_utils.py" line_range="737-746" />
<code_context>
+    return last_check_in.isoformat()


 def create_host_with_reporter(
     db_create_host: Callable[..., Host],
     reporter: str,
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a focused test for create_host_with_reporter’s default stale_timestamp behavior

The helper now derives `host.stale_timestamp` from `last_check_in + CONVENTIONAL_TIME_TO_STALE_SECONDS` when `stale_timestamp` is omitted, but there’s no test that directly asserts this behavior. Since other tests depend on this helper for consistent staleness, adding a small test that calls `create_host_with_reporter` without `stale_timestamp` and verifies the expected offset would make the behavior explicit and protect against regressions in the default logic.

Suggested implementation:

```python
from datetime import datetime, timedelta, timezone
from types import SimpleNamespace

from tests.helpers import api_utils


def test_create_host_with_reporter_defaults_stale_timestamp_to_last_check_in_plus_conventional_delay() -> None:
    """create_host_with_reporter should derive stale_timestamp when omitted.

    The helper is expected to set ``stale_timestamp`` to
    ``last_check_in + CONVENTIONAL_TIME_TO_STALE_SECONDS`` when the caller
    does not explicitly provide a ``stale_timestamp``.
    """
    last_check_in = datetime(2024, 1, 1, 12, 0, 0, tzinfo=timezone.utc)
    captured_kwargs: dict = {}

    def fake_db_create_host(**kwargs):
        captured_kwargs.update(kwargs)
        # Simulate a Host-like object that exposes the passed fields as attributes.
        return SimpleNamespace(**kwargs)

    host = api_utils.create_host_with_reporter(
        db_create_host=fake_db_create_host,
        reporter="some-reporter",
        last_check_in=last_check_in,
    )

    expected_stale_timestamp = last_check_in + timedelta(
        seconds=api_utils.CONVENTIONAL_TIME_TO_STALE_SECONDS
    )

    # Assert the helper computed the default stale_timestamp correctly.
    assert captured_kwargs["stale_timestamp"] == expected_stale_timestamp
    # And that the object returned from db_create_host exposes the same value.
    assert host.stale_timestamp == expected_stale_timestamp

```

This test assumes:

1. `tests/helpers/api_utils.py` exposes `create_host_with_reporter` and has `CONVENTIONAL_TIME_TO_STALE_SECONDS` available at module level (either defined there or imported there). If the constant is not accessible as `api_utils.CONVENTIONAL_TIME_TO_STALE_SECONDS`, update the import in the test accordingly (for example, import it directly from its defining module and use that instead).
2. The project’s test discovery is configured with the default pytest conventions (files named `test_*.py`), so placing this test in `tests/test_create_host_with_reporter.py` will make it discoverable.

If `tests/helpers` is not a package (missing `__init__.py`), adjust the import in the test to point to the correct module path where `api_utils` lives (for example, `from tests.helpers.api_utils import create_host_with_reporter` and import the constant from the appropriate module).
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +80 to +82
v = hosts_before_update[0].per_reporter_staleness[reporter]
assert isinstance(v, str)
assert v == _now.isoformat()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Custom staleness async tests no longer validate any staleness timestamp behavior beyond PRS shape

These tests now only assert that per_reporter_staleness[reporter] is an ISO string matching _now, whereas they previously validated how last_check_in related to stale_timestamp/culled_timestamp under different configs. Since _update_hosts_staleness_async still depends on staleness configuration, can we add at least one assertion in these async flows that checks host.stale_timestamp (or serialized staleness timestamps) uses the expected offsets for the custom config, both before and after PATCH/DELETE, to retain that regression coverage?

Suggested implementation:

            # Validate async per-reporter staleness shape (ISO string at "now")
            for reporter in hosts_before_update[0].per_reporter_staleness:
                v = hosts_before_update[0].per_reporter_staleness[reporter]
                assert isinstance(v, str)
                assert v == _now.isoformat()

            # Additionally validate host-level staleness timestamps before the async update
            host_before_update = hosts_before_update[0]
            stale_timestamp_before = host_before_update.stale_timestamp
            culled_timestamp_before = getattr(host_before_update, "culled_timestamp", None)

            # The concrete offsets here should match the custom staleness configuration used
            # for this test (see <additional_changes> notes for aligning these expectations).
            assert stale_timestamp_before is not None
            assert stale_timestamp_before >= _now

            if culled_timestamp_before is not None:
                assert culled_timestamp_before >= stale_timestamp_before

            staleness_url = build_staleness_url()
            status, _ = api_patch(staleness_url, CUSTOM_STALENESS_HOST_BECAME_STALE)
            wait_for_all_events(event_producer, num_hosts)

            hosts_after_update = db_get_hosts(host_ids).all()
            host_after_update = hosts_after_update[0]

            # Validate that async staleness recalculation applied the custom offsets
            stale_timestamp_after = host_after_update.stale_timestamp
            culled_timestamp_after = getattr(host_after_update, "culled_timestamp", None)

            assert stale_timestamp_after is not None

            # Depending on the semantics of CUSTOM_STALENESS_HOST_BECAME_STALE, the host is
            # expected to be "stale" relative to _now. For a typical "became stale" config,
            # stale_timestamp should be at, or before, _now.
            assert stale_timestamp_after <= _now

            if culled_timestamp_after is not None:
                # Culled timestamp is normally some offset after stale_timestamp; ensure
                # that relationship still holds under the async path.
                assert culled_timestamp_after >= stale_timestamp_after

            # Ensure that async recomputation actually changed persisted staleness values
            assert stale_timestamp_after != stale_timestamp_before or culled_timestamp_after != culled_timestamp_before

            for reporter in hosts_after_update[0].per_reporter_staleness:

notes for aligning these expectations).
assert stale_timestamp_before is not None
assert stale_timestamp_before >= _now

        if culled_timestamp_before is not None:
            assert culled_timestamp_before >= stale_timestamp_before

        staleness_url = build_staleness_url()
        status, _ = api_patch(staleness_url, CUSTOM_STALENESS_HOST_BECAME_STALE)
        wait_for_all_events(event_producer, num_hosts)

        hosts_after_update = db_get_hosts(host_ids).all()
        host_after_update = hosts_after_update[0]

        # Validate that async staleness recalculation applied the custom offsets
        stale_timestamp_after = host_after_update.stale_timestamp
        culled_timestamp_after = getattr(host_after_update, "culled_timestamp", None)

        assert stale_timestamp_after is not None

        # Depending on the semantics of CUSTOM_STALENESS_HOST_BECAME_STALE, the host is
        # expected to be "stale" relative to _now. For a typical "became stale" config,
        # stale_timestamp should be at, or before, _now.
        assert stale_timestamp_after <= _now

        if culled_timestamp_after is not None:
            # Culled timestamp is normally some offset after stale_timestamp; ensure
            # that relationship still holds under the async path.
            assert culled_timestamp_after >= stale_timestamp_after

        # Ensure that async recomputation actually changed persisted staleness values
        assert stale_timestamp_after != stale_timestamp_before or culled_timestamp_after != culled_timestamp_before

        for reporter in hosts_after_update[0].per_reporter_staleness:

REPLACE
</file_operation>
</file_operations>

<additional_changes>
To fully align these assertions with your custom staleness configuration and keep them in sync with the synchronous tests:

  1. Replace the generic inequalities (>= _now, <= _now, >= stale_timestamp_before, etc.) with exact offset checks that mirror your existing sync tests. For example, if your custom config defines:
    • CUSTOM_STALENESS_HOST_BECAME_STALE["stale_timestamp_offset"]
    • CUSTOM_STALENESS_HOST_BECAME_STALE["culled_timestamp_offset"]
      or similar, compute:
    expected_stale_after = _now + CUSTOM_STALENESS_HOST_BECAME_STALE["stale_timestamp_offset"]
    expected_culled_after = _now + CUSTOM_STALENESS_HOST_BECAME_STALE["culled_timestamp_offset"]
    assert stale_timestamp_after == expected_stale_after
    assert culled_timestamp_after == expected_culled_after
  2. If you already have a helper (e.g. _assert_custom_staleness_offsets(host, now, config)) used in other tests, prefer calling that helper here instead of duplicating the offset logic. For instance:
    _assert_custom_staleness_offsets(host_before_update, _now, CUSTOM_STALENESS_INITIAL_CONFIG)
    ...
    _assert_custom_staleness_offsets(host_after_update, _now, CUSTOM_STALENESS_HOST_BECAME_STALE)
  3. Ensure the _now used in these assertions is the same reference time used when computing the expected offsets in the custom config (or in your helper) so that the comparisons remain stable and deterministic.

Comment on lines 737 to +746
def create_host_with_reporter(
db_create_host: Callable[..., Host],
reporter: str,
last_check_in: datetime,
staleness_config: dict[str, int],
include_timestamps: bool = True,
*,
stale_timestamp: datetime | None = None,
) -> Host:
"""Helper to create a host with a reporter and set last_check_in and stale_timestamp."""
"""Create a host with flat PRS (ISO string per reporter) and a coherent host-level ``stale_timestamp``.

If ``stale_timestamp`` is omitted, it defaults to ``last_check_in`` plus
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider adding a focused test for create_host_with_reporter’s default stale_timestamp behavior

The helper now derives host.stale_timestamp from last_check_in + CONVENTIONAL_TIME_TO_STALE_SECONDS when stale_timestamp is omitted, but there’s no test that directly asserts this behavior. Since other tests depend on this helper for consistent staleness, adding a small test that calls create_host_with_reporter without stale_timestamp and verifies the expected offset would make the behavior explicit and protect against regressions in the default logic.

Suggested implementation:

from datetime import datetime, timedelta, timezone
from types import SimpleNamespace

from tests.helpers import api_utils


def test_create_host_with_reporter_defaults_stale_timestamp_to_last_check_in_plus_conventional_delay() -> None:
    """create_host_with_reporter should derive stale_timestamp when omitted.

    The helper is expected to set ``stale_timestamp`` to
    ``last_check_in + CONVENTIONAL_TIME_TO_STALE_SECONDS`` when the caller
    does not explicitly provide a ``stale_timestamp``.
    """
    last_check_in = datetime(2024, 1, 1, 12, 0, 0, tzinfo=timezone.utc)
    captured_kwargs: dict = {}

    def fake_db_create_host(**kwargs):
        captured_kwargs.update(kwargs)
        # Simulate a Host-like object that exposes the passed fields as attributes.
        return SimpleNamespace(**kwargs)

    host = api_utils.create_host_with_reporter(
        db_create_host=fake_db_create_host,
        reporter="some-reporter",
        last_check_in=last_check_in,
    )

    expected_stale_timestamp = last_check_in + timedelta(
        seconds=api_utils.CONVENTIONAL_TIME_TO_STALE_SECONDS
    )

    # Assert the helper computed the default stale_timestamp correctly.
    assert captured_kwargs["stale_timestamp"] == expected_stale_timestamp
    # And that the object returned from db_create_host exposes the same value.
    assert host.stale_timestamp == expected_stale_timestamp

This test assumes:

  1. tests/helpers/api_utils.py exposes create_host_with_reporter and has CONVENTIONAL_TIME_TO_STALE_SECONDS available at module level (either defined there or imported there). If the constant is not accessible as api_utils.CONVENTIONAL_TIME_TO_STALE_SECONDS, update the import in the test accordingly (for example, import it directly from its defining module and use that instead).
  2. The project’s test discovery is configured with the default pytest conventions (files named test_*.py), so placing this test in tests/test_create_host_with_reporter.py will make it discoverable.

If tests/helpers is not a package (missing __init__.py), adjust the import in the test to point to the correct module path where api_utils lives (for example, from tests.helpers.api_utils import create_host_with_reporter and import the constant from the appropriate module).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready for review The PR is ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant