Skip to content

Conversation

@swiatekm
Copy link
Contributor

@swiatekm swiatekm commented Oct 27, 2025

What does this PR do?

It makes the coordinator responsible for creating and removing the working directories of components. Until now, CommandRuntime and ServiceRuntime did this on their own, whereas the OtelManager didn't do it at all. We move the logic for creating and removing the directories into the component module itself, and call it from the Coordinator. The new logic is as follows.

  • When we generate a new component model, we ensure each component has a working directory.
  • When we get a STOPPED state for a component AND that component is not present in the component model, we delete its working directory.

The condition that the component must not be present in the current model fixes an issue where the working directory would be deleted when the component was being moved between runtimes.

Why is it important?

  1. Currently, the Otel runtime does not remove the working directories at all. The behavior should be the same no matter where the component runs.
  2. If a component is moved from the process runtime to the otel runtime, its working directory is incorrectly deleted. This causes filestream to re-ingest all logs, for example.

Note that I've included an integration test from #10544, which specifically checks point 2 from above.

Checklist

  • I have read and understood the pull request guidelines of this project.
  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool
  • I have added an integration test or an E2E test

How to test this PR locally

Related issues

@swiatekm swiatekm added bug Something isn't working backport-8.19 Automated backport to the 8.19 branch backport-9.2 Automated backport to the 9.2 branch labels Oct 27, 2025
@swiatekm swiatekm force-pushed the feat/coordinator-component-teardown branch from 5ac04a9 to 5230db6 Compare October 28, 2025 11:38
@swiatekm swiatekm force-pushed the feat/coordinator-component-teardown branch 2 times, most recently from 4794d7e to c472e05 Compare October 28, 2025 16:12
@swiatekm swiatekm force-pushed the feat/coordinator-component-teardown branch from c7ed40c to f7105b6 Compare October 28, 2025 19:19
@swiatekm swiatekm marked this pull request as ready for review October 28, 2025 19:20
@swiatekm swiatekm requested a review from a team as a code owner October 28, 2025 19:20
@swiatekm swiatekm requested review from blakerouse and pchila October 28, 2025 19:20
@cmacknz
Copy link
Member

cmacknz commented Oct 28, 2025

Did a first pass and this looks good, want to see CI pass before approving and take a second look with fresher eyes tomorrow (some else approving also covers this).

There is also the requirement that we don't change the run directory names from what they were before this change, which I double checked manually to confirm it is the same.

@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Oct 29, 2025
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@swiatekm swiatekm requested a review from cmacknz October 29, 2025 14:16
blakerouse
blakerouse previously approved these changes Oct 29, 2025
Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks good to me. My only comment is really on the changelog.

cmacknz
cmacknz previously approved these changes Oct 29, 2025
@swiatekm swiatekm dismissed stale reviews from cmacknz and blakerouse via 31e0394 October 30, 2025 11:30
@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @swiatekm

Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating the changelog.

@swiatekm swiatekm merged commit 41fd24f into main Oct 30, 2025
21 checks passed
@swiatekm swiatekm deleted the feat/coordinator-component-teardown branch October 30, 2025 14:24
mergify bot pushed a commit that referenced this pull request Oct 30, 2025
* Move component setup and teardown code to the Component struct

* Move teardown to coordinator

* Integration test to check data re-ingest when switching runtimes

* Add integration test for component working dirs

* Check workdir creation time in tests

* Fix linter warnings

* Fix a minor issue in endpoint integration tests

* Add changelog entry

* Completely remove workdir handling from command runtime

* Fix changelog summary

* Use fleet in the integration test

---------

Co-authored-by: Lee E. Hinman <[email protected]>
(cherry picked from commit 41fd24f)
mergify bot pushed a commit that referenced this pull request Oct 30, 2025
* Move component setup and teardown code to the Component struct

* Move teardown to coordinator

* Integration test to check data re-ingest when switching runtimes

* Add integration test for component working dirs

* Check workdir creation time in tests

* Fix linter warnings

* Fix a minor issue in endpoint integration tests

* Add changelog entry

* Completely remove workdir handling from command runtime

* Fix changelog summary

* Use fleet in the integration test

---------

Co-authored-by: Lee E. Hinman <[email protected]>
(cherry picked from commit 41fd24f)
swiatekm added a commit that referenced this pull request Oct 30, 2025
* Move component setup and teardown code to the Component struct

* Move teardown to coordinator

* Integration test to check data re-ingest when switching runtimes

* Add integration test for component working dirs

* Check workdir creation time in tests

* Fix linter warnings

* Fix a minor issue in endpoint integration tests

* Add changelog entry

* Completely remove workdir handling from command runtime

* Fix changelog summary

* Use fleet in the integration test

---------

Co-authored-by: Lee E. Hinman <[email protected]>
(cherry picked from commit 41fd24f)
cmacknz pushed a commit that referenced this pull request Oct 30, 2025
…10925)

* Move component setup and teardown code to the Component struct

* Move teardown to coordinator

* Integration test to check data re-ingest when switching runtimes

* Add integration test for component working dirs

* Check workdir creation time in tests

* Fix linter warnings

* Fix a minor issue in endpoint integration tests

* Add changelog entry

* Completely remove workdir handling from command runtime

* Fix changelog summary

* Use fleet in the integration test

---------


(cherry picked from commit 41fd24f)

Co-authored-by: Mikołaj Świątek <[email protected]>
Co-authored-by: Lee E. Hinman <[email protected]>
swiatekm added a commit that referenced this pull request Oct 30, 2025
…to coordinator (#10924)

* Move component working directory management to coordinator (#10857)

* Move component setup and teardown code to the Component struct

* Move teardown to coordinator

* Integration test to check data re-ingest when switching runtimes

* Add integration test for component working dirs

* Check workdir creation time in tests

* Fix linter warnings

* Fix a minor issue in endpoint integration tests

* Add changelog entry

* Completely remove workdir handling from command runtime

* Fix changelog summary

* Use fleet in the integration test

---------

Co-authored-by: Lee E. Hinman <[email protected]>
(cherry picked from commit 41fd24f)

* Fix imports

---------

Co-authored-by: Mikołaj Świątek <[email protected]>
Co-authored-by: Lee E. Hinman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-8.19 Automated backport to the 8.19 branch backport-9.2 Automated backport to the 9.2 branch bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[beats receivers] switch monitoring runtime from process to otel deletes run directories

7 participants