Skip to content

Conversation

HankStat
Copy link

@HankStat HankStat commented Oct 14, 2025

Which issue(s) does the PR fix:

Fixes #57144

Problem

The TestEventHandler test is flaky due to a race condition that causes a "directory not empty" error during cleanup.
When the test completes, the Go testing framework's os.RemoveAll command on the temporary directory races against the App's shutdown process. The test fails if the directory removal attempt occurs before the App has fully closed its file handles.

Solution

Inside the startApp function, I add app.Close() with t.Cleanup().
The Go testing framework guarantees that cleanup functions are executed after a test completes but before temporary directories are removed. This change ensures that the app fully closes and releases all file handles before the directory is deleted, which eliminates the race condition.

Verification

The test now correctly and consistently passes on my local machine after running it 100 times using a loop to simulate the flaky behavior. In addition, after running 100 times, the TestEventHandler test took an average of 1.18 seconds to finish.

cd integrations/event-handler 
go test -v -race . -run ^TestEventHandler$ -count=100

Copy link

github-actions bot commented Oct 14, 2025

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@HankStat
Copy link
Author

I have read the CLA Document and I hereby sign the CLA

@HankStat HankStat marked this pull request as ready for review October 14, 2025 03:57
Add a t.Cleanup to close the app. This ensures the
temporary directory is empty before the test framework calls os.RemoveAll
on it.

Signed-off-by: Yong-Han Chen <[email protected]>
@HankStat HankStat force-pushed the flaky-TestEventHandler branch from 9ea168a to f6278a6 Compare October 14, 2025 04:15
Copy link
Contributor

@rosstimothy rosstimothy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for attempting to help us improve out tests! I'm however still seeing the same failures reported in the linked issue with this branch using the following command to run the test in a limited environment.

$ cd integrations/event-handler
$ go test -c -race

$ docker run --cpus="0.35" -v/teleport:/teleport golang:1.25.3-bookworm /teleport/integrations/event-handler/event-handler.test -test.count=10000 -test.timeout=120m -test.failfast  -test.run "TestEventHandler"

require.NoError(t, err)

t.Cleanup(func() {
t.Log("Close the app...")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this log line to reduce noise.

Suggested change
t.Log("Close the app...")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TestEventHandler flakiness

2 participants