Skip to content

Conversation

@lucliu1108
Copy link
Contributor

@lucliu1108 lucliu1108 commented Oct 24, 2025

What

Ticket: https://issues.apache.org/jira/browse/KAFKA-19807
Follow up on #20757, add tests that:

  1. Test member with topology that triggers internal topic auto creation
  2. Test member heartbeat under dynamic configuration
    (streams.num.standby.replicas) change
  3. Test membership expiring and rejoining
  4. Test member heartbeat before and after group coordinator's restart

Reviewers: @lucasbru

@github-actions github-actions bot added triage PRs from the community core Kafka Broker tests Test fixes (including flaky tests) labels Oct 24, 2025
@lucasbru lucasbru requested a review from Copilot October 24, 2025 07:54
@lucasbru lucasbru self-assigned this Oct 24, 2025
@lucasbru lucasbru self-requested a review October 24, 2025 07:54
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds comprehensive integration tests for the StreamsGroupHeartbeat RPC, validating group coordinator behavior across various scenarios including internal topic creation, dynamic configuration changes, member expiration/rejoining, and coordinator restarts.

Key changes:

  • Added four new integration tests for different StreamsGroupHeartbeat scenarios
  • Implemented helper methods for topology creation and task ID conversion
  • Configured test-specific cluster properties for heartbeat and session timeouts

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Member

@lucasbru lucasbru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lucliu1108 You need to add the right header to the file

@lucasbru
Copy link
Member

@lucliu1108 You need to add the right header to the file (meant to comment in this PR)

@github-actions github-actions bot removed the triage PRs from the community label Oct 25, 2025
Copy link
Member

@lucasbru lucasbru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly looking good to me, thanks!

Just minor comments

}, "Second member rebalance heartbeat did not succeed within the timeout period.")

// Verify initial state with no standby tasks
assertEquals(0, streamsGroupHeartbeatResponse1.data.standbyTasks().size(), "Member 1 should have no standby tasks initially")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation seems off

TestUtils.waitUntilTrue(() => {
val expiredResponse = connectAndReceive[StreamsGroupHeartbeatResponse](expiredHeartbeatRequest)
expiredResponse.data.errorCode == Errors.UNKNOWN_MEMBER_ID.code() &&
expiredResponse.data.memberEpoch() == 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is a && missing here. Part of the boolean condition is just dropped.


// Send heartbeat with topology containing internal topics
var streamsGroupHeartbeatResponse: StreamsGroupHeartbeatResponse = null
TestUtils.waitUntilTrue(() => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the base class, there is a helper method streamsGroupHeartbeat, I think you may have defined it yourself. Couldn't we reuse it here?

}, "Second member heartbeat after config change did not succeed within the timeout period.")

// Verify that at least one member has active tasks
val member1HasActive = streamsGroupHeartbeatResponse1.data.activeTasks().size() > 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: member1HasActiveTasks

val member1HasStandby = streamsGroupHeartbeatResponse1.data.standbyTasks().size() > 0
val member2HasStandby = streamsGroupHeartbeatResponse2.data.standbyTasks().size() > 0
assertTrue(member1HasStandby || member2HasStandby, "At least one member should have standby tasks after config change")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add this to be more precise:

// With 2 members and streams.num.standby.replicas=1, each active task should have 1 standby
val totalActiveTasks = member1HasActive.size + member2HasActive.size
val totalStandbyTasks = member1HasStandby.size + member2HasStandby.size
assertEquals(totalActiveTasks, totalStandbyTasks, "Each active task should have one standby")

@lucliu1108 lucliu1108 requested a review from lucasbru October 26, 2025 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Kafka Broker tests Test fixes (including flaky tests)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants