Skip to content

Conversation

@manmathbh
Copy link
Contributor

Description

This PR adds comprehensive unit tests for the hash package, including tests for Hash128 function and helper functions.

Changes

  • Added unit tests for Hash128 function
  • Added tests for rotl64 helper function
  • Added tests for fmix64 helper function
  • Added avalanche effect test for fmix64
  • Added test for unaligned data input handling

Related Issues

Fixes #1529

Notes

This PR replaces #1530. The changes have been moved from the main branch to a feature branch following best practices.

Testing

All tests pass locally. The unaligned data test ensures Hash128 handles unaligned input without panicking on architectures that enforce memory alignment.

Copilot AI review requested due to automatic review settings November 26, 2025 18:09
@gemini-code-assist
Copy link

Summary of Changes

Hello @manmathbh, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the reliability and robustness of the hash package by introducing a comprehensive suite of unit tests. These tests validate the core Hash128 function across diverse inputs and edge cases, verify the behavior of its internal helper functions, and specifically address potential issues like unaligned memory access. The addition of benchmarks also provides insight into the function's performance characteristics.

Highlights

  • Hash128 Function Tests: Comprehensive unit tests have been added for the Hash128 function, covering various input scenarios including empty data, single bytes, different seeds, exact and multiple block sizes, and all possible tail lengths.
  • Helper Function Tests: Dedicated unit tests are introduced for the rotl64 and fmix64 helper functions to ensure their correctness and expected behavior.
  • Avalanche Effect Test: An avalanche effect test has been added for fmix64 to verify that a small change in input leads to a significant change in output, indicating good mixing properties.
  • Unaligned Data Handling Test: A specific test case is included to ensure that Hash128 correctly handles unaligned data inputs without panicking, which is crucial for robustness on architectures that enforce memory alignment.
  • Performance Benchmarks: Benchmarks for Hash128 have been added to measure its performance with small, medium, and large data inputs.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


Code is written, then it's run, But bugs may hide, spoiling fun. Tests are added, clear and bright, To make sure all is truly right.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive and well-structured test suite for the hash package, significantly improving its test coverage. The tests cover a wide range of scenarios, including various input sizes, determinism, helper functions, and benchmarks. The addition of an avalanche test for fmix64 and a test for unaligned memory access is particularly valuable. However, the unaligned access test reveals a critical portability issue in the existing murmur3.go implementation, which will cause panics on certain hardware architectures. While the fix for this is outside the scope of this PR, the test is crucial for identifying this problem.

Comment on lines 303 to 318
func TestHash128_UnalignedInput(t *testing.T) {
// Create a buffer and then a sub-slice that is not 8-byte aligned.
buf := make([]byte, 33)
for i := range buf {
buf[i] = byte(i)
}
unalignedData := buf[1:] // len=32

defer func() {
if r := recover(); r != nil {
t.Errorf("Hash128 panicked on unaligned data: %v", r)
}
}()

Hash128(unalignedData, 0)
} No newline at end of file

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This test correctly identifies a critical portability issue in the Hash128 implementation. The use of unsafe.Pointer in murmur3.go to perform unaligned 64-bit reads will cause the program to panic on architectures that enforce memory alignment (like ARM, MIPS, etc.).

While this test may pass on x86 platforms, it will fail on others, making the function unreliable in a cross-platform environment. This is a critical issue.

To fix this, the implementation in murmur3.go should be updated to use encoding/binary for safe unaligned reads. For example:

// In murmur3.go, inside the loop:
k1 := binary.LittleEndian.Uint64(data[i*16 : i*16+8])
k2 := binary.LittleEndian.Uint64(data[i*16+8 : i*16+16])

Although fixing murmur3.go is outside the scope of this PR, this test is essential for exposing the bug. A follow-up PR to fix the implementation is highly recommended.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in the latest commit!
I've updated murmur3.go to use encoding/binary. LittleEndian.Uint64 for safe unaligned reads as recommended. The implementation is now portable across all architectures including ARM and MIPS.

Test results:
All 8 test suites pass (36 total test cases)
TestHash128_UnalignedInput now passes without panics
No unsafe.Pointer usage for unaligned memory access
Commit: 737f9f6

Copilot finished reviewing on behalf of manmathbh November 26, 2025 18:11
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive unit tests for the MurmurHash3 implementation in the hash package. The tests validate the Hash128 function and its internal helper functions (rotl64 and fmix64), ensuring correct hash computation, deterministic behavior, and proper handling of edge cases including unaligned memory access.

Key Changes

  • Added 318 lines of unit tests covering Hash128 function with various input sizes and seeds
  • Added tests for internal helper functions (rotl64, fmix64) with expected values
  • Added benchmark tests for small, medium, and large data inputs

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 111 to 112
h1_1, h2_1 := Hash128(testData, seed)
h1_2, h2_2 := Hash128(testData, seed)
Copy link

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Naming: Variable names like h1_1, h2_1, h1_2, h2_2 don't follow Go naming conventions which typically avoid underscores. Consider using camelCase names like h1First, h2First, h1Second, h2Second or h1Call1, h2Call1, h1Call2, h2Call2 for better readability.

Copilot uses AI. Check for mistakes.
@codecov
Copy link

codecov bot commented Nov 26, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 38.86%. Comparing base (3097c3e) to head (4e37da6).
⚠️ Report is 22 commits behind head on main.

Files with missing lines Coverage Δ
pkg/utils/hash/murmur3.go 100.00% <100.00%> (+100.00%) ⬆️

... and 4 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c5a02b3...4e37da6. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

- Rename test functions to follow Go conventions for unexported functions
- Add TestHash128_UnalignedInput to verify handling of unaligned data
- Address feedback from gemini-code-assist and Copilot

Signed-off-by: manmathbh <[email protected]>
@hzxuzhonghu
Copy link
Member

@manmathbh I would suggest first testing the core package and functions

Copilot AI review requested due to automatic review settings December 5, 2025 12:52
Copilot finished reviewing on behalf of manmathbh December 5, 2025 12:56
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 111 to 112
h1_1, h2_1 := Hash128(testData, seed)
h1_2, h2_2 := Hash128(testData, seed)
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Variable names h1_1, h2_1, h1_2, h2_2 use underscores which is not idiomatic Go naming. Consider using camelCase like h1First, h2First, h1Second, h2Second to follow Go naming conventions.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 4e37da6!

Comment on lines 125 to 126
h1_1, h2_1 := Hash128(data1, seed)
h1_2, h2_2 := Hash128(data2, seed)
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Variable names h1_1, h2_1, h1_2, h2_2 use underscores which is not idiomatic Go naming. Consider using camelCase like h1Data1, h2Data1, h1Data2, h2Data2 to follow Go naming conventions.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 4e37da6!

data := append(baseData, []byte("extra tail bytes")[:tailLen]...)
h1, h2 := Hash128(data, seed)

// Verify it doesn't panic and produces some output
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The assertion if h1 == 0 && h2 == 0 && len(data) > 0 assumes that a hash of non-empty data can never be (0, 0). While extremely unlikely, a hash function could legitimately produce (0, 0) for some inputs. Consider removing this check or adding a note that it's testing for the hash function not being broken (returning zero for all inputs), rather than a strict correctness check.

Suggested change
// Verify it doesn't panic and produces some output
// This check is a sanity check to catch a hash function that is broken
// (e.g., always returning zero). It is not a strict correctness check,
// as a valid hash function could theoretically return (0, 0) for some input.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 4e37da6!

Fixes the critical portability bug identified by gemini-code-assist.
The previous unsafe.Pointer cast to [2]uint64 could panic on
architectures with strict memory alignment requirements (ARM, MIPS).

Using encoding/binary. LittleEndian.Uint64 provides safe, portable
unaligned reads while preserving the algorithm's behavior.

Removed duplicate murmur3_test.go test file to resolve test name
conflicts. All tests now consolidated in hash_test.go.

All 8 test suites pass including the critical TestHash128_UnalignedInput.

Signed-off-by: manmathbh <[email protected]>
@manmathbh
Copy link
Contributor Author

Fixed the critical portability bug and addressed all code review feedback:

Bug Fix:
Replaced unsafe unaligned memory access in murmur3. go with encoding/binary. LittleEndian.Uint64
Now safe on ARM, MIPS, and other architectures with strict alignment requirements

Code Review Feedback Addressed:
Updated variable naming to follow Go conventions (camelCase)
Improved test comments for clarity
Added DCO sign-off

Test Results:
All 8 test suites pass (36 total test cases)
TestHash128_UnalignedInput confirms the fix works correctly

Ready for review!

@hzxuzhonghu
Copy link
Member

/lgtm
/approve

@kmesh-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hzxuzhonghu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@hzxuzhonghu hzxuzhonghu merged commit 352bd8d into kmesh-net:main Dec 8, 2025
14 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add unit tests for hash package

3 participants