Skip to content

"Lin Bytes test with Thread" may fail because it is Thread unsafe #544

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jmid opened this issue Mar 26, 2025 · 0 comments · Fixed by #546
Closed

"Lin Bytes test with Thread" may fail because it is Thread unsafe #544

jmid opened this issue Mar 26, 2025 · 0 comments · Fixed by #546
Labels
test suite reliability Issue concerns tests that should behave more predictably

Comments

@jmid
Copy link
Collaborator

jmid commented Mar 26, 2025

Up until #540 that improved the error finding ability of the Thread modes, we have considered "Lin Bytes test with Thread"
a positive test run with a low count of 250.

Now, the tooling has improved and the test may fail. It just did so on merging #542 to main running MSVC bytecode with trunk:
https://github.com/ocaml-multicore/multicoretests/actions/runs/14088717379/job/39459605635

random seed: 362773306
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 5000     0.0s Lin Bytes test with Domain
[ ]    0    0    0    0 / 5000     0.0s Lin Bytes test with Domain (generating)
[ ]  509    0    0  509 / 5000   110.1s Lin Bytes test with Domain
[ ] 1112    0    0 1112 / 5000   170.6s Lin Bytes test with Domain
[✓] 1244    0    1 1243 / 5000   188.3s Lin Bytes test with Domain

[ ]    0    0    0    0 /  250     0.0s Lin Bytes test with Thread
[✗]   41    0    1   40 /  250    28.6s Lin Bytes test with Thread

[ ]    0    0    0    0 / 1000     0.0s Lin Bytes stress test with Domain
[ ]  931    0    0  931 / 1000    13.7s Lin Bytes stress test with Domain
[✓] 1000    0    0 1000 / 1000    14.7s Lin Bytes stress test with Domain

--- Info -----------------------------------------------------------------------

Negative test Lin Bytes test with Domain failed as expected (40 shrink steps):

[...]

--- Failure --------------------------------------------------------------------

Test Lin Bytes test with Thread failed (61 shrink steps):

                                  |           
                       Bytes.fill t 2 1 '\128'
                                  |           
                     .------------------------.
                     |                        |           
           Bytes.set t 0 '\128'    Bytes.is_valid_utf_8 t 
           Bytes.fill t 1 3 'U'                           


+++ Messages ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Messages for test Lin Bytes test with Thread:

  Results incompatible with sequential execution

                                        |                  
                        Bytes.fill t 2 1 '\128' : Ok (())  
                                        |                  
                     .------------------------------------.
                     |                                    |                  
      Bytes.set t 0 '\128' : Ok (())        Bytes.is_valid_utf_8 t : true    
      Bytes.fill t 1 3 'U' : Ok (())                                         

================================================================================
failure (1 tests failed, 0 tests errored, ran 3 tests)
File "src/bytes/dune", line 12, characters 7-16:
12 |  (name lin_tests)
            ^^^^^^^^^
(cd _build/default/src/bytes && .\lin_tests.exe --verbose)
Command exited with code 1.

I believe the problem here is that for Bytes.is_valid_utf_8 t : true a '\128' byte can only appear in valid utf-8
after a 110xxxxx 1110xxxx or 11110xxx prefix https://en.wikipedia.org/wiki/UTF-8#Description
This is not the case above, so I guess a Thread context switching situation is happening where

  • Bytes.is_valid_utf_8 t is started in the second leg, that checks index 0 before Bytes.set t 0 '\128'
  • Then context switches to the first leg, performing both the set and Bytes.fill t 1 3 'U' : Ok (())
    effectively making this part of the string valid utf-8 and
  • finally it context switches back to the second leg to confirm this in Bytes.is_valid_utf_8 t

... without the entire string never being valid utf-8 under any interleaving 😮

Locally, by bumping up the count and running under a 5.3.0+bytecode switch I've been able to confirm this,
albeit it triggers rarely (I also commented out the other two Bytes test):

$ dune exec src/bytes/lin_tests.exe -- -v
random seed: 180290958
generated error fail pass / total     time test name
[✗]  955    0    1  954 / 5000     9.1s Lin Bytes test with Thread

--- Failure --------------------------------------------------------------------

Test Lin Bytes test with Thread failed (30 shrink steps):

                                  |           
                        Bytes.set t 2 '\128'  
                                  |           
                     .------------------------.
                     |                        |           
          Bytes.is_valid_utf_8 t    Bytes.set t 0 '\128'  
                                   Bytes.fill t 1 3 '\025'


+++ Messages ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Messages for test Lin Bytes test with Thread:

  Results incompatible with sequential execution

                                        |                  
                         Bytes.set t 2 '\128' : Ok (())    
                                        |                  
                     .------------------------------------.
                     |                                    |                  
       Bytes.is_valid_utf_8 t : true       Bytes.set t 0 '\128' : Ok (())    
                                          Bytes.fill t 1 3 '\025' : Ok (())  

================================================================================
failure (1 tests failed, 0 tests errored, ran 1 tests)

It is impressive that we are able to find such a corner-case for the Bytes.is_valid_utf_8 predicate given that

  • (a) we target 27 Bytes bindings
  • (b) it involves a specific byte '\128' and
  • (c) calls to both is_valid_utf_8 and the right overwriting index in Bytes.fill

to observe a Thread-mode difference... 🙂 It also explains why this triggers somewhat rarely... 🤔

@jmid jmid added the test suite reliability Issue concerns tests that should behave more predictably label Mar 26, 2025
@jmid jmid closed this as completed in #546 Mar 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test suite reliability Issue concerns tests that should behave more predictably
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant