-
Notifications
You must be signed in to change notification settings - Fork 16
Improve context switching chance in Lin_thread
and STM_thread
modes
#540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
FTR, the PR initially used the magic constant of I have experimented a bit more by timing The initial
Now, with
The above seems like a good compromise between not affecting time usage significantly, while giving the error rate a significant bump upwards 🙂 Both of the above were run with OCaml 5.3.0. For comparison, I've timed the same run on OCaml 5.2.0 (where
Going from ~9m to ~5m wall clock time shows the I also timed the run on OCaml 5.2.0 with rep_count:3 to assess the overhead of
so the Finally, for completeness, below follows the repeated stats from the initial PR description. Revised STM numbers with
|
CI summary for ca10290: All 36 workflows passed. CI summary for 292a3e0:
Out of 36 workflows, 1 failed with a genuine (supposedly Cygwin) error |
292a3e0
to
0ef30bb
Compare
CI summary for 0ef30bb: all 36 workflows passed. |
CI summary for merge to |
This fixes #338 - or at least gives it a good kick in the right direction.
Both
Lin_thread
andSTM_thread
are currently marked as experimental - and for good reason:The chance of them triggering unfortunate, error-triggering
Thread
context switches is currently slim.In hindsight, releasing them earlier - even as
experimental
- was probably premature.Recall that a
Thread
context switch may happenIIUC:
CHECK_SIGNALS
instruction inserted inwhile
andfor
loops.This PR improves on the situation by utilizing a terrific hack by @edwintorok in the form of a
Gc.Memprof
callback:Upon allocations, with a certain probability,
Memprof
will trigger a callback which can be utilized to trigger an explicitThread.yield
.To get an idea how well this works, I've run stats on our usual {int,int64} x {ref,CList} negative tests, repeating each one 10.000 times, and recording how many of these failed the "interface is
Thread
-safe" property. Since theThread
failures are still relatively rare, each of the 10.000 test inputs isUtil.repeat
ed 25 times, to bring up the probability of a failure.For
CList
, theDomain
-unsafeadd_node
does not do any allocations in the central "unsafe window", effectively making itThread
safe. I've therefore introduced an even worseadd_node_thread
by moving an allocation to this part,for the purpose of testing...
Each test is run on both 5.3.0, 5.2.0, and 4.14.2 - with only 5.2.0 not having
Gc.Memprof
support: It is available in 4.14 and was restored in 5.3. By running on 5.2 theMemprof.{start,stop}
calls thereby degrade to essentially noops, and enable an easy comparison with/without the improvement.STM numbers with
Util.repeat 25
:Lin numbers with
Util.repeat 25
:From the above it is clear that 5.3 and 4.14 works remarkably better than the current version.
Even in bytecode mode this represents a good step forward. However, for some reason, in bytecode mode
both
Lin_thread
andSTM_thread
fail to trigger anint64 ref
issue. I've not yet figured out why this is the case.Overview of commits
Lin
andSTM
modesexperimental
alert attributesMemprof.{start,stop}
so the 5th commit wraps them in a handlersrc/neg_tests
tests to adjust for the improved behaviour