You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fast pseudorandom numbers are used in a few places where they probably need not be cryptographically secure, but where the generated bytes are intended to be different across any kind of concurrent tasks, including across separate processes. Existing uses in gitoxide--but not necessarily all of their callers--are as follows:
let new_value = (fastrand::usize(750..=1250)* backoff_ms) / 1000;
default_with_random is public, so there may be non-gitoxide crates that use it directly, probably in the same kind of scenarios it is used here. Here, it is used to avoid some cases of the thundering herd problem: when accessing a network, at least in gix-testtools; as well as in gix-lock, which runs in production.
let link_path = root.join(format!("__file_link{rand}"));
As discussed in #1789, a collision in the filename, even across processes, causes the probe to wrongly report that symlinks are not supported.
In addition to its use of backoff with pseudorandom jitter mentioned above, gix_testtools::spawn_git_daemon also shuffles the order of ports to try listening on:
let addr_at = |port| std::net::SocketAddr::from(([127,0,0,1], port));
Unlike (1) and (2), I think it is not really a problem here if the same orders are sometimes used. But performance is slightly better if that is avoided.
The bug
As currently implemented, we occasionally generate the same sequence of pseudorandom numbers across concurrently running processes. This was observed in #1789 to affect (2) when running parallel tests, but in addition to affecting the symlink capability probe, it should be expected occasionally to affect cases (1) and (3). Case (1) is overall the most important.
See "The cause" subsection below on why this happens, and the "Steps to reproduce" section or #1789 for details of the recently observed failure due to (2).
Although I was only able to cite one occurrence of (2), I vaguely recalled having seen similar errors. Since the test case that failed only started asserting that the probe returned true a couple months ago, occasional failures may have been different or absent in the past. For these reasons, I estimate that (2) has occurred on CI about once a month for an extended time, but this estimate is extremely low confidence.
In contrast to (2), I would not expect our existing automated tests ever to surface (1) or (3):
While case (1) is the most important, it is probably both less severe and far less common than the previous known thundering herd condition (praetorian-inc/noseyparker#179, addf446, #1412). That bug was not reproduced by the project's automated tests (though I believe a manual approach with hyperfine was able to reproduce it). I would expect this issue to be encountered less often than that.
Case (3) only occurs in tests, though it could occur in other projects' test suites if they use gix-testtools (or in non-test code in the rare case that another project uses gix-testtools for something other than tests). But it is hard to observe because it doesn't really cause a problem: if and when the collision occurs, the effect is at most a small slowdown.
In contrast, while (2) is not limited to tests, it probably most often occurs in them. It is also more observable when it does occur, because a single collision causes a test failure, rather than requiring multiple collisions to cause a serious problem as in (1), or never causing a serious problem as in (3).
The cause
This happens because we use fastrand with the per-thread generators it provides, which use seeds produced on most platforms by hashing together monotonic clock time and thread ID, but nothing that inherently varies across concurrently running processes:
Thread IDs, as returned by std::thread::current().id(), are often small numbers like 1. They are frequently the same between threads of separate, concurrently running processes.
Expected behavior 🤔
To avoid even rare cases of the thundering herd problem when used to generate jitter for backoff--as well as to avoid other glitches such as collisions when probing for the ability to create symlinks--even pseudorandom numbers that do not need to be cryptographically secure should be generated in a way that does not depend on state that is likely to be shared across concurrently running pseudorandom generations.
Possible solutions
That leaves open a wide range of possible solutions. Assuming everything that needs cryptographically secure random numbers is already implemented in a crate that uses CSPRNGs rather than fastrand--which seems to be the case--I think that:
An extremely fast, cryptographically insecure algorithm may still be the way to go here. The fastrand crate could still be used, unless something else turns out to be more convenient.
However, a higher entropy seed should be used.
I don't know if we could continue using the per-thread PRNGs fastrand arranges, though. A fastrand::Rng can be reseeded, including an Rng that fastrand provides for a thread. But this would have to be done on each thread whose generator is used. Furthermore, since there could be many threads and higher-entropy sources are probably always at least a little slower than the above approach, each thread's PRNG should not be seeded until it is first used.
At first glance, it feels like it would be enough to do even slightly better with the seed. However, I think this actually requires as unpredictable a seed--or, rather, 8 bytes of seed material per thread, to continue using shared thread-local PRNGs--as can practically be obtained without heavy delays.
Why not just mix in the PID?
Suppose we do something like fastrand does for its per-thread PRNGs, but also mix in the current process ID. That seems like the simple and obvious thing to do, given that the observed problem is that processes are not distinguished. PIDs are reused, but not concurrently.
I think this is insufficient for two reasons:
Some targets (WASM?) may not have access to PIDs, so this is not a complete solution.
Concurrent processes can run in different PID namespaces. If different instances are run in separate containers, even within in the same system, they could occasionally have not only the same monotonic clock times and thread IDs, as has been observed, but also the same process IDs.
The second reason is more important because it suggests that seeding with monotonic clock time, thread ID, and PID would not be desirable even if a backup strategy were used for cases where the PID is unavailable.
Another way to describe the second reason is that just mixing in the PID would turn a collision that happens between separate processes into a collision that (still) happens across separate containers. This might, in practice, fix (2), if the containers don't write to the same volume. But it would not be a complete fix for (1).
However, this would be a significant mitigation even in (1). So if all other approaches turn out to involve dependencies that need to be avoided in some cases, then a new default feature could be added for the primary solution, and mixing in the PID might then be considered as a partial fallback solution when that feature is disabled.
We can use getrandom for the seed
The getrandom library obtains random data from the operating system in a way that is intended to be as robust and unpredictable as possible, typically using the best source of randomness that is available in each environment.
There are a number of tradeoffs between robustness and speed. That library navigates them. For example, on some Unix-like systems, it does a blocking read on /dev/random to avoid returning data that are insufficiently random because the system entropy pool is not yet adequately seeded, but then uses /dev/urandom to avoid further blocking that could be excessive.
For some platforms, it requires explicit build configuration to work. However, that is rare. I think this only applies to WASM targets with unknown-unknown. I think the only conditions under which this happens are those where fastrand is already using a fixed seed.
To be clear, I am not advocating generating our random numbers with getrandom, but only the seed. Unless a reason is found for the uses listed above to use cryptographically secure random numbers, the algorithm being seeded could continue to be a cryptographically insecure but extremely fast algorithm. fastrand could continue to be used. If its interface is less convenient when the seeds are computed separately, then some other libraries could be considered (and there are some other Rust libraries that use Wyrand, the algorithm fastrand uses).
Can we directly read /dev/urandom instead on Unix-like systems?
Since we don't need cryptographically secure random numbers, it seems like the getrandom crate dependency could be avoided on Unix-like systems, by reading seed bytes directly from /dev/unrandom instead, and that this would on rare occasion even be slightly faster. It seems like that would be okay even if the system entropy pool is not, and was never, adequately seeded.
My guess is that this would be okay, but I am not sure. Besides (or sometimes together with) containers, one way someone might run the same or related programs many times concurrently is when deploying a task to many virtual machines. Normally that would be fine, but it may be that it is done early on after booting the virtual machines, which might be in use only for that task. This is relevant because, as the getrandom readme says:
Sometimes, early in the boot process, the OS has not collected enough entropy to securely seed its RNG. This is especially common on virtual machines, where standard "random" events are hard to come by.
So I worry that this approach would fix the problem across processes and containers but not fix it across virtual machines, and that this could actually in rare cases cause problems in practice. I think that this not a problem for our use case, because:
Even if the OS cannot securely seed its pseudorandom number generator, reading still may give different data, especially if the separate reads are on the same machine, but even, I think, on different machines.
The code in gitoxide that uses random numbers seems unlikely to be run "early in the boot process." At minimum, for a thundering herd problem involving a network resource, it would have to be after network interfaces are operational.
But I am not sure. This is assuming we have non-adversarial unpredictability even when we don't have adversarial unpredictability. That assumption may be wrong. Since implementing more strategies is more complicated than implementing fewer strategies, and it will be either hard or infeasible to test in the test suite, it should probably only be done if we are confident it will work properly. I am not confident.
Why we shouldn't just mix in some weird thing that happens to vary
We can't rely on things that seem to change, but that are not required to change, being different, and oddities will vary by target. Even attempting this would be complicated and probably involve more code that it would take to reimplement substantial fragments of libraries.
(Secondarily, I would worry about using anything that is not routinely exposed already and that might be in use to seed ASLR, or to reveal its seed data. This includes the numerical addresses of objects, as the running process sees them, but it may include other information that varies across separate processes in a system-dependent way.)
Is it a bug in fastrand that its seeds clash across processes?
Edit: Clarified the title of this subsection.
I don't know if this would be considered a bug in how fastrand seeds its per-thread PRNGs by default. But my guess is that this may not be considered a bug, because while that functionality does depend on the std feature, it is intentionally pretty minimalistic, whereas even adding in the process IDs seems like it would be more extensive and not always available. I'm not sure, though.
Git behavior
The csprng_bytes and git_rand functions
Outside of operations performed with CSPRNGs in cryptographic libraries, and outside of the test suite, Git uses pseudorandom numbers in a few ways. It obtains them, primarily from cryptographic libraries, though its csprng_bytes function, and its associated git_rand convenience function:
git_rand makes a 4-byte integer from data from csprng_bytes. Both functions are declared in wrapper.h and defined in wrapper.c. In both functions, the parameter is not a seed, but an option controlling how robust the output is required to be.
When Git probes for the ability to create symlinks, it does not use those facilities.
Quality of pseudorandom numbers
As its name implies, csprng_bytes is primarily a cryptographically secure pseudorandom number generator. It uses several strategies. For at least one of those strategies, it is possible to fall back to less secure output when secure output is not available.
Whether this will be done depends on whether the CSPRNG_BYTES_INSECURE flag is passed. Currently it is the only recognized flag, so calls to csprng_bytes--or git_rand, which passes the flag along--pass either 0 or CSPRNG_BYTES_INSECURE.
Even when CSPRNG_BYTES_INSECURE is passed, such as when choosing delays in reftable/stack.c, I believe the pseudorandom data are well distributed and unlikely to cause a thundering herd condition or other clash-related problems.
Separately from that flag, if git is compiled without any of the cryptographic libraries it prefers to use for generating pseudorandom numbers, it will attempt to read from /dev/urandom. If this were somehow to happen during early boot, then the data could be predictable on some systems, but I think would not be the same across runs, as discussed above in "Expected behavior."
The RtlGenRandom case (HAVE_RTLGENRANDOM in csprng_bytes) applies to Windows. To the best of my knowledge, it is always available to git on Windows, but I have not investigated this.
The symlink probe
This is for comparison to case (2) presented above in the "Current behavior" section.
Git probes to see if it is able to create a symbolic link like this, creating a regular file in a collision-resistant way and then deleting it and attempting to make a symlink of the same name:
/* Check if symlink is supported in the work tree */path=git_path_buf(&buf, "tXXXXXX");
if (!close(xmkstemp(path)) &&
!unlink(path) &&
!symlink("testing", path) &&
!lstat(path, &st1) &&S_ISLNK(st1.st_mode))
unlink(path); /* good */elsegit_config_set("core.symlinks", "false");
That uses the xmkstemp function, defined in wrapper.c. xmkstemp delegates to mkstemp (which is usually part of libc). xmkstemp checks for and bails out with a useful error message, but it does not customize how the temporary file is created, nor how its name is chosen.
An mkstemp implementation usually uses a PRNG, though sometimes not a very high quality one. However, it also performs an operation that is only expected to fail due to a collision, and it retries a number of times with different names when there is a collision.
Some other mkstemp-related functions that do not delegate to the system mkstemp function are defined in wrapper.c, in both the upstream Git repository and in Git for Windows. In particular, git_mkstemp_mode and git_mkstemps_mode accept a mode with which to create the temporary file. These functions are implemented fully in wrapper.c. They use csprng_bytes to fill in the X placeholders in their pattern arguments.
I believe it is only on Windows that the mkstemp implementation is part of Git itself, where it is defined in mingw.c. It delegates to git_mkstemp_mode, passing a mode of 0600.
Steps to reproduce 🕹
This is hard to reproduce intentionally, and to my knowledge only (2) has been observed so far. This is as reported in #1789. In that question, I had thought this was a Windows-specific problem, but I no longer expect that to be so. For completeness, the CI output shown there is:
FAIL [ 0.018s] gix-worktree-state-tests::worktree state::checkout::overwriting_files_and_lone_directories_works
──── STDOUT: gix-worktree-state-tests::worktree state::checkout::overwriting_files_and_lone_directories_works
running 1 test
test state::checkout::overwriting_files_and_lone_directories_works ... FAILED
failures:
failures:
state::checkout::overwriting_files_and_lone_directories_works
test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 14 filtered out; finished in 0.00s
──── STDERR: gix-worktree-state-tests::worktree state::checkout::overwriting_files_and_lone_directories_works
thread 'state::checkout::overwriting_files_and_lone_directories_works' panicked at gix-worktree-state\tests\state\checkout.rs:182:9:
The probe must detect to be able to generate symlinks
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
assert!(opts.fs.symlink,"The probe must detect to be able to generate symlinks");
So if the problem occurred before that when running the test suite, on CI or otherwise, it would either have caused a different (possibly less obvious) error, or not caused an error in the tests.
The most important case is (1) and it might be good to try to reproduce it intentionally. Maybe an approach with hyperfine could attempt this. I am not sure.
The text was updated successfully, but these errors were encountered:
I am extremely impressed by this analysis! It seems more exhaustive than I could ever have imagined, and it feels I learned a lot and can only hope to recall it when applicable.
As for this particular problem, focussing on reliably-enough solving 2) seems easiest, but when attempting it I'd hope that gix-utils could gain a feature that supports this.
To me it boils down to calling seed(<reasonably-random-value>) per thread before generating any random numbers with fastrng.
Maybe this could be combined with a way to obtain handles to uniquely opened files (or symlinks) in gix-fs, possibly emulating what Git does to some extent.
getrandom is already part of the dependency tree, using it as direct dependency shouldn't be a problem.
As for this particular problem, focussing on reliably-enough solving 2) seems easiest, but when attempting it I'd hope that gix-utils could gain a feature that supports this.
It seems to me that seeding more robustly would solve both. Since adding getrandom is acceptable, I think this is probably the easiest approach, even if only (2) were a goal.
To me it boils down to calling seed(<reasonably-random-value>) per thread before generating any random numbers with fastrng.
Is arranging to call fastrand::seed(getrandom::u64()) on each thread easier than defining our own thread-local fastrand::Rng that we initialize with getrandom::u64()?
Since all global RNG calls are automatically done per thread, I'd think that arranging one call per thread to seed() should overall be simpler.
However, I wouldn't know what offers these "once per thread" semantics.
So probably it will be easier overall to maintain a thread-local fastrand::Rng and use that state to know if it was initialized already. It's a bit sad that there seems to be no other way but to funnel all RNG through own functions that take care of the on-demand seeding, in gix-util maybe.
Current behavior 😯
Background
Fast pseudorandom numbers are used in a few places where they probably need not be cryptographically secure, but where the generated bytes are intended to be different across any kind of concurrent tasks, including across separate processes. Existing uses in gitoxide--but not necessarily all of their callers--are as follows:
gix_utils::backoff::Quadratic::default_with_random
supplies a series of increasing timings that are adjusted with pseudorandom jitter, so concurrent tasks don't all retry at the same time. Itsrandomize
helper computes:gitoxide/gix-utils/src/backoff.rs
Line 4 in ffb73b5
default_with_random
is public, so there may be non-gitoxide crates that use it directly, probably in the same kind of scenarios it is used here. Here, it is used to avoid some cases of the thundering herd problem: when accessing a network, at least ingix-testtools
; as well as ingix-lock
, which runs in production.(Per gitpython-developers/gitdb#115 (comment) and #1815, this was recently renamed from
Exponential
toQuadratic
. But that was just a name change, with no effect on this issue.)gix_fs::capabilities::Capabilities::probe
includes a check for the ability to create symlinks in thegit_dir
, done by itsprobe_symlink
helper with a filename produced by:gitoxide/gix-fs/src/capabilities.rs
Lines 106 to 107 in ffb73b5
As discussed in #1789, a collision in the filename, even across processes, causes the probe to wrongly report that symlinks are not supported.
In addition to its use of backoff with pseudorandom jitter mentioned above,
gix_testtools::spawn_git_daemon
also shuffles the order of ports to try listening on:gitoxide/tests/tools/src/lib.rs
Lines 206 to 208 in ffb73b5
Unlike (1) and (2), I think it is not really a problem here if the same orders are sometimes used. But performance is slightly better if that is avoided.
The bug
As currently implemented, we occasionally generate the same sequence of pseudorandom numbers across concurrently running processes. This was observed in #1789 to affect (2) when running parallel tests, but in addition to affecting the symlink capability probe, it should be expected occasionally to affect cases (1) and (3). Case (1) is overall the most important.
See "The cause" subsection below on why this happens, and the "Steps to reproduce" section or #1789 for details of the recently observed failure due to (2).
Although I was only able to cite one occurrence of (2), I vaguely recalled having seen similar errors. Since the test case that failed only started asserting that the probe returned
true
a couple months ago, occasional failures may have been different or absent in the past. For these reasons, I estimate that (2) has occurred on CI about once a month for an extended time, but this estimate is extremely low confidence.In contrast to (2), I would not expect our existing automated tests ever to surface (1) or (3):
hyperfine
was able to reproduce it). I would expect this issue to be encountered less often than that.gix-testtools
(or in non-test code in the rare case that another project usesgix-testtools
for something other than tests). But it is hard to observe because it doesn't really cause a problem: if and when the collision occurs, the effect is at most a small slowdown.The cause
This happens because we use
fastrand
with the per-thread generators it provides, which use seeds produced on most platforms by hashing together monotonic clock time and thread ID, but nothing that inherently varies across concurrently running processes:Thread IDs, as returned by
std::thread::current().id()
, are often small numbers like 1. They are frequently the same between threads of separate, concurrently running processes.Expected behavior 🤔
To avoid even rare cases of the thundering herd problem when used to generate jitter for backoff--as well as to avoid other glitches such as collisions when probing for the ability to create symlinks--even pseudorandom numbers that do not need to be cryptographically secure should be generated in a way that does not depend on state that is likely to be shared across concurrently running pseudorandom generations.
Possible solutions
That leaves open a wide range of possible solutions. Assuming everything that needs cryptographically secure random numbers is already implemented in a crate that uses CSPRNGs rather than
fastrand
--which seems to be the case--I think that:fastrand
crate could still be used, unless something else turns out to be more convenient.I don't know if we could continue using the per-thread PRNGs
fastrand
arranges, though. Afastrand::Rng
can be reseeded, including anRng
thatfastrand
provides for a thread. But this would have to be done on each thread whose generator is used. Furthermore, since there could be many threads and higher-entropy sources are probably always at least a little slower than the above approach, each thread's PRNG should not be seeded until it is first used.At first glance, it feels like it would be enough to do even slightly better with the seed. However, I think this actually requires as unpredictable a seed--or, rather, 8 bytes of seed material per thread, to continue using shared thread-local PRNGs--as can practically be obtained without heavy delays.
Why not just mix in the PID?
Suppose we do something like
fastrand
does for its per-thread PRNGs, but also mix in the current process ID. That seems like the simple and obvious thing to do, given that the observed problem is that processes are not distinguished. PIDs are reused, but not concurrently.I think this is insufficient for two reasons:
The second reason is more important because it suggests that seeding with monotonic clock time, thread ID, and PID would not be desirable even if a backup strategy were used for cases where the PID is unavailable.
Another way to describe the second reason is that just mixing in the PID would turn a collision that happens between separate processes into a collision that (still) happens across separate containers. This might, in practice, fix (2), if the containers don't write to the same volume. But it would not be a complete fix for (1).
However, this would be a significant mitigation even in (1). So if all other approaches turn out to involve dependencies that need to be avoided in some cases, then a new default feature could be added for the primary solution, and mixing in the PID might then be considered as a partial fallback solution when that feature is disabled.
We can use
getrandom
for the seedThe
getrandom
library obtains random data from the operating system in a way that is intended to be as robust and unpredictable as possible, typically using the best source of randomness that is available in each environment.There are a number of tradeoffs between robustness and speed. That library navigates them. For example, on some Unix-like systems, it does a blocking read on
/dev/random
to avoid returning data that are insufficiently random because the system entropy pool is not yet adequately seeded, but then uses/dev/urandom
to avoid further blocking that could be excessive.For some platforms, it requires explicit build configuration to work. However, that is rare. I think this only applies to WASM targets with
unknown-unknown
. I think the only conditions under which this happens are those wherefastrand
is already using a fixed seed.To be clear, I am not advocating generating our random numbers with
getrandom
, but only the seed. Unless a reason is found for the uses listed above to use cryptographically secure random numbers, the algorithm being seeded could continue to be a cryptographically insecure but extremely fast algorithm.fastrand
could continue to be used. If its interface is less convenient when the seeds are computed separately, then some other libraries could be considered (and there are some other Rust libraries that use Wyrand, the algorithmfastrand
uses).Can we directly read
/dev/urandom
instead on Unix-like systems?Since we don't need cryptographically secure random numbers, it seems like the
getrandom
crate dependency could be avoided on Unix-like systems, by reading seed bytes directly from/dev/unrandom
instead, and that this would on rare occasion even be slightly faster. It seems like that would be okay even if the system entropy pool is not, and was never, adequately seeded.My guess is that this would be okay, but I am not sure. Besides (or sometimes together with) containers, one way someone might run the same or related programs many times concurrently is when deploying a task to many virtual machines. Normally that would be fine, but it may be that it is done early on after booting the virtual machines, which might be in use only for that task. This is relevant because, as the
getrandom
readme says:So I worry that this approach would fix the problem across processes and containers but not fix it across virtual machines, and that this could actually in rare cases cause problems in practice. I think that this not a problem for our use case, because:
But I am not sure. This is assuming we have non-adversarial unpredictability even when we don't have adversarial unpredictability. That assumption may be wrong. Since implementing more strategies is more complicated than implementing fewer strategies, and it will be either hard or infeasible to test in the test suite, it should probably only be done if we are confident it will work properly. I am not confident.
Why we shouldn't just mix in some weird thing that happens to vary
We can't rely on things that seem to change, but that are not required to change, being different, and oddities will vary by target. Even attempting this would be complicated and probably involve more code that it would take to reimplement substantial fragments of libraries.
(Secondarily, I would worry about using anything that is not routinely exposed already and that might be in use to seed ASLR, or to reveal its seed data. This includes the numerical addresses of objects, as the running process sees them, but it may include other information that varies across separate processes in a system-dependent way.)
Is it a bug in
fastrand
that its seeds clash across processes?Edit: Clarified the title of this subsection.
I don't know if this would be considered a bug in how
fastrand
seeds its per-thread PRNGs by default. But my guess is that this may not be considered a bug, because while that functionality does depend on thestd
feature, it is intentionally pretty minimalistic, whereas even adding in the process IDs seems like it would be more extensive and not always available. I'm not sure, though.Git behavior
The
csprng_bytes
andgit_rand
functionsOutside of operations performed with CSPRNGs in cryptographic libraries, and outside of the test suite, Git uses pseudorandom numbers in a few ways. It obtains them, primarily from cryptographic libraries, though its
csprng_bytes
function, and its associatedgit_rand
convenience function:git_rand
makes a 4-byte integer from data fromcsprng_bytes
. Both functions are declared inwrapper.h
and defined inwrapper.c
. In both functions, the parameter is not a seed, but an option controlling how robust the output is required to be.When Git probes for the ability to create symlinks, it does not use those facilities.
Quality of pseudorandom numbers
As its name implies,
csprng_bytes
is primarily a cryptographically secure pseudorandom number generator. It uses several strategies. For at least one of those strategies, it is possible to fall back to less secure output when secure output is not available.Whether this will be done depends on whether the
CSPRNG_BYTES_INSECURE
flag is passed. Currently it is the only recognized flag, so calls tocsprng_bytes
--orgit_rand
, which passes the flag along--pass either0
orCSPRNG_BYTES_INSECURE
.Even when
CSPRNG_BYTES_INSECURE
is passed, such as when choosing delays inreftable/stack.c
, I believe the pseudorandom data are well distributed and unlikely to cause a thundering herd condition or other clash-related problems.Separately from that flag, if
git
is compiled without any of the cryptographic libraries it prefers to use for generating pseudorandom numbers, it will attempt to read from/dev/urandom
. If this were somehow to happen during early boot, then the data could be predictable on some systems, but I think would not be the same across runs, as discussed above in "Expected behavior."The
RtlGenRandom
case (HAVE_RTLGENRANDOM
incsprng_bytes
) applies to Windows. To the best of my knowledge, it is always available togit
on Windows, but I have not investigated this.The symlink probe
This is for comparison to case (2) presented above in the "Current behavior" section.
Git probes to see if it is able to create a symbolic link like this, creating a regular file in a collision-resistant way and then deleting it and attempting to make a symlink of the same name:
That uses the
xmkstemp
function, defined inwrapper.c
.xmkstemp
delegates tomkstemp
(which is usually part of libc).xmkstemp
checks for and bails out with a useful error message, but it does not customize how the temporary file is created, nor how its name is chosen.An
mkstemp
implementation usually uses a PRNG, though sometimes not a very high quality one. However, it also performs an operation that is only expected to fail due to a collision, and it retries a number of times with different names when there is a collision.Some other
mkstemp
-related functions that do not delegate to the systemmkstemp
function are defined inwrapper.c
, in both the upstream Git repository and in Git for Windows. In particular,git_mkstemp_mode
andgit_mkstemps_mode
accept a mode with which to create the temporary file. These functions are implemented fully inwrapper.c
. They usecsprng_bytes
to fill in theX
placeholders in theirpattern
arguments.I believe it is only on Windows that the
mkstemp
implementation is part of Git itself, where it is defined inmingw.c
. It delegates togit_mkstemp_mode
, passing a mode of 0600.Steps to reproduce 🕹
This is hard to reproduce intentionally, and to my knowledge only (2) has been observed so far. This is as reported in #1789. In that question, I had thought this was a Windows-specific problem, but I no longer expect that to be so. For completeness, the CI output shown there is:
That assertion was introduced in d715e4a (#1652):
gitoxide/gix-worktree-state/tests/state/checkout.rs
Line 182 in ffb73b5
So if the problem occurred before that when running the test suite, on CI or otherwise, it would either have caused a different (possibly less obvious) error, or not caused an error in the tests.
The most important case is (1) and it might be good to try to reproduce it intentionally. Maybe an approach with
hyperfine
could attempt this. I am not sure.The text was updated successfully, but these errors were encountered: