diff --git a/README.md b/README.md index 9542c90..2ed37f6 100644 --- a/README.md +++ b/README.md @@ -3,12 +3,12 @@ ## INTRODUCTION A C++ compiler exhibits non-deterministic behavior if, for the same input program, the object code generated by the compiler differs from run to run. In -this work, we first explore the causes of such non-determinism. Then we +this work, I first explore the causes of such non-determinism. Then we outline the scenarios where non-determinism is observed and examine why such -behavior is undesirable. We then present a case study on our work to uncover -and fix non-deterministic behavior in the LLVM compiler. Finally, we report on -the impact that our work has had on the LLVM community, the total number of -bugs found and how we fixed them. +behavior is undesirable. I then present a case study on my work to uncover +and fix non-deterministic behavior in the LLVM compiler. Finally, I report on +the impact that my work has had on the LLVM community, the total number of +bugs found and how I fixed them. ## RELEVANCE Millions of C++ developers around the world use compilers to develop their @@ -17,7 +17,7 @@ compiler. So, having a robust compiler becomes supremely important. However, the behavior of a compiler may not always be deterministic. For the same input program, it may generate different code in different scenarios. This non-determinism can make debugging difficult, result in hard-to-reproduce bugs, -cause unexpected runtime crashes or unpredictable performance. Our work +cause unexpected runtime crashes or unpredictable performance. My work attempts to uncover non-deterministic behavior in the LLVM C++ compiler thereby making LLVM more robust. @@ -31,32 +31,30 @@ input program. Or there might be differences in behavior between asserts and non-asserts version of the same compiler. Or even back-to-back runs of the same compiler can produce different object code for the same input. -We have identified three main causes of non-deterministic behavior in a C++ +I have identified three main causes of non-deterministic behavior in a C++ compiler: 1. Iteration of unordered containers - 2. Hashing of pointer keys - 3. Use of non-stable sort functions All three arise due to poor understanding of the behavior of various containers and algorithms. The detection of such non-deterministic behavior is often challenging since the compiler may not always behave in an expected way. In -LLVM we try to uncover non-determinism in 2 ways: +LLVM I try to uncover non-determinism in 2 ways: ### 1. Iteration order non-determinism -We implemented a “reverse iteration” mode for all supported unordered -containers in LLVM. The CMake flag LLVM_REVERSE_ITERATION enables the reverse -iteration mode. This mode makes all supported containers iterate in reverse, by -default. The idea is to compare the output of a reverse iteration compiler with -that of a forward iteration compiler to weed out iteration order randomness. -This mode is transparent to the user and comes with almost zero runtime cost. +I implemented a “reverse iteration” mode for all supported unordered containers +in LLVM. The CMake flag LLVM_REVERSE_ITERATION enables the reverse iteration +mode. This mode makes all supported containers iterate in reverse, by default. +The idea is to compare the output of a reverse iteration compiler with that of +a forward iteration compiler to weed out iteration order randomness. This mode +is transparent to the user and comes with almost zero runtime cost. The following upstream buildbot tracks this mode: http://lab.llvm.org:8011/builders/reverse-iteration ### 2. Sorting order non-determinism -We added a wrapper function to LLVM called llvm::sort which randomly shuffles a +I added a wrapper function to LLVM called llvm::sort which randomly shuffles a container before invoking std::sort. The idea is that randomly shuffling a container would weed out non-deterministic sorting order of keys with the same values. @@ -64,29 +62,29 @@ values. The following upstream buildbot tracks this mode: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win -We also outline some best practices we followed in LLVM to avoid or fix -non-deterministic behavior: +Some best practices I followed in LLVM to avoid or fix non-deterministic +behavior are: 1. Sort the container before iteration 2. Use a stronger sort predicate 3. Use a stable sort function 4. Use an ordered container ## COMPLETION STATUS -Our work to enable reverse iteration and random shuffling a container is -complete and available upstream in the latest 6.0 release of LLVM. We have so +My work to enable reverse iteration and random shuffling a container is +complete and available upstream in the latest 6.0 release of LLVM. I have so far uncovered and fixed 42 iteration order bugs and 44 sorting order bugs. The upstream buildbots regularly catch non-determinism bugs and the community -promptly fixes them. As a result of our work, the LLVM community has become -more diligent in their use of containers and sorting algorithms. We have also -added coding standards for LLVM compiler developers on the correct use of -unordered containers and sorting algorithms: +promptly fixes them. As a result of my work, the LLVM community has become more +diligent in their use of containers and sorting algorithms. I have also added +coding standards for LLVM compiler developers on the correct use of unordered +containers and sorting algorithms: [1] https://llvm.org/docs/CodingStandards.html#beware-of-non-determinism-due-to-ordering-of-pointers [2] https://llvm.org/docs/CodingStandards.html#beware-of-non-deterministic-sorting-order-of-equal-elements ## REFERENCES -Our work has featured several times in the LLVM weekly newsletters and other +My work has featured several times in the LLVM weekly newsletters and other places: [1] https://bugs.swift.org/browse/SR-6154