You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add the description of atomic instruction to let readers know there is a
difference between using fetch and..., which is only a programming tool,
and its actual execution as an atomic operation that depends on the
compiler.
Simplify the rmw_example code to provide more flexible examples.
- Initially, all worker threads will be initialized. The main thread
will ask all workers to start running. If there is no job or the job is
completed, the worker will become idle. Next, the main thread will
continue to add more jobs and ask the worker to start running again.
Meanwhile, the main thread will also wait for the results of the work.
- Use the struct `tpool_future` to record all the information required
for the job.
Co-authored-by: Chih-Wei Chien <[email protected]>
Copy file name to clipboardExpand all lines: concurrency-primer.tex
+36-33Lines changed: 36 additions & 33 deletions
Original file line number
Diff line number
Diff line change
@@ -396,12 +396,12 @@ \section{Read-modify-write}
396
396
In \secref{atomicity}, there is a need for atomicity to ensure that a group of operations is not only sequentially executed but also completes without being interrupted by operation from other threads.
397
397
This establishes correct order of operations from different threads.
\captionof{figure}{Exchange, Test and Set, Fetch and…, Compare and Swap can all be transformed into atomic RMW operations, ensuring that operations like t1 \to t2 \to t3 will become an atomic step.}
401
-
\label{fig:atomic_rmw}
401
+
\label{fig:atomic-rmw}
402
402
403
403
Atomic loads and stores are all well and good when we do not need to consider the previous state of atomic variables, but sometimes we need to read a value, modify it, and write it back as a single atomic step.
404
-
As shown in \fig{fig:atomic_rmw}, the modification is based on the previous state that is visible for reading, and the result is then written back.
404
+
As shown in \fig{fig:atomic-rmw}, the modification is based on the previous state that is visible for reading, and the result is then written back.
405
405
A complete \introduce{read-modify-write} operation is performed atomically to ensure visibility to subsequent operations.
406
406
407
407
Furthermore, for communication between concurrent threads, a shared resource is required, as shown in \fig{fig:atomicity}
@@ -411,16 +411,16 @@ \section{Read-modify-write}
411
411
412
412
As discussed earlier, the process of accessing shared resources responsible for communication must also ensure both order and non-interference.
413
413
To prevent the recursive protection of shared resources,
414
-
atomic operations can be introduced for the shared resources responsible for communication, as shown in \fig{fig:atomic_types}.
414
+
atomic operations can be introduced for the shared resources responsible for communication, as shown in \fig{fig:atomic-types}.
415
415
416
416
There are a few common \introduce{read-modify-write} (\textsc{RMW}) operations to make theses operation become a single atomic step.
417
417
In \cplusplus{}, they are represented as member functions of \cpp|std::atomic<T>|.
\captionof{figure}{Test and Set (Left) and Compare and Swap (Right) leverage their functionality of checking and their atomicity to make other RMW operations perform atomically.
422
422
The red color represents atomic RMW operations, while the blue color represents RMW operations that behave atomically.}
423
-
\label{fig:atomic_types}
423
+
\label{fig:atomic-types}
424
424
425
425
\subsection{Exchange}
426
426
\label{exchange}
@@ -441,7 +441,7 @@ \subsection{Test and set}
441
441
\introduce{Test-and-set} operations are not limited to just \textsc{RMW} functions;
442
442
they can also be utilized for constructing simple spinlock.
443
443
In this scenario, the flag acts as a shared resource for communication between threads.
444
-
Thus, spinlock implemented with \introduce{Test-and-set} operations ensures that entire \textsc{RMW} operations on shared resources are performed atomically, as shown in \fig{fig:atomic_types}.
444
+
Thus, spinlock implemented with \introduce{Test-and-set} operations ensures that entire \textsc{RMW} operations on shared resources are performed atomically, as shown in \fig{fig:atomic-types}.
445
445
\label{spinlock}
446
446
\begin{ccode}
447
447
atomic_flag af = ATOMIC_FLAG_INIT;
@@ -464,7 +464,7 @@ \subsection{Fetch and…}
464
464
or bitwise \textsc{AND}, \textsc{OR}, \textsc{XOR}) and return its previous value,
465
465
all as part of a single atomic operation.
466
466
Compare with \introduce{Exchange} \secref{exchange}, when programmers only need to make simple modification to the shared variable,
467
-
they can use \introduce{Fetchand…}.
467
+
they can use \introduce{Fetch-and…}.
468
468
469
469
\subsection{Compare and swap}
470
470
\label{cas}
@@ -501,50 +501,57 @@ \subsection{Compare and swap}
501
501
Subsequently, update the expected value with the current shared value and retry modify in a loop.
502
502
This iterative process allows \textsc{CAS} to serve as a communication mechanism between threads,
503
503
ensuring that entire \textsc{RMW} operations on shared resources are performed atomically.
504
-
As shown in \fig{fig:atomic_types}, compared with \introduce{Test-and-set} \secref{Testandset},
504
+
As shown in \fig{fig:atomic-types}, compared with \introduce{Test-and-set} \secref{Testandset},
505
505
a thread that employs \textsc{CAS} can directly use the shared resource to check.
506
506
It uses atomic \textsc{CAS} to ensure that Modify is atomic,
507
507
coupled with a while loop to ensure that the entire \textsc{RMW} can behave atomically.
508
508
509
+
~\\
510
+
However, atomic \textsc{RMW} operations here are merely a programming tool for programmers to achieve program logic correctness.
511
+
Its actual execution as atomic operations depends on the how compiler translate it into actual atomic instructions based on differenct hardware instruction set.
512
+
\introduce{Exchange}, \introduce{Fetch-and-Add}, \introduce{Test-and-set} and \textsc{CAS} in instruction level are different style of atomic \textsc{RMW} instructions.
513
+
ISA could only provide some of them,
514
+
leaving the rest to compilers to synthesize atomic \textsc{RMW} operations.
515
+
For example, In IA32/64 and IBM System/360/z architectures,
516
+
\introduce{Test-and-set} functionality is directly supported by hardware instructions.
517
+
x86 has XCHG, XADD for \introduce{Exchange} and \introduce{Fetch-and-Add} but has \introduce{Test-and-set} implemented with XCHG.
518
+
Arm, in another style, provides LL/SC (Load Linked/Store Conditional) flavor instructions for all the operations,
519
+
with \textsc{CAS} added in Armv8/v9-A.
520
+
509
521
\subsection{example}
510
522
\label{rmw_example}
511
-
Following example code is a simplify implementation of thread pool to demonstrate the use of \clang{}11 atomic library.
523
+
The following example code is a simplified implementation of a thread pool, which demonstrates the use of \clang{}11 atomic library.
512
524
513
525
\inputminted{c}{./examples/rmw_example.c}
514
526
515
-
%Compile the code with \monobox{gcc rmw\_example.c -o rmw\_example -Wall -Wextra -std=c11 -pthread} and execute the program.
516
-
%A thread pool has three states: idle, cancelled and running.
517
-
%It is initialized with \monobox{N\_THREADS} (default 8) of threads.
518
-
%\monobox{N\_JOBS} (default 16) of jobs are added, and the pool is then set to running.
519
-
%A job is simply echoing its job ID.
520
-
%\monobox{sleep(1)} is used to ensure that the second batch of jobs is added after the first batch is finished; otherwise, jobs may not be consumed as expected.
521
-
%Thread pool is then destroyed right after starting running.
522
527
Stdout of the program is:
523
528
\begin{ccode}
524
-
PI calculated with 101 terms: 3.141592653589793
529
+
PI calculated with 100 terms: 3.141592653589793
525
530
\end{ccode}
526
531
527
532
\textbf{Exchange}
528
-
In function \monobox{thread\_pool\_destroy}, \monobox{atomic\_exchange(\&thrd\_pool->state, cancelled)} reads current state and replaces it with "cancelled". A warning message is printed if the pool is destroyed when still running.
529
-
If the exchange is not performed atomically, we may initially get the state as "running". Subsequently, a thread could set the state to "cancelled" after finishing the last one, resulting in a false warning.
533
+
In function \monobox{thread\_pool\_destroy}, \monobox{atomic\_exchange(\&thrd\_pool->state, cancelled)} reads the current state and replaces it with ``cancelled''.
534
+
A warning message is printed if the pool is destroyed while workers are still ``running''.
535
+
If the exchange is not performed atomically, we may initially get the state as ``running''. Subsequently, a thread could set the state to ``cancelled'' after finishing the last one, resulting in a false warning.
530
536
531
537
\textbf{Test and set}
532
-
In the example, the scenario is as follows:
533
-
First, the main thread initially acquire a lock \monobox{future->flag} and then set it true,
534
-
which is akin to creating a job and then transfer its ownership to the worker.
535
-
Subsequently, the main thread will be blocked until the worker clear the flag.
536
-
This inidcate the main thread will wail until the worker completes the job and return the ownership back to the main thread, which ensure correct cooperation.
538
+
In this example, the scenario is as follows:
539
+
First, the main thread initially acquires a lock \monobox{future->flag} and then sets it true,
540
+
which is akin to creating a job and then transferring its ownership to the worker.
541
+
Subsequently, the main thread will be blocked until the worker clears the flag.
542
+
This indicates that the main thread will wail until the worker completes the job and returns ownership back to the main thread, which ensures correct cooperation.
537
543
538
544
\textbf{Fetch and…}
539
-
In the function \monobox{thread\_pool\_destroy}, \monobox{atomic\_fetch\_and} is utilized as a means to set the state to idle.
545
+
In the function \monobox{thread\_pool\_destroy}, \monobox{atomic\_fetch\_and} is utilized as a means to set the state to ``idle''.
540
546
Yet, in this case, it is not necessary, as the pool needs to be reinitialized for further use regardless.
541
547
Its return value could be further utilized, for instance, to report the previous state and perform additional actions.
542
548
543
549
\textbf{Compare and swap}
544
550
Once threads are created in the thread pool as workers, they will continuously search for jobs to do.
545
-
Jobs are taken from the tail of job queue.
546
-
To claim a job without it being taken by another worker halfway through, we need to atomically change the pointer to the last job. Otherwise the last job is under races.
547
-
The while loop in function \monobox{worker},
551
+
Jobs are taken from the tail of the job queue.
552
+
To take a job without being taken by another worker halfway through, we need to atomically change the pointer to the last job.
553
+
Otherwise, the last job is under race.
554
+
The while loop in the function \monobox{worker},
548
555
\begin{ccode}
549
556
while (!atomic_compare_exchange_weak(&thrd_pool->head->prev, &job,
Without specifying, atomic operations in \clang{}11 atomic library use \monobox{memory\_order\_seq\_cst} as default memory order. Operations post-fix with \monobox{\_explicit} accept an additional argument to specify which memory order to use.
576
583
How to leverage memory orders to optimize performance will be covered later in \secref{lock-example}.
577
584
578
-
You may have noticed that there is padding after \monobox{\_Atomic(job\_t *) prev} in \monobox{struct idle\_job} in the example.
579
-
It is used for preventing \introduce{false sharing} in a cache line.
580
-
Further discussion on cache effects and false sharing is provided in \secref{false-sharing}.
581
-
582
585
\section{Atomic operations as building blocks}
583
586
584
587
Atomic loads, stores, and \textsc{RMW} operations are the building blocks for every single concurrency tool.
0 commit comments