You should read the 5 objectives A-E for correct and efficient concurrent programming (derived from material in Tannenbaum's text) in my notes for Lecture 14.
What is a critical region? (in this context, critical section is a synonym for critical region)
C/C++ "block": code delimited by "{", "}" brace pairs: { ...a C/C++ block of statements.. }Definition 1 (Tanenbaum and other OS books): "That part of a program where shared memory is accessed"
critical region for Mutex M: code between the lock(M) and corresponding unlock(M):lock(M); ..; critical region or section ..; unlock(M);
The practice of programming with mutexs or locks is:
- Described by objective A: "Prevent race condition bugs by choosing critical regions and use a system that enforces 1" ("No two processes [threads] may be simultaneously inside their critical regions.")
- Called "Easy Concurrency" by Lampson and Rinard.
- You (the programmer) must cleverly choose what sequences of operations to put between corresponding lock and unlock operation, so that, race condition bugs will not occur if the system meets the guarantee described by 1.
- (Lampson and Rinard describe the use of invariants in this methodology. The principle is that you figure out facts that you want the program to satisfy whenever it is not in a critical section, write down precise description of them, and carefully program the critical sections so if you assume these facts whenever the program is about to enter a critical region, then these facts will be true whenever the program leaves the critical section.)
- Illustrated by our bracketing the statement sequences (A1), (A2), (A3) and (B1)(B2) (B3) by lock(M) and unlock(M) statements.
The execution of a user threads as well as kernel threads can be viewed at the lowest software abstraction level as the execution of machine language instructions in sequence. However, interrupts and exceptions cause other instructions (belonging to the interrupt handler, other kernel functions, and other threads if a context switch is performed) to be executed between successive instructions.Way 2: Take advantage of store-synchronous operations when using lock variables:(This was discussed in the lecture because it wasn't covered enough before in the course. You can read about it in earlier textbook chapters. The examples of interrupts or exceptions used were: timer, floating point overflow, and page fault.)
Two kinds of interrupts or exceptions:
1) The interrupted instruction is retried after the interrupt is handled.In either case, the variables manipulated by the code sequence can be modified by the instructions that execute as a result of the interrupt.
2) Although the interrupting event occurs in the middle of the currently executing instruction, the hardware makes that instruction complete all its work and sets the saved PC so that after the interrupt is handled, the next sequential instruction is executed.All hardware today suitable for general computing has instuctions to "disable" and "enable" most interrupts. On the PC (i486) architecture, the disable interrupt instruction is cli; and the enable interrupt instruction is sti;
When a condition causing an interrupt occurs when interrupts are disabled, the interrupt controller (a piece of hardware) "remembers" the interrupt and continues to present the interrupt signal to the CPU. The interrupt will occur right after the interrupt enable instruction is executed. (Some computers have levels of interrupts. Each interrupt cause has a level. When an interrupt occurs, the CPU interrupt level is set to the level of the interrupt's cause. Now, interrupts with a lower or equal are disabled but an interrupt with a higher level will be taken. As before, all interrupts are "remembered" by the interrupt controller.)
Disabling interrupts has the advantage of being simple and efficient. It is often used in low level kernel code (interrupt handlers are called with interrupts disabled). Tanenbaum describes two disadvantages:
When the computer has more than one CPU, disabling interrupts on one CPU does not stop other CPUs from accessing memory. So, variables accessed by code between the disable and enable operations might also be accessed by the other CPU. In Linux, the solution to this problem is to use "spin locks" in addition to disabling interrupts.There's another disadvantage: If interrupts are disabled for too long, some I/O devices will lose data because the data is not transferred from their buffers quickly enough. Kernel programmers who disable interrupts must make sure they are re-enabled quickly.Instructions to disable interrupts are privileged, so they don't work (and cause exceptions instead) when run by user processes. If user processes used privileged mode, they could disrupt other processes or the whole system operation, either on purpose or by accident. (In MSDOS, everything ran in privileged mode.)
Caution: Simpleminded applications of this doesn't work correctly. It took 15-20 years for computer scientists to figure out "Peterson's solution" after the problem was first formulated.Way 3: Use special atomic "test-and-set", "compare-and-exchange" etc. built into modern computer instruction sets to help OS writers implement synchronizationIn current computers, "store" machine instructions take an word of data to store together with the address of the memory word in which to store the data. Such store instructions are atomic "store-synchronous" operations, which means that once they are started they complete. They never get interrupted so only half a word is stored. (In the Java language specification, the details of what operations on variables are atomic is defined very carefully. For example, stores of integers or characters are atomic, but not floating point numbers. Writers of multithreaded floating point Java applications beware! See http://java.sun.com/docs/books/jls/second_edition/html/memory.doc.html#28654
We will take advantage of the following fact about store-synchronous operations: If say two threads run the operations "VarX := ValueA;" and "VarX := ValueB;" concurrently, after they are both finished, the (shared) variable VarX will have a well-defined value that is either ValueA or ValueB, and that value not change until a new write to VarX is performed.
Simpleminded implementation of mutex lock and unlock that Fails, ... and why:
typedef boolean Mutex;
// Mutex M==0 means it is unlocked, M==1 means it is locked
lock( Mutex *M )
{
while( *M != 0 /*operation 1*/ )
/*do nothing*/ ;
*M = 1 /*operation 2*/ ;
return /*operation 4*/ ;
}
unlock( Mutex *M )
{
*M = 0; /*operation 3*/
}Suppose thread A runs: lock(&M); critical_region_A; unlock(&M);
and thread B runs: lock(&M); critical_region_B; unlock(&M);The expected interleaving is:
(A1)(A2)(B1)(B1)(B1)...(B1)(A4) critical_region_A operations interleaved with (B1)s (A3)(B1)(B2) critical_region_B (B3)However, the scheduler might choose to run thread B right after test (A1) causes thread A's next step to be (A2):
(A1)(B1)(B2)(B4)some critical_region B ops(A2)(A4) critical region A and B running CONCURRENTLY(BAD!!)....What happened: In between the time thread A tested *M!=0 finding it false and the time A set *M to 1, thread B was scheduled and ALSO ran *M!=0 finding it false!! Hence, BOTH threads "think" the mutex is unlocked.
(This discussion formalises the remarks under Lock Variables, page 104.)
We will see how "strict alternation" maintains condition 1 at the expense of condition 3.
We will see that careful usage of essentially 3 variables enables Peterson's solution to achieve mutual exclusion without alternation in a system that merely has synchronous store operations.
These instructions or particular sequences of instructions (for example, "lock;" followed by "cmpxhg" in the IA-32) do make 2 or more memory accesses but make the hardware prevent any concurrent access to the targeted memory locations between them. See the textbooks for information on how to use them.Closing Remark Higher level synchronization functions (mutex locks, semaphores, monitors, wait queues) are typically provided in modern systems. They are provided for user level processes and threads by system call libraries; or by language features of interpreted languages like Java. Similar (but not THE SAME) functions are provided INSIDE OS kernels for use by other kernel code. They are used 2 ways: (1) To provide synchronization and mutual exclusion for the operation of the kernel. (2) To enable kernel programmers to implement the synchronization system calls called from the above libraries. These kernel functions will then be IMPLEMENTED with some of the 3 ways of doing low level synchronization described above. The kernel will also use store synchronous variables or special atomic synchronization instructions within spin locks that provide practical short intervals of mutual exclusion between kernel threads both on the same and on separate processors.