Lecture 16

Here the 3 criteria given in AOS Chapter 7 for a solution to the "critical region problem" to be acceptable.   They are edited and expressed as outlines for improved logical clarity and discussion in the lecture.
  1. Mutual Exclusion:
    1. If process P is executing in its critical section, then no other processes can be executing in their critical sections.
  2. Progress:
  3. Bounded Waiting:
    1. Suppose now some process P requests to enter its critical section.
    2. There exists a numerical bound on the number of times OTHER processes are allowed to enter their own critical sections BEFORE process P's request is granted.
The progress criteria prevents an infinite loop executing by one process in non-critical section code from stopping a different process from entering a critical region.  But, it does not prevent an infinite loop inside a critical region from making the system hang!  Smart programmers are careful to make sure the code they make into critical regions will never execute for long periods of time!

You should read the 5 objectives A-E for correct and efficient concurrent programming (derived from material in Tannenbaum's text) in my notes for Lecture 14.

What is a critical region? (in this context, critical section is a synonym for critical region)

Objective B: How to implement mutual exclusion using various hardware "gadgets" or features built into current computers:
Way 1:  Disable interrupts.
The execution of a user threads as well as kernel threads can be viewed at the lowest software abstraction level as the execution of machine language instructions in sequence.  However, interrupts and exceptions cause other instructions (belonging to the interrupt handler, other kernel functions, and other threads if a context switch is performed) to be executed between successive instructions.

(This was discussed in the lecture because it wasn't covered enough before in the course.  You can read about it in earlier textbook chapters.  The examples of interrupts or exceptions used were: timer, floating point overflow, and page fault.)

Two kinds of interrupts or exceptions:

1) The interrupted instruction is retried after the interrupt is handled.
2) Although the interrupting event occurs in the middle of the currently executing instruction, the hardware makes that instruction complete all its work and sets the saved PC so that after the interrupt is handled, the next sequential instruction is executed.
In either case, the variables manipulated by the code sequence can be modified by the instructions that execute as a result of the interrupt.

All hardware today suitable for general computing has instuctions to "disable" and "enable" most interrupts.  On the PC (i486) architecture, the disable interrupt instruction is cli; and the enable interrupt instruction is sti;

When a condition causing an interrupt occurs when interrupts are disabled, the interrupt controller (a piece of hardware) "remembers" the interrupt and continues to present the interrupt signal to the CPU.   The interrupt will occur right after the interrupt enable instruction is executed.   (Some computers have levels of interrupts.  Each interrupt cause has a level.  When an interrupt occurs, the CPU interrupt level is set to the level of the interrupt's cause.  Now, interrupts with a lower or equal are disabled but an interrupt with a higher level will be taken.  As before, all interrupts are "remembered" by the interrupt controller.)

Disabling interrupts has the advantage of being simple and efficient.  It is often used in low level kernel code (interrupt handlers are called with interrupts disabled).  Tanenbaum describes two disadvantages:

When the computer has more than one CPU, disabling interrupts on one CPU does not stop other CPUs from accessing memory.  So, variables accessed by code between the disable and enable operations might also be accessed by the other CPU.  In Linux, the solution to this problem is to use "spin locks" in addition to disabling interrupts.

Instructions to disable interrupts are privileged, so they don't work (and cause exceptions instead) when run by user processes.   If user processes used privileged mode, they could disrupt other processes or the whole system operation, either on purpose or by accident.  (In MSDOS, everything ran in privileged mode.)

There's another disadvantage:  If interrupts are disabled for too long, some I/O devices will lose data because the data is not transferred from their buffers quickly enough.  Kernel programmers who disable interrupts must make sure they are re-enabled quickly.
Way 2: Take advantage of store-synchronous operations when using lock variables:
Caution: Simpleminded applications of this doesn't work correctly.  It took 15-20 years for computer scientists to figure out "Peterson's solution" after the problem was first formulated.

In current computers, "store" machine instructions take an word of data to store together with the address of the memory word in which to store the data.   Such store instructions are atomic "store-synchronous" operations, which means that once they are started they complete.  They never get interrupted so only half a word is stored.  (In the Java language specification, the details of what operations on variables are atomic is defined very carefully.  For example, stores of integers or characters are atomic, but not floating point numbers.  Writers of multithreaded floating point Java applications beware!   See http://java.sun.com/docs/books/jls/second_edition/html/memory.doc.html#28654

We will take advantage of the following fact about store-synchronous operations:  If say two threads run the operations "VarX := ValueA;" and "VarX := ValueB;" concurrently, after they are both finished, the (shared) variable VarX will have a well-defined value that is either ValueA or ValueB, and that value not change until a new write to VarX is performed.

Simpleminded implementation of mutex lock and unlock that Fails,  ... and why:
 

typedef boolean Mutex;
// Mutex M==0 means it is unlocked, M==1 means it is locked
lock( Mutex *M )
{
    while( *M != 0 /*operation 1*/ )
        /*do nothing*/  ;
    *M = 1 /*operation 2*/ ;
    return /*operation 4*/ ;
}
unlock( Mutex *M )
{
    *M = 0; /*operation 3*/
}

Suppose thread A runs:  lock(&M); critical_region_A; unlock(&M);
and thread B runs: lock(&M); critical_region_B; unlock(&M);

The expected interleaving is:
(A1)(A2)(B1)(B1)(B1)...(B1)(A4) critical_region_A operations interleaved with (B1)s (A3)(B1)(B2) critical_region_B (B3)

However, the scheduler might choose to run thread B right after test (A1) causes thread A's next step to be (A2):
(A1)(B1)(B2)(B4)some critical_region B ops(A2)(A4) critical region A and B running CONCURRENTLY(BAD!!)....

What happened: In between the time thread A tested *M!=0 finding it false and the time A set *M to 1, thread B was scheduled and ALSO ran *M!=0 finding it false!! Hence, BOTH threads "think" the mutex is unlocked.

(This discussion formalises the remarks under Lock Variables, page 104.)

We will see how "strict alternation" maintains condition 1 at the expense of condition 3.

We will see that careful usage of essentially 3 variables enables Peterson's solution to achieve mutual exclusion without alternation in a system that merely has synchronous store operations.

Way 3: Use special atomic "test-and-set", "compare-and-exchange" etc. built into modern computer instruction sets to help OS writers implement synchronization
These instructions or particular sequences of instructions (for example, "lock;" followed by "cmpxhg" in the IA-32) do make 2 or more memory accesses but make the hardware prevent any concurrent access to the targeted memory locations between them. See the textbooks for information on how to use them.
Closing Remark Higher level synchronization functions (mutex locks, semaphores, monitors, wait queues) are typically provided in modern systems. They are provided for user level processes and threads by system call libraries; or by language features of interpreted languages like Java. Similar (but not THE SAME) functions are provided INSIDE OS kernels for use by other kernel code. They are used 2 ways: (1) To provide synchronization and mutual exclusion for the operation of the kernel. (2) To enable kernel programmers to implement the synchronization system calls called from the above libraries. These kernel functions will then be IMPLEMENTED with some of the 3 ways of doing low level synchronization described above. The kernel will also use store synchronous variables or special atomic synchronization instructions within spin locks that provide practical short intervals of mutual exclusion between kernel threads both on the same and on separate processors.