cs357 Lecture Notes Spring 2000 Week 9, Tuesday For today you should have read Chapter 6, Sections 6.1 through 6.4 You should have picked up hw6, scheduled meeting times with your partner, and gotten started on at least the first experiment. Homework 6 is due next Tuesday. For Thursday you should read the POSIX handout about mutexes. Also you should read the rest of Chapter Six. Wednesday: cs fete! Chapter 6: Synchronization -------------------------- In general we cannot make any claims about the order in which concurrent code runs. Thread A Thread B -------- -------- a1 b1 a2 b2 We can say for sure that a2 > a1 > means "happens after" We can sat for sure that b2 > b1 But, we cannot compare any of the as to any of the bs. Shared variables can impose constraints on the ordering. On the exam we saw something like: Thread A Thread B -------- -------- b1 while (x == 5) { x = 6 // wait b2 } a1 Now, if we can prove that the initial state of x is not 5 and that B is the only thread that can make the value of x be 5, then we can prove that a1 > b1. But we can still make no claim about a1 <> b2. This is an example of a simple form of synchronization. It is not a particularly useful form because it only allow us to make a few simple claims, and it is not easy to generalize it to solve the problems we are interested in. synchonization: together time any mechanism that creates a temporal relationship between statements of distinct threads Mutual exclusion ---------------- Remember that if two threads access a shared variable concurrently, the outcome is often nondeterministic. (Except in special cases, e.g. all accesses are reads, or all writes are writing the same value.) Non-deterministic is not always bad. But correctness of programs often demands determinism. The fundamental source of the problem is that we cannot control when interrupts happen. Any non-atomic operating might get interrupted. Interrupts off! -------------- One simple solution is to turn interrupts off before doing something non-atomic and potentially non-deterministic: Thread A Thread B -------- -------- temp1 = x temp2 = x x = temp1 + 1 x = temp2 + 1 Obviously it would be bad if either thread got preemtped. between statements. Thread A Thread B -------- -------- interrupts off interrupts off temp1 = x temp2 = x x = temp1 + 1 x = temp2 + 1 interrupts on interrupts on This solves the problem, but it raises some difficulties: 1) we don't want to give users the ability to turn off interrupts a) it would violate CPU protection, because they could run forever b) it would violate interprocess protection, since if a process neglected to turn interrupts back on, it would crash the OS c) if a process leaves interrupts off too long, it might cause an I/O buffer to overflow, causing I/O operations to fail. 2) it's hard to turn interrupts off on a multiprocessor Nevertheless, this is sometimes an appropriate mechanism inside the kernel, where other mechanisms might not be available, or might be unacceptably inefficient, and where the code is more carefully controlled. Mutual exclusion problem ------------------------ Before we look at other solutions, it is useful to state the problem more carefully. For many problems, what we really want is not necessarily that threads be immune from interruption. There is a more conservative requirement that serves the same purpose, but might be cheaper to implement. We can identify sections of code in different threads, called critical sections, that should never run concurrently with each other. In the previous example, the critical section is between "enter" and "leave". Thread A Thread B -------- -------- enter enter temp1 = x temp2 = x x = temp1 + 1 x = temp2 + 1 leave leave The requirement is that if one of these threads is running in its critical section, then the other one (or others) cannot. This is sufficient to guarantee determinism, but it is weaker than the no interrupt requirement, since we are free to interrupt A in its critical section if necessary. If we do, we are free to run any other thread, including B, as long as B does not try to enter its critical section. In that case --- A was running in its critical section and was interrupted, and now B is trying to get in --- B will have to wait at "enter" until A has resumed and left the critical section. Software solution for two processes ----------------------------------- The book's algorithm 3 flag[i] indicates that thread i would like to enter it's critical section. Thread A Thread B -------- -------- i = 0 i = 1 // my id j = 1 j = 0 // other id Enter section flag[0] = true flag[1] = true // set my flag turn = 1 turn = 0 // set turn to other while (flag[1] && while (flag[0] && // wait turn = 1) turn = 0) Critical section assert (!flag[1] || assert (!flag[0] || turn = 0) turn = 1) temp1 = x temp2 = x x = temp1 + 1 x = temp2 + 1 Laeve section flag[0] = false flag[1] = false Can we prove this is correct? Only thread i writes flag[i], so flag[i] is not a shared variable. Turn is a shared variable and both threads write it concurrently, so the outcome is non-deterministic, but it doesn't matter which value "sticks". Inside the critical section, both threads can assert that "either the other thread is not trying to enter or the other thread is trying to enter but it is my turn." This proves that mutual exclusion is guaranteed. In addition, we might want to prove the properties of progress and bounded waiting. Progress: if a thread wants to enter, it might have to wait for another thread that is trying to enter, but it shouldn't have to wait for a thread that is not trying to enter (this is the problem with Algorithm 1, which requires strict alternation) Bounded waiting: if a thread is waiting to enter, there has to be a bound on the number of times other threads enter before it does (can't break ties by choosing the thread with a smaller id, for example) Bakery algorithm ---------------- We can generalize the previous solution to n processes, at the cost of some additional complexity. It is hard to solve this problem in software, and this is a relatively simple synchronization problem! Can we get a little help from the hardware? Lock ---- Yes! If we can read and write a shared variable atomically, then we can build a lock. A lock is a shared value plus two operations: aquire (lock): when control returns from this method, we know that we have the lock (and no one else does) release (lock): we're done, and someone else can have the lock. Here's a very simple implementation of a lock that turns out not to be correct: typedef struct { int value; } Lock; Lock *make_lock () { Lock *lock = (Lock *) malloc (sizeof(Lock)); lock->value = 0; return lock; } void acquire (Lock *lock) { while (lock->value == 1); lock->value = 1; } void release (Lock *lock) { lock->value = 0; } Why isn't this correct? What could go wrong? What if we could read a value from memory and write it at the same time? (Why does the value have to be in memory?) The Intel x86 instruction set provides Bit Test and Set For fun, I compiled lock.c to get lock.s gcc -S lock.c yields acquire: pushl %ebp // save the value of the reg movl %esp,%ebp // get the stack pointer nop .p2align 4,,7 .L3: movl 8(%ebp),%eax // move lock into reg cmpl $1,(%eax) // compare lock->value to 1 je .L5 // if it's 1, jump to bottom jmp .L4 // otherwise, jump out .p2align 4,,7 .L5: jmp .L3 // jump from bottom to top .p2align 4,,7 .L4: movl 8(%ebp),%eax // move lock into reg movl $1,(%eax) // set lock->value to 1 .L2: leave // restore the saved reg, I think ret I rewrote it as follows: acquire: pushl %ebp movl %esp,%ebp nop movl 8(%ebp), %eax // move lock into reg (once!) .p2align 4,,7 .L2: btsl $0,0(%eax) // bit test and set lock->value jc .L2 // if value = 1, jump to top .p2align 4,,7 .L3: leave // if we get here, we know that ret // value was 0 and that we set it I couldn't help doing a little optimization while I was in there. Note: this solution does not provide bounded waiting. The book shows how to augment it. Downside: atomic test and set is hard to implement on multiprocessors.