cs357 Lecture Notes Spring 2000 Week 8, Tuesday For today you should have picked up homework 5, scheduled meeting times with your partner, and gotten started on at least the first experiment. Homework 5 is due this Thursday. Today (and Thursday)... 1) Chapter 5: CPU Scheduling 2) Homework 5 intro 3) Exam discussion 4) Homework 4 results Homework 4 results ------------------ Virtual address space looks like what we thought. (Draw the picture!) Children's stacks have about 2 MB each. (Express values in human-comprehensible terms!) Can't get them to crash, segmentation fault happens first. (Most people were good about reporting accurately, even if they did not have an explanation.) (Do a little math: how many recursions before the seg fault? Is that enough to consume 2 MB?) Tug of war: lousy results! (Plot run time versus n!) CPU Scheduling continued... Goals ----- 1) minimize the average turnaround time for all processes 2) satisfy real-time constraints, like requirements for interactive behavior 3) maintain high utilization of the CPU and the other hardware by overlapping the CPU of one process with the I/O of another 4) behave predictably and fairly Strategies (heuristics) ----------------------- FIFO: first in first out pro: simple, and meets some definition of fairness con: bad average turnaround when short waits for long no priority mechanism: hard to meet real-time constraints SJF: shortest (time to completion or blocking) first pro: provably optimal (minimal) average turnaround time for a set of processes all available at t0 con: we generally don't know the duration of each process's next CPU burst, and also the model assumed in the proof does not consider the dependency of successive bursts Round Robin: give each process a little bit of time pro: generally good when there is large variability in duration con: overhead becomes signficant as quantum gets smalle pessimal when jobs are the same length External priorities: pro: satisfies real time requirements con: burden on users/programmers starvation / bidding wars Multi-level feedback queue: pro: reduces user burden con: can get fooled, manipulated occasional pathological behavior (example: priority increases with I/O; what happens to a thrashing process?) can be expensive to keep track of needed info Non-deterministic scheduling: Example: lottery scheduling. Problem: for any deterministic system you can contruct pathological cases that yield bad performance Solution: a little randomness goes a long way Con: unpredictable execution model for programmers Any idea what Linux is doing? Intro to Homework 5. Real-time scheduling -------------------- When processes arrive, we generally need to know 1) their resource requirements 2) their deadlines 3) sometimes a cost function (how bad would it be if we missed the deadline) The goal of the real time scheduler is either 1) find any schedule that misses no deadlines (or maybe the best such schedule) 2) if not possible, minimize the total cost of all misses Some real-time schedulers involve recurring tasks, in which case they sometimes break the time line into cycles that are the LCM of the cycle times of recurring tasks. In the case where there is guaranteed to be a schedule that misses no deadlines, and the cost of a context switch is zero, there is a simple algorithm that is guaranteed to find a successful schedule: Work on the Process with the Nearest Deadline First Sound familiar? Problem: if the schedule is untenable, this algorithm is often pessimal. Dependencies ------------ Complication: there are often dependencies among tasks. 1) the sooner you issue an I/O command, the sooner it completes 2) users might not submit the next process until this one completes These dependencies limit the usefulness of simple queueing models. Instead, designers use heuristics like: 1) give processes priority if you expect them to do I/O soon (and why might you expect that?) 2) degrade the priority of things that have been running for a long time a) they are less likely to be interactive b) the relative cost of delays is declining Evaluation techniques --------------------- 1) Analysis: only works for very simple strategies/models 2) Simulation: useful for checking out wide range of possibilities. depends on workload model or traces 3) Implementation: most reliable, but also depends on workload (Which do Waldspurger and Weihl do?) (What is their workload or workload model?) (What is their performance metric?) Exam 2 solutions ---------------- Part One: Fork void recurse_and_fork (int n) { pid_t pid; if (n == 0) return; printf ("Hello from level %d\n", n); pid = fork (); recurse_and_fork (n-1); } int main () { recurse_and_fork (3); } (4 points) Is this program legal? If not, why not? If so, what is the output? Legal. The output is Hello from level 3. Hello from level 2. Hello from level 2. Hello from level 1. Hello from level 1. Hello from level 1. Hello from level 1. (2 points) How many child processes did the program create? 3000 (2 points) Did the child processes run concurrently with the parent process? Yes. (2 points) Did the child processes run concurrently with each other? No. (2 points) Name all the system calls that are invoked by the parent process inside the loop. fork, wait (2 points) Name all the system calls that are invoked by the child processes. exit (3 points) How much system time is charged to the parent process for each child it creates? 0.41 / 3000 = 140 us (3 points) How much system time is charged to each child process? 0.88 / 3000 = 290 us (4 points) Based on these results, do you think the child processes are being charged for some part of the fork and wait system calls? Give evidence to support your claim, but also indicate any alternative explanations. It seems likely that the child is being charged for part of the fork time (or maybe the wait time), since 290 us is a lot for just exiting, and it seems unlikely that it takes longer to do nothing and then exit than it does to create a new process. (4 points) Let's say that you ran the program again and got the following result... What changes, if any, would you make to the answers you have given above (questions 2-9)? I would modify my estimates to reflect the variability in the results. Instead of saying 140 us, I would say "somewhere in the range from 90 us to 140 us". (4 points) If you needed a more accurate estimate of the per-process costs of forking, how would you do it? Increasing the number of processes would yield a more precise estimate of what we are already measuring, but it would not yield a more accurate estimate of the fork time. To do that I would have to redesign the experiment to get at just the fork time. *Part Two: True or False When a process executes an I/O operation, it moves from the ready state into the waiting/blocked state. If two threads make unsynchronized access to a shared variable, then the program is nondeterministic. A context switch between threads of the same program is generally slower than a context switch between processes. When an interrupt occurs, the running process moves into the ready state, and the scheduler must choose a new process to run. Interprocess protection prevents one thread from reading and writing the memory of other threads. The sum of a process's CPU time and system time is usually roughly equal to its wall clock time. In a virtual memory system, a process can read and write any address in the physical address space. All are false, except Number 4, which was sufficiently ambiguous that I decided to accept either answer. *Part Three: Short answer questions (4 points) Assume that the quantum allocated to a process is 2 ms, and that each time the timer exipires it takes the OS 35 s to handle the interrupt, select a process and start it. How much overhead would be imposed on a single process running by itself? 35 us out of every 2 ms + 35 us = 1.72 % So many people misinterpreted this question that I decided not to count it. (4 points) Linux implements fork using copy-on-write. Other systems provide versions of fork that copy all the pages in the parent's address space at the time of the fork. Name one advantage of each scheme. Advantages of copy-on-write: fork time is reduced, child can begin execution sooner, any pages that are never copied yield net benefit. Advantages of copy-at-fork-time: simpler to implement, no need to incur overhead of future traps. Note: neither one _necessarily_ yields less overhead; it depends on how many of the pages are eventually copied. (4 points) How does a virtual address system facilitate the implementation of copy-on-write? By marking a page read-only you can force future writes to cause a trap, giving the OS a chance to copy the page, update the page table, and return control to the process. (4 points) How can you tell whether an operation like x++ is atomic? Why might it be important to know? The only way to tell is by looking at the implementation or documentation, or possibly by testing it many times for non-atomic behavior (although this is a one-way test). It is important to know you have to know in order to tell whether a program that contains x++ is deterministic. (4 points) What is the difference between a process and a thread? Processes have their own address spaces. Threads share part of their address spaces, usually the text, static segment and heap. (4 points) Name one advantage and two disadvantages of user-level threads over kernel-level threads. This comes right out of the book, page 105 Advantage of user-level: faster context switch (no kernel intervention required) Disadvantages: unfair scheduling one thread's I/O blocks the whole process * Part Four: Synchronization (4 points) Initial state: x = 1 Thread A Thread B -------- -------- x++ x++ x++ print x Output = 2, 3 or 4 (4 points) Initial state: x = 1, y = 3 Thread A Thread B -------- -------- x = y y = x print x x = 7 print y Output = 11, 33, 71, 73 (4 points) Initial state: x = 5 Thread A Thread B -------- -------- while (x == 5) { x = 6 // wait print x } x = 7 print x Output = 77, 67