Software Systems Spring 2008 For today, you should have: 1) Started on Homework 4 2) Skimmed Chapters 14 and 15 of the Cow Book. 3) Done the readers-writers problem. 4) Read the handout from Standish 5) prepared for a quiz Outline: 1) Selected exam solutions. 2) Proposal feedback. 3) Homework 4. 4) Memory management. For next time you should: 1) Solve the next semaphore puzzle (no starve readers-writers). 2) Read about the Lea allocator and answer questions below. 3) Work on Homework 4. 4) Work on your project. Exam 1 ------ Total points = 56. 50+ 1 45-49 6 40-44 2 35-39 1 30-34 1 The exam was a little long and replace.c was too hard, so overall I think the scores are good. One trouble spot: arithmetic and sweating the details. Next week's quiz: arithmetic all-or-nothing! sim.c ----- What is a Heap, and why is it useful for discrete-event simulation? http://en.wikipedia.org/wiki/Heap_%28data_structure%29 Code review: sim.c has no great surplus of comments :) But first, a comment by Guy Steele, Jr, a prominent programming language designer. from http://www.dreamsongs.com/ObjectsHaveNotFailedNarr.html Fred Brooks, in Chapter 9 of The Mythical Man-Month, said: Show me your flowchart and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowchart; it'll be obvious. That was in 1975. Eric Raymond, in The Cathedral and the Bazaar, paraphrased Brooks' remark into more modern language: Show me your code and conceal your data structures, and I shall continue to be mystified. Show me your data structures, and I won't usually need your code; it'll be obvious. That was in 1997, and Raymond was discussing a project coded in C, a procedural language. But for an object-oriented language, I think this aphorism should be updated again: Show me your interfaces (the contracts for your classes) and I won't usually need your data structures; they'll be irrelevant. So, here are the field declarations and method interfaces in sim.c // EVENT typedef struct event { int type; double time, size; struct queue *queue; struct customer *customer; } Event; Event *make_event (int type, double time, Queue *queue) // HEAP typedef struct heap { Event **v; int next, size; } Heap; Heap *make_heap (int size) { int heap_is_empty (Heap *h) { void heap_add (Heap *h, Event *e) { Event *heap_peek (Heap *h) { Event *heap_remove (Heap *h) { // QUEUE typedef struct queue { double lambda, mu, rho; int busy; int length, size; Customer **customers; int first, next; } Queue; Queue *make_queue () int queue_is_full (Queue *queue) { int queue_is_empty (Queue *queue) { void add_to_queue(Queue *queue, Customer *customer) { Customer *get_from_queue (Queue *queue) { // CUSTOMER typedef struct customer { int id; double arrival, service, departure; } Customer; Customer *make_customer () // EVENT HANDLERS void handle_arrival (Event *e) { void handle_departure (Event *e) { void handle_observation (Event *e) { void handle_event (Event *e) { void start_queue (Queue *queue) void end_queue (Queue *queue) { Homework 4 ---------- We've been a little loose in our discussion of "average" values. Averaged across what? Customers? Time? How to compute time averages. 1) instrument handle_event 2) sample at regular intervals (not generally a good idea) 3) use PASTA and add observation events 4) use PASTA and use arrival events Why are the actual values different from the predicted values? 1) warm up and cool off 2) actual workload vs. nominal workload 3) randomness How can you deal with each of these factors? Getting your project started ---------------------------- WARNING: mixed metaphors ahead! Most projects have two phases, "wandering in the desert" and "on rails". The difference is whether, when you sit down to work on the project, you know what to do. Wandering is no fun, and it is very hard to get motivated for it. The difference between success and failure is getting through the wandering phase. Tips for successful wandering: 1) Keep several balls in the air, so you have useful work to do while you wait. And so you have the best chance of finding a fruitful path. 2) Some reading is good, but avoid the fallacy that passing your eyes over paper is productive work. Do _something_ concrete early. 3) Write as you go. Start writing your final report now. a) part of the introduction b) bibliography and related work. You will find it surprisingly useful to explain your project to yourself, and each other. 4) Use brainstorming techniques. I know they are corny, but they work. What does an Operating System do? --------------------------------- 1) abstraction timesharing is the illusion of unshared hardware VM creates the illusion of a uniform address space 2) coordination (of shared resources) VM also prevents processes from accessing each other's memory other devices also need coordination 3) allocation (of resources) scheduling is allocation of time page replacement is allocation of physical memory (between processes) memory management is allocation of memory within a process In the context of memory, "allocation" might refer to 1) the OS allocating an additional frame to a process. Allocation in units of (non-contiguous) pages. 2) a program allocating space in VM for a particular purpose. Allocation in units of (contiguous) bytes. Memory management ----------------- Within a process, there are several kinds of allocation. At compile time (static allocation): 1) the code generator lays out the program text in memory. 2) if the compiler can tell how big something is it can allocate at compile time in the static segment. At run time (dynamic allocation): 1) local variables are allocated on the stack usually the entire stack frame is allocated at once. 2) constructors allocate space in the heap variable size, contiguous allocation unpredictable Deallocation: 1) objects in the static section are never deallocated 2) stack frames are deallocated when functions complete normal functions complete in FILO order (a co-routine is a function whose call/complete pattern is not stack structured) 3) dynamically-allocated objects can be deallocated any time after the last access. How does the OS know? 1) the programmer might explicitly "free" the object. 2) the compiler might analyze the program and add deallocation code (not common in practice) 3) the run-time system might detect that an object cannot be accessed (how?) #1 is called explicit memory management #3 is called garbage collection. Explicit memory management: 1) potentially optimal. 2) error prone. 3) often in conflict with modularity. Garbage collection: 1) conservative (does not deallocate as soon as possible). sometimes needs a little help. 2) takes time. 3) used to be in conflict with real time constraints. Contiguous allocation --------------------- If the size of the things being allocated is always the same, contiguous allocation is easy (it reduces to the non-contiguous problem) Variable size allocation is challenging. Standish presents the baseline strategy 1) two-way linked list of free blocks 2) fit policy 3) coalescing Lea presents an alternative with better performance. Garbage collection ------------------ Two general strategies: 1) mark and sweep periodically stop the program and traverse the reference graph deallocate anything you _didn't_ visit 2) reference counting keep track of the number of references to each object when the last reference goes away, deallocate Reference counting 1) more overhead 2) cannot collect circular data structures (without help) Mark and sweep 1) used to impose noticeable delays current versions are incremental The Lea Allocator ----------------- Read the article at http://g.oswego.edu/dl/html/malloc.html I highly recommending printing it so you can mark it up. What are Lea's metrics for a good allocator? Which are the most important? What workloads does Lea think are most important? What are boundary tags, and how do they permit coalescing? Why would you ever want to defer coalescing? What are the two ways of allocating more space from the OS? What is the wilderness, and how is it different from other chunks?