Software Systems Spring 2008 For today, you should have: 1) finished Homework 3 2) flipped through Chapter 11 of the Cow Book just to see what's there, and then read Chapter 12 and done Exercise 12-2. 3) read from Tanenbaum and answered the questions Outline: 1) paging 2) reading questions 3) workload characterization 4) performance evaluation overview 5) project planning For next time you should: 1) prepare for the exam 2) read Breslau et al "Web Caching..." and answer the questions below. There will be a section about this paper on the exam. You can work together on this, but I won't answer questions about the paper until after the exam. http://wb/ss/handouts/breslau99web.pdf 3) think about projects: proposal due Monday 3 March Workload characterization ------------------------- wget wb/ss/code/lifetimes.tgz tar -xzf lifetimes.tgz cd lifetimes ls ctc.lifetimes # data from the SP2 at the Cornell Theory Center t3e.batch.lifetimes # data from the T3E at SDSC (batch jobs) t3e.inter.lifetimes # data from the T3E at SDSC (interactive jobs) ucb.lifetimes # data from workstations at Berkeley process # a script for processing the data Here is the content of the script named process: # replace this with the path to your copy of cdf cdf=/home/downey/bin/cdf for file in *.lifetimes do echo \$file \$cdf -t logx \$file > \$file.cdf \$cdf -t loglog \$file > \$file.ccdf done Edit the script so it knows where you put cdf, and then run it: bash process Let's start by looking at the data from UCB: wc ucb.lifetimes xgraph ucb.lifetimes.cdf 1) The x axis is log2(process lifetime in seconds) 2) Left of 0 is less than 1 second. 3) Resolution is coarse for short jobs What is the median lifetime? What percentage of processes run longer than 1 second? xgraph ucb.lifetimes.ccdf 1) The vertical axis is now log2(1 - Prob {lifetime > x}) 2) This view of the data makes the tail more visible. 3) A straight line on this graph is evidence of a Pareto distribution. What percentage of processes run longer than 2^5 seconds? Now let's look at the data from CTC: xgraph ctc.lifetimes.cdf xgraph ctc.lifetimes.ccdf What do we make of this? We can also look at all the cdfs: xgraph *.cdf xgraph *.ccdf What conclusions can we draw from these datasets? Performance Evaluation ---------------------- Measurement 1) what do you want to know? 2) what environments do you care about? 3) what do you have access to? 4) what instruments do you have? Workload characterization 1) what generalizations can we make from measurements? 2) what simplifications can we make? Modeling 1) what features of the system do we need to include? 2) what can we leave out? 3) what kind of analysis/simulation are we planning? Analysis 1) what can we derive/prove about the system? 2) how do our assumptions affect these results? Simulation 1) if we relax the assumptions needed for analysis, do the results change? 2) are there additional results we can get? Implementation 1) what do our results tell us about real systems? 2) if we have developed new algorithms, can we implement them? Validation 1) does the implementation behave as we expect? 2) what metrics are important? 3) what benchmarks do we use? 4) how does the actual workload affect performance? Project ideas ------------- Strategies for finding a project 0) Skim the parts of the textbook we haven't got to yet so you have an idea of what's coming. In particular, the topics that are coming up are memory allocation and garbage collection, disk drive performance and scheduling, file system implementation, I/O implementation, TCP/IP performance, distributed hash tables (peer-to-peer network implementation), and possibly out-of-core algorithms. 1) If there is a part of an operating system, network or database that interests you, look it up in the Wikipedia and follow pointers to additional reading. 2) If you read something in the press, or someplace like slashdot, and you want to look deeper, that can lead to an interesting project. Example: http://www.theregister.co.uk/2005/02/18/intel_tcpip_attack/ 3) If you have an idea for a hardware/software product, we can try to build and evaluate an implementation. 4) If you would like to start in on publishable work in this area, I can help you find the research frontier. 5) If there is a company that is doing something interesting, maybe we can make a deal! Some ideas I have kicked around: 1) Perform a measurement study of some part of the Olin computing infrastructure. wireless network workload / performance laptop / hard drive reliability -- data loss modeling what can you measure with/without cooperation of infrastructure? 2) Design and implement a software system for local use a backup/restore system for your laptops general interaction between laptops and other systems load sharing, storage sharing, distributed processing 3) Implement some OS feature on a microcontroller. process abstraction (timesharing, virtual memory) real time scheduling joint POE project to build a boot loader for enchanced PIC 4) Investigate and improve some part of Linux. where are the real performance bottlenecks? eliminate thrashing implement an application-specific memory allocation system (or evaluate existing ones) evaluate alternative file system implementations 5) Develop a higher-level abstraction replace the file system abstraction with something like a searchable database. 6) Perform a simulation study of a real-life "system" optimize some of the algorithms we use in real life example: virtual postal addresses time-motion study of the dining hall 7) Investigate alternative architectures Build or evaluate a system that uses flash memory instead of disk. Use a simulation study to evaluate hybrid drives. Web caching ----------- Questions on Breslau et al, "Web Caching..." 1) Of the performance evaluation activities we have discussed (workload characterization, measurement, modeling, analysis, simulation, implementation, verification), which ones do the authors present in this paper? 2) Do the authors find that page access frequencies obey Zipf's law? 3) What is the primary analytic result this paper presents? How does this contribution relate to prior work? 4) What is the relationship between the size of an object and how frequently it is accessed? 5) What conclusion is Figure 1 meant to support? 6) What conclusion is Figure 6 meant to support? 7) The analysis in Section III is based on a model the authors admit is unrealistic. Describe one simplification this model is based on. 8) What are the four replacement algorithms the authors evaluate? 9) What conclusions can we draw from Figure 9?