Software Systems Spring 2008 For today you should have: 1) read Patterson's article: http://wb/ss/handouts/patterson04latency.pdf (wb stands for "whiteboard", which is an alternative to blackboard. from off campus, you need the full name, which is wb.olin.edu or ece.olin.edu) For next time: 1) do Homework 1 2) read the handout from LBoS and write a solution to the last puzzle 3) get your Linux partition squared away 4) read pages 11-16 of the cow book; you can get a scanned version from http://wb/ss/handouts/oualline97chap02.pdf compile and run hello.c Note on page 16, Step 4: to run the program you just compiled, you probably need to specify the path: ./hello 5) get a hand-held calculator? 6) get two 3-ring binders, one for notes, one for LBoS Outline: 1) class organization 2) LB model of data transfer 3) start in on Homework 1 Five topics, one uber-topic, five kinds of work ----------------------------------------------- The topics are Operating Systems, Networks, Run-time systems, Synchronization and System-level programming in C. The uber-topic is performance evaluation, which includes workload characterization, experimental design, measurement, modeling, analysis, simulation, implementation, and verification. The kinds of work are reading: I will give you reading questions for the textbook and papers, and we will have reading exercises. homeworks: reinforce the ideas and practice the techniques with short well-defined projects. project: apply the techniques to a long-term, open-ended project. synchronization puzzles: use sync primitives to solve mind-bending puzzles. programming: we'll learn C and get a little programming practice. LB model of data transfer ------------------------- The LB model is based on the observation that in many systems, the time to transfer a data object from one place to another is roughly linear with the size of the data. So we can characterize the line with: latency: the time to transfer some standard size chunk, usually the smallest relevant size measured in units of time bandwidth: the marginal rate at which additional data is sent measured in units of data size / time Graphically, latency is the intercept of the line size=min_size, and bandwidth is the inverse of the slope of the line Complications: 1) depending on context, latency might measure a one-way transfer (in which case it is often called delay) or a round trip (in which case it is often called a round trip time) 2) bandwidth, strictly speaking, is measured in Hz, because it measures the width of a band, which is a range of frequencies. "data rate" or "capacity" would probably be more correct. The reason they are used interchangably is that Shannon's theorem relates them: C = B log2 (1 + S/N) where: C is the maximum information-carrying capacity of a signal in bits/second B is the width of the band in Hz S/N is the signal-to-noise ratio In other words, the data rate is the bandwidth multiplied by a factor that depends on noise. 3) The actual data rate an application achieves, which is sometimes called throughput, is often related to bandwidth, but the relationship can be complicated. 4) Some systems are only roughly linear, and some are not very linear at all. See notes01.fig1.eps Nevertheless: 1) the LB model is used frequently. 2) it's often good enough (which is all we can ask from a model). 3) latency and bandwidth are usually the most important metrics of system performance. Sometimes you only care about one of them ----------------------------------------- For example, in networks: 1) interactive applications tend to send lots of short messages, so performance depends on latency. 2) moving large files tends to depend on bandwidth. 3) often startup depends on latency, steady state on bandwidth (which is why estimates of remaining time converge from above) In operating systems: 1) sizes are chosen to amortize latency. 2) therefore, both parameters matter more often than by chance. Mnemonic of the day: "Bandwidth is for big things, Latency is for little things"