Chapter 6 ---------- 6.1 Allocation ----------- resource allocation: primarily link bandwidth and buffer space on routers contention: one packet delays another congestion: packets are getting dropped because of insufficient buffer (queue) space congestion control: preventing congestion or dealing with it when it happens (why is routing around busy links insufficient to avoid congestion?) (what's a connectionless flow?) (what's the difference between soft state and hard state?) Router-centric vs. host-centric Who is primarily responsible for detecting and acting? Reservation-based vs. feedback-based In virtual circuit networks, hosts reserve resources for each flow ahead of time, so congestion is avoided, at the cost of underutilizing resources. In a connectionless network, it is generally not possible to avoid congestion, can only deal with it. Window-based vs. rate-based Window: you can send x bits now. Await further instructions. Rate: you can send x bits per second until further notice. Evaluation ---------- Measurements of a single flow: Throughput: rate at which bits are getting through (averaged over an interval of time) Delay: time between leaving the source and arriving at host (averaged over a number of packets or bits) Problem 1: the two goals conflict, how to balance them? One (not very good) possibility: maximize throughput/delay Problem 2: measurements pertain to a single flow, how to aggregate? Total throughput over all flows? Average delay? Problem with most kinds of aggregation is that they neglect some notion of fairness. Unfortunately, fairness is not well defined either. 6.2 Queueing ----------- queue discipline: who gets transmitted next? basic = FIFO drop discipline: if the queue is full, who gets kicked out? basic = last one in Priority queueing: multiple FIFO queues Problems: 1) starvation 2) allocating priority 6.2.2 Fair queueing ------------- 1) Separate queue for each flow 2) Round-robin service -- one packet from each queue (why isn't that adequate?) (how, ideally, would we fix it?) (why is the ideal not feasible?) FQ = approximates bitwise round-robin, but maintains packet boundaries. WFQ = each flow gets a weight, which is abstractly the number of bits per rotation. Was bedeutet "separating policy from mechanism?" 6.3 Congestion ----------- TCP congestion control send packets, react to events events = timeouts and obstinate ACKs congestion collapse 1) blast the network 2) everything gets dropped and times out 3) go to 1 self-clocking means the protocol is dynamic and usually has no initial information about flows or network 6.3.1 Additive increase/multiplicative decrease ----------------------------------------- AdvertisedWindow protects receiver provided by receiver CongestionWindow protects network continuously estimated Fundamental assumption: if a packet it dropped, it is probably due to congestion. (how do we know when a packet is dropped?) Another fundamental assumption: things are constantly changing, but they are not changing TOO fast (information about recent history is still relevant) After every successful transfer, increment the CW. After every dropped packet, halve the CW. This turns out to be a necessary condition for stability. Intuition: Being too high is much worse than being too low. Need to get out fast! 6.3.2 Slow start ---------- Increase exponentially, rather than linearly. (What exactly is this slower than?) This happens at the beginning of a connection. It also happens when a timeout occurs: 1) threshhold = target window = 1/2 current window 2) current window = 1 3) increase current window exponentially until we get back to target 4) increase linearly thereafter (why are so many packets lost during startup) (tradeoff between agressive start and linear start) Is there a better way to get an initial estimate of the available bandwidth? Maybe -- packet-pair. 6.3.3 Fast retransmit --------------- How do we usually detect a lost packet? What's the alternative? 1) sender sends a packet after every arrival, but only ACKs things in order 2) out-of-order arrivals yield obstinate ACKs 3) obstinate ACKs probably indicate loss 4) and we can tell what's missing Fast recovery ------------- After a timeout, just set current window = 1/2 current window. 6.4 Congestion avoidance ----------- DECbit: changes both routers and hosts RED: changes just the routers Vegas: just the hosts Which can we implement? Which can Cisco implement? Which can no one implement? 6.4.1 DECbit ------ (what does DEC stand for?) packet headers have a bit, initially zero. (they call it a congestion bit, it should be a contention bit) If the packet goes through a busy router, the bit gets set. Bit gets copied into the ACK. Hosts can keep track of how many of their recent packets went through busy routers (on the outgoing path?) How busy is busy? 1) smoothed average of queue length 2) arbitrary threshhold (based on very careful analysis of unreasonable workload and meaningless performance metric) How do hosts react? If the %age of marked packets is small, increase the window, and vice versa. Again, there are free parameters that have to be tuned. 6.4.2 RED --- Random early detection. Routers monitor their utilization. Send implicit messages to hosts. (What's an implicit message?) Again, we have a smoothed queue length. If small, never drop. If large (but not full) always drop. In between, there is some probability of dropping. See figure on page 479. Even with RED, we might wind up in tail drop mode. (How?) Again, there are free parameters that have to be tuned. Hard to do without a good network simulation, and that depends on a good workload model. Small optimization: on ATM, if you are going to drop one cell of an AAL PDU, then you might as well drop them all. 6.4.3 Vegas ----- Entirely source-based: try to detect the incipient stages of congestion. 1) observe an increase in RTT: queue delays are increasing! 2) observe a leveling off in actual throughput, even as congestion window increases. How can we estimate actual throughput? 1) Divide the number of bytes outstanding in the network by RTT 2) Choose a reference packet. Record the time when it starts to transmit and when it gets ACKed. Measure the number of bits that are sent during that RTT. For a given congestion window, what throughput do we expect in the absence of contention: ExpectedRate = CongestionWindow / BaseRTT Where BaseRTT is the shortest (possible/observed) RTT. Diff = ExpectedRate - ActualRate This is positive by definition. If it's too big, we are spinning our wheels. If it's too small, we don't have enough data in the network, which makes it harder to respond to transient increases in capacity (so I've heard). No clues about how to choose the threshholds. Evaluation ---------- All hosts need not run the same congestion avoidance algorithm. Some are probably more polite than others. Why not run "TCP Vogon"?