Chapter 9 ---------- 9.1 DNS ------- DNS = domain name service name space -- set of possible names flat = any string hierarchical = strings separated by delimiters (usually with restrictions on what can be in each field, or requirements for the number of fields, etc.) bindings -- mapping between names and some other piece of information (what's the info in this case?) resolution -- looking up a names and getting the info centralized database vs distributed database 1) obvious issues of scalability and availability 2) obvious problem with consistency 9.1.1 Domain hierarchy ---------------- Like usmail addresses, ordered from most specific to most general. Processed backwards. Root of tree is the "big six" plus one for each country. edu, com, gov, mil, org, net U.S.-centric (f**king A!) Some political controversies over names of country domains. 9.1.2 Name servers ------------ Tree is divided into zones. A zone is often but not always a level or a subtree. Each zone is usually an administrative domain. May have multiple name servers per zone. Also, one name server may "implement" multiple zones. (What does it mean to say that a server implements a zone?) A DNS entry is a 5-tuple < Name, Value, Type, Class, TTL> Value is one of IP address of the Named host NS the name of a name server that should be asked about the host CNAME canonical name for the host MX a host that will accept mail addressed to the given host TTL is a time (really!) = how long we can cache this entry. It might take several conversations with a name server to find out what's what. Interdomain abstraction ----------------------- Aliases are used to hide the name of the host that provides certain general services like web servers. MX entries are used to hide the name of the machine that is accepting mail. (Why?) (Why two mechanisms? Why not have aliases for mail.wellesley.edu) 9.1.3 Name resolution --------------- How do we find the local name server? Not all name servers have entries for all names, of course. If we can find a server that knows the rightmost field of the destination address, we can go from there. Root name servers have entries for all the top level domains. In any zone, there is a local name server that knows the name of a root server. The local name server acts as a proxy. It goes out and makes as many queries as necessary. (How many are usually necessary?) And then it reports back to you and also makes an entry in its cache for future reference. 9.2 Applications ----------- SMTP: simple mail transfer protocol HTTP: hypertext transfer protocol SNMP: simple network management protocol All built on top of TCP or UDP. What's in the transport protocol? 1) who talks first when connection made a) or reliability/connection if based on UDP 2) how do we delimit requests and replies 3) authentication, if any What's in the companion format protocol? 1) what is the format of a request? 2) what is the format of a reply? 3) what is the format of status/error information? SMTP uses RFP 882 HTTP uses HTML SNMP uses ASN.1/BER and MIB (yikes!) 9.2.1 Electronic mail --------------- SMTP queries: HELO, MAIL FROM, RCPT TO, DATA, QUIT replies: 250 OK, 550 no such user here RFP 882 To:
From: Date: date format Reliably delivery ----------------- Mail handler transfers to mail daemon (usually local) local daemon uses DNS to find a mail gateway that will accept messages for that destination. Maybe its the recipient host, but usually its a gateway in the recipient's domain. (Why gateways?) Different reliability models: TCP/IP uses end-to-end explicit ACK. If the packet arrives, you get an explicit ACK. If not, you might get an error message. SMTP guarantees delivery or error message within 5 days. Guarantee is made one hop at a time. When a gateway accepts a packet for delivery, it should write it to disk before it acknowledges it. (Why?) (Failure modes?) 9.2.2 HTTP ---- Transfer protocol: queries: GET, HEAD, PUT, DELETE replies: 1xx, 2xx, 3xx, 4xx URL format HTML: hyper text markup language

Opinion ------- HTTP/HTML are a fundamentally broken protocol pair because they are currently being used in ways that undermine their underlying principles: 1) URLs are supposed to identify objects (content) extensions to HTTP/HTML, like frames, have undermined that many site managers prefer not to give clients references to internal objects that might go away, because request failures look bad but the alternative is a poorer object identification system 2) HTML is a markup language, not a page description language markup = "This is a heading, make it whatever font you are using for headings. Leave some space above and below it." page description language = "make this 12 point Times and put it exactly here" LaTeX is a ML, Postscript is a PDL HTTP performance ---------------- persistent connections 1) less TCP startup shutdown overhead 2) (what's the second, more important reason?) Who has to work harder? Who shuts down? When? Caching: many levels browser, local, ISP Browser: check cache before requesting Local: browser sends explicit local request to proxy ISP: router has to snoop and intercept At any level, the cache is responsible for consistency: 1) pages have expiration dates 2) proxies can make HEAD requests to check for changes or use conditional GET 9.2.3 SNMP ---- This is a pretty general recipe for a distributed database: Transport: GET, SET Format: names of variables, format of results Names of variables defined by MIB (management information base) Format of results defined by ASN.1 (Chapter 7) Abstract Syntax Notation Solves the big-endian, little-endian problem. Floating-point formats? Marshalling of structures? 9.3 Multimedia Applications ----------------------- Way back in Chapter 1, we talked about the semantic gap between the basic service provided by the network and the set of services commonly needed by applications. UDP provides a pretty bare-bones process-to-process TP. TCP adds many of the functions most applications need. Multimedia applications fall in between. They need more than UDP, but TCP contains some things that are unnecessary, some that are implemented inappropriately and some that are undesireable. (Name one of each) One solution to this problem is for multimedia applications to build their own protocols on top of UDP (the same way HTTP and SMTP are built on TCP). Quickly it becomes clear that there are common features needed by all MA. These features can be factored out and implemented once and for all as an intermediate layer between the application and UDP. 9.3.2 RTP --- Real time transport protocol Common requirements 1) sender and receiver agree on (maybe negotiate) data format and encoding scheme 2) timestamping data: when should this bit get played? if everything were delivered exactly on time, we could play it as soon as it arrives since there is variation in delay (jitter) we send everything a little early in other words, we maintain a playback buffer if it gets there late, it will still be in time if it gets there promptly, there is a delay between when it arrives and when it plays if it gets there super late, then everything subsequent will be delayed, although we can sometimes catch up by shortening silences 3) facilitate synchronization: sound with image, or multiple image or multiple sound 4) monitor packet loss so that applications can respond appropriately (as opposed to TCP, where this functionality resides lower in the protocol stack) (What might different applications do?) 5) indicate frame boundaries useful for shortening (or lengthening) silences, as mentioned above 6) indentify stream sources (Why do headers have to be small?) (Why do packets tend to be small?) Padding scheme is an interesting example of how RTP really chisels on header bits. One bit indicates that there is padding. (How do we figure out how much?) RTCP ---- Real-time transport control protocol RTCP is to RTP as ICMP is to IP A few highlights ---------------- The idea of a mixer is very cool. Think of an audio stream as a sequence of talkbursts. If they don't overlap, then the mixer just intersperses them, using the timestamps for synchronization. (Wait, I thought the timestamps measured arbitrary ticks, and different streams might use different granularities). What if they do overlap? Multiple talkbursts can be merged at the application level. Not clear that this can happen at RTP level. Even so, the book implies that the merged stream is smaller than the sum of the two streams. Hmm. Clock sync ---------- RTCP packets map time-of-day timestamps to tick timestamps. (How do you deal with clock error, skew, drift, etc.?) Distributed clock sync algorithms (maybe).