Chapter 9
----------
9.1 DNS
-------
DNS = domain name service
name space -- set of possible names
flat = any string
hierarchical = strings separated by delimiters
(usually with restrictions on what can
be in each field, or requirements for
the number of fields, etc.)
bindings -- mapping between names and some other piece
of information
(what's the info in this case?)
resolution -- looking up a names and getting the info
centralized database vs distributed database
1) obvious issues of scalability and availability
2) obvious problem with consistency
9.1.1 Domain hierarchy
----------------
Like usmail addresses, ordered from most specific to
most general.
Processed backwards.
Root of tree is the "big six" plus one for each country.
edu, com, gov, mil, org, net
U.S.-centric (f**king A!)
Some political controversies over names of country domains.
9.1.2 Name servers
------------
Tree is divided into zones.
A zone is often but not always a level or a subtree.
Each zone is usually an administrative domain.
May have multiple name servers per zone. Also, one
name server may "implement" multiple zones.
(What does it mean to say that a server implements
a zone?)
A DNS entry is a 5-tuple
< Name, Value, Type, Class, TTL>
Value is one of
IP address of the Named host
NS the name of a name server that should be asked about the host
CNAME canonical name for the host
MX a host that will accept mail addressed to the given host
TTL is a time (really!) = how long we can cache this entry.
It might take several conversations with a name server to
find out what's what.
Interdomain abstraction
-----------------------
Aliases are used to hide the name of the host that
provides certain general services like web servers.
MX entries are used to hide the name of the machine
that is accepting mail.
(Why?)
(Why two mechanisms? Why not have aliases for
mail.wellesley.edu)
9.1.3 Name resolution
---------------
How do we find the local name server?
Not all name servers have entries for all names, of course.
If we can find a server that knows the rightmost field of
the destination address, we can go from there.
Root name servers have entries for all the top level domains.
In any zone, there is a local name server that knows the
name of a root server.
The local name server acts as a proxy.
It goes out and makes as many queries as necessary.
(How many are usually necessary?)
And then it reports back to you and also makes an entry
in its cache for future reference.
9.2 Applications
-----------
SMTP: simple mail transfer protocol
HTTP: hypertext transfer protocol
SNMP: simple network management protocol
All built on top of TCP or UDP.
What's in the transport protocol?
1) who talks first when connection made
a) or reliability/connection if based on UDP
2) how do we delimit requests and replies
3) authentication, if any
What's in the companion format protocol?
1) what is the format of a request?
2) what is the format of a reply?
3) what is the format of status/error information?
SMTP uses RFP 882
HTTP uses HTML
SNMP uses ASN.1/BER and MIB (yikes!)
9.2.1 Electronic mail
---------------
SMTP
queries: HELO, MAIL FROM, RCPT TO, DATA, QUIT
replies: 250 OK, 550 no such user here
RFP 882
To:
From:
Date: date format
Reliably delivery
-----------------
Mail handler transfers to mail daemon (usually local)
local daemon uses DNS to find a mail gateway that will
accept messages for that destination.
Maybe its the recipient host, but usually its a gateway
in the recipient's domain.
(Why gateways?)
Different reliability models:
TCP/IP uses end-to-end explicit ACK.
If the packet arrives, you get an explicit ACK.
If not, you might get an error message.
SMTP guarantees delivery or error message within 5 days.
Guarantee is made one hop at a time.
When a gateway accepts a packet for delivery, it should
write it to disk before it acknowledges it.
(Why?)
(Failure modes?)
9.2.2 HTTP
----
Transfer protocol:
queries: GET, HEAD, PUT, DELETE
replies: 1xx, 2xx, 3xx, 4xx
URL format
HTML: hyper text markup language
Opinion
-------
HTTP/HTML are a fundamentally broken protocol pair because
they are currently being used in ways that undermine their
underlying principles:
1) URLs are supposed to identify objects (content)
extensions to HTTP/HTML, like frames, have undermined that
many site managers prefer not to give clients references
to internal objects that might go away, because request
failures look bad
but the alternative is a poorer object identification system
2) HTML is a markup language, not a page description language
markup = "This is a heading, make it whatever font you
are using for headings. Leave some space above
and below it."
page description language = "make this 12 point Times and
put it exactly here"
LaTeX is a ML, Postscript is a PDL
HTTP performance
----------------
persistent connections
1) less TCP startup shutdown overhead
2) (what's the second, more important reason?)
Who has to work harder? Who shuts down? When?
Caching: many levels
browser, local, ISP
Browser: check cache before requesting
Local: browser sends explicit local request to proxy
ISP: router has to snoop and intercept
At any level, the cache is responsible for consistency:
1) pages have expiration dates
2) proxies can make HEAD requests to check for changes or
use conditional GET
9.2.3 SNMP
----
This is a pretty general recipe for a distributed database:
Transport: GET, SET
Format: names of variables, format of results
Names of variables defined by MIB (management information base)
Format of results defined by ASN.1 (Chapter 7)
Abstract Syntax Notation
Solves the big-endian, little-endian problem.
Floating-point formats?
Marshalling of structures?
9.3 Multimedia Applications
-----------------------
Way back in Chapter 1, we talked about the semantic gap
between the basic service provided by the network and
the set of services commonly needed by applications.
UDP provides a pretty bare-bones process-to-process TP.
TCP adds many of the functions most applications need.
Multimedia applications fall in between. They need more
than UDP, but TCP contains some things that are
unnecessary, some that are implemented inappropriately
and some that are undesireable.
(Name one of each)
One solution to this problem is for multimedia applications
to build their own protocols on top of UDP (the same way
HTTP and SMTP are built on TCP).
Quickly it becomes clear that there are common features
needed by all MA.
These features can be factored out and implemented once
and for all as an intermediate layer between the application
and UDP.
9.3.2 RTP
---
Real time transport protocol
Common requirements
1) sender and receiver agree on (maybe negotiate) data format
and encoding scheme
2) timestamping data: when should this bit get played?
if everything were delivered exactly on time, we could
play it as soon as it arrives
since there is variation in delay (jitter) we send
everything a little early
in other words, we maintain a playback buffer
if it gets there late, it will still be in time
if it gets there promptly, there is a delay between
when it arrives and when it plays
if it gets there super late, then everything subsequent
will be delayed, although we can sometimes catch up by
shortening silences
3) facilitate synchronization: sound with image, or multiple
image or multiple sound
4) monitor packet loss so that applications can respond
appropriately
(as opposed to TCP, where this functionality resides lower
in the protocol stack)
(What might different applications do?)
5) indicate frame boundaries
useful for shortening (or lengthening) silences, as
mentioned above
6) indentify stream sources
(Why do headers have to be small?)
(Why do packets tend to be small?)
Padding scheme is an interesting example of how RTP really
chisels on header bits.
One bit indicates that there is padding.
(How do we figure out how much?)
RTCP
----
Real-time transport control protocol
RTCP is to RTP as ICMP is to IP
A few highlights
----------------
The idea of a mixer is very cool.
Think of an audio stream as a sequence of talkbursts.
If they don't overlap, then the mixer just intersperses
them, using the timestamps for synchronization.
(Wait, I thought the timestamps measured arbitrary
ticks, and different streams might use different
granularities).
What if they do overlap? Multiple talkbursts can be
merged at the application level. Not clear that this
can happen at RTP level.
Even so, the book implies that the merged stream is smaller
than the sum of the two streams. Hmm.
Clock sync
----------
RTCP packets map time-of-day timestamps to tick timestamps.
(How do you deal with clock error, skew, drift, etc.?)
Distributed clock sync algorithms (maybe).