Chapter 4
---------

4.1 Internetworking
---------------

With lower-case i, any network made up of smaller, heterogeneous
networks.

What used to be bridges are now routers, where the primary
distinction is that routers have to convert from one network
format to another.

Canonical example: Ethernet onto FDDI or ATM.

With capital I, the Internet is specifically the global internet
that uses IP (Internet Protocol) as its lingua franca.

Figure 4.2 on page 249 shows a protocol diagram for an Internet
path.

Hosts are running protocols on top of IP.

Routers need to be able to "speak" IP as well as two different
network frame formats.

IP packets are encapsulated inside network-level frames.


4.1.2 Service model
-------------

datagrams -- all packets contain all info needed for delivery.
	     no setup before sending
             no way of knowing if there is a path, or if the
	       recipient is up

unreliable -- packets can get lost, reordered, or delivered more
	      than once!

best-effort -- if a router discovers that something has gone
	       wrong, it has no obligation to inform either party
	       (although sometimes is does)


Header format
-------------

Notice: things that handle IP packets tend to be written
in software, so

1) byte alignment is nice
2) CRC is less appealing

Header is 20 bytes (5 words) long; most fields are byte-aligned

Version (4 bits):  IPv4 or IPv6

Hlen: length of header including options (in words)

      Example option: source routing... up to nine router IP addresses

                      route tracing... same thing

TOS: type of service.  In theory there is min delay, min cost, etc.

Length: total length of packet including header (in bytes)

Indent: unique identifier used when packet is fragmented

Flags: most important, M=are there More fragments in this packet?

Offset: where in the fragmented packet does this fragment fall
	(specified in 8-byte units)

	(Why?)

TTL: literally time to live, but actually number of hops to live,
     but actually number of well-behaved routers to live

Protocol: used at the next level up  TCP = 6, UDP = 17

Checksum: just protects the header, fast algorithm for software

Source addr and dest addr:  32 buts each, as explained below

Options, padding.


Fragmentation
-------------

Each network has a maximum frame size.

MTU = maximum transmission unit = maximum frame size - IP header size

We _could_ require IP to use a packet size = min packet size over
   all network types, but that's bad because

1) new network types all the time, and IP needs to be universal

2) not all paths include all network types.  We want hosts to
   be able to use large packet sizes when possible.

Therefore, hosts can send any packet size (usually up the the
MTU of the network they are connected to directly), and
the network fragments if necessary along the way.

The fragments are themselves valid IP packets that are delivered
independently to the destination.

Once fragmented, packets are not reassembled until they arrive
at the destination.

If one fragment is lost, the whole packet is dropped, since there
is no one we can get just the dropped fragment from.

Fragments might themselves get fragmented, if they pass through
a link with an even smaller MTU.

Not a particularly efficient process, since
reassembly is non-trivial (possibly multi-level).

Better if hosts do MTU discovery before sending.


4.1.3 Global addresses
----------------

Every interface on every host has a 32-bit IP address that is
globally unique (this is now only almost true).

How many addresses are there?  How many interfaces are out there now?


Hierarchical: 32-bit address contains a network part and a host part.

Pro: simplifies routing information... the router need only
     look at the network part

Con: not the most efficient allocation scheme.  Not all networks
     have the maximum number of hosts on them, so many IP addresses
     are wasted.

     (Actually it's even worse than that, because ranges of network
     numbers have been allocated to institutions that are not using
     them.)

The split between the network part and the host part depends on
the class of the address.

Addresses are written as 4 decimal numbers with . between them

first byte    class   split
0  -127	      A	      1 byte network, 3 bytes host
128-195	      B	      2 byte network, 2 bytes host
196-255	      C	      3 byte network, 1 bytes host

Rocky is (was) 137.146.194.45

First bit is 1, second is 0, so this is a Class B address.

137.146 is Colby, there can be 65536 interfaces on campus.

Colby uses one byte to identify Ethernet,
one byte for each host on the net.


Recurrence relations
--------------------

In the spirit of Math-on-demand, here's a few tips about
recurrence relations.

See Sedgewick, pages 76-78.

Problem for next time:

Given the following recurrence relation:

f(n) = a f(n/2) + b lg n

f(1) = f1

Assume that there is a solution of the form:

f(n) = c n lg n + d lg n + en

Find a general solution by plugging the assumed
solution into the recurrence and solving for
c, d and e in terms of a, b, and f1.

Is there some condition on a, b, and f1 that
is required in order for this functional form
to solve the recurrence relation?


4.1.5 ARP
---

Translate the IP address and encapsulate the IP packet

One possibility is to use physical addresses as the host
part of the IP address, but that's not generally possible.

Instead, each host maintains a table.

1) updated dynamically

2) entries time out after about 15 minutes

3) if you don't know, broadcast


If you get an ARP query, you (incidentally) find out the
physical address of the sender.

1) if the sender is already in your table, refresh

2) if the sender is asking about you, reply, and also
   add an entry to your table (on the assumption that
   you will need it soon)

3) otherwise, do nothing


ATMARP
------

Divide net into subnets.

Each subnet gets an ARP server.

Each node configured with physical address of ARP server.

Everyone has a VC to the ARP server.

Everyone registers with the ARP server when they create their VC.

Server answers all ARP queries.


4.1.6 DHCP
----

Dynamic host configuration protocol

Can't allocate IP addresses at manufacture time.

      (Why not?)

Have to configure upon installation on a network.

In addition to their own IP addresses, hosts need to
be configured with the IP address of a default server.

Administrator can assign addresses and configure manually.

Alternative:

1) create a DHCP server somewhere on the network

   (in this case, think campus-wide)

2) on startup, host broadcasts DHCPDISCOVER on its subnet

   if DHCP server is on the subnet, it replies

   otherwise, a DHCP relay on the subnet relays (by unicast)
   the message to the DHCP server (it knows the IP address)

3) DHCP might have a table of static IP addresses, in which
   case it will look up the physical address of the sender

   otherwise it will choose an IP address dynamically from
   the range of available addresses


This makes things more manageable, mostly.

Problem: IP address of a given host changes more often
	 than might be desireable, especially if this
	 host provides a service of some kind (like
	 web pages)

	 There might be bad interactions with DNS,
	 about which more later.


4.1.7 ICMP
----

Internet Control Message Protocol

Something goes wrong, routers (or recipients) can send
messages back to sender

1) host unreachable!

2) reassembly failed!

3) ttl = 0, time expired!

4) checksum failed!


Quick note: both DHCP and ICMP use UDP, which is a protocol
that runs on top of IP, so we are getting just a little ahead
of ourselves, but that's ok.


4.1.8 Virtual private networks
------------------------

You might not want everyone to be able to send packets to
you from a shared link.

On nets that support VC, this is easy to do.  You just
configure your router to refuse to create VCs from addresses
you don't like.


Tunnel
------

In IP land (no virtual circuits), we can get some of the same
behavior with tunnels.

Tunnel entrance takes incoming packets, encapsulates them
and sends them to the other end.

Tunnel exit unwraps the packets and delivers them locally.

Tunnel exit can restrict access to local network by refusing
to unpack packet from undesireables, or packets that aren't
wrapped at all.

Benefits:

1) privacy, security

2) multicast:  multicast packet travels through non-multicast
               intermediate network incognito

3) non-IP protocols over IP (similar to above)

Downside:

1) space overhead, time on router overhead

2) management

3) confounds path discovery, since intermediate routers don't
   decrement ttl of inner packet


4.2 Routing
-----------

See the table on page 266.  How did it get built?

Building the table is routing.  When you hear routing,
think, "building routes," which means "building routing tables."

You can't route a packet.  You can forward a packet using
a routing table that got built by the routing mechanism.


(What information is in the forwarding table that is often
not in the routing table?)


(Why might/should be routing table and the forwarding table be
different data structures?)


Fundamental problem with decentralized algorithms -- possibility
of inconsistent state...

   I think you have the fastest route to the dest;
   you think I do.

   We keep handing the packet back and forth.

Best we can promise is that this situation will resolve itself
in a reasonable amount of time.


4.2.2 Distance vector routing (RIP)
-----------------------------

Everyone knows who their neighbors are and their cost,
which is one.

Distance to non-neighbors is initially infinity.

Periodically, everyone sends that information to their
neighbors, who update their own table.

Eventually, the system converges (everyone has a consistent
view of the topology and knows the shortest distance to
everyone).


Updates are periodic (seconds to minutes) and triggered
(tell your neighbors if something changes).


How might a node detect that one of its neighbors (or
the intervening link) has failed?

What does it do in that case?

In the graph on page 284, if E goes down, A sends messages
to everyone.  If B gets the message before C, and C sends
an update to B before B gets the message from A, all heck
breaks loose.

(What is the error in the book?)


4.2.3 Link state (OSPF)
-----------------

Instead of everyone telling the neighbors everything
they know, everyone tells everyone only what they know well!

Then everyone has global information and they can use global
algorithm to find minimal paths.

Problem: how can you multicast when you have not yet got
routes established?

Answer: reliable flooding.

1) send to all your neighbors.

2) they send to all of theirs, and so on.

3) packets have sequence numbers, so you only pass on new ones.

4) every connection is reliable (uses ACKS)


Total amount of traffic is more, per update, but updates need
to happen less often, since convergence is fast and reliable.

Once hosts have global info, they use Dijkstra's shortest path
algorithm.

Downside:

1) amount of info stored at each router is large


Ultimately, neither of these algorithms scales well, which is
why we need Section 4.3:  Global Internet.


4.2.4 Metrics
-------

So far, we have been using costs = 1, which means we are
choosing the shortest path in number of hops, but that
is not necessarily best.

Neither is the first draft of ARPANET

1) cost = queue length

2) cost = average delay of the last few packets

   delay = latency + transmit time + (depart - arrival)

latency, in this context, means the fixed part of the
	 time it takes to send a packet, regardless of size

transmit time is the variable part of the time it takes
	 to send a packer, depending on size

(depart - arrival) is usually queue delay, but also includes
	timeouts and resend times

Latency and transmit time are static, (depart - arrival) is
updated periodically (presumably when we recompute routes).

I assume that transmit time is based on some average packet
size.

3) total hack!

What is instability?

On one level is means that routes are changing more than you
want.

The more interesting meaning is that routes oscillate forever
and never converge.

   COMPRESS
   effect of queue delays to reduce the appeal of
   long paths

   SMOOTH
   by replacying delay with utilization, and averaging
   over a longer period of time, and bounding rate of
   change over time.  This supposedly reduces instability.

   SIMPLIFY STATIC INFO
   instead of latency and transmit time, just look at
   the type of the link -- in other words, the precomputed
   functions on page 303 take into account the link
   characteristics.


Empricial Internet Behavior
---------------------------

The box on page 304 describes an interesting phenomenon.

Although the Internet is made up of deterministic, man-made
pieces, the system as a whole cannot be understood using
the conventional tools of reductionism.

It is best treated as a complex system, which means

1) it is made up of multiple components

2) the interactions of those components contribute something essential
   to the behavior of the system, making it impossible to understand
   the system by looking at the parts in isolation.

complex != complicated

complex means made up of multiple components, not monolothic

	therefore, you either are or aren't; there's no "so complex..."

complicated has an underlying element of human confusion, which comes
in degrees, as in "the system is so complicated that we can't make
good predictions about it"

Alternatively, "the system is complex, so the best way to make
predictions about it is to model the components and also the
interactions among the components."

Let's read Paxson's paper!


Let's skip routing for mobile hosts


4.3 Global Internet
---------------

Two problems

1) routing algorithms don't scale up to 10,000 networks

   routers might have 10,000 entries!

2) address allocation is getting tight


4.3.1 Subnetting
----------

You might have noticed that I have been vague for the last
week about what I mean by a network.

Is it a single Ethernet on a campus or is it the whole
campus.

The answer is that from the point of view of a router
that is off campus, the entire Colby campus is one network.

So all traffic headed for Colby is going to go to the same
place, which is fine becauce

1) there is only one way to get here anyway,

and

2) even if there were more than one way, any will do.


Once the packet gets onto campus, OUR routers use additional
bits of the "host" part of the address to forward the packet
to the right subnet (which usually corresponds to a physical
network).

We can use as many bits as we want, even non-contiguous bits
(although the latter is a bad idea).

The subnet mask indicates which bits it is.

Only the routers (and host) in our domain need to know which
part of our IP addresses idenifies the subnet.

Why do hosts have to know the subnet mask?


Can have more than one physical network per subnet!

    (we saw this in Chapter 3 -- bridges forward from one
    physical network to another)

Can have more than one subnet per physical network!

    (this just means that hosts that are actually on
    the same network will think they are not and send
    packets through the router when they don't have to)


There are several good things about subnetting, but the
biggest benefit is that it greatly reduces the amount
of information off-campus routers need to have about our
network.


4.3.2 CIDR
----

Classless interdomain routing.

We have already blurred the line between the network
part of an IP address and the host part.

CIDR does the same thing, except in the other direction.

If you (a router) use only a prefix of the network
part, and ignore the low order bits, you are effectively
aggregating a bunch of networks.

Anything that starts with 137.0011 is going to the same
place.

If you hand out IP addresses in clever ways, you can
make this work.

For example, give a set of contiguous, power-of-two
addresses to a network servive provider and let them
hand out network addresses to customers.

For the point-of-view of other networks, you can send
packets for any of MCIs customers to the same place.

This is sometimes called supernetting, since you are
taking a bunch of network and treating them as one
big unit.

CIDR prefixes can be any length.

Routers match IP addresses with entries in the forwarding
table by choosing the longest prefix match.

There might be something like:

For example, maybe Denmark is one big supernet, but
it includes networks in Denmark but also in Greenland.

A router in Reykjavik might have entries that say...

Anything going to Denmark, send it here.

Except things in Greenland (a Danish colony), which go
somewhere else.


4.3.3 BGP
---

That's still not good enough.  Even with CIDR, there
are still something like 50,000 prefixes out there that
every router would have to know (and that was a couple
of years ago).

1) the world is broken into AUTONOMOUS SYSTEMS.  Forget
   about getting packets to the right network; just get them
   to the right AS.

2) most AS don't participate in interdomain routing, so
   that simplifies the routing process if not the forwarding
   process

   (we can assign a unique ID to all the transit AS)

3) forget about minimal paths, just try to get a loop-free
   path

4) appoint only one router in each AS to participate in
   routing -- the speaker

5) when you advertise a path, advertise the complete path
   so that someone deciding whether to use it can check
   for loops.

6) withing each AS, aggregating works well because finding
   a good path means finding the right border router, and
   there usually aren't that many per AS


Injection
---------

In a stub AS, all the routers know that the border
router is the default router.

       (I don't understand why the book indicates
       that this information is injected by the border
       router into the intradomain routing process.)

In a multihomed AS, is it actually useful for the border
routers to advertise routes (with local costs) to the
other routers in the domain, so that internal routers
can choose the best border router for a given destination
from a given place in the AS.

In a transit AS, you might want to use IBGP, which uses
BGP to identify the best border router and intradomain
routing to get there.


NAT
---

NAT is evil.

Instead of maintaining the very desireable property of
globally unique addresses, NAT makes it possible to hand
out addresses within an AS that are not unique.

The often begin with 10.

Imagine that Colby got a single Class C address.

Within campus, we use addresses that begin with 10,
so there are 2^24 local addresses.

Other people might be using the same ones elsewhere.

Somewhere between us and the outside world is a NAT box.

It translates our local addresses into globally unique
addresses, using our pool of 256 Class C addresses.

As long as there are no more than 256 people on campus
trying to send or receive packets to or from off campus,
we're fine.

Otherwise we might have to queue for a free UID.

Other (big) problem.  Embedded addresses.

NAT box has to know when and how to translate addresses
that are embedded inside messages, according to higher-level
protocols.

NAT is evil.


4.3.5 IPv6
----

Need to change the size of addresses.

Therefore...

Need to change the size/format of the header.

Therefore...

All bets are off!  It's a free-for-all!

Step right up, pile on the cruft!

IPv6 has been acting as a feature filter for a decade.

Some of the good ideas (except big addresses) have been
implemented in IPv4.

That may have delayed actual implementation.

   (What is anarchy, and what is the implication for Internet?)


Addressing
----------

128 bits, one address for every atom in the universe.

Even with suboptimal allocation, that's 1500/square foot.

Notation:  colon-separated 4-digit hex numbers


Aggregatable Global Unicast Address
-----------------------------------

Aggregatable:  you can use CIDR because they are handed
	       out such that common prefix implies common
	       route.

Global:        globally unique

Unicast:       this address identifies a single machine
	       (not a set of machines)


IPv4-compatible IPv6 address:  zero extend the IPv4 address.

   even IPv4 routers can refer to you.

IPv4-mapped IPv6 address: prefix FF and then zero-extend

   IPv6 routers refer to you this way, but you are clueless.


Transition
----------

Goal: hosts and routers that only understand IPv4 can
      continue to lead productive lives.

      You just have to talk LOUDLY AND SLOWLY.

For a long time (maybe forever), new routers will be
expected to be bilingual (dual stacks).

To get IPv4 packets from one IPv6 compliant zone to another,
we can use tunneling.

Remember? 

The last IPv6 router wraps the packet inside
an IPv4 packets and forwards it to the first IPv6 router
at the destination.

Any IPv6 router or host along the way can do the unpacking,
as long as there are no more IPv4 routers.

If the recipient has an IPv4-mapped IPv6 address, then
and IPv6 router or host can do the unwrapping and generate
an IPv4 header with no prior configuration.

Otherwise, the wrapper has to know the IPv4 address of
either the recipient or an IPv6 router between the
recipient and the last IPv4 router.


Aggregration
------------

Internet has a natural (intrinsic) hierarchy:

A small number of direct providers (backbones) connected to

A larger number of indirect providers connected to

A very large number of subscribers (nontransit AS).

And so...

God assigns prefixes to registries (gubment or NGO)

Registries assign prefixes to direct providers.

Direct providers assign prefixes to indirect providers.

Indirect providers assign prefixes to subscribers (like Colby)

Subscribers hand out the rest according to local policy.


See the picture on page 335.

There is a natural mapping between aggregated addresses
and trees.


Packet format
-------------

Version.

Traffic class: QoS

Flow label: remember day one we talked about multiple flows
            between the same pair of machines?

   (Why would the net care which flow a packet belonged to?)

PayloadLen

NextHeader: either an indicator of options to follow, or
	    a protocol identifier.

HopLimit = ttl

40 bytes total!


Autoconfiguration
-----------------

Currently, a new host needs to know:

It's IP address, the local subnet mask, the address of a name server.

DHCP servers can solve the first problem.  Similar approach
might solve the second, but:

1) depends on the consistency and availability of a server!

Instead, we can use a unique interfece ID (like an Ethernet
address) as a host ID and get the network address by snooping.