Clink: a tool for estimating Internet link characteristics

by Allen B. Downey

Papers

The paper "Using pathchar to Estimate Internet Link Characteristics," appeared at SIGCOMM '99. It is is available in gzipped Postscript (242 KB).

Here's what the talk looked like, from my point of view: photo (50 KB).

I also presented this work as a poster at SIGMETRICS '99.

I presented an extended version of my SIGCOMM talk at Cisco Systems. The slides I used are available in gzipped Postscript (185 KB).

pathchar

Pathchar is a tool written by Van Jacobon that tries to estimate the characteristics of links along an Internet path. There is a web page at CAIDA that describes pathchar.

At this time (19 July 1999), an alpha version of pathchar is available in binary form for FreeBSD, Linix, Solaris and some other platforms (pathchar binary distribution).

There is also a set of slides there that Jacobson used to present pathchar at a conference.

Datasets

The datasets that appear in the SIGCOMM paper are available as a gzipped tar file (1.8 MB) or an uncompressed tar file (6.2 MB).

The files contain the output generated by pathchar and the dump files created by pathchar. The format of the dump files is not documented, but I have inferred that the columns are:

The ttl (number of hops the probe traversed).
A flag of some kind. I discarded all data with flag != 0.
The size of the packet (including the UDP and IP headers, but not the Ethernet header) in bytes.
The round trip time in milliseconds.

clink

clink (Characterize Links) is a utility I wrote that does the same thing as pathchar -- it estimates the latency and bandwidth of Internet links by sending UDP packets from a single source and measuring round-trip times. The basic mechanism is similar to ping and traceroute, except that clink generally has to send many more packets.

The interface of clink is based on the interface of pathchar, and the underlying mechanism is based on Jacobson's description of pathchar. No pathchar source code is included in clink.

Here is the documentation for clink.

clink is based on trout, which is a simple version of traceroute written by Allen Downey, but which is heavily based on the version of traceroute Richard Stevens presents in his book UNIX Network Programming: Volume 1, Second Edition.

In Summer 1998, I wrote the data processing part of clink, which became one of the source code files below (process.c). I used the alpha version of pathchar to collect data, and wrote a paper describing my experiences, and proposing some improvements to pathchar's data processing.

In Summer 1999, I wrote the data collection part of clink, which is the source code file called collect.c, and assembled the pieces into this distribution. I am currently working on (1) implementing the improvements I suggested in my SIGCOMM paper, and (2) testing additional improvements.

So far, the primary differences between pathchar and clink are:

1) clink uses the even-odd technique described in the SIGCOMM paper to generate interval estimates for bandwidth.

2) when clink encounters a routing instability, it collects data for ALL the paths it encounters, until one of the paths generates enough data to yield an estimate.

The source code for clink is available here in a tar file (82 KB) or at gzipped tar file (23 KB). The next major addition to clink will be adaptive data collection as described in the SIGCOMM paper.

pchar

There is another program called pchar, written by Bruce Mah at Sandia National Labs. It is also an Open Source reimplementation of pathchar, available from Bruce's web page.

The abstract, introduction, and conclusions of the SIGCOMM paper are below.

Abstract

We evaluate pathchar, a tool that infers the characteristics of links along an Internet path (latency, bandwidth, queue delays). Looking at two example paths, we identify circumstances where pathchar is likely to succeed, and develop techniques to improve the accuracy of pathchar's estimates and reduce the time it takes to generate them. The most successful of these techniques is a form of adaptive data collection that reduces the number of measurements pathchar needs by more than 90% for some links.

Introduction

pathchar is a new tool, written by Van Jacobson at Lawrence Berkeley Laboratory (LBL), that tries to infer the characteristics of individual links along an Internet path by measuring the round trip time of packets sent from a single host. An alpha version of pathchar is available for FreeBSD, Linux, OSF and Solaris here. We explain the basic mechanism and evaluate its accuracy on two paths whose link characteristics are known. Based on these observations, we propose techniques to improve the accuracy of pathchar and to reduce the number of measurements (and time) it takes to generate its estimates. The contributions of this paper are

An evaluation of pathchar and some insight into when it can or cannot be expected to be useful.
A technique for generating intervals for the estimates pathchar generates.
A technique for determining dynamically the number of measurements needed to achieve a given accuracy.

Conclusions

Based on our evaluation of the alpha version of pathchar we conclude:

Estimating link latencies is relatively easy, since latencies in wide-area networks are large compared to pathchar's measurement errors.
Estimating bandwidths is harder, because the difference in round-trip time between the largest packet and the smallest is small compared to the measurement errors. The higher the bandwidth, the more difficult it is to estimate.
The key to getting a good bandwidth estimate is to send enough probes that one of them traverses the entire path without incurring any queue delays. As the length of the path increases, the number of probes required increases quickly.
A single busy link, by imposing queue delays on the majority of probes, makes it difficult to resolve the characteristics of links on the other side. On the other hand, slow links are not necessarily a barrier to accurate measurement, as long as their performance is consistent.
pathchar's attempt to characterize the distribution of queue times for individual links may be in vain, since network traffic conditions change significantly while pathchar is running.

Of the new techniques we tested, only adaptive data collection seems to work well: it greatly reduces the amount of data required, without affecting the accuracy of the estimated characteristics. None of the techniques we tried significantly improved pathchar's accuracy. Our experiments have uncovered a few anomalies.

The measured bandwidth of T1 links is consistently within a few percent of 1.43 Mb/s, when the nominal bandwidth is 1.54 Mb/s.
The measured bandwidth of the DS3 links is consistently too low, with a median value close to half of the nominal bandwidth.

It is not easy to explain these results, and we cannot attribute them to noise. In order for a fast link to masquerade as a slow link, it has to impose delays that are proportional to the packet size. Random noise would not yield estimates that are always too low, or as consistent as the T1 measurements.