Raghu Kacker, Hung-kung Liu, Nien Fan Zhang, James Yen, David Banks,
Craig Hunt, Sri Kumar, Hsin Fang
The Statistical Engineering Division and the Advanced Network Technologies Division are collaborating in support of industrial consortia that are developing metrics and tools to measure the performance of Internet services. The consortia include Cross Industry Working Team (XIWT) and the Internet Performance Metrics Working Group (IPPM-WG) of the Internet Engineering Task Force (IETF). The activities of these consortia are driven by (1) the pressing need to develop metrology to enable customer service level agreements with Internet Service Providers (ISP), (2) faster methods for troubleshooting network problems and identifying network-wide inefficiencies, and (3) the need for resource allocation techniques to meet the Quality of Service (QoS) requirements of emerging applications. Since the behavior of the Internet is spatially and temporally changing, the metrics and measurement methods are inherently statistical.
A number of measurement tools have emerged to generate data on service quality. XIWT uses a "pinger" tool developed at Stanford University. Eight monitoring sites are continually "pinging" each other at one half hour intervals and pinging forty-six other remote sites. The data consists of end to end return delay times as logged by the source. Average, minimum, and maximum delay of 10 pings are recorded. A delay larger than a threshold is logged as lost. The data is currently archived at NIST for XIWT but this is a temporary arrangement. The ITL team did exploratory analysis of pinger data from NIST to three sites for the period May 1 to Sept 4, 1998 and noted the following: (1) large packets are more likely to be lost, certain anomalies were observed, a weekly trend was clear, (2) the distribution of delays seems to be heavy tailed (e.g., large delays may occur with significant probabilities), and auto-correlations are persistent but not strong in absolute value. Next, XIWT asked us to look at bi-directional data between pairs of monitoring sites, and plot three quartiles of delay at noon for each month. The box-plots indicate improvement of performance from June to November, 1998. When outliers are removed the correlation between bi-directional data is about 0.5 for an important pair of sites.
Future plans include: development of statistical software for data management so the data analysis could be done easily and with fast turn around, evaluation of various IP measurement tools for active measurement, and exploration of possibilities for adaptive modeling and prediction. Since Internet may change so quickly the traditional adaptive modeling methods may not be useful. Nonetheless this is a "holy-grail" in that imminent or likely problems could be predicted and hence ameliorated.
Figure 3: To eliminate time-of-day effects, only the delay times at around noon were taken. Then the distribution of these noontime delays were summarized in separate boxplots for every month (Jun.-Nov.). These show that for pings of size of 10,000 (NIST is the exclusive sender of such pings), the delays are much shorter in Oct. and Nov. than in the previous months. No such clear differences are evident for pings of size 100 and 1000.
Date created: 7/20/2001