This is a purely informative rendering of an RFC that includes verified errata. This rendering may not be used as a reference.

The following 'Verified' errata have been incorporated in this document: EID 6870
Network Working Group                                         D.L. Mills
Request for Comments: 957                               M/A-COM Linkabit
                                                          September 1985

              Experiments in Network Clock Synchronization


Status of this Memo

   This RFC discusses some experiments in clock synchronization in the
   ARPA-Internet community, and requests discussion and suggestions for
   improvements.  Distribution of this memo is unlimited.

Table of Contents

   1.      Introduction
   2.      Design of the Synchronization Algorithm
   2.1.    The Logical Clock
   2.2.    Linear Phase Adjustments
   2.3.    Nonlinear Phase Adjustments
   3.      Synchronizing Network Clocks
   3.1.    Reference Clocks and Reference Hosts
   3.2.    Distribution of Timing Information
   4.      Experimental Validation of the Design
   4.1.    Experiment Design
   4.2.    Experiment Execution
   4.3.    Discussion of Results
   4.3.1.  On Power-Grid Clocks
   4.3.2.  On Clocks Synchronized via Network Links
   4.3.3.  On the Accuracy of Radio Clocks
   4.3.3.1. The Spectracom 8170 WWVB Radio Clock
   4.3.3.2. The True Time 468-DC GOES Radio Clock
   4.3.3.3. The Heath GC-1000 WWV Radio Clock
   4.3.4.  On Handling Disruptions
   4.4.    Additional Experiments
   5.      Summary and Conclusions
   6.      References

List of Figures

   Figure 1. Clock Registers
   Figure 2. Network Configuration

Experiments in Network Clock Synchronization


List of Tables

   Table 1. Experiment Hosts
   Table 2. Link Measurements
   Table 3. First Derivative of Delay
   Table 4. GOES Radio Clock Offsets
   Table 5. WWV Radio Clock Offsets
   Table 6. ISI-MCON-GW Clock Statistics
   Table 7. LL-GW Clock Statistics
   Table 8. LL-GW Clock Statistics

1.  Introduction

   One of the services frequently neglected in computer network design
   is a high-quality, time-of-day clock capable of generating accurate
   timestamps with small residual errors compared to intrinsic one-way
   network delays.  Such a service would be useful for tracing the
   progress of complex transactions, synchronizing cached data bases,
   monitoring network performance and isolating problems.

   Several mechanisms have been specified in the Internet protocol suite
   to record and transmit the time at which an event takes place,
   including the ICMP Timestamp message [6], Time Protocol [7], Daytime
   protocol [8] and IP Timestamp option [9].  A new Network Time
   Protocol [12] has been proposed as well.  Additional information on
   network time synchronization can be found in the References at the
   end of this document.  Synchronization protocols are described in [3]
   and [12] and synchronization algorithms in [2], [5], [10] and [11].
   Experimental results on measured roundtrip delays in the Internet are
   discussed in [4].  A comprehensive mathematical treatment of clock
   synchronization can be found in [1].

   Several mechanisms have been specified in the Internet protocol suite
   to record and transmit the time at which an event takes place,
   including the ICMP Timestamp message [6], Time protocol [7], Daytime
   protocol [8] and IP Timestamp option [9].  Issues on time
   synchronization are discussed in [4] and synchronization algorithms
   in [2] and [5].  Experimental results on measured roundtrip delays in
   the Internet are discussed in [2].  A comprehensive mathematical
   treatment of the subject can be found in [1], while an interesting
   discussion on mutual-synchonization techniques can be found in [10].

   There are several ways accurate timestamps can be generated.  One is
   to provide at every service point an accurate, machine-readable clock
   synchronized to a central reference, such as the National Bureau of
   Standards (NBS).  Such clocks are readily available in several models
   ranging in accuracies of a few hundred milliseconds to less than a
Experiments in Network Clock Synchronization


   millisecond and are typically synchronized to special ground-based or
   satellite-based radio broadcasts.  While the expense of the clocks
   themselves, currently in the range $300 to $3000, can often be
   justified, all require carefully sited antennas well away from
   computer-generated electromagnetic noise, as well as shielded
   connections to the clocks.  In addition, these clocks can require a
   lengthy synchonization period upon power-up, so that a battery-backup
   power supply is required for reliable service in the event of power
   interruptions.

   If the propagation delays in the network are stable or can be
   predicted accurately, timestamps can be generated by a central
   server, equipped with a clock such as described above, in response to
   requests from remote service points.  However, there are many
   instances where the trans-network delay to obtain a timestamp would
   be intolerable, such as when timestamping a message before
   transmission.  In addition, propagation delays are usually not
   predictable with precisions in the order required, due to
   probabilistic queuing and channel-contention delays.

   In principle, a clock of sufficient accuracy can be provided at each
   service point using a stable, crystal-controlled clock which is
   corrected from time to time by messages from a central server.
   Suitable inexpensive, crystal-controlled clock interfaces are
   available for virtually any computer.  The interesting problem
   remaining is the design of the synchronization algorithm and protocol
   used to transmit the corrections.  In this document one such design
   will be described and its performance assessed.  This design has been
   incorprated as an integral part of the network routing and control
   protocols of the Distributed Computer Network (DCnet) architecture
   [5], clones of which have been established at several sites in the US
   and Europe.  These protocols have been in use since 1979 and been
   continuously tested and refined since then.

2.  Design of the Synchronization Algorithm

   The synchronization algorithm is distributed in nature, with protocol
   peers maintained in every host on the network.  Peers communicate
   with each other on a pairwise basis using special control messages,
   called Hello messages, exchanged periodically over the ordinary data
   links between them.  The Hello messages contain information necessary
   for each host to calculate the delay and offset between the local
   clock of the host and the clock of every other host on the network
   and are also used to drive the routing algorithm.

   The synchronization algorithm includes several features to improve
   the accuracy and stability of the local clock in the case of host or
Experiments in Network Clock Synchronization


   link failures.  In following sections the design of the algorithm is
   summarized.  Full design details are given in [5] along with a formal
   description of the Hello protocol.

2.1.  The Logical Clock

   In the DCnet model each service point, or host, is equipped with a
   hardware clock, usually in the form of an off-the-shelf interface.
   Using this and software registers, a logical clock is constructed
   including a 48-bit Clock Register, which increments at a 1000 Hz
   rate, a 32-bit Clock-Adjust Register, which is used to slew the Clock
   Register in response to raw corrections received over the net, and a
   Counter Register, which is used in some interface designs as an
   auxilliary counter.  The configuration and decimal point of these
   registers are shown in Figure 1.

           Clock Register

           0               16               32
           +---------------+---------------+---------------+
           |               |               |               |
           +---------------+---------------+---------------+
                                           A
                                     decimal point

           Clock-Adjust Register

                           0               16
                           +---------------+---------------+
                           |               |               |
                           +---------------+---------------+
                                           A
                                     decimal point

           Counter Register

                           0              16
                           +---------------+
                           |               |
                           +---------------+
                                           A
                                     decimal point

                       Figure 1. Clock Registers

   The Clock Register and Clock-Adjust Register are implemented in
   memory.  In typical clock interface designs such as the DEC KMV11-A
Experiments in Network Clock Synchronization


   the Counter Register is implemented in the interface as a buffered
   counter driven by a crystal oscillator.  A counter overflow is
   signalled by an interrupt, which results in an increment of the Clock
   Register at bit 15 and the propagation of carries as required.  The
   time of day is determined by reading the Counter Register, which does
   not disturb its counting process, and adding its value to that of the
   Clock Register with decimal points aligned.

   In other interface designs such as the simple LSI-11 event-line
   mechanism, each tick of the clock is signalled by an interrupt at
   intervals of 10, 16-2/3 or 20 ms, depending on interface and clock
   source.  When this occurs the appropriate number of milliseconds,
   expressed to 32 bits in precision, is added to the Clock Register
   with decimal points aligned.

   It should be noted at this point that great care in operating system
   design is necessary in order to preserve the full accuracy of
   timestamps with respect to the application program, which must be
   protected from pre-emption, excessive device latencies and so forth.
   In addition, the execution times of all sequences operating with the
   interrupt system disabled must be strictly limited.  Since the PDP11
   operating system most often used in the DCnet (the "Fuzzball"
   operating system) has been constructed with these considerations
   foremost in mind, it has been especially useful for detailed network
   performance testing and evaluation.  Other systems, in particular the
   various Unix systems, have not been found sufficiently accurate for
   this purpose.

   Left uncorrected, the host logical clock runs at the rate of its
   intrinsic oscillator, whether derived from a crystal or the power
   frequency.  The correction mechanism uses the Clock-Adjust Register,
   which is updated from time to time as raw corrections are received.
   The corrections are computed using roundtrip delays and offsets
   derived from the routing algorithm, described later in this document,
   which are relatively noisy compared to the precision of the logical
   clock.  A carefully designed smoothing mechansim insures stability,
   as well as isolation from large transients that occur due to link
   retransmissions, host reboots and similar disruptions.

Experiments in Network Clock Synchronization


2.2.  Linear Phase Adjustments

   The correction is introduced as a signed 32-bit integer in
   milliseconds.  If the magnitude of the correction is less than 128
   ms, the low-order 16 bits replaces bits 0-15 in the Clock-Adjust
   register. At suitable intervals, depending on the jitter of the
   intrinsic oscillator, the value of this register is divided by a
   fixed value, forming a quotient which is first added to the Clock
   Register, then subtracted from the Clock-Adjust Register.  This
   technique has several advantages:

      1.  The clock never runs backwards;  that is, successive
          timestamps always increase monotonically.

      2.  In the event of loss of correction information, the clock
          slews to the last correction received.

      3.  The rate of slew is proportional to the magnitude of the last
          correction.  This allows rapid settling in case of large
          corrections, but provides high stability in case of small
          corrections.

      4.  The sequence of computations preserves the highest precision
          and minimizes the propagation of round-off errors.

   Experience has indicated the choice of 256 as appropriate for the
   dividend above, which yields a maximum slew-rate magnitude less than
   0.5 ms per adjustment interval and a granularity of about 2.0
   microseconds, which is of the same order as the intrinsic tolerance
   of the crystal oscillators used in typical clock interfaces.  In the
   case of crystal-derived clocks, an adjustment interval of four
   seconds has worked well, which yields a maximum slew-rate magnitude
   of 125 microseconds per second.  In the case of power-frequency
   clocks or especially noisy links, the greatly increased jitter
   requires shorter adjustment intervals in the range of 0.5 second,
   which yields a maximum slew-rate magnitude of 1.0 ms per second.

   In most cases, independent corrections are generated over each link
   at intervals of 30 seconds or less.  Using the above choices a single
   sample error of 128 ms causes an error at the next sample interval no
   greater than about 7.5 ms with the longer adjustment interval and 30
   ms with the shorter.  The number of adjustment intervals to reduce
   the residual error by half is about 177, or about 12 minutes with the
   longer interval and about 1.5 minutes with the shorter.  This
   completely characterizes the linear dynamics of the mechanism.

Experiments in Network Clock Synchronization


2.3.  Nonlinear Phase Adjustments

   When the magnitude of the correction exceeds 128 ms, the possiblity
   exists that the clock is so far out of synchronization with the
   reference host that the best action is an immediate and wholesale
   replacement of Clock Register contents, rather than a graduated
   slewing as described above.  In practice the necessity to do this is
   rare and occurs when the local host or reference host is rebooted,
   for example. This is fortunate, since step changes in the clock can
   result in the clock apparently running backward, as well as incorrect
   delay and offset measurements of the synchronization mechanism
   itself.

   However, it sometimes happens that, due to link retransmissions or
   occasional host glitches, a single correction sample will be computed
   with magnitude exceeding 128 ms.  In practice this happens often
   enough that a special procedure has been incorporated into the
   design.  If a sample exceeding the limit is received, its value is
   saved temporarily and does not affect the Clock-Adjust Register.  In
   addition, a timer is initialized, if not already running, to count
   down to zero in a specified time, currently 30 seconds.

   If the timer is already running when a new correction sample with
   magnitude exceeeding 128 ms arrives, its value and the saved sample
   value are averaged with equal weights to form a new saved sample
   value. If a new correction sample arrives with magnitude less than
   128 ms, the timer is stopped and the saved sample value discarded.
   If the timer counts down to zero, the saved sample value replaces the
   contents of the Clock Register and the Clock-Adjust Register is set
   to zero.  This procedure has the effect that occasional spikes in
   correction values are discarded, but legitimate step changes are
   prefiltered and then used to reset the clock after no more than a
   30-second delay.

3.  Synchronizing Network Clocks

   The algorithms described in the previous section are designed to
   achieve a high degree of accuracy and stability of the logical clocks
   in each participating host.  In this section algorithms will be
   described which synchronize these logical clocks to each other and to
   standard time derived from NBS broadcasts.  These algorithms are
   designed to minimize the cumulative errors using linear synchronizing
   techniques. See [10] for a discussion of algorithms using nonlinear
   techniques.

Experiments in Network Clock Synchronization


3.1.  Reference Clocks and Reference Hosts

   The accuracy of the entire network of logical clocks depends on the
   accuracy of the device used as the reference clock.  In the DCnet
   clones the reference clock takes the form of a precision crystal
   oscillator which is synchronized via radio or satellite with the NBS
   standard clocks in Boulder, CO.  The date and time derived from the
   oscillator can be sent continuously or read upon command via a serial
   asynchronous line.  Stand-alone units containing the oscillator,
   synchronizing receiver and controlling microprocessor are available
   from a number of manufacturers.

   The device driver responsible for the reference clock uses its
   serial-line protocol to derive both an "on-time" timestamp relative
   to the logical clock of the reference host and an absolute time
   encoded in messages sent by the clock.  About once every 30 seconds
   the difference between these two quantities is calculated and used to
   correct the logical clock according to the mechanisms described
   previously.  The corrected logical clock is then used to correct all
   other logical clocks in the network.  Note the different
   nomenclature:  The term "reference clock" applies to the physical
   clock itself, while the term "reference host" applies to the logical
   clock of the host to which it is connected. Each has an individual
   address, delay and offset in synchronizing messages.

   There are three different commercial units used as reference clocks
   in DCnet clones.  One of these uses the LF radio broadcasts on 60 KHz
   from NBS radio WWVB, another the HF radio broadcasts on 5, 10 and 15
   MHz from NBS radio WWV or WWVH and the third the UHF broadcasts from
   a GOES satellite.  The WWVB and GOES clocks claim accuracies in the
   one-millisecond range.  The WWV clock claims accuracies in the 100-ms
   range, although actual accuracies have been measured somewhat better
   than that.

   All three clocks include some kind of quality indication in their
   messages, although of widely varying detail.  At present this
   indication is used only to establish whether the clock is
   synchronized to the NBS clocks and convey this information to the
   network routing algorithm as described later.  All clocks include
   some provision to set the local-time offset and propagation delay,
   although for DCnet use all clocks are set at zero offset relative to
   Universal Time (UT).  Due to various uncertaincies in propagation
   delay, serial-line speed and interrupt latencies, the absolute
   accuracy of timestamps derived from a reference host equipped with a
   WWVB or GOES reference clock is probably no better than a couple of
   milliseconds.

Experiments in Network Clock Synchronization


3.2.  Distribution of Timing Information

   The timekeeping accuracy at the various hosts in the network depends
   critically on the precision whith which corrections can be
   distributed throughout the network.  In the DCnet design a
   distributed routing algorithm is used to determine minimum-delay
   routes between any two hosts in the net.  The algorithms used, which
   are described in detail in [5] and only in outline form here, provide
   reachability and delay information, as well as clock offsets, between
   neighboring hosts by means of periodic exchanges of routing packets
   called Hello messages. This information is then incorporated into a
   set of routing tables maintained by each host and spread throughout
   the network by means of the Hello messages.

   The detailed mechanisms to accomplish these functions have been
   carefully designed to minimize timing uncertaincies.  For instance,
   all timestamping is done in the drivers themselves, which are given
   the highest priority for resource access.  The mechanism to measure
   roundtrip delays on the individual links is insensitive to the delays
   inherent in the processing of the Hello message itself, as well as
   the intervals between transmissions.  Finally, care is taken to
   isolate and discard transient timing errors that occur when a host is
   rebooted or a link is restarted.

   The routing algorithm uses a table called the Host Table, which
   contains for each host in the network the computed roundtrip delay
   and clock offset, in milliseconds.  In order to separately identify
   each reference clock, if there is more than one in the network, a
   separate entry is used for each clock, as well as each host.  The
   delay and offset fields of the host itself are set to zero, as is the
   delay field of each attached reference clock.  The offset field of
   each attached reference clock is recomputed periodically as described
   above.

   Hello messages containing a copy of the Host Table are sent
   periodically to each neighbor host via the individual links
   connecting them.  In the case of broadcast networks the Hello message
   is broadcast to all hosts sharing the same cable.  The Hello message
   also contains a timestamp inserted at the time of transmission, as
   well as information used to accurately compute the roundtrip delay on
   point-to-point links.

   A host receiving a Hello message processes the message for each host
   in turn, including those corresponding to the reference clocks.  It
   adds the delay field in the message to the previously determined
   roundtrip link delay and compares this with the entry already in its
   Host Table.  If the sum is greater than the delay field in the Host
Experiments in Network Clock Synchronization


   Table, nothing further is done.  If the sum is less, an update
   procedure is executed.  The update procedure, described in detail in
   [5], causes the new delay to replace the old and the routing to be
   amended accordingly.

   The update procedure also causes a new correction sample to be
   computed as the difference between the timestamp in the Hello message
   and the local clock, which is used to correct the local clock as
   described above.  In addition, the sum of this correction sample plus
   the offset field in the Hello message replaces the offset field in
   the Hello Table.  The effect of these procedures is that the local
   clock is corrected relative to the neighbor clock only if the
   neighbor is on the path of least delay relative to the selected
   reference clock.  If there is no route to the reference clock, as
   determined by the routing algorithm, no corrections are computable
   and the local clock free-runs relative to the last received
   correction.

   As the result of the operation of the routing algorithm, all local
   clocks in the network will eventually stabilize at a constant offset
   relative to the reference clock, depending upon the drift rates of
   the individual oscillators.  For most applications the offset is
   small and can be neglected.  For the most precise measurements the
   computed offset in the Host Table entry associated with any host,
   including the reference clock, can be used to adjust the apparent
   time relative to the local clock of that host.  However, since the
   computed offsets are relatively noisy, it is necessary to smooth them
   using some algorithm depending upon application.  For this reason,
   the computed offsets are provided separately from the local time.

4.  Experimental Validation of the Design

   The original DCnet was established as a "port expander" net connected
   to an ARPAnet IMP in 1978.  It was and is intended as an experimental
   testbed for the development of protocols and measurement of network
   performance.  Since then the DCnet network-layer protocols have
   evolved to include multi-level routing, logical addressing,
   multicasting and time synchronization.  Several DCnet clones have
   been established in the US and Europe and connected to the DARPA
   Internet system.  The experiments described below were performed
   using the DCnet clone at Linkabit East, near Washington, DC, and
   another at Ford Motor Division, near Detroit, MI.  Other clones at
   Ford Aerospace and the Universities of Maryland and Michigan were
   used for calibration and test, while clones at various sites in
   Norway and Germany were used for occasional tests.  Additional

Experiments in Network Clock Synchronization


   ARPANET gateways of the WIDEBAND/EISN satellite system were also
   included in the experiments in order to determine the feasability of
   synchronizing clocks across the ARPANET.

   There were four principal issues of interest in the experiments:

      1.  What are the factors affecting accuracy of a network of clocks
          using the power grid as the basic timing source, together with
          corrections broadcast from a central point?

      2.  What are the factors affecting accuracy of a network of clocks
          synchronized via links used also to carry ordinary data.

      3.  How does the accuracy of the various radio clocks - WWVB, GOES
          and WWV compare?

      4.  What is the best way to handle disruptions, such as a leap
          second?

   These issues will be discussed in turn after presentation of the
   experiment design and execution.

4.1.  Experiment Design

   Figure 2 shows the configuration used in a series of tests conducted
   during late June and early July of 1985.  The tests involved six
   hosts, three reference clocks and several types of communication
   links.  The tests were designed to coincide with the insertion of a
   leap second in the standard time broadcast by NBS, providing an
   interesting test of system stability in the face of such disruptions.
   The test was also designed to test the feasability of using the power
   grid as a reference clock, with corrections broadcast as described
   above, but not used to adjust the local clock.

Experiments in Network Clock Synchronization


      ARPAnet                              |
      - - - - - - - - - - - - - - - - - -  |  - - - - - - - - - -
                                       56K |
      +---------+     +---------+     +----+----+ 1.2 +---------+
      |   WWV   | 1.2 |         | 4.8 |         +-----+  WWVB   |
      |  radio  +-----+  DCN6   +-----+  DCN1   |async|  radio  |
      |  clock  |async|         |DDCMP|         +--+  |  clock  |
      +---------+     +---------+     +----+----+  |  +---------+
                       Ethernet            |       |
      DCnet     ===o===============o=======o===    | 1822/DH
                   |               |               |
              +----+----+     +----+----+     +----+----+
      power   |         |     |         |     |         |    power
      freq <--+  DCN3   |     |  DCN5   |     |  DCN7   +--> freq
      60 Hz   |         |     |         |     |         |    60 Hz
              +---------+     +----+----+     +---------+
                               9.6 |
      - - - - - - - - - - - - - -  |  - - - - - - - - - - - - - -
                                   | DDCMP
                              +----+----+     +---------+
                              |         | 1.2 |  GOES   |
      FORDnet                 |  FORD1  +-----+satellite|
                              |         |async|  clock  |
                              +---------+     +---------+

                    Figure 2. Network Configuration

   Only those hosts and links directly participating in the tests are
   shown in Figure 2.  All hosts shown operate using the DCnet protocols
   and timekeeping algorithms summarized in this document and detailed
   in [5].  The DCnet hosts operate as one self-contained net of the
   Internet systems, while the FORDnet hosts operate as another with
   distinct net numbers.  The gateway functions connecting the two nets
   are distributed in the DCN5 and FORD1 hosts and the link connecting
   them.  This means that, although the clock offsets of individual
   DCnet hosts are visible to other DCnet hosts and the clock offsets of
   individual FORDnet hosts are visible to other FORDnet hosts, only the
   clock offset of the gateway host on one net is visible to hosts on
   the other net.

   In Figure 2 the links are labelled with both the intrinsic speed, in
   kilobits per second, as well as the link protocol type.  The DDCMP
   links use microprocessor-based DMA interfaces that retransmit in case
   of message failure.  The 1822/DH link connecting DCN1 and DCN7
   operates at DMA speeds over a short cable.  The Ethernet link uses

Experiments in Network Clock Synchronization


   DMA interfaces that retransmit only in case of collisions.  The
   asynchronous links are used only to connect the reference clocks to
   the hosts over a short cable.

   While all hosts and links were carrying normal traffic throughout the
   test period, the incidence of retransmissions was very low, perhaps
   no more than a few times per day on any link.  However, the DDCMP
   link protocol includes the use of short control messages exhanged
   between the microprocessors about once per second in the absence of
   link traffic. These messages, together with retransmissions when they
   occur, cause small uncertaincies in Hello message delays that
   contribute to the total measurement error.  An additional uncertaincy
   (less than 0.5 per-cent on average) in Hello message length can be
   introduced when the link protocol makes use of character-stuffing or
   bit-stuffing techniques to achieve code transparency, such as with
   the LAPB link-level protocol of X.25.  However, the particular links
   used in the tests use a count field in the header, so that no
   stuffing is required.

   Although the timekeeping algorithms have been carefully designed to
   be insensitive to traffic levels, it sometimes happens that an
   intense burst of traffic results in a shortage of memory buffers in
   the various hosts.  In the case of the Ethernet interfaces, which
   have internal buffers, this can result in additional delays while the
   message is held in the interface pending delivery to the host.
   Conditions where these delays become significant occur perhaps once
   or twice a day in the present system and were observed occasionally
   during the tests.  As described above, the correction-sample
   processing incorporates a filtering procedure that discards the vast
   majority of glitches due to this and other causes.

4.2.  Experiment Execution

   The series of experiments conducted in late June and early July of
   1985 involved collecting data on the delays and offsets of the six
   hosts and three reference clocks shown in Figure 2.  In order to
   accomplish this, a special program was installed in a Unix 4.2bsd
   system connected to the Ethernet link but not shown in Figure 2.  The
   program collected each 128-octet Hello message broadcast from DCN1
   every 16 seconds and appended it bit-for-bit to the data base.  The
   total volume of raw data collected amounted to almost 0.7 megabyte
   per day.

   The raw Hello-message data were processed to extract only the
   timestamp and measured clock offsets for the hosts shown in Table 1
   and then reformatted as an ASCII file, one line per Hello message.

Experiments in Network Clock Synchronization


        Host    Clock   Drift   Experiment Use
        Name    ID      (ppm)
        ------------------------------------------------------
        DCN1    WWVB    -2.5    WWVB reference host
        DCN3    -       60-Hz   power-grid (unlocked)
        DCN5    DCN1    6.8     Ethernet host
        DCN6    DCN1    -1.7    DDCMP host, WWV reference host
        DCN7    DCN1    60-Hz   power-grid (locked)
        FORD1   GOES    17.9    GOES reference host
        WWV     -       -       WWV reference clock
        WWVB    -       -       WWVB reference clock

                       Table 1. Experiment Hosts

   In Table 1 the Clock ID column shows the reference host selected as
   the master clock for each host shown.  In this particular
   configuration host DCN1 was locked to host WWVB, while hosts DCN5,
   DCN6 and DCN7 were locked to DCN1.  Although the offset of GOES can
   not be directly determined from the Hello messages exchanged between
   DCnet and FORDnet hosts, the offset of FORD1 relative to GOES was
   determined by observation to be in the order of a millisecond, so for
   all practical purposes the offset of FORD1 represents the offset of
   GOES.  In addition, since the WWVB clock was considered by experience
   the most accurate and reliable and the offset of DCN1 relative to
   WWVB was negligible, DCN1 was considered the reference clock with
   offset zero relative to the NBS clocks.

   During the setup phase of the experiments the intrinsic drift rates
   of the crystal oscillators in the four hosts DCN1, DCN5, DCN6 and
   FORD1 equipped with them was measured as shown in the "Drift" column
   in Table 1.  The two hosts DCN3 and DCN7 are equipped with
   line-frequency clocks. For experimental purposes DCN3 was unlocked
   and allowed to free-run at the line-frequency rate, while DCN7
   remained locked.

   An ASCII file consisting of about 0.2 megabyte of reformatted data,
   was collected for each Universal-Time (UT) day of observation
   beginning on 28 June and continuing through 8 July.  Each file was
   processed by a program that produces an eight-color display of
   measured offsets as a function of time of observation.  Since the
   display technique uses a bit-map display and each observation
   overwrites the bit-map in an inclusive-OR fashion, the sample
   dispersion is immediately apparent. Over eight samples per pixel on
   the time axis are available in a 24-hour collection period.  On the
   other hand, the fine granularity of almost four samples per minute
   allows zooming the display to focus on interesting short-term
   fluctuations, such as in the case of the WWV clock.

Experiments in Network Clock Synchronization


4.3.  Discussion of Results

   Each of the four previously mentioned issues of interest will be
   discussed in following subsections.

4.3.1.  On Power-Grid Clocks

   Telephone interviews with operators and supervisors of the Potomac
   Electric Power Company (PEPCO), the electric utility serving the
   Washington, DC, area, indicate that there are three major operating
   regions or grids, one east of the Rockies, a second west of the
   Rockies and a third in parts of Texas.  The member electric utilities
   in each grid operate on a synchronous basis, so that clocks anywhere
   within the grid should keep identical time.  However, in the rare
   case when a utility drops off the grid, no attempt is made to
    
re-establish correct time upon rejoining the grid. In the much more

EID 6870 (Verified) is as follows:

Section: 4.3.1.

Original Text:

...
re-establish correct time upon rejoining the grrd. In the much more
...

Corrected Text:

...
re-establish correct time upon rejoining the grid. In the much more
...
Notes:
The word grid is misspelled as grrd.
common case when areas within the grid are isolated due to local thunderstorms, for example, clock synchronization is also disrupted. The experiments provided an opportunity to measure with exquisite precision the offset between a clock connected to the eastern grid (DCN3) and the NBS clocks. The results, confirmed by the telephone interviews, show a gradual gain in time of between four and six seconds during the interval from about 1700 local time to 0800 the next morning, followed by a more rapid loss in time between 0800 and 1700. If the time was slewed uniformly throughout these extremes, the rate would be about 100 ppm. The actual slewing rates depend on the demand, which itself is a function of weather, day of the week and season of the year. Similar effects occur in the western and Texas grids, with more extreme variations in the Texas grid due to the smaller inertia of the system, and less extreme variations in the western grid, due to smaller extremes in temperature, less total industrial demand and a larger fraction of hydro-electric generation. The uilities consider timekeeping a non-tariffed service provided as a convenience to the customer. In the eastern grid a control station in Ohio manually establishes the baseline system output, which indirectly affects the clock offset and slewing rate. The local time is determined at the control station with respect to a WWVB radio clock. The maximum slewing rate is specified as .025 Hz (about 400 ppm), which is consistent with the maximum rates observed. In the western grid the baseline system output is adjusted automatically using a servomechanism driven by measured offsets from the NBS clocks. Experiments in Network Clock Synchronization Given the considerations above, it would seem feasable for hosts to synchronize logical clocks to a particular power grid, but only if corrections were transmitted often enough to maintain the required accuracy and these corrections were delivered to the hosts essentially at the same time. Assuming a worst-case 400-ppm slewing rate and one minute between correction broadcasts, for example, it would in principle be possible to achieve accuracies in the 20-ms range. There are a number of prediction and smoothing techniques that could be used to inhance accuracy and reduce the overhead of the broadcasts. Host DCN3, which uses a line-frequency clock interface, was unlocked during the experiment period so that the offset between the PEPCO clock, which is locked to the eastern power grid, could be measured with respect to the reference host DCN1. Host DCN7, which uses the same interface, remained locked to DCN1. In spite of the previously noted instability of the power grid, DCN7 remained typically within 30 ms of DCN1 and only infrequently exceeded 100 ms in the vicinity of large changes in system load that occured near 0800 and 1700 local time. Over the seven-day period from 2 July through 8 July the mean offset was less than a millisecond with standard deviation about 24 ms, while the maximum was 79 ms and minimum -116 ms. Experiments were also carried out using ICMP Timestamp messages with hosts known to use line-frequency clock interfaces in California, Norway and Germany. The results indicated that the western power grid is rather more stable than the eastern grid and that the overseas grids are rather less stable. In the Oslo, Munich and Stuttgart areas, for example, the diurnal variation was observed to exceed ten seconds. 4.3.2. On Clocks Synchronized via Network Links As mentioned previously, all network links used to synchronize the clocks were carrying normal data traffic throughout the experiment period. It would therefore be of interest to investigate how this affects the accuracy of the individual clocks. Table 2 summarizes the mean and standard deviation of the measured offsets between the WWVB radio clock and various hosts as shown in Figure 2. Measurements were made over the 24-hour period for each of several days during the experimental period. Each entry shown in Table 2 includes the mean of the statistic over these days, together with the maximum variation. Experiments in Network Clock Synchronization Host Mean Deviation Link Type and Speed ----------------------------------------------------------- DCN1 .08/.02 0.53/.02 WWVB radio clock (1200 bps) DCN5 -13.61/.04 1.1/0.4 Ethernet (10 Mbps) DCN6 0.27/.18 5.8/1.0 DDCMP (4800 bps) FORD1 38.5/1.6 2.5/0.5 DDCMP (9600 bps) Table 2. Link Measurements The departure of the mean shown in Table 2 from zero is related to the drift of the crystal oscillator used in the hardware interface (see Table 1). As described previously, FORD1 was synchonized to the GOES radio clock with neglible offset, so that the mean and standard deviation shown can be accurately interpreted to apply to the GOES radio clock as well. The results show that the uncertaincies inherent in the synchronization algorithm and protocols is in the same order as that of the reference clocks and is related to the speed of the links connected the reference hosts to the other hosts in the network. Further discussion on the FORD1/GOES statistics can be found in the next section. Further insight into the error process can be seen in Table 3, which shows the first derivative of delay. Host Dev Max Min Error ------------------------------------- DCN3 2.3 12 -17 10 DCN5 1.5 45 -45 5 DCN6 9 94 -54 40 DCN7 1.4 6 -7 5 FORD1 3.4 68 -51 15 Table 3. First Derivative of Delay The mean and standard deviation of delay were computed for all hosts on a typical day during the experimental period. In all cases the magnitude of the mean was less than one. The standard deviation, maximum and minimum for each link is summarized by host in Table 3. A common characteristic of the distribution in most cases was that only a handful of samples approached the maximum or minimum extrema, while the vast majority of samples were much less than this. The "Error" colum in Table 3 indicates the magnitude of the estimated error when these extrema are discarded. Experiments in Network Clock Synchronization A very interesting feature of the observations was the unexpectedly low standard deviation of DCN3, which was locked to the power grid and thus would be expected to show wide variations. Upon analysis, this turned out to be a natural consequence of the fact that the Hello messages are generated as the result of interrupts based on the line frequency when the local clock had just been incremented by 1/60th of a second. The synchronizing protocol and implementation were carefully constructed to minimize the loss of accuracy due to sharing of the network links between data and control traffic, as long as sufficient resources (in particular, packet buffers) are available. Since the various network links shown in Figure 2 operate over a wide range of rates, it is possible that undisciplined bursts of traffic can swamp a host or gateway and precipitate a condition of buffer starvation. While most hosts using paths through the experimental configuration were relatively well-disciplined in their packetization and retransmission policies, some Unix 4.2bsd systems were notorious exceptions. On occasion these hosts were observed sending floods of packets, with only a small amount of data per packet, together with excessive retransmissions. As expected, this caused massive congestion, unpredictable link delays and occasional clock synchronizing errors. The synchronizing algorithms described above successfully cope with almost all instances of congestion as described, since delay-induced errors tend to be isolated, while inherent anti-spike and smoothing properties of the synchronizing algorithm help to preserve accuracies in any case. Only one case was found during the ten-day experiment period where a host was mistakenly synchronized outside the linear-tracking window due to congestion. Even in this case the host was quickly resynchronized to the correct time when the congestion was cleared. 4.3.3. On the Accuracy of Radio Clocks One of the more potent motivations for the experiments was to assess the accuracy of the various radio clocks and to determine whether the WWV radio clock was an appropriate replacement for the expensive WWVB or GOES clocks. A secondary consideration, discussed further in the next section, was how the various clocks handled disruptions due to power interruptions, leap seconds and so forth. Experiments in Network Clock Synchronization 4.3.3.1. The Spectracom 8170 WWVB Radio Clock As the result of several years of experience with the WWVB radio clock, which is manufactured by Spectracom Corporation as Model 8170, it was chosen as the reference for comparison for the GOES and WWV radio clocks. Washington, DC, is near the 100-microvolt/meter countour of the WWVB transmitter at Boulder, CO, well in excess of the 25-microvolt/meter sensitivity of the receiver. The antenna is located in a favorable location on the roof of a four-storey building in an urban area. Using the data from the instruction manual, the propagation delay for the path from Boulder to Washington is about 8 ms, while the intrinsic receiver delay is about 17 ms. The clock is read via a 1200-bps asynchronous line, which introduces an additional delay of about 7 ms between the on-time transition of the first character and the interrupt at the middle of the first stop bit. Thus, the WWVB radio clock indications should be late by 8 + 17 + 7 = 32 ms relative to NBS standard time. While it is possible to include this delay directly in the clock indication, this was not done in the experiments. In order to account for this, 32 ms should be subtracted from all indications derived from this clock. The uncertaincy in the indication due to all causes is estimated to be a couple of milliseconds. 4.3.3.2. The True Time 468-DC GOES Radio Clock The GOES radio clock is manufactured by True Time Division of Kinemetrics, Incorporated, as Model 468-DC. It uses the Geosynchronous Orbiting Environmental Satellite (GOES), which includes an NBS-derived clock channel. Early in the experiment period there was some ambiguity as to the exact longitude of the satellite and also whether the antenna was correctly positioned. This was reflected in the rather low quality-of-signal indications and occasional signal loss reported by the clock and also its apparent offset compared with the other radio clocks. Table 4 shows a summary of offset statistics for the GOES radio clock by day (all day numbers refer to July, 1985). Experiments in Network Clock Synchronization Day Mean Dev Max Min ------------------------------------ 2 31.6 9.4 53 -76 3 19.8 22.1 53 -64 4 42.8 17.1 >150 19 5 39.3 2.2 54 -45 6 37.8 2.7 53 19 7 62.2 13.0 89 22 8 38.2 2.8 90 -7 Table 4. GOES Radio Clock Offsets On all days except days 5, 6 and 8 long periods of poor-quality signal reception were evident. Since the antenna and satellite configuration are known to be marginal, these conditions are not considered representative of the capabilities of the clock. When the data from these days are discarded, the mean offset is 38.4 ms with standard deviation in the range 2.2 to 2.8. The maximum offset is 90 ms and the minimum is -45 ms; however, only a very small number of samples are this large - most excursions are limited to 10 ms of the mean. In order to compute the discrepancy between the GOES and WWVB clocks, it is necessary to subtract the WWVB clock delay from the mean offsets computed above. Thus, the GOES clock indications are 38.4 - 32 = 6.4 ms late with respect to the WWVB clock indications. which is probably within the bounds of experiment error. 4.3.3.3. The Heath GC-1000 WWV Radio Clock The WWV radio clock is manufactured by Heath Company as Model GC-1000. It uses a three-channel scanning WWV/WWVH receiver on 5, 10 and 15 MHz together with a microprocessor-based controller. The receiver is connected to an 80-meter dipole up about 15 meters and located in a quiet suburban location. Signal reception from the Fort Collins transmitters was average to poor during the experiment period due to low sunspot activity together with a moderate level of geomagnetic disturbances, but was best during periods of darkness over the path. The clock locked at one of the frequencies for varying periods up to an hour from two to several times a day. The propagation delay on the path between Fort Collins and Washington is estimated at about 10 ms and can vary up to a couple of milliseconds over the day and night. While it is possible to include this delay in the clock indications, which are already corrected for Experiments in Network Clock Synchronization the intrinsic receiver delay, this was not done in the experiments. During periods of lock, the clock indications are claimed to be accurate to within 100 ms. Table 5 shows a summary of offset statistics for the WWV radio clock by day (all day numbers refer to July, 1985). Day Mean Dev Max Min ------------------------------------ 2 -31 36 110 -119 3 -42 38 184 -141 4 -21 38 61 -133 5 -31 37 114 -136 6 -48 42 53 -160 7 -100 80 86 -315 8 -71 70 115 -339 Table 5. WWV Radio Clock Offsets On inspection of the detailed plots of offsets versus time the data reveal an interesting sawtooth variation with period about 25 cycles per hour and amplitude about 90 ms. Once the clock has locked for some time the variation decreases in frequency and sometimes disappears. This behavior is precisely what would be expected of a phase-locked oscillator and accounts for the rather large standard deviations in Table 5. On inspection of the plots of offsets versus time, it is apparent that by far the best accuracies are obtained at or in the periods of lock, which is most frequent during periods of darkness over the propagation path, which occured roughly between 0800 UT and 1100 UT during the experiment period. Excluding all data except that collected during this period, the mean offset is -21.3 ms with standard deviation in the range 29-31. The maximum offset is 59 ms and the minimum is -118 ms. In order to compute the discrepancy between the WWV and WWVB clocks, it is necessary to subtract the total of the propagation delay plus WWVB clock delay from the mean offsets computed above. Thus, the WWV clock indications are -21.3 - 10 - 32 = -72.3 ms late (72.3 ms early) with respect to the WWVB clock indications. Considering the large standard deviations noted above, it is probably not worthwhile to include this correction in the WWV clock indications. On exceptional occasions excursions in offset over 300 ms relative to the WWVB clock were observed. Close inspection of the data showed that this was due to an extended period (a day or more) in which lock Experiments in Network Clock Synchronization was not achieved on any frequency. The master oscillator uses a 3.6-MHz crystal oscillator trimmed by a digital/analog converter and register which is loaded by the microprocessor. The occasional excursions in offset were apparently due to incorrect register values as the result of noisy reception conditions and excessive intervals between lock. On occasion the oscillator frequency was observed in error over 4 ppm due to this cause, which could result in a cumulative error of almost 400 ms per day if uncorrected. 4.3.4. On Handling Disruptions The experiment period was intentionally selected to coincide with the insertion of a leap second in the worldwide time broadcasts. The intent was to examine the resulting behavior of the various radio clocks and the synchronization algorithm when an additional second was introduced at 2400 UT on 30 June. As it turned out, radio reception conditions at the time of insertion were quite poor on all WWV frequencies, the WWVB frequency and the GOES frequency. Thus, all three clocks took varying periods up to several hours to resynchonize and correct the indicated time. In fact, the only time signals heard around the time of interest were those from Canadian radio CHU, but the time code of the Canadian broadcasts is incompatible with the of the US broadcasts. As mentioned above, the WWVB clock was used as the master during the experiment period. About two hours after insertion of the leap second the clock resynchronized and all hosts in the experimental network were corrected shortly afterwards. Since the magnitude of the correction exceeded 128 ms, the correction was of a step nature, but was not performed simultaneously in all hosts due to the individual timing of the Hello messages. Thus, if timing-critical network operations happened to take place during the correction process, inconsistent timestamps could result. The lesson drawn from this experience is quite clear. Accurate time synchronization requires by its very nature long integration times, so that epochal events which disrupt the process must be predicted in advance and applied in all hosts independently. In principle, this would not be hard to do and could even be integrated into the operation of the step-correction procedure described earlier, perhaps in the form of bits included in Hello messages which trigger a one-second correction at the next rollover from 2400 to 0000 hours. In order for such an out-of-band correction to be effective, advance notice of the leap second must be available. At present, this information is not available in the broadcast format and must be Experiments in Network Clock Synchronization obtained via the news media. In fact, there are spare bits in the broadcast format that could be adapted for this purpose, but this would require reprogramming both the transmitting and receiving equipment. Nevertheless, this feature should be considered for future systems. 4.4. Additional Experiments A set of experiments was performed using two WIDEBAND/EISN gateways equipped with WWVB radio clocks and connected to the ARPANET. These experiments were designed to determine the limits of accuracy when comparing these clocks via ARPANET paths. One of the gateways (ISI-MCON-GW) is located at the Information Sciences Institute near Los Angeles, while the other (LL-GW) is located at Lincoln Laboratories near Boston. Both gateways consist of PDP11/44 computers running the EPOS operating system and clock-interface boards with oscillators phase-locked to the WWVB clock. The clock indications of the WIDEBAND/EISN gateways were compared with the DCNet WWVB reference clock using ICMP Timestamp messages [6], which record the individual timestamps with a precision of a millisecond. This technique is not as accurate as the one described in Section 3, since the protocol implementation involves the user-process level, which can be subject to minor delays due to process scheduling and interprocess-message queueing. However, calibration measurements made over several of the links shown in Figure 2 indicate that the measurement errors are dominated by the individual link variations and not by the characteristics of the measurement technique itself. Measurements were made separately with each gateway by sending an ICMP Timestamp Request message from the ARPANET address of DCN1 to the ARPANET address of the gateway and computing the round-trip delay and clock offset from the ICMP Timestamp Reply message. This process was continued for 1000 message exchanges, which took about seven minutes. Table 6 shows the statistics obtained with ISI-MCON-GW and Table 7 those with LL-GW (all numbers are milliseconds). Experiments in Network Clock Synchronization ISI-MCON-GW Mean Dev Max Min -------------------------------------------- Offset -16 40 126 -908 Delay 347 59 902 264 Table 6. ISI-MCON-GW Clock Statistics LL-GW (a) Mean Dev Max Min -------------------------------------------- Offset -23 15 32 -143 Delay 310 25 536 252 Table 7. LL-GW Clock Statistics The smaller values of standard deviation and extreme for LL-GW are probably due to the shorter ARPANET path involved. The confidence in the mean offset can be estimated by dividing the standard deviation by the square root of the number of samples (1000), which suggests that the mean offsets are accurate to within a couple of miliseconds. The mean offsets of the WIDEBAND/EISN clocks as a group relative to the DCN1 clock may thus indicate a minor discrepancy in the setting of the delay-compensation switches. It is well known that ARPANET paths exhibit wide variations in delays, with occasional delays reaching surprising values up to many seconds. In order to improve the estimates a few samples were removed from both the offset and delay data, including all those with magnitude greater than one second. The above experiments involve a burst of activity over a relatively short time during which the ratio of the measurement traffic to other network traffic may be nontrivial. Another experiment with LL-GW was designed with intervals of ten seconds between ICMP messages and operated over a period of about three hours. The results are shown in Table 8. LL-GW (b) Mean Dev Max Min -------------------------------------------- Offset -16 93 990 -874 Delay 371 108 977 240 Table 8. LL-GW Clock Statistics Experiments in Network Clock Synchronization Note that the standard deviations and extrema are higher than in the previous experiments, but the mean offset is about the same. The results of these experiments suggest that time synchronization via ARPANET paths can yield accuracies to the order of a few milliseconds, but only if relatively large numbers of samples are available. The number of samples can be reduced and the accuracy improved by using the techniques of Section 3 modified for ICMP Timestamp messages and the longer, more noisy paths involved. 5. Summary and Conclusions The experiments described above were designed to verify the correct operation of the DCnet time-synchronization algorithms and protocols under a variety of scenarios, including the use of line-frequency clocks, three types of radio clocks and various types of interprocessor links. They involved the collection and processing of many megabytes of data collected over a ten-day period that included the insertion of a leap second in the standard NBS time scale. Among the lessons learned were the following: 1. The algorithms and protocols operate as designed, yielding accuracies throughout the experimental net in the order of a few milliseconds to a few tens of milliseconds, depending on the topology and link type. 2. Glitches due to congestion, rebooted hosts and link failures are acceptably low, even in the face of massive congestion resulting from inappropriate host implementations elsewhere in the Internet. 3. A synchronization scenario where the clocks in all hosts are locked to the line frequency and corrections are broadcast from a central time standard will work only if all hosts are on the same power grid, which is unlikely in the present Internet configuration, but may be appropriate for some applications. 4. In spite of the eastern power grid wandering over as much as six seconds in a day, it is possible to achieve accuracies in the 30-ms range using line-frequency interface clocks and corrections broadcast on the local net. 5. Radio clocks can vary widely in accuracy depending on signal reception conditions. Absolute time can be determined to within a couple of milliseconds using WWVB and GOES radio clocks, but only if they are calibrated using an independent Experiments in Network Clock Synchronization standard such as a portable clock. The inexpensive WWV clocks perform surprisingly well most of the time, but can be in error up to a significant fraction of a second under some conditions. 6. Adjustments in the time scale due to leap seconds must be anticipated before they occur. The synchronization protocol must include a mechanism to broadcast an adjustment in advance of its occurance, so that it can be incorporated in each host simultaneously. There is a need to incorporate advance notice of leap seconds in the broadcast time code. 7. Time synchronization via ARPANET paths can yield accuracies in the order of a few milliseconds, but only if relatively large numbers of samples are available. Further work is needed to develop efficient protocols capable of similar accuracies but using smaller numbers of samples. 6. References 1. Lindsay, W.C., and A.V. Kantak. Network Synchronization of Random Signals. IEEE Trans. Comm. COM-28, 8 (August 1980), 1260-1266. 2. Mills, D.L. Time Synchronization in DCNET Hosts. DARPA Internet Project Report IEN-173, COMSAT Laboratories, February 1981. 3. Mills, D.L. DCNET Internet Clock Service. DARPA Network Working Group Report RFC-778, COMSAT Laboratories, April 1981. 4. Mills, D.L. Internet Delay Experiments. DARPA Network Working Group Report RFC-889, M/A-COM Linkabit, December 1983. 5. Mills, D.L. DCN Local-Network Protocols. DARPA Network Working Group Report RFC-891, M/A-COM Linkabit, December 1983. 6. Postel, J. Internet Control Message Protocol. DARPA Network Working Group Report RFC-792, USC Information Sciences Institute, September 1981. 7. Postel, J. Time Protocol. DARPA Network Working Group Report RFC-868, USC Information Sciences Institute, May 1983. 8. Postel, J. Daytime Protocol. DARPA Network Working Group Report RFC-867, USC Information Sciences Institute, May 1983. Experiments in Network Clock Synchronization 9. Su, Z. A Specification of the Internet Protocol (IP) Timestamp Option. DARPA Network Working Group Report RFC-781. SRI International, May 1981. 10. Marzullo, K., and S. Owicki. Maintaining the Time in a Distributed System. ACM Operating Systems Review 19, 3 (July 1985), 44-54. 11. Mills, D.L. Algorithms for Synchronizing Network Clocks. DARPA Network Working Group Report RFC-956, M/A-COM Linkabit, September 1985. 12. Mills, D.L. Network Time Protocol (NTP). DARPA Network Working Group Report RFC-958, M/A-COM Linkabit, September 1985.