PROJECTS
Share Article
Written by: Eirik Norrheim Larsen (Development Engineer)
PROJECTS
Share Article
Written by: Eirik Norrheim Larsen (Development Engineer)
In a world where self-driving cars and autonomous systems are set to take over human operation, there is significant pressure on sensors and data collection. These systems must be able to gather high-resolution data at a high frequency, which places immense demands on the system’s network, bandwidth, and processing power. Achieving high-precision environmental models requires excellent synchronization between sensors. As safety systems in autonomous vehicles continue to improve, vehicle speeds will likely increase as well. The higher the speed, the more crucial well-synchronized systems become, as even a millisecond could have fatal consequences.
There are many stages from a sensor collecting data to a system performing an action. The systems can range from just a few sensors and a processing unit to large complex systems, spanning over large areas. A processing unit typically handles tasks such as sensor data filtering, manipulation, data fusion, neural networks, path planning, and much more. All of this can occur in systems operating at speeds up to 20Hz and likely beyond as development continues. If the data coming from the sensors is not synchronized, data fusion might take longer, the fused model might be incorrect, or fusion might be impossible altogether. In the worst case, this could have fatal consequences for an autonomous system, highlighting the critical importance of high-precision synchronization for all components.
A real-time system operates within strict time frames. In a user application on a computer, time is important, but not critical. If a program is very slow, warnings might appear, and users might lose interest in using it. This can happen on a regular computer due to background processes occupying resources, and a program opening 10ms or even 100ms later than planned might not be a significant issue. In a real-time system, time is not just important – it is absolutely critical. If a real-time system misses its deadlines, it results in an error, a system failure. These deadlines vary significantly based on system requirements but are often under 1ms, making highly synchronized components a necessity.
PTP is a message-based system used over Ethernet. Ethernet is one of the most commonly used communication protocols in a system due to its widespread usage, speed, stability, ample bandwidth, and relatively easy scalability to add more devices. Every device in the system must be PTP compatible, meaning that PTP is a standard in the hardware and cannot be added later, with a minor exception for computers. Computers must have PTP-compatible network interface cards (NICs), but they don’t necessarily need hardware-integrated PTP functionality because PTP can run over software on most computers. This is known as “software timestamping”, but it is mostly impractical in a PTP system due to its relative imprecision compared to hardware integration.
PTP comes in two different versions used today: PTPv2 (IEEE 1588) and gPTP. Another PTP version is under development, led by CERN, called the “The White Rabbit Project”. The version has even higher synchronization precision and can deliver synchronization down to sub-microsecond(!) PTPv2 is the general PTP type and the successor to the first PTP version. gPTP is intended for the autonomous industry, featuring a slightly simplified message flow with a more set system. PTP messages are sent at a low level in the network stack (Layer 2/3), containing little overhead and requires minimal processing to synchronize with. However, this only applies to devices with hardware integrated PTP.
In a PTP system, there exists a GM (Grand Master), which serves as the reference clock for the entire system. All other clocks in the system synchronizes with this clock. Most devices with PTP capability have the potential to act as a GM. A GM can be statically assigned as a system configuration or selected as a GM based on the “Best Master Clock Algorithm”. BMCA is a messaging process that determines which device in the system has the most precise on-board clock. Suppliers provide rankings for clocks based on their precision, which BMCA is based on. If a new device is connected to the network or an existing GM goes offline, BMCA can be rerun, and a new GM will be chosen. BMCA significantly enhances system redundancy. gPTP does not use BMCA, so the GM must be configured statically in this case.
When a system exceeds four devices, signals often need to be split due to port limitations on the GM device. There are primarily three ways to split a signal from the GM: using a regular switch, a PTP transparent switch, or a Boundary Clock. A typical Ethernet switch merely sends messages back and forth between devices. The switch receives packets, buffers them, and sends them to the appropriate port, treating PTP packets like any other random packet. A PTP transparent switch, however, recognizes the PTP protocol and can prioritize packets accordingly. In a standard switch under heavy load, a PTP packet might be buffered for a long time before being transmitted, leading to poor synchronization. Besides prioritizing PTP messages for quick handling, the PTP transparent switch also measures the time the packet is buffered on the switch. This time is included in a field in the PTP message called “Correction time,” making the calculation of “Path Delay” more accurate.
The last type of porting devices is called Boundary Clock. It is a full-fledged PTP member and has its own high-precision clock on board. It receives PTP packets on the master port and synchronizes its own clock accordingly. The master/slave port is configured via BMCA and can be changed while the system is running, such as when selecting a new GM. On the slave port, it sends out PTP packets just like an ordinary GM, making a Boundary Clock a sort of local GM. The advantages of a Boundary Clock are that PTP messages are prioritized similarly to a transparent switch, but unlike a transparent switch, the GM only needs to send it one message, which can relieve message traffic on the GM. Boundary Clocks tend to be more expensive than transparent switches and are significantly pricier than regular switches.
In a standard PTP system, a total of four messages are primarily sent in a cycle between the master (GM) and the slave. The first message is a “sync” message sent from the master to the slave. The slave logs the time it receives the sync message. Following that is a “follow up” message containing the time the sync message was sent from the master. This method is called a “two-step” variant of PTP and is the most common and precise synchronization method. There is also a “one-step” variant where both sync and transmission times are in the same message, but it is less precise and therefore less used. These two messages provide a specific offset between the master and slave, but they lack information about the time it takes for messages to travel from the master to the slave.
The next two messages are used to calculate the “path delay,” which refers to the time it takes to transmit a message between the master and slave. This is done in the same message cycle after the slave receives the “follow up” message. The slave then sends a “delay request” message to the master while logging the time the message is sent. This message is just a notification, and the master logs the time it receives this message. Then, the master sends this time back to the slave in a message called “delay response.” With these timestamps, time differences and both offset and delay between master and slave can be calculated, as shown in Figure 2.
The message cycle has a regular interval, usually occurring approximately every two seconds, but this can be adjusted in some cases. If the components in the system have poor clocks that drift easily, more frequent synchronizations are necessary. Clock adjustment is achieved by altering the clock frequency rather than setting the time forward or backward. Changing the clock speed avoids potential crashes of applications and inaccurate timestamps of data. A control technique is required to adjust the speed since the setpoint, i.e., the corrected time, continuously changes. The most standard way to regulate this is by using a PI controller, although techniques like linear regression are also commonly used.
To use PTP in the Linux operating system, “Linuxptp” is primarily used. This package contains several programs/services for PTP. The most notable is “ptp4l,” which is used to enable PTP on the network interface card. Many PTP settings can be changed in the “ptp4l” config file, ranging from transmission intervals to regulation techniques for adjusting the clock rate. Linuxptp also includes programs/services to synchronize the system clock on the computer and/or synchronize the PTP system time via GPS or NTP. The latter is the common way computers synchronize time over the internet.
Even if all the device and sensor clocks in a network are synchronized, it can be challenging to synchronize the actions of each device with one another. In an example of combining LiDAR and cameras, the cameras might need to capture images at specific rotation angles of the LiDARs, or the cameras has dynamically changing exposure times. If there is a very static cycle where sensors need to be triggered, it might be best to set a frequency for the sensors and potentially offset this. However, a critical question arises: offset from what?
ToS (Top of Second) is a common and intuitive way to synchronize frequencies, cycles, and actions across various sensor types. Assuming the sensor’s time is nearly precise (thanks to PTP synchronization), whole seconds are used as reference points. For instance, a LiDAR operating at 10Hz might start at 0 degrees at 00:00:00:00:00 (s-ms-us-ns) and return to 0 degrees after 10 revolutions at exactly 01:00:00:00. By having a consistent reference time, different sensor types can be set up to operate at certain frequencies and have a static offset from ToS.
If sensors are to be triggered sporadically or controlled by a program, it’s advisable to command sensors using “when” and not “now”. When a program determines that a camera should trigger in 200ms, it is better to send a message to the camera and command it to capture an image at the specified time. If the computer sending the message is synchronized on the same PTP network, the time will be exactly the same as on the sensor. One should avoid commanding a sensor with “now” because this message could be delayed through the OS, network stack, hardware, switches, etc., resulting in poor precision.
There can be several reasons to choose PTP, but why choose it over a well-established solution like GPS? Using GPS is one of the most common ways to synchronize a network. A GPS provides highly accurate time in the form of NMEA messages (other protocols exist as well), typically via UART. A GPS also has a signal called PPS (Pulse Per Second), which provides a ToS pulse. Moreover, a GPS can be used not only as an internal reference time, but is synchronized with the outside world as well, making tasks like remote system control much easier. Using GPS often requires each sensor to have its own GPS or to be located very close to each other since i.e. UART doesn’t work over long distances. With multiple GPS units, there will be multiple reference times, and while they should theoretically be fairly accurate, there can be differences. A system with GPS often involves a much more complex hardware setup and, in many cases, might not be as precise. However, very high precision can be achieved with a GPS solution, especially using PPS. A GPS solution, though, might not work for systems underground or where accommodating a good antenna is challenging.
PTP is a straightforward protocol that requires minimal additional hardware in a system. It essentially requires selecting PTP-compatible devices and being a bit more careful with large amounts of data over the network. If sensors can receive power over Ethernet (PoE) and the network is designed to support this, cable quantities can be reduced to only one Ethernet cable per sensor. PTP has only one reference clock (GM), which means that a well-designed system can be extremely precise, down to 2-3 nanoseconds! If the system’s reference clock goes down, another clock takes over as the GM, making PTP highly redundant, particularly if designed strategically with Boundary Clocks and switches. A PTP solution can also be synchronized to world clocks via GPS or NTP.
With careful planning, a PTP system can become an extremely precisely synchronized network. The network can have minimal hardware and offers great potential for multiple layers of redundancy. A PTP network can be nearly “plug and play.” The network can face challenges under heavy data loads and especially if network components like switches are not appropriately dimensioned or designed for PTP systems. PTP is also a well-integrated system with little overhead, that requires minimal processing, making it very lightweight. With easy setup, extremely high precision, and simpler and more affordable hardware, PTP has found its place in a growing world of autonomy and will likely remain a popular choice for time-critical systems for quite some time.
Development Engineer