Networks

Ensuring real-time distributed computing at ITER

Many of the control systems at ITER require quick response and a high degree of determinism. If commands go out late, the state of the machine may have changed in the interim, rendering actions useless—and maybe even detrimental.
The hardware and software components delivered by ITER Member states must be able to cooperate seamlessly over the deterministic communication infrastructure. A large part of the integration work involves maintaining the communication network that guarantees the travel time of a message from one computer to another.

Dozens of computers take part in the tight control loops of measuring, processing, ordering and acting. Most of the control systems (and some other systems) at ITER depend on a platform that enables collaborative processing at the rate of around 1000 operations per second, with a guaranteed delay.

"The requirement for reaction time is 10 microseconds," says Bertrand Bauvir, Leader of the Central Control Integration Section. "We can take data in at 10,000 times per second, process it and send commands out. One challenge is to perform the necessary amount of computation on time, because some of the algorithms are fairly complex."

Off-the-shelf technology and skilled software developers

The Central Control Integration Section is responsible for the integration and verification of hardware and software components delivered by ITER Member states to guarantee that each piece is able to cooperate seamlessly over the deterministic communication infrastructure. A large part of the integration work involves maintaining the communication network that guarantees the travel time of a message from one computer to another. This guarantee must hold even for multicasting—when a message has to be replicated and sent to more than one other computer.

Fortunately, many other organizations have had similar needs—and the real-time networking required by ITER can be achieved with careful software and network design, based on well-vetted commercial-off-the-shelf Ethernet network components.

One interesting fact about potential bottlenecks on the network is that it takes more time for light to travel through the optical fibres connecting the buildings to the server room than it takes for an Ethernet switch to replicate and forward a message to multiple destinations. This is because light travels around 1.5 times slower through an optical fibre than in a vacuum, resulting in a best-case latency over optical fibre of about 5 microseconds per kilometre. By contrast, high-performance cut-through Ethernet switches can achieve sub-microsecond packet forwarding time.

Another large part of real-time distributed computing is the operating system used on each system. Most do not guarantee a response time, so ITER relies on a real-time version of Red Hat Linux that has been widely used for years.

"All contributions from the Member states have to meet our performance and latency requirements—and all software must behave in a way that does not hinder the performance of other systems connected to the real-time communication network," says Bauvir. "We set standards, starting with how to physically connect to the network and how to communicate. Suppliers have to buy the right computers and run the right operating system. They also have to use our software libraries for communication."

But even with standards and guidelines—and good software engineering in different parts of the world—the overall performance of the distributed real-time control function depends on each component collaborating seamlessly. So when a control system component is delivered, Bauvir and colleagues perform verification and commissioning, including quality assessments and functional checks.

Bertrand Bauvir is looking forward to seeing it all work. "Until now we have been integrating what we call utilities," he says. "These are control systems for monitoring and distributing electrical supply, cooling water, and so on. In the next six months we will begin to integrate and commission the first equipment that participates in the real-time control of the machine. We will be hooking up the very first real-time computers; many more will follow in the coming years."