Partnerships help prepare for operation
ITER has been nurturing partnerships with big tech companies and the larger scientific community to help store, distribute and analyze all the data that will be produced from experiments. _Img1_"In the end, what the Member states are paying for is the data," says Peter Kroul, Computing Center Officer. The job at the ITER Scientific Data and Computer Centre (SDCC) is to deliver on that commitment—storing, securing, processing and distributing the vast amounts of scientific data produced throughout the lifetime of the project. But the Data Centre will not do it alone; it will be assisted by partners with deep experience in managing and sharing very high volumes of information. ITER data management challenges are comparable to those of CERN, synchrotrons, telescopes and other large scientific installations—one of which is that at least one copy of all data generated during experiments will be stored onsite. Since researchers will need to analyze data across different experimental phases, it must be possible to quickly compare the results of the latest pulses with some of the earlier ones. This means very fast data access at any time, to any scientific data ever produced over the lifetime of the project. "We have worked with IBM and partner B4Restore since 2020 to run a proof of concept of the long-term data storage and high-performance storage," says Kroul. "We get access to their technological road maps to foresee, for instance, how storage technology is advancing, so we can better forecast the systems and space needed for ITER's Scientific Data and Computing Centre. As an important partner, we have also benefitted from testing some of their latest systems before they are released to the market." "Knowing where IBM and other companies will be in the next few years helps us predict how we'll be able to squeeze a growing amount of data into our limited facility. A discipline like capacity management at some point will become very essential to the daily operation of the Data Centre. We need to get used to frequently adding capacity, removing outdated storage systems and replacing them with the latest technology—and we have to manage this while in operation and without downtime." _Img2_ Offsite storage and distribution
ITER has been nurturing partnerships with big tech companies and the larger scientific community to help store, distribute and analyze all the data that will be produced from experiments.
"In the end, what the Member states are paying for is the data," says Peter Kroul, Computing Center Officer. The job at the ITER Scientific Data and Computer Centre (SDCC) is to deliver on that commitmentâstoring, securing, processing and distributing the vast amounts of scientific data produced throughout the lifetime of the project. But the Data Centre will not do it alone; it will be assisted by partners with deep experience in managing and sharing very high volumes of information. ITER data management challenges are comparable to those of CERN, synchrotrons, telescopes and other large scientific installationsâone of which is that at least one copy of all data generated during experiments will be stored onsite. Since researchers will need to analyze data across different experimental phases, it must be possible to quickly compare the results of the latest pulses with some of the earlier ones. This means very fast data access at any time, to any scientific data ever produced over the lifetime of the project. "We have worked with IBM and partner B4Restore since 2020 to run a proof of concept of the long-term data storage and high-performance storage," says Kroul. "We get access to their technological road maps to foresee, for instance, how storage technology is advancing, so we can better forecast the systems and space needed for ITER's Scientific Data and Computing Centre. As an important partner, we have also benefitted from testing some of their latest systems before they are released to the market." "Knowing where IBM and other companies will be in the next few years helps us predict how we'll be able to squeeze a growing amount of data into our limited facility. A discipline like capacity management at some point will become very essential to the daily operation of the Data Centre. We need to get used to frequently adding capacity, removing outdated storage systems and replacing them with the latest technologyâand we have to manage this while in operation and without downtime."
The current ITER high performance computer with more than 300 physical servers and 9000 compute cores.
Offsite storage and distribution ITER's Scientific Data and Computing Centre must guarantee 99.99% availability, which means downtime must be under one hour per year. To support this stringent requirement, at least one extra copy of all data will be stored offsite at a fast-retrieval distribution centre to ensure each Member state gets immediate access to data they request. That infrastructure is being constructed in a data centre in Marseille and is expected to be fully operational by mid-2024. Two geographically separated fibre optic links will connect the distribution centre to the ITER site, with one set of cables serving as a hot standby. Another redundant pair of cabling systems will connect it to the research network backbone funded by the European Union. "We have coordination meetings with other organizations from the ITER Members because we're using the same research networks that constitute the backbone of the scientific internet," says David Fernandez, Leader of the IT System & Operation Section. The distribution centre will be a hub for all continental and intercontinental data traffic but also for all the cloud providers, which will host some applications and possibly provide extra computational power as needed. "A year ago, we finalized the first test of integrating our on-site computing clusters with both Google Cloud and with Microsoft Azure," says Kroul. "And that was a successful test, meaning we managed to seamlessly integrate our on-site facility directly into these cloud operators so that we could offload some of the computational jobs to services off siteâand do so in a manner that is transparent to the scientists. We did this with both Google and Microsoft, and it was very impressive. The speed was almost the same as if the service were on siteâand sometimes fasterâeven though we had to send the job to Google or Microsoft in the cloud, spin up the resources and then get the call back. With Google we ran several important large computations using over 5,000 cloud-based cores, which saved us months of onsite resources and work." While the cloud comes as an incremental cost, it is convenient and easy to use on an as-need basis to provide a hybrid burst capacity for onsite computation jobs. If the load is too high and researchers don't have time to wait for high-performance computing resources to become available on site, the job can be off-loaded to the cloud.
Outside of ITER Headquarters, cooling and electrical equipment has been installed for the ITER Scientific Data and Computer Centre, including this extra fuel tank for extended generator operation.
Quick retrieval and deep analysis A data rate of at least 50 gigabyte per second is expected during full deuterium-tritium operation at ITER. But that may grow even higher because as sensors and cameras become more advanced, they will generate much more data than what was predicted at the initial phases of the project. On the retrieval side, the data rate must be at least the same as the rate at which it is stored. "When we get the connectivity to Marseille, we can start performing data challenges," says Fernandez. "These will be tests to demonstrate the feasibility of data replication to the offsite data centre within the timeframe requirements. Similarly, when we have connectivity into the international research networks at a high speed, transatlantic data challenges will also be attempted. These tests will be run with several partners. As of today, this includes ESnet [the Energy Sciences Network] and the United States Domestic Agency US ITER." Depending on the queries scientists want to make, it might be necessary to retrieve data from different sources. To enable that kind of operation, the right software needs to be deployed and the data needs to be appropriately structured so that, for example, a query does not require opening a thousand different files simultaneously. The infrastructure has to perform well enough to support these dispersed retrievals without creating bottlenecks. Finally, ITER is keeping an eye on how artificial intelligence (AI) can be used for data analysis. AI is still relatively new and the need for intensive analysis is still a few years off, so no commitments have been made yet. However, the group in charge of ITER's Scientific Data and Computing Centre has already begun discussions with big tech companies to see how AI software and hardware might be used. "To give you an example, we have been talking with Google and NVIDIA about how AI and Machine Learning could help us manage and analyze data," says Kroul. "It looks very promising."