TRIUMF operates one of ten international Tier-1 data intensive computing centres for the ATLAS detector's distributed computing network, part of the globe's largest and most advanced scientific computing grid. The Tier-1 Centre is part of TRIUMF's major collaboration in the ATLAS detector at CERN's Large Hadron Collider (LHC), including significant contributions to the detector design and construction. The ATLAS detector is best known for its central role in the 2012 discovery of the Higgs boson.
The ATLAS detector generates vast amounts of data, about 10 petabytes per year. This is the equivalent of the data storage of 200,000 blue-ray DVD's, which if stacked in cases would form a tower 1200 metres tall. The storage, processing and physics modelling of this data is as crucial to particle physics discovery as the ATLAS detector itself. To manage this huge amount of data, the ATLAS collaboration operates a sophisticated distributed network of ten Tier-1 computer centres and about 100 smaller Tier-2 facilities used primarily for data analysis and simulations. The ATLAS network is part of the Worldwide LHC Computing Grid (WLCG), the global network that stores, distributes and analyses all LHC data.
A founding member of the WLCG, TRIUMF hosts Canada's only Tier-1 centre. All of ATLAS' Tier-1 Centres are run by national physics laboratories or other facilities capable of providing the experimental physics culture and infrastructure to support ATLAS' 24/7 stringent data storage, distribution, reprocessing and security requirements. Since coming fully online in 2007, TRIUMF's Tier-1 centre, supported by ten highly qualified TRIUMF staff, has performed with close to 100-percent up-time, supplying fully ten percent of ATLAS' global computational Tier-1 resources. The exceptional performance has enabled the centre to provide additional capacity during critical times in ATLAS' science program. This included providing data reprocessing and modelling that enabled the 2012 confirmation of the discovery of the Higgs boson.
Beginning in 2017, TRIUMF's Tier-1 centre began transitioning to Compute Canada's new state-of-the-art facility at Simon Fraser University. The move, to be completed by the end of 2018, includes a full upgrade and expansion of all the centre's compute and storage technologies. TRIUMF personnel will continue to operate the Tier-1 centre, including providing primary scientific services, management and technical support.
How It Works
ATLAS' success depends on its highly structured, secure and fault-tolerant globally distributed computing system for capturing, processing and analyzing ATLAS data. As a core part of this network, the Canadian ATLAS Tier-1 centre is a large-scale, data-intensive facility that is maintained 24/7, with at least one highly qualified staff member on call at all times.
Light Path to CERN
The TRIUMF-operated Tier-1 centre is connected to CERN via a dedicated high-speed, high-bandwidth light fibre link provided by CANARIE, Canada's advanced research network. In 2001, TRIUMF and CANARIE were involved in one of the first transatlantic light path tests to CERN, demonstrating the potential for operating ATLAS computing as a globally distributed network.
Distributed ATLAS Software
ATLAS has an online system of distributed, grid-based software and the TRIUMF group provides user support to ATLAS as a whole and members of the ATLAS-Canada collaboration. TRIUMF's user-support specialists developed high-level tools for the CERN Virtual Machines Files System and played a key role in its deployment and wide-scale adaptation. This provides ATLAS' thousands of scientists with easier, more efficient access to, and use of, ATLAS' complex online analysis software, and ensures that the entire collaboration works with validated software configurations.
A Tiered Approach to Big Data
ATLAS' computing demand is always increasing, whether for long-term data storage or the reprocessing of ever greater amounts of raw data. The TRIUMF-ATLAS Tier-1 Centre has grown from an initial 112 cores (each the equivalent of a powerful desktop computer) and about 10 terabytes of storage in 2006, to 7700 cores and 11 petabytes of disk and 31 petabytes of tape storage, making it one of Canada's largest capacity dedicated to a scientific project.
The Tier-1 centre capacity will continue to grow significantly in the coming years. The larger Compute Canada Simon Fraser University location will provide the TRIUMF-operated Tier-1 centre with cost efficiencies, infrastructure support and room to grow as ATLAS prepares for its High Luminosity Era when the experiment will generate ten times as much data.
ATLAS' global, distributed computing network meets the detector's computational needs by operating in a nested, tiered fashion, with different roles and responsibilities for each layer.
Tier-0: Located at CERN, the Tier-0 centre is the hub-of-the-wheel in the ATLAS computing network, collecting and distributing ATLAS raw data 24/7 when the experiment is running. CERN keeps the primary copy of all ATLAS raw data, and a secondary copy is distributed among the Tier-1 centres.
Tier-1: Ten Tier-1 centres operate 24/7 receiving raw ATLAS data for storage, reprocessing and applying the latest calibrations and reconstruction algorithms. Data from the ATLAS detector is collected as electronic signals and the Tier-1 centres use algorithms to reconstruct the raw data as particles with particular trajectories and energies. This reprocessed data is distributed back to CERN, to the other Tier-1 centres, and to Tier-2 centres worldwide.
Tier-2: About 100 ATLAS Tier-2 centres, most based at universities, are the primary sites for ATLAS-team scientists to model and analyze the data. In Canada, Compute Canada provides resources for university-based Tier-2 centres. The TRIUMF Tier-1 centre provides support and expert knowledge for the Canadian Tier-2 centres, and TRIUMF scientists have been responsible for coordinating ATLAS' distributed computing efforts in Canada since the beginning of the project.
Tier-1 centres also provide significant computing resources for large-scale simulations to supplement the capacity of Tier-2 centres. These simulations involve billions of events in order to have statistically reliable results and to properly model all of the physics processes needed to make discoveries. The Tier-1 centres also store all ATLAS data indefinitely, providing built-in redundancy, the equivalent of external hard-drives for a personal computer.