Canadian researchers are starting to build a cross-country backbone linking high-performance computing facilities that they hope will be as powerful as the resources they enjoy in the lab.
Network (SHARCNet) on Monday said it had established a dedicated transport connection running from the University of Guelph in Ontario to the Western Canada Research Grid’s (WestGrid) optical infrastructure in Alberta and British Columbia. The so-called “lightpath” will run over the CA*Net 4 optical network managed by CANARIE. The Ontario Research and Innovation Optical Network (ORION), whose engineers completed the installation, announced the project at its annual Ontario Research and Education Summit in Toronto.
SHARCNet, a high-performance computing (HPC) resource serving a cluster of post-secondary schools across southern Ontario, and WestGrid, which commonly ranks among the biggest supercomputers in Canada, hope the lightpath will be the beginnings of a pan-Canadian network that will link a handful of other labs. HPC labs are typically used to perform large, complex mathematical calculations and simulations in a growing variety of disciplines, including physics, meterology and life sciences.
HPC labs have started turning to lightpath software as a way of setting up networks in a condominium fashion, whereby resources are shared and managed by more than one party. Carleton University, for example, is already using a CA*net 4 lightpath to bypass Internet bottlenecks for high-speed data transfer between its Ottawa campus and CERN, the international particle physics lab in Geneva. WestGrid was also an early adopter of CA*net 4, connecting $44 million in computing instrumentation.
Hugh Couchman, SHARCNet’s scientific director, said the initial push for a link with WestGrid came from one of his former post-doctoral students, who was struggling to send a 5TB simulation from WestGrid to a facility at Queen’s University in Kingston, Ont.
“Whereas many of the so-called HPC consortia like WestGrid of SHARCNet have good internal networks, their connection to the outside world is through some campus infrastructure to the Internet,” he said. “It just seems crazy to have these incredible computational resources sitting on the end of such a thin pipe.”
One of the problems with long-haul networks is large round-trip time and the effect that has on network performance, said Rob Simmonds, WestGrid’s chief technology officer.
“A lot of people have concentrated on computing resources, but less so on the data requirements – the data required by large simulation and the data produced by researchers,” he said. “People need to take networking seriously.”
Couchman said establishing the lightpath connection was “astonishingly quick,” the result of a few week’s work following a meeting in mid-May. The next step will be testing the network and figuring out how to expand it to other HPC consortia, he said, and then developing a roadmap for upgrades.
“(Researchers) don’t want to invest a lot of time in building the required infrastructure when they have the idea now,” he said. “You really do need to put some capability on the ground so researchers can see what you can do and start using it right away.”
While WestGrid and SHARCNet each have staff that can look after the lightpath connection between their clusters, a pan-Canadian network would need someone to handle the overall maintenance and support. Couchman suggested the solution might come out of the C3 Association Inc., a national advocacy group dedicated to Canadian HPC issues which gathers major resource providers for discussions about once a month.
“Support is a concern, and something that needs to be planned carefully,” Simmonds agreed, adding that WestGrid already has experience dealing with multiple organizations such as Netera Alliance and BCNet, which are providing provincial connections in the west. “You need to make sure that the network is engineered in such a way that you can isolate which organization has a problem.”
Potential members of a pan-Canadian HPC cluster could include the Atlantic Computational Excellence Network, (ACEnet), le Réseau québécois de calcul de haute performance (RQCHP), the Consortium Laval-UQAM-McGill and Eastern Quebec for High Performance Computing (CLUMEQ), and the High Performance Computing Virtual Laboratory (HPCVL) in Eastern Ontario, led by Queen’s University.