Sometimes, high-tech isn’t for the faint of heart. Faced with a tight budget, one university researcher at the University of Western Ontario decided to take a do-it-yourself approach to a supercomputing project. Four years later, he considers it an overwhelming success, but says it was a bumpy ride
“”It turns out to be a complex affair,”” says Alessandro Forte, now an earth sciences professor with the University of Quebec in Montreal (UQAM). He remains an adjunct professor at Western and still supervises the project.
In 1999, Forte sought out federal funding to help run simulations of the earth’s heat and landmass movements. His options at the time were limited, he says.
Joining a larger consortium or working entirely with a private-sector partner would have meant losing autonomy and probably his research focus, so Forte proposed building his own parallel cluster with low-priced, high-performance equipment. “”You can draw on all of the power of the PCs by interconnecting them intelligently,”” he says.
At first, choosing the right hardware proved to be difficult.
“”Equipment that we initially considered when we wrote the project became sunset equipment by the end of the process.””
Forte considered using new 64-bit Alpha systems from Digital, which was bought out by Compaq (now part of Hewlett-Packard) at about the same time. In 2001, however, Forte opted for systems running on the first-generation Itanium processor. With a budget of about $250,000 from the Canadian Foundation for Innovation, he bought six X460 servers from IBM Corp., each with four Itanium processors running the Linux operating system.
Forte worked with Mississauga, Ont. firm Infinity Technologies Inc. and IBM Canada staff to build and maintain the cluster.
For IBM, the project was a good opportunity to test out the firm’s research, says Dominic Lam, national e-server manager for IBM Canada’s deep computing business.
“”It fit well within our deep computing model,”” he says.
Forte’s project involves building simulations of what occurs under the earth’s surface, including the movement of heat and land over time. This includes looking at continental movement and changes in sea levels. By studying the planet’s “”heat engine,”” Forte can examine conditions within specific land masses, as well as the earth’s ties to outer space. For instance, looking at the effects of gravitational fields from the sun or other planets could lead to further insights about climate change.
“”The cluster itself was used to solve problems in computational fixed dynamics.””
Forte and his research colleagues drew widespread media attention in 2001 after publishing an article in the journal Nature. Their findings established a connection between the movement of continents and occurrences deep within the earth — a connection not acknowledged previously.
This sort of work would be impossible without clustering, Forte says. And doing it on a budget sometimes proved difficult.
Using Linux helped cut costs associated with using proprietary platforms, according to Lam. The open-source operating system is also popular among researchers and allows for cheap, scalable and interoperative computing. “”Researchers don’t have multimillion dollar (budgets) to fund their projects.””
Clusters — sometimes called Beowulf clusters — originated in the U.S. National Aeronautics and Space Agency during the 1980s. Running several systems in tandem allows researchers to sub-divide tasks among computers and solve problems more quickly and effectively.
“”One way to deal with data is to dump them into the system memory as opposed to leaving them on the hard drive,”” Lam says. “”Large memory is a big requirement in Alessandro’s research.””
While the process sounds good in theory, it doesn’t always work out in practice — at least, not at first. Forte chalks up problems that originated early on in the project to using leading-edge technology that may not have all the bumps worked out.
“”We had problems finding compilers for our programs,”” he says.
Forte says scientists have it hard enough programming their own applications for research without having to deal with converting them into machine language, “”unless you’re a real guru in writing computer code.””
Working with IBM, though, Forte hooked up with Intel Corp., the manufacturer of Itanium chips, and eventually found an Itanium-specific compiler.
For Forte, it was important to find software solutions that didn’t limit the system’s usability in other locations. He says it’s important for researchers to be able to work with applications no matter where they are.
“”If it’s been created to work on one machine, it’s not very portable.””
Using Linux helps keep those doors open, according to Lam. “”It gave researchers the ability to collaborate with other researchers,”” he says. “”Everyone’s based on the same open standard.””
Forte is currently working on building another cluster at UQAM, potentially connecting it to the one at Western. He expects to see greater use of the technology in the future, as hardware prices fall and researchers find new ways to co-operate over distances. He points to Sharcnet, which connects clusters from universities across southern Ontario, as an example of how computing power can be enhanced even further.
For Lam, the project allowed IBM to test out its research and business strategies. This past spring the company renamed its server operations the “”deep computing”” business unit, drawing on earlier work, most notably the success of the Deep Blue chess computer against world champion Gary Kasparov.
“”(We asked) whether IBM wants to be a chess machine manufacturer or to do something else with it,”” Lam says.
Forte says he looks forward to working on future clusters, even if it means working with what might be considered “”test-bed”” technology. For him, the results are worth a few technical glitches. “”We’re willing to take the risk in order get the most powerful machines we can,”” he says. “”In the past, we may have been one of 100 users on a Cray supercomputer and we’d have to line up and take a number.””