A Quebec biotechnology firm is clustering Linux-based servers in an effort to mine genetic data that could accelerate the cure for diseases.
Galileo Genomics Inc. Wednesday said it was using IBM’s eServer xSeries to run its e-mail and file sharing, while a pilot project will use a Linux cluster to analyze the Quebec Founder Population. Galileo, which is based in Quebec City, has for three years been recruiting DNA donors across the province to correlate the makeup of more than six million related people. These are organized by trios — for example, a disease patient and his or her parents, or another relative who could make up the trio.
The firm has created software to analyze genome-wide scans across the DNA, which requires considerable computing power, according to Galileo senior director of IT Jean-Francois Levesque. Right now Galileo is rewriting some of its software to prepare it for various streams of research, which means forecasting its needs becomes very difficult, he said.
“”There’s a lot of variation in compute time,”” he said. “”We can go from fairly short to extremely long. Because we’re still working on optimizing the software, it’s still not clear to us how much compute power we’ll actually need for a full-run production.””
Sal Causi, business development executive within IBM Canada’s life sciences unit, said one of the biggest inhibitors of any biotech or biopharmaceutical is the cost of research, and compute power is a big part of that.
“”A cluster approach really works well when you’re trying to look at a great deal of data in a parallel fashion,”” he said. “”They can go through mountains of data all at the same time. You may use a supercomputer in collecting data, or starting to build a database of proteins that should relate to each other.””
Levesque said Galileo chose Linux because it is already working with a number of universities on code and algorithms, and these institutions are increasingly relying on open source. It’s also easier to use, he said, because the company doesn’t require a Windows-style user interface to perform its calculations. “”It gives us a base that we can easily port different types of applications or algorithms to their systems,”” he said. “”Obviously in terms of licensing it’s a little bit less expensive than Windows.””
Galileo chose IBM through an RFP about six months ago when the mixture of equipment it was using became too complex to manage and caused reliability issues, Levesque said.
“”Obviously if there’s downtime in the lab it costs money, but downtime in a calculation that may take weeks may require you to re-start all over again,”” he said. “”That can be really expensive.””
Causi said the volume of biological data is doubling every six months, which means customers need the flexibility of popping in some more Linux-based blade servers as needed.
“”Linux technology has come a long ways in the last couple of years,”” he said. “”One of its biggest advantages, of course, is its scaleability, and it’s low cost in terms of being able to put together a lot of compute power.””
Results of the Quebec Founder Population could identify the cause of diabetes, schizophrenia and asthma, Levesque said.