Canada builds global disease research database

TORONTO — Canada will play host to what scientists are calling the world’s largest database of protein-related information that could speed drug research and help cure diseases.

The Blueprint Initiative, a research program led out of Mount Sinai Hospital’s

Samuel Lunfield Research Institute, has been live for just over a year, but on Wednesday Sun Microsystems of Canada launched a Global Centre of Excellence in Systems Biology based on the program. The centre, housed in an 11,000 sq. ft. building within the recently opened 1.3 million sq. ft. Medical and Related Sciences (MaRS) Discovery District, will be complemented with similar Blueprint facilities in Asia and Europe over the next two years.

The centre will use Sun’s hardware and software tools in a grid computing environment to host the Blueprint Initiative’s database development project. Dubbed the Biomolecular Interaction Network Database (BIND), it will be offered as a Web-accessible repository of all manner of genomic and proteomic research. While the mapping of the human genome has helped scientists better understand DNA, proteomics promises to get at the root causes of disease by figuring out how and why proteins interact with one another.

Dr. Christopher Hogue, the Blueprint Initiative’s principal investigator, likened the current wealth of information about proteins and genes to an unassembled piece of IKEA furniture: scientists can recognize the parts, but not how they function. As an example, he showed an image of two proteins, one blue and one green, which were linked together. A series of yellow areas mapped mutations that could lead to the development of breast cancer. Part of Hogue’s work involves the development of open source software that will encode schematic diagrams into BIND and make it easier for scientists around the world to get at relevant data.

“”I intend to work toward that — well, until the end of my career,”” Hogue said.

As with many corporate enterprises, the challenge is not in finding data but aggregating it intelligently. Hogue said BIND would offer a degree of self-service forms by which researchers could enter results of their own research projects, but Blueprint staff would also assist them in creating records. Hogue calls these staff members “”curators”” who analyze and help choose what’s included. A separate database, called PreBIND, will include biological data that hasn’t yet been curated. The Blueprint Initiative expects to index more than 200,000 records over the next five years. The project, which has received funding from Sun, Foundry Networks and MDS Proteomics in addition to several public sector sources, employs 40 indexers, 24 programmers and 10 system administrators.

Hogue said BIND has migrated many pieces of data originally written in ASN.1 to XML, and expects the database will capture a variety of text, structural and gene sequencing information. The goal is to help researchers avoid duplicating efforts as they conduct experiments, or learn from previous experiments.

“”A pocket dictionary has more than 40,000 words, many of which you probably won’t use,”” he said by way of analogy. “”To learn all those words would be asking you to enlarge your vocabulary eight times over.””

In addition to its $6.27 million in-kind donation, Sun is providing the Blueprint Initiative with 108 Sun Fire V60 Linux servers that will be clustered together in order to ensure capacity is being fully utilized. Stephane Boisvert, president of Sun’s Canadian operation, said the company would also assist the Blueprint Initiative with code optimization and Java tool development.

“”Our corporate culture is very similar, in that we both strive for excellence,”” he said. “”We’ve been doing this kind of work before it was fashionable to be in life sciences.””

Some of Sun’s other Canadian biotechnology customers include Montreal-based Caprion, which also studies proteomics using the company’s hardware and software.

“”If you’ve tried to simulate a cell, you can imagine the computational requirements (the database) will have, now and in the future.””

Comment: info@itbusiness.ca

Share on LinkedIn Share with Google+