IBM’s Toronto Software Lab is helping Big Blue prepare a version of its DB2 database that could store XML data more easily.
Early previews of the database, code-named Viper, are being provided this week to attendees of the XML 2005 Conference in Atlanta, though the finished product is not expected until next year. Key to Viper’s features is the ability to manage and integrate native extensible markup language (XML) data and relational data.
Right now most databases store XML as a file or a “blob” in a cell, which can require reformatting as a large object in the system. Extracting that XML data for analytic purposes can be time-consuming and not necessarily true to the way the data was organized.
Berni Schiefer, manager of DB UDB performance, benchmarks and solutions development, at IBM’s Toronto Software Lab, said Viper will allow applications to use XQuery, or SQL to retrieve the data, rather than creating separate applications to pull from relational and XML repositories.
“Think of it as having two streams coming into a box and two coming out. DB2 provides wiring that lets you take incoming streams and mix and match them with outgoing streams,” he said.
With traditional databases, “you almost have to think ahead of all the variations you might expect – we call that ‘schema chaos,’” Schiefer said. “When you have schema chaos, the traditional relational model is not the ideal fit.”
For Viper beta testers such as SkyTide, which makes software to analyze XML data, the benefits could be very important, according to John Morrell, vice-president of product marketing, in San Mateo, Calif.
“One of the big gaps in the market to effectively manage XML data they’re acquiring with a lot of the other relational data,” he said. “As it grows from five to 15 per cent of corporate data, it becomes an important asset. You need to manage it.”
IBM’s chief rival in the database market is still Oracle, which according to IDC data released earlier this year, continues to command 41.3 per cent of the market. Oracle has spent the last three years pushing 10g, a database focused on allowing enterprises to balance computing work loads by creating grids of IT resources.
Viper doesn’t compete with that approach, but will offer autonomic or self-managing capabilities that will let the database configure itself for various environments, such as SAP, Microsoft or Oracle. It will also make sure the right amount of memory is available for various jobs, Schiefer said, such as an overnight batch job.
“Think of a car with a fuel injection system where the computer knows how much gas to shoot into the engine at the right time to maximize the amount of energy,” he said. “We’re constantly taking sensors in the database where the demands are and using a metric that we call ‘unit of goodness per byte of memory’ to know who would be the most valuable recipient in order to optimize the throughput.”
IBM’s other main database rival, Microsoft, released the 2005 version of its SQL database last week.