Propelled by steady growth in the need for long-term data storage and particularly a growing amount of data that must be kept unchanged, a new approach to storage is gaining attention. It’s called content-addressed storage.
Consulting firm CAP Gemini S.A. has called it a quiet revolution. Enterprise Strategy Group Inc., a Milford, Mass., research firm, estimates sales of “CAS and CAS-like solutions” hit almost US$1 billion last year, and expects that number will roughly double in 2006, says Tony Asaro, senior analyst at ESG.
“CAS came about because of the need for managing the massive amount of fixed content that is out there,” says Tanya Loughlin, manager of strategic programs at Hopkinton, Mass.-based storage vendor EMC Corp. and chair of the CAS Community, an organization set up to promote content-addressed storage and educate users about it.
“It all comes down to having a lower-cost tier of storage to keep online infrequently accessed data,” Asaro says.
Examples are medical records – which are storage-intensive and must by law be kept for long periods – and business records that legislation such as Canada’s Bill C-198 and the U.S.’s Sarbanes-Oxley Act require to be kept unaltered for prescribed lengths of time.
Traditional storage technology is designed primarily for data that changes – the data generated by a payroll system or an inventory system, say. “No-one had ever built a storage system and taken into account the nature and characteristics of a system designed for fixed content,” says Ken Steinhardt, director of technology analysis at EMC.
Content-addressed storage – sometimes called object-based storage – is good for this because it stores data as objects with metadata attached. This metadata includes a sort of digital signature that can be used to verify that the data has not changed. This is important for two reasons, according to Steinhardt. For one, it satisfies the regulatory need for an assurance that information hasn’t been altered. For another, it allows the storage system to check the data for integrity.
All storage media are vulnerable to “bit flipping” – over time, small errors can be introduced into the data they store. If data is stored long enough, enough errors can accumulate to become a problem. EMC’s Centera content-addressed storage can actually check stored data for such errors and correct them.
The metadata has other uses. Often archival data must be stored for a fixed period – usually a number of years. CAS can use metadata to record how long data needs to be kept and ensure that no data can be deleted before its time. CAS systems can also delete data automatically when the required retention period ends – or simply notify an administrator who makes the final decision.
CAS assigns a unique name to each data object and stores it once. Storage is location-independent, so nobody needs to worry about the physical location of the data, and it can even be moved around without affecting the applications that use it. This makes CAS a natural fit with information lifecycle management (ILM), says Christina Casten, EMC’s director of strategic programs and chair of the Storage Networking Industry Association (SNIA) committee that is working on an Extensible Access Method (XAM) interface standard for CAS.
ILM moves data around to the most appropriate storage media at different points in its life – depending on how often the data will be used, for instance. “You can actually keep more intelligence in the CAS system that helps you manage more of those things,” says Paul O’Brien, director of information lifecycle management for Hewlett-Packard Co.‘s StorageWorks products.
CAS can also do away with the space-wasting proliferation of copies of the same data throughout an organization, O’Brien notes. “Rather than keeping 30 copies of the same PowerPoint presentation, you can keep a single copy or a single instance of that document,” he explains. Different users and applications appear to have their own copies, but the data is stored only once.
O’Brien says this can reduce a typical organization’s storage needs by 20 to 30 per cent, and HP hopes to save more space with new technology due this summer, which will store single instances of data at the block level – so if you have multiple PowerPoint presentations with only a few slides different, common sections can be stored just once.
CAS is also easy to manage, because it dispenses with the complex file systems used in most storage. Though CAS is more costly than the tape and compact disk systems it mainly aims to replace, Asaro says, simplified management results in lower total cost of ownership than conventional storage systems offer.
At Guelph General Hospital in Guelph, Ont., an 18-month-old EMC Centera CAS system has helped keep pace with growing storage requirements for diagnostic images and simplified storage management, says Jason Winter, the hospital’s manager of information technology.
Previously, the hospital used DVD jukeboxes to store the images. Winter was looking for quicker, easier access to images and lower over-all cost. The move to CAS, while the technology itself cost more, lowered total cost of ownership through simplified management. Not only are day-to-day operations simpler, Winter says – “Centera is a black box for us” – but dealing with growing storage requirements is easier than before because the system is scalable. And Winter says he is more confident than before that images will be stored reliably for long periods.
Today, interest in CAS is driven largely by compliance legislation, helped along by the burgeoning storage requirements for material such as health records and video files. Since EMC pioneered the technology with Centera about three and a half years ago, Capgemini says, other vendors such as HP and StorageTek (now part of Sun), as well as smaller companies, have been quick to jump on the bandwagon.
As with many emerging technology areas, there is a need for standards. “There are a lot of products in the field and they all do things a little differently,” notes Jered Floyd, vice-president of development at CAS vendor Permabit Inc. in Cambridge, Mass. The SNIA’s XAM effort is aimed at providing a standard interface to all CAS technology. The committee is aiming for a draft standard in about a year, Floyd says.