Meaningful collection and identification of data may be the Rosetta Stone of any corporate organization, but what do you do with the data about data?
Metadata, or “”data about data,”” as Norman Paskin, founder of the International DOI Foundation,
calls it, is another, extremely large superset of information that must also have its correct place in the world.
For Paskin, who spoke Thursday at Access Copyright‘s Digital Rights Conference in Toronto, metadata is a way to define a relationship between different levels of data.
The problem is that there’s just so much of it. There are numerous classifications for metadata, including Digital Object Identifier, or DOI. Paskin, the man responsible for DOI, sees the need for a metadata dictionary or a framework that could put the classifications in context.
He’s not the first. A European initiative called Indecs (interoperability of data in e-commerce systems), of which the DOI Foundation is a member, has been working on the problem for several years. A metadata dictionary would operate somewhat similarly to an English to French dictionary, only with every single metadata language.
In an ideal world, you would start from scratch and construct metadata in a way such that it was all interoperable, said Brian Green, manager of U.K.-based Book Industry Communication, an organization that develops standards for e-commerce in the books and periodicals industry. But, since that isn’t feasible, a universal hub that can make disparate metadata work together is the only reasonable solution. “”It’s time for some standards-based implemenations, and we look forward to them,”” said Green.
He cited the work that has already been done by Indecs, as well as that done by InterParty, another European effort that is continuing the work. InterParty, however, is focused on building a nomenclature for corporations and individuals to make it easier to recognize them in contracts involving intellectual property and digital rights.
Metadata is everywhere, yet easily overlooked by most. Every book published is marked with an ISBN (International Standard Book Number), for example, which is overseen by the International Organization for Standardization (ISO).
The problem with ISBN, however, is that the numbers are running out. So, effective Jan. 1, 2007, ISBN will be moving from a 10-digit format to 13.
“”An object has to be able to be described in order to be understood as a unique identity,”” said Jane Thacker, who has led the ISBN re-development. Thacker, based in Ottawa, manages the International Secretariat of the ISO’s committee on identification and description of information resources (also known as ISO TC46/SC9).
Thacker is also leading the development of the International Standard Text Code, another metadata standard due to come into effect next year which will catalogue textual works like articles, poems, screenplays and short stories. That should help automate some of the contractual work around intellectual property, she said.
Norm Friesen, the director of CanCore, is also trying to remove some of the complexity around metadata. CanCore, the Canadian centre dedicated to e-learning metadata, has modified the IEEE’s LOM (Learning Object Metadata) standard to make it easier to use for schools and universities.
The problem with e-learning, especially in the education system, is that it’s a relatively young discipline and therefore unable to take advantage of the economies of scale enjoyed by its corporate cousins. It’s “”often still grassroots and somewhat hobbyist,”” said Friesen, who’s worked with EduSource (the Canadian network of learning object repositories) and the University of New Brunswick.
CanCore is helping to deploy LOM across Canada, which should in turn help assign metadata to pieces of online learning courses such that they can be separated and repackaged more easily.
The variety of metadata may be daunting, but the ultimate goal is to make sure all of it is relatable. It remains a behind-the-scenes discipline that is being guided and redefined by organizations like the DOI Foundation. As metadata mechanic Paskin says, “”You don’t need to know what’s under the hood.””