E-commerce players flocking to semantic Web systems

Despite the recession, luggage retailer Ebags.com enjoyed phenomenal 2010 holiday sales — some 33 per cent higher than the previous year. (The online retail sector as a whole reported a 15 per cent gain this past holiday season.) Both Black Friday and Cyber Monday sales set all-time records, according to Ebags Inc. co-founder Peter Cobb.

Cobb credits much of these gains to his company’s deployment of Endeca Technologies Inc.’s online retail platform, which uses semantic technology to analyze shoppers’ keyword choices and clicks, and then winnows down results from categories to subcategories and microcategories. The end result? “Guiding the shopper to the perfect bag very quickly,” Cobb says.

Endeca’s Web site navigation software allows shoppers to use type, brand, price and size filters to get to relevant choices, Cobb explains. “With over 500 brands and 40,000 bags, we recognized a few years ago how important semantic search and guidance was to the shopping experience.”

By providing highly detailed descriptions of products and their attributes, and linkages between categories, the semantic technology has also enabled Ebags to attain higher placement on Web search engine results pages, according to the e-retailer’s chief technology officer, Chris Cummings.

In the late 1990s, Tim Berners-Lee, now widely known as the father of the World Wide Web, announced his vision of a “semantic Web” that would help people find exactly the information, answer or product they were looking for. This would happen, he hoped, without users having to design complex queries or try dozens of different keyword combinations or sort through thousands of irrelevant URLs.

To help make this happen, the World Wide Web Consortium (W3C), under Berners-Lee’s direction, has developed standards that allow computer platforms and software agents to identify, access and integrate information from disparate Web sites and domains, as well as from various information silos within an enterprise.

Using the W3C standard Resource Description Framework (RDF), for example, retailers and manufacturers could pass detailed product information back and forth, says Jay Myers, lead Web development engineer at BestBuy.com. “Right now, a lot of our vendors provide product information in spreadsheets, which makes it hard to distill.”

BestBuy.com isn’t currently taking full advantage of the W3C RDF’s capabilities; that’s still a future goal, according to Myers. Indeed, Berners-Lee’s dream is still a long way from reality, although it’s getting closer. Many business decision-makers remain skeptical that the paybacks of adopting semantic technology will make up for the costs and risks. What’s needed is a killer app that will persuade a critical mass of business users to invest in semantic Web software, says Phil Simon, a consultant and the author of The Next Wave of Technologies.

What it’s all about

Semantic software uses a variety of techniques to analyze and describe the meaning of data objects and their inter-relationships. These include a dictionary of generic and, often, industry-specific definitions of terms, as well as analysis of grammar and context to resolve language ambiguities such as words with multiple meanings.

For example, the phrase “there are 40 rows in the table” uses rows as a noun, whereas “she rows five times a week” uses rows as a verb. Likewise, the word stock has one meaning in the phrase “I used beef bones for my soup stock,” another in “the supermarket keeps a lot of stock on hand” and yet another in “analysts are bearish on the stock.”

Advice for going semantic

  • Remember that collaboration between subject matter experts and IT staff is crucial when developing a semantic ontology
  • Make sure you have a specific business mission before you build an ontology, otherwise it will wind up being a useless exercise
  • Resist the urge to jump in with both feet right away; it’s better to go slowly, implement projects that solve real problems and win converts along the way

Resolving language ambiguities ensures that a shopper who does a search using the phrase “used red cars” will also get results from Web sites that use slightly different terms with similar meanings, such as “pre-owned red automobiles,” for example.

It also makes it possible for a user to, say, type in a complex query like “progressive rock songs from the 1970s with odd time signatures and atmospheric feels” at a music Web site like iTunes or Amazon.com and get back Pink Floyd, says Simon.

Once defined, content is tagged with descriptive metadata or “markups” and is mapped into an ontology. Ontologies are schema that describe data objects and their relationships. Developing them is typically a collaborative effort involving technicians who understand semantic schema and subject matter experts who understand business language.

Semantic Web technology refers to products and architectures that support semantic searches, queries, publishing and retrieval based on W3C standards. These include Web Ontology Language (otherwise known as OWL), the Resource Description Framework (RDF) and Simple Protocol And RDF Query Language (SPARQL), as well as existing Web protocols like XML and HTTP.

The hidden helper

Ebags.com’s Cummings admits that he’s not all that familiar with semantic technology. However, he is very aware that Endeca’s semantic-based online retail platform has played a major role in increasing Ebags’ sales. “Since it was deployed, our conversion rates have doubled,” he reports. (Conversion is the term used to describe what happens when a shopper who clicks on a link to an e-commerce site actually buys something.)

Indeed, business users, and even some IT executives, don’t always realize that their e-commerce or enterprise software platforms are using semantic technology. However, they definitely appreciate the paybacks.

In addition to stronger sales numbers, other benefits of semantic technology can include more clicks from Web search engines, higher customer satisfaction rankings and, internally, more timely and effective decision-making and faster responses to competitors and market changes.

One of the earliest applications of semantic technology has been to help business users more easily find and access the information they need, no matter where the data is located and no matter who owns it.

Michael Lang, CEO of Cambridge Semantics Inc., a Boston-based maker of semantic middleware and plug-ins, is betting that semantic platforms will supplant traditional business intelligence systems. The main reason he’s expecting this to happen, he says, is because semantic technology eliminates the need to extract, transform and load all relevant data from disparate information silos into data warehouses and marts that need to be constantly updated.

With semantic technology, all of that happens on the fly and in the background.

According to Lynda Moulton, an analyst at Gilbane Group, a Cambridge, Mass.-based research arm of Outsell Inc., semantic technology can provide significant benefits for enterprises that are confronted with data that has some combination of the following characteristics:

  • It’s voluminous, with millions of unstructured documents
  • It’s complex in scope and depth
  • It’s valuable to end users, but in small, disparate pieces

I

  • t’s needed by highly paid and highly skilled professionals for use in their areas of expertise
  • It’s undifferentiated for e-discovery and research purposes. That means, for example, that the information lacks metadata and is not available in a structured format that supports intelligent searches
  • It’s likely to have an impact on the bottom line, indirectly or directly, when discovered

Semantic technology can process such information so that it can be “aggregated, federated, pinpointed or analyzed to reveal concepts or meanings” that are logistically impossible for human beings to obtain manually, Moulton says. Early adopters of semantic technology included companies in the publishing and life sciences industries; they’re now being followed by enterprises “whose content has grown to proportions unmanageable by humans,” says Moulton.

Competing for clicks

Semantic technologies can “make search engines better or more precise in finding relevant content,” says Moulton. So if your company operated a retail Web site, that would mean that semantically-enabled searches would do a better job of leading shoppers to your site and then helping them find products they want to buy.

BestBuy.com, for example, realized “high ROI in terms of increased store and product visibility on the Web,” Myers says. While adding semantic metadata to product pages on some 1,100 store blogs was no small task, Myers’ team saved a great deal of technical grunt work by using GoodRelations, an ontology that German university professor Martin Hepp developed specifically for e-commerce.

GoodRelations provides a standardized vocabulary — the semantic Web term for ontology — for product, price and company data. This information can be embedded into existing Web pages, then processed by other computers, applications and search engines that support W3C protocols. As mentioned above, this makes richer product information available to search engines that support W3C standards. It also provides the potential for cross-domain semantic querying across e-commerce sites — as long as other e-commerce companies incorporate the vocabulary into their data, too. So far, only a handful of retailers have done so, including BestBuy.com and, more recently, Overstock.com.

While Myers could give no hard numbers on time savings, he said that, in contrast with most deployments of new methodologies and technologies, “we spent very little overhead time implementing GoodRelations in our markup.” After an “initial introduction,” developers typically found working with GoodRelations as easy as coding standard HTML, Myers says.

BestBuy.com is exploiting the power and precision of semantic search not only to help shoppers find what they want but also to bring their attention to specific types of products, such as “long-tail” items that don’t generate huge sales, Myers explains. And early last year, his team developed a program, based on semantic Web standards, that makes it easy for store managers to publish information about “open box” or returned products on the store’s WordPress blog. Because these products are slightly cheaper, they are much in demand among customers with budget restrictions, Myers points out.

Semantic Web platforms from vendors such as Expert System, Cambridge Semantics, Sinequa and Lexalytics allow users to query both internal enterprise data, and Web sources, including blogs, social networks like Facebook, and other Web 2.0 media.

Answering employees’ questions

Bouygues Construction is using Sinequa’s Context Engine to put employees in touch with in-house experts who can answer their questions in a broad range of areas, says Eric Juin, the worldwide construction firm’s e-services and knowledge management director. “It could be a lawyer, an engineer or an executive, anywhere in the world.” The semantic platform identifies and categorizes all experience within the company, worldwide, by analyzing vast quantities of unstructured information, including training materials, project documentation and other internal sources, as well as Web-based newspapers and scientific publications, Juin says.

The platform is also being used to help knowledge workers quickly find information that resides either on internal systems or on the Web, Juin says. The semantic engine pores through documents, as well as comments from internal experts, and scores the material in terms of relevance to the user’s query, he adds.

Juin says that while no hard ROI numbers are available, there’s plenty of anecdotal evidence that the platform has helped Bouygues employees to avoid mistakes on construction sites by allowing them to rapidly contact people who can answer their questions. These anecdotes helped his staff cost-justify the deployment to management, he adds. Not that the project was expensive. It cost “just a [small per centage]” of the cost of Bouygues’ ERP project, Juin says.

Tips from the trenches

Data housekeeping is a critical preliminary step, experts agree. “The extent to which content is enriched with good metadata [means] you can start to build applications that deliver on the promise of semantic Web,” says Geoffrey Bock, an analyst at Gilbane Group.

Consultant Simon says he’s worked on a number of projects implementing “breakthrough” information technologies, and he has learned that if you don’t do housekeeping chores like cleansing and deduplicating data, “you just have better access to bad data.”

Lee Feigenbaum, vice president of technology at Cambridge Semantics, advises IT and business people to work together to determine a project where semantic technology would yield “differential value.” Will it speed up the development cycle? Enable end users to infer new data? Improve experiences for customers or partners?

Take things slow, at least at first, Simon advises. The project will reach critical mass as people get used to it and start to realize the benefits, he says.

Best Buy is doing just that. Its semantic Web deployment, which is about a year old now, is very much a work in progress, Myers says — as is the semantic Web itself.

“There are lots of semantic tools and open-source projects out there, plus SPARQL is a really powerful query language,” Myers says. “This gives me hope that semantic technology is at least one of the answers to the problem of Big Data. We have this large mass of data under our noses that we’re not utilizing. If we can find a way to gain insight from it and pass that on to customers and business partners, that’s a big competitive advantage.”

Don’t forget to check back in a couple of weeks for Part 2 of this story, which will cover products and frameworks you can use for semantic technology, as well as the relevant W3C standards.

Horwitt, is a freelance reporter and former Computerworld senior editor, is based in Waban, Mass. Contact her at ehorwitt@verizon.net.

Share on LinkedIn Share with Google+