Can science help in the search of lost Web pages?

The most memorable scene in In Search of Lost Time, Marcel Proust’s sprawling, seven-volume novel of life among the French aristocracy, occurs early on, when the narrator bites into a spongy piece of cake called a Madeleine. Instantly his mind goes back to an earlier, more innocent time when he ate

such cakes as a child, which in turn helps launch the story of his rise through the social order. Although it’s hard to imagine Proust surfing the Internet, he would no doubt have been quick to spot the one thing missing that would make the online experience complete. The Web, unfortunately, is bereft of Madeleines.

Oh, there are bookmarks, and a history of recently visited URLs in most browsers, but for some reason they’re not good enough to help the majority of enterprise IT users relocate important information. That’s why researchers at Virginia Polytechnic Institute and State University are hard at work trying to study the way people search for data they’ve seen before, in the hopes of developing tools that would let them re-access Web pages more quickly and easily. These tools would perform a function similar to mnemonic words that help give the mind clues to remember sometimes-forgettable details.

To some IT managers, this branch of cognitive science may seem a bit ominous, but it can’t be avoided when you consider that so many projects are related to information retrieval of some kind. Unfortunately, the Virginia researchers’ conclusions so far seem remarkably obvious — that Web surfers tend to find an initial page that led them to the desired information and then sift through their memory based on context, and that annotation makes things easier.

This reflects my own methods. If I bookmark anything, it’s only to spare me the bother of typing. For pages I might visit only a handful of times, I tend to jot things down on notes left on my desk, related to a particular project. Later on, I may not remember the URL, but I will remember the project, and through a series of keywords on a search engine I can usually find what I need.

To some extent, relocation is also assisted through the use of various content aggregation pages, which are becoming the phone books of the Internet. The real issue that makes research in this area critical, however, is the proliferation of action-specific data — the kind that won’t be linked to an aggregator’s page — and the evolution of computers from desk-bound to mobile.

For example, enterprises have spent the last several years moving away from paper-based processes to Web-based ones. That includes not merely transaction-oriented portals accessible through the public Web, but business-to-business pages created through Web services standards or pages available through complex enterprise intranets. At the same time, we are shifting from a desktop world to, in many cases, substantial portions of the workforce who access the Internet through laptops, personal digital assistants or mobile phones. Annotation may be less effective when you are nowhere near your desk. Being on the road also takes us out of a familiar context, distracting us with new visual details that make certain forms of recollection so difficult.

Software may yet address these problems, and search engine firms are creating ever-more sophisticated algorithms to complement our mental processes. The key will be paying more attention to user behaviour and isolating the barriers to rediscovery. Without that, these Madeleines will be forever half-baked.

Share on LinkedIn Share with Google+