Behind the ‘beer and diapers’ data mining legend

Are beer and diapers on your shopping list this week?

Probably not, because, after all, this is Canada and grocery stores aren’t licensed to sell liquor. But a popular 10-year-old legend that beer and diaper sales spike between the hours of 5 p.m. and 7 p.m. has led to the multi-million dollar

industry of data mining and warehousing.

Conventional wisdom suggested that fathers would stop by a store on the way home to pick up supplies for their children and libations for themselves. The beer/diaper discovery was made by Thomas Blischok, then vice-president of industry consulting for NCR Corp. NCR has since spun off its data mining business under the name Teradata and Blischok is now CEO of his own data firm, MindMeld Inc.

Blischok marked the 10-year anniversary of his epiphany on Tuesday by recounting the tale and pointing to some potential future trends for the industry.

The original study was carried out by NCR for American retail chain Osco Drugs. Osco did not in fact move beer and diapers to the same aisle, but the NCR data did result in some fundamental changes to the ways in which they sell to customers. By examining 1.2 million market baskets in 25 stores, NCR identified 30 different shopping experiences, such as a correlation between fruit juice and cough medication sales.

Osco removed approximately 5,000 slow-moving SKUs from its inventory, but by re-arranging merchandise, consumers actually thought that Osco’s selection had increased. “”What they began to achieve was putting the right merchandise in the right quantities at the right time,”” said Blischok.

The industry is now on the cusp of what Blischok called the second generation of data mining. “”We will see the basic re-invention of the interaction between the corporation and consumer,”” he said.

The Royal Bank of Canada, is an example of this re-invention, according to RBC’s director of enterprise CRM solutions Lore McGuire. “”Our objective is, rather than selling products to customers, is really to service the needs of our customers and help them achieve their financial goals,”” she said. “”It’s a subtle difference, but it’s a huge mindshift change in how we target our customers.””

But there remains the temptation to view data mining as some sort of magical process that can instantly produce an iron-clad formula for improving sales, said the director of Teradata’s Data Mining Lab in San Diego, Mike Rote.

“”Hundreds of years ago, an English researcher discovered an increase in birth rate seemed to coincide with an increase in stork activity: That’s a great example of how things occurred together, but perhaps there wasn’t necessarily a casual relationship of any sort,”” said Rote.

In the last decade, data mining tools have reached a level of sophistication where it may be possible to determine a causal relationship between product sales. But, as McGuire pointed out, there’s still no substitute for actually talking to customers. “”Asking the client the question directly is way better than building any predictive model to try to anticipate what their answer might be,”” she said.

Now data mining applications have extended beyond trying to predict customer patterns into more scientific endeavours. A lab in San Diego, for example, is using the software in its bioinformatic research.

“”(They are) doing some phenomenal work in the area of gene expression analysis. We have tissues from diseased mice versus healthy mice and we’re trying to ascertain what is the difference in terms of the genes expressed. We use data mining technology to gain insight into just this problem,”” said Rote.

One indication that the data mining market is starting to reach maturity is that Microsoft is taking more of an interest in it, said Andrew Braunberg, data warehouse and business intelligence analyst for Current Analysis Inc. The software giant typically doesn’t pursue a market worth less than US$1 billion, he said, but Microsoft has included some data mining tools in the latest version of its SQL database.

“”The problem was that for a long time the tail was wagging the dog in this industry. . . . There’s some cool technology out there, but I think it was cool technology chasing business problems. . . . An end user needs to have a clear business problem at hand,”” said Braunberg.

The next generation of data mining tools are embedding algorithms directly into databases so models can be built directly against the data, said Braunberg. “” We think this is an important trend because we think it’s going to expose data mining to a much larger potential audience and make it easier to use for mainstream IT personnel, particularly DBAs (database administrators).””

IDC estimates the data mining market was worth US$450 million in 2001, but could grow to US$822 million by 2006.

Comment: [email protected]

Would you recommend this article?


Thanks for taking the time to let us know what you think of this article!
We'd love to hear your opinion about this or any other story you read in our publication.

Jim Love, Chief Content Officer, IT World Canada

Featured Download

Featured Story

How the CTO can Maintain Cloud Momentum Across the Enterprise

Embracing cloud is easy for some individuals. But embedding widespread cloud adoption at the enterprise level is...

Related Tech News

Get ITBusiness Delivered

Our experienced team of journalists brings you engaging content targeted to IT professionals and line-of-business executives delivered directly to your inbox.

Featured Tech Jobs