By watching for certain keywords, law enforcement agencies can already identify e-mails that might contain clues to criminal activity and corporations can flag employee messages that could cause legal problems.
Keywords have limitations, though – people trying to avoid detection may
steer clear of language likely to attract attention. So a Queen’s University researcher is exploring ways to spot suspicious e-mails even when writers try not to give themselves away.
Dr. David Skillicorn’s work is based on the idea that when people are trying to hide something, they write differently than people who have nothing to hide. That’s more true of e-mail than of more formal documents, he adds, because few of us go back and edit our e-mails.
One difference might be the complete absence of words someone might possibly think would draw a law enforcement agency’s attention to their e-mails, but that most people would occasionally use innocently (as in “my presentation yesterday really bombed.”) Another, Skillicorn says, is that research shows people speak and write differently when they feel guilt about a subject, for instance using fewer first-person pronouns, like I and we.
“If you’re up to no good,” he says, “it’s very hard for you to write something that looks ordinary.”
Skillicorn doesn’t know all the ways suspicious e-mails might read differently from innocent ones. The beauty of his approach is that he doesn’t need to know. His software is designed simply to look for messages that are different, based on word frequencies, from the mass of e-mails. It needn’t understand the reasons for the differences.
A related trick, he says, is to examine patterns in who e-mails whom. As an example, in criminal networks it is common to find several people communicating regularly with the same person, but never with each other. This is meant to ensure that if one lawbreaker is caught, he or she is unlikely to lead authorities to too many others. But it can also be a clue to suspicious activity.
To help with his work, Skillicorn has been working with archives of e-mail from Enron Corp., the company at the heart of one of the most prominent scandals in recent U.S. business history. In some respects, he notes, the Enron e-mails are not a good sample for analysis, because Enron employees seemed to have no compunction about what they were doing. “People should feel some guilt or at least some self-consciousness when they’re being deceptive,” he says. But there were more indicators of deception in e-mail going out of Enron than in e-mail coming in.
With funding from Science and Engineering Research Canada (still known by the acronym for its old name, NSERC), Skillicorn has developed the algorithms to spot unusual e-mails. The next step is to scale the technology up to handle larger volumes of mail, he says.
The software will flag as unusual some e-mails that have nothing to do with criminal activity or deception, he admits, and further screening will be needed to identify messages that are actually cause for concern. “You have to think of this as the first stage of a process that has several stages.”
Commercialization is not high on Skillicorn’s list of priorities. Such technology has obvious applications in surveillance by law enforcement and security bodies, but Skillicorn suspects agencies like the U.S. National Security Agency have little need of his help. “I infer from things they say around me that some of this stuff they already do,” he says.
Darrell Evans, executive director of the B.C. Freedom of Information and Privacy Association in Vancouver, says it is widely believed, and probably true, that governments monitor large amounts of e-mail. “I just assume it happens, and I think you have to assume it happens,” he says.
Given that, Evans says research like Skillicorn’s doesn’t represent a significant new privacy threat. “It’s no more of a concern than looking for keywords. It depends who’s doing the looking and for what purpose . . . which I have, of course, concerns about.”
But Evans also says that the more sophisticated tools for monitoring and searching e-mail become, the easier it is for governments and others to locate specific personal information they may be seeking.
Companies might also use the technology to scan employee e-mail for potentially damaging or inappropriate messages.
Skillicorn says the technique might have other applications, such as scanning for Web pages that are intended to deceive people. He does not believe, however, it will help in fighting spam.