The year is 2012.
As soon as you walk into the airport, the machines are watching. Are you a tourist — or a terrorist posing as one?
As you answer a few questions at the security checkpoint, the systems begin sizing you up. An array of sensors — video, audio, laser, infrared — feeds a stream of real-time data about you to a computer that uses specially developed algorithms to spot suspicious people.
The system interprets your gestures and facial expressions, analyzes your voice and virtually probes your body to determine your temperature, heart rate, respiration rate and other physiological characteristics — all in an effort to determine whether you are trying to deceive.
Fail the test, and you’ll be pulled aside for a more aggressive interrogation and searches.
That scenario may sound like science fiction, but the U.S. Department of Homeland Security (DHS) is deadly serous about making it a reality.
Interest in the use of what some researchers call behavioral profiling (the DHS prefers the term “assessing culturally neutral behaviors”) for deception detection intensified last July, when the department’s human factors division asked researchers to develop technologies to support Project Hostile Intent, an initiative to build systems that automatically identify and analyze behavioral and physiological cues associated with deception.
That project is part of a broader initiative called the Future Attribute Screening Technologies Mobile Module, which seeks to create self-contained, automated screening systems that are portable and relatively easy to implement.
The DHS has aggressive plans for the technology. The schedule calls for an initial demonstration for the Transportation Security Administration (TSA) early this year, followed by test deployments in 2010. By 2012, if all goes well, the agency hopes to begin deploying automated test systems at airports, border checkpoints and other points of entry.
If successful, the technology could also be used in private-sector areas such as building-access control and job-candidate screening. Critics, however, say that the system will take much longer to develop than the department is predicting — and that it might never work at all.
In the Details
“It’s a good idea fraught with difficulties,” says Bruce Schneier, chief technology officer at security consultancy BT Counterpane in Santa Clara, Calif.
Schneier says that focusing on suspicious people is a better idea than trying to detect suspicious objects. The metal-detecting magnetometers that airport screeners have relied on for more than 30 years are easily defeated, he says.
But he thinks the technology needed for Project Hostile Intent to succeed is still at least 15 years out. “We can’t even do facial recognition,” he says. “Don’t hold your breath.”
But Sharla Rausch, director of the DHS’s human factors division, says the agency is already seeing positive results. In a controlled lab setting, she says, accuracy rates are in the range of 78 to 81%.
The tests are still producing too many false positives, however. “In an operational setting, we need to be at a higher level than that,” Rausch says, and she’s confident that results will improve. At this point, though, it’s still unclear how well the systems will work in real-world settings.
Measuring Hostile Intent
Current research focuses on three key areas. The first is recognition of gestures and so-called “microfacial expressions” — a poker player might call them “tells” — that flash across a person’s face in about one third of a second. Some researchers say micro expressions can betray a person when he is trying to deceive.
The second area is analysis of variations in speech, such as pitch and loudness, for indicators of untruthfulness.
The third is measurement of physiological characteristics such as blood pressure, pulse, skin moisture and respiration that have been associated with polygraphs, or lie detectors.
By combining the results for all of these modalities, the DHS hopes to improve the overall predictive accuracy rate beyond what the polygraph — or any other means of testing an individual indicator — can deliver.
That’s not a very high bar. The validity of polygraphs has long been questioned by scientists, and despite decades of research and refinements, the results of lie-detector tests remain inadmissible in court.
While the U.S. Department of Defense’s Defense Academy for Credibility Assessment (DACA; formerly the Polygraph Institute) puts median accuracy percentage for polygraphs in the mid-80s when properly administered, others say that number is closer to 50% in the real world and that the results depend heavily on the skills of the examiner.
Schneier goes even further. He says lie detectors rely on “fake technology” that works only in the movies. They remain on the scene, he says, because people want them to work.
The presumption that combining the predictive results from the three areas being studied will increase predictive accuracy is also untested. “We can’t find any indicators that this stuff is being combined [in current research]. The feeling is that [the DHS is] doing some groundbreaking stuff here,” says Rausch.
Many researchers are already tackling different pieces of the Hostile Intent puzzle. Julia Hirschberg, a computer science professor at Columbia University, is investigating how deception can be detected by picking up on speech characteristics that vary when someone is lying.
The research, funded by a DHS grant, has identified 250 “acoustic, intonational and lexical features” that may indicate when a subject is lying.
So far, the best accuracy rate is 67%. She admits that’s “not great,” but it’s better than human observation alone, she claims.
The results may not apply to real world situations, however. Her work is based on lab experiments in which the subject presses a pedal when he is lying, and machine-learning systems process the results. “It’s not ideal,” she acknowledges.
Moreover, the accuracy rate in predicting deception varies with cultural background as well as personality type. Hirschberg says she has identified four or five personality types that could affect how the results should be interpreted.
Adjusting for personality type might improve accuracy in cases where the type can be identified, but it’s doubtful that interviewers in an airport or border setting will have the insight necessary to do so.
Dimitris Metaxas, a professor of computer science in biomedical engineering at Rutgers University, has received funding from both the DHS and the DACA to use technology to track and interpret the meaning of microexpressions and gestures. “I’m trying to find the expressions and body movements that are not normal and could be linked to deception,” he says.
Metaxas says his research focuses on movements of the eyebrows and mouth as well as various head and shoulder gestures, but he wouldn’t be more specific. That’s because the exact indicators that he is interested in remain secret.
Although the DHS’s Rausch believes that micro expressions are involuntary, she doesn’t want people to know exactly what expressions the agency will be measuring — just in case.
“Every system can be broken,” Metaxas points out.
Objections and Obstacles
Skeptics say that no tech-based system will work.
The Ekman Group has trained TSA staffers on techniques to help them recognize and interpret microexpressions. The consultancy was founded by Paul Ekman, a pioneer in research linking microexpressions to deception. At the TSA, trained officers use the techniques as part of the organization’s Screening Passengers Through Observation Techniques program.
John Yuille, the Ekman Group’s director, doesn’t think the technique can be automated. The discipline is a “social science,” he says, and microexpressions merely represent “clues to truthfulness” that require human interpretation. “Our methodology is not amenable to technological intervention,” Yuille says.
Metaxas says that what’s holding him back at this point isn’t technology. “The basic technology to track the face, I’ve solved that problem,” he says, claiming an accuracy rate of 70 to 80% with cameras positioned at distances up to nine feet from the subject.
The challenge is optimizing the algorithms that relate those expressions to deception. To do that, he needs more data from psychologists. The theories linking microexpressions to deception are largely based on academic research. Although it has been tested in lab settings, it has not been scientifically proved in large-scale, real-world studies.
Rules must also be applied in the correct context. For example, a measurement of something like a microexpression must be associated with what was being said at the time, and the meaning of what was said must be correctly interpreted, says Hirschberg.
The system must also be able to determine whether there is a mismatch between a given expression or gesture and what was said.
“That is very difficult [for a computer] to do,” she says, so in the lab, the matching work has been done manually.
In an effort to refine the algorithms, Metaxas has collaborated with Judee Burgoon, a professor of communication, family studies and human development at the University of Arizona.
She says the lack of rigorous research validating the use of microexpressions as indicators of deception “gives everyone pause.” It’s not known whether microexpressions correspond with underlying emotions or whether those emotional states correspond to deception, she says.
Although it is believed that microexpressions are involuntary, it’s unclear whether subjects can “game the system,” as they have done with polygraphs. And many researchers in the field believe that indicators of deception are culturally dependent.
That means analysis that doesn’t take cultural background into account could amount to ethnic, rather than behavioral, profiling. That’s ironic, since using machines to analyze the data is supposed to help eliminate biases associated with human decision-making.
In fact, the development of “culturally neutral” indicators is a stated goal of Project Hostile Intent. Rausch believes that researchers can identify microexpressions and other indicators that are universal or “cross-cultural.” That won’t happen in time for the initial test systems.
But by 2011, says Rausch, the DHS should have test systems that use only culturally neutral indicators.
For Metaxas, the challenge now is to prove that the fundamental assumptions linking microexpressions to deception are correct. “What I hope I can do is validate and verify the psychology,” he says.
To do that he needs to conduct further tests involving interviews in real-world situations. But that won’t be easy. Privacy and security concerns have prevented Metaxas and other researchers from monitoring interrogations or conducting interviews in real-world settings such as airports or immigration points.
Even the DHS faces obstacles in testing the technology in the field, Rausch acknowledges. And in real-world testing, says Hirshberg, there’s another problem: “You don’t really know when the person’s lying.”
With an aggressive timeline for deployment, Rausch is well aware of the challenges, and she cautions that the technology is far from complete. “We’re very much in a basic research stage,” she says.
Beyond Hostile Intent
Project Hostile Intent is just one of the programs that the DHS’s human factors division is pursuing. Another is violent-intent modeling. By applying social behavior theory to terrorism, the division is hoping to assist analysts that must manually sift through thousands for publications, news feeds and other data.
Researchers are developing indicators for potential violent behavior, which are used in computerized architectural frameworks that help analysts extract relevant data as they review documents.
“Computers help in running the models. As you put the data together, you get likelihood coefficients for violent behavior. Our goal is to get that automated for the analysts,” says Rausch.
The “information-extraction tools” will assist analysts by identifying important information as they’re reading it, but they won’t replace analysts. “We’re doing it in a way that’s consistent with the way analysts think,” Rausch says.
Another developing area is biometrics. Research is focused on developing mobile readers that can perform facial, fingerprint and iris recognition. “As we push out in years, we’ll get into remote biometric [sensors],” as well as more refined, “10-print” fingerprint recognition, says Rausch.
The systems will tap into “huge databases for identification and verification,” she says.
Other TSA Technologies
The TSA may eventually use the behavior profiling systems that come from Project Hostile Intent, but that’s just one part of the agency’s transportation security strategy. The layered approach includes “a technology factor, a human factor and shared intelligence,” a spokesperson says.
The TSA’s passenger screening technology hasn’t changed since the magnetometer, a metal detector, was introduced in 1973, but it’s working on other technologies including a so-called advanced technology X-ray.
This high-resolution X-ray system provides clearer images of the contents of carry-on baggage and offers multiple viewing angles. The machines are already widely used in Europe. The TSA has purchased 250 of them and plans to have a total of 500 installed by the end of 2008.
That’s a fraction of the 751 checkpoints and 2,000 lanes in service, but 500 machines is enough to cover 75% of the security lanes at the nation’s largest airports, which represent 45% of all travelers.
Another technology is the puffer machine. The subject walks into this phone booth-like device, and translucent bifold doors close around him. The machine then blasts the subject with a burst of compressed air and analyzes it for trace amounts of explosives.
The puffer is already in testing in some airports but hasn’t worked well. “They’re OK, but I think we’ll go more in the direction of whole-body imaging,” says a spokesperson.
In whole body imaging, a machine bombards the subject with radio-frequency energy as he walks through and creates a very accurate image of his body — perhaps too accurate — in order to detect any foreign objects. “There’s a whole lot of privacy issues with this,” a spokesperson acknowledges.
The TSA is testing two technologies: One, called back scatter, uses a privacy algorithm that changes the image to a “chalk outline” of the body while the other, called millimeter wave, creates what looks like a negative.
To address privacy concerns, facial images are blurred, and images aren’t saved. In addition, the screener who sees the passenger never sees the images.
The machines are already in use in Phoenix, where passengers can choose a pat-down instead, and will show up at Los Angeles International Airport and John F. Kennedy International Airport soon. “You’ll see more whole-body imaging [in 2008], a spokesperson says.
Caveats and Ethical Issues
Even if Project Hostile Intent ultimately succeeds, it will not be a panacea for preventing terrorism, says Schneier. The risk can be reduced, but not eliminated, he says. “If we had perfect security in airports, terrorists would go bomb shopping malls,” he says. “You’ll never be secure by defending targets.”
Assuming that the system gets off the ground, Project Hostile Intent also faces challenges from privacy advocates.
Although the system would use remote sensors that are physically “noninvasive,” and there are no plans to store the information, the amount of personal data that would be gathered concerns privacy advocates — as does the possibility of false positives.
“We are not going to catch any terrorists, but a lot of innocent people, especially racial and ethnic minorities, are going to be trapped in a web of suspicion,” says Barry Steinhardt, director of the Technology and Liberty Project at the American Civil Liberties Union in Washington.
But Steinhardt isn’t really worried. He says Project Hostile Intent is just the latest in a long string of expensive and failed initiatives at the DHS and the TSA. “I’ve done hundreds of interviews about these [airline-passenger screening] schemes,” he says.
“They never work.” Steinhardt adds that “hundreds of billions” of dollars have been wasted on such initiatives since 9/11. “Show me it works before [we] debate the civil liberties consequences,” he says.