One of the most memorable scenes in Ridley Scott’s 1979 film Blade Runner involves the interrogation of a cyborg suspected of being a human impostor. Technology’s advance has rendered these humanoid “replicants,” as they are known in the film’s icky parlance, indistinguishable from the real thing. Indeed, the latest models are so much like us that their true identity can only be exposed via a series of complicated and seemingly abstract questions, which eventually cause the robot to betray the fact that it’s comprised not of organs and arteries, but circuits and motors.
In the film’s depiction of this futuristic Turing test the examiner describes a vivid imaginary scene. In the story, Leon (the suspected replicant) is walking in a desert when he happens upon a tortoise. He reaches down and flips the creature onto its back. The examiner describes the tortoise’s legs groping uselessly in the hot sun.
“What do you mean I’m not helping?” says Leon, his agitation with the scenario clearly mounting. The examiner merely smiles. “They’re just questions, Leon.”
Almost three decades later, at a university laboratory in Enschede, a city in the eastern Netherlands, a woman sits down at a computer screen to take a similar yet diametrically opposed test. Whereas in Scott’s scenario a human aims to expose the lies of an AI, here it’s the computer that’s asking the questions.
Brad, as the bot is named, hopes to spot where its human interviewee is telling untruths. It’s part of a research study by Merijn Bruijnes and Sabine Strofer from the University of Twente that aims to find out whether or not bots like Brad might be used as police interviewers capable of catching criminals in a lie.
Unlike the sophisticated law enforcement robots with which we’re familiar in science fiction–the RoboCops and Terminators of our fictional landscape–Bruijnes and Strofer’s bot is closer to a rudimentary chat bot. Brad appears as a crudely animated avatar on screen: a white man in his early 40s, lingeringly handsome in a dark gray button shirt, with a swept back fringe. “What is your link to the university?” he begins.
The woman is hooked up to a tangle of wires, which monitor her skin conductance. If she sweats, Brad will know she’s lying–or so the theory goes.
Earlier that day the woman was asked to provide cover for a colleague who was off sick from work. One of her tasks involved working through his bulging inbox. There, among the nagging column of unanswered emails, she’d found a contract, which the sender stated needed to be signed urgently. The woman, wanting to help her co-worker, signed the contract by faking his signature. It was a benign act, but nevertheless a fraudulent one.
“Did you see this contract?” asks Brad coolly, pulling up the paperwork on screen. “No,” the woman replies.
Then, the gotcha: “Is that your signature?” asks Brad, pointing to her handwriting.
In Blade Runner the replicant cracks under the pressure of this kind of questioning. As the interrogation reaches its climax, Leon pulls out a concealed pistol and shoots Holden twice in the chest. No such theatrics in the Netherlands’ research center. The only clue to the woman’s lie is a curtain of minuscule droplets hanging imperceptibly from her fingers.
“The question we wanted to answer was simple,” explained Strofer. “Can physiological cues to deception be measured with a virtual interviewer?”
It’s a timely question. Governments and agencies are increasingly interested in quick, reliable, and low-intrusive ways to detect deception, especially among large groups of people. Autonomous interview systems offer an attractive proposition as first-level screening tools to detect deceit at airports and other densely visited tourism spots where the need to keep people moving quickly is at odds with the high security risk. A sufficiently advanced bot could potentially help to uncover a concealed weapon, or unmask malevolence.
“The advantage is obvious,” said Strofer. “By replacing a real human with an autonomous virtual interviewer, many more people can be interviewed in a short period of time, which can be very useful for crowded and vulnerable places.”
Detecting physiological deception is part art, part science. When lying, human beings give off various clues, some obvious (covering the mouth or glancing to the right when telling a lie), some subtle. When we lie, for example, our sympathetic nervous system activity increases. Our heart rates pick up. Our hands become clammy. Small sensors placed on fingers (of the sort used during traditional polygraph tests) can detect these tiny fluctuations in temperature and moisture.
The theory is not without controversy. Detractors contest the idea that there’s a consistent pattern of physiological reactions unique to deception. They argue that an honest person may be nervous when answering truthfully. Conversely, a dishonest person may be relaxed. Despite these reservations, law enforcement agencies around the world use polygraph tests as a trusted tool of interrogation. In face-to-face encounters with real humans, proponents maintain that the ambient stress of a lie cannot be physically contained. The psychological processes inevitably evoke physiological cues.
If true, could a bot elicit a similar response in a liar?
During Bruijnes and Strofer’s test, 79 participants (mostly volunteers from the university) were either told to tell the truth to every question the bot posed or, alternatively, lie in response to each one. Following the test, participants indicated whether they believed that Brad was human- or computer-controlled. The results were surprising.
Bruijnes and Strofer discovered that if the subject believed Brad was being remotely controlled by a human operator, then they exhibited the sort of telltale signs of deception that police might expect were the subject being interviewed by a human officer. They gave themselves away. If, however, the interviewee believed that Brad was an AI, they felt less stress around being caught. The physiological signals disappeared.
“It appears as though physiological deception detection only works when people assume that they are talking to a real human,” said Strofer. She attributes this to the interviewee’s lack of faith in the bot’s competencies. “Deceivers who assume an unconscious computer does not “‘understand’ the conversation on a social level appear to feel no urgency to look sincere and maintain a normal, unsuspicious conversation,” she said. In other words, it appears that people simply don’t believe bots are up to the job.
Despite our disbelief, science fiction is filled with cautionary tales about rogue robots being used in law enforcement. In 1987’s RoboCop, for example, the ED-209, a hulking, bipedal robot–a cross between an anti-aircraft turret and a Tyrannosaurs Rex–is a police bot that’s been programmed to shoot criminals who fail to drop their weapons after repeated warnings. During a demonstration to the executive board of its creators, Omni Consumer Products, the autonomous robot fails to recognize that its volunteer target has dropped his gun. After two more warnings, the robot kills the man in a hail of bullets.
Brad is a long way from the kind of fully autonomous police robot that might present a threat to a human’s life. As the writer Adam Elkus put it in Slate, if AI programs are the majestic lions and eagles of the artificial ecosystem, then bots are “the disgusting yet evolutionarily successful cockroaches and termites.” In Brad’s case, he wasn’t even properly autonomous. During the test, a human manipulated the avatar behind the scenes, selecting lines of dialogue for Brad to speak–for example: “That does not matter” or “Just answer my question, please”–in order to better convince the interviewee that the virtual avatar could understand them and respond appropriately.
Nevertheless, the use of such techniques in live police operations raises difficult questions, particularly about the potential that a more advanced version of Brad might have for harming its targets. In early 2016, Microsoft’s AI chatbot Tay became a racist monster in fewer than 48 hours, prompting the company to remove the software from the internet. Unless programmed to work with a highly limited number of interview questions, which could prove too restrictive to be useful, there is potential for any autonomous piece of software to cause distress when interacting with humans, particularly if elevated to a position of institutional authority such as the police.
Even if we developed more competent AI interrogators, people are, it turns out, terrible at distinguishing between AI and human operation. Tony Chemero, the author of Radical Embodied Cognitive Science, once carried out an experiment with Sony’s Aibo, in which the robot dog performed tricks for an audience. After the show, audience members were asked to guess whether the machine was acting through software or remote control. People were more likely to say it was software when in fact a skilled human operator was in control. “We should not underestimate the tendency of people to over-attribute capabilities to machines,” said Colin Allen, a professor of cognitive science and the philosophy of science at Indiana University. “We are easily fooled.”
For now, however, there appears to be little faith in the current capacity of true bots to read and interpret the subtleties of human communication. For all the grand advances in artificial intelligence, and the potential for ushering autonomous androids able to socialize and converse with humans into the world, none of the test participants who believed that Brad was fully autonomous thought the avatar would be able to detect their lies. They were right. “Our ability to do this kind of thing in 2016 is very rudimentary,” said Gary Marcus, a professor in the New York University Department of Psychology. “It will be years before machines have enough understanding of human language and behavior to be truly effective in this kind of screening.”
Even when machines have the capacity to understand human language to the extent that they can be useful in police interrogations, there will be a need to simultaneously develop a robust ethical framework in which they operate. Should a police bot know when it is causing distress to an interviewee and back off? How does one program a police bot to act sensitively, while still pursuing an effective line of questioning? There are plenty of cases where even experienced police officers have misstepped in this regard.
“Humans live with a lot of ambiguity that we may not tolerate in machines, and we acquire an intuitive sense of what’s ethically acceptable by watching how others behave and react to situations,” said Allen. “In other words, we learn what is and isn’t acceptable, ethically speaking. Either machines will have to have similar learning capacities, or they will have to have very tightly constrained spheres of action, remaining bolted to the factory floor, so to speak.”
It seems unlikely that bots will replace detectives in interview rooms any time soon. “What I could imagine, however, is that virtual avatars possibly could be used as first-level screening tools to detect deception in places where it is crucial to keep an eye on large groups of people,” said Strofer.
Perhaps, but there will need to be improvements in the detection systems involved first. Even if you overlook the various controversies surrounding the reliability of the polygraph test, the physical strictures of rigging members of a crowd up to a system that can measure pulse, respiration rate, and skin conductance is wholly impractical. Beyond these physical challenges, subjects would also have to be tricked into believing that, behind every bot, there is a human operator. “Looking human-like is not enough,” said Strofer. “Interviewees have to believe that a real human hides behind the virtual avatar. Before police departments would be able to employ this kind of approach in real scenarios in the future, follow-up research needs to be carried out.”
Bruijnes remains skeptical about the use of bots in police work, even in more mundane roles such as processing victims. “It is hard to predict the future,” he said, “but I can imagine cases where virtual characters can be used to interact with the public, for example by being the first contact of an organization like the police. However, I think it is unlikely that they would ever completely replace the human, not because such technology might not exist in the future, but because, when we find ourselves in trouble, we typically want to be helped by other people.”
How We Get To Next was a magazine that explored the future of science, technology, and culture from 2014 to 2019. This article is part of our Talking With Bots section, which asks: What does it mean now that our technology is now smart enough to hold a conversation? Click the logo to read more.