Charles Arthur on the scientific paper casting doubt on phone lie detection technology used by insurance companies (original) (raw)

It may seem contrary - even churlish - to doubt a technology claimed to have prevented millions of pounds of fraudulent insurance and benefit claims around the world. Yet that's what Francisco Lacerda, a professor of linguistics at Stockholm University, and Anders Eriksson, professor of phonetics at Gothenburg University, have done in a scientific paper.

They say the system, used to try to detect people lying in phone calls made to 25 UK councils and a number of car insurers, is no more reliable than flipping a coin - and that millions of pounds have been spent on a technology that has not been validated scientifically, and for which the claims about its function are "at the astrology end of the validity spectrum".

The claims publicly made for the voice risk analysis (VRA) software being used by trained operators at some local councils since May 2007 sound impressive. "Phone lie detector led to 160 Birmingham benefit cheat investigations", said the Birmingham Post. The Department for Work and Pensions has already spent £1.5m installing 150 "seats" of the software - plus training from its UK reseller, Amersham-based DigiLog, for each group - in councils, as part of two sets of pilot tests of the VRA system.

Insurance claim

Highway Insurance, which has used DigiLog's product since 2002, claimed in 2007 that the system has "successfully prevented more than £11m in potentially fraudulent motor insurance claims" because "Highway has screened nearly 19,000 motor claims cases since 2002, with more than 15% repudiated or withdrawn."

That suggests the system works. But perhaps the wording is important: it says they were potentially, not demonstrably, fraudulent. Scientists say telling people they are being monitored by a "lie detector" (real or not) makes them more likely to be truthful. The example cited in Lacerda and Eriksson's paper is of prison inmates interviewed about their drug use, and then tested by urinalysis and hair samples - an objective method. With "lie detection", only 14% lied; without it, 40% did.

The software is from Nemesysco, an Israeli company, which licenses DigiLog to sell it in the UK. Sales to the government are handled jointly by DigiLog (which does the staff training) and Capita. Nemesysco claims it applies "layered voice analysis" (LVA): "LVA uses a patented technology to detect 'brain activity traces' using the voice as a medium. By utilising a wide-range spectrum analysis to detect minute involuntary changes in the speech waveform itself, LVA can detect anomalies in brain activity and classify them in terms of stress, excitement, deception and varying emotional states".

In the UK, the system is known as VRA. Callers to Harrow council to make a housing benefit claim are warned their call may be subjected to voice analysis. The DigiLog software monitors the line that operator is on: if it reckons patterns in the voice indicate some form of stress, the operator hears a beep. Thus alerted, the operator is trained to begin asking questions that may uncover the truth.

Harrow visits anyone who chooses not to take part in a VRA call; it says there has been only one complaint since its introduction, "indicating that customers do not feel intimidated by the process". It claims the technology has saved it about £110,000 in benefits payments, helped identify 126 incorrectly awarded single-person council tax discounts - worth £40,000 - and prompted reviews of 304 claims. Of these, 47 were no longer valid, saving another £70,000. Birmingham city council is equivocal: no prosecutions have followed VRA's use - and in some cases, the benefits paid have even been raised.

Yet nobody testing the system seems to have tried generating the beep in the operator's ear by the electronic equivalent of a coin flip. Measuring the difference in effectiveness between random beeps and the proper system (without telling the operator) would be a scientific "blind" test: that could show whether the system is worth its cost or whether it was just the more assertive questions, allied to the "lie detector" warning, that made the difference.

In the absence of such scientific investigation, the next best step is to analyse the software. In a paper titled "Charlatanry in forensic speech science: a problem to be taken seriously", published in the International Journal of Speech, Language and the Law, Eriksson and Lacerda analysed the code in the 2003 patent for Nemesysco's software. They say it comprises about 500 lines in Microsoft's simple Visual Basic programming language. That code carries out the signal analysis, they say, and then offers the multiple levels of "certainty" to operators trying to decide whether someone is being truthful.

Call their bluff

"At best, this thing is giving you an indication of how [voice] pitch is changing," Lacerda told the Guardian. "But there's so much contamination by other [noise] factors that it's a rather crude measure." In the paper - which has been withdrawn from the website of its publisher, Equinox Publishing, after complaints from Nemesysco's founder that it contains personal attacks - the scientists say the scientific provability of the Nemesysco code is akin to astrology. The deterrent effect "is no proof of validity, just a demonstration that it is possible to take advantage of a bluff".

That chimes with one specialist, who spoke on condition of remaining anonymous. "Nobody seems to have done any sensible research into this," he says. "[The clients have] all talked to salesmen rather than scientists. Study after study shows low validity, and chance level for reliability. But people won't listen. They don't try them in controlled trials; they make a public announcement they're using it, then feel happy they've got a 30% fall in claims. It's called the 'bogus pipeline effect'. People are frightened [of the threat]."

Stress at work

But Lior Koskas, the business development manager of DigiLog, says the VRA system cannot be separated from its user, because the system only picks up stress. He does not claim it spots "lies" on its own. "Only when the technology and an operator trained by us spots it, then can we say there's a risk someone is lying." Has there been a scientific "blind test" of the system? "No," Koskas says, "you can't say you're using something if you aren't."

He adds that the technology "hasn't been scientifically validated", but he rejects Lacerda and Eriksson's criticisms. "With any technology you will have opinions," he says. "But how many of these scientists have tested it properly? They talk about the technology in isolation, as though you don't need anything from the operator except turning it on or off. But the majority of the training course is about linguistic training analysis, learning to listen. Anybody using this [technology] in the UK doesn't use it in isolation."

What would Lacerda advise the government and companies considering spending money on the system to do? "Spend it on educating the people who are going to interview people, because that would be much more valid and ethically sensible."

Yossi Pinkas, Nemesysco's vice-president of sales and marketing, insists the system "can't be tested in a lab environment, because you're testing emotion". To him, Lacerda and Eriksson's analysis is flawed because "there's no scientific field of 'voice analysis', only voice recognition".