An International Survey of Face-Matching Training (original) (raw)
Related papers
Passport Officers' Errors in Face Matching
Photo-ID is widely used in security settings, despite research showing that viewers find it very difficult to match unfamiliar faces. Here we test participants with specialist experience and training in the task: passport-issuing officers. First, we ask officers to compare photos to live ID-card bearers, and observe high error rates, including 14% false acceptance of 'fraudulent' photos. Second, we compare passport officers with a set of student participants, and find equally poor levels of accuracy in both groups. Finally, we observe that passport officers show no performance advantage over the general population on a standardised face-matching task. Across all tasks, we observe very large individual differences: while average performance of passport staff was poor, some officers performed very accurately -though this was not related to length of experience or training. We propose that improvements in security could be made by emphasising personnel selection.
A high priority for many governments around the world is the ability to identify people and facial recognition systems are one of the leading technologies currently being employed in support of this. Contrary to popular belief, the current generation of facial recognition systems used for identification require a human (facial reviewer) in the decision making loop. It is the facial reviewer’s role to adjudicate the output of the system, to compare a target image with a set of images returned during a search (one−to−many unfamiliar face matching) to make the final decision. Unfortunately, concerns about human one−to−many unfamiliar face matching performance have been consistently raised and these concerns have been justified by the results of empirical research highlighting marked individual differences in performance. The overarching aim of this research, therefore, was to identify and better understand the key variables that impact on the one−to−many unfamiliar face matching performance of facial recognition system users in the operational context. This thesis reports the results of five studies conducted to achieve this aim. Study 1, a series of briefings with Australian government agencies engaged in processing and investigative work, identified the key variables impacting on one−to−many unfamiliar face matching performance to be: candidate list size, the determinants of expertise in face matching, and decision aids. A comprehensive survey (Study 2) of facial reviewers was then conducted to better understand these variables and situate them within the operational context. After a review of the extant literature relevant to each variable to identify research gaps, three empirical studies were conducted. Each empirical study employed ecologically valid stimuli and experimental method, informed by the results of Study 1 and Study 2. Study 3 investigated the impact of candidate list size on one−to−many unfamiliar face matching performance, across a range of operational contexts. The impact of expertise on one−to−many unfamiliar face matching performance was explored in Study 4, using a sample of facial reviewers from several Australian government agencies. Finally, Study 5 examined the impact of a range of facial recognition system decision aids (and combinations of) on one−to−many unfamiliar face matching performance. The results of this research highlight the impact of candidate list size on one−to−many unfamiliar face matching performance; the complexities of defining what constitutes an expert in the field of facial review; and the impact that facial recognition system decision aids can have on the decision making process. The thesis concludes with a series of recommendations regarding one−to−many unfamiliar face matching and the use of facial recognition systems in the operational context, as well as future research directions designed to contribute to the development of the field of facial review.
The Effect of Image Quality and Forensic Expertise in Facial Image Comparisons
Journal of Forensic Sciences, 2014
Images of perpetrators in surveillance video footage are often used as evidence in court. In this study, identification accuracy was compared for forensic experts and untrained persons in facial image comparisons as well as the impact of image quality. Participants viewed thirty image pairs and were asked to rate the level of support garnered from their observations for concluding whether or not the two images showed the same person. Forensic experts reached their conclusions with significantly fewer errors than did untrained participants. They were also better than novices at determining when two high-quality images depicted the same person. Notably, lower image quality led to more careful conclusions by experts, but not for untrained participants. In summary, the untrained participants had more false negatives and false positives than experts, which in the latter case could lead to a higher risk of an innocent person being convicted for an untrained witness.
The Glasgow Face Matching Test
Behavior Research Methods, 2010
We describe a new test for unfamiliar face matching, the Glasgow Face Matching Test (GFMT). Viewers are shown pairs of faces, photographed in full-face view but with different cameras, and are asked to make same/ different judgments. The full version of the test comprises 168 face pairs, and we also describe a shortened version with 40 pairs. We provide normative data for these tests derived from large subject samples. We also describe associations between the GFMT and other tests of matching and memory. The new test correlates moderately with face memory but more strongly with object matching, a result that is consistent with previous research highlighting a link between object and face matching, specific to unfamiliar faces. The test is available free for scientific use.
The Oxford Face Matching Test: Short Form Alternative
Quarterly Journal of Experimental Psychology, 2023
A recently published test of face perception, the Oxford Face Matching Test, asks participants to make two judgements: whether two faces are of the same individual; and how perceptually similar the two faces are. In the present study, we sought to determine to what extent the test can be shortened by removing the perceptual similarity judgements, and whether this impacts test performance. In Experiment 1, participants completed two versions of the test, with and without similarity judgements, in separate sessions in counterbalanced order. The version without similarity judgements took approximately 40% less time to complete. Performance on the matching judgements did not differ across versions and the correlation in accuracy across the two versions was comparable to the originally reported test-retest reliability value. Experiment 2 validated the version without similarity judgements against other measures, demonstrating moderate relationships with other face matching, memory and self-report face perception measures. These data indicate that a test version without the similarity judgements can substantially reduce administration time without impacting on test performance.
A new atlas for the evaluation of facial features: advantages, limits, and applicability
International Journal of Legal Medicine, 2011
Methods for the verification of the identity of offenders in cases involving video-surveillance images in criminal investigation events are currently under scrutiny by several forensic experts around the globe. The anthroposcopic, or morphological, approach based on facial features is the most frequently used by international forensic experts. However, a specific set of applicable features has not yet been agreed on by the experts. Furthermore, population frequencies of such features have not been recorded, and only few validation tests have been published. To combat and prevent crime in Europe, the European Commission funded an extensive research project dedicated to the optimization of methods for facial identification of persons on photographs. Within this research project, standardized photographs of 900 males between 20 and 31 years of age from Germany, Italy, and Lithuania were acquired. Based on these photographs, 43 facial features were described and evaluated in detail. These efforts led to the development of a new model of a morphologic atlas, called DMV atlas (“Düsseldorf Milan Vilnius,” from the participating cities). This study is the first attempt at verifying the feasibility of this atlas as a preliminary step to personal identification by exploring the intra- and interobserver error. The analysis yielded mismatch percentages from 19% to 39%, which reflect the subjectivity of the approach and suggest caution in verifying personal identity only from the classification of facial features. Nonetheless, the use of the atlas leads to a significant improvement of consistency in the evaluation.
Diverse types of expertise in facial recognition
Scientific Reports
Facial recognition errors can jeopardize national security, criminal justice, public safety and civil rights. Here, we compare the most accurate humans and facial recognition technology in a detailed lab-based evaluation and international proficiency test for forensic scientists involving 27 forensic departments from 14 countries. We find striking cognitive and perceptual diversity between naturally skilled super-recognizers, trained forensic examiners and deep neural networks, despite them achieving equivalent accuracy. Clear differences emerged in super-recognizers’ and forensic examiners’ perceptual processing, errors, and response patterns: super-recognizers were fast, biased to respond ‘same person’ and misidentified people with extreme confidence, whereas forensic examiners were slow, unbiased and strategically avoided misidentification errors. Further, these human experts and deep neural networks disagreed on the similarity of faces, pointing to differences in their represent...
Diverse routes to expertise in facial recognition
Facial recognition errors jeopardize national security, criminal justice, public safety and civil rights. Here, we compare the most accurate humans and facial recognition technology in a detailed lab-based evaluation and international proficiency test for forensic scientists involving 27 forensic departments from 14 countries. We find striking cognitive and perceptual diversity between naturally skilled super-recognizers, trained forensic examiners and deep neural networks, despite them achieving equivalent accuracy. Clear differences emerged in super-recognizers’ and forensic examiners’ perceptual processing, errors, and response patterns: super-recognizers were fast, biased to respond ‘same person’ and misidentified people with extreme confidence, whereas forensic examiners were slow, unbiased and strategically avoided misidentification errors. Further, these human experts and algorithms disagreed on the similarity of faces, pointing to differences in their face representations. O...