A well-executed exercise in snake oil evaluation (original) (raw)
In the umpteenth chapter of UK governments battling encryption, Priti Patel in September 2021 launched the “Safety Tech Challenge”. It was to give five companies £85K each to develop “innovative technologies to keep children safe when using end-to-end encrypted messaging services”. Tasked with evaluating the outcomes was the REPHRAIN project, the consortium given £7M to address online harms. I had been part of the UKRI 2020 panel awarding this grant, and believed then and now that it concerns a politically laden and technically difficult task, that was handed to a group of eminently sensible scientists.1 While the call had strongly invited teams to promise the impossible in order to placate the political goals, this team (and some other consortia too) wisely declined to do so, and remained realistic.
The evaluation results have now come back, and the REPHRAIN team have done a very decent job given that they had to evaluate five different brands of snake oil with their hands tied behind their backs. In doing so, they have made a valuable contribution to the development of trustworthy AI in the important application area of online (child) safety technology.
The Safety Tech Challenge
The Safety Tech Challenge was always intellectually dishonest. The essence of end-to-end encryption (E2EE) is that nothing2 can be known about encrypted information by anyone other than the sender and receiver. Not whether the last bit is a 0, not whether the message is CSAM (child sexual abuse material).3 The final REPHRAIN report indeed states there is “no published research on computational tools that can prevent CSAM in E2EE”.
In terms of technologies, there really also is no such thing as “in the context of E2EE”: the messages are agnostic as to whether they are about to be encrypted (on the sender side) or have just been decrypted (on the receiving side), and nothing meaningful can be done4 in between; any technologies that can be developed are agnostic of when they get invoked.
Maybe a more honest formulation would have been to look for technologies that can keep users safe from specific kinds of abuse in services where the providers are not including complete surveillance of all service users’ activities. This would also remove another intellectual dishonesty in the challenge, namely the suggestion that any techniques developed would apply specifically and only to CSAM, rather than being (ab/re)usable for identifying and restricting other, potentially less obviously undesirable, content – reminders of this are a refrain in the REPHRAIN evaluations. It would also have eliminated a number of the projects before spending £85K of public money each on full surveillance solutions.
The common solution to achieving detection and blocking objectives “in the context of” E2EE is to employ client-side scanning (CSS), either at the sending or the receiving end of E2EE. Doing so at a server would not be possible without breaking the E2EE property. CSS had been proposed by Apple for iPhones in August 2021 and was indefinitely paused after a few weeks of negative press, relating to unclear or unwelcome side effects of this approach. Depending on how its invocation is controlled and on what gets reported to whom and how (including through side channels such as AI models that get trained along the way), CSS may be viewed as breaking E2EE altogether.
Of the five projects awarded under the Safety Tech Challenge, four looked on initial description to resort to client-side scanning (GalaxKey, SafeToWatch, Cyacomb, and T3K). Only DragonflAI promised the provably impossible, to “allow nudity and age to be detected together within E2EE”. Cyacomb and DragonflAI each received an additional £129,500 from the challenge fund later on.
Establishing the evaluation criteria
The REPHRAIN team took a community approach to the evaluation of the projects. Draft evaluation criteria were published for consultation in March 2022 and the community was given two weeks to respond (I have included links to all 4 published responses I’m aware of). Revised criteria were published in July and also included in the final evaluation report.
The draft criteria contained the curious phrase “any risks relating to client-side scanning are beyond the scope of this evaluation”. My feedback suggested this potentially excluded all five projects, trivialising the exercise – that would have been some way to navigate the intellectual dishonesty, which was also (though not in those words) highlighted by the response from the US Centre for Democracy and Technology. However, REPHRAIN chose to take some distance from definitions of E2EE in the final criteria, and their consideration of CSS related risks ended up as one of the strong points of the overall report.
The responses, for example that by tech lawyer Neil Brown also reflected a lack of consensus in the community on the relationships between the concepts of privacy, security, human rights, data protection, and legal compliance. The REPHRAIN team’s response was to introduce a separate top-level category (and additional team member) for human rights, which includes privacy and data protection. My only quibble regarding this aspect is that the report does recognise the need for data protection impact assessments (DPIA), but not the fact that from its “risk to the rights and freedoms of individuals” (GDPR Art 35.1) definition, a DPIA would necessarily cover (or at least fully inform) the entire human rights dimension of the evaluation.
Alongside the CSS aspect, my biggest worry was the team being asked to evaluate these projects on the basis of insufficient technical information. The papers typically produced by the tech industry, “white papers”, don’t live up to the peer-review standard of academic papers, and in particular tend to avoid substantiating the claims of success that are always included. The REPHRAIN team would be given reports and progress documents, but no access to code or testing. Hence this was a “hands tied behind their backs” evaluation.
In my opinion, despite the severe limitations on both realism in the competition and technical inputs provided for the evaluation, the REPHRAIN team have delivered an excellent evaluation. In essence, this arose from applying their expertise and awareness of foundational deficiencies to what little the projects revealed for evaluation, including the experts’ awareness of attacks against CSS and AI models. Overall, they state that they were prevented from evaluating the effectiveness, robustness, performance, and scalability of the tools – as was to be expected from the limitations put on the evaluation process. It is fair to also note that, these being proof of concept tools, not all such properties could be expected to be achieved at this stage.
A number of themes come back in most of the individual evaluations.
Most of the projects lack awareness that they are supposed to provide an acceptable compromise against total surveillance of all communication (images and, in some projects, also text) and checking it for undesirable material of whatever kind, as the obvious extreme solution. The broader questions of proportionality and necessity are not addressed by any of the projects. Most also view their technology being reusable (or even already used) in other contexts besides CSAM as a great sales pitch rather than a scope creep risk. Several also employ their surveillance methods without user awareness – of it taking place, or of any reports sent as a consequence of a (potentially false) positive test. Most also appeared unaware that there are legal, ethical, and technical (including security related) issues with using any freshly evaluated inputs for further training of AI models.
While the evaluation team were not given a chance to establish false positive and false negative rates or verify any claims in that area, they could observe that the developers were generally unprepared to mitigate against false positives: no consideration of scalability (in light of base rate fallacy), no auditability, transparency, or potential appeal for wrong decisions.
None of the projects is in a position to claim it could reliably deal with new CSAM. CSAM is obviously highly sensitive material, with ethical and legal issues around obtaining it even for projects that aim to combat it and claiming high degrees of confidentiality.5 As a consequence, Cyacomb is based only on recognising the Internet Watch Foundation’s collection of previously identified CSAM, an approach that is already used by various large internet companies, and would also have supported the Apple CSS approach. Other projects used dubious proxies for CSAM: adult material (SafeToWatch), or nudity detection in combination with age estimation from faces (DragonflAI and TSK), which would lead to high false positives as well as (in absence of faces!) false negatives.
SafeToWatch is slightly different to the other products, in that it only looks at material as it’s produced by an identified user to see whether it might be unwise to share, and it then only advises, so avoids some of the surveillance related issues. This still shares some of the same AI related problems.
Given that Yoti was involved in GalaxKey and DragonflAI, it should not be surprising to see that face-based age estimation is used in authentication. One of the funniest sentences in the report is “to mitigate errors in age estimation, users have to identify themselves with a government approved document like a passport or driving license” – i.e., the age estimation is a red herring.6
In addition to these recurring themes, the REPHRAIN team was also able to point out a number of additional security risks, for example relating to what communication is necessary.
Conclusion
Overall, I would say the REPHRAIN evaluation team have done an excellent job. Without access to code or experimental data, they highlighted major weaknesses along multiple dimensions with all of these projects. While they conclude generously “the PoC tools reported here that provide a context to mitigate the creation of CSAM and prevent its publication prior to uploading are innovative”, the “innovation” to me seems to be of limited value when little has been achieved to overcome the technical and socio-technical problems associated with AI-based CSS, and the headline “E2EE” played next to no role in the products nor in the evaluations. However, this report provides excellent value in exploring quality criteria for online child safety tech, and I heartily recommend a read of the full report.
Footnotes
[1] I have known some of these for years and have an active collaboration with one, started in 2021.
[2] To be more precise, no contents of a message; information about the length of the message may follow from the length of the encrypted text.
[3] To be more precise, nothing can be found out with knowledge of the encrypted message with a likelihood meaningfully higher than from guessing without having that encrypted message.
[4] This refers to a mathematical impossibility.
[5] At least, when there’s scrutiny of technical claims in the offing.
[6] I have argued before that age estimation should be illegal under data protection legislation because of lack of fairness: it systematically discriminates against people who don’t look their age. Yoti have emailed to me in response setting out in detail how they establish their bias and accuracy and keep those low and high, respectively – but not addressing the fundamental point that some people, however few and however disjoint from any specific known category, will be unfairly assessed.
This guest post was written by Professor Eerke Boiten, Cyber Technology Institute, De Montfort University. Photo by Alok Sharma.