Statistical disclosure control Research Papers (original) (raw)
Disclosure control planning is characterised by over-reliance on theoretical models, inappropriate disclosure scenarios, worst-case planning, confusion over subjective versus objective risk management, and an unwillingness to consider the... more
Disclosure control planning is characterised by over-reliance on theoretical models, inappropriate disclosure scenarios, worst-case planning, confusion over subjective versus objective risk management, and an unwillingness to consider the evidence base. This is most striking in the case of access to sensitive data for scientific purposes: most research on SDC has little or no value for this group. This is because confidentiality for scientific users is best managed by a range of procedural and technical options, of which statistical methods are both the least important and the least desirable. In the last ten years or so, this procedural perspective has become increasingly dominant amongst the designers and managers of data access systems for the social sciences. However, the research management community has been less successful in getting this message out to other stakeholders. This paper summarises the case for an evidence-based holistic approach to data access management. In particular, it considers
- the universality of the 'intruder' model, despite a substantial body of evidence that an 'idiot' model is more realistic, relevant, useful, and better aligned with legal requirements
- the focus on quantifiable measures of risk, when uncertainty is the true problem
- the legal, institutional and practical definition of 'identification'
- assessing genuine user and stakeholder needs
- the low importance of statistical factors in the design of data access systems
- engrained institutional attitudes to risk
The common themes are use of evidence, integration of statistical and non-statistical approaches, the effective use of limited resources, and the importance of grounding strategy in realistic expectations of risk and uncertainty. 2 Although most relevant to the scientific research community, where the difference between worst-case and evidence-based planning is starkest, there are also general lessons for the dissemination of confidential data, for international data sharing, and even the provision of uncontrolled (public) data.