On the matter of the British Library cyber incident (original) (raw)

Share

The introduction: For nearly three months, the British Library has been close to unusable because of what has invariably been called "a cyber incident”. Lots of people have asked me in recent months: “what on earth is going on with the BL and why isn’t it getting more attention?”

At the start of this week, the BL announced the partial restoration of its capabilities. So it seems a good time to take stock of one of the most impactful cyber incidents in British history.

The apology: This is the first post on this Substack for, well, a very long time. Apologies to those who supported me when it started. Many of you will know there were various circumstances which meant 2023 was a hard year for me to sustain it. I won’t make any promises about how often I will post, but I will try to reactivate it, based on an article at least once a month. Consider this January’s offering. And feedback on content and ideas for more is always welcome.

The caveat: This post is based open-source information, my own judgments, and nothing else. I used to run the UK’s National Cyber Security Centre, but I stopped doing that in 2020 and left public service. I have not asked former colleagues about the case. Nor have I spoken to the British Library. No one should assume anything I say here reflects the position of the Government or any part of it.

The other apology. Pursuant to that caveat, I am acutely conscious that this is an article about an extremely hard-pressed organisation trying its best to serve its users under the most extraordinary pressure. Being at the centre of a cyber crisis is absolutely horrible. It normally also means something has gone wrong, somewhere. In commenting on some of those potential causes of the problems, I do not mean to criticise those working flat out to fix things. I apologise if any of this post inadvertently comes across that way. Indeed I’d want to thank BL staff for what appears to have been an extraordinarily effort in a long slog to get to this important recovery point this week. I would encourage anyone else commenting on this or other cyber incidents to remember the human beings at the centre of the crisis. A paper from the Royal United Services Institute this week rightly identified psychological damage to staff as a consequence of these types of attacks. We should always remember this.

To the issue at hand, and first, some facts. In early January, Alex Scroxton at the indispensable Computer Weekly wrote a superb overview of the British Library cyber incident. In the interests of brevity, the following points are the most important:

Two key points flow from these events. The first is that it can safely be inferred that neither the BL nor anyone else paid the ransom (though no one has, to my knowledge, commented officially on this). If the ransom had been paid but the criminals had failed, for whatever reason, to restore access to the BL’s network we would know about that by now, one way or the other.

The second, and most important part of the whole story, is that for more than two and a half months this vital national resource has been essentially unusable. At the start of the crisis it seems that nothing at all worked: the basic staff computers, the phones, and even the public Wi-Fi for a bit. But the longer term damage was caused by the total inaccessibility of the main BL catalogue, described by the BL’s boss itself as “one of the most important datasets for researchers around the world” with its record of some 170 million items dating back centuries.

A particular problem is understood to be that most of the collection is stored in a giant facility belonging to the BL in West Yorkshire. Users are supposed to order from the catalogue and the item will be transported south in a few days. Without the catalogue, this became impossible. Whilst some workarounds could be done in the BL’s magnificent London headquarters, if the text you wanted was in Yorkshire, and it probably was, no one had any way of knowing where it was, and how to get it.

Although plenty of people have asked why this episode hasn’t received more national attention, it is clearly one of the worst cyber incidents in British history. So what are the lessons of it?

For me, there are three. None are new, but not all of them receive enough attention. And the last one needs to resonate thunderously throughout all organisations.

The British Library cyber crisis has nothing and everything to do with geopolitics. Nothing, in that the only motivation for it is money. Everything, in that the only reason it can happen with impunity is because the Rhysidia group, like nearly all the major ransomware groups, are based in Russia.

It is well documented that the Russian state has no interest in shutting these groups down and putting the leaders in prison, providing they don’t harm Russian interests and cooperate with the state when required. It is against current Russia law for the state to extradite its own citizens (this must be the first time I’ve linked to Tass). So these people are almost certain never to appear in a British court, and very unlikely to face even a Russian one anytime soon.

But, like all comparable democracies, the British state is configured to treat this type of incident as an arrestable and prosecutable crime. “Cyber crime is just the same as other crime” is something I heard a lot from law enforcement colleagues in Government. But there is one crucial difference. For the first time in human history, it is possible to inflict sustained, large-scale criminal damage on another country without the perpetrator or a single accomplice setting foot in it.

We have consistently underestimated just how much cyber crime breaks our model of policing. In rule of law democracies, the contract between citizen and police is based in part on an assumption that when someone is a victim of crime, the police will pursue the perpetrator. And with cybercrime, there are some in the UK we can go after. And with the Russians, every so often some idiotic cyber criminal goes on holiday to a Western country, or contracts for a criminal service with someone in East London, and the police can do what police are supposed to do. But these are the exceptions.

What police forces are doing increasingly well - normally via multinational operations led by the FBI - is orchestrating takedowns of digital infrastructure used by the criminals. But these interventions, while welcome, are invariably whack-a-mole operations and the criminals reappear in another guise with new infrastructure.

Can anything be done? Things got so bad in 2021, with the attack on Colonial Pipeline in the US, alongside serious healthcare disruption in the US and Europe, that President Biden used his Geneva summit with Vladimir Putin in June of that year to demand Russia clamp down on the rampant ransomware crime emanating from its territory. For a brief period, this seemed to have some effect, with the somewhat theatrically broadcast arrest of the REvil gang, one of the most notorious groups.

But then came the invasion of Ukraine. A dictatorship willing to defy the White House over the invasion of a neighbour is unlikely to be swayed by American demands about criminals on its own territory. And a West that would support Ukraine but not take direct military action on its behalf is not going to take direct action against individuals protected within Russia’s vast borders. Both the Russian state and the criminals know that.

Therefore, the brief period when some of Russia’s ransomware thugs flew a bit too close to the sun and became a nuisance to the Kremlin is now over. All the evidence of 2023 suggests that the criminal safe haven has been fully restored. There will come a time in the future when Washington, London, Brussels and others can talk to Moscow about dealing with this scourge. But that time is not now, or soon.

It is of no benefit to pretend otherwise. Australia’s otherwise hugely impressive response to the disastrous theft by cyber criminals of more than a third of the population’s medical records - what I’ve called elsewhere (£) a masterclass in devaluing a stolen dataset to the criminal - provides a case in point. In a press conference in November 2022, the head of the Australian Federal Police claimed the identities of the hackers were known to the AFP and pledged to bring the perpetrators to justice in Canberra via cooperation with Russian law enforcement. Any reasonable Australian watching could have concluded the police thought they had a good chance of locking up the villains. But, as was widely predicted at the time, this has not happened, and there appears next to no chance that it ever will.

It is always hard for Governments and public authorities to admit they can’t do something, especially when the ‘thing’ is being able to catch and convict criminals who’ve laid waste to something that’s very important to lots of citizens. But the lesson from Australia, the British Library, and countless other ransomware crises is that normal policing doesn’t work in most of these cases because the suspects are safely holed up in Russia.

So we should stop pretending that conventional policing can do much about this, and look instead at other things we might be able to do. This article is long enough already without prescribing in detail what the approach should be: that is for another day. However, here are three starting points:

Many of the reforms in that report have merit and deserve consideration. But the Committee’s overarching point is that countering ransomware needs serious political leadership and attention.

That sort of strategic review of our approach to ransomware requires us to look hard at our own national vulnerabilities. Here the BL crisis provides some valuable lessons.

Harm happens in cyberspace because we have a three decades-long legacy of weak security in our software, hardware and wider digital infrastructure. Famously, the Internet was not built with security in mind - and we’re plagued with poor incentives for providers and users to do anything about it. That is slowly changing, but it is improving much more for newer technologies more than for our existing tech stack.

As all IT security professionals know, legacy systems in old organisations pose the hardest problems. There are no really transformative options until new systems come along. There are only mitigations. These mitigations require a lot of high quality technical and human resources. So they are expensive. They also require a lot of skilled people, as well as management attention and sponsorship. But it’s hard to explain the benefits of these measures to hard-pressed management facing many other pressures. And security reforms are often unpopular with staff and users because they add complexity to everyday work.

So it’s easy to see why some organisations are incentivised to take cyber security and resilience seriously, and some aren’t. Any service where public safety is at risk will invest heavily in security, safety and resilience, and test it all the time. The system and the organisation will probably be inspected. A regulatory license to operate might well depend on that evaluation. Put simply, no one should ever do something where their physical safety is dependent only on a computer staying connected, and most regulatory systems rightly don’t allow this.

So, for example, when part of the UK’s National Air Traffic Control system failed (accidentally) last August, there was no risk to safety to the planes already in the air because of the way much-tested backups work. Because the air traffic control system is such an obvious part of critical national infrastructure it is highly likely that Government agencies will pay attention to and assist with the cyber protection and resilience of the service. Similarly, in the private sector, banks invest heavily in cyber security capabilities and people because they know the risks of large scale financial loss are existential. Moreover, they can afford to. And the Government and regulator will want to help too, to avoid systemic risk within the financial system.

But consider the British Library in this context. It is a very important national institution, for sure. But if you’re tasked with identifying the most important national IT networks for protection against attack, the British Library will not get anywhere near the top of the list for attention. As we have seen, no one gets hurt or dies if the BL goes down. The health service will still function. So will the banks. The lights will still be on. People’s bills will still be accurate. The data of vulnerable populations will not have leaked. And so on.

As a cultural institution, the BL is important and famous. It also a public body. It is not, however, a political or budgetary priority. Constrained by public sector budgets and salaries, it will find it hard to source the people and capabilities it needs for cyber security (the British Treasury was widely mocked for advertising for a head of cyber security with an annual salary of between £51,000 and £57,000 when the industry standard is multiples of that figure). It is hard to imagine the BL being able to pay more, or finding it easy to recruit cyber security professionals.

This matters, because it is within hundreds of networks like the one the BL depended on that serious national risk lies.

The history of cyber security is pockmarked with warnings of mass casualty digital apocalypses threatening civilisation as we know it. It turns out that’s the wrong problem: as the brilliant work of Lennart Maschmeyer has shown, hacking into, say, a power grid and depriving civilians of supply even for a short time via cyber means is possible, but it is painfully slow and hugely resource intensive for the aggressor. Moreover, for cyber security and other security reasons these systems are better protected than ‘normal business’ networks, and have manual or other backups. That’s why cyber attacks don’t directly kill people.

It turns out, however, that our more immediate cyber security problem is that by crippling these so-called ‘normal business’ networks an aggressor can hugely harm a society without that much effort. We now know you can shut down a crucial oil pipeline in the United States not by attacking the pipeline, but by shutting down the ordinary software systems that support its administration. It turns out you can cripple the entire healthcare system of a rich EU nation not by touching hospital equipment or systems but by locking out the network of the body that allocates doctors appointments and schedules surgeries. And it turns out that you can bring part of the British academic sector to a crashing halt by taking a massive library catalogue offline.

So what else can an aggressor do to networks that don’t look to be of ‘strategic’ importance? That is that question we should be asking ourselves in the light of the BL fiasco. We should then be moving resources, expertise and monitoring accordingly as best we can. We also need to think about how we better incentivise the leaders of these organisations to improve basic security and resilience, because ransomware attacks are not, in general, sophisticated.

This is an election year in the UK, and after the votes are counted we can expect someone to try to form a stable administration with a five year horizon. Much is made of short-termism in politics, but we have to work with the world as it is, not as we’d like to be. In that spirit, here are two planning assumptions on national cyber risk for the next five-year Parliament:

  1. a devastating, highly sophisticated, threat-to-life cyber attack against the UK in the next five years is unlikely, and if it happens, its impact will be mitigated so long as we continue to ensure that safety-critical systems are not wholly dependent on computer networks;
  2. by way of contrast, serious economic and social disruption, including an incident that could threaten public order or safety arising from a cyber operation (the disruption of healthcare administration, the criminal justice system, or food or oil distribution being some examples) is very likely. Indeed, an incident of the severity of the BL attack is likely in each of the next five years.

This lesson of national vulnerability from the BL case, and these assumptions, would make a good starting point for the sort of serious discussion about ransomware that is urgently needed.

And the one thing above all else that would make a difference to the problem is finding a way of forcing organisations to be able to recover more quickly than the British Library did.

To understand why, we have divert back briefly to ransoms. As noted earlier, the British state doesn’t pay ransoms, and most other Governments don’t. Throughout this crisis the Government did not come under any serious pressure to pay (unlike, for example, the Irish Government during the healthcare cyber crisis of 2021 because of the huge impact on health services).

The private sector is another matter. Because there is no reporting requirement in most jurisdictions, including the UK, to report when a ransom has been paid, there are no reliable figures for how many organisations pay (the cyber security company Coveware has made as decent a fist as any of tracking trends over time, and the latest figures show a significant decrease to fewer than half of organisations in 2022, down from seven out of every eight a few years earlier).

The blunt reason why the private sector often pays, but governments hardly ever do, is that Governments can throw far more resources and support at recovery. That was certainly the case in Ireland, where the military and a number of major cyber security companies were deployed with no expense spared. Private companies cannot afford to surge in capabilities like this, and they can’t call in the Army. And unlike the state, they can go bankrupt.

So for private organisations, paying can be more effective than not paying. In this case, the cost to the BL was far more than the ransom. This is not always the case: a BBC File on Four documentary in 2021 tracked the impressive response of the Harris Federation of London, a major schools provider. They held their nerve and the overall cost to them was less than the ransom demanded. And paying does not mean avoidance of harm: Colonial Pipeline paid the ransom, but the pipeline was still out for several days.

But Governments do not want to pay ransoms and, certainly in Britain, it is unlikely that taxpayers want them to. The crucial point is that not paying the ransom only works if the organisation can recover quickly.

The heroic efforts of Irish healthcare workers, and IT professionals from the civil service, the military and the private sector got the system back up and running to some sort of basically acceptable level in a similar amount of time as it takes a victim who paid to recover. Similarly, the Joint Committee on National Security Strategy heard from the leader of Redcar and Cleveland about how staff from the National Cyber Security Centre slept in the council’s offices during a ransomware crisis to ensure that the system dealing with the cases of at-risk children were recovered quickly.

Moreover, we need to ask ourselves: what if there is no ransom? What if a hostile hacker working for a nation state does exactly the same thing as a ransomware attacker, but the objective is to damage the UK by destroying the network, rather than to extort money by temporarily locking it?

In such a scenario, recovery is the only option. Ransomware highlights our digital vulnerabilities to others who have motives even worse and more strategically damaging than the criminals. And if there is no effective system backup that can easily be deployed, or no way of restoring the old system in some way, - in other words, if there’s no way of recovering quickly - then we’re stuffed. Recovery capability is paramount for national security.

Here, the obvious point to make about the British Library is that it has taken - and is taking - an inordinately long time for its catalogue, one of its most important services, to be restored. That’s even, presumably, with help from Government experts and others. Of all the high profile ransomware cases throughout the world, it is hard to think of many that have dragged on for this long with this degree of severity.

This slowness to recover is the most painful and most important lesson from the British Library cyber incident.

There are, no doubt, very good specific reasons for it. A 170 million item catalogue is bound to be very complicated. A replica backup would no doubt be very expensive and hard to maintain. (And, as stated at the start, this analysis implies no criticism of those working round the clock and over Christmas to try to get services back up and running; it is impossible to retrofit a solution that did not exist before the crisis).

But faced with the likelihood and potency of this threat to myriad public and private entities, we must no longer accept a situation where important national organisations, public or private, cannot withstand the lost of their enterprise computer network for such a long period of time. If we tolerate this, the likely consequences in terms of economic and social disruption will prove intolerable. Planning for the loss of a key network, and being able to recover quickly from it, needs to be a core part of good public and corporate governance that every organisation models and practices.

The way to get to this point is not to indulge in the classic British tradition of holding a what-went-wrong-and-who-can-we-hang-out-to-dry inquiry. This is not the Post Office IT scandal. There is not a single allegation of malice, bad faith or wilful negligence. Instead, an organisation with a reputation for being well-run and held in high public esteem found itself without the systems and plans in place to recover from being the victims of criminals. They deserve sympathy and support.

But we have to figure out why. What constraints were there, (and what incentives weren’t), that prevented this otherwise capable organisation from protecting itself and recovering quickly? Where else is this a risk? And what can be done about it?

The American answer to this conundrum has been to establish a Cyber Safety Review Board, based on the successful model in aviation safety. The aim is not to hunt for blame but to look at the rational explanations for why things went wrong and make constructive recommendations to address them. Such an approach could work in this case. The last thing we need are hours of theatrical hearings in a courtroom or committee room of Parliament, with exhausted witnesses defensive and humiliated. That makes for good TV and terrible public policy.

The UK has, by and large, suffered less major harm from ransomware than most comparable nations. But the British Library case is a warning. The critical lessons of it are:

This work is not easy. But it is vital, and urgent. And it is doable, with the right focus and leadership. Otherwise, in the well-chosen title of the Parliamentary report, national security is a hostage to fortune.

Share