Voir en

français

Computer Security: Data Centre Nightmares

What a bad weekend. I didn’t sleep well. Not. At. All. I tossed and turned. Sweated. Woke up and fell asleep again. I had data centre nightmares.

You know, computer security comes with clear mantras. One is “Defence-in-depth”, where security controls are applied at every level of the hardware and software stack, e.g., agile and timely updating and vulnerability management, secure and professional software development, as well as an inventory known as Software Bill of Materials (SBOM), tested business continuity and disaster recovery plans, logging and intrusion detection, access control, network segregation and compartmentalization, firewalls and email quarantines, data diodes, bastion hosts, gateways and proxies. A second mantra is “KISS” ─ “Keep it simple, stupid”. It tells us not to overcomplicate things, to avoid unnecessary complexity and to not deviate too far from the “standard”.

But nothing is “KISS” anymore. Gone are the days when the accelerator sector, the physics experiments and the IT department used a multitude of dedicated PCs ─ PC farms ─ to do the job. The same PCs that could be found in offices. And, security-wise, PCs were easy back then: the motherboard and its “BIOS” (operating system) and your favourite application. Three layers to secure. Easy. Although we had separate computer centres in the past, this is not affordable anymore. The combined requirements of the accelerator sector, experiments and IT, as well as the user community, are simply too large.

A modern data centre, on the other hand, is complex. Instead of three layers, we have five: the motherboard (but now running a full-blown operating system), a hypervisor, one or several virtual machines benefitting from the multiple CPUs on the motherboard, and the containers inside running ─ finally! ─ your favourite application. And since everything is virtualized, the same hardware runs a multitude of other applications in parallel. This is called being “agile” or “elastic”, and it allows for load balancing, business continuity and disaster recovery. It accommodates the infrastructure for “Big Data” ─ machine learning and, just around the corner, ChatGPT. It provides public/hybrid/private cloud resources, as well as GPUs, and it will eventually enable quantum computing. It is, to use the German phrase, “eine eierlegende Wollmilchsau”. Enter the third mantra of computer security: "AC/DC", or rather, "all convenient and damn cheap". After all, nobody prioritises security over convenience and value for money. "AC/DC" is therefore complex and not without significant security challenges – my worst nightmare… Let’s start dreaming.

Dreaming of dedicated networks
Let’s try. One network for the hardware and its BIOS, now called IPMI or BMC ─ a fully-fledged operating system. One network for the provisioning of the virtual machines and containers. One network for CERN’s Intranet ─ the Campus Network. Several networks for running the accelerators, infrastructure and experiments. “Security” would require those networks to be physically separate from one another, as using the same hardware (routers, switches) ─ e.g. to spin up VLANs ─ might have flaws and, when exploited, could allow a hacker to jump from one network to another. I start tossing and turning in bed.

Also, the network needs to be managed: DHCP, DNS, NTP. Ideally, there should be one system for each network. Unfortunately, they need to be synchronised, either by connecting them or just having one central system. One system to rule them all. One system to fail. My mind is racing.

And it might not even matter. By using hypervisors, we are already bridging networks. Unless we have separate hypervisors for separate duties, which would violate the third mantra ─ by not being elastic, convenient or cheap. Unfortunately, we have already seen cases in which vulnerabilities in the hypervisor have spread, jumping from one virtual machine to another, bridging networks, and more: “Spectre”, “Meltdown” and “Foreshadow” (2018), “Fallout” (2019), “Hertzbleed” (2022), “Downfall” (2022) and “Inception” (2023). I’m sweating.

But it can get worse. Our hardware, our computer centre, is supposed to serve. And sometimes it must serve several masters. The accelerator sector and the experiments and at the same time the Campus Network. Or, even worse, the Campus Network and the Internet. Full exposure. One server, one service, one application (like the “e-logbook”) visible to the accelerator control room, the experiments, the Campus Network, and the whole wide world. Ready to fall prey to ransomware. Waking up.

Hallucinating about controlled administration and provisioning
But let’s forget those complications. I’m falling asleep again. This time I’m dreaming of data centre administration. Ideally, admins have one console for the IPMI/BMC network and one for provisioning. But who wants to have two consoles on their desk? Three, if you count the one for the Campus Network. Mantra three: Nobody. So, we bridge the networks once more. One console ─ an office computer ─ to administer them all. Ideally reachable from the Internet for remote maintenance. Not that this comes with any risk... Tossing and turning again.

And we have not yet mentioned provisioning, that is, the use of Puppet and Ansible tools to “push" virtual machines and containers out from the storage systems and databases and deploy them in the hypervisor, thereby “orchestrating” the data centre. But this orchestration, the storage systems and the databases must also be available to our user community: CERN allows its community to run their own services, their own virtual machines and their own containers. So, we ultimately bridge the provisioning network and the Intranet once more. Sweating. Lots of sweating.    

The cloud trance
The above configuration ─ with all its complications ─ can also be called a “private cloud”. But modern people don’t stop here. Enter public clouds. Connecting our data centre with that of Amazon, Google, Microsoft or Oracle. Bridging our networks, sigh, and theirs. Via the Internet. And using, in parallel, other Internet resources: publicly shared virtual machines, commonly available containers, shared (open source) software libraries and packages. The Internet is full of all sorts of useful things. And malicious things, too. Compromised virtual machines, malicious containers, vulnerable software. All of them channelled straight into our data centre. No filtering, nothing. Aarrgh. I’m wide awake again.

Data centre nightmares
Voilà. I’m having data centre nightmares. Common hardware vulnerabilities threaten the security of data centres. As do basic services crossing network boundaries (DNS, SSO/LDAP, orchestration, storage, DBs, etc.). A rapidly growing cacophony of dependencies, agility, heterogeneity and complexity violates the second mantra: “KISS” (“Keep it simple, stupid”) becomes “AC/DC” (“All convenient and damn cheap"). And that’s without mentioning the increasing dependency on external cloud services and software importations… So, if you have any bright ideas ─ and please don’t suggest sleeping pills ─ let us know at Computer.Security@cern.ch.

_____

Do you want to learn more about computer security incidents and issues at CERN? Follow our Monthly Report. For further information, questions or help, check our website or contact us at Computer.Security@cern.ch.