Advertisement

What Caused The Largest IT Outage In History And What Can Be Done?

July 22, 2024 12:52 pm in by

On July 19, 2024, a seemingly routine cybersecurity update led to a catastrophic IT outage, causing chaos worldwide. The culprit? A patch for CrowdStrike’s Falcon Sensor program, which inadvertently introduced a coding error that sent 8.5 million of Windows computers into a “Blue Screen of Death” (BSOD) boot-loop, rendering them unusable until a specific file (channel file 291) could be neutralized.

The Role of CrowdStrike

CrowdStrike, a prominent US-based cybersecurity company, serves government agencies, corporations, and various other organisations with cost-effective and scalable security solutions. Their Falcon Sensor program, a cloud-based service providing malware protection and antivirus support, is a cornerstone of their offerings. CrowdStrike even has ties with the Australian security agencies. In 2019 CrowdStrike initiated a contract with The Australian Signal Directorate worth $640,000, as well as a limited tender contract with the Department of Defence worth $954,323. That said Prime Minister Anthony Albanese says there’s “no impact to critical infrastructure, government services or triple-0 services”.

Article continues after this ad
Advertisement

What Went Wrong?

The update intended to enhance security backfired due to a logic error in the code, affecting any Windows system running Falcon Sensor version 7.11 or above. This bug triggered operating system crashes, causing widespread disruptions in airports, supermarkets, and media outlets, among other sectors. CrowdStrike responded quickly, deploying a fix within hours. However, the scale of the issue meant that many users continued to experience problems, and the recovery process varied in efficiency. Many systems required manual intervention to delete the faulty file, a process complicated by the need for remote access or physical presence.

The irony of this event is that the company entrusted with protecting our digital assets with regular updates actually ended up bricking millions of devices and causing hundreds of millions, if not billions of dollars worth of damage/lost revenue for their customers.

How To Fix It

If your organisation has a dedicated IT team, listen to their directions first and foremost, but verify they are in fact part of your organisation. Hopefully you have a regular contact you can trust, but even then it’s best to avoid text-only communications.

Article continues after this ad
Advertisement

More specifically you can boot the machine into Safe Mode and delete C:WindowsSystem32driversCrowdStrikeC-00000291*.sys

If you can’t access the system32 folder you may need a system admin to login either remotely or locally.

If your machine has bitlocker enabled, you’ll need to use your BitLocker Recovery Key.

This issue is so widespread that Microsoft has even created a dedicated bootable recovery tool to delete the specific channel file 291.

Whilst some online guides suggest disabling the security endpoint entirely, this is not recommended as it could leave your device vulnerable.

Article continues after this ad
Advertisement

Policy Implications

Governments around the world took note of the widespread disruption, prompting discussions on regulatory measures to ensure higher standards of cybersecurity. Lawmakers debated the necessity of mandatory third-party audits for critical software updates, stricter penalties for negligent coding practices, and the establishment of public-private partnerships to enhance national cyber resilience. This event also sparked interest in revising cybersecurity policies to include more stringent requirements for incident reporting and response times, aiming to protect both public and private sectors from future crises.

Consumer Confidence

For many users, the incident eroded trust in CrowdStrike and similar cybersecurity firms. The perception of infallibility in top-tier security solutions was shattered, leading businesses and individuals to seek alternative measures to safeguard their data. Some companies switched providers, while others opted to diversify their cybersecurity tools to avoid reliance on a single solution. The breach in confidence also encouraged more users to educate themselves on cybersecurity best practices, fostering a more informed and vigilant digital community.

A Lesson for Long-Term Cyber Resilience

Article continues after this ad
Advertisement

When a single company or software platform dominates the market, it creates a fragile system with a single point of failure. Experts have long advocated for greater redundancy and the use of decentralized and heterogeneous federated systems to enhance long-term cyber resilience. Redundancy involves creating backup systems and alternative pathways for data and operations, ensuring that if one system fails, others can take over without significant disruption. By leveraging different systems and protocols, these networks can better withstand attacks that exploit specific vulnerabilities in one type of technology.

Future Cyber Threats: The Rise of AGI and ASI

One of the most significant looming threats is the advent of Artificial General Intelligence (AGI) and, eventually, Artificial Superintelligence (ASI) that will possess the ability to understand, learn, and apply intelligence across a wide range of domains, potentially surpassing human capabilities in almost every field. The potential for AGI and ASI to be weaponized poses a serious cyber threat. These systems could autonomously discover and exploit vulnerabilities in critical infrastructure, launch sophisticated cyber-attacks, and create malicious code that evolves to evade detection and neutralization. The implications of such attacks are profound; an AGI or ASI system could disrupt national security, manipulate financial markets, and cripple essential services such as healthcare and transportation. The ability of these systems to operate at speeds and scales far beyond human capabilities makes them formidable adversaries in the cyber domain. As we stand on the brink of an era defined by intelligent machines, the imperative to safeguard our digital future has never been more urgent. 

Don’t Panic

In moments of crisis, it’s natural to feel overwhelmed and fearful, but it’s crucial to remember that no amount of fear can improve the situation. Be patient with service providers and focus on what you can do to strengthen your own digital resilience.

Article continues after this ad
Advertisement
Advertisement