In the wake of a widespread Microsoft outage that disrupted operations across various sectors globally, cybersecurity firm CrowdStrike has issued a detailed statement and provided workarounds to address the issue. The outage, which stemmed from a defect in a single content update for Windows hosts, caused significant disruptions, including grounded flights and stalled financial transactions.
CrowdStrike’s Statement on the Falcon Content Update
CrowdStrike has clarified that the issue was not a result of a security incident or cyberattack but was due to a defect found in a recent content update for Windows hosts. “Mac and Linux hosts are not impacted,” the company stated, emphasizing that the problem was isolated to specific Windows systems.
“The issue has been identified, isolated, and a fix has been deployed,” CrowdStrike announced. “We refer customers to the support portal for the latest updates and will continue to provide complete and continuous updates on our website.”
Detailed Technical Alert and Workarounds
At 9:22 am ET on July 19, 2024, CrowdStrike provided a comprehensive technical alert with more information about the issue and steps organizations can take to mitigate its effects. The company is actively working with global customers to ensure their security and stability.
Summary of the Issue
CrowdStrike is aware of reports of crashes on Windows hosts related to the Falcon Sensor. Symptoms include hosts experiencing a blue screen error (commonly known as the “blue screen of death”) due to the Falcon Sensor.
Key details include:
- Windows hosts which have not been impacted do not require any action as the problematic channel file has been reverted.
- Windows hosts brought online after 0527 UTC will also not be impacted.
- Hosts running Windows 7/2008 R2 are not impacted.
- Mac- or Linux-based hosts are unaffected.
- The problematic channel file “C-00000291*.sys” with a timestamp of 0409 UTC has been replaced with a good version timestamped at 0527 UTC.
Current Action and Workaround Steps
CrowdStrike Engineering has identified and reverted the content deployment related to the issue. For hosts that are still crashing and unable to stay online to receive the updated channel file, the following workaround steps are recommended:
For Individual Hosts:
- Reboot the Host: Give the host an opportunity to download the reverted channel file. If it crashes again:
- Boot Windows into Safe Mode or the Windows Recovery Environment: Using a wired network and Safe Mode with Networking can help remediation.
- Navigate to the CrowdStrike Directory: Go to %WINDIR%\System32\drivers\CrowdStrike.
- Delete the Problematic File: Locate the file matching “C-00000291*.sys” and delete it.
- Boot the Host Normally: Note that Bitlocker-encrypted hosts may require a recovery key.
For Public Cloud or Virtual Environments:
Option 1:
- Detach the Operating System Disk Volume: From the impacted virtual server.
- Create a Snapshot or Backup: As a precaution.
- Attach the Volume to a New Virtual Server: Navigate to the CrowdStrike directory and delete the problematic file.
- Detach and Reattach the Volume: To the impacted virtual server.
Option 2:
- Roll Back to a Snapshot: From before 0409 UTC.
Global Impact of the Outage
The outage has had a profound impact worldwide, with banks, media outlets, and emergency services among those affected. Major airlines including Delta Air Lines, United Airlines, and American Airlines halted departures, leading to significant travel disruptions.
Windows computers and tablets crashed in countries from the U.S. to China and Australia, with many devices showing the notorious blue screen of death. Financial institutions and corporations reported similar issues, all traced back to the CrowdStrike update.
Microsoft’s Response
Microsoft acknowledged the issue, stating, “We are working to restore services for those still experiencing disruptions as quickly as possible.” The company clarified that the CrowdStrike error was separate from its own cloud services outage experienced overnight.
Looking Forward
CrowdStrike’s proactive communication and the deployment of workarounds have been crucial in mitigating the fallout from this outage. As the situation stabilizes, the focus will likely shift to ensuring such incidents are prevented in the future and that response mechanisms are further refined.
This incident underscores the critical importance of robust cybersecurity measures and the need for swift, effective responses to technical issues in an increasingly interconnected digital world.