On 29 April 2026, a Linux kernel bug called Copy Fail (CVE-2026-31431) went public with a working exploit attached. 732 bytes of Python. Any user account on the box, root in seconds. No network access, no kernel debugger, no special configuration. Just a basic shell and a copy of the script.
By the time most operators had read the headline, every Webnestify-managed server already had the immediate mitigation deployed. The patched kernel followed across the fleet within the day.
Here’s what the bug actually does, why this disclosure was unusual, and what the day-of response looked like from inside a managed-infrastructure shop.
What Copy Fail actually does
Plain-English version: Linux keeps a working copy of every file the system is using in memory. That cache is called the page cache, and programs read from those copies rather than from disk. The kernel is supposed to be the only thing that can write to them.
Copy Fail breaks that rule. By feeding the kernel’s crypto API a carefully crafted sequence of operations, an unprivileged user can write a few bytes into the in-memory copy of any file the kernel has cached. Pick /usr/bin/su (it runs as root). Inject shellcode 4 bytes at a time. Run su. The kernel loads the corrupted in-memory version of the binary. Your shellcode executes as UID 0.
The on-disk file is never touched.
Why this one was different from a normal kernel CVE
Three things made this disclosure worse than the usual Linux kernel advisory.
- It hit basically every distribution. The bug was introduced by a 2017 performance optimization in
algif_aead. Anything built since then is in scope. At disclosure, Ubuntu 24.04 LTS (6.17.0-1007-aws), Amazon Linux 2023 (6.18.8-9.213.amzn2023), RHEL 10.1 (6.12.0-124.45.1.el10_1), and SUSE 16 (6.12.0-160000.9-default) all shipped vulnerable stock cloud kernels. - The exploit needed almost nothing. No network access. No kernel debug features. No setuid helper trick. Just a regular user account. The
AF_ALGcrypto interface that the exploit pivots through is enabled in essentially every mainstream distro’s default config. - It punches through container boundaries. The page cache is shared per host kernel, not per container. A process inside a container can corrupt cached binaries the host kernel reads, which is also a container escape and a Kubernetes node-compromise primitive.
How the exploit actually works
For anyone who wants the mechanism: the bug lives in the authencesn crypto template, a wrapper around AEAD encryption that adds Extended Sequence Number support for IPsec.
When a user splice()s a file into an AF_ALG socket, the kernel passes direct references to the file’s page cache pages into the crypto operation rather than copying them. That was the 2017 optimization. Those pages end up in a scatterlist that the crypto template treats as writable.
The authencesn template uses its destination buffer as scratch space during decryption. One of those scratch writes lands 4 bytes past the end of the legitimate output region (specifically, assoclen + cryptlen past the AEAD tag), directly inside the page cache page of whatever file got spliced in.
The reference PoC uses this primitive to inject shellcode into /usr/bin/su 4 bytes per call, then triggers execve("/usr/bin/su"). Because su is setuid-root, the corrupted in-memory binary runs as UID 0. Theori and Xint Code Research published the full writeup and the reference exploit on disclosure day.
732 bytes of Python.
The disclosure timeline
| Date | Event |
|---|---|
| 23 March 2026 | Reported to the Linux kernel security team by Xint Code Research / Theori (research credited to Taeyang Lee) |
| 25 March 2026 | Patches reviewed on the kernel security mailing list |
| 1 April 2026 | Patch committed to mainline as a664bf3d603d |
| 22 April 2026 | CVE-2026-31431 assigned |
| 29 April 2026 | Public disclosure on oss-security; working PoC released the same day |
The patch reverts the 2017 in-place optimization in algif_aead, so page cache pages can no longer reach the writable destination scatterlist of a crypto operation.
What we did the day it dropped
I was watching the oss-security thread on the morning of 29 April when the public PoC went live. Within roughly 90 minutes, every Webnestify-managed server had the immediate mitigation deployed:
echo "install algif_aead /bin/false" > /etc/modprobe.d/disable-algif.conf
rmmod algif_aead
That blocks the exploit path without a reboot. The patched kernel rolled out across the fleet through the next maintenance windows, which I scheduled at off-peak hours per server’s traffic profile so nobody noticed.
Customers got an email before most of them had seen the news. Verbatim:
You may have already seen the news about a Linux kernel vulnerability nicknamed “copy.fail” (CVE-2026-31431), disclosed on 29 April with a public exploit released the same day. It’s a serious flaw in a crypto module affecting nearly every Linux server on the internet, and coverage has been picking up over the last 48 hours.
Your server is already protected. We deployed the immediate mitigation as soon as the disclosure landed, and we’ve now also installed the patched kernel across our entire managed fleet for full long-term protection.
Scheduled reboot: to activate the new kernel, your server will reboot during a planned maintenance window at off-peak hours. The reboot itself takes only a minute or two, and we’ve timed it to minimise any disruption to your sites and visitors. No action is required from you.
Keeping your servers continuously patched and protected against zero-days like this is a core part of the Webnestify managed-hosting service. We monitor disclosures around the clock and respond before issues can affect your sites.
The work was already done by the time the explanation landed in inboxes, which is the order it should happen in.
Shoutouts to xCloud and RunCloud
Patching my own fleet is the bare minimum. Once the PoC was confirmed I also reached out to a few teams I know well, partly because their customers are friends and partly because the blast radius shrinks when the wider community moves together.
Nobin (CTO of xCloud) and the xCloud team moved fast. I sent them a short video walkthrough of the PoC behaviour and they had patches rolling across their managed infrastructure the same day. No committee, no delay.
Raj at RunCloud did the same. Servers connected to the RunCloud platform got the fix without drama.
That’s how this is supposed to work. A live exploit drops on a Wednesday morning, and the operators who actually care move first.
If you’re managing your own server
If you’re not on managed hosting, here’s the order of operations. (If you don’t already have the basics like SSH key auth, sudo users, and UFW locked down, start with the Linux server security fundamentals walkthrough before any of this.)
Mitigate first, even if you can’t reboot right away:
echo "install algif_aead /bin/false" > /etc/modprobe.d/disable-algif.conf
rmmod algif_aead
That kills the exploit path. Most workloads don’t use algif_aead directly. Some hardware crypto offload tooling does, so test in staging if you run anything exotic.
Then patch. Update to a kernel that includes mainline commit a664bf3d603d:
# Ubuntu / Debian
apt update && apt upgrade linux-image-generic
# RHEL / Amazon Linux / Fedora
dnf update kernel
# SUSE
zypper update kernel-default
Reboot to activate the new kernel. The mitigation can stay in place afterwards; it does no harm.
For container hosts, block AF_ALG socket creation via seccomp regardless of patch state. The default Docker seccomp profile already does this. Many Kubernetes distributions don’t, so it’s worth checking your node profile.
Watch your logs for unusual execve of /usr/bin/su from non-root accounts. The exploit is invisible to file-integrity tooling, but the behaviour around it tends not to look like normal user activity. If you don’t already have a behaviour-based defence layer on the box, CrowdSec is the one I install on day one — community-sourced blocklists plus local heuristics, no SaaS dependency.
What this changes about managed hosting
Copy Fail wasn’t unusual in being a Linux kernel CVE. Those land at a steady clip. It was unusual in two specific ways: the public PoC dropped on disclosure day with no grace period, and detection bypasses every standard host-integrity tool an operator would normally lean on.
That combination is the actual job of managed hosting. Not “is the server up”. Not “is the disk full”. Watching disclosure feeds at 9am on a Wednesday so that by the time the customer reads about it on Hacker News, the patch is already deployed and the explanation is already in their inbox.
Most hosting providers fail this test. Their version of “managed” is keeping the box running and rebooting it when monitoring screams, which is necessary but isn’t sufficient when the disclosure timeline collapses to zero. If you want a clearer picture of how the threat surface has shifted for the kind of business that depends on a working website, the cybersecurity threats modern businesses actually face writeup covers what’s changed in the last few years.
Want this kind of cover for your stack?
If Copy Fail made you wonder who’s actually watching your servers, that’s the right question. For most setups the answer is “nobody, until something breaks”, which is fine until a Wednesday like 29 April.
Webnestify is a small managed-infrastructure shop for agencies and growing businesses. The full picture of what we run for customers lives on the solutions page, and if you want a one-off check rather than ongoing management, the cloud infrastructure audit is the lightweight way in. When something serious drops, you hear from me directly on Signal, WhatsApp, or email, usually before you’ve seen the news.
Apply for a discovery call if you want to walk through your stack and risk profile. Thirty minutes, no slide deck.
P.S. — If you’d rather start by understanding the stack first, I’ve put 130+ free tutorials on the Webnestify YouTube channel. Same depth, no commitment.
References
- copy.fail, the official disclosure site
- Xint Code Research writeup, full technical analysis by Theori / Xint
- theori-io/copy-fail-CVE-2026-31431, reference PoC
- oss-security disclosure thread, original public announcement
- Debian security tracker, distro patch status