Category: Deep Dives

Long form technical deep dives into one mechanism at a time: cloud, kernel, IoT, and privacy internals.

  • How Bluetooth LE Pairing Breaks: KNOB, BLESA, Just Works, and Sniffed Keys

    How Bluetooth LE Pairing Breaks: KNOB, BLESA, Just Works, and Sniffed Keys

    Bluetooth LE pairing is the handshake where two devices agree on a secret key so the rest of their conversation can be encrypted. Done right, that key is fresh, strong, and tied to the two devices that actually meant to talk. Done wrong, the key is weak, unauthenticated, or trivially guessed, and a third radio in range can read everything or pretend to be one of the endpoints. The mechanism is small, it runs in milliseconds, and most users never see it. That is exactly why its failures are so quiet. This post walks the pairing flow one step at a time, then walks the real attacks that break it: an entropy downgrade that shrinks a key to a single byte, a spoofing trick that abuses reconnection, an association model that encrypts without authenticating, and a passive sniff that recovers the key offline.

    What bluetooth le pairing actually does

    When two Bluetooth Low Energy devices first meet, they have no shared secret. A smartphone and an Acme smart lock are radios shouting into the same crowded band, and anyone nearby can hear the same packets. Pairing is the procedure that turns that open exchange into a private channel. It does three jobs in sequence: the devices announce what input and output hardware they have, they pick an association model based on those capabilities, and they run that model to establish a key. Everything that follows, every encrypted read and write, rests on the key that pairing produced.

    There are two generations of this procedure, and the difference matters for every attack below.

    LE legacy pairing versus LE Secure Connections

    LE legacy pairing is the original mechanism, shipped with the first Low Energy specification. In legacy pairing the two devices first agree on a Temporary Key, or TK. They use the TK to derive a Short Term Key, the STK, which protects the rest of the exchange long enough to hand over a Long Term Key, the LTK, that gets stored and reused on later connections. The weakness baked into this design is the TK. Depending on the association model, the TK is either a six digit number the user typed, or it is simply zero. A small number space is a guessable number space, and that is the thread an attacker pulls.

    LE Secure Connections arrived in Bluetooth 4.2 and replaces the heart of the exchange with Elliptic Curve Diffie Hellman key agreement on the P-256 curve. Instead of agreeing on a tiny Temporary Key and stretching it, both sides contribute to a shared secret that an eavesdropper cannot reconstruct from the packets alone, because the private scalars never go over the air. The same four association models exist in both generations, but under Secure Connections they protect a real key agreement rather than a six digit guess. If you remember one defensive fact from this entire post, it is that requiring LE Secure Connections removes the foundation that the legacy attacks stand on.

    The four association models

    An association model is how the two devices authenticate the key they are agreeing on, given whatever buttons and screens they happen to have. The specification defines four, and which one runs is decided automatically from the input and output capabilities each side advertises.

    • Just Works is the model for devices with no screen and no keypad, which describes most cheap sensors, tags, and beacons. It runs the key agreement with no human in the loop and no value to compare. In legacy pairing the Temporary Key under Just Works is set to zero. It provides encryption, but it authenticates nothing.
    • Passkey Entry has one device display a six digit number and the user types it into the other, or the same number is keyed into both. That shared six digit value feeds the authentication, so a passive listener who never sees the digits cannot complete the handshake the same way.
    • Numeric Comparison, available only under LE Secure Connections, shows a six digit value on both screens and asks the user to confirm they match. Because the value is derived from both sides of an authenticated key agreement, confirming it rules out a radio sitting in the middle. This is the strongest model when both devices have a display.
    • Out of Band moves the authentication data over a different channel entirely, commonly NFC. If a strong secret travels over that side channel, the over the air handshake can be authenticated against it, and a remote attacker who only hears the Bluetooth radio has nothing to work with.

    The split that runs through all four is whether the model provides authentication, often called MITM protection, or only encryption. Just Works gives encryption with no authentication. The other three, used correctly, give both. An attacker reads that capability advertisement as a menu, and Just Works is the cheapest item on it.

    How the pairing breaks

    Four weaknesses, each attacking a different assumption in the flow above. The first downgrades the key. The second skips authentication on reconnect. The third never had authentication to begin with. The fourth recovers the key by listening.

    1. The KNOB attack: negotiating the key down to one byte

    The Key Negotiation of Bluetooth attack, KNOB, was published by Daniele Antonioli, Nils Ole Tippenhauer, and Kasper Rasmussen, and it targets a step most descriptions of pairing skip entirely: the negotiation of how long the encryption key will be. Before encryption starts on a Bluetooth BR/EDR link, the two devices agree on the entropy of the key, anywhere from 16 bytes down to a legacy minimum of 1 byte. The problem is that this entropy negotiation is itself unauthenticated. It happens before the strong key protects anything, and nothing signs or checks the proposed length.

    An attacker positioned between the two devices rewrites that negotiation in flight. When one side proposes 16 bytes, the attacker lowers the proposal to 1 byte before it reaches the other side, and does the same in reverse. Both devices believe they negotiated honestly and both accept a key with only 8 bits of entropy. That is 256 possible keys. The attacker brute forces it, decrypts the traffic, and injects valid ciphertext, all without either victim noticing, because the downgrade is invisible at the application layer.

    The devices agreed on a key. They never agreed on how hard that key would be to guess, and the one negotiation they trusted to set that was the one nobody was protecting.

    This is tracked as CVE-2019-9506, scored 8.1 High, and its official description is precise: the specification up to and including version 5.1 permits a sufficiently low encryption key length and does not prevent an attacker from influencing the key negotiation. The researchers tested more than 14 chips from Intel, Broadcom, Apple, and Qualcomm, and nearly all accepted 1 byte of entropy. The full method is documented on the KNOB attack project page. The fix the Bluetooth SIG shipped is a floor: their security notice on key negotiation recommends enforcing a minimum encryption key length of 7 octets so the downgrade has nowhere low to go.

    2. BLESA: spoofing a device on reconnect

    Pairing is the expensive first meeting. Reconnection is the cheap reunion. Once two devices have stored a Long Term Key, every later session is supposed to skip the full handshake and just resume encryption using that stored key. The Bluetooth Low Energy Spoofing Attack, BLESA, comes from researchers at Purdue’s PurSec Lab with EPFL, and it lives entirely in that reconnection step.

    The researchers analyzed the reconnection procedure as written in the specification and found that authentication on reconnect is effectively optional and, worse, poorly enforced by real stacks. When a previously paired device reappears, the client is supposed to insist the connection actually use the keys they share. Instead, several implementations would accept data from a peer that claims to be the known device without the peer proving it holds the Long Term Key. An attacker who has observed the earlier pairing can therefore impersonate the server, the Acme lock or a fitness sensor, and feed the client spoofed data on reconnect. The client trusts it because reconnection is the step everyone designed to be frictionless.

    The work won a Best Paper award at the USENIX Workshop on Offensive Technologies in 2020, and the paper estimates it could affect well over a billion devices. The reconnection assumption is the dangerous part: the whole point of storing an LTK was to avoid re proving identity, and skipping that re proof is exactly the hole. The full analysis is in the BLESA paper at USENIX WOOT 2020.

    3. Just Works: encryption with nobody on the other end verified

    Just Works is not a bug. It is a documented model that does precisely what its design says, which is to encrypt without authenticating. The trouble is what that combination means against an active attacker. Because no value is compared and no secret is shared out of band, there is nothing in the handshake that distinguishes the real Acme lock from a radio impersonating it. An attacker who is present during pairing can sit between the phone and the lock, pair with each side separately, and relay between them. Both ends get an encrypted channel. Both encrypted channels terminate at the attacker.

    This is the classic man in the middle, and Just Works is open to it by construction. The reason it is everywhere is hardware economics: a sensor with no screen and no keypad cannot run Passkey Entry or Numeric Comparison, so the capability negotiation falls through to Just Works as the only option both devices support. The defense is not to disable encryption but to refuse Just Works where it matters, by requiring the MITM protection flag during pairing so a device that can only offer Just Works is rejected for sensitive functions rather than silently accepted.

    It helps to be precise about what Just Works does and does not give you, because the marketing word is encryption and people stop reading there. The channel is encrypted, so a purely passive listener under LE Secure Connections cannot simply read the plaintext off the air. What is missing is any guarantee about who sits at the other end of that encrypted channel. Encryption answers the question is this traffic readable by an outsider. Authentication answers the question am I talking to the device I think I am. Just Works answers only the first, and an active attacker exploits the gap between the two by being the device you think you are. The phone encrypts faithfully to the attacker, and the attacker encrypts faithfully to the lock, and both sides see a green padlock the whole time.

    4. Passive sniffing of LE legacy pairing

    The first three attacks need an active radio in the conversation. The last one just listens. Mike Ryan’s tool crackle attacks LE legacy pairing offline, and it works because of the Temporary Key. Under Just Works the TK is zero. Under the six digit Passkey models the TK is a value in the range 0 to 999999, padded out to 128 bits, which sounds like a lot until you count it: one million possibilities is a number a laptop chews through in under a second.

    An attacker captures the legacy pairing exchange off the air, including the confirm values the two devices send. Then they compute the confirm value for every candidate TK and keep the one that matches what they captured. Recovering the TK unwinds the rest: the TK yields the Short Term Key, the STK protects the handover of the Long Term Key, and once the LTK is in hand the attacker decrypts the entire session and every future session that reuses it. No injection, no jamming, just a recording and a brute force. The technique is described in the crackle project on GitHub. The single line of defense is the generational one: LE Secure Connections replaces the guessable Temporary Key with a Diffie Hellman exchange that produces nothing for crackle to brute force.

    Why this lands hard on IoT

    These attacks would be academic if the affected devices were just headphones. They are not. Bluetooth Low Energy is the radio of choice for the cheapest, longest lived, least patched hardware in circulation, and that population maps almost exactly onto the weaknesses above.

    Smart locks are the sharpest example. A lock that uses Just Works, or that does not enforce authenticated reconnection, can be spoofed or relayed by an attacker in radio range, and the failure mode is a door that opens. Medical devices such as glucose monitors and insulin pumps carry data and sometimes control that a spoofing or eavesdropping attack turns into a safety problem, not just a privacy one. Wearables and fitness sensors leak a continuous stream of personal data over links that frequently fall back to Just Works because the band on your wrist has no keypad. And trackers, the small tags people attach to keys and bags, are designed to be silent and to reconnect automatically, which is precisely the reconnection behavior BLESA abuses.

    The common thread is constraint. These devices are too small for a screen, too cheap for careful firmware, and too long lived to be reliably updated, so they default to the weakest models and the oldest pairing generation. The economics that make them cheap are the same economics that make them vulnerable.

    There is a patching problem layered on top. When CVE-2019-9506 landed, the operating system vendors that ship general purpose devices, phones and laptops, pushed enforcement of a minimum key length fairly quickly. A standalone Acme lock or a budget fitness band has no such pipeline. Its firmware was flashed once at the factory and may never be touched again, and many such products have no mechanism to update the Bluetooth stack at all. So a weakness in the specification does not just affect devices for a patch cycle; it affects them for the entire service life of hardware that was never built to be fixed. An attacker does not need a fresh vulnerability against this population. The old ones never closed.

    Closing the gaps

    Every attack above has a corresponding control, and they stack. None of them requires inventing anything; they require refusing the weak defaults the specification still permits for compatibility.

    Require LE Secure Connections

    This is the single highest leverage change. Secure Connections mode replaces the legacy Temporary Key and STK chain with ECDH key agreement, which removes the guessable secret that crackle brute forces and strengthens the foundation under every association model. Devices can refuse to pair in legacy mode, and security sensitive products should. The legacy fallback exists for old peers; a lock or a medical device has no business honoring it.

    Enforce a minimum key length

    KNOB works because the entropy floor sits at 1 byte. Following the Bluetooth SIG guidance and enforcing a minimum encryption key length, 7 octets for BR/EDR, means an attacker who rewrites the negotiation cannot push it down to a brute forceable size. Platform vendors shipped exactly this enforcement after CVE-2019-9506, and devices should reject any negotiated key below the floor rather than accept whatever the negotiation lands on.

    Mandate authenticated reconnection

    BLESA exists because reconnection skipped the proof that the peer still holds the shared key. The fix is to make that proof mandatory: on every reconnect, require the link to actually use the stored Long Term Key and reject a peer that cannot demonstrate it. The convenience of a frictionless reunion is not worth accepting data from a device that never proved it is the one you paired with.

    Set the MITM protection flags

    During pairing, devices exchange authentication requirement flags, and one of them requests MITM protection. Setting it forces the capability negotiation toward an authenticated model, Passkey Entry, Numeric Comparison, or Out of Band, instead of letting it slide into Just Works. A device that can only offer Just Works then fails closed for sensitive operations rather than getting an unauthenticated channel by default. Pair this with Out of Band where you have a side channel like NFC, and the over the air handshake gets anchored to a secret the remote attacker never hears.

    These controls reinforce one another. Secure Connections kills the offline brute force, the key length floor kills the entropy downgrade, authenticated reconnection kills the spoof, and the MITM flag keeps the whole thing from quietly falling back to the model that authenticates nobody. The same discipline that protects a Bluetooth lock applies to the firmware underneath it; if the device boot chain is also worth trusting, the way the secure boot process verifies each stage is the embedded sibling of these radio side defenses.

    The assumption that breaks

    Strip the four attacks down and they share one root. Pairing assumes that the two devices negotiating the key are the only two in the conversation, and that the negotiation about the key is itself trustworthy. Both halves of that assumption fail. Just Works and weak reconnection break the first half, because a third radio can insert itself into a handshake that never proves who is on the other end. KNOB breaks the second half directly: the one negotiation that decides how strong the key will be is the one nobody bothered to authenticate, so an attacker edits it in transit and both victims sign off on a key they would never have chosen.

    The flaw is never a broken cipher. The ciphers are fine. The flaw is a step that was trusted without being checked, an entropy field nobody signed, a reconnection nobody re proved, a model that encrypts to whoever shows up. That gap between what a protocol assumes about its participants and what an attacker can actually arrange in the same radio band is the kind of weakness you find by asking what each step trusts and why it still trusts it, rather than by scanning for a known bad signature. It is exactly the kind of assumption an autonomous researcher built to test assumptions is meant to surface. Require Secure Connections, floor the key length, prove the peer on reconnect, and never let the handshake fall back to trusting a stranger. Learn more about that approach on our about page.

    Frequently asked questions

    What is the KNOB attack and what CVE tracks it?

    KNOB, the Key Negotiation of Bluetooth attack, lets a nearby attacker rewrite the encryption key length negotiation between two BR/EDR devices because that negotiation is unauthenticated, forcing a key with as little as 1 byte (8 bits) of entropy that is then trivially brute forced. It is tracked as CVE-2019-9506, scored 8.1 High, and the full method is documented on the KNOB attack project page.

    How does BLESA spoof a Bluetooth Low Energy device?

    BLESA, the Bluetooth Low Energy Spoofing Attack, abuses reconnection. After two devices pair and store a Long Term Key, later sessions resume without a full handshake, and the researchers found that authentication on reconnect is optional and poorly enforced in real stacks. An attacker can impersonate a previously paired device and feed spoofed data to the client. The analysis is in the BLESA paper from USENIX WOOT 2020.

    Why is the Just Works association model insecure?

    Just Works is the pairing model for devices with no screen or keypad, and it provides encryption without authentication. Because no value is compared and no secret travels out of band, nothing in the handshake distinguishes the real device from an impostor, so an attacker present during pairing can sit in the middle, pair with each side, and relay between them. The model is described in the Bluetooth SIG security overview.

    Can someone decrypt Bluetooth LE by just listening?

    Yes, against LE legacy pairing. Mike Ryan’s tool crackle recovers the Temporary Key, which is zero under Just Works or a value from 0 to 999999 under the six digit models, by brute forcing all candidates against captured confirm values in under a second. The recovered key unwinds the Short Term Key and then the Long Term Key, decrypting the whole session. The fix is LE Secure Connections. See the crackle project on GitHub.

  • What Is Sigreturn Oriented Programming and Why One Gadget Owns the CPU

    What Is Sigreturn Oriented Programming and Why One Gadget Owns the CPU

    Sigreturn oriented programming is a binary exploitation technique that turns one tiny piece of borrowed code into total control of the CPU. On Linux, when a signal is delivered, the kernel writes a snapshot of every register onto the user stack and trusts that snapshot completely when the handler returns. An attacker who can write to that stack forges the snapshot, triggers the return path, and the kernel obediently loads attacker chosen values into rax, rdi, rsp, rip, and the rest, all in a single step. Where a normal exploit hunts for a dozen scattered gadgets to set up one system call, this one needs almost nothing. This post walks the mechanism one step at a time: how a signal frame gets onto the stack, why the kernel never checks whether it is genuine, how a forged frame becomes a syscall chain that spawns a shell, and what actually stops it.

    What sigreturn oriented programming actually exploits

    The whole technique rests on one feature of how Unix systems deliver signals, and on one assumption the kernel makes about that feature. When a process receives a signal, the kernel does not just jump to the handler and forget where it was. It first saves the entire interrupted execution state so that, after the handler runs, the process can pick up exactly where it left off. That saved state is the signal frame, and it lives on the user stack, in memory the process can read and write like any other.

    The frame is not a vague summary of the process. It is a full register dump. On x86-64 Linux the saved context, a structure the kernel calls a ucontext wrapping a sigcontext, holds the values of the general purpose registers, the stack pointer, the instruction pointer, and the flags. Every register that defines what the CPU will do next is sitting there in plain memory, written by the kernel, waiting to be put back. The technique was first described in full by Erik Bosman and Herbert Bos of Vrije Universiteit Amsterdam in their 2014 paper, which named the saved frame as the entire attack surface.

    How signal delivery sets up the frame

    Walk the normal, benign path first so the abuse is obvious later. A process is running. A signal arrives, say SIGALRM or SIGSEGV. The kernel pauses the process, builds the signal frame on the user stack, and arranges for the registered handler to run. The handler does its work. When it returns, it does not return like an ordinary function. Instead, control flows to a small trampoline that invokes a special system call named rt_sigreturn.

    That syscall has exactly one job: take the signal frame currently on top of the stack and restore the process from it. The kernel reads the saved ucontext, copies every saved register back into the live CPU registers, restores the signal mask, and resumes execution at the saved instruction pointer. As the manual for sigreturn(2) puts it, the call restores the process context, meaning the processor flags and registers, including the stack pointer and the instruction pointer. After it runs, the process is bit for bit back where it was before the signal, and none the wiser.

    Here is the load bearing detail. The kernel does not keep its own private, trusted copy of that frame. It put the frame on the user stack, and when rt_sigreturn runs, it reads the frame back from the user stack. It does not check a cookie. It does not verify that a signal was ever actually delivered. It does not confirm that the bytes it is about to load are the same bytes it wrote. It reads whatever is at the top of the stack, interprets those bytes as a saved register set, and loads them into the CPU. The kernel assumes that a frame on the stack is one the kernel itself placed there. That assumption is the whole game.

    The forged frame

    Now suppose an attacker has a stack write, the classic precondition for any return oriented attack: a buffer overflow, a format string write, or any primitive that lets them lay out bytes on the stack and steer the return address. Instead of building a long chain of return addresses the way classic return oriented programming does, they write something simpler. They write a fake signal frame.

    The layout is fixed and public, so forging it is mechanical rather than clever. The attacker fills in the saved register slots with the exact values they want the CPU to hold: a chosen rip to control where execution goes, a chosen rsp to control the stack, a chosen rax to select a system call, and chosen argument registers rdi, rsi, and rdx to fill in that call’s parameters. They do not have to find a gadget that loads each register one at a time. They just write the value they want into the slot that the kernel will copy into that register. Tooling makes this trivial in practice. The pwntools exploitation library ships a SigreturnFrame class that builds the byte layout for you, so a practitioner writes frame.rdi = ... and frame.rip = ... rather than memorizing offsets.

    With the fake frame in place, the attacker needs only to make the program execute rt_sigreturn. On x86-64 that means getting the syscall number 15, which is 0xf, into rax and then reaching a syscall instruction. The kernel sees the syscall, treats the top of the stack as a genuine signal frame, and loads every forged register at once. One step, and the entire CPU state belongs to the attacker.

    The kernel does not ask whether the signal frame is real. It reads the stack, trusts the bytes, and loads them into every register the CPU has.

    Why one gadget is enough

    To appreciate why this technique matters, contrast it with the attack it descends from. Classic return oriented programming chains together short instruction sequences that already exist in the target binary, each ending in a ret, to assemble behavior the attacker wants without injecting any new code. To set up a single system call that way, you typically need a gadget to load rdi, another for rsi, another for rdx, another for rax, and then a syscall. If the binary is small or stripped down, some of those gadgets may simply not exist, and the whole approach stalls. Gadget availability is the limiting factor.

    Sigreturn oriented programming collapses all of that into one move. It does not load registers one at a time from scattered gadgets. It loads the entire register set in a single rt_sigreturn, sourced from a frame the attacker fully controls. That has three consequences that make it unusually strong.

    It is close to universal

    The rt_sigreturn path is part of the kernel’s signal machinery, not a quirk of any one program. The two ingredients the attacker needs, a way to set rax to 15 and a syscall instruction, are minimal and turn up almost everywhere. Bosman and Bos titled their paper around portability for this reason: an exploit built on signal frames travels across different binaries with little or no change, because it leans on a syscall convention rather than on whatever odd gadgets a particular binary happens to contain. Where classic chains are bespoke to each target, a sigreturn payload is close to write once.

    It barely cares about gadget scarcity

    Because the register values come from the forged frame rather than from gadgets, a binary that is too lean for a normal return oriented chain can still fall to this one. You are no longer searching the binary’s instruction stream for a way to control rsi. You wrote rsi directly into the frame. The attack sidesteps the exact scarcity that defeats classic chains, which is why it is so often the answer when a target offers almost nothing to work with.

    It hands you full register control

    Setting registers precisely is the hard part of many exploits, and here it is free. One rt_sigreturn sets all of them to known values in one shot, which makes the next step, invoking a system call with carefully chosen arguments, completely deterministic.

    Chaining syscalls into a shell

    The payoff of full register control is the ability to make any system call you like with any arguments you like. The canonical goal is a shell, which on x86-64 means calling execve("/bin/sh", NULL, NULL). The syscall number for execve is 59, which is 0x3b.

    The attacker forges a signal frame whose saved registers describe that call exactly. They set rax to 59 to select execve. They set rdi to the address of the string /bin/sh in memory. They set rsi and rdx to zero for the empty argument and environment pointers. Crucially, they set the saved rip to the address of a syscall instruction. When rt_sigreturn restores this frame, every one of those registers snaps into place and execution jumps straight to the syscall, which now runs execve with the attacker’s arguments. A shell pops.

    Often a single sigreturn is not the end but a stage. A common pattern when there is nowhere known to put the /bin/sh string, or no executable place to land, is to chain frames. The first forged frame calls a syscall like read or mmap to write attacker data into a known, writable, executable location, and it sets the saved rsp so that when that syscall returns, the stack is positioned at the next forged frame, which performs the next step. Each rt_sigreturn both performs a syscall and repositions the stack for the one after it, so a series of frames becomes a syscall chain that does setup work and then spawns the shell.

    This chaining is why the technique is so flexible in practice. The saved rsp in each frame is the thread that ties the stages together: it lets the attacker walk the stack pointer forward through a prepared sequence of frames without needing any gadget that adjusts the stack. A frame can call mprotect to make a writable region executable, then the next frame can jump into freshly written shellcode, then a final frame can clean up. The attacker is, in effect, scripting the kernel’s own restore path into a small virtual machine where each instruction is one forged frame and one syscall. That is a long way from the brittle, binary specific gadget hunting that classic chains demand.

    Where the syscall and the string come from

    The technique still needs two concrete addresses: somewhere to find a syscall instruction to put in the saved rip, and somewhere to find or place the /bin/sh string. This is where a known fixed location matters. Historically the vsyscall page on x86-64 Linux sat at the constant address 0xffffffffff600000 and was executable, which gave attackers a syscall gadget at a hardcoded spot regardless of address randomization. An mmap region created with a fixed address, or any leaked address that reveals where executable bytes and writable memory live, serves the same purpose. The sigreturn frame supplies the registers, but the attacker still has to point rip at real executable code, so a stable or leaked location is the other half of the recipe. This dependence on a known address is the same kind of memory layout problem you see in a kernel use after free, where control of where a stale object lives is what turns a dangling reference into a write primitive.

    How it relates to and differs from classic ROP

    It helps to be precise about the family relationship. Both classic return oriented programming and the sigreturn variant are code reuse attacks. Neither injects new executable code into the process, which is the point: they defeat the no execute protections that made plain shellcode on the stack stop working. Both rely on the attacker controlling the stack and the return address. So far they are siblings.

    The difference is where the register values come from. Classic chains source each register value from a separate gadget already present in the binary, then string those gadgets together with return addresses, so the chain’s power is bounded by what gadgets the binary contains. The sigreturn variant sources every register value from a single forged data structure that the kernel itself will faithfully load, so its power is bounded only by the attacker’s ability to write a frame and trigger one syscall. One is a sequence of borrowed instructions. The other is a single borrowed kernel mechanism that hands over the whole CPU at once. In CTF and real exploitation practice the two are routinely combined: a short classic chain sets rax to 15 and reaches a syscall, and that single act detonates the forged frame.

    Mitigations

    Because the flaw is an assumption rather than a memory bug, the defenses are a layered set rather than a single patch. None of them is a silver bullet, and one common belief about defense is simply wrong.

    Signal cookies, the direct fix

    The most targeted defense is the one Bosman and Bos proposed in the original paper: a signal cookie, sometimes called a sigreturn cookie. The idea is to make the kernel able to tell its own frames from forged ones. When the kernel writes a real signal frame, it also stores a random value derived from a secret combined with the address where the frame sits, in effect a canary bound to that stack location. On rt_sigreturn the kernel recomputes and checks that value before trusting the frame. An attacker who forges a frame cannot produce the right cookie without knowing the secret, so the forged frame is rejected. This directly attacks the trusted bytes problem at its root, and variants of this mitigation have appeared in some kernels. The elegance of the approach is that it changes the trust model rather than the layout: the kernel stops assuming that a frame on the stack is its own and starts proving it, which is exactly the assumption the attack abused.

    Vsyscall emulation and reduced fixed locations

    The old executable vsyscall page at its constant address was a gift to attackers, so modern kernels emulate it rather than letting code execute there directly. Since Linux 3.3 an attempt to run instructions in that page traps instead of executing, which removes one reliable, ASLR proof source of a syscall gadget. This does not stop the technique, but it removes a convenient fixed foothold and forces the attacker to find an executable address some other way.

    Control flow integrity and hardware shadow stacks

    Control flow integrity aims to ensure that indirect control transfers only land at intended targets, which constrains the return oriented building blocks the attacker stitches together. At the hardware level, Intel’s Control flow Enforcement Technology adds a shadow stack: a protected copy of return addresses that the CPU checks, so a corrupted return address on the normal stack no longer redirects execution unnoticed. These raise the cost of the surrounding chain that gets you to the syscall in the first place, though they target control flow hijacking broadly rather than the signal frame trust specifically.

    Full RELRO and hardening the rest of the path

    Defenses that close off the primitives an attacker uses to reach rt_sigreturn matter too. Full RELRO maps the global offset table read only after startup, removing a popular write target that exploits use to hijack control flow, which makes the initial stack write or redirect harder to obtain. Hardening the overflow or write primitive that the attack depends on shrinks the opening before signal frames ever enter the picture.

    The ASLR misconception

    It is tempting to assume address space layout randomization stops this. It does not, not on its own. Randomization hides where code and the stack live, which raises the bar, but the sigreturn technique only needs the attacker to write a frame to a stack they already control and to point rip at one executable address. Once any information leak gives up a single address, the layout is known and randomization is spent. Treat ASLR as one delaying layer that a leak cancels, not as a defense against forged signal frames. This is the same trap as treating a leak resistant looking design as safe: the moment one address escapes, the assumption underneath collapses.

    The assumption that breaks

    Strip away the frame layouts and the syscall numbers and one assumption is left holding everything up. The kernel assumes that a signal frame sitting on the user stack is one the kernel itself put there. The rt_sigreturn path was designed as the kernel’s private return road, a way to undo a context switch it had performed moments earlier, and it was built on the premise that only the kernel ever lays a frame down. So it reads the bytes and loads them into every register without a second look. The attacker never breaks that mechanism. They simply place a frame of their own on the stack and let the kernel do exactly what it was always going to do.

    The bug is not a corrupt syscall or a broken handler. The bug is a trust boundary drawn in the wrong place: the kernel trusted the contents of memory it had already handed to the process, and that memory is precisely what an attacker controls. That gap, between what a system assumes about who wrote some bytes and who actually can, is the kind of flaw you find by asking what each component trusts and why it still trusts it, rather than by scanning for a known bad pattern. It is exactly the kind of assumption an autonomous researcher built to test assumptions is meant to surface. Verify the frames you restore, narrow the fixed locations an attacker can lean on, and remember that a leak turns randomization back into a known address. Learn more about that approach on our about page.

    Frequently asked questions

    What makes sigreturn oriented programming so powerful?

    When a signal is delivered, the Linux kernel saves a full register snapshot, the signal frame, onto the user stack, and the rt_sigreturn syscall restores every register from it without checking that the frame is genuine. An attacker who controls the stack forges that frame and sets rax, rdi, rsp, and rip all at once with a single gadget, instead of hunting for one gadget per register the way classic chains do. The technique was introduced by Erik Bosman and Herbert Bos in Framing Signals: A Return to Portable Shellcode.

    How does SROP differ from classic ROP?

    Both are code reuse attacks that need a stack write and never inject new code, so they survive no execute protections. The difference is where register values come from. Classic return oriented programming sources each value from a separate gadget already in the binary, so it is limited by gadget availability. The sigreturn variant sources every value from one forged signal frame the kernel faithfully loads, so it works even on lean binaries. See the overview of sigreturn oriented programming for the comparison.

    How does a forged frame spawn a shell?

    The attacker builds a signal frame whose saved registers describe execve("/bin/sh", NULL, NULL): rax set to 59, rdi pointing at the /bin/sh string, rsi and rdx set to zero, and the saved rip aimed at a syscall instruction. Triggering rt_sigreturn with syscall number 15 loads the whole frame and runs the call. The SigreturnFrame helper in pwntools builds this byte layout automatically.

    Does ASLR stop sigreturn oriented programming?

    Not on its own. Address randomization hides where code and the stack live, but the attack only needs to write a frame to a stack the attacker already controls and to point rip at one executable address. A single information leak reveals that address and the layout is known. The real fix is a signal cookie that binds a secret to the frame’s stack location, the mitigation described in the sigreturn(2) manual and the original research, so the kernel can reject forged frames.

  • How a Container Escape Works: The cgroups v1 release_agent Technique

    How a Container Escape Works: The cgroups v1 release_agent Technique

    A container escape happens when a process running inside a container breaks out of its restricted view of the system and starts acting on the host directly, usually as root. The reason this is even possible comes down to one fact people forget: a container is not a virtual machine. It is an ordinary Linux process that the kernel has wrapped in restricted namespaces and cgroups, and that process shares the exact same kernel as the host and every other container on the box. There is no hypervisor in between. This post walks one specific, real escape end to end, the cgroups v1 release_agent technique tracked as CVE-2022-0492, and then steps back to the wider family of escapes that all rely on the same shared kernel boundary.

    Why a container escape is possible at all

    Start with what a container actually is, because the whole escape hinges on it. When you run docker run or start a pod, you do not boot a second machine. The kernel takes a normal process and changes what it can see. Namespaces give it a private view of process IDs, mount points, network interfaces, and user IDs, so from inside it looks like the process owns the system. Cgroups (control groups) cap how much CPU, memory, and IO it can use. Capabilities and seccomp filters trim which privileged operations and syscalls it is allowed to make. Stack those together and you get isolation that feels like a separate machine.

    But every one of those layers is enforced by the same kernel the host runs. A virtual machine gets a virtual CPU and virtual hardware from a hypervisor, and the guest kernel is genuinely separate; to escape a VM you have to defeat the hypervisor itself. A container has none of that. The isolation is just bookkeeping inside one shared kernel. So if you can reach a kernel interface that was never namespaced, or you hold a capability that the kernel trusts more than it should, or the kernel has a bug you can hit from inside, the boundary is not a wall. It is a convention, and conventions can be talked out of.

    That is the mental model for the rest of this post. The container assumes the kernel will keep enforcing its restricted view. CVE-2022-0492 is what happens when one kernel interface forgets to check who is allowed to touch it.

    It is worth being precise about capabilities here, because the whole vulnerability turns on a subtlety in how they work. A capability is a slice of root’s power that the kernel can hand out one piece at a time instead of granting everything at once. CAP_NET_BIND_SERVICE lets a process bind a low port. CAP_SYS_ADMIN is the grab bag that covers mounting filesystems, setting hostnames, and a long list of other administrative actions, which is why it is often described as the new root. A default container runtime hands a container a deliberately small set and drops the dangerous ones. The kernel then checks, at the moment of each privileged action, whether the calling process holds the capability that action requires. The escape we are about to walk is, at bottom, a story about the kernel checking that the caller holds CAP_SYS_ADMIN but checking it in the wrong place.

    The cgroup v1 release_agent mechanism

    To understand the escape you first have to understand a perfectly legitimate cgroups feature that was never meant to be reachable from inside a container.

    What release_agent and notify_on_release do

    In cgroups version 1, every control group can carry two special files. The first is notify_on_release, a flag set to 0 or 1. The second is release_agent, which lives at the root of a cgroup hierarchy and holds a path to a program. The deal is simple. When notify_on_release is set to 1 on a cgroup and the last process in that cgroup exits, leaving it empty, the kernel runs the program named in release_agent to clean up. This is a real housekeeping mechanism documented in the kernel cgroups manual page. It exists so userspace can react when a group empties out.

    The critical detail is who runs that program and where. The kernel invokes the release_agent binary itself, from the host context, as a fully privileged root process with all capabilities, in the host’s namespaces. It is not run inside the container. It is run by the kernel on the host. So if an attacker inside a container can write a path of their choosing into a release_agent file and then cause a cgroup to empty, the kernel will execute their chosen program as root on the host. That is the entire escape in one sentence. Everything else is about getting permission to write that file.

    The capability that was supposed to guard it

    Writing to release_agent is obviously dangerous, so the kernel gates it behind a capability. The relevant capability is CAP_SYS_ADMIN, the broad administrative capability that container runtimes strip from containers by default precisely because it is so powerful. A normal Docker or Kubernetes container does not hold CAP_SYS_ADMIN, so under default settings it cannot write release_agent, and the housekeeping feature stays a housekeeping feature.

    For years that was the assumed boundary. If you wanted to abuse release_agent, you needed CAP_SYS_ADMIN, and if you had CAP_SYS_ADMIN you were already a heavily privileged container that could do plenty of damage anyway. The interesting question, and the one CVE-2022-0492 answers, is whether a container could obtain a working CAP_SYS_ADMIN over a cgroup mount without the host ever granting it.

    The classic escape with a privileged container

    It helps to see the abuse in its original, non vulnerability form first, because the vulnerability simply removes the precondition. Picture our invented note taking service, Acme Notes, which runs each customer’s background jobs in a container. Suppose an attacker has found a way to run as root inside one of those job containers, and the container was started privileged so it does hold CAP_SYS_ADMIN. The escape is a short sequence:

    mkdir /tmp/cgrp && mount -t cgroup -o rdma cgroup /tmp/cgrp
    mkdir /tmp/cgrp/x
    echo 1 > /tmp/cgrp/x/notify_on_release
    host_path=$(sed -n 's/.*\perdir=\([^,]*\).*/\1/p' /etc/mtab)
    echo "$host_path/cmd" > /tmp/cgrp/release_agent
    echo '#!/bin/sh' > /cmd
    echo "cat /etc/shadow > $host_path/output" >> /cmd
    chmod a+x /cmd
    sh -c "echo 0 > /tmp/cgrp/x/cgroup.procs"

    Read it top to bottom. You mount a cgroup v1 controller (the rdma controller is a common pick) so its hierarchy is writable. You make a child cgroup x and turn on notify_on_release for it. You find the container’s path on the host filesystem by reading the overlay mount info, then write host_path/cmd into release_agent, so the kernel will look for the agent at a path that resolves to a file inside your container. You drop a small script at /cmd that does whatever you want, here dumping the host’s /etc/shadow back to a place you can read. Finally you write a PID into the child’s cgroup.procs and let it exit, emptying the cgroup. The kernel sees the empty group, reads release_agent, and runs /cmd as root on the host. You just executed code outside the container.

    The kernel was never tricked into running the wrong file. It ran exactly the file it was told to, as root, on the host, because nothing checked that the process which named that file had any business naming it.

    CVE-2022-0492: the missing check

    The classic technique above needs a privileged container with CAP_SYS_ADMIN. CVE-2022-0492 is the discovery that an unprivileged container could reach the same write through a back door, because the kernel’s permission check on release_agent was wrong.

    What Unit 42 found

    The vulnerability was disclosed in early 2022 by Yiqi Sun and Kevin Wang, with the most detailed public writeup published by Palo Alto Networks’ Unit 42 research team. The flaw lived in the cgroup_release_agent_write function in kernel/cgroup/cgroup-v1.c. That function is what runs when something writes to a release_agent file, and it was supposed to confirm the writer was sufficiently privileged before accepting the new path. It did not. The function failed to verify that the writing process held CAP_SYS_ADMIN in the initial user namespace. The official CVE record for CVE-2022-0492 describes it as allowing the cgroups v1 release_agent feature to escalate privileges and bypass namespace isolation, and NVD scores it CVSS v3.1 7.8, high severity.

    What makes the finding sharp is that the underlying release_agent abuse was already public and understood as a privileged container trick. The contribution was noticing that user namespaces had quietly changed the threat model: a feature whose guard assumed only a genuinely privileged process could ever reach it was now reachable by any process that could spin up its own user namespace and call its bluff. The bug had reportedly been present since the relevant code path was introduced years earlier, sitting in plain sight, dangerous only once unprivileged user namespaces became common enough to weaponize. That is the recurring texture of this class of flaw. Nothing crashed, nothing leaked, the code did exactly what it said. It just trusted the wrong namespace.

    Why an unprivileged user namespace is the key

    This is the part that turns a missing check into a real escape. Linux user namespaces let an unprivileged process create a new user namespace in which it is root and holds a full set of capabilities, including CAP_SYS_ADMIN, but only over resources owned by that new namespace. The whole point of user namespaces is that this capability is local and fake from the host’s point of view. You are root in your little bubble; the host still sees you as nobody. Inside that new user namespace you are allowed to create a new mount namespace and mount a fresh cgroup v1 hierarchy, and within that hierarchy you have a writable release_agent file.

    Now the two pieces meet. The attacker holds CAP_SYS_ADMIN, but only in the new user namespace, which should not count for a host level action like setting a release agent. The kernel’s job in cgroup_release_agent_write was to notice that and refuse. Because the check was missing, the kernel accepted the write from a process whose CAP_SYS_ADMIN was the local, namespaced, supposed to be harmless kind. The attacker then runs the same notify_on_release sequence, empties the cgroup, and the kernel dutifully executes their script as real root on the host. An unprivileged container, given that user namespaces are enabled and no extra hardening blocks the steps, escapes to the host.

    The distinction the kernel missed is the difference between two functions with very similar names. ns_capable asks whether the caller holds a capability in some particular user namespace, which a process that just created its own user namespace always satisfies, because it minted itself a full capability set when it created the namespace. capable asks whether the caller holds the capability in the host’s original user namespace, the one no unprivileged process can fake its way into. The release agent write must demand the second kind, because the program it stores gets run as host root. The vulnerable code effectively settled for the first kind, or for no kind at all, which is why a process that was root only inside its own bubble could set a file that the kernel would then honor with the real thing. The gap between those two questions is the entire CVE.

    One more nuance makes the escape practical rather than theoretical. The attacker has to name a program the kernel can actually find and run from the host context. Because the kernel resolves the release_agent path on the host, the attacker reads the container’s location on the host filesystem out of the mount information, usually the overlay upperdir, and writes a path that lands inside files they already control from within the container. So the script the kernel executes as root is a file the attacker wrote inside the container, reached by its true host path. No file is smuggled across the boundary; the same bytes are simply addressed two ways.

    This is fundamentally a privilege escalation dressed as a container escape. The container gains an authority it was never assigned by exploiting an interface that trusted a capability it should have distrusted.

    The kernel fix

    The fix is almost anticlimactically small, which is what makes it instructive. The patch landed in mainline as commit 24f6008564183aa120d07c03d9289519c2fe02af and added the check that should always have been there. Before accepting a write to release_agent, the function now confirms the caller is operating in the initial user namespace and holds genuine CAP_SYS_ADMIN, using capable(CAP_SYS_ADMIN) against the host’s init_user_ns rather than the namespace local ns_capable check that a user namespace could satisfy. If the writer’s user namespace is not init_user_ns, or it lacks real CAP_SYS_ADMIN, the write is rejected with EPERM. That single distinction, host capability versus namespaced capability, is the whole bug and the whole fix. The fix shipped in 5.17 and was backported to the maintained stable trees.

    The broader family of container escapes

    The release_agent trick is one entry in a catalog, and it is worth knowing the neighbors, because they all share the shared kernel premise even when the specific door differs.

    Privileged containers

    A container started with --privileged is barely a container at all. It keeps almost all capabilities, including CAP_SYS_ADMIN, and it can see host devices. The classic release_agent escape works directly from such a container with no vulnerability required, and so do many other tricks, because a privileged container is one short step from being a host root shell. The lesson is that privileged is a decision to drop the boundary, not a convenience flag.

    A mounted docker.sock

    Mounting the Docker daemon socket, /var/run/docker.sock, into a container hands that container the ability to talk to the Docker daemon, which runs as root on the host. From inside, the process can ask the daemon to start a new container that mounts the host’s root filesystem and runs as root, then read or write anything on the host. There is no kernel bug here at all. The container was simply given a control channel to a privileged host service.

    Exposed host mounts

    Bind mounting sensitive host paths into a container, the host root, /etc, the Docker directory, or device nodes, gives the container direct reach into host state. Write access to the right host file, such as a script the host runs on a schedule or a configuration the host trusts, is escape enough. The boundary leaks wherever a writable path crosses it.

    A vulnerable shared kernel

    Because the kernel is shared, any kernel memory corruption bug reachable from inside a container is a candidate escape. Dirty COW (CVE-2016-5195) and Dirty Pipe (CVE-2022-0847) are the famous examples: both let an unprivileged process overwrite files it should only be able to read by abusing a flaw in how the kernel handles copy on write or pipe page memory, and both can be fired from inside a container to overwrite a host owned file and gain root. A different flavor of the same family is a kernel use after free, where freed kernel memory is reclaimed and reused to corrupt state the attacker controls. The common thread is unmistakable: one kernel, shared by host and container, so a kernel bug is a host bug.

    Defending against the release_agent escape

    The good news is that the same hardening that blocks most of this family blocks the release_agent path too. Patch the kernel so cgroup_release_agent_write enforces the real capability check. Keep the default seccomp and the default AppArmor or SELinux profiles in place, because they deny the mount and write steps the exploit needs; Unit 42 noted the escape only works against containers running without those protections. Drop CAP_SYS_ADMIN and run unprivileged. Where you do not need them, disabling unprivileged user namespaces removes the mechanism that hands an unprivileged container its local CAP_SYS_ADMIN in the first place. And prefer cgroups v2, which does not carry the release_agent and notify_on_release interface in the form this exploit abuses. None of these is exotic. They are the defaults, and the escape mostly works where the defaults were removed.

    The assumption that breaks

    Underneath all of it sits one assumption, and it is the same assumption every time. A container assumes the kernel boundary holds. It behaves as though its namespaces and cgroups are a wall around it, as solid as the virtual hardware around a VM. But the kernel is not a wall around the container. It is the floor under both the container and the host, one shared surface, and the container is standing on it right next to everything it is supposed to be isolated from. The moment a single capability is trusted too far, or a single kernel interface forgets to ask who is calling, or a single host path is left writable across the line, the boundary was never there. It was a set of checks, and one missing check, the absent CAP_SYS_ADMIN test in cgroup_release_agent_write, was enough to collapse the whole thing into a root shell on the host.

    That is the shape of bug you only find by asking what each layer trusts and why it still trusts it, rather than by scanning for a known bad pattern. The vulnerability was not a crash or a corrupted pointer. It was an interface that assumed a capability meant what it used to mean before user namespaces made capabilities local and cheap. Finding it meant questioning a boundary everyone treated as settled. That is exactly the kind of assumption an autonomous researcher built to test assumptions is meant to catch: not the malformed input, but the quiet premise that the wall is a wall. Read more about that approach on our about page.

    Frequently asked questions

    What is a container escape?

    A container escape is when a process inside a container breaks out of its restricted namespaces and cgroups and acts on the host directly, usually as root. It is possible because a container is not a virtual machine; it is an ordinary process that shares the same kernel as the host, so one over trusted capability or one writable host interface can collapse the boundary. The Unit 42 analysis of CVE-2022-0492 walks a real example end to end.

    How does the cgroups v1 release_agent escape work?

    In cgroups v1 a hierarchy can hold a release_agent file naming a program the kernel runs as root on the host when a cgroup with notify_on_release set to 1 becomes empty. If an attacker can write a path into release_agent and then empty a cgroup, the kernel executes their chosen script as root outside the container. The man7 cgroups documentation describes the legitimate release agent and notify_on_release mechanism this abuses.

    What did CVE-2022-0492 actually break?

    The cgroup_release_agent_write function in kernel/cgroup/cgroup-v1.c failed to verify that the process writing release_agent held real CAP_SYS_ADMIN in the initial user namespace. An unprivileged container could create a user namespace where it holds a local, supposed to be harmless CAP_SYS_ADMIN, mount a writable cgroupfs, and write the file the kernel trusted. The flaw is documented at CVE-2022-0492 and scored CVSS 7.8 high by NVD.

    How do you defend against this container escape?

    Patch the kernel so cgroup_release_agent_write enforces the real capability check (the fix landed in commit 24f6008564183aa120d07c03d9289519c2fe02af), keep the default seccomp and AppArmor or SELinux profiles, drop CAP_SYS_ADMIN, avoid privileged containers and mounted docker.sock, and disable unprivileged user namespaces where you do not need them. The Sysdig writeup on CVE-2022-0492 covers detection and mitigation.

  • What Is Web Cache Deception and How a Crafted URL Leaks Private Pages

    What Is Web Cache Deception and How a Crafted URL Leaks Private Pages

    Web cache deception is an attack where a CDN or caching proxy is tricked into storing a victim’s authenticated, private response under a URL the attacker can fetch for themselves. The attacker lures a logged in victim to a crafted link like https://app.acmenotes.com/account/settings/nonexistent.css. The origin server ignores the extra suffix and serves the victim’s real account page, full of personal data and tokens. The cache, looking at the same URL, sees a .css ending and decides this must be a harmless stylesheet worth saving. It stores the private page under that key. The attacker then requests the very same URL, the cache serves its stored copy, and the victim’s private response lands in the attacker’s browser. This post walks the mechanism one step at a time: why the origin and the cache read the same URL differently, how a cache decides what to store, what the attacker actually walks away with, how this differs from web cache poisoning, and how to close the gap.

    The disagreement at the heart of web cache deception

    A cache sits between your users and your origin server to make things fast. When many people ask for the same stylesheet, script, or image, there is no reason to bother the origin every time. The cache keeps a copy of the response and hands it out to everyone who asks for that URL. This works beautifully for content that is the same for every visitor and stays the same for a while. Static assets are the textbook case.

    The whole arrangement rests on one quiet assumption: that a given URL means the same thing to the cache as it does to the origin. Web cache deception is what happens when that assumption is false. The cache and the origin both look at /account/settings/nonexistent.css and reach different conclusions about what it is. The origin routes by path prefix and decides this is the account settings page. The cache classifies by file extension and decides this is a CSS file. One of them is serving private, per user content. The other is treating that content as a public asset safe to store and replay. The attack lives entirely in that gap.

    How the origin reads the path

    Most application servers do not match a request against a literal file on disk. They route. A framework looks at the leading part of the path, matches it to a handler, and treats whatever trails behind as a parameter, a path variable, or simply noise it can ignore. A request to /account/settings hits the settings handler. A request to /account/settings/nonexistent.css very often hits the exact same handler, because the router matched on /account/settings and never cared about the /nonexistent.css tacked on the end. The origin happily renders the logged in user’s settings page and returns it with a 200 OK. As Omer Gil described the condition in his original 2017 research, the requirement is simply that the server returns the content of the real page for the decorated URL rather than a 404. The suffix is invisible to the application but very visible to everything downstream.

    How the cache reads the same path

    The cache makes its decision on different grounds. A common and reasonable cache rule says: anything ending in a known static extension is cacheable. CSS, JS, PNG, GIF, ICO, WOFF, and a long tail of others. The logic is that files with those extensions are assets, assets do not contain secrets, and caching them is pure speed with no downside. So when the response to /account/settings/nonexistent.css comes back, the cache looks at the URL, sees .css, and stores the response. It often does this even when the origin’s own caching headers said not to, because an extension based rule can be configured to override or ignore Cache-Control. The cache is not reading the body. It does not know it just filed a page full of one specific user’s data under a public key. Omer Gil’s PayPal report listed more than forty extensions that PayPal’s cache would store this way, from css and js down to ico and swf.

    Two independent, defensible decisions have now combined into a vulnerability. The origin decided the suffix was meaningless. The cache decided the suffix was authoritative. Neither component is broken on its own. The bug is the disagreement between them.

    It helps to see why each side made the choice it did. The origin’s router is built for flexibility. Modern frameworks encourage clean, expressive routes, and matching on a leading prefix while ignoring trailing junk is a feature, not an oversight. It lets developers write one handler for /account/settings and not worry about every odd thing a browser or proxy might append. The cache, for its part, was tuned for a world where the URL is an honest signal of content type. For most of the web’s history, a path ending in .css really was a stylesheet, and trusting the extension was a cheap, reliable shortcut. Each component optimized for its own job under a reasonable assumption about the other. The attacker simply found the one input where those two reasonable assumptions point in opposite directions.

    The cache key is not the whole URL

    To see why the attacker can retrieve what the victim triggered, you have to look at the cache key. A cache does not index its stored responses by the full request. It builds a key, usually from the URL path and some chosen query parameters, and crucially that key does not include the victim’s session cookie. Cookies are exactly the thing that makes a response personal, and they are normally left out of the key so that the cache can serve one stored copy to many users.

    That omission is the engine of the attack. The victim’s request to /account/settings/nonexistent.css carried their session cookie, so the origin rendered their private page. But the cache filed that private response under a key built only from the path. When the attacker later requests the identical URL, with no cookie or with their own, they produce the same cache key. The cache matches the key, sees a stored response, and serves it without ever consulting the origin. The session that authorized the content is long gone from the picture. The attacker did not need the victim’s cookie because the cache already stripped the cookie out of the key and kept the response.

    The victim’s credentials fetch the private page once. The cache then serves that page to anyone who knows the URL, because the thing that made it private was never part of the key.

    Beyond the file extension trick

    The clean .css suffix is the original and most intuitive form, but the same disagreement shows up in subtler shapes. The PortSwigger Web Security Academy catalogs several, and they all reduce to the cache and the origin parsing the URL by different rules.

    Static directory rules

    Caches are often told to store anything under a particular directory prefix, like /static, /assets, or /resources. The intent is to cache the asset folder wholesale. If the origin’s router is loose about where that prefix appears, an attacker can craft a path that the cache sees as living under /assets while the origin still routes it to a dynamic, authenticated handler. No file extension is needed at all. The cacheable signal is the directory, and the path confusion smuggles private content into it.

    Delimiter and path parameter discrepancies

    Different stacks disagree about which characters end a path and which are just data. A semicolon is a meaningful path parameter delimiter in some Java servers and inert punctuation elsewhere. An encoded character like %2f may be decoded to a slash by one component and left literal by another. When the cache truncates the URL at a delimiter the origin ignores, or matches an extension the origin treats as part of an earlier path segment, the two views split apart again. The attacker’s job is to find a single character or encoding that the origin reads one way and the cache reads another, then build the gap from that seam. OWASP files this whole family under path confusion, and its testing guide points testers at exactly these decorated URLs.

    Normalization gaps

    Caches and origins also resolve path traversal and normalize sequences differently. If a cache collapses ..%2f before keying but the origin resolves it after routing, or the reverse, an attacker can present a path that appears to sit under a static prefix to one and under a dynamic route to the other. Same root cause, different mechanical lever: the two parsers do not agree on what the bytes in the path mean.

    What the attacker actually walks away with

    The stored response is whatever the victim would have seen on that authenticated page, and that is rarely just cosmetic. Account pages routinely embed the things that matter most. Personally identifiable information sits in the page body: names, email addresses, postal addresses, phone numbers, partial card numbers, balances. Gil’s PayPal disclosure noted the leak could expose exactly this class of data, names, account balances, card digits, transaction history, and more. That alone is a serious data breach with no further work.

    It frequently gets worse, because authenticated pages also carry security tokens in their markup. A CSRF token printed into a hidden form field is meant to prove that a request came from the real user. If the page holding that token gets cached and handed to an attacker, the token leaks, and a defense against forged requests becomes a gift to the forger. Some pages expose session identifiers, API keys, or single use links in the same way. Once any of those land in the cached copy, the attacker can escalate from reading the victim’s data to acting as the victim, which is the path to full account takeover.

    The delivery is also low effort for the attacker. There is no malware and no exploit chain to detonate; there is a link. The attacker sends the victim a crafted URL through email, a chat message, or any page the victim will click, exactly the way a phishing link travels. The victim does not have to type anything, log in again, or approve a prompt. They are already authenticated, and clicking the link silently fires off the request that primes the cache. From the victim’s point of view nothing dramatic happens; the page they land on may even look normal or show a missing stylesheet for a fraction of a second. The damage is invisible until the attacker fetches the stored copy. This is part of why the attack is so durable: the visible footprint on the victim’s side is close to nothing.

    The reach of the attack is not narrow. Gil reported that when he tested high profile sites, a meaningful fraction were exploitable. Later academic work has kept confirming the prevalence at scale. A 2024 study, Hidden Web Caches Discovery by Matteo Golinelli and Bruno Crispo, used a timing based method to find caches that do not even announce themselves through response headers, measuring roughly 5.8 percent of the Tranco top 50,000 sites running such hidden caches, of which over a thousand were susceptible to web cache deception. A cache you cannot see in the headers is still a cache that can store and leak a private page. That last point is worth dwelling on, because it undercuts a common defensive instinct. Teams often reason about caching by reading response headers, assuming that if they do not see a cache status header they are not being cached. A hidden cache breaks that assumption outright. The infrastructure may cache silently, and the only way to know is to probe its behavior rather than trust what it advertises.

    What separates this from cache poisoning

    Web cache deception is constantly confused with web cache poisoning, and they are genuinely different attacks pointed in opposite directions. PortSwigger draws the line cleanly: poisoning manipulates the cache key to inject malicious content into a cached response that is then served to other users, while deception exploits cache rules to trick the cache into storing sensitive content that the attacker then retrieves for themselves.

    Read that again by the direction of harm. In cache poisoning, the attacker is the source of bad content and the victims are everyone else. The attacker finds an unkeyed input, some header or parameter the origin reflects into the response but the cache leaves out of the key, and they use it to plant a malicious payload under a popular URL. The next thousand visitors who request that URL get the attacker’s poisoned response. The flow runs from attacker, into the cache, out to the crowd.

    In web cache deception, the direction reverses. The victim is the source of the sensitive content and the attacker is the single beneficiary. The attacker lures one logged in victim to fetch their own private page, the cache stores it, and the attacker pulls that one stored copy back out. Nothing malicious is injected. The response is completely legitimate; it is simply the wrong person’s response, served to the wrong person. Poisoning is about controlling what a shared cache serves. Deception is about reading what a shared cache should never have stored. The mechanics rhyme, because both exploit a mismatch between what the cache keys and what actually varies the response, but the payload, the victim, and the goal are inverted.

    This origin versus intermediary disagreement is a recurring pattern rather than a one off. It is the same shape as HTTP request smuggling, where the front end and the back end disagree about where one request ends and the next begins. In both cases there is no single broken component, only two components that parse the same bytes by different rules, and an attacker who lives in the gap between their interpretations.

    Closing the gap

    Because the root cause is a disagreement, the fixes all work by removing the disagreement or refusing to act on it.

    The strongest move is to make caching decisions on what the origin actually returns, not on what the URL looks like. A cache that respects the origin’s Cache-Control: no-store and private directives will not store an authenticated page no matter what extension is glued to the path, because the page that produced it asked not to be stored. Send those headers on every response that contains per user data, and configure the cache to honor them rather than override them with a blanket extension rule.

    Next, close the parsing gap at the origin. If a request decorated with /nonexistent.css or a stray delimiter is not a real route, the application should return a 404 or a redirect, not silently serve the underlying page. A router that rejects the decorated path denies the cache anything worth storing. Verify that the response Content-Type matches the extension the cache thinks it is caching; a page served as text/html under a .css URL is the exact contradiction the cache should refuse.

    Finally, narrow the cache rules and align the two parsers. Prefer caching by explicit, known safe paths over broad extension or directory rules that match anything ending a certain way. Where the cache and the origin must both parse a URL, make sure they normalize delimiters, encodings, and traversal sequences identically, so there is no seam for an attacker to pry open. Each of these turns the two disagreeing views back into one.

    It is worth testing for this directly rather than assuming you are safe. Pick an authenticated page, request it with a static suffix like /nonexistent.css appended, and watch what comes back. If the origin still returns the private page with a 200 OK, request the same decorated URL a second time without any session cookie and see whether the private content comes back from the cache. If it does, you have reproduced the attack against your own application, and you know precisely which of the fixes above is missing. Run the same probe against the delimiter and directory variants, because a site can be hardened against the plain extension trick while still leaking through a semicolon or a static directory prefix. This kind of hands on probing is what OWASP’s path confusion guidance asks testers to do, and it surfaces the disagreement far more reliably than reading configuration files and hoping the cache and the origin agree.

    The assumption that breaks

    Strip away the extensions and the delimiters and one assumption is holding the whole thing up. The cache assumes that a URL means the same thing to it as it does to the origin, and that anything dressed as a static asset is safe to store and replay to anyone. The origin assumes the cache will only keep what is genuinely public. Neither side ever checks the other, and the request itself never carries a signal that says this response was personal. So the two views drift apart on a single crafted path, and the gap between them is exactly wide enough to slip one user’s private page into a public slot.

    The bug is not a broken cache or a careless framework. The bug is two correct components trusting that they agree on what a URL means when they do not, and a trust boundary everyone assumed sat at the response when it actually sat in the disagreement over the path. That kind of flaw does not show up by scanning for a known bad string. You find it by asking what each component assumes about the request, and whether the component on the other side shares that assumption. It is exactly the kind of question an autonomous researcher built to test assumptions is meant to ask. Honor the origin’s caching headers, make your router reject the decorated path, and keep the cache and the origin reading the same URL the same way. Learn more about that approach on our about page.

    Frequently asked questions

    What causes a web cache deception vulnerability?

    It is caused by the cache and the origin server disagreeing about what a URL means. The origin routes by path prefix and serves a private page for a decorated URL like /account/settings/nonexistent.css, ignoring the suffix, while the cache classifies the same URL by its .css extension and stores the response as a public asset. Neither component is broken alone; the bug is the mismatch. Omer Gil first described this condition in his 2017 Web Cache Deception Attack research.

    How does the attacker retrieve the victim’s private data?

    Through the cache key. A cache indexes stored responses by the URL path, not by the victim’s session cookie, since cookies are normally excluded so one copy can be served to many users. The victim’s authenticated request stores their private page under a cookieless key, and the attacker then requests the identical URL, produces the same key, and the cache serves the stored copy without consulting the origin. The PortSwigger Web Security Academy explains how cache keys and cache rules combine to enable this.

    How is web cache deception different from web cache poisoning?

    They point in opposite directions. Web cache poisoning manipulates the cache key to inject malicious content into a cached response that is then served to many other users, so the attacker is the source and the crowd is the victim. Web cache deception tricks the cache into storing one victim’s sensitive response, which the attacker alone retrieves, so the victim is the source and nothing malicious is injected. PortSwigger draws this exact distinction in its web cache deception writeup.

    How do you prevent web cache deception?

    Make caching decisions on what the origin returns, not on what the URL looks like. Send Cache-Control: no-store and private on every authenticated response and configure the cache to honor them rather than override them with an extension rule. Have the router return a 404 for decorated paths like /account/settings/nonexistent.css, verify the Content-Type matches the extension, and align how the cache and origin normalize delimiters and encodings. OWASP covers testing for this under Test for Path Confusion.

  • How DNS Rebinding Works and Reaches Inside Your Private Network

    How DNS Rebinding Works and Reaches Inside Your Private Network

    A dns rebinding attack is a trick where a web page you open in your browser quietly turns into a client for a device on your own home or office network. The page is served from a name the attacker controls, say rebind.acmeattacker.com, and your browser treats every request to that name as belonging to one origin. The attacker also controls the DNS server for that name, so a moment after the page loads they change the answer. The same name that first resolved to a public server now resolves to a private address like 192.168.1.1 or 127.0.0.1. Your browser keeps thinking it is talking to the same origin, because the hostname never changed, and the attacker’s JavaScript starts speaking directly to your router, your media server, or a service bound to localhost that was never meant to face the internet. This post walks the mechanism one step at a time: why the same origin policy trusts the hostname, how a short DNS time to live lets the attacker swap the IP underneath it, why DNS pinning only partly closes the gap, the answer tricks attackers use, the real devices this has hit, and the defenses that actually hold.

    Why the browser trusts a name it cannot pin down

    The same origin policy is the rule that keeps one site’s JavaScript from reading another site’s data. Two pages share an origin when their scheme, host, and port all match. A script on https://app.acmenotes.com can read responses from https://app.acmenotes.com and is blocked from reading https://api.bank.example. The host comparison is a string comparison on the hostname. As the MDN same origin policy reference describes it, origin is defined by scheme, host, and port, and the host part is the textual name. Nowhere in that comparison does the browser ask which IP address the name currently resolves to.

    That choice is deliberate and mostly reasonable. A single hostname legitimately moves across IP addresses all the time. Load balancers rotate backends, content networks return the nearest edge, failover swaps a dead server for a live one. If the same origin policy were pinned to an IP address, ordinary sites would break every time DNS handed back a different answer. So the policy trusts the name and assumes the name keeps meaning the same thing for the life of the page.

    It is worth being precise about what the origin actually is, because the whole attack lives in the definition. An origin is the triple of scheme, host, and port. The host is a registered domain name or an IP literal, and when it is a name, the browser stores and compares the name itself. The browser does resolve that name to an address in order to open a socket, but the resolved address is an implementation detail of the network layer, not part of the security identity. Two requests to https://rebind.acmeattacker.com are same origin with each other by definition, no matter what each one resolved to at the moment it was sent. The attacker is not breaking the comparison. They are feeding it two different machines under one honest name.

    DNS is the part of the system that turns a name into an address, and it was built to be changeable on purpose. A record carries a time to live, the number of seconds a resolver may cache the answer before it has to ask again. Set that number to one second and you have told every resolver on the path that this answer expires almost immediately. The attacker who runs the authoritative DNS server for their own domain decides that number. They can answer one way now and a completely different way a second later, and the protocol considers both answers correct.

    Put those two facts side by side and the gap appears. The browser fixes the origin on the name. DNS lets the attacker change what the name points at. The browser never rechecks.

    The time of check versus time of use gap

    The cleanest way to see dns rebinding is as a time of check to time of use bug, the classic shape where a system validates something once and then relies on that validation after the thing has changed. Here the check is the initial DNS lookup and the same origin decision that rides on it. The use is every later request the page makes to that same hostname.

    Walk the sequence. The victim visits rebind.acmeattacker.com. Their browser asks the attacker’s DNS server for the address and gets back a normal public IP, say the attacker’s own web server, with a time to live of one second. The page loads, the malicious script runs, the origin is now fixed on that hostname. So far nothing is unusual and nothing private has been touched.

    The script then waits, or makes a request that it knows will force a fresh lookup once the one second cache entry expires. The browser asks the attacker’s DNS server again. This time the answer is 192.168.1.1, the victim’s own router. From the browser’s point of view nothing about the origin has changed. The scheme is the same, the host string is the same, the port is the same. So it sends the request, including any work the script wants done, straight to the router. The check happened against the public server. The use lands on the private one. The window between them is the whole attack.

    The reason the attacker bothers with this dance, rather than just pointing their page at 192.168.1.1 directly, is that the same origin policy would stop the direct approach cold. A page served from https://rebind.acmeattacker.com cannot read responses from http://192.168.1.1, because those are plainly different origins. The browser would let the request go out but hide the response from the script, which is useless to the attacker. Rebinding exists precisely to make the private address wear the attacker’s hostname, so the response comes back to a script that is allowed to read it. The attacker is borrowing the victim’s own browser as a proxy that sits inside the network and, crucially, is trusted to read what it gets back.

    What the attacker needs from the victim is almost nothing. There is no exploit of the browser, no malware, no breached account. The victim only has to open a tab, which an attacker arranges with an advertisement, a link, or any embedded frame on a page the victim already visits. The page can keep the victim busy with ordinary looking content while the script quietly cycles through internal addresses in the background. By the time anything is noticeable, the requests have already been made and the responses already read.

    The browser never lied about the origin. The origin simply stopped meaning what it meant at the instant the browser decided to trust it.

    DNS pinning and why it is incomplete

    Browsers noticed this years ago and added a countermeasure called DNS pinning. The idea is simple. Once the browser has resolved a hostname and started using it, hold onto that first IP address for the lifetime of the page even if the DNS record’s time to live says the answer has expired. If the browser refuses to follow the rebind, the second lookup never reaches the router, and the attack dies.

    Pinning helps, but it was never a complete fix, for reasons that are structural rather than bugs to be patched away. The browser cannot pin forever. A page can stay open for hours, connections drop and get reestablished, and a pin that lasted indefinitely would break legitimate failover. So pins expire. An attacker who is willing to wait, or who can make the original connection fail, gets a fresh lookup and a fresh chance to rebind.

    Pinning also lives in only one place. The browser may pin, but it is not the only component resolving names and caching answers. The operating system has its own resolver cache, the local network may run its own, and these layers do not coordinate their pins. An answer that one layer considers expired another may serve fresh. The gaps between independent caches are exactly where a patient rebind slips through.

    There is a deeper limit too. Pinning binds a name to an address inside one browser process for one page session, but the attacker controls time. A rebinding script does not have to win in the first second. It can hold the tab open, throttle its own requests, and simply outlast whatever pin the browser is willing to maintain. The economics here favor the attacker the same way they do elsewhere in security. The defender has to keep the pin perfect across every cache and every reconnect. The attacker only needs the pin to lapse once.

    Multiple A records and the 0.0.0.0 trick

    Attackers found ways to make rebinding faster and more reliable than waiting on a cache to expire. One is to return multiple A records in a single answer. The attacker’s DNS server replies with two addresses for the name at once, their public server and the target’s private address. The browser connects to the public one first because that is where the page is served. Then the attacker makes their own server stop answering on that port. The browser, holding a name that still has a valid private address in the same record set, fails over to the private address without any new lookup at all. The rebind happens inside one cached answer, so pinning on the time to live buys nothing.

    A related family of tricks abuses how some systems treat special addresses. The address 0.0.0.0 is not a normal destination. On many operating systems a connection to 0.0.0.0 is routed to localhost, so a service bound to 127.0.0.1 can be reached through it. This has been the basis of a long running class of issues, often discussed as the 0.0.0.0 problem, where a public page reaches a service the developer believed was safely bound to localhost only. Combine that with rebinding and a service that listens on the loopback interface, assuming nothing on the wider network can talk to it, is suddenly reachable from a tab the user opened by accident.

    What dns rebinding actually reached in the wild

    This is not theoretical. The most thoroughly documented modern survey is the NCC Group researcher Brannon Dorsey’s writeup, Attacking Private Networks from the Internet with DNS Rebinding, which walked the full chain against consumer hardware and named the devices. The pattern across all of them is the same. Each device exposed an HTTP control interface on the local network with no authentication, on the unstated assumption that only software already inside the home would ever reach it.

    Google Home and Chromecast devices exposed an undocumented REST API on port 8008 that required no authentication and could launch apps, play content, reboot the device, and scan for nearby WiFi networks, which in turn enabled rough geolocation of the home. Roku devices exposed an External Control API on port 8060 with the same no authentication shape, tracked as CVE-2018-11314. Sonos WiFi speakers exposed debugging endpoints and a UPnP server that allowed network reconnaissance commands, tracked as CVE-2018-11316. Radio Thermostat CT50 and CT80 units exposed a completely unauthenticated control API, CVE-2018-11315, where the demonstrated impact was setting the temperature in a victim’s home to 95 degrees. WiFi routers were the highest value target, because the same UPnP and admin interfaces let an attacker rewrite the router’s own DNS server or add port forwarding rules, which turns a single accidental page view into a lasting foothold on the whole network.

    The Transmission case and a real CVE

    The starkest single example is CVE-2018-5702, found by Tavis Ormandy of Google Project Zero in the Transmission BitTorrent client. Transmission exposes a remote procedure call interface over HTTP for its web and desktop front ends. Its access control relied on a custom header, X-Transmission-Session-Id, which is not on the browser’s list of forbidden headers and so could be obtained and replayed by a malicious page. Through a rebinding attack a web page could reach the local Transmission daemon and issue RPC commands. As Ormandy described the impact, an attacker could set script-torrent-done-enabled and have an arbitrary command run when a torrent finished, or set download-dir to the user’s home directory and upload a torrent named to overwrite a file like .bashrc. That is remote code execution reached from an ordinary browser tab. Ormandy reported it and supplied a fix the following day, which landed as a Host header validation patch on the Transmission project. The fix is worth noting because it points straight at the right defense.

    Rebinding reaches more than home gadgets

    The technique generalizes to anything that trusts the network it sits on. Internal admin panels that skip authentication because they are only reachable on a corporate subnet are reachable through a rebind from any employee’s browser. In cloud environments the same idea targets the instance metadata service, the link local endpoint at 169.254.169.254 that hands out credentials to a workload. A rebind that lands on that address from inside a victim’s network or browser context is closely related to the broader class of server side request forgery, where a trusted client is steered into making a request it should never make. The rebind is the steering mechanism. The metadata service is the prize.

    Finding the target from inside the browser

    Before an attacker can rebind onto a useful service they have to know it is there, and the same browser that runs their script can also do the scouting. A script can try to load resources from a range of private addresses and ports and watch how long each attempt takes or whether it errors. A closed port fails fast, an open one behaves differently, and the timing alone leaks which internal hosts and services are alive. The private IP space is small and predictable. Home networks cluster on 192.168.0.0/16 and 10.0.0.0/8, routers sit on the first usable address, and well known services answer on well known ports. The attacker does not need to guess much.

    Once a live service is mapped, the rebind is aimed at exactly that address and port, and the generic page becomes a targeted client. This is why rebinding pairs so naturally with browser based port scanning. The scan tells the attacker where to point the rebind, and the rebind turns a discovered service into one the script is allowed to read from. Neither half requires anything beyond an open tab.

    A worked example on Acme Notes

    Make it concrete with our invented app. Suppose Acme Notes ships a small desktop helper that runs a local sync agent, listening on 127.0.0.1:7000 to talk to the Acme Notes app at app.acmenotes.com. The team bound it to localhost on the reasonable belief that only software already on the machine could reach it, so they did not add authentication to its control endpoint. A user installs the helper and later, on an unrelated tab, opens a page that an attacker controls.

    That page is served from sync.acmeattacker.com, resolves first to the attacker’s public server, and loads a script. The script waits for the one second pin to lapse, the name rebinds to 127.0.0.1, and now the script is talking to the local sync agent under a hostname the browser trusts. Because the agent never checked who was calling, it answers, and the script can read sync state, change settings, or point the agent at a server the attacker runs. The takeaway is not that Acme Notes wrote bad code. It is that binding to localhost was treated as authentication when it is only a filter on connection origin, and rebinding is built specifically to defeat that filter.

    Defenses that actually hold

    Because rebinding works by changing the IP under a trusted name, the durable defenses are the ones that stop trusting the network position and start checking something the attacker cannot forge.

    Validate the Host header

    This is the single most important server side fix, and it is the one the Transmission patch used. A service that is meant to answer only for localhost or its own hostname should inspect the Host header on every request and reject anything that does not match an allowlist of expected values, returning a 403 Forbidden. When the rebind lands, the browser still sends Host: rebind.acmeattacker.com, because that is the name in the address bar. The service sees a host it does not serve and refuses. The attacker cannot change the Host header to a forged value from JavaScript, because the browser sets it from the URL. This check costs almost nothing and defeats the core of the attack.

    Require real authentication, not network position

    A service that demands a credential the attacker’s page does not have is safe from rebinding even if the request reaches it. Binding to 127.0.0.1 is not authentication. It is a filter on where connections may originate, and rebinding is precisely a way to originate from there. Bind to localhost if you like, but also require an authenticated session, and do not rely on a header that scripts can obtain and replay. The whole Transmission issue was a custom header standing in for real access control.

    Filter private answers at the resolver

    A DNS resolver can refuse to return private addresses in answers for public names. If a name under a public domain tries to resolve to 192.168.x.x, 10.x.x.x, 127.0.0.1, or the link local metadata address, the resolver drops or rewrites that answer, and the rebind never completes. Tools like dnsmasq and several home and enterprise resolvers offer exactly this rebinding protection, and some public resolvers strip private ranges out of responses by default. It is a network level safety net rather than a per service fix, but it stops a large fraction of attacks before they reach any device.

    Adopt Private Network Access in the browser

    The browser platform itself is closing the gap. The Private Network Access specification, formerly known as CORS-RFC1918, restricts a public page from making requests into a private network unless the private service explicitly opts in with a CORS preflight. A request from a public origin to a private IP triggers a preflight that the target must answer with the right header, and an attacker’s router or media server will not. This shifts the default from quietly allowing public to private requests toward refusing them, which is exactly the assumption rebinding has always exploited.

    The assumption that breaks

    Underneath every variation of this attack sits one assumption. The same origin policy trusts the hostname, and it assumes the hostname keeps pointing at the same machine for as long as the page lives. That assumption is convenient and almost always true, which is why it survived. DNS was designed to let a name move between addresses, and the browser cannot tell a legitimate move from a malicious one, because both look like a name resolving to a new IP. The browser is not broken. It is honoring a contract that DNS never promised to keep.

    So the defenses that last are the ones that stop deciding trust from where a request appears to come from. Check the Host header, demand a real credential, and refuse private answers for public names. Each of those replaces a trust in network position with a check the attacker cannot satisfy. The gap between what a component assumes about a name and what an attacker can actually arrange with DNS is the kind of flaw you find by asking what each layer trusts and why it keeps trusting it after the situation has moved, rather than by scanning for a known bad string. It is exactly the kind of assumption an autonomous researcher built to test assumptions is meant to catch. Learn more about that approach on our about page.

    Frequently asked questions

    What is DNS rebinding in simple terms?

    It is an attack where a web page served from a hostname the attacker controls swaps that name’s IP address right after the page loads. The browser keeps treating requests as one origin because the hostname never changed, so attacker JavaScript can reach private services like a router or a localhost daemon. It works because the same origin policy compares the host as a name and never rechecks which IP that name currently resolves to.

    Why does a short DNS TTL matter for the attack?

    The time to live tells resolvers how many seconds they may cache an answer before asking again. The attacker sets it to about one second so the browser quickly performs a fresh lookup and receives a private address like 192.168.1.1 in place of the original public one. The result is a time of check to time of use gap, walked step by step in the NCC Group writeup Attacking Private Networks from the Internet with DNS Rebinding.

    Has DNS rebinding led to a real vulnerability?

    Yes. CVE-2018-5702 was a remote code execution flaw in the Transmission BitTorrent client found by Tavis Ormandy, where a page could reach the local RPC interface through a rebind and run commands. Researchers also documented unauthenticated control of Google Home, Chromecast, Roku, Sonos, and routers via the same technique. The Transmission fix added Host header validation to reject requests that do not match the expected name.

    How do you defend against DNS rebinding?

    Validate the Host header on every request and reject names you do not serve, since the browser still sends the attacker’s hostname after the rebind. Require real authentication instead of trusting that a request came from localhost or a private subnet, and have resolvers drop private IPs from answers for public names. Browsers are also restricting public to private requests through the Private Network Access specification, formerly CORS-RFC1918.

  • What Is Subdomain Takeover and Why a Forgotten DNS Record Is Dangerous

    What Is Subdomain Takeover and Why a Forgotten DNS Record Is Dangerous

    A subdomain takeover happens when a DNS record on a domain you own keeps pointing at a cloud resource that no longer exists, and an attacker registers that same resource name on the provider to serve their own content from your trusted subdomain. The record is still there. The thing it pointed at is gone. Somebody else claims the empty slot, and now status.acmenotes.com answers with a page the attacker wrote, on a name your users already trust. This post walks the mechanism one step at a time: how a DNS record outlives the resource behind it, why the gap is claimable, what control of a trusted subdomain actually unlocks, and how to close the window for good.

    The dangling pointer at the heart of a subdomain takeover

    A domain name is a tree. acmenotes.com is the apex, and below it you hang names like www, blog, status, and app. Each of those names needs a DNS record to tell the world where it lives. The most common kind for a hosted service is a CNAME, which is an alias. It says, in effect, do not look here, look over there instead.

    Say your team puts the marketing status page on a managed host. You create:

    status.acmenotes.com.  CNAME  acme-status.someprovider.io.

    Now any browser that asks for status.acmenotes.com is told to go ask acme-status.someprovider.io, and the provider serves the page. This works because you registered the resource name acme-status on that provider, and the provider mapped it back to your content. Two things are now linked: the DNS alias you control, and the resource slot the provider holds for you.

    Months later the status page is retired. An engineer deletes the resource on the provider, closes the account, and moves on. The provider releases the name acme-status back into its pool of available names. But the CNAME in your DNS zone is never touched. It still says status.acmenotes.com aliases to acme-status.someprovider.io. The alias now points at a slot that belongs to nobody. That is a dangling DNS record, and OWASP describes the condition plainly: a DNS record, typically a CNAME, points to a cloud resource or third party service that has been deprovisioned or no longer exists.

    The trouble is structural, not careless. Cloud resources are short lived and DNS records are persistent. Teams spin up and tear down services constantly, and the records that point at them tend to pile up unless somebody deletes them on purpose. The pointer outlives the thing it pointed at.

    Why the empty slot is claimable

    An attacker enumerating your subdomains looks for exactly this shape. They resolve status.acmenotes.com, follow the alias to acme-status.someprovider.io, and ask for the page. Instead of your content they get a provider error that says the resource is not configured. Each provider has a recognizable fingerprint for that state. On Amazon S3 the bucket returns The specified bucket does not exist. On GitHub Pages the response reads There isn't a GitHub Pages site here. On Heroku it is No such app. On some Azure endpoints the name simply fails to resolve at all and the DNS layer returns NXDOMAIN. That distinctive error is the signal that the alias is dangling and the slot is open.

    From there the takeover is just a registration. The attacker creates their own account on the provider and registers the resource name your record still points at, acme-status. The provider has no memory that this name was once yours. It hands the name to whoever asks first. The moment the attacker holds acme-status.someprovider.io, your CNAME resolves their content. They did not touch your DNS. They did not breach your account. They claimed the address your own record was still advertising. The can I take over xyz project catalogs which providers leave this door open and the exact error string each one shows when a slot is unclaimed.

    Two conditions have to line up for this to work, and both are common. First, your external DNS server has a subdomain record configured to point at a resource or endpoint that is no longer active. Second, the provider hosting that endpoint does not handle ownership verification properly, so it lets a new account register the name without proving any connection to your domain. When a provider does verify ownership, the second condition fails and the slot stays safe even though the record dangles. When it does not, the dangling record is enough on its own.

    It is not only CNAME records

    The alias case is the most frequent, but the same shape appears across record types, and the impact climbs as you move up the tree. A dangling A record that pins a subdomain to an IP address can be taken over if that address is released back into a cloud provider’s shared pool and the attacker manages to acquire it. A dangling MX record can route mail for the subdomain to a host the attacker controls, which lets them receive password resets and verification mails sent to that name. The worst case is a dangling NS record. Nameserver delegation hands authority for a whole zone to another server. If that server is deprovisioned and the delegation is left in place, an attacker who claims it gains control over the entire DNS zone under that name, not just one page. An NS takeover is less likely but has the highest impact, because it is full control of the subtree rather than a single endpoint.

    The attacker never breaks into your domain. Your domain keeps pointing at an address you abandoned, and the attacker simply moves into it.

    What control of a trusted subdomain unlocks

    Serving a page from status.acmenotes.com sounds like vandalism, a defacement at worst. It is far more than that, because the rest of your application has been built to trust names under acmenotes.com. The browser, your cookies, your login flow, and your content policy all make decisions based on the domain. A taken over subdomain steps inside that trust boundary and quietly inherits a pile of privileges it was never supposed to have.

    Phishing that passes every glance test

    The simplest payoff is a login page. The attacker serves a pixel perfect copy of your sign in form at status.acmenotes.com and mails the link to your users. Everything a careful user checks holds up. The domain is really yours. The TLS certificate is valid, because the attacker controls the subdomain and can request one from any certificate authority on the spot. There is no typosquatting tell, no lookalike character, no foreign domain. The credentials users type go straight to the attacker. This is the same trust that makes phishing on a controlled subdomain so much more effective than a random external link.

    Cookies scoped to the parent domain

    Cookies are where this turns from convincing into mechanical. A cookie set with Domain=.acmenotes.com is sent by the browser to every subdomain under it, including the one the attacker now owns. If a session cookie or a preference cookie is scoped to the parent domain and is not marked HttpOnly, JavaScript running on the attacker’s page can read it directly with document.cookie. The attacker did not need to defeat your login. The browser handed them the session cookie because, as far as it can tell, the request came from a legitimate part of acmenotes.com. Parent domain cookie scoping was a convenience for sharing sessions across app and www. It now shares them with the attacker too.

    Even cookies marked HttpOnly are not fully out of reach. The attacker can set their own cookies on the parent domain from the controlled subdomain, which opens session fixation, and they can read any cookie that scripts are allowed to see. The boundary everyone assumed sat at the domain edge actually ran between subdomains, and one of those subdomains just changed hands.

    OAuth and SSO redirect abuse

    Login flows lean on a list of trusted return addresses. When a user signs in through OAuth or single sign on, the identity provider sends the token or authorization code back to a redirect_uri, and it will only send it to a destination on an approved allowlist. Teams frequently approve patterns rather than exact addresses, allowlisting anything under *.acmenotes.com so they do not have to update the list every time they add a subdomain. A taken over subdomain matches that wildcard. The attacker starts an authentication flow with redirect_uri=https://status.acmenotes.com/callback, the identity provider sees a host that passes the allowlist, and it delivers the authorization code or token to a page the attacker controls. The fix the standards push is exact match redirect URIs precisely because wildcard allowlists turn any one weak subdomain into a token leak.

    Bypassing a Content Security Policy allowlist

    A Content Security Policy is a list of sources a browser is allowed to load scripts and other content from. Many policies list a wildcard like script-src https://*.acmenotes.com so that internal subdomains can host assets. The policy is meant to be a wall against injected scripts from anywhere else. A taken over subdomain sits inside the wildcard, so a script served from status.acmenotes.com satisfies the policy. If the attacker also has an HTML injection or cross site scripting foothold on the main app, the CSP that should have blocked their payload now waves it through, because the source is an allowlisted subdomain they happen to own. The same wildcard that bypasses the OAuth allowlist bypasses the script allowlist. To see how a policy like that grades, and to spot a wildcard before an attacker does, paste your response headers into our free security headers and CSP analyzer.

    Defeating same site assumptions

    A lot of web security quietly rests on the idea that everything under one registrable domain is one trust zone. Same site cookie rules, CORS allowlists that permit any origin under the parent, frames that are trusted because they share the domain, internal tools that skip a permission check for requests coming from a sibling subdomain. Each of those is a reasonable shortcut right up until one subdomain is controlled by someone outside the organization. After the takeover the attacker speaks from inside the same site, and every assumption built on that sameness now works in their favor.

    How one weak subdomain chains into a full compromise

    The individual effects above are bad, but the real danger is that they combine. Walk a plausible chain on our invented app, Acme Notes. The main app at app.acmenotes.com sets a session cookie scoped to .acmenotes.com so the marketing site and the app can share a login. It also ships a Content Security Policy that allowlists script-src https://*.acmenotes.com for shared widgets, and its single sign on flow approves any redirect_uri under *.acmenotes.com. None of those three choices is reckless on its own. Each one is a normal convenience.

    Now the attacker takes over the retired status.acmenotes.com. They host a script there. Because the subdomain matches the CSP wildcard, that script loads inside the main app whenever they find a place to reference it, and it reads the parent domain session cookie that the browser cheerfully attaches to the controlled subdomain. If a cookie is marked HttpOnly and stays out of reach, they pivot to the login flow instead, starting an authentication request with redirect_uri=https://status.acmenotes.com/callback, which the wildcard allowlist accepts, and the identity provider delivers the authorization code to their page. Three separate trust shortcuts, each defensible alone, become one path from a forgotten DNS record to a stolen session. That is why a single dangling subdomain rarely stays a small problem.

    How attackers find a dangling record before you do

    None of this requires luck. The reconnaissance is routine. An attacker collects the subdomains of a target from certificate transparency logs, which publicly record every TLS certificate ever issued for a name, from passive DNS datasets, and from brute forcing common names. Then they resolve each one and check where the alias lands. Any subdomain whose CNAME points at a provider and returns one of the known not configured fingerprints is a candidate. Tooling automates the whole sweep, matching responses against the same fingerprint list that the can I take over xyz project maintains. The economics favor the attacker. They scan thousands of names cheaply, and they only need one forgotten record. You have to remember all of them.

    It is worth naming where this sits relative to neighboring bugs. A subdomain takeover is not server side request forgery, where a server is tricked into making a request on the attacker’s behalf, and it is not the credential theft path from a cloud instance metadata service. But it rhymes with both. All three come from a component trusting a name or a location more than the situation deserves. Here the trusted thing is the domain label, and the betrayal is that the label kept its meaning after the resource behind it disappeared.

    Preventing a subdomain takeover

    The good news is that this class of bug has a clean root cause, which means it has a clean fix. The window only exists because of an ordering mistake during decommissioning. Close that ordering and the window never opens.

    Deprovision in the right order

    The single most important habit is sequencing. When you retire a service, the order is fixed:

    • First serve a maintenance page or redirect from the subdomain, so nothing breaks abruptly.
    • Then update or remove the DNS record so the name no longer points at the provider slot.
    • Allow time for DNS to propagate so caches expire.
    • Only then decommission the cloud resource.

    The common mistake is doing these steps in reverse, deleting the cloud resource first. That creates an immediate window for takeover that persists until someone notices the dangling record. Delete the pointer before you release the thing it points at, and there is never an empty slot for anyone to claim.

    Inventory every record and tie it to an owner

    You cannot protect records you do not know you have. Keep a live inventory of every DNS record in every zone, and link each one to the resource and the team that owns it. When a resource is torn down, that link is what tells you which record has to go with it. Records without a known owner are exactly the ones that rot into dangling aliases, so treat an unowned record as a finding, not a footnote.

    Claim and verify the resource you point at

    Wherever a provider offers domain verification, use it. A claimed and verified resource cannot be silently re registered by a stranger, because the provider checks ownership before handing the name out. This shrinks the set of providers where a free registration is enough to steal the slot, and it is the difference between an alias that is merely unused and one that is actually open for the taking.

    Monitor for the dangling state continuously

    Treat detection as ongoing, not a one time audit. For every CNAME in your zone, resolve the target on a schedule and check that it still exists and returns the content you expect, rather than a provider error page. Watch for two signals in particular. The first is NXDOMAIN, where the aliased target no longer resolves at all. The second is a known service fingerprint in the response body, one of those distinctive not configured error strings that says the resource has been removed. A weekly automated scan that flags either condition turns a silent dangling record into an alert before an attacker finds it. If you already have a CNAME target and the error page it serves, our free subdomain takeover fingerprint checker matches them against the known service fingerprints so you can confirm a dangling slot fast. Certificate transparency logs help here too, since they reveal subdomains you may have forgotten you ever created.

    The assumption that breaks

    Step back from the records and the fingerprints and one assumption is left holding everything up. DNS assumes that the record still points at something you own. A CNAME is a promise about a relationship between two names, and the relationship is only safe while you control both ends. The system has no way to notice when one end quietly slips away. The provider forgets you the instant you delete the resource. Your zone keeps advertising the alias as if nothing changed. Nothing in the protocol reconciles those two views, so the gap between them sits open, advertised to the whole internet, waiting.

    The bug is not a broken DNS server or a sloppy provider. The bug is a pointer that outlived the resource it pointed at, and a trust boundary that everyone drew at the domain edge when it actually ran between the subdomains. That gap between what a system assumes about a name and what an attacker can actually arrange is the kind of flaw you find by asking what each component trusts and why it still trusts it, rather than by scanning for a known bad string. It is exactly the kind of assumption an autonomous researcher built to test assumptions is meant to catch. Delete the record before you release the resource, verify what you point at, and watch your aliases for the day one of them stops pointing home. Learn more about that approach on our about page.

    Frequently asked questions

    What causes a subdomain takeover?

    It is caused by a dangling DNS record. A subdomain has a CNAME aliasing it to a cloud resource, and when that resource is deleted or the account is closed, the provider releases the name but the DNS record is never removed. The alias now points at an empty slot anyone can register. The OWASP Subdomain Takeover Prevention Cheat Sheet describes this dangling record as the core condition.

    How does an attacker claim the dangling subdomain?

    They enumerate your subdomains, follow each alias to its provider target, and look for a not configured error such as The specified bucket does not exist on S3 or There isn't a GitHub Pages site here. on GitHub Pages. That error means the slot is free. The attacker then registers the same resource name on the provider, and your unchanged CNAME immediately serves their content. The can I take over xyz project catalogs the vulnerable providers and their exact fingerprints.

    Why is a taken over subdomain so dangerous?

    Because the subdomain sits inside the trust boundary of your domain. The attacker can host a convincing phishing login on a real name with a valid certificate, read cookies scoped to the parent domain, match wildcard OAuth redirect allowlists to steal tokens, and satisfy a Content Security Policy that allowlists *.yourdomain.com. Every assumption built on names being under one trusted domain now works in the attacker’s favor.

    How do you prevent a subdomain takeover?

    Deprovision in the right order: remove or update the DNS record before you delete the cloud resource, never the reverse. Keep an inventory of every DNS record tied to its owner, use provider domain verification to claim the resources you point at, and monitor every CNAME on a schedule for NXDOMAIN or a known service fingerprint. This maps to the weakness MITRE tracks as CWE-350, relying on a name resolving to something you still control.

  • What Is HTTP Request Smuggling

    What Is HTTP Request Smuggling

    An HTTP request looks like one clean unit of work: a method, a path, some headers, and a body. But on the modern web your request almost never reaches a single server. It passes through a front end first, a proxy or load balancer or content delivery network, which then forwards it to a back end. http request smuggling is what happens when those two servers read the same bytes and disagree about where one request stops and the next one begins. When they disagree, an attacker can hide a second request inside the first, and the back end will glue it onto whatever victim request arrives next. This post walks the mechanics precisely: why two length headers fight, what one smuggled prefix does to the next person in line, how HTTP/2 reopens the wound through downgrades, and how to shut it.

    One connection, two readers, two opinions

    The front end and the back end usually keep a connection open between themselves and reuse it for many requests from many users. This is normal and efficient. It also means the back end is reading a continuous stream of bytes and slicing it into requests on its own. The front end already sliced the same stream. As long as both slice it at the same byte, everything is fine and nobody notices the machinery underneath.

    The attack lives in the moment they slice at different bytes. If the front end thinks request A ended at byte 100 but the back end thinks it ended at byte 80, then 20 bytes the front end believed were part of A are sitting at the front of the back end’s buffer, waiting. Those 20 bytes are attacker chosen. When the next real request arrives, the back end reads the leftover 20 bytes first, then the victim’s bytes, and treats the whole thing as one request. The victim’s request has been prefixed with the attacker’s smuggled content, and the victim never sent it.

    Request smuggling is not a parsing bug in one server. It is a disagreement between two servers about a question they both think has an obvious answer: where does this request end?

    Why a request has two ways to say how long it is

    To send a body in HTTP/1.1 you have to tell the server how many bytes to read. The protocol gives you two ways to do that, and that redundancy is the whole problem.

    The first way is Content-Length. You count the bytes of the body and put the number in a header. Content-Length: 11 means read exactly eleven bytes after the blank line, and that is the body. Simple and exact.

    The second way is Transfer-Encoding: chunked. Instead of declaring the total up front, you send the body as a series of chunks. Each chunk starts with its own size written in hexadecimal on its own line, then the chunk data, then a blank line. A chunk of size zero marks the end of the body. So a chunked body that carries the text q=smuggling looks like this:

    Transfer-Encoding: chunked
    
    b
    q=smuggling
    0
    
    

    The b is hexadecimal for 11, the length of q=smuggling. The 0 on its own line is the terminator. The reader is supposed to stop there. Everything before the 0 chunk is the body, and everything after it is the start of the next request.

    Two ways to declare length is one way too many. What is a server supposed to do when a single request arrives carrying both a Content-Length and a Transfer-Encoding: chunked header that point at different boundaries? The standard has an answer. RFC 9112 section 6.3 says that when both are present, Transfer-Encoding wins and Content-Length is ignored. The same standard warns that a request carrying both may be an attempt at request smuggling. The trouble is that not every server in the chain obeys the rule, and the ones that disagree are the ones you can attack.

    Walking one http request smuggling example byte by byte

    The cleanest way to see http request smuggling is to follow one example slowly. The variants are named after which header each server trusts. CL.TE means the front end honors Content-Length and the back end honors Transfer-Encoding. Watch what that mismatch does to one crafted request.

    The attacker sends a single request that includes both length headers on purpose:

    POST / HTTP/1.1
    Host: acme-notes.example
    Content-Length: 6
    Transfer-Encoding: chunked
    
    0
    
    GET /admin HTTP/1.1
    Host: acme-notes.example
    Foo: x

    Now read it twice, once as each server.

    The front end trusts Content-Length: 6. It counts six bytes of body after the blank line. Those six bytes are the 0, then the line ending, then the blank line that follows. As far as the front end is concerned the body is the short chunked terminator and nothing more. It decides the request ends right there and forwards the whole thing, every byte, to the back end on the shared connection. The front end believes it forwarded one ordinary POST.

    The back end trusts Transfer-Encoding: chunked and ignores the Content-Length entirely. It reads the body as chunks. The very first chunk it sees is 0, the terminator. So the back end decides the body is empty and the POST is finished at that point. But the bytes after the 0 chunk did not vanish. The back end now has this still sitting in its buffer, unread:

    GET /admin HTTP/1.1
    Host: acme-notes.example
    Foo: x

    The back end treats those leftover bytes as the beginning of the next request on the connection. It does not get attributed to the attacker. It gets stitched onto whatever arrives next. The smuggled GET /admin is the prefix, and it is missing a final piece, the rest of its headers, which is why the attacker leaves Foo: x dangling with no value terminated. That dangling header swallows the first line of the next victim’s request so the smuggled request stays valid.

    What the prefix does to the next victim

    Say an ordinary user sends a normal request a moment later:

    GET / HTTP/1.1
    Host: acme-notes.example
    Cookie: session=victim-session-here
    ...

    The back end already had the smuggled prefix waiting. So what it actually parses is the attacker’s lines followed by the victim’s lines fused together. The Foo: header absorbs the victim’s request line, and the request the back end runs is the attacker’s GET /admin carrying the victim’s session cookie. The victim asked for the home page and instead drove a request the attacker authored. Depending on the app, this poisons the response queue so the victim gets back a page meant for someone else, or it captures the victim’s own request data into a place the attacker can read, or it slips a request past the front end’s access rules because the front end only ever saw the harmless looking POST.

    That last point is the sharp one. Front ends are often where access control and request filtering live. They block /admin, strip dangerous headers, enforce rate limits. A smuggled request never passes the front end as a request at all. It rides inside the body of a request the front end approved, then becomes a request only after it is already past the gate. The control was real. It was just looking at the wrong bytes. This is the same shape of problem we describe in our web security glossary: a check that runs on a different view of the data than the action it is meant to protect.

    It helps to be precise about the three ways a smuggled prefix turns into damage, because they are not the same attack and they do not need the same conditions.

    • Bypassing front end controls. The smuggled request reaches paths and methods the front end was supposed to refuse. The attacker smuggles a request to a restricted route, and because the front end only inspected the approved outer request, the inner one runs with no filter between it and the back end.
    • Capturing another user’s request. The attacker smuggles a prefix that ends with a header expecting a long value, like a comment field or a search parameter, so the victim’s incoming request, cookies and all, is captured as that value and stored where the attacker can later read it back.
    • Poisoning the response queue. Once the boundary between requests is off by one, the back end’s responses fall out of step with who asked for them. The attacker’s smuggled request consumes a response slot, and the next user receives a response meant for a different request. Chain this with a reflected input or a cached page and a single smuggle can serve a poisoned response to many users.

    The mirror image, and the obfuscation trick

    TE.CL is the same idea flipped. The front end honors Transfer-Encoding and the back end honors Content-Length, so the attacker crafts a chunked body whose declared size leaves bytes the back end reads as a new request. The roles swap but the outcome is identical: a prefix left in the back end’s buffer.

    TE.TE is sneakier. Both servers support Transfer-Encoding, so in theory they agree. The attacker breaks that agreement by obfuscating the header so that one server recognizes it and the other does not. A header written as Transfer-Encoding: xchunked, or with odd spacing, or duplicated, or with a tab in a place a strict parser rejects but a lenient one accepts, can make one server fall back to Content-Length while the other still reads chunks. The instant one server stops honoring Transfer-Encoding, you are back to a CL versus TE split, and the smuggle works again. The lesson is that small differences in how strictly each server parses a header name are enough to desync the chain.

    HTTP/2 was supposed to fix this, and then it did not

    HTTP/2 removes the ambiguity at its root. It does not send headers and bodies as a text stream you have to slice. Each message body is carried in binary data frames, and every frame has a built in length field. The protocol knows exactly where a message ends because the framing tells it, not because two text headers happen to agree. End to end HTTP/2 has no place for a length disagreement to hide. If the whole chain spoke HTTP/2 from the browser to the back end, this class of bug would mostly be over.

    The chain does not speak HTTP/2 the whole way. Most front ends accept HTTP/2 from the internet and then rewrite each request as HTTP/1.1 before handing it to the back end, because the back end still speaks the older protocol. That rewrite is called a downgrade, and it is where James Kettle’s research, presented as HTTP/2: The Sequel is Always Worse, showed the bug coming back to life.

    When the front end downgrades, it has to invent the HTTP/1.1 length headers from the HTTP/2 frame data. It writes a Content-Length, or it copies across a Transfer-Encoding the request carried. If the front end does this carelessly, the back end is once again reading length from a text header that may not match reality.

    H2.CL and H2.TE

    H2.CL is the downgrade version of a Content-Length desync. In HTTP/2 the true body length is fixed by the data frames, so the content-length field a client sends is just a claim the server is supposed to validate against the frames. If the front end fails to check it and trusts the attacker supplied value during the downgrade, it writes that wrong Content-Length into the HTTP/1.1 request it forwards. The back end then reads too few or too many bytes, and the leftover becomes a smuggled prefix, exactly as in CL.TE.

    H2.TE is the Transfer-Encoding version. The HTTP/2 standard says a request carrying a transfer-encoding header should be treated as malformed and rejected, because chunked encoding has no meaning inside HTTP/2 framing. A front end that forwards that header anyway hands the back end a Transfer-Encoding: chunked on a downgraded request. The back end honors it, reads the body as chunks regardless of the front end’s idea of the length, and desyncs. Same prefix, same poisoned queue, reached through a header the front end should have thrown away.

    The reason the downgrade case is worth so much attention is that it widened the target list. Pure HTTP/1.1 smuggling needs two HTTP/1.1 servers that parse length differently, which careful operators had started to fix. The downgrade reopened the bug on chains that looked modern and safe from the outside, where the public facing server speaks HTTP/2 and only the hop you cannot see still speaks HTTP/1.1. Kettle’s research also showed that HTTP/2 carries its own smuggling surface beyond length, because attackers can smuggle through header names, header values, and even the pseudo headers that HTTP/2 uses for the method and path, all of which have to be flattened into a single text line during a downgrade. Anywhere a special character survives that flattening, a new request boundary can be forged.

    Defenses that actually hold

    The fixes are not clever payloads to block. There is no signature for a smuggled request, because every byte in it is valid on its own and the attack is purely in how two servers slice the stream. So the defenses do not try to spot bad content. They are about making the two servers agree on boundaries, or refusing to forward anything the two of them might read differently.

    • Reject ambiguous requests instead of guessing. A request that carries both Content-Length and Transfer-Encoding is not a request to interpret, it is a request to refuse. RFC 9112 lets a server reject it outright, and it requires the server to close the connection after responding to such a request so no leftover bytes can poison the next one. Closing the connection is the part that breaks the smuggle, because the prefix has nowhere to wait.
    • Make the front end normalize and own the framing. The front end should rewrite every request into one unambiguous form before forwarding, with exactly one length header that it computed itself, so the back end never has to choose. If the front end will not honor a Transfer-Encoding it should strip it, not pass it along for the back end to honor differently.
    • Reject the Transfer-Encoding you will not honor. A front end that does not implement chunked the way the back end does should reject requests that use it, including obfuscated spellings, rather than forwarding a header it parses loosely.
    • Use HTTP/2 end to end where you can. If the connection to the back end also speaks HTTP/2, there is no downgrade and no place to forge a length header. When you must downgrade, validate the content-length against the real frame data and drop any transfer-encoding the HTTP/2 standard says is malformed.
    • Reuse back end connections carefully. Much of the impact comes from one shared connection carrying many users. Some deployments reduce blast radius by not pooling back end connections across users, so a leftover prefix cannot land on a stranger’s request.

    These are not hypothetical. In March 2025 Akamai disclosed CVE-2025-32094, a request smuggling flaw James Kettle reported in their edge platform. It chained an HTTP/1.x OPTIONS request, an Expect: 100-continue header, and obsolete line folding so that two in path Akamai servers read one request two different ways. Akamai fixed it across the platform with no known exploitation, but the cause is the same one this post has circled the whole time: two servers, one stream, two opinions about where a request ends.

    The assumption underneath

    Every link in this chain is built by people doing something reasonable. The front end forwards requests to be fast. The back end reads length from a header because that is how the protocol works. The standard offers two ways to declare length because both are genuinely useful. None of those choices is wrong on its own. The bug is the assumption that connects them: that the front end and the back end will always agree on where a request ends, because the question feels like it has one obvious answer.

    It does not. The attack lives entirely in the gap between two readers of the same bytes, a gap nobody put there on purpose and nobody tested for, because each server was certain the other saw what it saw. That is the kind of flaw you find by asking what each component assumes about the one next to it, then arranging for the assumption to be false, rather than by scanning for a known bad string. It is the same trust in a validated view of a request that powers bugs like server side request forgery, and it is exactly the class of bug an autonomous researcher that tests assumptions is built to surface. The two servers think they agree. The whole exploit is proof that they do not.

    Frequently asked questions

    What is HTTP request smuggling in simple terms?

    It is an attack that works when a front end server and a back end server read the same bytes on a shared connection but disagree about where one request ends and the next begins. The attacker crafts a request that the front end treats as finished while the back end thinks part of it is the start of a new request. Those leftover bytes wait in the back end buffer and get stitched onto the next user’s request, so the back end runs a request the attacker wrote. The PortSwigger Web Security Academy covers the mechanics in depth in its request smuggling guide.

    Why do the two length headers cause the problem?

    HTTP/1.1 gives two ways to declare how long a body is. Content-Length states the byte count up front, while Transfer-Encoding: chunked sends the body as sized chunks ending in a zero length chunk. When one request carries both headers and they point at different boundaries, servers can disagree. RFC 9112 section 6.3 says Transfer-Encoding wins and warns the request may be a smuggling attempt, but not every server obeys, and the ones that disagree are the ones an attacker chains against.

    What are CL.TE, TE.CL, and TE.TE?

    They name which header each server trusts. CL.TE means the front end honors Content-Length and the back end honors Transfer-Encoding, so the back end stops at the zero chunk and leaves the rest as a smuggled prefix. TE.CL is the reverse. TE.TE is when both support chunked, so the attacker obfuscates the Transfer-Encoding header, for example with odd spacing or a misspelling, so one server stops honoring it and the chain desyncs again.

    Doesn’t HTTP/2 prevent request smuggling?

    End to end HTTP/2 mostly does, because it carries each body in binary frames with a built in length, leaving no room for two text headers to disagree. The risk returns when a front end accepts HTTP/2 from the internet and downgrades each request to HTTP/1.1 for the back end. If it forges a wrong content-length (H2.CL) or forwards a transfer-encoding it should have rejected as malformed (H2.TE), the back end desyncs just like before. Using HTTP/2 to the back end too, or validating length against the real frames, removes the gap.

  • How TLS Fingerprinting Works: JA3, JA4, and the ClientHello

    How TLS Fingerprinting Works: JA3, JA4, and the ClientHello

    Before a single byte of HTTP travels, before any JavaScript runs, before a cookie is set, a web client has already told the server a great deal about itself. The very first message of a TLS connection, the ClientHello, is sent in the clear, and the exact way it is built is specific to the software that built it. TLS fingerprinting is the practice of reading that first message and turning it into a short, stable identifier for the client. A real Chrome browser, a Python script using requests, and a piece of malware calling home to its controller each produce a different shape of ClientHello, and that shape gives them away. This post takes the idea apart from the packet up: why the handshake is a fingerprint at all, how the original JA3 method computed one, why JA3 broke, how JA4 fixed it, and what all of this means for catching bots and malware versus the privacy of ordinary users.

    Why the handshake is a fingerprint

    A TLS connection opens with a negotiation. The client speaks first with a ClientHello, a plaintext message that lists everything the client is willing and able to do so the server can pick a common option. That list is not a single fixed value. It is an ordered set of choices, and every TLS library makes those choices a little differently.

    The ClientHello carries, among other things, the highest TLS version the client supports, the ordered list of cipher suites it offers, a list of extensions, the elliptic curves it will accept for key exchange, and the elliptic curve point formats it understands. None of this is secret. It cannot be, because the server needs to read it to agree on parameters before encryption is set up. The values themselves are mundane. What identifies the client is the combination and the order: which ciphers, in which sequence, which extensions, advertised which way.

    This matters because the choices come from the TLS stack, not from the application on top of it. OpenSSL, BoringSSL, the schannel library on Windows, the network stack inside Chrome, and the Go standard library each assemble a ClientHello in their own house style. So the fingerprint reflects the runtime, not the label the client puts on itself. A script can set its HTTP user agent header to the exact string a real Chrome sends, but the header is added later, inside the encrypted HTTP request. The TLS handshake underneath was already built by Python’s stack, and it does not look like Chrome at all. That gap between what a client claims and what its handshake reveals is the entire reason the technique is useful.

    It helps to picture where this sits in the connection. The TCP handshake completes, then the client sends the ClientHello as the very first TLS record. The server reads it, replies with a ServerHello that picks one cipher and one set of parameters, both sides derive keys, and only then does the channel turn encrypted. So the ClientHello is the last fully readable thing the client ever sends on a healthy connection. A passive observer between the two parties cannot read the page that is requested or the data that comes back, but it can read that opening message in full. TLS fingerprinting is the discipline of getting the most identity out of that one readable message.

    The client picks a user agent string to tell you what it is. The handshake tells you what it really is, and the handshake was sent before the client had a chance to lie.

    How JA3 computes a TLS fingerprinting hash

    The first widely used method for this came from Salesforce in 2017 and is called JA3. Its idea is simple enough to follow by hand. JA3 reads five fields out of the ClientHello, always in the same order:

    • TLS version, the version number from the handshake.
    • Cipher suites, the ordered list of ciphers the client offers.
    • Extensions, the list of TLS extensions, in the order they appear.
    • Elliptic curves, the supported curves, sometimes called supported groups.
    • Elliptic curve point formats, the point format list.

    JA3 takes the decimal values from each field, joins the values inside a field with a dash, and joins the five fields with a comma. The result is one long string in a fixed layout: TLSVersion,Ciphers,Extensions,EllipticCurves,EllipticCurvePointFormats. A real example of that intermediate string looks like this:

    769,47-53-5-10-49161-49162-49171-49172-50-56-19-4,0-10-11,23-24-25,0

    Here 769 is the TLS version, the long middle run is the cipher list, 0-10-11 is the extension list, 23-24-25 is the curve list, and the trailing 0 is the single point format. If a field is empty, JA3 keeps the comma and leaves the field blank, so a client with no extensions produces a string like 769,4-5-10-9-100-98-3-6-19-18-99,,, with the empty positions preserved. That last detail is part of the fingerprint too, because the absence of extensions is itself a property of the client.

    The final step is a hash. JA3 runs the whole comma joined string through MD5 and keeps the 32 character result. The string above becomes:

    769,47-53-5-10-49161-49162-49171-49172-50-56-19-4,0-10-11,23-24-25,0
      -> ada70206e40642a3e4461f35503241d5

    MD5 is a poor choice for security where collisions matter, but here it is only a compact label for a string, so its weakness is not the point. The point is that the same client software, run again, produces the same five fields in the same order and therefore the same hash. A different client produces a different one.

    It is worth being precise about what JA3 deliberately leaves out. It does not read the server name indication, the actual hostname being requested, even though that field is present and readable in many ClientHellos. It does not read the contents of every extension, only which extensions are present. And it does not touch anything above TLS. The aim is a fingerprint of the client stack, not of the destination or the request, so two connections from the same software to two different sites share a JA3 hash. That is the property that makes it useful for spotting one tool across many targets, and it is also why JA3 alone cannot tell you what the client was doing, only what it was.

    GREASE and the server side twin

    Two refinements are worth knowing. First, modern clients inject GREASE values, which are deliberately reserved placeholder numbers sprinkled into the cipher and extension lists to keep servers from getting rigid about what they accept. JA3 ignores GREASE values entirely so that a client which uses GREASE still maps to one stable hash rather than a new one each connection. Second, there is a mirror method called JA3S that fingerprints the server’s response from its version, chosen cipher, and extensions. Pairing the client JA3 with the server JA3S describes a whole conversation, which is handy when the same client always talks to the same controller.

    Where JA3 is genuinely useful

    The reason security teams cared about JA3 is that it identifies software by how it speaks, not by where it connects or what it claims. That property has three concrete uses.

    Malware and command and control detection. A piece of malware is usually built against one TLS library and offers one fixed handshake. It does not matter if the malware rotates its server IP every hour, uses domain generation algorithms to invent new hostnames, or even hides its controller behind a public service. The JA3 hash of the malware’s own handshake stays the same. Salesforce documented that the Trickbot sample consistently produced the JA3 hash 6734f37431670b3ab4292b8f60f29984, which means a sensor can flag that traffic by how it connects rather than by chasing an endless list of addresses. Threat intelligence feeds publish lists of JA3 hashes tied to known malware families for exactly this.

    Bot detection by mismatch. The strongest signal is a contradiction. When an HTTP request carries a user agent header that says Chrome 120, but the TLS handshake under it matches the fingerprint of Python’s requests library or a plain curl build, the two stories do not agree. A browser stack and a scripting stack assemble different ClientHellos, so a request that claims to be a browser while handshaking like a script is almost certainly automated. A web application firewall or content delivery network can compare the claimed client to the observed fingerprint and act on the gap.

    Allow listing in locked down networks. In an environment where only a known set of applications should ever make outbound TLS connections, you can record the fingerprints of the approved software and alert on anything else. A new fingerprint is a new piece of software talking, which is worth a look.

    If you want the broader picture of how servers profile clients across many layers, our writeup on how browser fingerprinting works covers the JavaScript and HTTP signals that sit above the handshake. TLS fingerprinting is the layer beneath all of that, the one that fires first.

    Why JA3 broke

    JA3 had a structural weakness, and two separate forces pushed on it until it gave way. The weakness is that JA3 reads the extension list in the order it appears in the ClientHello. Order is part of the hash. So anything that changes the order changes the hash, even when the client’s actual capabilities are identical.

    The first force was an evasion that costs almost nothing. Because order drives the hash, a client that wants to dodge a JA3 blocklist only has to shuffle its extension list. The set of extensions is the same, the handshake still works, but the bytes are reordered and the hash is new. For an attacker this is close to free. A list of sixteen extensions can be arranged in sixteen factorial ways, which is more than twenty trillion orderings, so a single piece of software can wear an effectively unlimited number of JA3 faces. A blocklist built on a fixed hash cannot keep up with a client that changes the hash on a whim.

    The second force was not an attack at all. Starting around early 2023, with the rollout landing in Chrome version 110 and the change merged a release or two earlier, Chrome began randomizing the order of its TLS extensions on purpose. The stated reason was healthy: by shuffling the order on every connection, Chrome forces servers and middleboxes to stop depending on the exact byte layout of its ClientHello, which keeps the wider TLS ecosystem flexible. The side effect was that the single common JA3 hash for Chrome shattered. Overnight a huge share of legitimate traffic stopped matching its old fingerprint, and the same twenty trillion orderings that helped attackers now scattered ordinary users too. JA3 went from a useful client label to noise for the most common browser on the internet.

    How JA4 fixes the order problem

    JA4, from FoxIO, is the answer to that breakage, and the core fix is almost obvious once you see the failure. If order is the problem, remove order from the parts where it is not meaningful. JA4 sorts the cipher list and sorts the extension list before hashing them. A shuffled ClientHello and an unshuffled one, with the same underlying capabilities, sort to the same sequence and therefore produce the same fingerprint. The evasion of reordering, and Chrome’s deliberate randomization, both stop mattering because the sorted output is identical either way.

    JA4 also changes the shape of the output to be readable rather than a single opaque hash. A JA4 fingerprint comes in three parts joined by underscores. A real example:

    t13d1516h2_8daaf6152771_b186095e22b6

    The first segment is human readable metadata. Reading it left to right: t means TLS over TCP, 13 means TLS version 1.3, d means a server name indication was present so this is a connection to a named domain, 15 is the count of cipher suites with GREASE excluded, 16 is the count of extensions, and h2 is the first and last characters of the negotiated application layer protocol, here HTTP/2 by way of ALPN. The second segment, 8daaf6152771, is a truncated SHA256 hash of the sorted cipher list. The third segment, b186095e22b6, is a truncated hash of the sorted extensions, leaving out the ones that are themselves variable, plus the signature algorithms in their original order.

    Two design choices stand out. Sorting is what defeats the shuffle, both the malicious kind and Chrome’s well meaning kind. Adding ALPN is new information that JA3 never captured, since the negotiated protocol is another property of the client stack. And because the leading segment is plain text, an analyst can group and hunt on individual pieces, for example every TLS 1.3 client that offers a certain count of extensions, without decoding a hash.

    The readable prefix earns its keep in practice. Suppose a feed of traffic is dominated by ordinary browsers and you want to find the odd one out. With a single opaque MD5 you can only test for exact matches against a known list. With JA4 you can ask coarser questions directly off the string: show every client that negotiated TLS 1.3 with no server name indication, which is unusual for a browser visiting a website and common for automated tooling. The counts and flags in that first segment give you a way to slice traffic before you ever compare a hash, so a new variant that has never been catalogued can still stand out by its shape. JA4 also extends to QUIC and HTTP/3, where the same handshake idea rides on UDP, which is something the older method was never built to cover.

    JA4 is one of a family

    JA4 by itself fingerprints the TLS client. FoxIO published it as the lead member of a suite called JA4+, where each method fingerprints a different part of a connection: JA4S for the server’s TLS response, JA4H for the HTTP client, JA4X for the certificate, JA4SSH for SSH sessions, and several more for TCP, latency, and DHCP. The JA4X variant works over the X.509 certificate the server presents, and if you want to see what fields live inside one of those certificates, our free X.509 certificate decoder breaks a certificate down into its issuer, validity dates, extensions, and public key. The stated uses for the suite read like a defender’s job list: scanning for threat actors, malware detection, session hijacking prevention, grouping related actors, and detecting reverse shells, among others. The JA4 TLS method itself is published under a BSD license, while the rest of the suite carries the FoxIO license that allows internal use but asks for a license to resell.

    The privacy and evasion angle, told honestly

    Everything that makes TLS fingerprinting good at catching bots also makes it a tracking tool. A fingerprint identifies a client before any cookie is set and survives a private browsing window, since it comes from the TLS stack rather than from stored state. Two people on the same network running the same browser build share a fingerprint, which limits how precisely it pins down one person, but it still sorts traffic into groups by software without anyone’s consent. This is the same tension that shows up across client identification, and it is the reason Chrome’s randomization was framed as ecosystem hygiene rather than as an anti tracking feature, even though it carried both effects.

    Evasion is real and worth naming plainly. There exist tools that rebuild a script’s handshake to match a real browser’s, so that a request claiming to be Chrome also handshakes like Chrome and slips past a mismatch check. The existence of these tools is the reason no serious defender treats a fingerprint as proof on its own. A fingerprint is one signal among several, strong because it fires early and is hard to fake casually, weak because a determined party can copy a known good handshake. This post will not walk through how to build such a forgery. The defensive takeaway is the useful one: combine the fingerprint with other evidence, watch for the contradiction between the claimed client and the observed one, and treat a perfect browser fingerprint from an unexpected source as a question rather than an answer. For more terms in this area, see our web security glossary.

    The assumption that breaks

    Step back from the cipher lists and the hash construction and one assumption is doing all the work. A client connecting over TLS assumes that encryption hides it. The padlock is up, the channel is private, the payload is unreadable to anyone in the middle. All of that is true for the contents of the conversation. It is not true for the handshake that set the conversation up. The ClientHello is sent in the open by necessity, and its construction is a property of the software, so the very act of asking for a private channel announces who is asking.

    That is the gap that JA3 and JA4 read. The client believed the encrypted channel covered its identity, and it was wrong, because the metadata of the handshake identifies the software before a single encrypted byte is exchanged. A real browser, a script wearing a browser’s name, and a malware sample each make the same request for privacy in a different accent, and the accent is the fingerprint. Testing that assumption, the quiet belief that the tunnel hides the traveler, is exactly where the signal lives.

    Frequently asked questions

    What is TLS fingerprinting?

    It is the practice of identifying client software from the way it builds its first TLS handshake message, the ClientHello, which is sent in the clear before any HTTP or JavaScript. Methods like JA3 and JA4 read fields such as the TLS version, the offered cipher suites, the extension list, and the supported curves, then turn that combination into a short stable identifier. Because the values come from the TLS library rather than the application, the fingerprint reflects the real runtime even when the client sets a misleading user agent string.

    How is a JA3 hash computed?

    JA3 reads five fields from the ClientHello in a fixed order: TLS version, cipher suites, extensions, elliptic curves, and elliptic curve point formats. It joins the values inside each field with dashes and the five fields with commas, producing a string like 769,47-53-5-10,0-10-11,23-24-25,0, then runs that string through MD5 to get a 32 character hash. Empty fields keep their commas, and GREASE placeholder values are ignored so a client still maps to one stable hash. The method comes from Salesforce, documented at github.com/salesforce/ja3.

    Why did JA3 stop working and how does JA4 fix it?

    JA3 hashes the extension list in the order it appears, so reordering the extensions changes the hash without changing the client. Attackers exploited that to dodge blocklists, and from Chrome 110 in 2023 Chrome began randomizing its extension order on purpose, which shattered the common Chrome JA3 hash. JA4, from FoxIO, sorts the cipher and extension lists before hashing so a shuffled and an unshuffled handshake produce the same fingerprint. The technical format is published at github.com/FoxIO-LLC/ja4.

    How does TLS fingerprinting catch bots and malware?

    Malware usually offers one fixed handshake from the TLS library it was built with, so its fingerprint stays the same even when it rotates server IP addresses or hostnames, which lets sensors flag it by how it connects. For bots, the strongest signal is a mismatch: a request whose user agent claims to be a browser but whose handshake matches a script like Python requests or curl is almost certainly automated. Fingerprints are one signal among several, since evasion tools exist that copy a real browser handshake.

  • How Browser Fingerprinting Identifies You Without a Cookie

    How Browser Fingerprinting Identifies You Without a Cookie

    Clear your cookies, open an incognito window, and most people assume they are starting fresh and anonymous. They are not. Long before you log in or accept a consent banner, the page has already read several dozen facts about your machine and combined them into a stable identifier. That technique is called browser fingerprinting, and it works without storing anything on your device at all. There is nothing to delete, because the identifier is not saved on your side. It is computed on the server from signals your browser hands over for free, every visit, by design. This post takes the method apart signal by signal: what each one is, how many bits of identifying information it carries, how a handful of medium entropy signals multiply into something unique among millions, and why that matters for tracking, fraud, and deanonymization.

    Why a fingerprint exists when nothing is stored

    A cookie is a value the server asks your browser to keep and send back later. You can see cookies, count them, and erase them. A fingerprint is the opposite. The server does not ask you to store anything. It reads attributes your browser already exposes to make legitimate web pages work, and it derives an identifier from the exact combination of those attributes. A page needs your screen size to lay itself out. It needs your language to pick a translation. It can query your graphics stack to decide whether to use hardware acceleration. Each of these is reasonable on its own. The fingerprint is what you get when a script collects all of them at once and treats the bundle as a name.

    Because nothing is written to your disk, the usual privacy reflexes do not touch it. Clearing cookies removes saved values, not the shape of your device. A private window blocks cookie persistence and history, not the screen resolution your monitor reports. The fingerprint survives both because it was never stored in the first place. It is recomputed from scratch on each visit, and as long as your machine and browser stay roughly the same, the result stays roughly the same.

    It helps to separate two jobs the fingerprint does. First, recognition: deciding whether the browser in front of the server right now is one it has seen before. Second, linkage: tying together two separate sessions that the user believed were unrelated, such as a logged in visit and an anonymous one. A cookie does both jobs only as long as it survives. A fingerprint does both jobs without ever needing your cooperation, and that is the whole point. The server is reading you, not asking you to carry a tag.

    The signals: what a page reads about you

    Open the developer console on any page and most of these are one line of JavaScript away. None of them require a permission prompt. Here is the core set, grouped by how a script gets at them.

    The easy attributes from the navigator and screen objects

    The navigator object is a grab bag of properties the browser exposes about itself. The classic one is the user agent string, read with navigator.userAgent, which spells out the browser name, version, rendering engine, and operating system. A typical value looks like this:

    Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)
    AppleWebKit/537.36 (KHTML, like Gecko)
    Chrome/126.0.0.0 Safari/537.36

    Alongside it sit navigator.language and navigator.languages for your locale preferences, navigator.platform, navigator.hardwareConcurrency for the number of logical CPU cores, and navigator.deviceMemory for a rough memory figure. The screen object gives width, height, available width and height, and color depth. A single call to Intl.DateTimeFormat().resolvedOptions().timeZone returns your time zone as a clean string like America/New_York. Each of these is cheap to read and stable from one visit to the next.

    Font enumeration

    The exact set of fonts installed on a machine is surprisingly varied, because it reflects the operating system, the applications you have installed, and the language packs you have added. A script cannot ask for the full list directly anymore, but it can probe. It renders a string in a font it wants to test, measures the width and height of the result, and compares that against the measurement for a known fallback font. If the size differs, the requested font is present. Run that probe across a few hundred candidate fonts and the script reconstructs which ones you have. The presence or absence pattern is the signal.

    Canvas rendering

    This is where fingerprinting stops reading labels and starts measuring hardware. The HTML5 <canvas> element lets a script draw text and shapes, then read the resulting pixels back out. The trick, first described by Keaton Mowery and Hovav Shacham in their 2012 paper Pixel Perfect: Fingerprinting Canvas in HTML5, is that two machines asked to draw the exact same instructions do not produce the exact same pixels. A script draws a line of text, often with a mix of letters and an emoji, over a colored background, then calls toDataURL() to get the rendered image as a string and hashes it.

    const c = document.createElement('canvas');
    const ctx = c.getContext('2d');
    ctx.textBaseline = 'top';
    ctx.font = '14px Arial';
    ctx.fillText('Cwm fjordbank glyphs vext quiz 😀', 2, 2);
    const hash = sha256(c.toDataURL());

    The canvas does not even need to be visible on the page. The output differs because the work of turning instructions into pixels runs through your GPU, your graphics driver, your installed fonts, and your operating system font rasterizer. Anti aliasing, sub pixel smoothing, and how an emoji is drawn all vary across an Intel integrated chip, an Nvidia card, and an Apple GPU. The differences are invisible to your eye and consistent on the same machine, which is exactly what a tracker wants.

    This signal moved from research curiosity to mass deployment fast. In early 2014 the bookmarking company AddThis quietly ran canvas fingerprinting on a large share of the most visited sites on the web, a finding that drew attention because users had no way to see it happening and no setting to refuse it. That is the recurring shape of the problem. The data is read silently, the cost to the site is near zero, and the user is not told. A small study of canvas alone measured around 5.7 bits of entropy from the technique, which is not enough to name you by itself but plenty as one term in a larger product of signals.

    WebGL

    WebGL goes one level deeper into the graphics stack. A script can ask the WebGL context for the renderer string through the WEBGL_debug_renderer_info extension, and many browsers return the literal name of your graphics chip, something like ANGLE (Apple, Apple M2, OpenGL 4.1). Beyond the name, a script can render a 3D scene off screen and read the pixels back, the same idea as canvas but exercising more of the hardware. Shading, depth handling, and floating point rounding in the GPU pipeline differ across devices and show up in the output.

    AudioContext

    Audio fingerprinting applies the same logic to sound. A script creates an OfflineAudioContext, which processes audio as fast as it can without ever sending anything to your speakers. It generates a known waveform with an OscillatorNode, usually routes it through a DynamicsCompressorNode to magnify small differences, and reads the resulting samples back. Those samples are 32 bit floating point numbers. Because different audio stacks and CPUs round and process the signal with tiny differences, the values diverge at the far decimal places. Hash that buffer and you get another stable identifier, derived this time from your audio processing pipeline rather than your graphics. You never hear a thing.

    The math: how browser fingerprinting turns weak signals into a unique name

    No single signal here identifies you. Plenty of people use Chrome on macOS in the New York time zone. The power of browser fingerprinting comes from combining many signals that are each only moderately revealing. To see why, you need one idea from information theory: entropy, measured in bits.

    The surprisal of an observation with probability p is -log2(p) bits. If half of all browsers share some attribute value, observing that value costs an attacker -log2(0.5) = 1 bit, and it cuts the candidate pool in half. If one in eight browsers share a value, that is -log2(1/8) = 3 bits, and it cuts the pool to an eighth. The entropy of a whole attribute is the average surprisal across all its possible values, written H = -Σ p(x) log2 p(x). The key property is that bits from independent signals add together. Each bit halves the number of people you could be.

    Independence is the catch in that sentence, and it is worth being precise about. Bits add cleanly only when the signals do not correlate. In practice many do. Your operating system is implied by your user agent, hinted at by your font set, and reflected again in your canvas output. A tracker who naively sums the entropy of correlated signals overcounts, because the second signal tells them less once the first is known. Serious fingerprinting work measures the joint entropy of the whole bundle rather than adding the parts, which is why the headline numbers below come from full fingerprints, not from stacking the per signal figures.

    A fingerprint does not need any single signal that names you. It needs enough independent signals that, multiplied together, only one person on earth fits all of them at once.

    Put numbers on it. To be unique among the roughly 5 billion internet users alive today you need about log2(5,000,000,000), which is close to 33 bits of identifying information. That sounds like a lot until you tally what your browser gives away. In the original 2010 Panopticlick study, run by Peter Eckersley at the Electronic Frontier Foundation across 483,492 browsers, the measured entropy of individual signals looked like this:

    • User agent string: about 5.1 bits
    • Browser plugins: about 4.2 bits
    • Installed fonts: about 4.1 bits
    • Screen resolution and color depth: about 3.8 bits
    • Time zone: about 3.6 bits

    Those five medium strength signals already stack toward 20 bits when combined, enough to be one in roughly a million. Eckersley found that the full fingerprint carried at least 18.1 bits of entropy in that sample, which meant a randomly chosen browser had about a one in 286,777 chance of sharing its fingerprint with another. In practice 94.2 percent of the browsers that ran Flash or Java were outright unique. The dataset was a fraction of the global population, so 18.1 bits was enough to single out almost everyone in it.

    The signals have shifted since then but the conclusion got stronger. Plugins and Flash are gone, which removed two of the richest old signals. In their place, canvas and WebGL became the heavy hitters. The 2016 AmIUnique study by Pierre Laperdrix and colleagues collected 118,934 fingerprints and found 89.4 percent of them unique, with canvas rendering now one of the most discriminating attributes. The reason a script reads your GPU through canvas, WebGL, and audio is that hardware variation is a deep, stable well of entropy that survives browser updates far better than a version number does.

    Why a fingerprint stays stable

    An identifier is only useful for tracking if it is the same tomorrow. Fingerprints are not perfectly stable. You update your browser and the user agent changes. You plug in an external monitor and the screen resolution changes. Eckersley measured this churn and found fingerprints shifted often, yet a simple heuristic still relinked more than 99 percent of changed fingerprints to their previous version, because usually only one attribute moves at a time while the rest hold.

    The hardware derived signals are the anchor. Your GPU, your audio chip, and your installed fonts change far less often than your browser version. Canvas and WebGL hashes can stay identical across browser updates because they reflect silicon and drivers, not software labels. A tracker that sees most of your fingerprint stay constant while one field drifts can follow you across the change. The bundle is sticky even when its parts are not.

    Why this is a privacy and security threat

    The first harm is plain tracking. A fingerprint is a cookie that you cannot clear and did not consent to. Advertising and analytics networks use it to recognize you across sites and sessions even after you delete cookies or switch to a private window. It works in the exact moments people reach for privacy, which is what makes it worse than a cookie rather than equal to one.

    The second harm is deanonymization. Suppose you use one browser profile for an ordinary logged in account and the same browser, in incognito, for something you want kept separate. If both sessions produce the same fingerprint, a service that sees both can tie them to one device. The anonymity you expected from a fresh window evaporates, because the device itself was the identifier the whole time. The same linkage works across sites that share data with a common third party. If an advertising network is embedded on two unrelated sites and both reads return the same fingerprint, that network can join your activity on both, no cookie required and no account needed.

    There is a quieter harm that compounds the first two. Fingerprinting is not the only data that leaks about you without a prompt: a photo you share can carry hidden EXIF metadata such as the GPS coordinates where it was taken, which you can inspect and strip with our free EXIF metadata viewer and scrubber before you post it. Because a fingerprint is read passively, it can be collected before any consent dialog appears and without leaving an obvious trace in the browser. A user inspecting cookies and storage sees nothing unusual, because the identifier lives on the server side, derived from a few script calls that look like ordinary feature detection. The absence of a visible artifact is part of what makes the technique hard to govern. You cannot easily audit what you cannot see being stored.

    The third use cuts the other way, toward defense, and it is worth being honest about. The same fingerprint that tracks you also helps fraud and account takeover systems. When your bank sees a login from an account it knows, on a device whose fingerprint it has seen many times, it can wave you through. When the same account suddenly logs in from a device with a fingerprint never seen before, that is a signal worth a second factor. Fingerprinting is a tracking threat and an anti fraud tool at the same time, and which one it is depends entirely on who is doing it and why. The mechanics are identical.

    Defenses, and their honest limits

    You cannot turn off fingerprinting the way you can clear a cookie, but you can shrink your entropy or muddy the signal. The approaches split into two camps, and both have real limits.

    • Randomization. Some browsers add small noise to canvas, audio, and WebGL output so the hash differs on each read. Brave does this by default. The catch is that a fingerprint that changes every visit can itself be a recognizable trait, and a determined tracker can sometimes average the noise out across reads.
    • Uniformity. The Tor Browser takes the other path. It tries to make every user look identical by standardizing the window size, blocking or faking many signals, and prompting before a canvas can be read. If everyone in the crowd looks the same, no fingerprint stands out. The cost is a more restricted browsing experience, and the protection only holds while you behave like the standard configuration. Resize the window or install an extension and you start to stand out again.
    • Built in browser modes. Firefox ships resist fingerprinting and protection features, and Safari trims the data it exposes. These help, but vendors balance privacy against breaking real sites, so the protection is partial by design. Each blocked signal that a normal site relies on is a site that might break.

    The uncomfortable truth is that better fingerprinting protection can make you more unique, not less, if it makes your browser behave unlike anyone else’s. Privacy here is a crowd problem. You are safest when you look like everyone around you, and most hardening makes you look different. For the full vocabulary around tracking, identifiers, and the attack surface of the browser, our web security glossary is a good companion. The research mindset that exposes fingerprinting in the first place, asking what a system quietly assumes and then testing it, is the same one behind how researchers find vulnerabilities.

    The assumption that breaks

    Every privacy tool aimed at the casual user rests on one assumption: that your identity online is a thing you store, so deleting what you stored makes you anonymous again. Clear the cookies, open a private window, wipe the history, and you are someone new. Browser fingerprinting breaks that assumption at the root. There was never anything stored on your side to delete. The identifier is your device, read live from the screen, the graphics chip, the fonts, the audio stack, the clock. You did not save it and you cannot erase it, because it is not a record. It is a measurement.

    That is the gap worth sitting with. The thing people trust to make them anonymous, clearing local state, targets the wrong layer entirely. The fingerprint lives one level below, in the physical and configured reality of the machine, and that level does not reset when you clear your cookies. Anonymity online was supposed to be something you could reclaim by forgetting. Fingerprinting quietly turned it into something the device remembers for you.

    Frequently asked questions

    Does clearing cookies or using incognito stop browser fingerprinting?

    No. A fingerprint is not stored on your device, so there is nothing to clear. It is recomputed on each visit from attributes your browser exposes, like screen size, time zone, installed fonts, and how your GPU renders a canvas. A private window blocks cookie and history persistence, not these signals, so the same fingerprint reappears. The Electronic Frontier Foundation explains this in its Cover Your Tracks project.

    How many bits of information does it take to identify a browser uniquely?

    Identity is measured in entropy, in bits, where each bit halves the pool of people you could be. To be unique among roughly 5 billion internet users you need about 33 bits. In the 2010 Panopticlick study across 483,492 browsers, the full fingerprint carried at least 18.1 bits of entropy, enough to give a roughly one in 286,777 chance of a collision, and 94.2 percent of browsers running Flash or Java were outright unique in that sample.

    What is canvas fingerprinting?

    Canvas fingerprinting asks the browser to draw text and shapes onto a hidden HTML5 canvas, then reads the pixels back with toDataURL() and hashes them. Two machines given identical drawing instructions produce slightly different pixels because of differences in GPU, graphics driver, fonts, and the operating system rasterizer. The technique was first described by Keaton Mowery and Hovav Shacham in their 2012 paper Pixel Perfect, and canvas is now one of the most discriminating fingerprint signals.

    Is browser fingerprinting only used for tracking?

    No. The same signals power tracking and deanonymization on the privacy invading side, and fraud detection and account takeover protection on the defensive side. A bank can recognize a known device by its fingerprint and challenge a login from a device it has never seen. The mechanics are identical. Whether fingerprinting is a threat or a safeguard depends on who collects it and why, which Mozilla covers in its MDN guide on fingerprinting.

  • How a Device Decides to Trust Its Own Firmware

    How a Device Decides to Trust Its Own Firmware

    A phone, a router, a smart camera, and a car all start the same way. Power arrives, a CPU comes alive, and within microseconds the chip has to answer one question before it does anything else: should I run the code sitting in flash, or has someone swapped it for their own? Secure boot is the machinery that answers that question. It builds a chain of checks that starts in a tiny piece of code burned into the silicon and that the manufacturer cannot change, then extends trust outward one stage at a time until a full operating system is running. This post walks that chain from the bottom up. We start at the immutable boot ROM and the hardware root of trust, follow how each stage verifies the next with a signature check, and then look at the real places attackers break the chain, not by cracking the cryptography, but by stopping the check from running at all.

    What secure boot is actually deciding

    Strip away the acronyms and secure boot is a single repeated decision. At every handoff during startup, the code that is currently in control measures the code it is about to run, checks that measurement against a trusted reference, and refuses to continue if they do not match. The reference is a digital signature. The trusted party is the device maker, who signed each firmware image with a private key that never leaves their build infrastructure. The device holds the matching public key, or a fingerprint of it, and uses that to confirm the signature was made by the right party and that not one byte of the image has changed since.

    That sounds simple, and conceptually it is. The hard part is the very first link. To check a signature you need a trusted public key. To trust that public key you need something that was itself never tampered with. You cannot verify your way down forever, so the chain has to terminate in something the attacker physically cannot rewrite. That something is the hardware root of trust, and everything else hangs off it.

    The hardware root of trust: where trust has to start

    The root of trust is not software in the usual sense. It is a small block of code fixed permanently in the chip during manufacturing, called the boot ROM, plus a place to store the device maker’s public key fingerprint that can be written once and never again. When the CPU comes out of reset, the program counter does not point at flash. It points at this boot ROM. The very first instruction the processor runs is code the attacker has no way to modify, because it was etched into the silicon mask. This is the anchor. If an attacker could change the boot ROM, the whole scheme would collapse, so the design makes that physically impossible rather than merely difficult.

    eFuses and one time programmable memory

    The boot ROM needs the device maker’s public key to check the next stage, but baking a full 4096 bit key into the ROM is wasteful and inflexible. Instead the chip stores only a cryptographic hash of the public key, often a SHA-256 or SHA-384 digest, in a bank of eFuses. An eFuse is a microscopic link that the factory can blow exactly once by passing current through it, flipping a bit from one to zero forever. This kind of storage is called one time programmable, or OTP. Once the key hash is fused in, there is no electrical way to roll a blown fuse back to its original state. The key fingerprint becomes a permanent property of that physical chip.

    The flow at first power on goes like this. The boot ROM reads the actual public key from flash, where it sits alongside the signed firmware. It hashes that key and compares the result against the fingerprint locked in the eFuses. If they match, the key is genuine and can be trusted to verify signatures. If they do not match, the boot ROM stops. This indirection is deliberate. The chip commits to a tiny fixed value, the hash, while the full key lives in cheaper rewritable storage. An attacker can replace the key in flash, but then its hash no longer matches the fuses, and the boot ROM rejects it.

    The fuse does not store a secret. It stores a public fingerprint that can never be unsaid, and that permanence is the entire point. Everything the device will ever trust traces back to a value the attacker cannot rewrite.

    Walking the chain upward, one signature at a time

    With a trusted key in hand, the boot ROM can verify the next piece of code. That next piece is usually the first stage bootloader, a small program in flash whose job is to bring up enough of the system to load the larger pieces that follow. The image is shipped with a signature: the device maker hashed the bootloader, encrypted that hash with their private key, and appended the result. The boot ROM hashes the bootloader it found in flash, uses the now trusted public key to verify the signature, and compares. Match means the bootloader is authentic and unmodified, so control passes to it. Mismatch means stop.

    Here is the structural idea that makes secure boot work. Each stage, once verified, becomes trusted, and it carries the same responsibility forward. The first stage bootloader verifies the second stage. The second stage verifies the operating system kernel. On a device with a richer software stack, the chain can keep going into a hypervisor or a trusted execution environment. Each link uses the same pattern: hash the next image, verify its signature against a key that the current trusted stage already vouches for, refuse to continue on failure. Trust flows in one direction only, from the silicon outward, and it is never assumed, only checked and passed along.

    [ Boot ROM ]        immutable, in silicon
         |  verifies signature of
         v
    [ First stage bootloader ]   in flash, signed
         |  verifies signature of
         v
    [ Second stage bootloader ]  in flash, signed
         |  verifies signature of
         v
    [ OS kernel ]                in flash, signed
         |
         v
    [ Applications ]

    A useful contrast helps here. Some systems do measured boot instead of, or alongside, secure boot. Measured boot does not stop a bad image from running. It records a hash of each stage into a secure log, often inside a security chip, so a later party can inspect the log and decide whether the device is in a known good state. Secure boot is enforcement: a bad stage never runs. Measured boot is evidence: a bad stage runs but leaves a record. Many designs use both, because they answer different questions.

    Anti rollback: blocking the downgrade trick

    Signature checking alone has a gap. Suppose version 5 of the firmware shipped with a security fix, but version 3 was also signed by the same valid key a year earlier and had a flaw. An attacker who keeps a copy of the old version 3 image can flash it back. Its signature is still valid, because the key has not changed, so a naive secure boot accepts it. The attacker has downgraded the device to a vulnerable but properly signed build. This is a rollback attack, and it defeats the purpose of patching.

    The defense is an anti rollback counter, a monotonic version number stored in OTP fuses or other secure non volatile memory. Each firmware image carries a minimum version it is willing to run as. When a new version boots, it can burn the counter forward to its own version. From then on, the boot process refuses any image whose version is below the stored counter, even if that image is perfectly signed. Because the counter lives in fuses that only move in one direction, the attacker cannot wind it back. The old signed image becomes unbootable on that device. This is why secure boot designs care about a monotonic counter as much as about signatures: the signature proves who made the image, and the counter proves it is recent enough to trust.

    Where the secure boot chain actually breaks

    Now the interesting part. In almost every real world bypass, the cryptography stays intact. Nobody factors the RSA key or finds a hash collision. Attackers go after the assumption underneath the whole scheme: that the verify step always runs, and always runs correctly. Break that assumption and the strongest signature in the world never gets checked. Here are the recurring weak points, described as concepts rather than as a recipe.

    Stages that were never signed in the first place

    The simplest break is a chain with a missing link. A designer signs the bootloader and the kernel but forgets to verify a later component, a configuration blob, a device tree, a secondary processor’s firmware, a recovery image. Any stage that loads code without checking a signature is an open door. The attacker does not need to defeat the strong links. They walk through the unverified one and gain control inside the trusted boot flow. Secure boot is only as strong as its weakest handoff, and a single unsigned stage anywhere in the sequence resets the whole guarantee. This is the same lesson as ordinary software privilege escalation, where one component that trusts input it should have checked hands an attacker more power than they were supposed to have.

    Debug interfaces left wide open

    Chips ship with hardware debug ports for development: JTAG, serial wire debug known as SWD, and a serial console over UART. These let an engineer halt the processor, read and write memory, and single step through code. They are essential during development and are supposed to be disabled or locked before a device ships. When they are left enabled, secure boot becomes almost beside the point. An attacker with a few dollars of wiring can attach a debugger, halt the CPU partway through boot, and either patch the comparison that decides whether a signature matched or simply jump past the check entirely. The signature is still valid and still present. It is just never the thing that decides what runs.

    A UART console deserves its own mention because it is so often overlooked. A serial port that drops to an interactive bootloader prompt, or that prints enough internal state to map the boot flow, gives an attacker both a foothold and a blueprint. Many embedded compromises start with nothing more exotic than soldering three wires to test pads and watching what the device says about itself as it boots.

    Fault injection: glitching the check into passing

    The most striking attacks accept that the signature check runs, then make it lie. Fault injection, also called glitching, deliberately pushes the chip outside its safe operating range for a few nanoseconds at a precise moment. A sharp dip or spike on the power supply, a sudden change in the clock, or a focused electromagnetic pulse can cause a single instruction to misbehave. The processor might skip an instruction, or compute the wrong result for a comparison. If that corrupted instruction happens to be the branch that says jump to failure if the signature did not match, the device sails on as if the check passed.

    This is not a theoretical worry. Security researchers have publicly demonstrated voltage glitching that bypasses secure boot on the popular ESP32 microcontroller, timing the glitch to land exactly when the boot ROM performs its verification. On Nordic Semiconductor’s nRF52 chips, a fault injection attack presented at Black Hat Europe in 2020 by the researcher behind LimitedResults defeated the APPROTECT feature that is meant to lock the debug port, effectively resurrecting full SWD debug access on a chip that was supposed to be sealed. Researchers have also used electromagnetic fault injection against the Linux kernel authentication stage of Android secure boot on an ARM Cortex A53, getting the device to accept an unsigned kernel some fraction of the time. The pattern across all of these is identical. The math was never attacked. The hardware was nudged into not running the math.

    TOCTOU: verify one image, run another

    There is a subtler failure that does not need a single physical fault. It is a time of check to time of use problem, usually shortened to TOCTOU. The boot code reads an image, verifies its signature, and then, in a separate step, loads the image into memory and runs it. If the storage can change between the verify and the load, an attacker can present a good image during the check and swap in a malicious one before it actually executes. The check passed honestly. It just validated a copy that is no longer the one being run. This shows up when verification reads from a location that direct memory access or a second processor can still write to, or when the image is verified in place and then copied with no re check. The fix is to verify the exact bytes you are about to execute, after they are in memory you control, and never give anything else a window to touch them in between.

    Rollback and key handling mistakes

    Even with everything else right, weak key handling unravels the chain. If the anti rollback counter is never actually advanced, old signed images with known flaws stay bootable. If a device maker’s signing key leaks, every device that trusts it will happily run attacker firmware, and revoking a key fingerprint that is fused into millions of chips ranges from painful to impossible. If the same key signs every product line with no segmentation, one leak compromises the entire fleet. These are not glamorous attacks, but they are common, because key management is operationally hard and the consequences are permanent in a way that software bugs are not.

    Why the secure boot chain holds or fails as a whole

    Look back across the breaks and a single shape emerges. The boot ROM is immutable, the fuses cannot be rewound, the signatures are cryptographically sound, and the chain is logically airtight. Attackers ignore all of that and target the seams. An unsigned stage means a link that never checks. An open JTAG port means the check can be patched out. A glitch means the check runs but produces the wrong answer. A TOCTOU window means the check validated the wrong bytes. In each case the cryptography is fine and the device is still owned, because the thing that failed was the guarantee that verification happens, on the real payload, every single time.

    This is the same way the most interesting software vulnerabilities get found. You do not start from a list of known bad inputs. You ask what each component is assuming about the thing that calls it or the thing it loads, then you find a way to make that assumption false. We dig into that mindset in our piece on how attackers find vulnerabilities. Hardware and software both reward the same question: where does this system trust something it never actually verified, and what happens when I stand in that gap?

    What a defender should take away

    If you build or buy devices that depend on secure boot, the checklist follows straight from the failure modes above. Verify every stage, with no unsigned component anywhere in the load order, including recovery paths and secondary processors. Disable or permanently lock JTAG, SWD, and UART debug access in production, and treat a chip whose debug lock can itself be glitched off as a chip whose debug lock you do not really have. Burn and enforce anti rollback counters so old signed images cannot come back. Verify the exact bytes you execute, after they are in memory you control, to close TOCTOU windows. Treat fault injection as a real threat for any device an attacker can physically hold, and prefer chips with hardened verification and glitch detection. And guard the signing keys as the crown jewels they are, because a fused root of trust is forever, in both directions.

    None of these controls is exotic on its own. The failures happen at the joins, where a reasonable looking design quietly assumed that a check would run when it did not, or ran on bytes that were no longer there. That is the heart of it. Secure boot does not fail because the cryptography is weak. It fails because an attacker found a way to make the verify step not run, or run on the wrong thing, or be skipped by a chip that was pushed past its limits. The cryptography assumes the check happens. The attacker breaks the assumption, not the math, and testing that assumption, asking whether the verify step truly runs every time on the real payload, is where the real security work lives.

    Frequently asked questions

    What is the hardware root of trust in secure boot?

    It is the part of the chain that an attacker physically cannot rewrite. It is the immutable boot ROM, the first code the CPU runs at reset, etched into the silicon, plus a one time programmable store such as eFuses that holds a hash of the device maker’s public key. The boot ROM uses that fused fingerprint to confirm the verification key is genuine before it checks any signature, so trust starts from a value no one can change after manufacturing.

    Why store a key hash in eFuses instead of the full key?

    An eFuse is a link the factory blows once, flipping a bit permanently, so the storage is one time programmable and cannot be rolled back. Storing a short SHA-256 or SHA-384 hash of the public key costs far fewer fuses than a full 4096 bit key while still pinning the chip to one trusted key. The complete key lives in cheaper rewritable flash, and the boot ROM rejects it if its hash does not match the fingerprint locked in the fuses. ARM describes these trust anchors in its platform security documentation.

    How do attackers bypass secure boot without breaking the cryptography?

    They stop the verify step from running or make it lie. Common breaks include a later stage that was never signed, debug ports such as JTAG, SWD, or UART left enabled so the check can be patched out, and fault injection or glitching that nudges the chip into skipping the comparison. There is also TOCTOU, where the code verifies one image and then loads a different one. In each case the signature math is sound and the device is still compromised.

    What is an anti rollback counter and why does it matter?

    It is a monotonic version number stored in fuses or secure non volatile memory that only ever moves forward. Without it, an attacker can reflash an older firmware version that is still validly signed but has a known flaw, undoing a security patch. The counter lets each new image refuse to run if its version is below the stored value, so old signed builds become unbootable. NIST covers this rollback prevention in SP 800-193.