How a Device Decides to Trust Its Own Firmware

How a Device Decides to Trust Its Own Firmware

Written by

in

A phone, a router, a smart camera, and a car all start the same way. Power arrives, a CPU comes alive, and within microseconds the chip has to answer one question before it does anything else: should I run the code sitting in flash, or has someone swapped it for their own? Secure boot is the machinery that answers that question. It builds a chain of checks that starts in a tiny piece of code burned into the silicon and that the manufacturer cannot change, then extends trust outward one stage at a time until a full operating system is running. This post walks that chain from the bottom up. We start at the immutable boot ROM and the hardware root of trust, follow how each stage verifies the next with a signature check, and then look at the real places attackers break the chain, not by cracking the cryptography, but by stopping the check from running at all.

What secure boot is actually deciding

Strip away the acronyms and secure boot is a single repeated decision. At every handoff during startup, the code that is currently in control measures the code it is about to run, checks that measurement against a trusted reference, and refuses to continue if they do not match. The reference is a digital signature. The trusted party is the device maker, who signed each firmware image with a private key that never leaves their build infrastructure. The device holds the matching public key, or a fingerprint of it, and uses that to confirm the signature was made by the right party and that not one byte of the image has changed since.

That sounds simple, and conceptually it is. The hard part is the very first link. To check a signature you need a trusted public key. To trust that public key you need something that was itself never tampered with. You cannot verify your way down forever, so the chain has to terminate in something the attacker physically cannot rewrite. That something is the hardware root of trust, and everything else hangs off it.

The hardware root of trust: where trust has to start

The root of trust is not software in the usual sense. It is a small block of code fixed permanently in the chip during manufacturing, called the boot ROM, plus a place to store the device maker’s public key fingerprint that can be written once and never again. When the CPU comes out of reset, the program counter does not point at flash. It points at this boot ROM. The very first instruction the processor runs is code the attacker has no way to modify, because it was etched into the silicon mask. This is the anchor. If an attacker could change the boot ROM, the whole scheme would collapse, so the design makes that physically impossible rather than merely difficult.

eFuses and one time programmable memory

The boot ROM needs the device maker’s public key to check the next stage, but baking a full 4096 bit key into the ROM is wasteful and inflexible. Instead the chip stores only a cryptographic hash of the public key, often a SHA-256 or SHA-384 digest, in a bank of eFuses. An eFuse is a microscopic link that the factory can blow exactly once by passing current through it, flipping a bit from one to zero forever. This kind of storage is called one time programmable, or OTP. Once the key hash is fused in, there is no electrical way to roll a blown fuse back to its original state. The key fingerprint becomes a permanent property of that physical chip.

The flow at first power on goes like this. The boot ROM reads the actual public key from flash, where it sits alongside the signed firmware. It hashes that key and compares the result against the fingerprint locked in the eFuses. If they match, the key is genuine and can be trusted to verify signatures. If they do not match, the boot ROM stops. This indirection is deliberate. The chip commits to a tiny fixed value, the hash, while the full key lives in cheaper rewritable storage. An attacker can replace the key in flash, but then its hash no longer matches the fuses, and the boot ROM rejects it.

The fuse does not store a secret. It stores a public fingerprint that can never be unsaid, and that permanence is the entire point. Everything the device will ever trust traces back to a value the attacker cannot rewrite.

Walking the chain upward, one signature at a time

With a trusted key in hand, the boot ROM can verify the next piece of code. That next piece is usually the first stage bootloader, a small program in flash whose job is to bring up enough of the system to load the larger pieces that follow. The image is shipped with a signature: the device maker hashed the bootloader, encrypted that hash with their private key, and appended the result. The boot ROM hashes the bootloader it found in flash, uses the now trusted public key to verify the signature, and compares. Match means the bootloader is authentic and unmodified, so control passes to it. Mismatch means stop.

Here is the structural idea that makes secure boot work. Each stage, once verified, becomes trusted, and it carries the same responsibility forward. The first stage bootloader verifies the second stage. The second stage verifies the operating system kernel. On a device with a richer software stack, the chain can keep going into a hypervisor or a trusted execution environment. Each link uses the same pattern: hash the next image, verify its signature against a key that the current trusted stage already vouches for, refuse to continue on failure. Trust flows in one direction only, from the silicon outward, and it is never assumed, only checked and passed along.

[ Boot ROM ]        immutable, in silicon
     |  verifies signature of
     v
[ First stage bootloader ]   in flash, signed
     |  verifies signature of
     v
[ Second stage bootloader ]  in flash, signed
     |  verifies signature of
     v
[ OS kernel ]                in flash, signed
     |
     v
[ Applications ]

A useful contrast helps here. Some systems do measured boot instead of, or alongside, secure boot. Measured boot does not stop a bad image from running. It records a hash of each stage into a secure log, often inside a security chip, so a later party can inspect the log and decide whether the device is in a known good state. Secure boot is enforcement: a bad stage never runs. Measured boot is evidence: a bad stage runs but leaves a record. Many designs use both, because they answer different questions.

Anti rollback: blocking the downgrade trick

Signature checking alone has a gap. Suppose version 5 of the firmware shipped with a security fix, but version 3 was also signed by the same valid key a year earlier and had a flaw. An attacker who keeps a copy of the old version 3 image can flash it back. Its signature is still valid, because the key has not changed, so a naive secure boot accepts it. The attacker has downgraded the device to a vulnerable but properly signed build. This is a rollback attack, and it defeats the purpose of patching.

The defense is an anti rollback counter, a monotonic version number stored in OTP fuses or other secure non volatile memory. Each firmware image carries a minimum version it is willing to run as. When a new version boots, it can burn the counter forward to its own version. From then on, the boot process refuses any image whose version is below the stored counter, even if that image is perfectly signed. Because the counter lives in fuses that only move in one direction, the attacker cannot wind it back. The old signed image becomes unbootable on that device. This is why secure boot designs care about a monotonic counter as much as about signatures: the signature proves who made the image, and the counter proves it is recent enough to trust.

Where the secure boot chain actually breaks

Now the interesting part. In almost every real world bypass, the cryptography stays intact. Nobody factors the RSA key or finds a hash collision. Attackers go after the assumption underneath the whole scheme: that the verify step always runs, and always runs correctly. Break that assumption and the strongest signature in the world never gets checked. Here are the recurring weak points, described as concepts rather than as a recipe.

Stages that were never signed in the first place

The simplest break is a chain with a missing link. A designer signs the bootloader and the kernel but forgets to verify a later component, a configuration blob, a device tree, a secondary processor’s firmware, a recovery image. Any stage that loads code without checking a signature is an open door. The attacker does not need to defeat the strong links. They walk through the unverified one and gain control inside the trusted boot flow. Secure boot is only as strong as its weakest handoff, and a single unsigned stage anywhere in the sequence resets the whole guarantee. This is the same lesson as ordinary software privilege escalation, where one component that trusts input it should have checked hands an attacker more power than they were supposed to have.

Debug interfaces left wide open

Chips ship with hardware debug ports for development: JTAG, serial wire debug known as SWD, and a serial console over UART. These let an engineer halt the processor, read and write memory, and single step through code. They are essential during development and are supposed to be disabled or locked before a device ships. When they are left enabled, secure boot becomes almost beside the point. An attacker with a few dollars of wiring can attach a debugger, halt the CPU partway through boot, and either patch the comparison that decides whether a signature matched or simply jump past the check entirely. The signature is still valid and still present. It is just never the thing that decides what runs.

A UART console deserves its own mention because it is so often overlooked. A serial port that drops to an interactive bootloader prompt, or that prints enough internal state to map the boot flow, gives an attacker both a foothold and a blueprint. Many embedded compromises start with nothing more exotic than soldering three wires to test pads and watching what the device says about itself as it boots.

Fault injection: glitching the check into passing

The most striking attacks accept that the signature check runs, then make it lie. Fault injection, also called glitching, deliberately pushes the chip outside its safe operating range for a few nanoseconds at a precise moment. A sharp dip or spike on the power supply, a sudden change in the clock, or a focused electromagnetic pulse can cause a single instruction to misbehave. The processor might skip an instruction, or compute the wrong result for a comparison. If that corrupted instruction happens to be the branch that says jump to failure if the signature did not match, the device sails on as if the check passed.

This is not a theoretical worry. Security researchers have publicly demonstrated voltage glitching that bypasses secure boot on the popular ESP32 microcontroller, timing the glitch to land exactly when the boot ROM performs its verification. On Nordic Semiconductor’s nRF52 chips, a fault injection attack presented at Black Hat Europe in 2020 by the researcher behind LimitedResults defeated the APPROTECT feature that is meant to lock the debug port, effectively resurrecting full SWD debug access on a chip that was supposed to be sealed. Researchers have also used electromagnetic fault injection against the Linux kernel authentication stage of Android secure boot on an ARM Cortex A53, getting the device to accept an unsigned kernel some fraction of the time. The pattern across all of these is identical. The math was never attacked. The hardware was nudged into not running the math.

TOCTOU: verify one image, run another

There is a subtler failure that does not need a single physical fault. It is a time of check to time of use problem, usually shortened to TOCTOU. The boot code reads an image, verifies its signature, and then, in a separate step, loads the image into memory and runs it. If the storage can change between the verify and the load, an attacker can present a good image during the check and swap in a malicious one before it actually executes. The check passed honestly. It just validated a copy that is no longer the one being run. This shows up when verification reads from a location that direct memory access or a second processor can still write to, or when the image is verified in place and then copied with no re check. The fix is to verify the exact bytes you are about to execute, after they are in memory you control, and never give anything else a window to touch them in between.

Rollback and key handling mistakes

Even with everything else right, weak key handling unravels the chain. If the anti rollback counter is never actually advanced, old signed images with known flaws stay bootable. If a device maker’s signing key leaks, every device that trusts it will happily run attacker firmware, and revoking a key fingerprint that is fused into millions of chips ranges from painful to impossible. If the same key signs every product line with no segmentation, one leak compromises the entire fleet. These are not glamorous attacks, but they are common, because key management is operationally hard and the consequences are permanent in a way that software bugs are not.

Why the secure boot chain holds or fails as a whole

Look back across the breaks and a single shape emerges. The boot ROM is immutable, the fuses cannot be rewound, the signatures are cryptographically sound, and the chain is logically airtight. Attackers ignore all of that and target the seams. An unsigned stage means a link that never checks. An open JTAG port means the check can be patched out. A glitch means the check runs but produces the wrong answer. A TOCTOU window means the check validated the wrong bytes. In each case the cryptography is fine and the device is still owned, because the thing that failed was the guarantee that verification happens, on the real payload, every single time.

This is the same way the most interesting software vulnerabilities get found. You do not start from a list of known bad inputs. You ask what each component is assuming about the thing that calls it or the thing it loads, then you find a way to make that assumption false. We dig into that mindset in our piece on how attackers find vulnerabilities. Hardware and software both reward the same question: where does this system trust something it never actually verified, and what happens when I stand in that gap?

What a defender should take away

If you build or buy devices that depend on secure boot, the checklist follows straight from the failure modes above. Verify every stage, with no unsigned component anywhere in the load order, including recovery paths and secondary processors. Disable or permanently lock JTAG, SWD, and UART debug access in production, and treat a chip whose debug lock can itself be glitched off as a chip whose debug lock you do not really have. Burn and enforce anti rollback counters so old signed images cannot come back. Verify the exact bytes you execute, after they are in memory you control, to close TOCTOU windows. Treat fault injection as a real threat for any device an attacker can physically hold, and prefer chips with hardened verification and glitch detection. And guard the signing keys as the crown jewels they are, because a fused root of trust is forever, in both directions.

None of these controls is exotic on its own. The failures happen at the joins, where a reasonable looking design quietly assumed that a check would run when it did not, or ran on bytes that were no longer there. That is the heart of it. Secure boot does not fail because the cryptography is weak. It fails because an attacker found a way to make the verify step not run, or run on the wrong thing, or be skipped by a chip that was pushed past its limits. The cryptography assumes the check happens. The attacker breaks the assumption, not the math, and testing that assumption, asking whether the verify step truly runs every time on the real payload, is where the real security work lives.

Frequently asked questions

What is the hardware root of trust in secure boot?

It is the part of the chain that an attacker physically cannot rewrite. It is the immutable boot ROM, the first code the CPU runs at reset, etched into the silicon, plus a one time programmable store such as eFuses that holds a hash of the device maker’s public key. The boot ROM uses that fused fingerprint to confirm the verification key is genuine before it checks any signature, so trust starts from a value no one can change after manufacturing.

Why store a key hash in eFuses instead of the full key?

An eFuse is a link the factory blows once, flipping a bit permanently, so the storage is one time programmable and cannot be rolled back. Storing a short SHA-256 or SHA-384 hash of the public key costs far fewer fuses than a full 4096 bit key while still pinning the chip to one trusted key. The complete key lives in cheaper rewritable flash, and the boot ROM rejects it if its hash does not match the fingerprint locked in the fuses. ARM describes these trust anchors in its platform security documentation.

How do attackers bypass secure boot without breaking the cryptography?

They stop the verify step from running or make it lie. Common breaks include a later stage that was never signed, debug ports such as JTAG, SWD, or UART left enabled so the check can be patched out, and fault injection or glitching that nudges the chip into skipping the comparison. There is also TOCTOU, where the code verifies one image and then loads a different one. In each case the signature math is sound and the device is still compromised.

What is an anti rollback counter and why does it matter?

It is a monotonic version number stored in fuses or secure non volatile memory that only ever moves forward. Without it, an attacker can reflash an older firmware version that is still validly signed but has a known flaw, undoing a security patch. The counter lets each new image refuse to run if its version is below the stored value, so old signed builds become unbootable. NIST covers this rollback prevention in SP 800-193.