In October 2023 Google, Cloudflare, and AWS disclosed a denial of service technique that broke every prior request rate record at once, and they all named the same root cause: a feature of HTTP/2 working exactly as designed. The attack is called HTTP/2 Rapid Reset, tracked as CVE-2023-44487, and the clever part is that it never breaks a single protocol rule. It opens a request, lets the server start the expensive work, then cancels it a moment later, faster than the server’s own limits were meant to allow. This post walks the mechanism at the frame level.
Why HTTP/2 multiplexes in the first place
HTTP/1.1 sends one request at a time down a connection. A slow response at the front holds up everything queued behind it. That is head of line blocking, and it is why fetching many things at once meant opening several separate TCP connections side by side.
HTTP/2 fixed that by putting many independent streams inside one TCP connection. Each stream is one request and response, identified by a number, and the bytes of all streams are chopped into frames that interleave on the wire. A HEADERS frame starts a stream and carries the method, path, and headers. DATA frames carry the body. Because every frame is tagged with its stream id, the server can work on streams 1, 3, and 5 at once and return their responses in whatever order they finish. RFC 9113 defines all of this.
The limit that was supposed to keep this safe
If a client could open unlimited streams on one connection, a server would drown in concurrent work. So HTTP/2 has a setting called SETTINGS_MAX_CONCURRENT_STREAMS. The server advertises a number, commonly around 100, and the client may not have more than that many streams open at once. Only streams in the open or half closed state count against the limit, so a client wanting more parallelism waits for a current stream to finish first. That ceiling is the safety valve the whole design leans on.
The HTTP/2 Rapid Reset trick, frame by frame
Now the cancel. HTTP/2 lets either side abandon a stream instantly by sending a RST_STREAM frame for that stream id. It is normal and useful: a browser navigates away, a fetch gets aborted, so the client tells the server to stop wasting effort. The frame is tiny, and the stream is closed the moment it is sent.
Here is the detail the attack hangs on. When a client sends RST_STREAM, that stream immediately stops counting against MAX_CONCURRENT_STREAMS. The slot frees at once. So the client can open a stream and cancel it in the same breath, and the canceled stream never occupied a slot long enough to matter. The concurrency limit was meant to throttle how fast a client creates work. The reset slips straight past it:
HEADERS stream 1 (GET /expensive-search?q=...) RST_STREAM stream 1 error_code = CANCEL HEADERS stream 3 (GET /expensive-search?q=...) RST_STREAM stream 3 error_code = CANCEL ... (repeat in a tight loop)
Watch what the two ends do. On the server, that one HEADERS frame is enough to begin a request. The server parses it, allocates stream state, runs middleware, maybe proxies the whole thing to an upstream back end that now starts its own work. A moment later the RST_STREAM arrives. The stream is marked canceled, but the request it kicked off is already moving through the stack, and the upstream may keep grinding for an answer nobody will read. The client paid almost nothing, two small frames. The server, and everything behind it, paid the full price of a request.
The concurrency limit counts streams that are open. The attacker’s streams are never open long enough to be counted, yet each one still sets the full cost of a request in motion.
Because the cancel frees the slot instantly, the client is not capped at 100 requests in flight. It is capped only by how fast it can write frame pairs onto the connection, far faster than the server can finish the work each pair triggers. One connection becomes a firehose of back end requests, inside the rules the entire time.
Why HTTP/1.1 could not do this
The same idea does not work on HTTP/1.1, and the reason is structural. HTTP/1.1 has no cheap in protocol cancel. The only way to abandon a request mid flight is to tear down the whole TCP connection, which costs the attacker a full handshake before the next one. And without multiplexing, one connection processes one request at a time, so head of line blocking stops you from stacking pending requests onto it. To flood a server over HTTP/1.1 you need a flood of connections, each one visible, countable, and rate limitable at the network layer. HTTP/2 collapsed all of that onto a single connection and handed the client a free, instant cancel. That is what made the request per second numbers explode.
The records back this up. AWS reported peaks around 155 million requests per second. Cloudflare measured 201 million, nearly triple its previous record. Google absorbed 398 million, the largest it had ever seen. The striking part is the source: Cloudflare noted the attack came from a botnet of only about 20,000 machines. The amplification was in how many requests each connection could conjure before the server could push back, not in the number of attackers. It is a cousin of algorithmic resource attacks like a hash flooding attack, and a relative of the protocol level desyncs behind HTTP request smuggling.
Mitigations that hold
There is no malicious payload to filter; every frame is valid. The defenses are about accounting: watch the resets, and stop doing free work for a client that abuses the cancel.
- Count and rate limit resets per connection. Track how many
RST_STREAMframes a connection sends. A healthy client cancels occasionally; one that cancels almost everything it opens is running the attack. Set the threshold strictly, since a loose limit lets a flood through before it trips. - Close connections that cross the reset threshold. Once a connection’s cancel rate looks abusive, send
GOAWAYand tear it down. Forcing the attacker back to a TCP handshake restores the cost HTTP/2 had removed and pushes the fight to the network layer, where connection floods are an old, handled problem. - Cap total streams per connection lifetime. Limit how many streams a single connection may ever create, not just how many run at once. The attack depends on recycling one connection through endless streams, so a lifetime cap bounds the damage any one connection can do.
- Cancel the upstream work too. Make sure a canceled stream actually aborts the upstream request and frees the query behind it, so a reset does not leave orphaned work running for nobody.
The assumption that broke
The cancel was not a bug. It was an efficiency feature, added so clients could stop paying for work they no longer wanted, and the attack ran it in the other direction to make the server pay instead. A feature built to reduce waste became an amplifier on demand. HTTP/2 assumed a canceled stream costs nothing, because the client asked to stop. That holds when clients are honest. It falls apart in bad faith, because the cost of a request is set when the HEADERS frame lands, not when the response is read, and the cancel arrives too late to call it back. The concurrency limit guarded the wrong moment: it counted what was open, not what had already been started, and that gap is the whole attack. That kind of flaw, a feature whose safety rests on an unstated good faith assumption, is exactly what an autonomous researcher built to test an application’s assumptions and prove findings with evidence is meant to surface. More on that approach on our about page.
Frequently asked questions
What is the HTTP/2 Rapid Reset attack?
It is a denial of service technique tracked as CVE-2023-44487, disclosed in October 2023 by Google, Cloudflare, and AWS. The attacker opens an HTTP/2 stream with a HEADERS frame so the server starts work, then immediately cancels it with a RST_STREAM frame, in a tight loop. Because a canceled stream stops counting against the connection’s concurrency limit at once, one connection can trigger far more backend requests than the limit was meant to allow.
Why could HTTP/1.1 not be used for this attack?
HTTP/1.1 has no cheap in protocol cancel. The only way to abandon a request is to close the whole TCP connection, which forces a new handshake before the next request. It also lacks multiplexing, so one connection handles one request at a time and head of line blocking stops an attacker from stacking many pending requests on it. HTTP/2 removed both limits by putting many streams on one connection and adding an instant cancel frame.
How big were the record HTTP/2 Rapid Reset attacks?
AWS reported peaks near 155 million requests per second, Cloudflare measured about 201 million, and Google absorbed roughly 398 million requests per second, the largest it had recorded. Cloudflare noted the traffic came from a botnet of only about 20,000 machines. The amplification came from how many requests each connection could generate, not from the number of attacking machines.
How do you mitigate the HTTP/2 Rapid Reset attack?
Count RST_STREAM frames per connection and rate limit them with a strict threshold, then close any connection that crosses it by sending GOAWAY. Cap the total number of streams a single connection may create over its whole lifetime, not just how many run at once. Make sure a canceled stream actually cancels the upstream work so a reset does not leave orphaned requests running.
