Dependency Confusion Attack Explained

A dependency confusion attack happens when your package manager looks in two places for the same package name, a private registry you control and a public one anyone can publish to, and an attacker plants a package with that exact name on the public side. The resolver sees a higher version number out in public, decides it is newer, and pulls the attacker’s code instead of yours. The install hook then runs on a developer laptop or a build server before anyone reads a line of it.

How a dependency confusion attack actually works

Many companies build internal libraries. Say a fictional company, acme.example, keeps a package called acme-billing-utils in a private registry. Developers add it to a manifest and their package manager fetches it. So far nothing is wrong.

The trouble starts with how the client resolves names. If the same client is also configured to check the public registry, then for any name it cannot find privately, or sometimes for every name, it asks the public registry too. The attacker registers acme-billing-utils on the public registry and gives it version 99.0.0. Your real internal copy is on version 1.4.2. When the resolver compares the two, the public version wins on precedence, and the build pulls the wrong one.

The attacker never breaks into your registry. They wait outside it, publish a higher version of a name you already trust, and let your own resolver hand them the build.

Version precedence is the lever

Package managers treat a higher version as the one you want. That rule is sensible most of the time, since you usually want the newest fix. It turns against you the moment two registries can answer for one name. The attacker does not need to guess your version. They publish something absurd like 99.0.0, and the comparison falls their way every time.

Install hooks run code, not just copy files

Installing a package is not only a download. Many ecosystems run a script at install time. In the npm world a postinstall script runs automatically. In Python, code in setup.py executes when the package is built or installed. That script runs with the same rights as the person or process doing the install. On a developer laptop that means access to local files, environment variables, and tokens. On a CI build server it can mean access to deploy keys and signing material. This is the same idea as command injection, since an install hook runs arbitrary commands the moment the package lands.

A tiny illustrative example

Here is the shape of the problem, written for defenders. Nothing below is a working payload. It shows how a manifest and a registry view line up so the wrong package gets chosen.

# package.json on a developer machine at acme.example
{
  "name": "acme-internal-app",
  "dependencies": {
    "acme-billing-utils": "^1.4.0"
  }
}

# What the private registry holds
acme-billing-utils  1.4.2   (your real internal package)

# What the attacker publishes to the PUBLIC registry
acme-billing-utils  99.0.0  (same name, much higher version)

# A package can declare an install hook that runs automatically
{
  "name": "acme-billing-utils",
  "version": "99.0.0",
  "scripts": {
    "postinstall": "node ./collect.js"   # runs at install time
  }
}

The resolver compares 1.4.2 against 99.0.0, picks the higher one, downloads it from the public registry, and runs postinstall. The collection script in this sketch is left empty on purpose. The point is that arbitrary code ran before any review, on whichever machine did the install.

Where it bites

Two places take the damage first.

CI build servers. These run installs constantly, often with broad permissions and long lived credentials. A build agent that pulls a poisoned package can leak deploy keys, cloud tokens, or source for every project it touches.
Developer laptops. A developer running an install brings the attacker’s code onto a machine that holds SSH keys, cloud sessions, and access to internal services. One install can become a foothold inside the network.

The public research that named this class generically appeared in 2021, when a researcher published internal package names for several organisations to the public registries and watched their builds reach out and run the planted code. We avoid restating specific company names or counts, since the point stands without them, the names were real and the technique worked widely.

How to detect it

Audit which names resolve publicly. List every internal package name, then check whether that name returns anything from the public registry. A private name that resolves in public is a name an attacker can claim.
Watch for unexpected high versions. An internal package on 1.4.2 that suddenly offers 99.0.0 from a public source is a clear warning. Diff resolved versions against what your private registry actually serves.
Monitor install hooks. Log when postinstall or setup.py code runs during a build, and flag scripts that reach the network or read credentials. A package that never needed a hook before and now ships one deserves a look.
Inspect lockfiles. Check the resolved source URL for each dependency. If an internal name resolved against the public registry, the lockfile records it.

How to prevent it

Claim your internal names in public. Reserve every internal package name on the public registries with an empty placeholder you control. An attacker cannot publish a name you already hold.
Scope packages to a namespace. Publish internal packages under an organisation scope, for example @acme/billing-utils, and bind that scope to your private registry only. A scoped name will not silently resolve elsewhere.
Pin and lock with integrity hashes. Commit a lockfile, pin exact versions, and verify integrity hashes so a swapped artifact fails the check.
Point the client at one trusted source. Configure a single trusted registry, or set a per scope registry, so the client never falls back to the public registry for internal names.
Disable or vet install scripts. Turn off automatic install scripts where you can, and review the ones you must keep. Many builds run fine with scripts off.
Use an internal mirror or proxy. Route every install through a proxy that serves your private packages first and only fetches vetted public ones, so the resolver never faces an open choice between two registries.

This bug shares a root cause with the rest of the injection and input family, untrusted material crossing into a place that trusts it. The install hook turns a naming gap into command execution, which is why the fix lives in both resolution config and script policy.

Why this rewards understanding the build, not a payload list

A dependency confusion attack is not found by firing known payloads at a target. It depends on how one organisation configures its registries, which names live only in private, and whether the client can ever fall back to public. You find it by understanding what the build assumes about where a name resolves, then testing whether that assumption holds.

That is the kind of assumption an autonomous researcher that tests how an app is meant to work is built to question. We are early and still building, so we make no promises here. If you want to see how that approach reads, our about page explains it.

Frequently asked questions

What is a dependency confusion attack?

It is a software supply chain attack where a package manager checks both a private registry and a public one for the same package name. An attacker publishes a package with your internal name and a higher version number on the public registry. The resolver treats the higher version as newer and pulls the attacker’s code, whose install hook then runs on your developer machines or build servers.

Why does the attacker’s package get chosen over the real one?

Package managers treat a higher version as the one you want, which is usually correct. The attacker exploits that rule by publishing an absurd version such as 99.0.0 in public while your real internal package sits on a much lower version. When the client can answer for one name from two registries, the public copy wins on version precedence.

How does the malicious code actually run?

Installing a package is not only a download. Many ecosystems run a script at install time, such as an npm postinstall script or code inside a Python setup.py file. That script runs with the same rights as the user or build process doing the install, so it can read tokens, keys, and environment variables before anyone reviews the package.

How do I prevent a dependency confusion attack?

Claim your internal package names on the public registries so an attacker cannot register them, scope packages to an organisation namespace bound to your private registry, pin versions and verify integrity hashes in a committed lockfile, point the client at a single trusted registry or per scope registries, disable or vet install scripts, and route installs through an internal mirror. The CWE entry on improper control of code under supply chain compromise covers the wider class at https://cwe.mitre.org/data/definitions/1357.html.