Category: Injection and Input

Injection and input handling flaws: XSS, SQL injection, and related classes.

  • What is command injection? Examples explained

    What is command injection? Examples explained

    Command injection is one of the oldest and most dangerous web bugs, and it is also one of the easiest to understand once you see it in action. It happens when an app takes input from a user, drops that input into a system command, and runs the whole thing in a shell. If the app trusts the input too much, the user can append their own commands and make the server run them.

    What command injection means

    The short version of the command injection meaning is this: your app wanted to run one command, but the attacker tricked it into running two. The first is the command you intended. The second is whatever the attacker tacked on. The shell happily runs both because, to the shell, it is just text.

    The root cause is mixing two things that should stay apart: data (the value a user typed) and code (the command the server runs). When user data flows straight into a command string, the data can change what command runs. That is the whole bug in one sentence.

    The app meant to run one command. The attacker made it run two. The shell cannot tell your intent from their input, so it runs both.

    A simple command injection example

    Let us invent a small app called Acme Netcheck. It is a network tool with one feature: you give it a hostname, and it pings that host so you can see if the host is reachable. The form has one field named host, and the backend runs a ping for you.

    Here is the kind of code that causes the problem. This is written to show the mistake, not to copy:

    # DANGEROUS: user input goes straight into a shell command
    host = request.form["host"]
    command = "ping -c 1 " + host
    output = os.popen(command).read()
    return output
    

    If a normal user types example.com, the server builds and runs this:

    ping -c 1 example.com
    

    That works as intended. The trouble starts when someone types something that is not just a hostname. On a typical shell, a semicolon ends one command and starts another. So an attacker types this into the same field:

    example.com; whoami
    

    Now the server builds and runs this:

    ping -c 1 example.com; whoami
    

    The shell runs the ping, then runs whoami, and the app returns the output of both. The attacker just learned which user the web server runs as. They did not break into anything clever. They only added a semicolon and a second command to a field that was supposed to hold a hostname.

    Other command injection examples that work the same way

    The semicolon is one of several shell characters that chain or redirect commands. These all let an attacker smuggle a second command into a single input field:

    • example.com && whoami runs whoami only if the ping succeeds.
    • example.com | whoami pipes the first command into the second.
    • $(whoami) or `whoami` runs the inner command and pastes its result back in.

    These are command injection examples you will see again and again because the cause is identical every time: input was treated as part of a command instead of as plain text.

    Why command injection is so serious

    With SQL injection, an attacker reaches your database. With command injection, the attacker reaches the operating system itself, running as whatever user your app runs as. That is a wider blast radius. Once they can run shell commands on your server, they can:

    • Read files the app can read, including config files and secrets like API keys and database passwords.
    • Reach other machines on the internal network that the server can talk to but you cannot reach from outside.
    • Install a backdoor or a reverse shell so they can come back later.

    A field meant to hold a hostname turned into full control of a server. That is why this bug class sits near the top of every serious security list.

    How to fix command injection

    The strongest fix is to stop building shell command strings out of user input. Most of the time you do not need a shell at all.

    Do not shell out when an API exists

    If you only need to read a file, use the file API in your language. If you need to make an HTTP request, use an HTTP library. Reaching for a shell command to do a job your language already does is the start of most of these bugs. No shell means no shell injection.

    If you must run a program, pass arguments as a list

    When you genuinely need to run an external program, call it directly and pass each argument as a separate list item instead of as one big string. Most languages support this. In Python it looks like this:

    # Safer: no shell, arguments passed as a list
    import subprocess
    host = request.form["host"]
    output = subprocess.run(
        ["ping", "-c", "1", host],
        capture_output=True, text=True
    ).stdout
    

    Here host is handed to ping as a single argument. There is no shell to interpret the semicolon, so example.com; whoami is passed to ping as one odd hostname, which fails to resolve. The second command never runs.

    Validate input with an allowlist

    Defense in depth helps too. Decide exactly what valid input looks like and reject everything else. For a hostname, you can allow only letters, digits, dots, and hyphens, and reject anything else before the value goes near a command:

    import re
    host = request.form["host"]
    if not re.fullmatch(r"[A-Za-z0-9.-]+", host):
        return "Invalid host", 400
    

    An allowlist describes what you accept. A blocklist tries to list every bad character and always misses some. Prefer the allowlist.

    Lower the impact when things go wrong

    Run the app as a low privilege user, not as root. Limit what that user can read and which machines it can reach. None of this fixes the bug, but it shrinks the damage if one slips through. You can read more patterns like this in our guide to injection and input bugs.

    How to spot it in your own code

    Search your codebase for the places where commands get run. Look for os.system, os.popen, subprocess calls with shell=True, backticks, exec, and eval. For each one, ask a single question: does any part of this command come from a request, a form, a URL, a header, or a file an outside user can influence? If yes, treat it as suspect and fix it with the steps above.

    Command injection survives because the dangerous code reads as harmless. Joining a string and running it looks fine in review. The bug only shows when someone tries the input you did not expect. This is exactly the kind of assumption an autonomous researcher that tests how an app really behaves is built to find. To see how we think about bugs like this, read more about UnboundCompute.

  • What is SQL injection and how does it work?

    What is SQL injection and how does it work?

    SQL injection is one of the oldest bugs on the web, and it still shows up in real applications today. At its core, SQL injection happens when an application builds a database query by gluing user input directly into the query text, so an attacker can send input that changes what the query means. This post explains what SQL injection is and how it works, with a small login example you can read in a minute.

    What is SQL injection in plain terms

    A web app talks to its database using SQL, a language for asking questions like “find the user with this email.” When the app writes that question, it often needs to drop in a value the user typed, like an email or a password. The safe way is to keep that value as data. The unsafe way is to paste it straight into the query string. When the app pastes it in, the user controls part of the query, not just part of the answer.

    Here is the key idea. The database cannot tell the difference between the query the developer meant to write and the extra query syntax the attacker typed. It just runs whatever text it receives. So if the input contains quotes, operators, or SQL keywords, those become part of the command.

    If user input can change the structure of a query instead of just the values inside it, the user is writing your SQL for you.

    How does SQL injection work in a login query

    Imagine a login form on an invented app called Acme Notes. The server takes the email and password and builds a query by string concatenation. In a backend language this might look like the following.

    query = "SELECT id FROM users WHERE email = '" + email + "' AND password = '" + password + "'"

    If a normal user types alice@example.com and hunter2, the final query is exactly what the developer expected.

    SELECT id FROM users WHERE email = 'alice@example.com' AND password = 'hunter2'

    Now look at what happens when an attacker types ' OR '1'='1 into the email field and leaves the password blank or fills it with anything. The concatenation produces this.

    SELECT id FROM users WHERE email = '' OR '1'='1' AND password = ''

    The attacker’s quote closed the email string early, and the added OR '1'='1' is a condition that is always true. The query no longer asks “is this the right email and password.” It asks something the developer never wrote. Depending on how the rows come back, this can return a user record and let the attacker through the login without knowing any real credentials. The same trick, with different syntax, can read data the attacker should never see.

    Why the quote matters

    The single quote is the turning point. Inside the query, a quote marks the start and end of a text value. When user input is allowed to contain its own quote, it can break out of the value and into the command. Everything after the breakout is treated as SQL, not as data. That is the whole mechanism in one sentence.

    What is the purpose of an SQL injection and what can an attacker do

    The purpose of an SQL injection, from the attacker’s side, is to make the database run commands the application never intended. Once they can shape the query, the range of damage is wide.

    • Bypass login, as shown above, by forcing a condition to be true.
    • Read other people’s data, like dumping every row in the users table or pulling password hashes, order history, or private notes.
    • Change or delete data, by injecting an UPDATE or DELETE when the query allows it.
    • Probe blindly, where the app shows no data but behaves differently for true and false conditions, so the attacker reads the database one yes or no answer at a time.
    • Reach further in, since on some setups a database account has enough rights to read files or run system commands.

    The common thread is trust. The app trusted that the email field held an email. The attacker proved that assumption wrong.

    Why SQL injection still happens

    This bug class has been understood for over twenty years, so it is fair to ask why it keeps appearing. A few honest reasons.

    • String building feels natural. Concatenating a query reads like normal code, and it works in testing because testers type ordinary input.
    • It hides in corners. The main login form might be safe while a search filter, an export feature, or an admin report still pastes input into a query.
    • ORMs are not a free pass. Many query builders are safe by default, but most also offer a raw query escape hatch, and that is where the bug sneaks back in.
    • Inputs you forget about. Headers, cookies, and JSON fields all reach the database too, not just visible form boxes.

    How to fix and prevent it

    The fix is direct, and it is the same idea every time. Keep user input as data, never as query structure. The standard tool for that is parameterized queries, also called prepared statements.

    Use parameterized queries

    With a parameterized query, you write the SQL once with placeholders, then pass the values separately. The database treats those values as pure data, so a quote in the input is just a quote, not a command. Here is the same login, done safely.

    query = "SELECT id FROM users WHERE email = ? AND password_hash = ?"
    db.execute(query, [email, password_hash])

    Now if someone sends ' OR '1'='1, the database looks for a user whose email is literally the string ' OR '1'='1. It finds none, and the login fails as it should. The attacker lost the ability to change the query’s shape.

    Back it up with more layers

    • Hash passwords and compare hashes, so a query never holds a raw password to begin with.
    • Validate input against what you expect, such as an email format, to reject obvious junk early. Treat this as a helper, not the main defense.
    • Limit database rights, so the account the app uses cannot drop tables or read files it never needs.
    • Review the raw query paths, since those escape hatches are where injection survives. Search the code for places that build a query from a string.

    If you want more on this family of bugs and how to catch them, the injection and input category collects related explainers.

    How to tell if your app has this bug

    Finding SQL injection is less about throwing payloads and more about understanding which inputs reach a query and what the app assumes about them. A scanner can flag the obvious cases. The harder ones live in the assumptions, like a report filter that quietly trusts a sort parameter, or a search field that an ORM passes through as raw SQL. Those need someone, or something, that reads how the app is meant to work and then tests where that logic could break.

    SQL injection is a clear example of one trusted assumption, that an input is only data, turning into full control of a query. This is exactly the kind of bug an autonomous researcher that tests an application’s assumptions is built to find and then prove with real evidence. You can read more about that approach on the about page.

  • What is XSS and how does it work? With examples

    What is XSS and how does it work? With examples

    If you have ever wondered what is XSS and how does it work, the short answer is this: cross site scripting happens when an app takes input from one user and hands it to another user’s browser as code instead of plain text. The browser runs that code with the same trust it gives the real site. That means an attacker can read cookies, change the page, or act as the victim.

    What cross site scripting actually is

    A web page mixes two kinds of content. There is the markup and script the site author wrote, and there is data, like a comment, a search term, or a username. The browser cannot tell them apart on its own. It trusts whatever the server sends. Cross site scripting is the bug where attacker data crosses over and becomes script.

    Here is the core idea. A user types a comment. The app stores it. Later the app prints that comment back into the HTML of a page. If the app prints it raw, and the comment contains a <script> tag, the tag runs in the next reader’s browser. The attacker never touched that reader. The site delivered the payload for them.

    The browser runs attacker text as code because the app never told it where the data ends and the markup begins.

    Why it matters

    Script that runs on the page runs as the logged in user. It can read document.cookie if the session cookie is not protected, submit forms, or quietly change account settings. The attacker does not need the victim’s password. They borrow the victim’s open session.

    The three types of XSS, with simple examples

    People sort cross site scripting into three buckets based on where the bad input lives and how it reaches the browser. The examples below use an invented app called Acme Notes, a small site where people post public notes and comments. None of these target a real system.

    Stored XSS

    Stored XSS means the payload is saved in the database and served to everyone who views the page. It is the worst of the three because one submission can hit many users. This is a clear stored xss example.

    Imagine the comment box on Acme Notes. A visitor submits this in the comment field:

    <script>alert('xss')</script>

    The app saves the text as is. When the note page renders, it builds the HTML like this on the server:

    <div class="comment">
      <script>alert('xss')</script>
    </div>

    Now every person who opens that note runs the script. The alert('xss') is harmless on its own. It only pops a box. But a real attacker would swap it for code that reads the session cookie and sends it to a server they control. Same hole, worse payload.

    Reflected XSS

    Reflected XSS means the payload is not stored. It rides in the request, usually in a URL, and the server reflects it straight back into the response. The victim has to open a crafted link. This is a plain reflected xss example.

    Say Acme Notes has a search page that shows what you searched for:

    https://acme-notes.example/search?q=hello

    The page prints: You searched for: hello. If the app prints the q value raw, an attacker can build a link where q is a script:

    https://acme-notes.example/search?q=<script>alert('xss')</script>

    Anyone who clicks that link runs the script in their own browser, on the real Acme Notes domain. The attacker sends the link by email or chat. The bug is on the page, but the trigger is the click.

    DOM based XSS

    DOM based XSS happens fully in the browser. The server may send clean HTML, but client JavaScript reads attacker input and writes it into the page in an unsafe way. The dangerous step is in the script the site already ships.

    Suppose Acme Notes shows a welcome banner using the part of the URL after the #:

    const name = location.hash.slice(1);
    document.getElementById('banner').innerHTML = 'Hi ' + name;

    Now an attacker shares this link:

    https://acme-notes.example/#<img src=x onerror=alert('xss')>

    The innerHTML assignment turns the text into real elements. The broken image fires its onerror handler, and the script runs. The server never saw the payload, because the part after # never leaves the browser. That is why server side filters miss it.

    What is XSS and how does it work under the hood

    Every variant of cross site scripting comes from one root cause. The app treats untrusted input as trusted output. The fix is to keep data as data the whole way through. There are two layers that do most of the work.

    Output encoding

    Encode data for the exact spot where it lands. When you put user text inside HTML, convert the characters that have meaning in HTML so the browser shows them instead of running them:

    • < becomes &lt;
    • > becomes &gt;
    • & becomes &amp;
    • " becomes &quot;

    After encoding, the earlier stored payload renders as visible text:

    <div class="comment">
      &lt;script&gt;alert('xss')&lt;/script&gt;
    </div>

    The reader sees the literal characters and nothing runs. Most template engines do this for you if you use their normal output syntax instead of a raw or unescaped output. Encoding depends on context. HTML body, an HTML attribute, JavaScript, and a URL each need their own encoding rules, so use a library that knows the difference rather than rolling your own escapes.

    Avoid the unsafe sinks

    For DOM based bugs, stop feeding untrusted input into sinks that parse HTML or run code. Reach for safe ones instead:

    • Use textContent instead of innerHTML when you only need text.
    • Avoid eval, setTimeout with a string, and document.write on user input.
    • Set attributes with setAttribute rather than building HTML strings by hand.

    Content Security Policy

    A Content Security Policy is a response header that tells the browser which scripts are allowed to run. It is a second line of defense, not a replacement for encoding. A strict policy blocks inline scripts and scripts from domains you did not approve:

    Content-Security-Policy: default-src 'self'; script-src 'self'

    With that header, an injected inline <script> is refused even if it slips into the page. Pair it with the HttpOnly flag on session cookies so script cannot read them through document.cookie. Layered defenses mean one mistake does not hand over the account.

    How to spot it before an attacker does

    Finding cross site scripting is partly about knowing where input flows. Trace every place the app reads input, then follow it to every place that input is written back out. Comment fields, search boxes, profile names, URL parameters, and error messages that echo your input are common starting points. For a wider tour of input bugs, see our injection and input category.

    The hard cases are the ones that depend on how the app assumes its own data behaves, like a field that is encoded in one view and printed raw in another. Those gaps show up when you understand what the app expects, not just when you throw a list of payloads at it. This is exactly the kind of bug an autonomous researcher that tests an app’s assumptions is built to find and then prove with real evidence. You can read more about that approach on our about page.