If you search for an ai vulnerability scanner, you will find a lot of tools that promise to find every bug in your app. Most of them work the same way underneath: they match your app against a list of known patterns and hand you back a long report of maybes. UnboundCompute is a different kind of tool, and this post is an honest look at how it differs and where it stands today.
We are early. The product is being built. So this is not a sales pitch. It is a comparison of two ways of looking for bugs, and an explanation of why we chose the harder one.
What a traditional vulnerability scanner actually does
A classic scanner crawls your app, collects every URL, form, and parameter it can reach, then fires a fixed set of test payloads at each one. It watches the response for signs that something went wrong. A reflected string here, a database error message there, a slow response that hints at a sleep command.
This works for whole classes of well known bugs. If a field echoes back <script>alert(1)</script> without encoding, a scanner will catch it. If a search box passes ' OR '1'='1 straight into a query, it will often catch that too. That is real value, and pattern matching is good at finding the obvious mistakes quickly.
The trouble starts past the obvious. A scanner does not know what your app is for. It does not know that a user on a free plan should never reach /api/v1/exports/full, or that order id=1043 belongs to a different account. It sees a request that returns 200 OK and moves on. To the scanner, a working feature and a broken access control check look identical.
Why the report is full of maybes
Because a scanner guesses from surface signals, it has to play it safe. If a payload causes any change at all, it tends to flag it so it does not miss a real bug. The result is a report with many items marked “possible” or “medium confidence,” and a real chance that most of them are false positives. Someone on your team then spends a day or two checking each one by hand to find the few that are real.
That is the core problem. The scanner did the easy part and left the hard part, proving the bug, to you.
A scanner tells you where something might be wrong. The expensive work, proving whether it really is, still lands on a human.
How an ai vulnerability scanner that reasons is different
UnboundCompute is built around a different loop. Instead of matching payloads against a list, it tries to understand the app first, then form ideas about where the logic could break, then run experiments to test those ideas, and only report a finding once it has proof. Understand, assume, experiment, verify, chain.
Here is what that looks like in practice on an invented example. Say a typical SaaS app called Acme Notes lets users share a note by id:
GET /api/notes/4471 Authorization: Bearer <user A token>
A pattern matcher checks that the response is valid and moves on. A researcher that reasons about the app notices the id is a plain number and forms an assumption: the server might be trusting the id in the URL without checking who owns the note. So it designs an experiment. It logs in as a second user, takes that user’s token, and asks for a note id that belongs to user A:
GET /api/notes/4471 Authorization: Bearer <user B token>
If user B gets back user A’s private note, that assumption was correct. The tool does not stop at a hunch. It confirms the note content belongs to a different account, records the exact request and response as evidence, and only then reports it. That is an access control bug a payload list would never spot, because nothing in the request looks malicious. The request is perfectly well formed. The problem is what the app assumed.
Proof before report
The rule that changes the output is simple: a finding is only reported when it is proven with concrete evidence. No proof, no report. This flips the work. Instead of handing you candidates to verify, the tool does the verification itself and hands you the ones that survived. The output is signal rather than a stack of maybes.
A confirmed finding can also be turned into a repeatable check, so the same test keeps running and tells you if the bug ever comes back after a fix or a refactor.
A short comparison
- How it finds bugs. A scanner matches known patterns. UnboundCompute forms an idea about the app’s logic and tests it.
- What it understands. A scanner sees URLs and parameters. The researcher tries to learn what the app is meant to do and where that intent could break.
- What it reports. A scanner reports candidates, many of them false positives. UnboundCompute reports findings it has already proven.
- Who proves the bug. With a scanner, a human triages the list. Here, the tool runs the experiment and keeps the evidence.
- The kind of bug it catches. Scanners are strong on known payload bugs. The researcher reaches logic and access control flaws that have no fixed payload.
- After the fix. A proven finding becomes a repeatable check that watches for the bug returning.
We go deeper on this split in scanners vs research, since it is the line that matters most when you are choosing a tool.
Where we are honest about the limits
None of this means scanners are useless. They are fast, cheap, and good at sweeping for the common, known issues. If you have never run one, run one. The point is that pattern matching has a ceiling, and the highest impact bugs usually live above it, in the assumptions an app makes about who you are and what you are allowed to do.
It also does not mean UnboundCompute is finished. It is not. We are building it, and we are not going to dress that up with customer counts or benchmark charts we do not have. What we can say is an early, encouraging signal: a frontier model drove the full methodology on its own and identified and verified real access control and injection issues in test applications it had not seen before. That is a hint that the approach works, not a promise of a result on your app.
Which one should you use
Think of them as different jobs. A vulnerability scanner is a smoke detector for the known stuff, cheap to run and worth keeping on. An autonomous researcher is closer to a person who reads your app, asks “what if the server trusts this id,” and goes and checks. They answer different questions.
If you take one thing from this, take the difference between a maybe and a proof. A maybe costs you time. A proof saves it. That gap is exactly what an autonomous researcher that tests assumptions is built to close. You can read more about who we are and where we are headed on our about page.
