We Pointed an Autonomous AI Pentester at a Deliberately Broken API. It Came Back With a Root Shell

Komodo Research
20 hours ago
3 min read

AigentX, our autonomous web-application penetration testing agent, ran black-box against OWASP crAPI and confirmed 35 exploitable findings, 15 of them Critical, including a chain that turns a free signup account into uid=0(root) and a permanently forged admin identity.

Every finding below carries a request, a response, and a reproduction. The full report is one click away.

35 Confirmed Findings	15 Critical	11 High	4 Kill Chains	0 False Positives

Most “AI found N vulnerabilities” write-ups never let you check the work. This one does. We ran AigentX against OWASP crAPI, the completely ridiculous API, a deliberately vulnerable, multi-service application that anyone in AppSec can stand up and verify against.

Then we did something most vendors won’t: we cross-checked the results with an independent AI source-code review, and we’ll tell you exactly where the two agreed and where they didn’t.

Why crAPI?

crAPI models a real automotive-services platform: user identity, a community feed, a shop, and a mechanic portal. It is split across a Spring Boot identity service, a Django workshop service, and a Go community service, behind a reverse proxy and backed by PostgreSQL and MongoDB.

That polyglot, multi-container shape is the point. Vulnerabilities hide in the seams between services, not just inside one file. It is also a known-answer target, which lets us ask the only question that matters for an autonomous pentester: does it find what is really there, prove it, and stop short of inventing what is not?

What AigentX Did

AigentX runs the full dynamic pentest loop: reconnaissance, endpoint discovery, authentication, vulnerability hypothesis, exploitation, and validation against a running target, with no access to source code. It works black-box, the way a real attacker does.

Against crAPI, it mapped the service topology, authenticated as an ordinary registered user, worked endpoint by endpoint through injection, access-control, SSRF, business-logic, and authentication testing, and then chained the confirmed primitives into end-to-end attacks.

The tally: 35 findings, 15 Critical, 11 High, 9 Medium, plus four working attack chains. There is no informational padding and no “potential, unconfirmed” tier. Every finding ships with request/response evidence and a reproduction.

The Headline: A Free Account to Root

The single most serious result is a remote code execution chain that starts from an ordinary user token and ends with a root shell and the keys to every identity in the system.

crAPI’s identity service stores a`conversion_params` value for uploaded videos, intended as options for an `ffmpeg` pipeline, and passes it unsanitized into a shell command. AigentX reached the internally restricted conversion endpoint through a second bug, the SSRF in `contact_mechanic`, and used it to run arbitrary commands, then to steal the RSA key that signs every `JWT`.

With the private key in hand, an attacker can mint valid RS256 tokens for any user or role indefinitely. Authentication does not just fail once. It collapses permanently until the key is rotated.

The Standout Findings

The Four Kill Chains

Individually these are findings. Chained, they are an incident. AigentX assembled four end-to-end attacks from confirmed primitives:

We Cross-Checked the Coverage With AI SAST

A pentest is only as good as its coverage, so we pressure-tested AigentX's. We ran the same crAPI codebase through an independent AI source-code review, a full white-box read of every service, and compared it against AigentX's black-box results.

The source review did surface more, and it's worth being precise about what kind of "more." The extras fell into two buckets: code paths that aren't reachable from outside the application, a handful of JWT-validation variants buried deep in the verification logic, and deployment-configuration hygiene: TLS settings, secrets committed to manifests, container hardening.

Both are real and both are worth fixing. Neither is an exploitable gap an external attacker could have walked through; they're the completeness that source access exists to provide. On the attack surface that actually faces the internet, the two methods agreed, and only the dynamic run could prove which of those issues an attacker can reach today.

Want to see what AigentX finds in your environment?

Book a demo to see how AigentX maps, exploits, and validates real attack paths against your application.

Book a demo