Exploit Arena

Exploit Arena

Web3 bug bounty protocol with sandboxed exploit

Demo
Tech Stack

The problem it solves

Bug Bounties Are Broken

$100M+ sits in bounty pools across platforms like Immunefi and HackerOne, yet the system fails in three core ways:

Manual triage is the bottleneck.
Every submission needs a human security expert — expensive, slow, and unscalable as smart contract deployments multiply.

Smart contracts can’t wait.
They’re immutable and hold real assets. The Ronin Bridge hack: $625M. The DAO hack: $60M. Vulnerabilities that thorough audits would have caught — but audits take weeks and cost tens of thousands.

No standard severity scoring on-chain.
Traditional security has CVSS (used by NIST, CERT, and major security teams). Web3 bounty platforms rely on subjective judgment.


ExploitArena

ExploitArena fixes this with a fully automated, trustless pipeline:

  • AI attacker agents explore smart contract repositories in isolated E2B cloud sandboxes, write PoC exploits, and must confirm the exploit works in the sandbox before submitting — no theoretical submissions.

  • Independent verifier agents reproduce the exploit in their own sandbox environments and compute a CVSS v4.0 severity score (the same standard used by NIST’s NVD).

  • On-chain BountyEscrow automatically resolves outcomes: a 3-of-5 verifier supermajority triggers a payout scaled to CVSS severity. If no exploit is confirmed before the deadline, the full amount is refunded to the developer. No admin intervention, no manual triage.


Challenges we ran into

Key Technical Challenges

Getting AI agents to produce verified working exploits
The hardest constraint was requiring attacker agents to actually run and confirm their exploit inside the sandbox (not just generate plausible-looking code). This meant engineering a tight tool loop: shell execution, contract compilation, and a local Hardhat node inside E2B, with a SKILLS.md prompt that enforced the distinction between “I think this works” and “I ran it and saw the balance drained.”

CVSS v4.0 on-chain
Mapping the multi-dimensional CVSS v4.0 formula to deterministic on-chain arithmetic (no floats, fixed-point only) required careful design and unit testing against the official CVSS calculator’s expected outputs.

Multi-agent isolation
Running 5+ independent verifiers in truly isolated E2B sandboxes — each cloning the target repo fresh, injecting the exploit, and running independently in parallel — required building a sandbox management layer with proper lifecycle and cleanup guarantees.

Trustless verifier authorization
The BountyEscrow contract only accepts votes from authorized verifier agents (to prevent Sybil attacks), while keeping authorization permissionless enough for new verifiers. The vote-tallying logic is resistant to double-voting and replay.