TLDR OpenAI released EVMbench, a benchmark that tests AI models on finding and fixing smart contract security flaws Built with Paradigm and OtterSec, it draws onTLDR OpenAI released EVMbench, a benchmark that tests AI models on finding and fixing smart contract security flaws Built with Paradigm and OtterSec, it draws on

OpenAI EVMbench Results: How Claude, GPT-5 and Gemini Ranked on Crypto Security

2026/02/19 20:30
3 min read
For feedback or concerns regarding this content, please contact us at crypto.news@mexc.com

TLDR

  • OpenAI released EVMbench, a benchmark that tests AI models on finding and fixing smart contract security flaws
  • Built with Paradigm and OtterSec, it draws on 120 real vulnerabilities from 40 audits
  • Anthropic’s Claude Opus 4.6 ranked first with a detect award of $37,824
  • OpenAI’s GPT-5.2 placed second at $31,623, Google’s Gemini 3 Pro third at $25,112
  • Crypto hackers stole $3.4 billion in 2025, making AI security tools more pressing

OpenAI has launched a new benchmark called EVMbench, built to test how well AI models can detect, exploit, and fix vulnerabilities in smart contracts.

The tool was created alongside crypto investment firm Paradigm and security firm OtterSec. Results were published in a research paper on Wednesday, February 18.

Smart contracts are permanent pieces of code that run on blockchains like Ethereum. They control billions of dollars across lending platforms and decentralized exchanges. Once deployed, they cannot easily be changed, so a single flaw can lead to major losses.

EVMbench used 120 real vulnerabilities pulled from 40 smart contract audits, most sourced from open-source security competitions.

Each AI model was scored using a “detect award,” which estimates the dollar value an AI could theoretically recover by correctly identifying a flaw in a contract.

How Each AI Model Ranked

Anthropic’s Claude Opus 4.6 took the top spot with an average detect award of $37,824.

OpenAI’s own OC-GPT-5.2 came in second at $31,623. Google’s Gemini 3 Pro placed third at $25,112.

The benchmark tested three core skills: finding security bugs, exploiting those bugs in a controlled setting, and patching the broken code without disrupting the contract.

Why OpenAI Built This Tool

Crypto attackers stole $3.4 billion in 2025, a slight increase from the year before. OpenAI said testing AI performance in “economically meaningful environments” is becoming more important as AI adoption grows.

OpenAI also noted it expects AI agents to play a growing role in stablecoin payments. Circle CEO Jeremy Allaire predicted in January that billions of AI agents will be transacting with stablecoins within five years.

What Comes Next

Dragonfly managing partner Haseeb Qureshi posted on X that smart contracts were never designed for human intuition. He said signing large transactions still feels “terrifying” due to threats like drainer wallets, unlike a standard bank transfer.

Qureshi believes AI-managed wallets will eventually handle these risks for everyday users. He compared the pairing to GPS meeting the smartphone.

OpenAI said it hopes EVMbench becomes a long-term standard for tracking AI progress in blockchain security.

Claude Opus 4.6 holding the top detect award score remains the latest data point from the published study.

The post OpenAI EVMbench Results: How Claude, GPT-5 and Gemini Ranked on Crypto Security appeared first on Blockonomi.

Market Opportunity
4 Logo
4 Price(4)
$0.009375
$0.009375$0.009375
-4.22%
USD
4 (4) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact crypto.news@mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Buterin pushes Layer 2 interoperability as cornerstone of Ethereum’s future

Buterin pushes Layer 2 interoperability as cornerstone of Ethereum’s future

Ethereum founder, Vitalik Buterin, has unveiled new goals for the Ethereum blockchain today at the Japan Developer Conference. The plan lays out short-term, mid-term, and long-term goals touching on L2 interoperability and faster responsiveness among others. In terms of technology, he said again that he is sure that Layer 2 options are the best way […]
Share
Cryptopolitan2025/09/18 01:15
UK Looks to US to Adopt More Crypto-Friendly Approach

UK Looks to US to Adopt More Crypto-Friendly Approach

The post UK Looks to US to Adopt More Crypto-Friendly Approach appeared on BitcoinEthereumNews.com. The UK and US are reportedly preparing to deepen cooperation on digital assets, with Britain looking to copy the Trump administration’s crypto-friendly stance in a bid to boost innovation.  UK Chancellor Rachel Reeves and US Treasury Secretary Scott Bessent discussed on Tuesday how the two nations could strengthen their coordination on crypto, the Financial Times reported on Tuesday, citing people familiar with the matter.  The discussions also involved representatives from crypto companies, including Coinbase, Circle Internet Group and Ripple, with executives from the Bank of America, Barclays and Citi also attending, according to the report. The agreement was made “last-minute” after crypto advocacy groups urged the UK government on Thursday to adopt a more open stance toward the industry, claiming its cautious approach to the sector has left the country lagging in innovation and policy.  Source: Rachel Reeves Deal to include stablecoins, look to unlock adoption Any deal between the countries is likely to include stablecoins, the Financial Times reported, an area of crypto that US President Donald Trump made a policy priority and in which his family has significant business interests. The Financial Times reported on Monday that UK crypto advocacy groups also slammed the Bank of England’s proposal to limit individual stablecoin holdings to between 10,000 British pounds ($13,650) and 20,000 pounds ($27,300), claiming it would be difficult and expensive to implement. UK banks appear to have slowed adoption too, with around 40% of 2,000 recently surveyed crypto investors saying that their banks had either blocked or delayed a payment to a crypto provider.  Many of these actions have been linked to concerns over volatility, fraud and scams. The UK has made some progress on crypto regulation recently, proposing a framework in May that would see crypto exchanges, dealers, and agents treated similarly to traditional finance firms, with…
Share
BitcoinEthereumNews2025/09/18 02:21
PBOC Sets Strongest Fix In 34 Months, Signaling Strategic Shift

PBOC Sets Strongest Fix In 34 Months, Signaling Strategic Shift

The post PBOC Sets Strongest Fix In 34 Months, Signaling Strategic Shift appeared on BitcoinEthereumNews.com. Yuan Mid-Point Soars: PBOC Sets Strongest Fix In 34
Share
BitcoinEthereumNews2026/03/05 11:45