OpenAI has introduced a new system called EVMbench, designed to measure how well artificial intelligence agents can find and fix security flaws in crypto smart OpenAI has introduced a new system called EVMbench, designed to measure how well artificial intelligence agents can find and fix security flaws in crypto smart

OpenAI and Paradigm launch smart contract security evaluation system

2026/02/19 13:15
3 min read

OpenAI has introduced a new system called EVMbench, designed to measure how well artificial intelligence agents can find and fix security flaws in crypto smart contracts.

Summary
  • OpenAI has introduced EVMbench, a new framework designed to measure how well AI agents can detect, fix, and exploit smart contract vulnerabilities.
  • Developed with Paradigm, the benchmark is built on real audit data and focuses on practical, high-risk security scenarios.
  • Early results show strong progress in exploit tasks, while detection and patching are still challenging.

The company announced on Feb. 18 that it has developed EVMbench in partnership with Paradigm. The benchmark focuses on contracts built for the Ethereum Virtual Machine and is meant to test how AI systems perform in real financial settings.

OpenAI said smart contracts currently secure more than $100 billion in open-source crypto assets, making security testing increasingly important as AI tools become more capable.

Testing how AI handles real security risks

EVMbench evaluates AI agents across three main tasks: detecting vulnerabilities, fixing flawed code, and carrying out simulated attacks. The system is built using 120 high-risk issues drawn from 40 past security audits, many of them from public auditing competitions.

Additional scenarios were taken from reviews of the Tempo blockchain, a payments-focused network designed for stablecoin use. These cases were added to reflect how smart contracts are used in financial applications.

To build the test environment, OpenAI adapted existing exploit scripts and created new ones where needed. All exploit tests run in isolated systems rather than on live networks, and only previously disclosed vulnerabilities are included.

In detection mode, agents review contract code and try to identify known security flaws. In patch mode, they must fix those flaws without breaking the software. In exploit mode, agents attempt to drain funds from vulnerable contracts in a controlled setting.

Early results and industry impact

OpenAI said a custom testing framework was developed to ensure results can be reproduced and verified.

The company tested several advanced models using EVMbench. In exploit mode, GPT-5.3-Codex achieved a score of 72.2%, compared with 31.9% for GPT-5, released six months earlier. Detection and patching scores were lower, showing that many vulnerabilities are still difficult for AI systems to handle.

Researchers observed that agents performed best when goals were clear, such as draining funds. Performance dropped when tasks required deeper analysis, such as reviewing large codebases or fixing subtle bugs.

OpenAI acknowledged that EVMbench does not fully reflect real-world conditions. Many major crypto projects undergo more extensive reviews than those included in the dataset. Some timing-based and multi-chain attacks are also outside the system’s scope.

The company said the benchmark is intended to support defensive use of AI in cybersecurity. As AI tools become more powerful, they could be used by both attackers and auditors. Measuring their capabilities is seen as a way to reduce risk and encourage responsible deployment.

Alongside the release, OpenAI said it is expanding security programs and investing $10 million in API credits to support open-source and infrastructure protection. All EVMbench tools and datasets have been made public to support further research.

Market Opportunity
Smart Blockchain Logo
Smart Blockchain Price(SMART)
$0,004405
$0,004405$0,004405
-1,73%
USD
Smart Blockchain (SMART) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.
Tags:

You May Also Like

Federal Reserve Lowers Interest Rates Again

Federal Reserve Lowers Interest Rates Again

The Federal Reserve has made the decision to lower interest rates by 25 basis points, signaling the possibility of further reductions later this year. This move comes as Fed officials appear divided on the future rate path, a divergence not seen in prior economic cycles.Continue Reading:Federal Reserve Lowers Interest Rates Again
Share
Coinstats2025/09/18 02:38
‘One Battle After Another’ Becomes One Of This Decade’s Best-Reviewed Movies

‘One Battle After Another’ Becomes One Of This Decade’s Best-Reviewed Movies

The post ‘One Battle After Another’ Becomes One Of This Decade’s Best-Reviewed Movies appeared on BitcoinEthereumNews.com. Topline Critics have hailed Paul Thomas Anderson’s “One Battle After Another,” starring Leonardo DiCaprio, as a “masterpiece,” indicating potential Academy Awards success as it boasts near-perfect scores on review aggregators Metacritic and Rotten Tomatoes based on early reviews. Leonardo DiCaprio stars in “One Battle After Another,” which opens in theaters next week. (Photo by Jeff Spicer/Getty Images for Warner Bros. Pictures) Getty Images for Warner Bros. Pictures Key Facts “One Battle After Another” boasts a nearly perfect 97 out of a possible 100 on Metacritic based on its first 31 reviews, making it the highest-rated movie of this decade on Metacritic’s best movies of all time list. The movie also has a 96% score on Rotten Tomatoes based on the first 56 reviews, with only two reviews considered “rotten,” or negative. The Associated Press hailed the movie as “an American masterpiece,” noting the movie touches on topical political themes and depicts a society where “gun violence, white power and immigrant deportations recur in an ongoing dance, both farcical and tragic.” The movie stars DiCaprio as an ex-revolutionary who reunites with former accomplices to rescue his 16-year-old daughter when she goes missing, and Anderson has said the movie was inspired by the 1990 novel, “Vineland.” Most critics have described the movie as an action thriller with notable chase scenes, which jumps in time from DiCaprio’s character’s early days with fictional revolutionary group, the French 75, to about 15 years later, when he is pursued by foe and military leader Captain Steven Lockjaw, played by Sean Penn. The Warner Bros.-produced film was made on a big budget, estimated to be between $130 million and $175 million, and co-stars Penn, Benicio del Toro, Regina Hall and Teyana Taylor. When Will ‘one Battle After Another’ Open In Theaters And Streaming? The move opens in…
Share
BitcoinEthereumNews2025/09/18 07:35
SAP Proposes Dividend of €2.50 per Share

SAP Proposes Dividend of €2.50 per Share

WALLDORF, Germany, Feb. 19, 2026 /PRNewswire/ — The Supervisory Board and Executive Board of SAP SE (NYSE: SAP) recommend that shareholders approve a dividend of
Share
AI Journal2026/02/19 15:30