Close Menu

    Stay Ahead with Exclusive Updates!

    Enter your email below and be the first to know what’s happening in the ever-evolving world of technology!

    What's Hot

    An Enterprise Client Accidentally Spent $500 Million on Claude in a Single Month. Every Company Deploying AI Agents Needs to Read This.

    June 3, 2026

    Anthropic Just Surpassed OpenAI in the Private Market. The AI Race Has A New Leader and the Gap Is Widening Fast.

    June 3, 2026

    GitHub Lost 3,800 Internal Repositories to a Poisoned Developer Extension. The Supply Chain Attack Nobody Saw Coming Is Now the Most Dangerous Kind.

    June 2, 2026
    Facebook X (Twitter) Instagram
    Facebook X (Twitter)
    PhronewsPhronews
    • Home
    • Big Tech & Startups

      Anthropic Just Surpassed OpenAI in the Private Market. The AI Race Has A New Leader and the Gap Is Widening Fast.

      June 3, 2026

      Trump Backed Down on His AI Executive Order After Big Tech Pushed Back. What the Retreat Reveals About U.S. AI Policy Is More Important Than the Order Itself.

      May 31, 2026

      SpaceX Filed Its IPO Papers and Is Targeting a $1.75 Trillion Valuation. If It Goes Through It Will Be the Largest Public Offering in History and It Will Reshape the Tech Market Permanently.

      May 31, 2026

      Foxconn Got Hit by Ransomware and 11 Million Files Were Stolen. The Nitrogen Attack on the World’s Largest Electronics Maker Has Consequences for Every Big Tech Supply Chain

      May 31, 2026

      Anthropic Is About to Turn a Profit for the First Time. Its Q2 Revenue Is Expected to Hit $10.9 Billion and That Number Changes Everything About the AI Business Model.

      May 28, 2026
    • Crypto

      Market Collapse: What Happened to NFTs?

      April 23, 2026

      Quantum Computing Advances Force Coinbase and Institutional Custodians to Rethink Crypto Security

      March 8, 2026

      AI Assisted Hacking Groups Target Crypto Firms With Multi-Layered Social Engineering

      February 18, 2026

      Global Crypto Regulations Expand as 2026 Begins With New Data Collection Frameworks and National Laws

      January 16, 2026

      Coinbase Bets on Stablecoin and On-Chain Growth as Key Market Drivers in 2026 Strategy

      January 10, 2026
    • Gadgets & Smart Tech
      Featured

      Foldable Phones Are No Longer a Gimmick — The Motorola Razr 2026 Is the Latest Sign That Foldables Are Going Mainstream

      By fariehanMay 3, 2026
      Recent

      Foldable Phones Are No Longer a Gimmick — The Motorola Razr 2026 Is the Latest Sign That Foldables Are Going Mainstream

      May 3, 2026

      Meta Raises Quest VR Headset Prices as Component Costs Rise

      May 1, 2026

      Robotics Showcase: China Uses a Half-Marathon to Signal Progress in Humanoid Tech

      April 27, 2026
    • Cybersecurity & Online Safety

      GitHub Lost 3,800 Internal Repositories to a Poisoned Developer Extension. The Supply Chain Attack Nobody Saw Coming Is Now the Most Dangerous Kind.

      June 2, 2026

      Foxconn Got Hit by Ransomware and 11 Million Files Were Stolen. The Nitrogen Attack on the World’s Largest Electronics Maker Has Consequences for Every Big Tech Supply Chain

      May 31, 2026

      A Cybersecurity Firm Just Had Its Own Source Code Stolen. Trellix’s Breach Is the Most Embarrassing Kind and the Most Instructive One.

      May 22, 2026

      Hackers Built a Zero-Day Exploit Using AI and Almost Got Away With It. Google Caught It in Time. Next Time May Be Different.

      May 19, 2026

      275 Million Students Had Their Data Exposed in the Largest Education Cyberattack Ever Recorded. Here Is Exactly What Happened to Canvas

      May 19, 2026
    PhronewsPhronews
    Home»Artificial Intelligence & The Future»OpenAI Benchmarks AI Models for Smart Contract Security Testing in Blockchain Applications
    Artificial Intelligence & The Future

    OpenAI Benchmarks AI Models for Smart Contract Security Testing in Blockchain Applications

    fariehanBy fariehanFebruary 27, 2026No Comments
    Facebook Twitter Pinterest LinkedIn WhatsApp Reddit Tumblr Email
    Share
    Facebook Twitter LinkedIn Pinterest Email

    On February 18 2026, OpenAI and Crypto investment firm Paradigm launched a jointly launched EVMbench, an open-source benchmark that tests how well AI agents can detect, patch and exploit vulnerabilities in Ethereum-based smart contracts. 

    The timing couldn’t be more perfect because smart contracts currently secure over $100 billion in open-source crypto assets. 

    What is EVMbench and How Does it Test Smart Contract Security

    EVMbench draws on 120 curated vulnerabilities across 40 professional audits, most pulled from Code4rena, a platform where security researchers race to find bugs in live codebases. Until now, no standardized tool existed to measure AI performance in this environment. 

    To fix that, OpenAI open-sourced the full dataset, tooling, and evaluation harness so developers can consistently test models as AI capabilities evolve. 

    Specifically, EVMbench tests agents across three modes: Detect (identify vulnerabilities), Patch (fix them without breaking functionality), and Exploit (execute fund-draining attacks inside a sandboxed environment). To keep things safe, the system runs all tests in an isolated environment so no real money is ever at risk. 

    How OpenAI GPT-5.3-Codex Scores Against Real Blockchain Vulnerabilities

    So far, the results show significant progress. In Exploit mode, GPT-5.3-Codex scored 72.2%, up from GPT-5‘s 31.9% just six months earlier. In fact, Paradigm Partner Alpin Yukseloglu noted that when the project started, top models could only exploit less than 20% of critical bugs. Today, that figure sits above 70%. 

    Nevertheless, detection and patching remain harder problems. Agents frequently stop after finding a single issue and Patch success rates still fall short of full coverage.

    Why Blockchain Developers Need AI Security Auditing Tools Right Now

    As blockchain adoption accelerates, manual auditing simply cannot keep up. As a result, EVMbench enables developers to run vulnerability sweeps in hours rather than days, freeing human auditors to focus on the most complex edge cases. 

    Beyond that, OpenAI also committed $10 million in API credits to support defensive cybersecurity research for open-source and critical infrastructure projects.

    Still, it is common knowledge that the AI helping defenders is also the AI helping to accelerate cyberattacks. OpenAI acknowledged this directly and is taking an evidence-based approach by accelerating defensive capabilities while putting safeguards in place to slow misuse. 

    Ultimately, open-sourcing EVMbench means any developer can now test AI models against the same standards that top security researchers use. 

    AI agents AI Benchmarking AI crypto tools AI cybersecurity tools AI infrastructure AI innovation AI models AI performance Artificial Intelligence Blockchain infrastructure blockchain security Blockchain technology Code auditing Crypto Security Defensive AI Defi security EVMbench Exploit testing Gpt-5.3-Codex Open source security OpenAI Paradigm Smart contract auditing Smart Contract Security Smart contract vulnerabilities Vulnerability detection Web3 security
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email
    fariehan

    Related Posts

    An Enterprise Client Accidentally Spent $500 Million on Claude in a Single Month. Every Company Deploying AI Agents Needs to Read This.

    June 3, 2026

    Anthropic Just Surpassed OpenAI in the Private Market. The AI Race Has A New Leader and the Gap Is Widening Fast.

    June 3, 2026

    GitHub Lost 3,800 Internal Repositories to a Poisoned Developer Extension. The Supply Chain Attack Nobody Saw Coming Is Now the Most Dangerous Kind.

    June 2, 2026

    Comments are closed.

    Top Posts

    Coinbase responds to hack: customer impact and official statement

    May 22, 2025

    Anthropic Will Use Claude User Chats For Data Training

    October 16, 2025

    Cursor AI Hits 1 Million Daily Users. Why Developers Are Switching to This Coding Tool

    March 23, 2026

    MIT Study Reveals ChatGPT Impairs Brain Activity & Thinking

    June 29, 2025
    Don't Miss
    Artificial Intelligence & The Future

    An Enterprise Client Accidentally Spent $500 Million on Claude in a Single Month. Every Company Deploying AI Agents Needs to Read This.

    By preciousJune 3, 2026

    An unnamed enterprise racked up roughly $500 million in charges on Anthropic’s Claude in a…

    Anthropic Just Surpassed OpenAI in the Private Market. The AI Race Has A New Leader and the Gap Is Widening Fast.

    June 3, 2026

    GitHub Lost 3,800 Internal Repositories to a Poisoned Developer Extension. The Supply Chain Attack Nobody Saw Coming Is Now the Most Dangerous Kind.

    June 2, 2026

    Trump Backed Down on His AI Executive Order After Big Tech Pushed Back. What the Retreat Reveals About U.S. AI Policy Is More Important Than the Order Itself.

    May 31, 2026
    Stay In Touch
    • Facebook
    • Twitter
    About Us
    About Us

    Evolving from Phronesis News, Phronews brings deep insight and smart analysis to the world of technology. Stay informed, stay ahead, and navigate tech with wisdom.
    We're accepting new partnerships right now.

    Email Us: info@phronews.com

    Facebook X (Twitter) Pinterest YouTube
    Our Picks
    Most Popular

    Coinbase responds to hack: customer impact and official statement

    May 22, 2025

    Anthropic Will Use Claude User Chats For Data Training

    October 16, 2025

    Cursor AI Hits 1 Million Daily Users. Why Developers Are Switching to This Coding Tool

    March 23, 2026
    © 2025. Phronews.
    • Home
    • About Us
    • Get In Touch
    • Privacy Policy
    • Terms and Conditions

    Type above and press Enter to search. Press Esc to cancel.