Researchers Uncover Critical RCE Flaws in Meta, Nvidia & Microsoft Inference Engines

Microsoft, Meta, and Nvidia just faced a major wake-up call. Security researchers found critical remote code execution flaws in their AI inference engines. In this article, we explore the recent discovery of serious security flaws in key AI inference engines.

We will cover how these remote code execution vulnerabilities spread through code reuse. Then, we detail the affected frameworks and risks. Finally, we look at patches and steps for better security in AI systems.

The ShadowMQ Vulnerability Pattern

Security researchers at Oligo Security found a dangerous pattern called ShadowMQ. It started in Meta’s Llama Stack framework.

The issue? The unsafe use of ZeroMQ sockets with Python’s pickle deserialization for data handling. Now, pickle can run any code during unpickling. And this opens doors for remote attacks over networks.

The spread of the flaw was due to copied code. Developers took files from one project to another. This incident happened without full checks, so it moved from one repository to the next.

ShadowMQ was first spotted in October 2024 during routine scans on Meta’s Llama Stack. Oligo noticed unauthenticated ZMQ sockets and deserializing untrusted data via pickle. Now, teams must watch for these hidden chains.

Affected Frameworks and Real-World Impact

The bugs hit major players. Meta’s Llama Stack got CVE-2024-50050 with a CVSS score of 8.0. Nvidia’s TensorRT-LLM has CVE-2025-23254 at 9.3 severity. Microsoft’s Sarathi-Serve remains vulnerable without a CVE yet.

Open-source ones like vLLM and SGLang also suffer. These tools power AI in big setups. Users include xAI, AMD, Intel, and clouds like AWS and Azure. Universities such as MIT and Stanford rely on them too.

Enterprises are already feeling the heat from ShadowMQ exposures. Oligo’s scan revealed over 4,200 publicly reachable ZeroMQ ports running the vulnerable code. Many belong to Fortune 500 companies and government clouds.

Attackers could steal models or add miners. Exploits might lead to full takeovers. One bad node could spread harm across clusters. Stats show AI threats rose 40 percent in 2025. This fits a trend where flaws hit development pipelines hard.

Real-world testing proved the danger in minutes. Oligo researchers gained remote shells on unpatched clusters with a single payload. They extracted full Llama-3.1-70B weights in under 20 minutes. This shows why inference engines now top the list of high-value targets for nation-state actors.

Patches Released and Security Lessons

Good news came with quick fixes. Meta switched to JSON serialization. Nvidia added HMAC checks in version 0.18.2. vLLM now defaults to its safe V1 engine. Modular Max Server uses msgpack instead.

Microsoft’s Sarathi-Serve needs urgent review. It’s a research tool but runs in production spots. Oligo urges all users to update now.

To avoid repeats of ShadowmMQ, audit copied code. Use safe data formats like JSON or msgpack. Test network exposures often. Don’t use pickle with untrusted data. And educate dev teams on the importance of serialization.

As engines scale, security must keep pace. Firms now push for vetted reuse. This could shape safer standards ahead.

What's Hot

NVIDIA AI Chip Demand Continues Driving Cloud Strategy Changes at AWS, Azure, and Google Cloud

Google DeepMind and OpenAI Intensify Competition as Coding AI Models Target Software Developers

Google Expands AI-Generated Audio Tools, Signaling a Major Shift in Digital Advertising Strategy

NVIDIA AI Chip Demand Continues Driving Cloud Strategy Changes at AWS, Azure, and Google Cloud

Google DeepMind and OpenAI Intensify Competition as Coding AI Models Target Software Developers

Google Expands AI-Generated Audio Tools, Signaling a Major Shift in Digital Advertising Strategy

Amazon Announces $12 Billion Louisiana Data Center Investment to Boost AI and Cloud Capacity

Hyperscalers Including Microsoft and Amazon Build Private Energy Systems to Power AI Data Centers

AI Assisted Hacking Groups Target Crypto Firms With Multi-Layered Social Engineering

Global Crypto Regulations Expand as 2026 Begins With New Data Collection Frameworks and National Laws

Coinbase Bets on Stablecoin and On-Chain Growth as Key Market Drivers in 2026 Strategy

Tether Faces Ongoing Transparency Questions and Reserve Scrutiny Amid Massive Bitcoin Accumulation

Kanye West YZY Coin Crash Follows $3B Hype Launch

Tesla Launches China AI Training Center for Full Self-Driving Development

Tesla Launches China AI Training Center for Full Self-Driving Development

Samsung to Unveil AI-powered Galaxy S26 on February 25 Unpacked Event

Meta Introduces its Neural Wristband to the World

OpenAI Benchmarks AI Models for Smart Contract Security Testing in Blockchain Applications

Cybersecurity Stocks Drop as Anthropic Launches Claude Code Security Tool

AI Assisted Hacking Groups Target Crypto Firms With Multi-Layered Social Engineering

SentinelOne Warns Hackers are Targeting AI in Physical World Systems like Self-Driving Cars

Deepfake Zoom Calls Used in Corporate Fraud Attacks: Inside the Latest AI Social Engineering Scheme

Researchers Uncover Critical RCE Flaws in Meta, Nvidia & Microsoft Inference Engines

NVIDIA AI Chip Demand Continues Driving Cloud Strategy Changes at AWS, Azure, and Google Cloud

Google DeepMind and OpenAI Intensify Competition as Coding AI Models Target Software Developers

Google Expands AI-Generated Audio Tools, Signaling a Major Shift in Digital Advertising Strategy

MIT Study Reveals ChatGPT Impairs Brain Activity & Thinking

From Ally to Adversary: What Elon Musk’s Feud with Trump Means for the EV Industry

Coinbase responds to hack: customer impact and official statement

Coinbase Hack 2025: Everything we know so far.

NVIDIA AI Chip Demand Continues Driving Cloud Strategy Changes at AWS, Azure, and Google Cloud

Google DeepMind and OpenAI Intensify Competition as Coding AI Models Target Software Developers

Google Expands AI-Generated Audio Tools, Signaling a Major Shift in Digital Advertising Strategy

Amazon Announces $12 Billion Louisiana Data Center Investment to Boost AI and Cloud Capacity

Our Picks

Most Popular

MIT Study Reveals ChatGPT Impairs Brain Activity & Thinking

From Ally to Adversary: What Elon Musk’s Feud with Trump Means for the EV Industry

Coinbase responds to hack: customer impact and official statement

Stay Ahead with Exclusive Updates!

What's Hot

Researchers Uncover Critical RCE Flaws in Meta, Nvidia & Microsoft Inference Engines

Related Posts