Nvidia CEO Jensen Huang. Photo Credit: I-Hwa Cheng/AFP via Getty Images

Nvidia has unveiled the RTX Spark, a new superchip that brings personal AI agents directly onto Windows laptops and compact desktops, built around 1 petaflop of AI compute and up to 128GB of unified memory. 

Announced at Nvidia’s GTC keynote during Computex in Taipei, the chip marks the company’s first move into the consumer PC market as a silicon designer, taking on Intel, AMD, Apple, and Qualcomm on their home turf.

What Nvidia is promising is that your laptop can now run language models with up to 120 billion parameters and 1 million-token context windows without sending a single query to the cloud. Devices powered by RTX Spark are expected to ship in fall 2026 from ASUS, Dell, HP, Lenovo, Microsoft Surface, MSI, Acer, and GIGABYTE.

What Is Inside the Chip

RTX Spark integrates three compute domains on one chip – a Blackwell RTX GPU, a 20-core Grace CPU, and an NPU, all sharing up to 128GB of LPDDR5X unified memory on a single system-on-chip built on TSMC’s 3nm process, with roughly 70 billion transistors. The GPU carries 6,144 CUDA cores, the same count as the RTX 5070 laptop GPU, with fifth-generation Tensor Cores supporting FP4, the 4-bit floating point format that produces the headline AI performance number. The CPU is a custom Grace design with ten Cortex-X925 performance cores at 4.0 GHz and ten A725 efficiency cores at 2.85 GHz, co-developed with MediaTek.

To put the 1-petaflop figure in context, a petaflop equals one quadrillion floating-point operations per second. A decade ago, that level of performance required room-sized supercomputers.

Why Local AI Matters Here

The most significant feature of RTX Spark is where raw performance runs. By keeping large models and long conversations on the machine, RTX Spark PCs respond faster and avoid sending sensitive data to outside servers unless the user chooses otherwise. For anyone who has had privacy concerns about cloud-based AI assistants, this may be a material change.

To manage what agents can actually do, Nvidia and Microsoft co-developed the OpenShell Runtime, which defines agent permissions, isolates them from sensitive data, and routes requests between local and cloud models based on the user’s privacy settings. A new set of security guardrails ensures that local agents only access the tools and data the user explicitly grants them. 

“For forty years, you launched apps. Click. Type. With RTX Spark and Microsoft Windows, you ask, and the PC does the work,” CEO Jensen Huang said.

Where RTX Spark Stands in a Crowded Market

RTX Spark’s primary differentiator is full CUDA ecosystem compatibility, the industry standard for AI development. On raw AI compute, the chip delivers roughly 1,000 TOPS compared to Apple’s 38 TOPS Neural Engine, although Apple’s M-series chips currently hold advantages in memory bandwidth and single-core CPU performance.

Nvidia has also gone further than just announcing one chip. The company shared a multi-generation roadmap that includes a second platform Vera Rubin Spark, which will pair a new Vera CPU with a Rubin GPU and LPDDR6 memory, while a third generation, Rosa Feynman Spark, follows with stacked Feynman GPUs. 

This roadmap tells OEM partners and software developers that Nvidia’s commitment to the consumer PC market is not a one-cycle experiment. Whether the software ecosystem grows fast enough to match the hardware ambition is what remains to be seen.

Share.

I’m Precious Amusat, Phronews’ Content Writer. I conduct in-depth research and write on the latest developments in the tech industry, including trends in big tech, startups, cybersecurity, artificial intelligence and their global impacts. When I’m off the clock, you’ll find me cheering on women’s footy, curled up with a romance novel, or binge-watching crime thrillers.

Comments are closed.

Exit mobile version