The always-on agent needs its own computer
In a single week, the AI industry acknowledged that agents are a fundamentally different computing paradigm. Perplexity shipped a $200/month always-on agent running on dedicated Mac mini hardware. Microsoft built Copilot Cowork as a persistent cloud agent with access to your entire enterprise graph. And Nvidia is redesigning its CPUs specifically for agentic workloads, with Jensen Huang calling them 'the bottleneck.' The chat-window era is ending — agents need persistent infrastructure at every layer of the stack, and builders who treat them as fancy API wrappers are building for yesterday.
CNBC
Nvidia previews agentic CPU pivot ahead of GTC 2026, teases 'world-surprising' chip
Nvidia is positioning its Vera CPUs as critical infrastructure for agentic AI workloads, with Jensen Huang calling CPUs 'the bottleneck' and promising a 'world-surprising' chip reveal at Monday's GTC keynote.
cnbc.com
For twenty years, computing moved in one direction: off the device and into the cloud. This week, AI agents started pulling it back.
The Decoder reported that Perplexity launched what it calls Personal Computer. The name is the point. For $200/month, you get a Mac mini running an always-on AI agent with persistent access to your local files and applications. Not an API you call. Not a chat window you open. A machine that runs continuously, working your tasks whether you're watching or not. Over 100 enterprise customers demanded access within a weekend of the consumer launch.
The same week, Fortune reported that Microsoft launched Copilot Cowork, a persistent cloud agent built on Anthropic's Claude that lives inside your M365 infrastructure. It accesses Outlook, Teams, Calendar, SharePoint, and Excel across your organisation. It runs multi-step tasks over time, not prompt-and-response exchanges. Priced at $30/user/month, bundled into a new E7 tier at $99.
And then there's the silicon layer. CNBC reported that Nvidia is positioning its Vera CPUs specifically for agentic inference workloads, with Jensen Huang calling CPUs 'the bottleneck.' A CPU-only rack on the GTC show floor this Monday signals something worth paying attention to: agents need different compute than training. Not more GPUs. Different architecture entirely.
The pattern builders should notice
Three companies at three different layers of the stack all arrived at the same conclusion in the same week: agents need their own persistent infrastructure.
This is a category break from the chat-window model. A chatbot needs a request, a response, and a connection that lasts a few seconds. An always-on agent needs auth that persists for days, file access that survives restarts, and compute that runs whether anyone is asking it questions or not. That's not an API wrapper. That's a computing environment.
The way I see it, the product engineering implications are immediate. If you're building AI features as stateless API calls behind a chat interface, you're designing for a paradigm these companies have already moved past. Microsoft is pricing persistent agents at enterprise scale. Perplexity is shipping dedicated hardware. Nvidia is redesigning silicon. These aren't experiments. These are production bets with real revenue attached.
The question that matters: what does your agent need that a chat endpoint doesn't provide? Persistent memory. Long-running task management. Durable auth tokens. File system access. Background execution. Billing models that account for compute time, not tokens consumed. The infrastructure layer for AI is diverging from the web application stack, and the tooling hasn't caught up.
Monday's GTC keynote will tell us whether Nvidia's 'world-surprising' chip accelerates this split further. But the direction is already clear. The next generation of AI products won't live in a browser tab. They'll need their own computer.
Read the original on CNBC
cnbc.com