The CPU Inference Opportunity
Why Intel and AMD could benefit from the shift to on-device AI; a look at the technical catalysts, the math on market opportunity, and the risks
Something interesting is happening in the AI chip market. CPUs are starting to beat GPUs for certain AI workloads. A May 2025 research paper showed that for small language models under 3 billion parameters, multi-threaded CPU execution achieves 1.31x speedups over GPU execution. This matters because the AI market is bifurcating. While NVIDIA dominates large-scale training and inference for massive models, there's a growing segment of AI that runs locally on devices. Small Language Models are getting remarkably good. Microsoft's Phi-4, a 14 billion parameter model, now rivals DeepSeek-R1 (671 billion parameters) on reasoning benchmarks. That's a 48x size difference. The thesis here is simple: as AI models get smaller and more efficient, and as CPUs gain dedicated AI acceleration (NPUs), Intel and AMD could capture a meaningful share of the AI inference market that doesn't require expensive discrete GPUs.
The Math
The Edge AI market is projected to grow from $24.9 billion in 2025 to $118.7 billion by 2033. That's a 21.7% CAGR over 8 years. Here's the breakdown of the opportunity:
The market opportunity for AI is projected to experience substantial growth, with the Edge AI market anticipated to increase from $24.9 billion in 2025 to $118.7 billion by 2033. This trajectory represents a compound annual growth rate (CAGR) of 21.7% over the eight-year period. Within this broader market, the AI PC subset is projected to reach approximately $50 billion by 2027. Considering that Intel and AMD hold a combined PC market share of roughly 85% , this amounts to a potential AI PC revenue opportunity estimated between $40 billion and $45 billion.
Of course, this assumes Intel and AMD can execute. NVIDIA's CUDA ecosystem is a formidable moat with 4+ million developers. But the opportunity is real, especially in the segment where power efficiency and on-device processing matter most.
Why This Could Work
"Families don't think quarter-to-quarter – they think about next year and 5 or 10 years after that." That quote from First Citizens' Vice Chairwoman applies equally well to the AI chip market. The shift to on-device AI isn't happening overnight, but the trajectory is clear.
Three things are converging:
1. Small Language Models are reaching production quality. Phi-4 matches models 5x its size. A 4-bit quantized 7B model now runs at 15 tokens/second on consumer hardware while retaining 95% of quality. Two years ago, this wasn't possible.
2. NPUs are becoming standard. Microsoft's Copilot+ PC requires 40+ TOPS from the NPU. Intel's new Panther Lake delivers 50 TOPS. AMD's Ryzen AI 9 HX 475 hits 60 TOPS. This is no longer optional hardware.
3. Power efficiency drives adoption. NPUs operate at 35-70% lower power than GPUs for the same workloads. For laptops targeting all-day battery life, this matters enormously.
Apple Shows What's Possible
Apple Silicon is the benchmark for what Intel and AMD need to achieve. The M4's 38 TOPS Neural Engine delivers superior real-world performance despite lower theoretical specs than competitors.
The projected growth of the Edge AI market shows an increase from $24.9 billion in 2025 to $118.7 billion by 2033. This represents a compound annual growth rate (CAGR) of 21.7% over that eight-year period. Within this broader market, the AI PC subset is expected to reach approximately $50 billion by 2027. Given that Intel and AMD hold a combined PC market share of roughly 85% , this translates to a potential AI PC revenue opportunity estimated between $40 billion and $45 billion.
Higher TOPS doesn't equal better performance. Apple's tight hardwaresoftware integration delivers 2x the real-world performance at lower theoretical specs. This is the gap Intel and AMD must close. The key lesson from Apple: unified memory architecture matters more than raw compute. Apple's M5 delivers 153GB/s memory bandwidth. Intel's Lunar Lake manages only 68GB/s. Memory bandwidth is the true bottleneck for LLM inference, not compute TOPS.
Intel: The Turnaround Play
Intel just launched Panther Lake at CES 2026, built on their new 18A process. The specs are competitive: • 50 TOPS NPU (meets Copilot+ requirements) • 180 total platform TOPS across CPU, GPU, and NPU • 200+ partner designs already committed • 60% better multithread performance vs. predecessors In the datacenter, Intel's AMX (Advanced Matrix Extensions) delivers meaningful AI acceleration: 4.5x improvement over previous generation for FP16 workloads, and LLM inference jumps from 28 to 57 tokens/second with AMX enabled. The challenge: Intel is discontinuing its Gaudi AI accelerator line and pivoting to "Jaguar Shores" rack-scale systems in 2026. This transition creates uncertainty, but also signals a strategic shift toward integrated solutions where Intel has historically been stronger.
Financially: Q3 2025 revenue of $13.7B beat expectations. Analysts project 2026 EPS of ~$0.59, a significant recovery from 2024's losses. The forward P/E of 68x suggests the market expects meaningful earnings improvement.
AMD: The Momentum Play
AMD has stronger momentum right now. Their Ryzen AI Max+ series offers something no other Windows processor can match: 128GB unified memory capable of running 235B parameter LLMs locally.
The XDNA 2 architecture in their latest chips delivers 5x compute capacity and 2x power efficiency over the previous generation. Their NPU hits 60 TOPS on the Ryzen AI 9 HX 475, leading the industry.
But the real story is datacenter momentum. AMD reports that 7 of the 10 largest AI model builders now run production workloads on Instinct GPUs. The customer list includes OpenAI (6GW deployment planned), Oracle (50,000 MI450 GPUs), and Meta (MI300X for Llama production).
ROCm 7.0 delivers 3.5x inference improvement over ROCm 6, with downloads up 10x year-over-year. The software ecosystem gap with NVIDIA is narrowing, though CUDA's moat remains formidable.
Financially: Q3 2025 revenue hit $9.25B (35.6% YoY growth), with datacenter at $4.34B. Analysts project 36% earnings growth in 2026. AMD's market cap is now 2.5x Intel's.
3 Engines for Returns
For both Intel and AMD, I see three potential engines of return, similar to what's worked for compounders in other industries:
Engine 1: Market Growth. The Edge AI market at 21.7% CAGR provides the rising tide. AI PCs are becoming the standard, not the exception. Even capturing a modest share of this growth drives revenue expansion.
Engine 2: Margin Expansion. As Intel executes on 18A and AMD continues scaling XDNA, manufacturing efficiencies should improve. Intel targets nonGAAP OpEx reduction to $16B by 2026. AMD's datacenter mix shift improves overall margins.
Engine 3: Multiple Expansion. Intel trades at a significant discount to its historical multiple. AMD trades below NVIDIA despite similar growth rates. If execution continues, multiples could expand as the market recognizes the AI opportunity.
None of these engines are extraordinary on their own. But together, they compound. A company with 15% revenue growth, 3% margin expansion, and 5% P/E expansion delivers 20%+ annual returns. This is the math that has worked for JPM over the past few years despite being the most well-followed bank in the country.
The Risks
The thesis has real risks:
CUDA's moat. NVIDIA has 4+ million developers and 40,000+ companies in its ecosystem. Switching costs are real. AMD and Intel solutions run 2-3x slower on training workloads due to software, not hardware.
Memory bandwidth gap. Intel's 68GB/s vs. Apple's 153GB/s is a significant disadvantage. Until Intel/AMD solve this (potentially with on-package memory), Apple Silicon will deliver better real-world performance.
Qualcomm competition. The Snapdragon X2 Elite hits 80-85 TOPS, challenging x86 in laptops. Qualcomm is capturing ~25% of the premium laptop segment.
Execution risk. Intel's 18A process needs to work. AMD's ROCm needs to keep improving. Neither is guaranteed.
Bottom Line
The CPU inference opportunity is real, but it's an evolution, not a revolution.
AMD has stronger current momentum with unified memory leadership (128GB), major datacenter wins (OpenAI, Oracle, Meta), and improving software ecosystem (ROCm 10x downloads YoY). The stock reflects some of this optimism.
Intel is the turnaround play with more upside if execution succeeds. Panther Lake's 200+ partner designs show the AI PC opportunity is real. The 18A process could be transformative. But execution risk is higher.
The key question isn't whether CPU inference will matter. It will. The question is how quickly Intel and AMD can close the software ecosystem gap with NVIDIA and deliver the unified memory architectures that Apple has proven essential for efficient local AI.
Watch for: SLM improvements (Phi-5, Gemma 4), memory bandwidth advances in next-gen chips, ROCm/OpenVINO adoption metrics, and enterprise deployment of local SLMs for privacy-sensitive applications. These will tell us whether the thesis is playing out.
Disclaimer: This is not investment advice. Do your own research. Cyclical industries can be volatile.


