- Published on
The Debate: Small, Large or World Models?
- Authors
- Name
- Strategic Machines
Seeking a Path
You don’t want to miss this emerging debate on language models. And just like our last post on the elusive value of GenAI for the enterprise, a little history is needed. TL;DR: This debate is all about value.
Amid the hype around ever-larger models chasing artificial general intelligence (AGI), a quieter shift is happening. Smaller models are proving more practical for everyday work. The paradox is clear. Frontier models grab headlines for acing exams and solving complex problems, but in corporate settings, small language models (SLMs) handle the heavy lifting. They're faster, cheaper, and specialized, powering AI agents in assembly-line workflows.
For instance, companies like Gong use SLMs to analyze sales calls, summarizing data before handing off to larger models for final insights. Or Alice, another SLM, can help identify your ideal buyers and book meetings for you. Meta even distills knowledge from big models into smaller ones for ad targeting, citing cost efficiency. Startups like Aurelian automate 911 responses, and Hark Audio clips podcasts—all with SLMs at the core. RunConcierge, from Tata Consulting, relies on a curated SLM to deliver accurate info to 55,000 runners in the NYC Marathon, avoiding hallucinations that could cause chaos. Christopher Mimms writes about this small model shift as well in this article. Resources like GitHub cookbooks make fine-tuning SLMs accessible, democratizing custom AI.
But the debate goes deeper. Yann LeCun, Meta's chief AI scientist and inventor of key AI components, calls LLMs a "dead end" for human-level intelligence. He compares them to a cat's mind—favoring the cat for its real-world understanding. LeCun argues that world models, not LLMs, will dominate. These architectures maintain abstract world states, predict actions' outcomes, and enable hierarchical planning with built-in safety. He illustrates LLM limits: humans can mentally rotate a floating cube, but LLMs can't model physics intuitively. LeCun has prototyped world models at Meta and may launch a startup, frustrated by the field's LLM fixation.
Andrej Karpathy echoes this evolution, noting that LLMs excel in areas "hard to specify but possible to verify," like creativity, reasoning, math, and code. This positions them to displace traditional software only in untapped markets, where verification ensures reliability. LLMs are still searching for their ‘killer use case’ in companies, as McKinsey and MIT have both noted.
So here we are. Models are emerging and evolving along different paths, and ultimately they seek a path which delivers real value. We've been watching this evolution (and debate) from the beginning, and believe the applications can be parsed along three distinct dimensions:
Small models shine in targeted tasks with bounded contexts: product orders, inventory management and logistics.
Large models suit broad reasoning with next-best actions that are "hard to specify but possible to verify". You can try a live demo here, with an AI reservation agent helping to book a room.
And world models could very well unlock planning in robotics or simulations, blending sensory data beyond text.
We're not so bold as to declare that LLMs are dead or that world models will arrive anytime soon, but we do urge clients to sort through the opportunities they see right now and develop insights on how to derive the greatest value. Plan carefully and embrace rewiring operations with these models—small for efficiency, large for complex reasoning, world models for the horizon. The disruption may unsettle some roles, but it's the path to human-scale AI.
Give us a call. We'd welcome the chance to share our insights on navigating this shift.
And for our friends, colleagues and clients in the United States, Happy Thanksgiving!