When it comes to AI, bigger isn’t always better

When it comes to AI, bigger isn’t always better

NewsSeptember 12, 2025Artifice Prime

115

Enterprise AI tends to default to large language models (LLMs), overlooking small language models (SLMs). But bigger isn’t always better. Often, a smaller, more specialized model can do the work faster and more efficiently.

What complicates things is that neither an LLM nor an SLM alone may give you everything you need, especially in complex enterprise environments. In both cases, structure is essential. That’s where knowledge graphs come in. Knowledge graphs add the context and connections that make these models truly useful.

The value of SLM thinking in enterprise AI

Let’s start with SLMs versus LLMs. Developers were already warming to small language models, but most of the discussion has focused on technical or security advantages. In reality, for many enterprise use cases, smaller, domain-specific models often deliver faster, more relevant results than general-purpose LLMs.

Why? Because most business problems are narrow by nature. You don’t need a model that has read TS Eliot or that can plan your next holiday. You need a model that understands your lead times, logistics constraints, and supplier risk. That’s what makes the output meaningful—not intelligence in general, but intelligence grounded in your context.

Reasoning models, by the way, already work this way: quietly and efficiently. Even cutting-edge systems like DeepSeek use a “mixture of experts” approach, calling on specialized internal components (like a math engine) to solve targeted problems, rather than activating the entire neural network every time.

This modular strategy mirrors how enterprises actually operate. Instead of relying on one monolithic model, you deploy multiple small language models, each focused on a specific domain, such as finance, ops, or customer service. Their outputs are then synthesized by a generalist coordinator model, possibly routed through an AI agent that knows which “expert” to call on when. The result is a flexible, efficient architecture that aligns with real-world organizational structures.

Which, of course, is how humans solve problems too. A physicist might struggle with a tax question, while you or I could give a passable, but vague, answer. Combine the two and you get both precision and coverage. AI works the same way. It performs best when there are clear boundaries of expertise and smart systems for delegation.

Just like in e-commerce or IT architecture, organizations are increasingly finding success with best-of-breed strategies, using the right tool for the right job and connecting them through orchestrated workflows. I contend that AI follows a similar path, moving from proof-of-concept to practical value by embracing this modular, integrated approach.

Plus, SLMs aren’t just cheaper than larger models, they can also outperform them. Take Microsoft’s Phi-2, a compact model trained on high-quality math and code data. Phi-2 outperforms much larger models, sometimes dramatically so, but only within its specialized domain. Its strength comes not from size, but from the focus and precision of its training data.

The key challenge with massive models trained on diverse data sets is that adding new data can degrade previously accurate outputs, as shifting weights alter earlier responses. SLMs avoid this issue by design, maintaining their narrow, focused expertise.

Making models work together optimally

But specialization brings its own challenge: orchestration. Managing multiple small models, and perhaps one or two LLMs, requires precise intent recognition and smart routing. When a user asks a question, the system must correctly interpret it and send it to the right model to deliver a reliable answer.

Because even the most advanced LLMs lack true meta-awareness, this routing logic is often hard-coded by data scientists, making full automation of task delegation tricky, while at the same time adding to the cost of the solution. In response, many enterprises are adopting a hybrid approach. They start with a general-purpose LLM, identify where it falls short, and then deploy SLMs to fill those gaps.

A broader issue is the dominance of generative AI in public discourse, which has somewhat overshadowed decades of valuable non-generative tools. As teams improve at tackling real enterprise-scale data problems, we’re likely to see a shift toward a more balanced, pragmatic toolbox—one that blends statistical models, optimization techniques, structured data, and specialized LLMs or SLMs, depending on the task.

In many ways, we’ve been here before. It all echoes the “feature engineering” era of machine learning when success didn’t come from a single breakthrough, but from carefully crafting workflows, tuning components, and picking the right technique for each challenge. It wasn’t glamorous, but it worked. And that’s where I believe we’re heading again: toward a more mature, layered approach to AI. Ideally, one with less hype, more integration, and a renewed focus on combining what works to solve real business problems, and without getting too caught up in the trend lines.

The need for other tools

After all, success doesn’t come from a single model. Just as you wouldn’t run a bank on a database alone, you can’t build enterprise AI on raw intelligence in isolation. You need an orchestration layer: search, retrieval, validation, routing, reasoning, and more.

And I believe graph technology is key to making any version of AI actually work. There’s growing momentum around pairing structured graph data with AI systems, where graphs act like domain-specific “textbooks,” boosting accuracy and dramatically reducing hallucinations.

Crucially, graphs provide a structure that allows non-technical users to query complex data in intuitive ways, without needing to understand graph theory. LLMs often struggle with long context windows, and simply injecting more data rarely solves the problem. But graphs excel at grouping related information and surfacing insights across multiple levels of abstraction. Graphs enable better answers to high-impact business questions, like “What are the key themes in my business?” or “Where are my biggest operational challenges?”

Techniques like retrieval-augmented generation (RAG), intelligent search, and graph-based logic are what make AI outputs usable, trustworthy, and truly aligned to task. A knowledge graph that draws on the latest advances, such as vector search, dynamic algorithms, and especially graph-based RAG (or GraphRAG), can feed context with unprecedented precision.

The strongest case for the future of generative AI? Focused small language models, continuously enriched by a living knowledge graph. Yes, SLMs are still early-stage. The tools are immature, infrastructure is catching up, and they don’t yet offer the plug-and-play simplicity of something like an OpenAI API. But momentum is building, particularly in regulated sectors like law enforcement where vendors with deep domain expertise are already driving meaningful automation with SLMs. As the ecosystem matures, others will follow.

What we’re heading toward is a more integrated AI stack where graphs, SLMs, and classic AI techniques combine into systems that are not just powerful, but purposeful. Just as no one talks about the AI in a calculator, the best AI may soon become an invisible but indispensable part of tools that simply work.

—

Generative AI Insights provides a venue for technology leaders to explore and discuss the challenges and opportunities of generative artificial intelligence. The selection is wide-ranging, from technology deep dives to case studies to expert opinion, but also subjective, based on our judgment of which topics and treatments will best serve InfoWorld’s technically sophisticated audience. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Contact doug_dineley@foundryco.com.

Original Link:https://www.infoworld.com/article/4041552/when-it-comes-to-ai-bigger-isnt-always-better.html
Originally Posted: Fri, 12 Sep 2025 09:00:00 +0000

Upvote0PointsDownvote

0 People voted this article. 0 Upvotes - 0 Downvotes.