While it’s tempting to brush aside seemingly minimal AI model token costs, that’s only one line item in the total cost of ownership (TCO) calculation. Still, managing model costs is the right place to start in getting control over the end sum. Choosing the right sized model for a given task is imperative as the first step. But it’s also important to remember that when it comes to AI models, bigger is not always better and smaller is not always smarter. “Small language models (SLMs) and large language models (LLMs) are both AI-based models, but they serve different purposes,” says Atalia Horenshtien, head of the data and AI practice in North America at Customertimes, a digital consultancy firm. “SLMs are compact models, efficient, and tailored for specific tasks and domains. LLMs, are massive models, require significant resources, shine in more complex scenarios and fit general and versatile cases,” Horenshtien adds. While it makes sense in terms of performance to choose the right size model for the job, there are some who would argue model size isn’t much of a cost argument even though large models cost more than smaller ones. “Focusing on the price of using an LLM seems a bit misguided. If it is for internal use within a company, the cost usually is lass than 1% of what you pay your employees. OpenAI, for example, charges $60 per month for an Enterprise GPT license for an employee if you sign up for a few hundred. Most white-collar employees are paid more than 100x that, and even more as fully loaded costs,” says Kaj van de Loo, CPTO, CTO, and chief innovation officer at UserTesting. Related:AI Is Improving Medical Monitoring and Follow-Up Instead, this argument goes, the cost should be viewed in a different light. “Do you think using an LLM will make the employee more than 1% more productive? I do, in every case I have come across. It [focusing on the price] is like trying to make a business case for using email or video conferencing. It is not worth the time,” van de Loo adds. Size Matters but Maybe Not as You Expect On the surface, arguing about model sizes seems a bit like splitting hairs. After all, a small language model is still typically large. A SLM is generally defined as having fewer than 10 billion parameters. But that leaves a lot of leeway too, so sometimes an SLM can have only a few thousand parameters although most people will define an SLM as having between 1 billion to 10 billion parameters. As a matter of reference, medium language models (MLM) are generally defined as having between 10B and 100B parameters while large language models have more than 100 billion parameters. Sometimes MLMs are lumped into the LLM category too, because what’s a few extra billion parameters, really? Suffice it to say, they’re all big with some being bigger than others. Related:Medallion Architecture: A Layered Data Optimization Model In case you’re wondering, parameters are internal variables or learning control settings. They enable models to learn, but adding more of them adds more complexity too. “Borrowing from hardware terminology, an LLM is like a system’s general-purpose CPU, while SLMs often resemble ASICs — application-specific chips optimized for specific tasks,” says Professor Eran Yahav, an associate professor at the computer science department at the Technion – Israel Institute of Technology and a distinguished expert in AI and software development. Yahav has a research background in static program analysis, program synthesis, and program verification from his roles at IBM Research and Technion. Currently, he is CTO and co-founder of Tabnine, an AI-coding assistant for software developers. To reduce issues and level-up the advantages in both large and small models, many companies do not choose one size over the other. “In practice, systems leverage both: SLMs excel in cost, latency, and accuracy for specific tasks, while LLMs ensure versatility and adaptability,” adds Yahav. Related:The Cost of AI: How Can We Adopt and Deliver AI Efficiently? As a general rule, the main differences in model sizes pertain to performance, use cases, and resource consumption levels. But creative use of any sized model can easily smudge the line between them. “SLMs are faster and cheaper, making them appealing for specific, well-defined use cases. They can, however, be fine-tuned to outperform LLMs and used to build an agentic workflow, which brings together several different ‘agents’ — each of which is a model — to accomplish a task. Each model has a narrow task, but collectively they can outperform an LLM,” explains, Mark Lawyer, RWS‘ president of regulated industries and linguistic AI. There’s a caveat in defining SLMs versus LLMs in terms of task-specific performance, too. “The distinction between large and small models isn’t clearly defined yet,” says Roman Eloshvili, founder and CEO of XData Group, a B2B software development company that exclusively serves banks. “You could say that many SLMs from major players are essentially simplified versions of LLMs, just less powerful due to having fewer parameters. And they are not always designed exclusively for narrow tasks, either.” The ongoing evolution of generative AI is also muddying the issue. “Advancements in generative AI have been so rapid that models classified as SLMs today were considered LLMs just a year ago. Interestingly, many modern LLMs leverage a mixture of experts architecture, where smaller specialized language models handle specific tasks or domains. This means that behind the scenes SLMs often play a critical role in powering the functionality of LLMs,” says Rogers Jeffrey Leo John, co-founder and CTO of DataChat, a no-code, generative AI platform for instant analytics. In for a Penny, in for a Pound SLMs are the clear favorite when the bottom line is the top consideration. They are also the only choice when a small form factor comes into play. “Since the SLMs are smaller, their inference cycle is faster. They also require less compute, and they’re likely your only option if you need to run the model on an