AI Companies Are Copying Each Other To Make Cheap Models

The expenses to develop large language models (LLMs) decrease at a fast rate. The AI industry uses distillation methods to produce compact, efficient models that cost much less than their original counterparts. The excitement about this development exists within specific AI sectors, yet it creates significant worries for Nvidia and other hardware providers who saw their market value drop by $600 billion due to chip demand concerns.

The process of distillation allows a big “teacher” model to teach a smaller “student” model, which becomes more efficient while requiring less computing power. Google AI veterans introduced the technique through a 2015 paper that went unnoticed until it became crucial for AI development.

The Chinese AI firm DeepSeek achieved OpenAI-equivalent models through $5 million training costs, which stunned the industry. The UC Berkeley research team developed two models under $1,000 while Stanford University together with the University of Washington and the Allen Institute for AI created a reasoning model at an even lower price.

The process of distillation enables AI companies to manufacture specialized artificial intelligence models. The open-source model Llama from Meta can be transformed into a specialized version that demonstrates excellence in U.S. tax law or logical reasoning. DeepSeek’s R1 reasoning model serves as an enhancement for Llama by improving the performance capabilities of smaller models.

The technique spreads through Hugging Face as it hosts more than 30,000 distilled models. The process of distillation leads to specialized models, yet the models lose some of their original general performance capabilities. The expansion of open-source AI models with increased power will transform the AI landscape through distillation by making advanced AI technology accessible and affordable.

Leave a Reply

Your email address will not be published. Required fields are marked *