Silicon Valley follows DeepSeek and starts distillation

2025/03/04 12:03

Source: Quantum

In the global AI race, leading AI companies such as OpenAI, Microsoft and Meta are adopting a development process called "distillation" to build cheaper AI models for consumers and businesses to adopt.

The AI model built by DeepSeek using this technology is powerful and efficient. The model is based on the open source system released by competitors Meta and Alibaba, and has attracted widespread attention in the industry. This breakthrough has shaken people's confidence in Silicon Valley's leadership in AI and caused a sharp drop in the stocks of large American technology companies.

Through distillation technology, companies use large language models (called "teacher" models) to generate the next possible word in a sentence. Data is generated by the teacher model, and then a smaller "student" model is trained to help quickly transfer the knowledge and predictions of the larger model to the smaller model.

While distillation technology has been widely used for many years, recent advances have convinced industry experts that building applications based on this technology will become more and more a boon for startups seeking to build applications in a cheap and effective way.

“Distillation is really magical,” said Olivier Godment, head of product at OpenAI’s platform. “What this process essentially does is take a large, cutting-edge model of intelligence and use it to train a smaller model…that is very powerful at a specific task, and it is cheap and very fast.”

Large language models like OpenAI’s GPT-4, Google’s Gemini, and Meta’s Llama require massive amounts of data and computing power to develop and maintain. While the companies don’t disclose how much it costs to train the large models, it’s likely in the hundreds of millions of dollars.

Distillation makes the power of these models available to developers and businesses at a very low price, allowing app developers to quickly run AI models on devices like laptops and smartphones.

Developers can use OpenAI’s platform to perform distillation and learn from the large language models that power products like ChatGPT. After investing nearly $14 billion in OpenAI, Microsoft, the company’s biggest backer, used GPT-4 to distill its family of small language models, Phi, as part of a commercial partnership.

However, OpenAI said it believed DeepSeek had distilled its models to train its rival products, a move that violated its terms of service. DeepSeek has not yet publicly responded to the claim.

While distillation techniques can be used to build high-performance models, experts add that they are also limited.

“Distillation presents a very interesting trade-off; if you make models smaller, you inevitably reduce their capabilities,” said Ahmed Awadallah of Microsoft Research. He said the distilled model can be used to summarize emails, “but it’s really not very good at other things.”

David Cox, vice president of AI models at IBM Research, said most companies don’t need huge models to run their products, and streamlined models are powerful enough for scenarios such as customer service chatbots, or to run on small devices such as mobile phones.

“As long as you can reduce costs and get the capabilities you want, why not do it?” he added.

This poses a challenge to the business models of many leading AI companies. Even if developers use stripped-down models from companies like OpenAI, they are much cheaper to run and cheaper to build, so they will generate less revenue. Model developers like OpenAI typically charge less for use of stripped-down models because they require less compute.

However, OpenAI’s Goldment believes that large language models will still be used for “high-intelligence and high-risk tasks” because “companies are willing to pay more for high levels of accuracy and reliability.” Large models are also needed to discover new capabilities, which can then be distilled into smaller ones, he added.

Still, the company is working to prevent its large models from being extracted and used to train rival products. OpenAI has teams that monitor usage, and if it suspects a user is generating large amounts of data to export and train competitors, it can remove that user’s access, as it has done with accounts it believes were linked to DeepSeek. But most of these actions are taken after the fact.

“OpenAI has been working to prevent data from being distilled for a long time, but it’s very difficult to avoid it completely,” said Duve Kira, CEO of Contextual AI, a startup that builds information retrieval tools for businesses.

Distillation is also a win for advocates of open models, whose techniques are freely available to developers. DeepSeek has also opened up its latest model to developers.

“We will immediately use distillation and incorporate it into our products,” said LeCun Yan, chief AI scientist at Meta. “That’s the idea of open source. As long as these processes are open, you can benefit from other people’s development.”

Distillation also means that model developers can spend billions of dollars to improve the capabilities of their AI systems and still face competitors catching up, as evidenced by the data recently released by DeepSeek. This raises questions about the first-mover advantage of building large language models, whose capabilities can now be replicated in a few months.

“In this world that’s changing so fast…you might actually spend a lot of money doing it the hard way and pretty soon everyone else in the space will follow,” IBM’s Cox said. “So it’s an interesting but tricky business environment.”

Gain a broader understanding of the crypto industry through informative reports, and engage in in-depth discussions with other like-minded authors and readers. You are welcome to join us in our growing Coinlive community:https://t.me/CoinliveSG

Add Comment

LoginLeave your comments

0 Comments

Earliest

Load more comments

Live Updates

Yesterday
Analysts: The market is gradually adapting to the surge in gold prices.
Bullish
Bearish
Yesterday
Arizona Senator Proposes Tax Exemption for Bitcoin and Cryptocurrencies
Bullish
Bearish
Yesterday
BTC Decline Sparks Bearish Sentiment Among Traders
Bullish
Bearish
Yesterday
Arizona Senator Wendy Rogers proposes tax exemption for BTC and cryptocurrencies
Bullish
Bearish
Yesterday
Hyperliquid (HYPE) Slips Toward Key Support as Selling Pressure Builds
Bullish
Bearish
Yesterday
BitMine Buys $88M Worth of ETH, Extending Its Accumulation Campaign
Bullish
Bearish
Yesterday
Best Crypto Lending Platforms in 2026: A Ranking of Top CeFi & DeFi Options
Bullish
Bearish
Yesterday
LATEST: Tokenized US Treasury products have grown 50x since early 2024 to reach nearly $7 billion in market cap, with BlackRock's BUIDL fund leading at close to $2 billion in AUM, according to Token Terminal data.
Bullish
Bearish
Yesterday
Eric Trump: DeFi is tearing apart the fee system that sustains the banks' "skyscrapers". During an interview on the "Money Talks" podcast (Nov 27), Eric Trump discussed how using Bitcoin as loan collateral is disrupting traditional finance. He highlighted the key benefit: a
Bullish
Bearish
Yesterday
Wake up: *Opens CoinMarketCap* At Work: *Opens CoinMarketCap* On the toilet: *Opens CoinMarketCap* 2AM: *Opens CoinMarketCap*
Bullish
Bearish

Silicon Valley follows DeepSeek and starts distillation

Live Updates

Trending News

US media: Is Trump’s return really a panacea for the surge in cryptocurrency?

Mark Zuckerberg Crowned Fourth Richest with Over $200 Million Fortune – Is the Expansion into Metaverse and AI the Key?

Liberated Man CZ Breaks Silence With First Tweet Post-Prison, BNB Price Soared Then Tanked Along With Market-Wide Drop

Over $70,000 Vanished from Crypto Wallets via Deceptive App on Google Play

Polymarket Users Become Victims of Google Login Wallets Attack With Wallets Wiped Out After Deposits Made

The inside story of the US banking industry's "explosion" is exposed! Well-known financial blogger: Biden's rare policy triggered a run on the bank...

Vitalik Buterin Zooms In on Metrics for Ethereum Alignment Measurement, Admits It Won’t Be Easy

Can Microsoft’s AI Recall Win Back User Trust with Enhanced Security Measures?

Crypto Holders Feel Cheated Again by FTX’s Last-Minute Payout Changes, Recovering Only 10-25%

Tether Dons a Cape: Helps US DOJ in Seizing Over $6 Million Linked to Southeast Asian-Based Crypto Fraud Crackdown