Source: FT Chinese
During the Chinese Spring Festival holiday a year ago, OpenAI released the Wensheng video model Sora on February 15, 2024, local time. In several videos generated by Sora, the smooth camera movement and the realistic presentation effect made the domestic large model industry, which was still in the imitation and follow-up stage at that time, fall into great shock and pessimism. For a while, the "surrender theory" was rampant, and investors and large companies came out to persuade entrepreneurs to give up their fantasies and turn to applications. Large model entrepreneurship is a "dead end."
Who would have thought that in this Spring Festival just one year later, everyone was discussing a domestic large model called DeepSeek. In addition to the discussion and screen-sweeping in the technology circle, its application began to penetrate thousands of households, and more ordinary people began to use DeepSeek to customize weight loss recipes, edit holiday greetings, write acrostic poems, and even tell fortunes.
So far, DeepSeek has launched a total of three generations of models. In May last year, DeepSeek, a subsidiary of Huanfang Quantitative, released DeepSeek-V2, which claimed to be comparable to GPT-4, but the price was only about 1% of GPT-4. The low price triggered a year-long price war for domestic large models. In December, DeepSeek released a new large model, DeepSeek-V3, which reduced the training cost to several million US dollars and was hailed as a "price butcher". The DeepSeek-R1 released this time directly targets OpenAI o1. The launch of the "deep thinking" and "online search" functions has allowed DeepSeek to successfully top the free list in both China and the United States.
It can be seen that the pace of technological evolution of the three generations of models launched by DeepSeek is very clear. V2 has lowered the price, but the performance advantage does not seem to be obvious. V3 has gradually caught up with the performance under the premise of ultra-low cost. Until R1, the performance is aligned with the most advanced large models in the world on the basis of still low price, which truly achieved the effect of breaking the circle. While the rhythm is clear, the time spent on technology updates is getting shorter and shorter. It took more than half a year from the release of V2 to V3, but only a short month from V3 to R1.
I mentioned DeepSeek, a startup that only emerged in May last year, in my outlook at the beginning of this year. At that time, I predicted that DeepSeek would definitely become a game-changer in the large model market in the new year. Its low cost and price prove that domestic large models are not without a way out based on the limited computing power and chips. And its "single-handed challenge" to a number of wealthy large companies as a startup company refuted the previous AI "surrender theory" and gave other entrepreneurs the confidence to continue to dig deep in the field of large models.
But more importantly, DeepSeek is not only low-priced but also completely open source, which breaks the Matthew effect of technology and capital giants in model training. This was one of the deep concerns about artificial intelligence in the past two years: chips are getting more and more expensive, training costs are getting higher and higher, several large models in the world are gradually becoming closed or even "oligopolistic", computing power and data are increasingly in the hands of a few companies, and the entry ticket to AI is getting more and more expensive. Most people may only be bystanders in this AI technology revolution.
The implementation of the Stargate plan announced by Trump not long ago after he took office will further strengthen this effect. This ambitious AI infrastructure plan has a scale of up to 500 billion US dollars and is led by SoftBank, Oracle and OpenAI. The US government is the backer, big companies are the leader, and then huge funds enter the market... It is obvious that the ultimate goal of this project is to rely on the arms race of capital, chips and computing power to make the United States always the leader in this AI technology revolution.
The emergence of DeepSeek has at least eliminated the anxiety that the Stargate plan brought to the Chinese AI community. When the superposition of capital and computing power is no longer the only way to advance technology, what does this mean to entrepreneurs and developers? Everyone in the technology circle should be able to predict it.
It is precisely because of this that DeepSeek is now praised in the public opinion field as "innovation at the level of national destiny". Not to mention whether it is flattery to link "national destiny" with a startup company, after seeing the past of Huawei and TikTok, raising a company to a political height and placing it at the forefront of the game between major powers will not do any good to the Chinese technology community, where pessimism has just reversed.
Since DeepSeek has used open source to "popularize" the AI ticket, then in the future, based on the rapid reduction of AI costs, the application and innovation of AI in various industries will further explode. The current discussion and thinking should return to the meaning of the market and technology itself: how to create a good market environment so that technological innovation can get positive feedback; how entrepreneurs and ordinary people can use AI to change the world around them.