Author: teafox; Source: Tea Fox Sees the World
This is an era of information explosion and serious information pollution. Therefore, I will not believe any information easily, especially when facing major positive news, I will be more "suspicious". For me, the best way to eliminate information pollution is to cross-compare and look at both the positive and negative sides.
In the past 48 hours, I have almost been on X, browsing various news about Deep Seek. In the English world, it is basically overwhelmingly positive. At this time, I especially want to see negative comments. After searching around, there are indeed some.
It is mainly divided into two categories:
One category, for the sake of opposition, especially some overseas anti-China people, any news about China, they will give negative comments, this kind of information is simply garbage. But it's good to take a look, at least you can know what garbage looks like.

The other category is the negative comments from industry insiders. The first and most hawkish voice actually comes from a Chinese industry insider, Alexandr Wang.
First of all, this person's name is a bit strange. When I first saw it, I thought CNBC had spelled it wrong. Generally speaking, Alexander is the most common spelling in English and is also the internationally accepted version. Alexandr is the spelling form of some Eastern European languages (such as Russian, Czech, etc.). It is a bit strange that a Chinese American actually uses an Eastern European name. But when I looked closely, it was indeed Alexandr.
Secondly, this Mr. Wang not only has a strange name, but also has a unique background. He was born in 1997 and is the founder and CEO of Scale AI. At the age of 24, Alexandr Wang became the youngest self-made billionaire in the world. According to Forbes, as of July 2024, he is worth $2 billion.
He is the son of Chinese immigrants. Both of his parents worked as physicists at Los Alamos National Laboratory, where nuclear weapons were born. It is very rare for Chinese people to work in such units.
Alexandr has been passionate about mathematics and computer programming since he was a child. He qualified for the Mathematical Olympiad of the US team in 2013. As a teenager, he worked as a software programmer on the American version of Quora. He then studied computer science at MIT, but dropped out and founded Scale AI, becoming an AI prodigy in Silicon Valley.
Alexandr said: Deep Seek has at least 50,000 Nvidia H100 graphics cards, but it is not convenient to say because of sanctions. Afterwards, I watched the CNBC interview several times. His original words were "as my understanding", which means "according to my understanding", without any solid evidence.
According to Deep Seek, the training model only used 2048 H800 graphics cards, which is a castrated version of H100, and the price is only one-third of H100 (30,000 US dollars). It is precisely because of the use of low-end hardware that Deep Seek's innovative value is reflected.

So, why is Alexandr Wang panicking?
I am not an AI expert, but based on the large amount of information I have read in the past few days, Deep Seek may be a giant black swan, wandering over Silicon Valley.
1/ Currently, it is extremely expensive to train top AI models. Giants such as OpenAI need large data centers with tens of thousands of H100 graphics cards. Each card costs at least $30,000, and the total price is more than a billion dollars. In addition, the power consumption is amazing, and an entire power plant is needed to provide electricity. They spend hundreds of millions of dollars just on training models.
2/ DeepSeek suddenly appeared and said, "Haha, what if we spend $5 million to do this?" They didn't just say it, they really did it. DeepSeek's model even beat GPT-4 and Claude in many tasks. The artificial intelligence world in Silicon Valley was stunned for a moment, and AI genius Alexandr Wang was incoherent.
3/ How did DeepSeek do it? They rethought everything from scratch. Traditional AI is like writing every number as a 32-digit decimal. And DeepSeek said, “What if we only use 8 decimal places?” Turns out, it’s still accurate enough! All of a sudden, the memory required was reduced by 75%.
4/ Then there’s their “multi-label” system. Regular AI reads like a first grader: “Geese…geese…geese…qu…xiang…xiang…xiang…tian…ge”, word by word. But DeepSeek reads the entire paragraph at once. It’s 2x faster and 90% more accurate, which is important when you’re dealing with billions of words.
5/ But here’s the really clever part: instead of using one big AI that tries to know everything (like having one person be a doctor, lawyer, engineer, carpenter at the same time), they built an “expert system” that only activates specific experts when needed, saving a lot of parameters.
6/ And the traditional model? All 1.8 trillion parameters are active all the time. Meanwhile, DeepSeek has a total of 671 billion parameters, with only 37 billion active at a time. It's like having a large team, but only calling in the experts you really need for each task.
7/ The results are astounding, training cost: hundreds of millions of dollars → $5 million; GPUs required: 100,000 → 2,000; API cost: 95% cheaper; can run on ordinary gaming graphics cards, not data center hardware.
8/ The craziest part - DeepSeek is open source (completely free). Anyone can use it, the code is public. The technical paper explains everything, it's not magic, just incredibly clever engineering. Now one of the most popular memes of x, Open AI → Closed AI, has been replaced by DeepSeek, which is the real Open AI.
9/ Why is DeepSeek important? Because it breaks the myth that only big tech companies can get involved in AI. You no longer need a multi-billion dollar data center, a few good gaming graphics cards, to do it.
10/ For Nvidia, this is terrible. Their entire business model is based on expensive graphics cards with 90% profit margins, such as the H100, which costs up to $30,000 or $40,000. If everyone can suddenly do AI with ordinary gaming graphics cards... then you know the problem.
11/ The key is: DeepSeek's team is less than 200 people, but Meta's team salaries alone exceed DeepSeek's entire training budget... and Meta's model is not as good as DeepSeek's.
12/ This is a classic disruption story: incumbents optimize existing processes, while disruptors rethink fundamental approaches.
13/ DeepSeek is like an earthquake, with huge aftershocks: AI development has become easier, competition has intensified, the "moats" of large technology companies look more like ditches, and hardware requirements (and costs) have dropped significantly
14/ Of course, giants such as OpenAI will not sit idly by. But everything is about to be overturned, and it is no longer a miracle-making model.
DeepSeek, this black swan, flaps its wings, and the entire Silicon Valley will be affected. The effects can be summarized as follows.
AI startup crisis: DeepSeek's high performance may cause a large number of AI startups that buy Nvidia graphics cards to go bankrupt, releasing a large number of second-hand GPUs. For Alexandr Wang, CEO and founder of Scale AI, this is a life-and-death struggle, and it is understandable that he speaks ill of others.
Data center business suffers: The business model of large data center operators renting Nvidia graphics cards will be impacted.
Tech giants slow down purchases: Tech giants may reduce purchases of Nvidia graphics cards due to inventory backlogs.
Nvidia's prospects are worrying: The combination of the above factors may lead to an overall decline in Nvidia's business.
On X, a financial big V said: deepseek better not be the real deal... (deepseek better not be true...) and then accompanied it with a chilling picture.

75-year high: The chart shows that the US stock market is at its highest point in 75 years.
Magnificent 7: This term refers to the seven best-performing technology giants in the US stock market, which have largely driven the rise of the US stock market.
Two bubbles: The Nifty 50 bubble in the 1960s and the Internet bubble in the 1990s. Both bubbles led to stock market crashes. This time, deepseek is here... Will the US stock market collapse?

Finally, let's take a look at the paper that the DeepSeek team just published at Cornell. Every author is worth remembering. Most of them are young people under the age of 30, from top universities in China, and some are still studying for a doctorate. Among them, no one has an overseas academic background. This once again shows that China has caught up with the United States in the quality of university education, and China will have an absolute numerical advantage in STEM graduates in the next few decades.
As Liang Wenfeng, the founder of DeepSeek, said: Our value lies in the team, and through this process we continue to grow and accumulate expertise. Building a team that can continuously innovate is our real moat.