The Pulse #122: DeepSeek rocks the tech industry
Almost unknown Chinese lab releases AI model that’s open, free, and as good as ChatGPT’s best models. Oh, and it’s also cheaper to operate. This has sent shockwaves through the AI sector
The Pulse is a series covering insights, patterns, and trends within Big Tech and startups. Notice an interesting event or trend? Send me a message.
This week, a massive event shook the tech industry: a lesser-known Chinese AI lab shocked the markets and tech professionals with the DeepSeek AI model, which feels on a par with OpenAI’s most capable publicly available model, ChatGPT o1. OpenAI has a more advanced o3 model, but it’s in preview and isn’t publicly available yet. DeepSeek is released as open and free to use within the DeepSeek app, or for anyone to host and download it.
Major AI companies are coming to terms with the fact that a small team in China with supposedly little funding, and no access to NVIDIA’s latest AI chips, could pull this feat off. It shatters the image of OpenAI’s invincibility, the notion that the US leads the AI race, and also raises the question of whether open models will turn advanced LLMs into a commodity.
Today, we cover:
The first “thinking model” that feels fast – and is a hit
About 4x cheaper — and possibly more efficient? — than ChatGPT
Open model spreads fast
OpenAI’s need to remain fully closed highlighted by DeepSeek
How did DeepSeek do it, and why give it away for free?
Geopolitics and export controls
Google feared open source AI will win
1. The first “thinking model” that feels fast – and is a hit
On Monday, NVIDIA’s valuation plummeted from $3.5 trillion to $2.9 trillion; an almost $600B reduction in its market cap on a 17% drop in the stock price. This was reported as the biggest ever fall by a U.S. company. The cause? A new Large Language Model (LLM) model called DeepSeek built by a Chinese AI startup, which has been an overnight sensation. Also on the same day, the DeepSeek app (built by the same company) hit the #1 spot on the US App Store on both iOS and Android, making it more downloaded than ChatGPT, which was relegated to #2. DeepSeek has remained #1 since.
What’s the cause of Deepseek’s sudden popularity? It’s thanks to the company updating the app to enable its “DeepThink (R1)” mode that uses their DeepSeek-R1 model. This model is similar to OpenAI’s o1 model in that it takes more ‘thinking time’ to respond, by using more compute to serve up a better response.
A big difference is that DeepSeek displays the model’s “chain of thought”, whereas OpenAI hides what happens during the “thinking” phase. So, the model feels much more “snappy” than OpenAI’s o1, more transparent, and more relatable. And frankly, it’s a far better experience to watch the model “think out loud” for 30 seconds, than watching ChatGPT’s spinner for 30 seconds.
Here’s a good example of what happens when asking a question that trips a lot of LLMs up: “if a chicken says ‘all chickens are liars’ is the chicken telling the truth?” DeepSeek starts to “think” for nearly a minute, spitting out pages-worth of internal monologue:
In the end, the answer it generates concludes the question is a paradox. The output is pretty similar to what OpenAI’s o1 produces, except o1 takes around the same time (38 seconds) to “think” and doesn’t show anything to the user.
DeepSeek: free, OpenAI: $20-200/month. An obvious reason for the DeepSeek app’s popularity is that it’s free and offers virtually the same functionality as paid ChatGPT plans, which cost $20/month for limited access, and $200/month for unlimited access to the advanced o1 and o1-mini models. DeepSeek offers all of this for free, while somehow dealing with what look like enormous loads. The key to this is that DeepSeek seems to be an order of magnitude cheaper to operate than existing models, like OpenAI’s’.
2. About 4x cheaper — and possibly more efficient? — than ChatGPT
The team behind DeepSeek found dozens of approaches to improve efficiency of their model – and published these optimizations in a paper titled DeepSeek-V3 Technical Report. Novel optimization methods include: