The last few days in AI industry have been rather exciting, as exciting as when OpenAI’s ChatGPT got all the attention in late 2022. A small largely unknown Chinese AI company DeepSeek founded in Dec 2023 by Liang Wenfeng, has managed to rattle the entire AI industry with its latest AI models. Let’s see in points why this is being called the AI Sputnik moment, and how exactly has DeepSeek AI Shocked the World.
- Upon the launch of ChatGPT and after it became a huge hit, it was expected that China would come up with something comparable. However, China’s first AI launch by Baidu was not received so well, and thus it was thought that China would lag behind the US.
- But DeepSeek’s latest AI models, V3-general purpose model and R1-reasoning and deeper thinking model, released on 20th Jan, have managed to impress and rapidly got the attention of all the AI experts.
- DeepSeek AI overtook ChatGPT to become the top-rated free application available on IOS App store.
- It is claimed to have been built at a much lower cost of around $6m worth of computing power from Nvidia H800 chips which are much lower end chips than the highly advanced chips being deployed by the US AI companies
- Caused NVIDIA to lose $600bn of market value, biggest one day loss in US history
- In terms of performance, it is being said as being on par with OpenAI’s o1 model. Also, the DeepSeek R1 is 20-50 times cheaper than OpenAI’s o1 depending on the task. For instance, DeepSeek-R1’s API costs just $0.55 per million input tokens and $2.19 per million output tokens, compared to OpenAI’s API, which costs $15 and $60, respectively.
- DeepSeek is open source similar to Llama from Meta. This allows developers to freely access, modify and deploy DeepSeek’s models, reducing the financial barriers to entry and promoting wider adoption of advanced AI technologies.
- DeepSeek’s models utilize a mixture-of-experts(MoE) architecture, activating only a small fraction of their parameters for any given task. This selective activation significantly reduces computational costs and enhances efficiency.
- DeepSeek’s outstanding performance also shows how ineffective the ban on export of advanced chips has been to China
- DeepSeek Got hit by large scale malicious attacks due to which they have limited registrations currently.
Let’s also look at what were some reactions on this big disruptive Tech event
- US President Donald Trump said it was a “wake-up call” for US companies who must focus on “competing to win”.
- In a statement to CBS News, Nvidia offered praise for DeepSeek. “DeepSeek is an excellent AI advancement and a perfect example of test-time scaling,” the company said in an email. “DeepSeek’s work illustrates how new models can be created using that technique, leveraging widely available models and compute that is fully export-control compliant.”
- “The fact that this technology is supposed to take less energy and is more cost-effective than U.S.-based models have U.S. technology investors very concerned,” Jay Woods, chief global strategist at Freedom Capital Markets, said.
- “DeepSeek has proven that cutting-edge AI models can be developed with limited compute resources,” says Wei Sun, principal AI analyst at Counterpoint Research.
- “In China, DeepSeek’s advances are being celebrated as a testament to the country’s growing technological prowess and self-reliance,” says Marina Zhang, an associate professor at the University of Technology Sydney.
These are exciting times, with DeepSeek AI, China has shown the world it is not going to settle being 2nd in this AI race and has leap frogged with this latest launch. This event will prompt the AI leaders and companies to rethink their strategies, investors to rethink their investments and politicians to rethink how they could also get a piece of this AI pie.