News

What is DeepSeek? The Chinese ChatGPT Rival Taking the World by Storm

Decrypt Courses Complete


DeepSeek is the buzzy new AI model taking the world by storm. The Chinese startup has impressed the tech sector with its robust large language model, built on open-source technology.

DeepSeek has also sent shockwaves through the AI industry, showing that it’s possible to develop a powerful AI for millions in hardware and training, when American companies like OpenAI, Google, and Microsoft have invested billions.

What is DeepSeek?

DeepSeek is the brainchild of investor and entrepreneur Liang Wenfeng, a Chinese national who studied electronic information and communication engineering at Zhejiang University. Liang began his career in AI by using it for quantitative trading, co-founding the Hangzhou, China-based hedge fund High-Flyer Quantitative Investment Management in 2015. In 2023, Liang launched DeepSeek, focusing on advancing artificial general intelligence.

Image: DeepSeek

DeepSeek launched its first large language model, DeepSeek-Coder, on November 29, 2023.

But it wasn’t until January 20, 2025, with the release of DeepSeek-R1, that the company upended the AI industry.

With a team of just 200 people and a budget of $6 million, DeepSeek released its free, open-source model, which was on par with OpenAI’s much-ballyhooed GPT 01 model—a project that cost as much as $600 million and took an an estimated 3,500 people two years to build.

Unlike big tech companies with big payrolls in the west, DeepSeek optimized its hiring to focus on recently graduated students: “Three to five years of work experience is the maximum, and those with more than eight years of work experience are basically rejected,” a headhunter told 36kr, a popular Chinese tech site.

And, whereas OpenAI and other dominant AI models were mainly available as subscription products, DeepSeek’s code is open source, available for public scrutiny and can be downloaded to a local computer via AI playground Huggingface, or as a phone app, for free.

DeepSeek’s underlying technology was considered a massive breakthrough in AI and its release sent shockwaves through the US tech sector, wiping out $1 trillion in value in one day.

Image: DeepSeek

What’s so special about DeepSeek?

DeepSeek’s success comes from its approach to model design and training. Like a massively parallel supercomputer that divides tasks among many processors to work on them simultaneously, DeepSeek’s Mixture-of-Experts system selectively activates only about 37 billion of its 671 billion parameters for each task. This approach significantly improves efficiency, reducing computational costs while still delivering top-tier performance across applications.

DeepSeek enhances its training process using Group Relative Policy Optimization, a reinforcement learning technique that improves decision-making by comparing a model’s choices against those of similar learning agents. This allows the AI to refine its reasoning more effectively, producing higher-quality training data.

DeepSeek has also demonstrated a commitment to open-source accessibility by releasing its models under the MIT license, which allows users to download, deploy, and customize the AI model, distinguishing it from competitors that maintain closed and proprietary systems. Open-source also allows developers to improve upon and share their work with others who can then build on that work in an endless cycle of evolution and improvement.

DeepSeek’s development is helped by a stockpile of Nvidia A100 chips combined with less expensive hardware. Some estimates put the number of Nvidia chips DeepSeek has access to at around 50,000 GPUs, compared to the 500,000 OpenAI used to train ChatGPT.

Reactions to DeepSeek

Many AI technologists have lauded DeepSeek’s powerful, efficient, and low-cost model, while critics have raised concerns about data privacy security.

“We are living in a timeline where a non-US company is keeping the original mission of OpenAI alive—truly open, frontier research that empowers all. It makes no sense,” Nvidia Senior Research Manager Dr. Jim Fan wrote on X (formerly Twitter). “The most entertaining outcome is the most likely.”

This is the DeepSeek R1 Reasoning Engine running Grok-1 Open Source.

The Reasoning Engine allows for new life to be given to older models.

It is absolutely fascinating how it works.

Look in and see: pic.twitter.com/FErN8TrOF8

— Brian Roemmele (@BrianRoemmele) January 28, 2025

Even OpenAI CEO Sam Altman acknowledged that DeepSeek is impressive.

“We will obviously deliver much better models and also it’s legit invigorating to have a new competitor!” Altman said on X.

Days later, though, the firm claimed to have found evidence that DeepSeek used OpenAI’s proprietary models to train its own rival model.

Critics have also raised questions about DeepSeek’s terms of service, cybersecurity practices, and potential ties to the Chinese government. Others have highlighted the extensive amount of user data collected by DeepSeek, including device models, operating systems, keystroke patterns, and IP addresses—data that’s stored on DeepSeek’s China-based servers, according to the firm’s privacy policy.

As a general news and also security awareness:
Deepseek is a new LLM and it’s powerful, but there is a caveat, they collect keystroke patterns, this is not common and can be used to identify yourself in the future in any device or website as keystroke patterns are like individual… pic.twitter.com/8pn1EkzN2K

— Raphael de Monticello (@RaphaMonticello) January 23, 2025

“Privacy is an issue because it’s China. It’s always about collecting data from users. So user beware,” Kevin Surace, CEO at AI software developer Appvance, told Decrypt. “It will force everyone to rethink how we train models and how much power is required for inference.”

What does the future hold for DeepSeek?

DeepSeek’s rapid rise challenges the dominance of Western tech giants and raises significant questions about the future of AI—who builds it, who controls it, and how open and affordable for all it should be.

But questions remain about the long-term implications of DeepSeek and whether U.S. President Trump will respond to China’s apparent overnight dominance in the AI sector with a TikTok-style ban. Did High-Flyer misrepresent its use of GPUs to make DeepSeek seem more efficient than it actually is? Was DeepSeek’s sudden public launch timed to drive down Nvidia’s stock for the benefit of well-positioned investors?

As competitors, including Meta and Perplexity AI, scramble to adapt to DeepSeek’s methodology, the full impact of this AI breakthrough remains uncertain. But one thing is clear: DeepSeek shook up the tech industry by proving yet again that sometimes, resource constraints force innovative breakthroughs and that powerful technology can be built without multi-billion-dollar price tags.

Generally Intelligent Newsletter

A weekly AI journey narrated by Gen, a generative AI model.





Source: https://decrypt.co/resources/what-is-deepseek-the-chinese-chatgpt-rival-taking-the-world-by-storm

Leave a Reply

Your email address will not be published. Required fields are marked *