LIVE Web3 News

ARTICLE INFORMATION

DeepSeek AI’s large language model disrupts industry with low-cost, high-efficiency approach

by ICN.live
May 14, 2025

ICN

DeepSeek AI Large Language Model Is Redefining Expectations In Artificial Intelligence Space.

DeepSeek AI’s first large language model, R1, which was launched in January of 2025, is arguably one of the hottest names in artificial intelligence at this time. The company behind it is Hangzhou-based DeepSeek AI. DeepSeek AI is funded by hedge fund High Flyer and founded by founder Liang Wenfeng. Despite being relatively new, it is now competing against companies that have been around far longer, such as OpenAI and Meta.

What is making DeepSeek AI’s large language model stand out right now is its price point relative to the performance it offers. According to reports, DeepSeek’s V3 model was created for approximately $6 million. This represents a significant discount compared to what OpenAI paid to create GPT-4, estimated to be over $100 million. Moreover, DeepSeek reportedly built its V3 model using fewer AI chips than were used by the more expensive models, including the use of some that are less powerful due to the United States’ trade restrictions. Despite these limitations, the performance of DeepSeek’s V3 model matched that of the more expensive models.

DeepSeek is challenging AI norms with a better way to train models for less money

One reason DeepSeek has achieved such success in reducing the cost of creating its AI models is the use of mixture of experts (MoE) layers. MoE layers enable only those portions of the model to be utilized when necessary, thus conserving energy. DeepSeek has also issued its R1 model under an open-weight license, allowing other developers to analyze and utilize the architectural components of the R1 model, subject to certain conditions.

DeepSeek conducted its training utilizing China compliant Nvidia GPU chips in a custom-built cluster called Fire-Flyer 2. The Fire-Flyer 2 cluster contained 5,000 GPUs. Utilizing firefly 2 and optimizing how many GPUs were used in each calculation allowed DeepSeek to minimize the amount of resources required to conduct the training.

In addition to the technology, recruitment is a key component of DeepSeek’s competitive advantage. DeepSeek is able to attract top Chinese AI researchers, while also attracting talent from non-traditional sources. The increased diversity in DeepSeek’s workforce provides a number of benefits to the model, including additional knowledge and improved performance.

DeepSeek AI large language model causes shockwaves in the global tech community

DeepSeek’s growing popularity is already causing ripples throughout the industry. Following the announcement of DeepSeek’s success, Nvidia’s stock plummeted, resulting in a decline of $600 billion in value — the largest single-day decline of a company’s market capitalization in U.S. history. Investors viewed the success of DeepSeek as a direct threat to the dominance of Nvidia.

Additionally, the open-weight strategy employed by DeepSeek and the speed at which it developed the R1 model has caused competitors to reassess their own approaches to developing AI models. Furthermore, DeepSeek has prompted discussions regarding the accessibility of AI models, the use of hardware-independent approaches, and how to build models that are “smarter” rather than simply larger.

With more affordable tools available, DeepSeek AI’s large language model demonstrates that you do not necessarily need to be a large corporation to make innovative advancements in AI. While it remains to be seen if this will result in a greater availability of affordable AI models, DeepSeek has certainly altered the landscape.

What is DeepSeek and how is it different from other AI companies?

DeepSeek is a Chinese AI company founded in 2023 that develops large language models (LLMs). It’s backed by the hedge fund High-Flyer and led by Liang Wenfeng. What sets DeepSeek apart is its ability to deliver GPT-4-level AI models at a fraction of the cost. Its DeepSeek-R1 model, for instance, was reportedly trained for only $6 million. The company also emphasizes efficiency, using a technique called mixture of experts (MoE) and training with fewer or less powerful chips. Unlike many Western firms, DeepSeek shares its model parameters openly under an ”open-weight” license, allowing others to learn from and build on its work.

How did DeepSeek train its AI with such low cost?

DeepSeek kept costs down by using efficient architecture designs like mixture of experts (MoE), and by training on locally available chips. Instead of using high-end GPUs that are difficult to access due to U.S. export restrictions, DeepSeek relied on weaker, export-approved versions and optimized training across fewer units. It also utilized a proprietary GPU cluster (Fire-Flyer 2), carefully managing resource use. This strategy enabled DeepSeek to reduce both energy consumption and computational demand, without compromising on model performance.

Why did Nvidia’s stock drop due to DeepSeek’s success?

Nvidia’s stock fell dramatically—losing $600 billion in market value—because DeepSeek demonstrated that high-performance AI can be achieved without relying on the most powerful Nvidia GPUs. By successfully training competitive models using fewer and weaker chips, DeepSeek signaled a potential shift away from Nvidia’s expensive hardware. Investors reacted strongly to this disruption, fearing that Nvidia’s market dominance in AI chips might be undermined. DeepSeek’s achievement shows that software innovation and training strategy could become more important than raw hardware power.

Is DeepSeek AI large language model open source?

DeepSeek’s model is considered ”open-weight” rather than fully open-source. This means that while the model’s parameters are publicly available for research and development, there may be usage restrictions that differ from traditional open-source licenses. For example, the company might limit commercial usage or redistribution of its model in certain ways. However, this open-weight approach still allows the community to explore the model architecture and test it, encouraging transparency and collaboration in the AI field.

MORE FROM SPONSORED

LIVE Web3 News

ARTICLE INFORMATION

DeepSeek AI’s large language model disrupts industry with low-cost, high-efficiency approach

DeepSeek is challenging AI norms with a better way to train models for less money

DeepSeek AI large language model causes shockwaves in the global tech community

FEATURED

ICN TALKS EPISODES

START UPS INCUBATOR. ACCELERATOR PROGRAMS.

Best Web3 Deals