-
熊節(jié)、塞爾吉奧·阿馬德烏:DeepSeek為什么要開(kāi)源?這可能與人工智能的領(lǐng)導(dǎo)權(quán)息息相關(guān)
THE DISPUTE FOR LEADERSHIP IN ARTIFICIAL INTELLIGENCE, CHINA, AND OPEN SOURCE
Why is technological leadership important? How to define technological leadership in AI? Artificial Intelligence (AI) is a transversal technology, and its advancements have profound impacts on the economy, society, and national security. Technological leadership, first and foremost, provides a series of competitive advantages, as inventions and innovations grant their developers gains and benefits that others do not possess. Secondly, technological leadership is a critical geopolitical factor, as it allows for the influence of global standards, norms, and regulations. Thirdly, technological leadership can drive an innovation ecosystem that consolidates long-term development. Fourthly, leadership can enhance security in an international context of threats, including military ones. Fifthly, leadership enables the direction of technology to benefit social, environmental, and political objectives.
From a technopolitical perspective, where technoscience is not neutral and has implications for power relations and social organization (Winner, 2020), leadership in AI is not merely about developing the most advanced technology but also about creating a sociotechnical environment that realizes broader social values and objectives, ensuring that innovation follows certain purposes. The trajectory of AI development may prioritize increasing productivity for the economic system or may aim to find socially just and environmentally sustainable solutions. It may seek to concentrate power and reinforce international asymmetries or contribute to the distribution of knowledge and equitable development. It may stifle the inventiveness of populations and cultures or ensure technodiversity. It may be tied to the concentration or distribution of power.
Currently, AI leadership resides in the United States, under the direction of the so-called Big Techs. These companies control indispensable resources for the development of existing AI, particularly AI dominated by the deep learning approach. This approach is based on the use of statistics and probability for the classification and extraction of patterns from large amounts of data. To perform these operations, AI developers rely on significant computational power. Training an advanced AI model like OpenAI's GPT costs millions of dollars and requires many hours of processing with specialized hardware, such as specific chips designed for these tasks. These are called "AI inference chips" or "inference accelerators." They achieve better results in less time. For example, Google's Tensor Processing Units (TPUs) are optimized for inference and training. Neural Processing Units (NPUs) or Neural Network Accelerators, common in mobile devices and edge computing, are also used. Graphics Processing Units (GPUs) are utilized for both training and inference. Currently, these chips are essential for applications such as image recognition, natural language processing, and other real-time AI tasks.
The U.S. government has, for some time, adopted a policy of restricting access to cutting-edge chips, primarily aimed at delaying AI development in China and other countries considered adversaries. The goal is to maintain U.S. leadership in AI. With Donald Trump's inauguration in January 2025, the policy of technological blockade was intensified. Additionally, the U.S. president announced a $500 billion investment in the Stargate project. Trump's plan is to develop physical and virtual AI infrastructures in the United States, in collaboration with companies like Oracle, OpenAI, and SoftBank, to "fuel the next generation of AI". Companies such as Nvidia, Arm, and Microsoft are partners in the project, which is beginning to be implemented in Texas and will, over the next four years, include "colossal data centers" across various regions of the United States.
American tech elites, represented by figures like Elon Musk, believe that artificial intelligence is approaching the "singularity"—the emergence of Artificial General Intelligence (AGI). They argue that AGI will completely surpass and replace human labor in all intellectual domains, and that if the United States is the first to achieve AGI, its technological hegemony will become unassailable. However, neither ChatGPT nor DeepSeek has shown any signs of approaching AGI. They are useful tools for processing natural language and demonstrate limited reasoning abilities within specific domains, but there is no evidence that they—or any known AI research—are nearing AGI.
THE OPEN SOURCE TURNAROUND
In May 2024, a small Chinese company called DeepSeek launched its Large Language Model (LLM) inspired by Llama, a model licensed under a restricted research agreement prohibiting commercial use. What stood out in the open-source model, DeepSeek V2, was its unprecedented cost-effectiveness. DeepSeek had reduced the cost of inference to just 1 yuan per million tokens, approximately one-seventh of Llama3 70B and significantly less than GPT-4. Tokens are basic units of text that language models use to process and understand human language. Depending on the context and language, tokens can be thought of as "chunks" of words, syllables, or even individual characters. AI models convert text into tokens, which are represented numerically. These numbers are then processed by the model to generate responses or perform tasks. Therefore, the number of tokens in a text directly affects the cost and processing time. The more tokens, the more complex and time-consuming the inference.
DeepSeek, like all Chinese companies, was and is subject to the U.S. government's blockade on cutting-edge chips. This led DeepSeek's leader and his team to focus more on research and optimization. Liang Wenfeng, in an interview in July 2024, stated, "Our starting point is not to seize the opportunity to make a fortune but to advance to the forefront of technology to promote the development of the entire ecosystem." The Chinese company's attempt to lead AI development is evident. To achieve this, DeepSeek did not limit itself to organizing data and running on available clouds. The team worked hard to find solutions in the face of the scarcity of cutting-edge chips. This required altering architectures and experimenting with new procedures, as well as extensive applied mathematics.
The young leader of DeepSeek, Liang Wenfeng, stated, "What we lack in terms of innovation is definitely not capital but confidence and knowledge on how to organize a high density of talent to achieve effective innovation." He continued, "Innovation is not entirely business-driven; it also requires curiosity and creativity. We are stuck in the inertia of the past, but this is also temporary." Liang Wenfeng's idea is to copy less and study more. To bet on open models not to use them but to improve them and find paths that require fewer computational resources.
Open source is fundamental to DeepSeek's strategy but may not be for other Chinese companies like Tencent, Baidu, and Alibaba, among others. However, open source allows knowledge to be distributed globally, generating possibilities for new discoveries at a faster and more inclusive pace. Liang Wenfeng stated:
"Actually, nothing is lost with open source and the publication of papers. For the technical team, being followed is a great sense of accomplishment. In fact, open source is more of a cultural behavior than a commercial one. Giving is actually an extra honor. A company that does this will also have cultural appeal."
Open source is not a technology. It is a development process based on knowledge sharing. Generally, it encourages the organization of communities willing to collaboratively solve problems and maintain solutions by updating them. Language models like Mistral 7B (Mistral AI) and Falcon (Technology Innovation Institute) are open and licensed under Apache 2.0. The reinforcement learning model Stable-Baselines3 is also open with an MIT license. There are numerous other open models in the field of AI. So why did DeepSeek's model become so relevant? Because it disrupted the global race for AI leadership. How? By drastically reducing the computational costs of a large language model.
Open source is fundamental for distributing knowledge but does not solve the problem of the computational infrastructure needed to train and run models. DeepSeek presented an open model with high performance and lower processing requirements.
DeepSeek-R1 has already demonstrated stronger inference capabilities than OpenAI's ChatGPT o1, while its costs (including both training and usage) are significantly lower. By open-sourcing its model, DeepSeek has facilitated the democratization of large language models—enabling smaller companies, countries with less developed technological and digital infrastructure, and even individuals to train their own “sovereign AI” based on DeepSeek, without relying on Big Tech products or handing over their data to these companies. Indonesia and India have already begun building their own AI infrastructure using DeepSeek as a foundation. Prior to this, only the United States and China had the capability to access large language models at such a high level.
DEEPSEEK R1'S BET ON REINFORCEMENT LEARNING
"DeepSeek-R1 - Zero chose an unprecedented path, the path of 'pure' reinforcement learning, which completely abandoned the predefined Chain of Thought (CoT) model and supervised fine-tuning (SFT), relying solely on simple reward and punishment signals to optimize the model's behavior."
In an analysis conducted by Tencent's team on the findings of DeepSeek's R1 model, they suggested that it might be necessary to rethink the role of supervised learning in AI development. Perhaps they were focused on making AI mimic how humans think rather than betting more on the native problem-solving capabilities of reinforcement learning systems. In reinforcement learning, rewards and punishments are mathematically expressed in the model. The agent (which can be an algorithm or a system) makes decisions based on a policy that seeks to maximize cumulative rewards over time. Rewards are numerical values that the agent receives for performing actions in a given state of the environment.
Machine learning is a field of artificial intelligence that allows computers to identify patterns and make decisions based on data without being explicitly programmed to do so. Machine learning relies on algorithms that extract patterns from large amounts of data and adjust their parameters to improve predictive capabilities over time. These algorithms can be divided into three main categories: supervised learning (when the model learns from labeled data), unsupervised learning (when the model identifies patterns without predefined labels), and reinforcement learning (when the model learns through trial and error, receiving rewards or penalties based on its actions). Deep learning is a subset of machine learning that uses artificial neural networks with multiple layers to process data in a hierarchical and sophisticated manner.
Due to these innovations, the training cost of DeepSeek R1 was drastically reduced, representing only 1/10 to 1/20 of ChatGPT's cost. While OpenAI's model spent $20, DeepSeek performed the same activity with just $1. In January 2025, the DeepSeek model cost only 16 yuan per million tokens, while ChatGPT cost up to 438 yuan—a difference of 27 times! This means that organizations can use DeepSeek's model at a lower cost while achieving greater efficiency.
COMPUTATIONAL POWER AND THE GEOPOLITICS OF AI
The plummeting stock prices of Nvidia and other Big Techs were heralded by many as the end of U.S. leadership in AI. This does not seem to be accurate. The sharp decline in the value of the powerful GPU manufacturer was driven by the sudden sale of a massive volume of its shares following the news that DeepSeek had managed to develop a large language model at 10% of OpenAI's costs. This could change the course of AI. The growing dependence on high-processing chips might be shifting. Based on this reasoning and fear, speculators took the opportunity to sell their positions in Nvidia and other companies.
The dependence on cutting-edge chips did not end with the innovations coming from China. Chips with less than 2 nanometers represent a crucial advancement for artificial intelligence. They ensure greater processing capacity with lower energy consumption. As AI models become more complex and require billions or trillions of parameters, computational efficiency remains a critical factor. Smaller chips allow for greater transistor density, improving calculation speed and energy efficiency, reducing operational costs, and the need for cooling. This evolution is fundamental for the large-scale implementation of AI, from data centers to mobile devices, including military applications.
It is important to note that nanochips expand embedded applications in devices and favor their use in IoT, healthcare, robotics, and autonomous vehicles. Another promise is that with more advanced and smaller chips, AI models can be run locally, reducing dependence on the cloud and ensuring faster and more secure responses. In the geopolitical context, the race for smaller chips intensifies technological disputes between powers like the U.S. and China, as control over this technology defines competitiveness in the digital economy and cybersecurity.
The United States maintains its leadership in the development and manufacturing of chips and semiconductors through a combination of technological dominance, strategic investments, and control of supply chains. American companies like NVIDIA, Intel, AMD, and Qualcomm lead the design of advanced chips. The U.S. government reinforces its position with subsidies and incentives, such as the CHIPS and Science Act, which allocates billions of dollars to strengthen domestic semiconductor production and reduce dependence on Asia.
In addition to technological superiority, the U.S. uses sanctions and export controls to limit access to critical technologies by strategic rivals like China. The Department of Commerce imposes severe restrictions on the export of advanced semiconductor manufacturing equipment, such as ASML's machines and chip design software from Cadence and Synopsys. These restrictions make it difficult for China to develop its own advanced chips and reinforce the U.S. position in the sector. Simultaneously, Washington invests in strategic alliances, such as the "Chip 4 Alliance" (with Japan, South Korea, and Chinese Taiwan), ensuring that its allies follow U.S. guidelines to restrict technology transfer to countries considered competitors. This consolidated strategy allows the U.S. to maintain its hegemony in the semiconductor industry, essential for the digital economy and national security.
While the United States is making every effort to restrict China’s access to advanced chips (below 7nm) and their production capabilities, China is continuously developing its ability to independently manufacture these high-end chips. Semiconductor Manufacturing International Corporation (SMIC) has already demonstrated the capability to produce 7nm chips and is believed to be likely capable of producing 5nm chips. Companies like Shanghai Micro Electronics Equipment (SMEE) are actively developing extreme ultraviolet (EUV) lithography technology to replace the lithography machines monopolized by ASML, which have been restricted from being sold to China.
On the other hand, in the field of mature process chips used in automotive and industrial sectors—where the technology is not the most cutting-edge but demand is significantly higher—China’s chip industry has already established a large-scale and complete industrial chain. In 2024, China’s total chip exports exceeded 1 trillion RMB (approximately 139 billion USD) . It is foreseeable that once Chinese companies achieve technological breakthroughs in advanced processing, their existing supply chain advantages will significantly reduce the prices of high-end chips. Moreover, chip processing is constrained by physical limits and cannot be improved indefinitely. It is only a matter of time before China catches up with the United States.
CONCLUSION
"Nvidia's leadership is not just the result of one company's efforts but the combined efforts of the entire Western technology community and industry. They can see the next generation of technological trends and have a roadmap. AI development in China also requires this ecosystem. Many domestic chips cannot develop due to the lack of supporting technical communities and only second-hand information, so China needs someone at the forefront of technology." (Liang Wenfeng, 2024)
The founder of DeepSeek, Liang Wenfeng, stated, "The problem we face has never been money but the ban on cutting-edge chips." Even if the trend of data concentration and the need for increasing computational power—which requires increasingly sophisticated chips—shifts and loses momentum, international capitalism does not seem to alter its fundamental asymmetries. Undoubtedly, the technoscientific development of China allows countries technologically dependent on the U.S. to structure strategies that benefit their development. Having sovereign, controllable, world-class large language models was once out of reach for countries outside the United States and China—especially those in the Global South. Now, DeepSeek has democratized this technology, opening up new possibilities for Global South countries in this field. At the same time, it has also presented new tasks and challenges for the governments of these nations.
What the DeepSeek phenomenon points to is the importance of open source for strengthening international collaborative chains that can reduce inequalities and large knowledge asymmetries. However, open source does not solve the problem of building sovereign infrastructures essential for local and national development. Therefore, it falls to states seeking to improve their techno-economic position to reduce the power of Big Techs, control the fundamental inputs of AI—especially data from their populations—and invest in solutions that reduce the environmental impact and labor precarization that automated systems have generated in capitalist countries. Betting on quality education for youth requires encouraging technodiversity and converting the cultural vitality of peoples into technological expressions.
本文系觀察者網(wǎng)獨(dú)家稿件,文章內(nèi)容純屬作者個(gè)人觀點(diǎn),不代表平臺(tái)觀點(diǎn),未經(jīng)授權(quán),不得轉(zhuǎn)載,否則將追究法律責(zé)任。關(guān)注觀察者網(wǎng)微信guanchacn,每日閱讀趣味文章。
-
本文僅代表作者個(gè)人觀點(diǎn)。
- 責(zé)任編輯: 鄭樂(lè)歡 
-
鋰電池“打一針”就能“重生”!《自然》刊登我國(guó)科研團(tuán)隊(duì)新發(fā)現(xiàn)
2025-02-13 06:42 -
從四個(gè)角度全面駁斥美方對(duì)DeepSeek的質(zhì)疑和污蔑
2025-02-12 07:34 心智觀察所 -
我國(guó)成功發(fā)射衛(wèi)星互聯(lián)網(wǎng)低軌衛(wèi)星
2025-02-11 19:20 航空航天 -
蹭熱度?ai.com重定向至DeepSeek
2025-02-10 14:35 人工智能 -
-
“中國(guó)物理學(xué)研究領(lǐng)先世界,美國(guó)機(jī)構(gòu)被擠出前十”
2025-02-09 09:14 科技前沿 -
撬開(kāi)日本海關(guān)的口:日本半導(dǎo)體設(shè)備對(duì)華依賴度有多高?
2025-02-06 08:06 心智觀察所 -
中國(guó)平臺(tái),集中上線
2025-02-04 21:12 -
“人造太陽(yáng)”再創(chuàng)紀(jì)錄,是中國(guó)式科研方法論又一次勝利
2025-02-04 13:05 心智觀察所 -
“大洋一號(hào)”功勛船舶將升級(jí)改造
2025-02-02 15:35 -
“霸榜全球140個(gè)市場(chǎng)”,拉新最多的是…
2025-02-01 22:06 觀察者頭條 -
突破70多年來(lái)的傳統(tǒng)認(rèn)知!他們發(fā)現(xiàn)距地球16萬(wàn)公里的“太空合聲”
2025-02-01 16:53 天文 -
中國(guó)光子毫米波雷達(dá)技術(shù)取得突破性進(jìn)展
2025-01-31 22:54 科技前沿 -
果然,臺(tái)當(dāng)局又跳了出來(lái)
2025-01-31 22:01 臺(tái)灣 -
英偉達(dá)平臺(tái)上線DeepSeek
2025-01-31 18:18 -
阿斯麥CEO:DeepSeek,好消息
2025-01-30 09:34 -
20光年外,科學(xué)家又發(fā)現(xiàn)“超級(jí)地球”
2025-01-29 19:03 -
-
理解DeepSeek的中國(guó)式創(chuàng)新,要先回顧深度學(xué)習(xí)的歷史
2025-01-27 08:03 心智觀察所 -
探索宇宙線起源之謎再添“觀天”利器
2025-01-21 20:09 天文
相關(guān)推薦 -
解放日?“對(duì)于市場(chǎng)而言這是‘屠戮日’” 評(píng)論 0“世界變了”,加拿大汽車零部件巨頭瞄準(zhǔn)中國(guó)市場(chǎng) 評(píng)論 25“中國(guó)對(duì)美反制,巴西看到了機(jī)會(huì)” 評(píng)論 36“眾叛親離!要是再發(fā)生911,誰(shuí)還同情美國(guó)人?” 評(píng)論 215她對(duì)美放狠話,還提到歐盟的“實(shí)力地位” 評(píng)論 274最新聞 Hot
-
解放日?“對(duì)于市場(chǎng)而言這是‘屠戮日’”
-
交割日已到,香港各界再批:長(zhǎng)和“賣港”不得人心
-
魯比奧警告歐盟:不要將美國(guó)排除在外
-
又?jǐn)偵鲜铝耍@回還有華爾茲
-
“世界變了”,加拿大汽車零部件巨頭瞄準(zhǔn)中國(guó)市場(chǎng)
-
德外長(zhǎng)插一腳:美烏能不能簽,先得給歐盟“掌掌眼”
-
候任美軍參聯(lián)會(huì)主席炒作:若與中國(guó)打持久戰(zhàn),我們還有很多不足
-
知名華裔教授失聯(lián)兩周?“異常而危險(xiǎn)的信號(hào)”
-
“中國(guó)對(duì)美反制,巴西看到了機(jī)會(huì)”
-
共和黨人也不滿關(guān)稅,“長(zhǎng)期來(lái)看,我們都會(huì)死”
-
韓網(wǎng)民強(qiáng)扯“中國(guó)間諜”喊美軍介入,駐韓美軍急了
-
美國(guó)航運(yùn)巨頭發(fā)話:我們有很多中國(guó)船,這錢得美國(guó)人掏
-
馬斯克狂撒2500萬(wàn),這場(chǎng)“會(huì)改變西方文明進(jìn)程”的選舉還是輸了
-
俄高官有望沖突后首次訪美,“美方暫時(shí)解除制裁”
-
美防長(zhǎng)稱日本是“前線”,日媒急了
-
美國(guó)兩黨拿出對(duì)俄制裁草案:征500%二級(jí)關(guān)稅
快訊- 緬軍對(duì)中國(guó)救災(zāi)車隊(duì)鳴槍?中方回應(yīng)
- 日本九州島附近海域突發(fā)6.2級(jí)地震
- “整個(gè)市場(chǎng)緊張不安”,美股三大指數(shù)集體低開(kāi)后轉(zhuǎn)漲
- 最新披露:對(duì)臺(tái)演練最近距離不足20海里
- 國(guó)臺(tái)辦:東部戰(zhàn)區(qū)近日臺(tái)島周邊演訓(xùn)是必要措施,正義之舉
- 4月2日《新聞聯(lián)播》主要內(nèi)容
- 獲利9897萬(wàn)、罰沒(méi)3.96億!一股民以虛假申報(bào)手段操縱多只股票被罰
- 高速公路管理中心回應(yīng)小米汽車事故路況:事故發(fā)生后調(diào)整施工狀態(tài)
-