DeepSeek rattles U.S. Markets

Chinese AI startup DeepSeek on January 20 launched two large-language models (LLMs): DeepSeek-R1-Zero and DeepSeek-R1-Distill. Almost immediately, the app topped the iTunes download charts, with the new models rivaled the performance of leading LLMs at a fraction of the cost.

Since its launch, DeepSeek has seen a daily average of 1.8 million active users. For Tech students like Rashmith Repala, third-year CS, it has changed the way they learn.

“I utilized DeepSeek to study for a recent Entrepreneurial Finance exam,” Repala said. “I really like the reasoning part of it, … I like reading through how the model thinks, it helps me learn the concepts much better.”

Traditional language models rely on supervised fine-tuning (SFT) with labeled data to train. This often means that developers need to secure extensive datasets to train their LLMs, a process that can be both costly and time-consuming. DeepSeek trained their models differently — they used large-scale reinforcement learning (RL). Essentially, the model taught itself to solve complex mathematical and coding tasks through iterative trial and error. Instead of gathering training data, DeepSeek’s models develop their own reasoning to draw conclusions.

DeepSeek R1-Zero, with 671 billion parameters, was able to achieve a 71% pass rate on the American Invitational Mathematics Examination (AIME) 2024, which features questions on advanced calculus, number theory, linear algebra and combinatorics. This allows DeepSeek-R1-Zero to rival OpenAI’s best model, GPT o1-0912, which scored 74.4% on the same exam.

Even the smaller models scored relatively well on the exam — DeepSeek-R1-Distill achieved a 55.5% accuracy with 7B parameters, and the smallest 1.5B parameter model achieved a 28.9% accuracy. In perspective, QwQ-32B-Preview, a model trained for logic and math, had 50B parameters and scored 50%.

Another differentiating factor of DeepSeek is its launch as open-source, under the MIT license. Unlike proprietary models such as GPT o1-0912, the DeepSeek-R1 Models can be downloaded, enhanced, taken apart or copied. A group of researchers at UC Berkeley claim to have replicated the core technology of DeepSeek-R1, albeit at a smaller scale. Further innovation seems to be at the gate, with a growing community already building upon its publicly available code base.

The success of DeepSeek came at a considerable cost to U.S. technology companies, which had previously maintained a dominant position in the AI sector. On the first day of trading following the DeepSeek news, leading AI chipmaker Nvidia experienced the largest one-day drop in the market value of a single company in stock market history, wiping $590 billion in market cap in a single day.

The tech world seems divided — some in praise of the new innovation, and others in uncertainty. “I think DeepSeek has had some real innovation”, said Satya Nadella, the CEO of Microsoft. On the other hand, numerous independent experts have highlighted how the model censors any politically sensitive information that portrays China in a negative light.

These developments come amid a growing fear of Chinese companies gathering U.S. data. On Jan. 18, 2 days before the news of DeepSeek, TikTok was temporarily banned in the U.S. due to concerns about how the Chinese parent company, ByteDance, manages user data collection. The ban has been lifted to give time for ByteDance to find a buyer for its US operations.

Following the DeepSeek news, Gov. Greg Abbott has banned DeepSeek on personal and state-owned devices for any state employees and contractors in Texas.

“Texas will not allow the Chinese Communist Party to infiltrate our state’s critical infrastructure through data-harvesting AI”, said Abbott.

However, growing data distrust goes further than DeepSeek. OpenAI, a U.S.-based company, has a growing number of lawsuits for illegal data collection and handling. Several have come from news publishers and authors who felt that the models trained on their data without consent. OpenAI’s advantage, though, is that it doesn’t raise as many concerning national security issues as DeepSeek.

“I still do not trust DeepSeek with my data. I do not upload personal documents because I do not know where the data will be saved or stored. Even though there are doubts about my data privacy OpenAI, I trust it more than DeepSeek”, said Repala. He shares a common sentiment — Chatgpt, OpenAIs website, still sees nearly 300 million weekly active users, despite the lawsuits.

There has yet to be any large-scale U.S. legislative action regarding DeepSeek, but the news of the model has already had major effects on the U.S. market. The long-term outcomes of DeepSeek are yet to be seen, but it undeniably marks a turning point in AI.

Advertising