top of page

DEEPSEEK: from Buzz to Bias

Feb 10

5 min read

0

13

0

As of today, everyone is talking about DeepSeek. Yet, until recently, few had even heard of it. That all changed on January 20, 2025, when DeepSeek unveiled its R1 model—a direct rival to OpenAI's o1. Notably, this was also the day Donald Trump was inaugurated as the 47th President of the United States.


There’s no need for me to elaborate on DeepSeek’s extraordinary capabilities, as I assume you are already familiar with how it has shaken Silicon Valley’s dominance in global AI. On January 27, DeepSeek had caused nearly $1 trillion to be wiped from U.S. stock markets.


Since then, I’ve been inundated with interview requests to explain what transpired. Yet, despite all the coverage, more questions remain than answers. The conversation has shifted from hype-filled buzz to debates about bias, while propaganda machines on both sides of the ocean operate at full throttle. With that in mind, I’ve compiled and addressed some of the most common “buzz and biased” questions I received at the end of January.


What made DeepSeek beat ChatGPT?


We could dive into the technical specifics—its innovative Mixture of Expert model, memory optimizations, low-level programming beyond CUDA, use of synthetic data, and automated reinforcement learning—but let’s skip that. After all, you can always "ChatGPT it."


What sets DeepSeek apart is its extraordinary combination of power, affordability (even free), and accessibility due to its open-source nature. While ChatGPT is undeniably powerful, it lacks the same level of accessibility and affordability. DeepSeek’s groundbreaking optimizations are impressive, though they could just as easily have come from OpenAI, Anthropic, or Meta.


The real surprise? A Chinese company achieved this first. It revealed a persistent bias among business leaders who still underestimate China’s ability to innovate at the forefront of emerging trends. As a result, much of the global discourse has focused more on Silicon Valley’s viability of its scaling law models and the stock market losses than on the transformative potential DeepSeek brings to the world.


Did DeepSeek come out of nowhere?


The answer is no. DeepSeek V2 had already been recognized as the 7th most powerful large language model (LLM) in China, according to SUPERCLUE. Its founder and CEO, Liang WenFeng, has spent nearly a decade developing AI-driven strategies for quantitative trading at his hedge fund. Remarkably, DeepSeek was built without the fanfare and hype surrounding OpenAI. This achievement demonstrates how a dedicated and intelligent research team—even one based in China—can rival the elite and expensive minds of Silicon Valley. Interestingly, Sam Altman remarked a year ago that it would be "hopeless" for a small startup with limited resources to compete with OpenAI in training foundational models. Since the emergence of DeepSeek, his perspective has shifted.


What I find most impressive isn’t that DeepSeek was relatively unknown, but that its team is entirely Chinese, with no overseas experience, and minimal prior work history. This was truly a David versus Goliath battle—not in terms of financial resources, but in expertise—and David emerged victorious. It was precisely their lack of conventional experience that allowed them to think outside the box and pioneer these groundbreaking innovations.


It’s time to finally dispel the myth that Chinese innovation is constrained by China’s "governance system" or that Chinese can only mimic or steal to stay competitive. DeepSeek’s success was driven by a determination to prove these stereotypes wrong and showcase the untapped potential of Chinese innovation. To further solidify this point, they open-sourced their model, inviting the world to look under the hood and see their ingenuity firsthand.


Did DeepSeek really cost only 5.5 million USD?


The short answer is no. While it’s true that the model’s final training likely required only a few thousand of NVIDIA’s H800 chips (a less expensive alternative to the H100 chips), DeepSeek’s hedge fund had access to significantly more chips during the pre-training phase. Comparing OpenAI’s $1 billion training costs to DeepSeek’s $5.5 million is, therefore, an unfair comparison. However, this does highlight the ingenuity of Chinese developers, who managed to train a leading AI model despite limited access to the same high-end chips available to OpenAI.


Before DeepSeek V3’s release, most AI analysts estimated that China was 12 to 18 months behind the U.S. in AI capabilities. Today, that gap can be measured in just a few months, at most. This rapid advancement has effectively ignited a new U.S.-China AI rivalry. I expect this will also push Chinese companies further to accelerate efforts in developing a domestic AI chip market. The AI chip sector deserves as much attention as the LLM market, as they will inevitably evolve in tandem. Additionally, we can anticipate an increase in chip protectionism from the U.S., along with retaliatory measures from China, further intensifying the US-China tech war tensions in 2025.


Is DeepSeek censored?


The answer is clearly yes. Critics often point to DeepSeek’s responses to geopolitical topics—such as Tiananmen in 1989, Hong Kong, the Uyghurs, or Taiwan’s status—citing its standard reply: “Sorry, that’s beyond my current scope. Let’s talk about something else.” This has fuelled claims that DeepSeek is censored. However, such accusations risk missing the bigger picture.


Consider this: Have we tried asking ChatGPT these same politically sensitive questions about China in 2024? Most likely we didn’t. The reality is that DeepSeek’s reasoning capabilities are immense, so focusing solely on its censorship overlooks its broader potential. Moreover, China’s censorship rules are a legal requirement for DeepSeek or any other internet service operation in China.


Importantly, DeepSeek is fully open-source. If you’re concerned about using Chinese servers, you can simply deploy it on your local machine via its API or access it on platforms like Microsoft Azure or Perplexity. It’s time to move past the debate about censorship and focus on the remarkable innovations DeepSeek brings to the table.


Did DeepSeek use OpenAI’s output data?


The answer is yes. At the end of January, OpenAI and Microsoft began investigating whether DeepSeek had employed a technique called “distillation.” DeepSeek also engaged in web data collection—commonly referred to as “scraping”—a practice OpenAI had used before them. Interestingly, a week later, OpenAI chose not to pursue legal action against DeepSeek, and Microsoft even made DeepSeek available on its Azure platform. In many ways, DeepSeek played the role of a Robin Hood for the AI world, democratizing large language models and liberating both China and global AI developers from Silicon Valley’s billion-dollar castles and moats.


This isn’t simply a China versus U.S. AI conflict; it’s a broader battle between open-source and proprietary approaches. DeepSeek embodies the push for AI democratization, making cutting-edge models accessible to all. The U.S. chip export restrictions, such as the denial of H100 NVIDIA chips, inadvertently nudged China towards embracing open-source innovation. But beyond that, this approach reflects a cultural shift—rooted in what is known as “ecosystem thinking.”


Why was this unexpected?


Geopolitics has often blinded us to the potential of Chinese grassroots innovation. China is rapidly closing the gap with Silicon Valley’s LLMs. The breakthrough could have come from any of a dozen impressive Chinese models—but it was inevitable. Surprisingly, DeepSeek wasn’t even considered among the “six little AI dragons” that secured major funding: Moonshot AI, Minimax, Baichuan, Zhipu.ai, 01.AI and Stepfun. Why don’t Moonshot’s Kimi 1.5 or Alibaba’s Qwen 2.5-Max—both of which rival DeepSeek today—receive the same level of attention? The answer lies in the media’s preference for stories driven by hype or controversy rather than nuanced trends.


Looking ahead, Chinese generative AI is poised to revolutionize industries in 2025 by delivering optimized LLMs that require fewer resources. DeepSeek isn’t a fluke—it’s the product of a Chinese innovation methodology. While China matches the U.S. in capital, talent, infrastructure, and data, what sets it apart is its inversive approach to innovation. Chinese AI models aren’t designed to “sustain technology leadership” of a handful of elite firms to drive their stock prices; they’re built on “sustainable technologies” focused on scalability and real-world impact. It’s a classic case of the hunted adapting to outwit the privileged poachers.

Comments

Share Your ThoughtsBe the first to write a comment.
bottom of page