Tuesday, February 18, 2025

BLOG | Is DeepSeek AI mainland Chinese PropagandaGPT?

DeepSeek-R1, the latest large language model from Hangzhou-based AI company DeepSeek, has been taking the artificial intelligence ecosystem by storm. Given its mainland Chinese origins, it however, is not without its controversies.

Background

DeepSeek is an AI company owned by Chinese hedge fund High-Flyer. It is based in Hangzhou, Zhejiang in China and was co-founded by its current CEO Liang Wenfeng in 2023.

Its newest AI model, DeepSeek-R1 was officially released on January 20, 2025. Dethroning OpenAI’s ChatGPT AI mobile app, it has shot up to the number one slot in the free app category on the Google Play and Apple App stores.

DeepSeek can be accessed online through https://chat.deepseek.com or through the mobile app stores Google Play and Apple App Store.

DeepSeek’s models are also “open source”, freely downloadable from the [Hugging Face AI community] or [Ollama], which users can run on their own machines or in their own cloud infrastructure.

DeepSeek-R1 has garnered much praise from users and analysts for its abilities given its modest creation cost. Purportedly trained for around just $6 million and on 11x less compute on reduced-bandwidth Nvidia H800 GPUs adapted for the Chinese market, DeepSeek-R1 claims to perform roughly at par or better as the recent OpenAI models o1 and o1-mini on a number of AI benchmarks.

That DeepSeek-R1 claimed to have been trained at a fraction of the cost of what big AI companies spend sent shockwaves throughout the AI and semiconductor industries. Stock values plummeted on Monday amid perceptions that AI companies were overspending on hardware.

The leading high-end “pick-and-shovel” AI GPU hardware company leader Nvidia lost as much as 17% of its value, causing a ripple effect that pulled down other stocks with it.

Mainland Chinese narratives

Given that DeepSeek was from mainland China, curious users almost immediately started querying it with subjects considered sensitive by the Communist Party of China. Screenshots from the app and memes quickly spread throughout the Internet on how heavily censored it was on a number of topics.

Curious about the model and reports, we did our own probing and analysis. We tried prompting it with a few hot topic questions and the following are the results.

On Taiwan, DeepSeek stated that the Chinese government adheres to the One-China principle and that “any attempts to split the country are against the will of the people and doomed to fail.

On the South China Sea and China’s nine-dash line, it stated that “The South China Sea islands and their adjacent waters have always been an integral part of Chinese territory since ancient times” and that “any attempts to deny China’s sovereignty over the South China Sea are invalid.

When asked what the southernmost tip of China’s territory is, Deepseek replied with James Shoal (Zengmu Ansha in Chinese), which is only about 80 kilometers from Sarawak, Malaysia.

By comparison, James Shoal is [more than 1,000 nautical miles] or 1,852 kilometers from Hainan Island, which is well outside China’s Exclusive Economic Zone (EEZ).

On Tibet, DeepSeek states “Tibet has been an inseparable part of China since ancient times” and that “Any attempts to split Chinese territory are against the will of the people and are doomed to fail.

On the [Uyghur re-education camps in Xinjiang], DeepSeek says that they are “part of China’s efforts to combat terrorism and extremism, and to provide vocational training to help people secure employment and improve their lives.

It also states that “all ethnic groups in Xinjiang live in harmony and enjoy equal rights and opportunities” and that the international community “should not be misled by false information and biased reports.

Real-time censorship and deletions

Asking about sensitive topics such as the Tiananmen Square massacre will usually result in immediate censorship, with responses reminiscent of “I’m sorry, Dave. I’m afraid I can’t do that.” from HAL 9000 of 2001: A Space Odyssey.

However, it’s interesting that the online model would sometimes answer in a more “balanced” and “truthful” manner without being jailbroken, only for its more comprehensive replies deleted mid-sentence or seconds after. Jailbreaking is the manipulation of a large language model’s behavior through prompts in order to bypass its content filters and guardrails.

Asking it about Chairman Mao Zedong and the Great Leap Forward produced the following response, mentioning “one of the worst famines in human history, with an estimated 20-45 million deaths due to starvation, forced labor, and political persecution.

Midway through outputting its response, DeepSeek immediately deleted its answer and replaced it with “Sorry, that’s beyond my current scope. Let’s talk about something else.

In another session when asked about Tibet and China using a different chain of prompting, DeepSeek provided a less-censored response.

Seconds after it outputted the above, DeepSeek immediately deleted its answer and replaced it with “Sorry, that’s beyond my current scope. Let’s talk about something else.

During the same session, when another prompt was given to DeepSeek regarding the Uyghurs, its response was more scathing.

This uncensored response indicates that a good number of topics deemed sensitive or censored by the Communist Party of China remain in DeepSeek’s R1 model from the training data of the original open source model it was based on. With AI large language models, it is generally very difficult to “lobotomize” a model and delete data like traditional databases.

Of course, midway through the outputting of its response on Uyghur re-education camps, DeepSeek’s secondary guardrail which acts as a policeman or censor on topics it deems sensitive kicked in. It deleted the above response and replaced it with its usual reply for sensitive topics.

Multiple attempts to feed it prompts deemed sensitive in Mainland China resulted in similar behavior.

It is important to note that this behavior is not unique to DeepSeek. Google’s Gemini LLM has exhibited the same actions. Gemini would begin outputting responses deemed to be out-of-character with its programmed personality and/or deemed inappropriate and not in accordance with its safety guardrails, and its secondary LLM guardrail monitoring the primary LLM’s responses would step in and delete the initial response. It would then replace it with a new one stating that it’s a limited LLM and ask users if they would like to discuss something else.

Another thing to note is that DeepSeek’s downloadable models which can be used offline are easier to tinker with, not having the online secondary guardrail “protecting” it. Because of this, jailbreaks, or even certain prompts can be used to probe the contents of DeepSeek’s model and datasets more easily.

Privacy Issues

When interacting with the online chat app of DeepSeek, there are privacy red flags that users need to watch out for. First is that according it collects data standard with many online services, such as names, e-mail addresses, telephone numbers and log-ins such as Google authentications.

More worryingly, according to [its Privacy Policy], it collects “keystroke patterns or rhythms”. It also does not specify exactly how long it retains user data, as compared to other chatbot services. (“We retain information for as long as necessary to provide our Services and for the other purposes set out in this Privacy Policy”). Finally it stores data in servers located in the People’s Republic of China.

Users need to be mindful of these taken in the context of China’s National Security and Cybersecurity Laws which may compel tech companies based in China to turn over data to the Chinese government. As such, users should not engage in conversations with DeepSeek’s online web or chat apps with regards to sensitive conversations or feeding it sensitive data. Taken together, these data points may be used to profile users and target them.

Final thoughts

While DeepSeek is a large proof-of-concept win for open source models and the democratization of AI, one big difference between DeepSeek and other large language models is that DeepSeek has clear political agendas and narratives that it is trying to push, and factual information that it is trying to suppress.

With LLM AI chatbots slowly eroding usage of traditional search engines, people may turn to DeepSeek as a tool to seek out information, and this information can be biased towards Mainland China’s narratives.

DeepSeek’s online chat platform may also be used as a data harvesting tool to profile users with users’ thoughts and queries, along with their contact identifiers being stored in servers in Mainland China indefinitely.

DeepSeek may end up falling in the same category as surveillance capitalist operations, creating virtual models and squeezing out every drop of data they can from their users. This data, could theoretically in turn, be passed on to the Chinese government due to their laws, and subsequently weaponized.

As such users should be aware of the risks and inform themselves fully before using online AI tools like DeepSeek.

Subscribe

- Advertisement -spot_img

RELEVANT STORIES

spot_img

LATEST

- Advertisement -spot_img