Deepseek’s hidden warning to AI safety

tThis month, he surprised the Wall Street and Silicon Valley of Deepseek R1, sucked investors, and impressed the technical leaders. However, in all lectures, many overlooked important details about how to work with new Chinese AI models. This is a nuance that researchers are worried about the ability to control the sophisticated new artificial intelligence system.

Everything depends on how Deepseek R1 was trained. It was a surprising action in the initial version of the model described by the researcher in the technical document accompanying the release.

During the test, the researchers noticed that the model was spontaneously switched between English and Chinese while the model was solving the problem. It was found that the ability of the system to solve the same problem would reduce the ability to solve the same problem because it was forced to stick to one language.

The discovery sounded an alarm bell for some AI safety researchers. Currently, the most capable AI system “thinks” in the left and right languages to humans, and writes down the inference before reaching the conclusion. It is a benefit for a safety team, and its most effective guardrails include monitoring the so -called “chain of thinking” of the model of the signs of dangerous actions. However, the results of DeepSeek have increased the possibility of separation on the horizon. This can get a new AI function by completely releasing a model of human language restrictions completely.

Certainly, switching the DeepSeek language is not the cause of the alarm itself. Instead, researchers are worried about new innovation that caused it. Deepseek’s paper describes a new training method that has been purely rewarded for the model to get the right answer, regardless of how the thinking process can be understood to humans. The worries are that this incentive -based approach can ultimately lead the AI system and develop a completely mysterious inference method.

The AI industry will give up readability to seek more powerful systems, but it will take away something that looks like a simple victory for AI’s safety. ” Bowman says. The AI company, an ANTHROPIC department, focuses on “arranging” AI to human preference. “Otherwise, they will forfeit the abilities they might have to look at.”

Read more: DeepSeek, what you need to know that AI companies in China are causing the stock market confusion

Think without words

AI, which creates a unique different language, is not as strange as sounding.

Last December, meta researchers have begun testing a hypothesis that human language is not the best format to perform inference. And the large language model (or LLM, Openai’s Chatgpt and Deepseek R1) may be able to infer or be more efficient and accurate if they are driven by the linguistic constraints.

Instead of performing inference in words, meta researchers have designed models using the latest patterns in the neural network, that is, a series of numbers representing internal inference engines. This model they discovered began to produce something called “continuous thinking.” The numbers were completely opaque and in mystery in human eyes. However, they found that this strategy had created the “advanced pronounced pattern that appeared” in the model. These patterns have brought higher scores in several logical inference tasks compared to models that infer the use of human language.

The meta research project was very different from the Deepseek project, but the survey was defined in China’s research and one important way.

Both DeepSeek and Meta have shown that the performance of the AI system is “the ease of reading of humans imposes taxes.” This is according to Jeremie Harris, a CEO of G Ladstone AI, a company that advises AI governments about AI’s safety issues. “At the limit, there is no reason why (AI’s thinking process) should be easy for humans to read,” says Harris.

And there are some safety experts in this possibility.

“The sentence seems to be on the wall (for AI Research) There are other means. Here we will optimize for the best inference you can get.” Bowman, the leader, says. “I hope people will expand this, and the risk is what they are trying to do, what their value is, or when they set them as agents. You will be involved in a model that you can’t say if you know how to make a difficult decision.

On their side, Meta researchers argued that their research did not need to bring to humans to a side job. “It is ideal to convert the survey results into a language when LLM has the freedom to estimate without language restrictions,” they wrote in the paper. (Meta did not respond to comments on the proposal that research could lead to dangerous directions.)

Read more: Why DeepSeek is causing discussions on national security like Tiktok

Language

Of course, even a flexible AI inference is not without that problem.

When the AI system explains their thoughts in plain English, they may seem faithful. However, some experts do not know if these explanations have actually made AI actually making a decision. It may be like a politician asking for the motivation behind the policy. They may come up with a good explanation that sounds good, but has little to do with the real decision process.

It is not perfect to let AI explain yourself in human terms, but many researchers believe it is better than alternatives. To develop AI a unique internal language that we cannot understand. Scientists are working on other ways that doctors look into in the AI system, as well as how to use brain scan to study human thinking. However, these methods are still new and have not yet given a reliable way to make the AI system more secure.

Therefore, many researchers are skeptical to encourage AI to infer AI in a way other than human language.

“If you don’t pursue this path, you’ll be in a better position for safety,” says Bowman. “If so, we will remove what seems to be the best leverage for some terrible open problems in alignment we have not yet solved.”

See Full Bio

What's Hot

What Europe’s AI education experiment can teach business

Stable Diffusion XL on Mac with advanced Core ML quantization

Council considers revised HR policy manual and asks staff to draft legislation

New AI research clarifies the origins of Papua New Guineans

AI helps prevent medical errors in real clinics

No one is surprised, and a new study says that AI overview causes a significant drop in search clicks

Detailed cyber espionage of humanity orchestrated by AI

Concerns about social trends in viruses like Barbie

Data silos are holding back enterprise AI

Most Popular