Finally, we know the cost of the amazing Deepseek model in China

Photo of author

By [email protected]


Remember when Dibsic briefly shook the entire artificial intelligence industry By launching the Great Language Model, R1, which was trained on a small part of the money that Openai and other great players who were flowing into their models? Thanks A new paper published by Deepseek AI in the magazine natureFinally, we know what it takes to train Deepseek 1: $ 294,000 and 512 NVIDIA H800 chips. The reason for this, it seems, is due to the team’s use of educational and error learning techniques.

Most of the artificial intelligence models in charge of performing the tasks of thinking about data and demonstrations must be trained to “learn” how to solve some problems, which are both Expensive and consumes time Where models are given more challenging tasks. Deepseek found that it can improve thinking and outputs in its model simply by motivating it to perform a experimental process and error until you get the correct answer.

in An article accompanying the paper,, Assistant Professor Dapnnie Epleto, PhD student, Zhang PhD student, explain the method of reinforcement by comparing it with a child who plays a video game: “Because the child moves on their symbolism in the game world, they learn through experience and error (such as gathering gold coins) earning points, and they go to Enemse. When I gave the wrong answers.

Previous research has shown that the use of the claim approach-such as LLM to provide a step-by-step explanation of how it reaches it-gives more accurate answers. But the Deepseek team discovered a way to get better answers by reinforcement by setting an output registration system produced by R1. This works in particular with mathematics and programming questions, which usually have a correct answer. Using this method instead of thinking to humans, LLM was able to reach a correct result on its own because it sought the highest levels.

Although the outputs of this method seem more accurate, they love the “thinking” process in the device a little more for humans trying to follow it. When asking to produce the path of thinking for his answer, the model sometimes turns back and forth between English and Chinese. It also produced interpretations that were 10,000 words or more. The method was also particularly functional only for answers with correct or clear answers instead of the most accurate or self -claims.

Regardless, it is an interesting window on the extent of Deepseek to compete for a smaller budget. However, the company itself has a lot of doubts surrounding it because of its perception of the Chinese government. Recently, Researchers showed Washington Post The company’s model will reject the production of a symbol with major security defects when the chase indicates that they are working with groups that the Chinese government considers sensitive. The researchers also found that the model applauds a less secure symbol when they are asked to produce work in favor of Tibet, Taiwan, Fallon Gong’s religious movement, or the Islamic State.



https://gizmodo.com/app/uploads/2025/01/DeepSeekSuckDown-1200×675.jpg

Source link

Leave a Comment