DeepSeek claims that its inference model outperforms OpenAI’s o1 on certain benchmarks

Photo of author

By [email protected]


Chinese AI lab DeepSeek has released an open version of DeepSeek-R1, its so-called inference model, which it claims performs as well as OpenAI. o1 In some standards of artificial intelligence.

R1 is available from the Hugging Face AI development platform under an MIT license, meaning it can be used commercially without restrictions. According to DeepSeek, the R1 outperforms the o1 in AIME, MATH-500, and SWE-bench Verified benchmarks. AIME uses other models to evaluate model performance, while MATH-500 is a set of word problems. Meanwhile, SWE-bench Verified focuses on programming tasks.

Being a logic model, R1 effectively verifies facts, which It helps her avoid some of the pitfalls that models usually stumble upon. Heuristic models take a little longer — typically seconds to minutes — to arrive at solutions than a typical non-heuristic model. The upside is that they tend to be more reliable in areas such as physics, science, and mathematics.

R1 contains 671 billion parameters, DeepSeek revealed in a Technical report. The parameters roughly correspond to the model’s problem-solving skills, and models with more parameters generally perform better than those with fewer.

671 billion parameters is a huge number, but DeepSeek has also released “distilled” versions of R1 ranging in size from 1.5 billion parameters to 70 billion parameters. Smaller can run on a laptop. As for the full R1, it requires more powerful hardware, but it is He is Available through DeepSeek’s API at prices 90% to 95% cheaper than OpenAI’s o1.

There is a downside to R1. Being a Chinese model, it is subject to… Performance measurement By China’s Internet Regulatory Commission to ensure that its responses “embody core socialist values.” R1 will not answer questions about Tiananmen Square, for example, or Taiwan’s autonomy.

DeepSeek R1 rejected
R1 liquidation is in progress. Image credits:Deep Sick

a lot Chinese artificial intelligence systemsincluded Other models of inference, decrease To respond to topics that may raise the ire of the country’s regulators, such as speculation about… Xi Jinping order.

The R1 arrives days after the end of the outgoing Biden administration Suggested Harder Export rules and restrictions on artificial intelligence technologies for Chinese projects. Companies in China have already been banned from purchasing advanced AI chips, but if the new rules take effect as written, companies will face tougher restrictions on both the semiconductor technology and models needed to power advanced AI systems.

In a Policy document Last week, OpenAI urged the US government to support the development of American AI, fearing that Chinese models could match or surpass them in capabilities. in interview With The Information, Chris Lehane, VP of Policy at OpenAI, pointed to High Flyer Capital Management, DeepSeek’s parent company, as an organization of particular interest.

So far, at least three Chinese laboratories have been established – DeepSeek, Alibaba, and Kimmyowned by the Chinese Unicorn Company Moonshot AI – They produced models that they claim are competitive with the o1. (It is worth noting that DeepSeek was the first of its kind Announce Preview for R1 in late November.) in a mail On X, Dean Paul, an AI researcher at George Mason University, said the trend suggests Chinese AI labs will remain a “fast follower.”

“The impressive performance of DeepSeek’s distilled models (…) means that highly efficient thinkers will continue to be widely deployed and can run on local hardware, far from the eyes of any top-down control system,” Paul wrote.



https://techcrunch.com/wp-content/uploads/2024/04/GettyImages-1652364481.jpg?resize=1200,675

Source link

Leave a Comment