
- Thomas Wolf’s Thomas Wolf says it is difficult to know any Amnesty International model It is the best as traditional artificial intelligence standards become saturated. To move forward, Wolf said that the artificial intelligence industry can rely on new relatives to measure standards – on the basis of the agency and use – specific.
Thomas Wolf, co -founder and president A world in the face embrace, It is believed that we may need new ways to measure artificial intelligence models.
The wolf told the audience in AI’s brainstorming In London, as artificial intelligence models advanced, it is increasingly difficult to know anyone who achieves the best.
He said: “It is difficult to know what is the best model,” noting the nominal differences between modern versions of Openai and Google. “It seems that they all, in fact, are very close.”
“The world of standards has developed a lot. We have been used to have this very academic standard that we mostly measured the form of the model – I think the most famous is MMLU (understanding the language of massive multiple tasks), which was essentially a set of questions at the level of graduate studies or at the doctoral level that was on the model to answer.” “These standards are mostly saturated now.”
Over the past year, there was an increasing choir of voices of academic, industry and politics that claim that common artificial intelligence standards, such as MMLU, glue, and helloswag, could reach saturation, and no longer reflect the benefit of the real world.
In a study published in February, researchers at the European Commission’s Joint Research Center published a paper called “Can we trust artificial intelligence standards? This has found “systematic defects in current measurement practices” – including improved incentives, design failure, playing results and data spices.
To move forward, Wolf said that the artificial intelligence industry must rely on two main types of standards that reach 2025: one to evaluate the forms agency, where LLMS is expected to do tasks, and the other is specially designed for each case use of models.
The embraced face already works on the latter.
The company’s new program, “Your Seat”, aims to help users determine the form that must be used for a specific task. Users feed some documents in the program, which then automatically creates a specific standard for the type of work that users can apply on different models to know which is the best of use.
“Just because these models all work on the same thing on this academic standard, it does not really mean that they are the same.”
Open -PROCE’s ‘ChatGPT MOMENT’
Founded by Wolf, Clément Delangue, and Julien Chauond in 2016, Huging Face has long been the AI Open Champion.
It is often referred to as GitHub Automated Learning, the company provides an open platform that enables developers, researchers and institutions to create models and data groups and applications on a large scale. Users can also browse models and data groups that others have uploaded.
Wolf told the Brainstorm Ai fans that the “business model in face” is really in line with the open source “and the company’s goal is to get the maximum number of people participating in this type of open society and sharing models.”
Wolf expected that the open artificial intelligence will continue to prosper, especially after Dibsic’s success earlier this year.
After its launch late last year, the Chinese artificial intelligence model deepsek R1 sent shock waves across the world of artificial intelligence when laboratory found that it matched or even its superiority.
Wolf said Dibsic was “the moment of Chatgpt” of open artificial intelligence.
He said: “Just as it was the moment when the whole world discovered artificial intelligence, Dibsic was the moment when the whole world discovered that there is a kind of this open society.”
This story was originally shown on Fortune.com
https://fortune.com/img-assets/wp-content/uploads/2025/05/54501800783_e51f2087c1_c.jpg?resize=1200,600
Source link