Openai’s GPT-5 Announcement last week It was supposed to be a victory – resisting that the company was still the undisputed leader in artificial intelligence – even it was not. During the weekend, a group of exhaustion of customers turned its offering into more than one firearm: it has become a producer and a crisis of confidence. Users expressed their regret for the loss of their favorite models, which have doubled as the remedies, friends and romantic partners. The developers complained about the deteriorating performance. It is expected that the deputy industry will be called GPT-5 “late, excessive and frustrated”.
Many, the perpetrator, have been hiding in sight: a new model in the actual time “router” that automatically decides which of the many GPT-5 variables to be rotated in each job. Assume many users GPT-5 It was one model trained from scratch. In fact, it is a network of models – some of which are weaker and cheaper, others are stronger and more expensive – together. Experts say this approach can be the future of artificial intelligence with the progress of large language models and becomes more intense in resources. But in the appearance of GPT-5 for the first time, Openai has shown some challenges in the approach and learned some important lessons on how user’s expectations developed in the era of artificial intelligence.
For all the benefits promised to direct models, many GPT-5 users exposed to what they considered lack of control; Some even suggested that Openai may intentionally try to pull the wool on their eyes.
In response to GPT-5, Openai quickly moved to return the previous main model, GPT-4O, to professional users. She also said that she was steadfast to direct animal vehicles, increase the limits of use, and promised continuous updates to restore the user’s confidence and stability.
Anand Chaudhry, co -founder of Ai Firstquadrant, summarized the position: “When guidance, it seems like a magic. When he explodes, he feels shrinking.”
The promise and lack of consistency to direct the form
Jiaxuan You, Assistant Professor of Computer Science at the University of Illinois Urbana Champin, luck His laboratory has studied both promise – and contradiction – from typical guidance. He said that in the case of GPT-5, he believes (although he cannot confirm) that the typical router sometimes sends parts of the query itself to different models. It may give a cheaper and fastest answer as one as it gives a slower model that focuses on another thinking, and when the system mixes those responses together, the accurate contradictions slide.
He explained that the idea of directing the model is self -evident, but “made it really work not very trivial.” He added that the mastery of the router can be difficult, such as building an Amazon -class recommendation systems, which take years and many field experts to improve. “GPT-5 is supposed to be built with more orders of resources,” explained, noting that even if the router chooses a smaller model, it should not produce inconsistent answers.
However, you think the guidance is here to stay. He said: “Society also believes that the typical guidance is promising,” noting the technical and economic reasons. Technically, it seems that the performance of a single model strikes a plateau: I have indicated the laws that are commonly martyred, which says when we have more data and account, the model improves. “But we all know that the model will not improve without limits,” he said. “During the past year, we all witnessed the ability of one model already saturated.”
Economically, guidance allows artificial intelligence service providers to continue to use old models instead of getting rid of them when launching a new one. Current events require frequent updates, but fixed facts remain accurate for years. It avoids directing some queries to the old models, wasting the tremendous time, account and money that has already been spent on training.
There are difficult physical boundaries as well. GPU’s bottle neck has become the excess models at all, and chips technology is close to the maximum memory that can be filled with one death. In practice, she made it clear that the physical boundaries mean that the following model cannot be ten times larger.
An old idea is now imposed
William Fallon, founder and CEO of Ai Lightning AI, notes that the idea of using a set of models is not new-it has been around almost 2018-and that Openai models are a black box, we do not know that GPT-4 has also not used the model guidance system.
“I think they may be more clear about it now, most likely,” he said. Either way, GPT-5 was largely launched-including the models guidance system. the Blog post presented the form It was called “smarter, fastest and most useful so far, thinking about thinking.” In the official ChatGPT blog post, Openai confirmed that the GPT-5 inside ChatGPT is working on a system of coordinated models with a scenes router that turns into deep thinking when needed. The GPT -5 system has gone further, as it clearly determines multiple model variables – GPT – 5 – Main, GPT – 5 – Main -MINI for speed, and GPT -5 -Thinking, GPT -5 -Thinking -MINI, as well as Victory of Thinking – Pro – explains how the unified system automatically ranges between it.
In the prior press, the CEO of Openai Sam Altman described the typical router as a way to address what was difficult to decipher the models list to choose from. Altman called the previous Picker Picker “very confusing chaos.”
But Falcon said that the main problem is that the GPT-5 simply did not feel like a jump. “GPT-1 to 2 to 3 to 4-each time it was a huge jump.
Will it add multiple models to AGI?
The discussion on the direction of the models led to the constant calling of the noise on the possibility of artificial intelligence, or AGI, which is being developed soon. Openai officially defines AGI as “self -ruling systems that exceed humans in most businesses of economic value”, but Altman significantly He said last week It is “not a very useful term.”
“What about the promised Aji?” Eden Qoiang Ho, Amnesty International researcher and co -founder of Tensoropera, wrote, xGPT-5 criticism. “Even a strong company like Openai lacks the ability to train a large model, forcing them to resort to the actual router.”
Robert Nishhara, CEO of the artificial intelligence production platform, says Scaling is still progressing in artificial intelligence, but the idea of a strong artificial intelligence model is still far -fetched. “It is difficult to build one model that is the best in everything,” he said. For this reason GPT-5 currently runs on a network of models associated with a router, not a single stolen.
Openai said that she hopes to unite her in one model in the future, but Nishihara indicates that hybrid systems have real advantages: you can upgrade one piece at one time without disrupting the rest, and you will get most of the benefits without cost and complicated re -training an entire giant model. As a result, nishihara believes that the guidance will wander.
Eden Zoiang agrees. In theory, scaling laws still bear – more data and calculation make models better – but in practice, it is believed that development will “become” between the two approaches: directing specialized models together, and then trying to enhance them in one. The decisive factors will be engineering costs, arithmetic bodies, energy boundaries and business pressure.
An excessive recitation of AGI may need to adjust it as well. “If anyone does anything close to AGI, I do not know if it would be literally one set of weights,” Falcon said in a reference to the “minds” behind LLMS. “If it is a set of models that look like AGI, then there is nothing wrong. No one is pure here.”
https://fortune.com/img-assets/wp-content/uploads/2025/04/GettyImages-2197091542-e1744671229805.jpg?resize=1200,600
Source link