Join the event that the leaders of the institutions have been trusted for nearly two decades. VB Transform combines people who build AI’s strategy for real institutions. Learn more
GoogleA modern decision to hide the distinctive symbols of the raw logic of its main model, Gemini 2.5 ProA violent reaction from the developers who relied on this transparency to build and correct applications.
The change that repeats a A similar step by OpenaiThe logic is replaced by step with a simplified summary. The response highlights the decisive tension between creating a polished user experience and providing noticeable tools and confidence that institutions need.
Since companies integrate LLMS models into more complicated and exhausting systems, the discussion about the amount of internal works of the model that must be exposed to a distinctive issue of the industry.
A basic reduction in the transparency of artificial intelligence
To solve complex problems, advanced artificial intelligence models generate internal monologue, also referred to as “”Chain of thought(CO). This is a series of intermediate steps (for example, plan, code draft, self -correction) that the model produces before reaching its final answer. For example, it may reveal how the data processes, which parts of the information he uses, how to evaluate its code, etc.
For developers, this thinking corridor is often a diagnostic tool and a basic correction. When the model provides incorrect or unexpected output, the thinking process reveals the location of its area. It happened to be one of the main advantages of Gemini 2.5 Pro on Openai’s O1 and O3.
At Google AI developer forum, users launched this feature.The huge decline“Without it, developers are left in the dark. Another description has been forced to” guess “the reason for the model’s failure, which led to” very frustrating and repeated rings trying to fix things. “
Besides correcting errors, this transparency is very important to build advanced artificial intelligence systems. The developers depend on the child’s bed to adjust it and system instructions, which are the main ways to direct the behavior of the model. The feature is especially important for creating Agentic workflow tasks, as artificial intelligence must carry out a series of tasks. A developer noticed, “I helped the children to formulate the work properly properly.”
For institutions, this movement can be a problem. Models of black artificial intelligence that hide their thinking determines a great danger, making it difficult to trust their outputs in high risk scenarios. This trend, which is started by O-Series thinking from Openai and now adopted by Google, creates a clear opening for open alternatives like the source like the source Deepsek-R1 and Qwq-32B.
Models that provide full access to their thinking chains give institutions more control and transparency on the behavior of the model. CTO or AI is no longer the model that has the highest standard. It is now a strategic choice between a higher but transparent performance model and a more transparent model that can be combined with greater confidence.
Google response
In response to screaming, Google team members explained the logical basis. Logan Kalpatrick, great product manager at Google DeepMind, Explained The change was “purely cosmetics” and does not affect the internal performance of the model. He pointed out that for the Gemini application that faces the consumer, it hides the lengthy thinking process of a cleaner user experience. He said: ” % of people who will read or read ideas in applying Gemini are very small.”
For developers, the new summaries were intended to serve as a first step towards reaching the effects of thinking programming through the API, which was not possible before.
Google admitted the value of the initial ideas of developers. “I hear that you all want raw ideas, the value is clear, and there are cases of use that requires it,” Kilpatrick wrote, adding that returning the feature to the artificial intelligence studio that focuses on developers is “something we can explore.”
Google’s reaction to the reverse reaction to the developer indicates that the middle land is possible, perhaps through the “developer mode” that re -provides access to raw thought. The need for observation will only grow with the development of artificial intelligence models into more independent factors that use tools and implement complex multiple -step plans.
As Kilpatrick has ended in his observations, “… I can easily imagine that raw ideas become a decisive demand for all artificial intelligence systems given the increasing complexity and the need to track + observation.”
Are the logical features exaggerated?
However, experts suggest that there are deeper dynamics in playing just a user experience. Subbarao Kambhampati, Professor of Amnesty International at Arizona State UniversityQuestions whether the “intermediate symbols” produced by the thinking model before the final answer can be used as a reliable guide to understand how the model solves the problems. A paper He recently participated in his authorship as the “medium symbol” such as “effects of thinking” or “ideas” that could have serious effects.
Models often go to endless and incomprehensible trends in the thinking process. Several experiments show that the models trained on the effective effects of thinking and the correct results can learn to solve problems completely as well as the functions trained on the effects of coordinated thinking well. Moreover, the latest generation of thinking models is trained Learning reinforcement The algorithms that are achieved only from the end result and do not evaluate the “tracking thinking” of the model.
“The fact that the sequence of the distinctive average symbol often looks reasonably like the work of human scratching better and spelling … Do not tell us much if it is used anywhere near the same purposes that humans use, not to mention whether it can be used as an interpretable framework in what LLM thinks, or as a reliable justification for the final answer,” researchers write.
“Most users can make anything from the rough intermediate codes that these models publish,” Kambhampati told Venturebeat. As we mention, Deepseek R1 produces 30 pages of false English in solving the simple planning problem!
However, Kambhampati suggests that post -reality summaries or interpretations are more likely to be more discharge for final users. “The issue becomes to what extent do they actually show the internal operations that LLMS has gone through,” he said. “For example, as a teacher, it may have a new problem with many misconceptions and decline, but explaining the solution in the way I think is easy to understand students.”
Cot is also a competitive trench. Raw thinking effects are incredibly valuable training data. As KAMBHAMPATI, the competitor can use these effects to perform “distillation”, which is the process of training a smaller and cheaper model to imitate one powerful capacity. Hiding raw ideas makes it difficult for competitors to copy the secret sauce of the model, which is a decisive feature in the resource manufacturer.
The discussion about the series of thought is a review of a much larger conversation about the future of artificial intelligence. There is still a lot to get to know the internal actions of thinking models, how we can benefit from them, and to what extent are models ready to go to enable developers to reach them.
https://venturebeat.com/wp-content/uploads/2025/06/LLM-reasoning.png?w=1024?w=1200&strip=all
Source link