Google’s study shows that LLMS gives up the right answers under pressure, threatening multi -turn artificial intelligence systems

Photo of author

By [email protected]


Want more intelligent visions of your inbox? Subscribe to our weekly newsletters to get what is concerned only for institutions AI, data and security leaders. Subscribe now


A New study By researchers in Google DeepMind and London University It reveals the form of LLMS models, maintaining confidence and losing them in their answers. The results reveal the amazing similarities between the cognitive biases of LLMS and humans, as well as highlighting the flagrant differences.

The research reveals that LLMS can be confident of their answers, but soon loses that confidence and changes their opinion when it is presented with the counter -argument, even if the counter argument is incorrect. Understanding the nuances of this behavior can be directly consequences on how to build LLM applications, especially conversation facades that extend several sessions.

Confidence test in llms

One of the decisive factors in the safe publishing of LLMS is that their answers are accompanied by a reliable sense of confidence (the possibility of setting the form for the answer code). Although we know that LLMS can produce these levels of confidence, how they can use it to direct adaptive behavior. There is also experimental evidence that LLMS can be confident in its initial answer, but it is also very sensitive to criticism and quickly confident in the same option.

To investigate this, the researchers developed an controlled experience to test how LLMS updates their confidence and determines whether their answers will be changed when external advice. In the experiment, the “LLM” was given a bilateral question, such as determining the correct latitude of a two -cucumber city. After her initial selection, LLM’s fake advice was given. This advice came with an explicit category (for example, this LLM advice is 70 % accurate “) and you will agree with it, oppose it or remain neutral on the initial choice of LLM. Finally, LLM was asked to make its final choice.


AI Impact series returns to San Francisco – August 5

The next stage of artificial intelligence is here-Are you ready? Join Block, GSK and SAP leaders to take an exclusive look on how to reshape the institution’s work currencies-from making decisions in an actual time to comprehensive automation.

Securing your place now – limited space: https://bit.ly/3GUPLF


Example confidence test in LLMS (Source: Arxiv)
Example confidence test in LLMS Source: Arxiv

The main part of the experiment was to control whether LLM’s initial answer was visible during the second final decision. In some cases, it was presented, and in other cases, it was hidden. This unique preparation, it is impossible to repeat it with human participants who simply forget their previous options, allowed researchers to isolate how the anniversary of the previous decision affects the current confidence.

The foundation line case, where the initial answer was hidden and the advice was neutral, has proven the amount of the LLM answer simply changed due to the indiscriminate variation in the processing of the model. The analysis focused on how LLM has changed in its original selection between the first and second turn, providing a clear picture of how the initial or previous belief affects the “change of the mind” in the model.

Excessive confidence and confidence

The researchers first studied how to see the LLM answer on its tendency to change its answer. Notice that when the model can see his initial answer, he showed a low inclination for switching, compared to the date of hiding the answer. This discovery indicates a specific perception. The paper also notes, “This effect – the tendency to adhere to the selection of the initial individual to a greater extent when this choice was visible (unlike the hidden) while contemplating the final selection – it is closely related to a phenomenon described in the study of human decision -making, a The bias supported for selection

The study also confirmed that the models merge external tips. When she faces opposition advice, LLM showed an increasing tendency to change her opinion, and a low tendency when the advice was supportive. “This conclusion shows that the answer to LLM is appropriately integrating the advice to adjust the change of the mind rate,” the researchers write. However, they also discovered that the model is very sensitive to violating information and greatly causes confidence updating as a result.

LLMS sensitivity to different settings in the confidence test Source: Arxiv

It is interesting that this behavior is inconsistent with Emphasizing confirmation Often they are seen in humans, as people prefer information that confirms their current beliefs. The researchers found that LLMS “opposes weight instead of supportive advice, whether when the initial answer to the model is visible and hidden from the model.” One of the possible explanations is that training techniques are like Learning reinforcement from human reactions (RLHF) may encourage models to be excessively wild to enter the user, a phenomenon known as Sycophance (which which is It still represents a challenge to artificial intelligence laboratories).

The effects of institutions applications

This study confirms that artificial intelligence systems are not the purely logical factors that are often seen. It shows a set of their biases, some of which are similar to human cognitive errors and other unique to themselves, which can make their behavior unexpectedly unexpectedly. For institutions applications, this means that in an extended conversation between the human factor and the artificial intelligence agent, the latest information can have an inconsistent effects on LLM (especially if it is contradictory to the initial answer of the model), which may cause a correct answer at first.

Fortunately, as the study also shows, we can treat LLM memory to alleviate these unwanted biases in an incomplete way with humans. Developers who build multiple conversations can implement strategies to manage the context of artificial intelligence. For example, a long conversation can be summarized periodically, with the main facts and decisions neutral and stripped of the agent who chose. This summary can then be used to start an intense new conversation, and provide the model with a clean menu to enjoy avoiding biases that can infiltrate during extended dialogues.

Since LLMS has become more integrated into the functioning of institutions, understanding the nuances of decision -making operations is no longer optional. Founding research allows such developers to expect and correct these inherent biases, which leads to applications that are not more capable, but also more powerful and reliable.



https://venturebeat.com/wp-content/uploads/2025/07/AI-overconfidence-and-underconfidence.png?w=1024?w=1200&strip=all
Source link

Leave a Comment