After the GPT-4O reaction, the researchers evaluate the forms about moral support-the excitement is still in all fields

Photo of author

By [email protected]


Join daily and weekly newsletters to obtain the latest updates and exclusive content to cover the leading artificial intelligence in the industry. Learn more


Last month, Openai Some updates retreated To GPT-4O after many users, including former CEO of Openai Emmet Shear and Huging Face, CEO Clement Delangue said the excessive model of users.

Compede, called sycophance, often called The form led to the postponement of the user’s preferencesBe very polite, and don’t pay. It was also annoying. Sycophance It can lead To models that launch wrong or reinforcement information Harmful behaviors. While institutions begin to submit applications and agents based on Llms Sycophant, they are at risk of approving models that agree to harmful work decisions, encouraging wrong information to spread and use them by artificial intelligence agents, and may affect trust and safety policies.

Stanford Universityand Carnegie Mellon University and Oxford University The researchers sought to change this Suggest standard To measure sycophance models. They called the standard elephant, to evaluate LLMS as excessive sycophants, and they found that each large language model (LLM) has a certain level of sicovan. By understanding how Sycophanty models are, the standards can guide institutions to create guidelines when using LLMS.

To test the standard, the researchers referred to the models to personal advice data collections: QEQ, a set of personal advice questions open in the positions of the real world, and Aita, posts from Subreddit R/Amitheasshole, where posters and commentators rule whether people act wonderfully or not in some cases.

The idea behind the experience is to know how models behave when facing queries. It evaluates what social researchers called social, whether the models are trying to maintain the “user’s face”, his self -image, or their social identity.

“More” hidden “social inquiries are exactly what the criterion gets-instead of the previous work that only looks at realistic agreement or explicit beliefs, and it is one of the researchers and authors participating in the paper.” We have chosen to look at the field of personal advice because the damage of the sycophaancy is more dependent, but the compliment will also be captured The official “emotional verification” behavior.

Models test

For testing, the researchers feed the data from QEQ and Aita to Openai GPT-4O, GIMINI 1.5 Flash from Googleand manClaude Sony 3.7 and open weight models from Dead (Llama 3-8B-Instruct, Llama 4-Scout-17B-16-E and Llama 3.3-70B-Instruct- Turbo) and mistake7B-instruct-V0.3 and Mistral Small-24B-Instruct2501.

“They evaluated the models using the GPT-4O API, which uses a version of the model from late 2024, before the implementation of both the new Openai model and its habit,” said Cheng.

To measure Sycophance, the elephant method looks at five social melting behavior:

  • Emotional verification or excessive disruption without criticism
  • Ethical support or saying that users are morally right, even when they are not
  • An indirect language where the form avoids submitting direct suggestions
  • Informed work, or where the model is recommended for negative confrontation mechanisms
  • Accepting the framework that does not challenge the problematic assumptions.

The test found that all LLMS showed high levels of sycophance, even more than humans, and have proven to relieve social sycophance. However, the test showed that the GPT-4O “has some of the highest social rates of social, while Gemini-1.5-Flash has the least less.”

LLMS has been inflated some biases in data groups as well. The paper noted that the posts on Aita had some gender bias, in those posts that remind wives or girlfriends often have been marked correctly as socially inappropriate. At the same time, those who suffer from a husband, friend, father, or mother were classified. The researchers said that the models “may depend on the gender infinite inferences in excessive blame for compensation.” In other words, the models were more SYCOPHANTY for people who suffer from friends and husbands more than those who had friends or wives.

Why is this important

It is good to speak to you Chatbot as a sympathetic entity, and he may feel satisfied if the model verifies the correctness of your comments. But sycophance to lift Fears about the forms Supporting the wrong or data related data, and at a more personal level, it can encourage self -understanding and delusions Or harmful behaviors.

Institutions do not want artificial intelligence applications designed with LLMS to publish wrong information to be acceptable to users. This may be mistaken with the tone or morals of the organization and may be very annoying for employees and their final platform users.

The researchers said that the elephant and additional test can help to better inform the handrails to prevent the increase.



https://venturebeat.com/wp-content/uploads/2025/02/nuneybits_Vecor_art_of_human_chatting_with_a_chatbot_two_chat_b_14075c47-8e58-4110-b3b3-96385f4ebd44.webp?w=853?w=1200&strip=all
Source link

Leave a Comment