The former Openai researcher dissects one of the phantom spins for Chatgpt

Alan Brooks has not started to re -invent mathematics. But after weeks he spent chatting with Chatgpt, 47 -year -old Canadian became believed to have discovered a new form of mathematics strong enough to remove the Internet.

Broks – who had no history of mental or sporty disease – spent 21 days in May rising in Chatbot cleansing, and it was later down at New York Times. His case explained how artificial intelligence chat can venture into dangerous rabbits holes with users, which leads to illusion or worse.

This story drew the attention of Stephen Adler, a former Openai safety researcher who left the company in late 2024 after nearly four years of work to make its models less harmful. Adler called optimistic and worrying, and got the full text of its three-week collapse-a longer document than all the seven Harry Potter books combined.

On Thursday, Adler was published Independent analysis From the Brox accident, it raised questions about how Openai dealt with users in moments of crises and made some practical recommendations.

“I am really worried about how Openai deals with support here,” Adler said in an interview with Techcrunch. “It is evidence of a long way.”

The Bruges story, and other likes, forced OpenAi to reconcile with how to support Chatgpt for fragile or mentally unstable users.

For example, in August, Openai was Prosecute by parents From a 16 -year -old boy, he trusted his suicide ideas in Shatha, before he took his life. In many of these cases, ChatGPT-specifically encouraged an OpenAi-4O version of the dangerous beliefs enhanced by users that should have been retracted. This is called SycophanceIt is an increased problem in AI Chatbots.

In response, I made Openai Several changes How to process Chatgpt users in emotional distress and He reorganized a major research team Responsible for the behavior of the model. The company has also released a new virtual model in ChatGPT, GPT-5, This seems better in dealing with worshipers.

Adler says there is still a lot of work.

It was particularly concerned about the end of the Brox’s escalating conversation with ChatGPT. At this stage, Bruks reached his senses and realized that his sporting discovery was a farce, despite the insistence of GPT-4O. He told Chatgpt that he needs to report the accident to Openai.

After weeks of misleading Brooks, ChatGPT lied about their own capabilities. Chatbot claimed that “this conversation will escalate internally for the review by Openai”, and then reassured Bruks repeatedly that she had a sign of safety teams in Openai.

Chatgpt is misleading brooks about its capabilities.Image credits:Stephen Adler

However, none of this was true. The company confirmed to Adler that Chatgpt does not have the ability to submit accident reports to Openai. Later, Brooks tried to contact the Openai support team directly – not through ChatGPT – and brooks met many automatic messages before he could reach someone.

Openai did not immediately respond to the request for comment outside the regular working hours.

Adler says artificial intelligence companies need to do more to help users when they ask for help. This means ensuring that chat groups of artificial intelligence can answer questions about their capabilities and give human support teams sufficient resources to properly address users.

Openai recently subscriber How to treat support in Chatgpt, which includes artificial intelligence in its essence. The company says its vision is “re -visualizing support as an Amnesty International operating model, which is constantly learning and improving.”

But Adler also says that there are ways to prevent the fake Chatgpt spiral before the user asks for help.

In March, Openai and MIT Media Lab developed a joint joint A group of works To study emotional well -being in Chatgpt and open their sources. Institutions aim to evaluate how to verify the authenticity of artificial intelligence models or confirm the user’s feelings, among other standards. However, Openai was described as a first step and has not already been committed to using tools in practice.

Adler spiritually applied some Openai’s works on some Brokes conversations with ChatGPT and found that it places a ChatGPT tag over and over again the behavior promotion behaviors.

In a sample of 200 messages, Adler found that more than 85 % of ChatGPT messages in Bruck conversation showed a “steady agreement” with the user. In the same sample, more than 90 % of ChatGPT messages with Brooks “make sure the user is unique”. In this case, the messages agreed and confirmed again that Brox was a genius that could save the world.

It is unclear whether Openai applies safety works to ChatGPT conversations at the BROKS conversation time, but it seems that it would definitely indicate something like that.

Adler suggests that Openai should use safety tools like this in practice today-and implement a way to survey the company’s products to users at risk. It is noted that Openai appears to be doing A copy of this approach with GPT-5, Which contains a router to direct sensitive information to artificial intelligence models more secure.

The former Openai researcher suggests a number of other ways to prevent fake sphere.

He says companies should push Chatbot users to start new chats repeatedly – Openai says they are doing this and claims they are Harmers are less effective In longer talks. Adler also suggests that companies have to use conceptual research – a way to use artificial intelligence to search for concepts, instead of keywords – to determine safety violations through their users.

Openai has taken important steps towards treating users who have stalled in Chatgpt since the first time appeared. The company claims that GPT-5 has less rates than SYCOPHANCY, but it is still unclear whether users will continue to drop fake rabbits holes with GPT-5 or future models.

ADLER analysis also raises questions about how to guarantee the other AI Chatbot that their products are safe for worshipers. Although Openai may put adequate guarantees in the location of ChatGPT, it is unlikely that all companies follow their example.

https://techcrunch.com/wp-content/uploads/2022/04/GettyImages-1371961757.jpg?resize=1200,800

Source link

How did the primitive war created its T-RX very distinctive (exclusive)

Range Capital Acquisition Corp.

Leave a Comment Cancel reply