Claude AI can now end the conversations it considers harmful or offensive

Photo of author

By [email protected]


man Declare A new experimental safety feature allows Claude OPUS 4 and 4.1 to end the conversations in rare, constantly harmful or abusive scenarios. This step reflects the increasing focus of the company on what you call “typical luxury”, which is the idea that protecting artificial intelligence systems, even if it is not emotional, is a wise step in consensus and moral design.

According to Anthropor’s private research, models were programmed to cut the dialogues after repeated harmful requests, such as sexual content that involves palace or instructions that facilitate terrorism, especially when artificial intelligence has already rejected and tried to direct the conversation. Artificial intelligence may show what Antarbur describes as “phenomenon”, which directed the decision to give Claude the ability to end these reactions in the simulation test and the real user.

Also read: Meta under fire was shown on artificial intelligence guidelines about “sensory” chats with minors

The Atlas of Artificial Intelligence

When this feature is turned on, users cannot send additional messages in that particular chat, but they are free to start a new conversation or edit previous messages and re -try the branch. Decally, other active conversations remain not affected.

man It confirms that this is the last return scale, which is meant only after the failure of multiple rejection and direction processes. The company explicitly loans Claude not to end the chats when the user is at risk of an imminent risk of self -harm or harm to others, especially when dealing with sensitive topics such as mental health.

Human tires This new ability as part of an exploratory project in typical luxury, a broader initiative that explores the remaining low -cost safety interventions in the event of developing artificial intelligence models any form of preferences or weaknesses. The statement says that the company is still “very sure about the possible ethical state of Claude and other LLMS (large language models).”

Also read: Why do professionals say that you should think twice before using artificial intelligence as a processor

A new look at the safety of artificial intelligence

Although this feature is rare and mainly affects the maximum cases, it belongs to a milestone in how a person deals with the integrity of artificial intelligence. The end of the new conversation tool contrasts with the previous systems that focused only on protecting users or avoiding misuse. Here, artificial intelligence is dealt with as an interest in itself, as Claude has the ability to say, “this conversation is not healthy” and ends it to protect the integrity of the model itself.

Anthropor’s approach has sparked a broader discussion on whether artificial intelligence systems should be granted protection to reduce possible “distress” or unexpected behavior. While some critics argue that models are just artificial machines, others welcomes this step as an opportunity to raise a more serious discourse on the ethics alignment of artificial intelligence.

“We are dealing with this feature as an ongoing experience and we will continue to improve our approach,” the company He said in a post.





https://www.cnet.com/a/img/resize/ce2a53a5a5641f5a3daad47fbc31108e639297e7/hub/2024/05/01/c43a41a8-a09e-484b-9dcd-f7c8f7d3043a/screenshot-2024-05-01-at-12-36-13pm.png?auto=webp&fit=crop&height=675&width=1200

Source link

Leave a Comment