Join daily and weekly newsletters to obtain the latest updates and exclusive content to cover the leading artificial intelligence in the industry. Learn more
Salesforce One of the most continuing challenges in artificial intelligence of business applications addresses: the gap between the raw intelligence of the artificial intelligence system and its ability to constantly perform in unexpected institutions environments – what the company calls. “Rough intelligence“
In a comprehensive research announcement today, Salesforce AI Research It revealed many new standards, models and frameworks designed to make artificial intelligence agents in the future more intelligent, reliable and diversified for institutions. Innovations aim to improve both the capabilities and consistency of artificial intelligence systems, especially when spreading as independent factors in complex business settings.
“While LLMS may excel in unified tests, plan complex trips, and generate advanced hair, its brilliance often stumbles when it faces the need to carry out reliable and consistent tasks in the environment of unexpected dynamic institutions,” Silvio Savarerez, AI’s chief research scientist, said during the press conference.
The initiative is the Salesforce batch towards what the Savarians call “Interprise General Intelligence((EGI) – AI is specially designed to complicate business rather than the most negative theoretical pursuit of general intelligence (AGI).
“We define EGI as artificial intelligence agents designed for this purpose for optimal work not only for the ability, but also on consistency,” Savarerez explained. “Although AGI may conjure up pictures of Superincligent machines that go beyond human intelligence, companies do not wait for this fake distant future. They apply these foundational concepts now to resolve challenges in the real world on a large scale.”
How Salesforce measures and determines the problem of inconsistency in Amnesty International in the Foundation’s settings
The central concentration of research is to measure and treat artificial intelligence inconsistency in performance. Salsforce foot Simple data collectionA general standard that includes 225 direct direct questions designed to measure the extent of the capabilities of the artificial intelligence system.
“Today’s artificial intelligence is deserted, so we need to work on this. But how can we work on something without measuring it first? This is exactly what this simple standard is,” explained by Shelby Henick, the first director of research in Salesforce, during the press conference.
For institutions applications, this contradiction is not just an academic concern. One error from Amnesty International’s agent can disrupt operations, erode customer confidence, or causes significant financial damage.
“For companies, artificial intelligence is not an informal hobby; it is an important tool that requires the ability to predict,” Savarerez pointed out his comment.
Inside Crmarena: Salsforce test
It may be the most important innovation CreaterinaA new standard framework designed to simulate the scenarios of realistic customer relationships. A comprehensive test allows artificial intelligence agents in professional contexts to address the gap between academic standards and work requirements in the real world.
“In recognition that current artificial intelligence models are often limited to the complex requirements of institutions environments, we have presented Crmarena: a new framework designed to simulate the real CRM scenarios,” Savarerez said.
The frame evaluates the performance of the agent through three main people: service agents, analysts and managers. Early tests revealed that even with the directed claim, the leading factors have less than 65 % of time calling for jobs for these people.
“CRM ARNA is an internal tool to improve agents,” Savarerez explained. “Stress allows us to test these factors, understand when they fail, then we use these lessons that we learn from these failures to improve our agents.”
New inclusion models that understand the Foundation’s context is better than ever
Among the technical innovations announced, the most prominent salesforce SFR EmbeddingA new model for understanding the deeper context that leads the criteria for including the huge text (MTB) via 56 data sets.
“SFR is not just a search. It comes to Cloud Cloud, very soon,” he pointed out.
Specialized version, SFR-ambedding-codeIt was also presented to developers, allowing the search for high -quality code and development coordination. According to Salesforce Code information recovery standard (COIR)While the smaller models (400 meters, 2b) provides effective and effective alternatives.
Why the micro -an artificial intelligence models may excel
Salsforce also announced Xlam v2 (big business model)A family of models specifically designed to predict action rather than just create a text. These models start from only 1 billion – part of the size of many leading language models.
“What distinguishes our XLAM models is that if you look at our models sizes, we have a 1B model, we are 70B. This 1B model, for example, is a small part of the size of many large language models today,” he explained. “This small model packs a lot of strength to take the ability to take the next action.”
Unlike standard language models, these procedures models are specifically trained to predict the following steps in the sequence and implementation of tasks, which makes them of special value for independent factors that need to interact with institutions systems.
“The large movement models are LLMS under the cap, and the way we build it is that we take LLM and we bear it on what we call the procedure paths,” he added. He added.
AI Enterprise Safety: How to create a confidence layer in Salesforce handrails to use business
To address the Foundation’s concerns about the safety and reliability of artificial intelligence, Salsforce has provided SFR-GuardA family of models trained on both the data available to the audience and the internal data specialized in CRM. These models enhance the company’s confidence layer, which provides handrails for the behavior of an artificial intelligence agent.
The company stated in its announcement: “The handrails in Agentforce establishes clear boundaries of the agent’s behavior based on the needs of work, policies and standards, and to ensure the work of agents within pre -limited limits.”
The company also launched contextuljudgenchA new standard for evaluating the models of LLM judges in the context-testing more than 2000 pairs of difficult response for accuracy, accuracy, sincerity, and appropriate rejection of the answer.
Looking at the text, unveiled salesforce TacoThe family of multimedia movement model is designed to address complicated multiple problems through the chains of thought and action (COTA). This symptom enables artificial intelligence to explain and respond to complex queries that include multiple types of media, with a salesforce claim by up to 20 % on the difficult MMVET standard.
Participation in the work: How the Salsforce’s Enterprise Ai Roadmap is formed
ITai AsseoAI Research’s nursery and nursery managers and the importance of participating in the customer in developing AI’s ready -made solutions.
“When we talk to customers, one of our main pain points is that when dealing with institutions data, there is very low tolerance to provide already inaccurate answers that are not related,” explained ASO. “We have made a lot of progress, whether it is with thinking engines, with rag techniques and other styles about LLMS.”
ASSEO cited examples of customer custody that results in great improvements in the performance of artificial intelligence: “When we applied the thinking engine in Atlas, including some advanced technologies to increase recovery, along with the methodology and architecture, along with our major competitors.”
The Road to Enterprise General Intelligence: What is the following for Salesforce AI
The Slesforce search boost comes at a critical moment in the adoption of the AI, as companies are increasingly seeking artificial intelligence systems that combine advanced capabilities and reliable performance.
While the entire technology industry follows more effective models with great capabilities, the Slesforce focus on the consistency gap highlights a more accurate approach to developing artificial intelligence-which gives priority to the real world’s work requirements on academic standards.
The technologies that were announced on Thursday will begin in the coming months, with SFR Embedding Go to Cloud Data First, while other innovations will run future versions of AgentForce.
Al -Safari also noted at the press conference, “It is not a matter of replacing humans. It is related to responsibility.” In the race to the dominance of Ai Enterprise, Salesforce is betting that consistency and reliability – not only raw intelligence – will eventually determine the winners of the Amnesty International Revolution.
https://venturebeat.com/wp-content/uploads/2025/05/Salesforce-handshake.webp?w=1024?w=1200&strip=all
Source link