Want more intelligent visions of your inbox? Subscribe to our weekly newsletters to get what is concerned only for institutions AI, data and security leaders. Subscribe now
Google researchers I developed a A new framework for artificial intelligence agents that excels over the Openai, Perplexity and others competitors and others On the main standards.
The new agent, is called Deep time screaming test (TTD-DR), inspired by the way people write by passing the process of formulation, searching for information and making repetitive reviews.
The system uses proliferation mechanisms and evolutionary algorithms to produce more comprehensive and accurate research on complex topics.
For institutions, this framework Is it possible to run a new generation of research assistants for high -value tasks This standard Recovery generation retrieval (Flasting) Systems are struggling with it, such as generating competitive analysis or market entry report.
Artificial intelligence limits its limits
Power caps, high costs of the symbol, and inference delay are reshaped. Join our exclusive salon to discover how the big difference:
- Transforming energy into a strategic advantage
- Teaching effective reasoning for real productivity gains
- Opening the return on competitive investment with sustainable artificial intelligence systems
Securing your place to stay in the foreground: https://bit.ly/4mwngngo
According to the authors of the paper, cases of business use in the real world were the primary goal of the system.
The borders of the current deep research agents
Deep search factors (DR) are designed to process complex queries that go beyond simple research. They use large language models (LLMS) for planning, using tools such as searching on the web to collect information, then synthesize the results in a detailed report with the help of Testing techniques at test time Like a chain of ideas (COT), the best samples of N, search for the Monte-Carlo tree.
However, many of these systems have basic design restrictions. Most DR agents available to the public apply algorithms and testing time testing tools without a structure that reflects human cognitive behavior. Open source factors often follow a linear or parallel process for planning, research and content generation, This makes it difficult in the different stages to search interacting with each other and correcting them.

This can cause the universal context agent to lose research and lack critical communications between different parts of the information.
The authors of the paper also notice, “This indicates the presence of basic restrictions in the work of the current DR agent and highlights the need for a more coherent and designed frame for this purpose for DR agents who imitates or exceeds the capabilities of human research.”
A new approach inspired by human writing and spread
Unlike the linear process of most agents of artificial intelligence, human researchers work in a repetition. They usually start with High -level plan, creating a preliminary draft, then engaging in multiple review courses. During these reviews, they are looking for new information to enhance their arguments and fill the gaps.
Google researchers note that this The human process can be simulated using a The proliferation model Rewarded with a retrieval component. (The proliferation models are often used in the generation of images. They start and gradually improve them until they become a detailed image.)
The researchers also explains, “In this measurement, the trained spread model initially creates a noisy draft and unifies the unit of reducing it, with the help of recovery tools, this draft to high -quality outputs (or with higher accuracy).”
TTD-DR was built on this scheme. The framework is treated for a research report as a spread, as a “loud” tumultuous “” primary “draft is improved to a polished final report.

This is achieved through two basic mechanisms. The first, which researchers call “reducing retrieval”, begins with a preliminary project and improves it frequently. In each step, the agent uses the current draft to formulate new search quotes, recover external information, and integrate it.
The second mechanism guarantees, “self -development”, that each component of the agent (the plan, the generator of the questions, and the full answer) works independently to improve its performance. In comments on Venturebeat, Rujun Han, Google’s research scientist and co -author of the paper, explained that this development at the component level is crucial because it makes the “report more effective”. This is similar to an evolutionary process as each part of the system improves gradually in its specific mission, providing a high -quality context of the main review process.

The authors say: “The complex interaction and the consolidated mixture of these two algorithms is decisive to achieve high -quality search results.” This repetitive process directly leads to reports that are not only more accurate, but also more logical cohesion. HAN is also noted, as the model was evaluated on assistance, which includes fluency and cohesion, performance gains are a direct measure of its ability to produce well -organized business documents.
According to the paper, The resulting research companion “is able to generate useful and comprehensive reports of complex research questions through the various fields of industry, Including financing, biomedical, entertainment and technology, “put it in the same category as the deep search products of Openai, perplexity, and GROK.
TTD-DR at work
To build and test their framework, researchers used Google’s Agent development kit (ADK), an expandable platform for organizing complex workflow tasks, with Gemini 2.5 Pro As Llm Core (although you can switch it to other models).
They evaluated the TTD-DR against the leading and open source regulations, including Openai Deep ResearchDeep search for confusion, Grok Deepsearch, and open sources GPT-RESEARCHER.
The evaluation focused on two main fields. To generate long -form comprehensive reports, use DeepConsult StandardA set of business -related demands and consulting, along with their long search data set. To answer multi -grave questions that require wide research and thinking, tested the agent about challenging academic and realistic standards such as The last humanity exam (Hle) and Gay.
TTD-DR results are constantly outperforming its competitors. In comparisons along with Openai’s deep research on long reports, TTD-DR achieved victory rates of 69.1 % and 74.5 % over two different groups. The Openai system has also exceeded three separate criteria that require multi -law thinking to find brief answers, with 4.8 %, 7.7 % and 1.7 %.

The future of the test time spread
While the current research focuses on text -based reports using Web Search, Framework is designed to be very adaptable. Han stressed that the team is planning to expand the work to integrate more tools for the tasks of complex institutions.
A The “Test Time Time” process can be used similar to creating a complex software codeand Create a detailed financial modelOr Design a multi -stage marketing campaignWhere a preliminary “draft” for the project Repeat the repetition of new information The comments are from various specialized tools.
“All of these tools can be combined naturally in our framework,” Han said, indicating that this revolting approach can become a basic structure of a wide range of complex factors of artificial intelligence.
https://venturebeat.com/wp-content/uploads/2025/08/deep-research-agent.png?w=1024?w=1200&strip=all
Source link