Join daily and weekly newsletters to obtain the latest updates and exclusive content to cover the leading artificial intelligence in the industry. Learn more
David Silver and Richard Soton, who are famous artificial intelligence scientists, argue with a New paper This artificial intelligence is about to enter a new stage, the “era of experience”. This is where artificial intelligence systems are less dependent on the data that a person provides and improves themselves by collecting data from the world and interacting with it.
Although the paper is conceptual and aspiration, it has direct effects on institutions that aim to build with future artificial intelligence agents and systems.
Both Silver and Sutton are experienced scientists with a busy record of making accurate predictions about the future of artificial intelligence. The authority predictions can be seen directly in the most advanced artificial intelligence systems today. In 2019, Soton, a pioneer in reinforcement learning, wrote the famous article.Bitter lesson“Whoever argues that the largest long -term progress in artificial intelligence is constantly established by benefiting from the account on a large scale with research and learning methods for general purposes, rather than relying primarily to integrate complex and derived knowledge.
David Silver, a great scientist in DeepMind, was a major contributor to Alphago, Alphazero and Alphastar, and all important achievements in deep learning. It was also Participant author of a paper in 2021 This claimed that the reinforcement learning and the well -designed bonus reference would be sufficient to create very advanced Amnesty International systems.
LLMS models benefit from these two concepts. The new LLMS wave that invaded the scene of artificial intelligence since GPT-3 has primarily based on accounting account and data to accommodate huge amounts of knowledge. The latest wave of thinking models, such as Deepsek-R1It has proven that learning reinforcement and a simple reward signal is sufficient to learn Complex thinking skills.
What is the age of experience?
The “Age of Experience” depends on the same concepts that Sutton and Silver discuss in recent years, and adapt to them with modern developments in artificial intelligence. The authors argue that “the pace of progress that you move only through learning to supervise human data is clearly slowing, indicating the need for a new approach.”
This approach requires a new source of data, which must be created in a way that is constantly improving, with the agent being stronger. “This can be achieved by allowing agents to learn constantly from their own experience, that is, data created by the agent who interacts with his environment,” writes Sutton and Silver. They argue that in the end, “the experiment will become the dominant means of improvement and eventually overcome the human data scale used in today’s systems.”
According to the authors, in addition to learning from their experimental data, the future artificial intelligence systems “will” penetrate the restrictions of artificial intelligence systems that focus on humans “through four dimensions:
- Streams: Instead of working through separate episodes, artificial intelligence agents will have “their own flow from their experiences, like humans, for a long time.” This will allow agents to plan long -term goals and adaptation with new behavioral patterns over time. We can see a glimmer of this in artificial intelligence systems that have very long windows and Memory structures This update is constantly based on the user reactions.
- Procedures and notes: Instead of focusing on human procedures and observations, agents will work independently in the real world. Examples of this are age dealerships that can interact with external applications and resources through tools such as computer use and form context protocol (MCP).
- Rewards: Current reinforcement systems often depend on bonus functions designed for human being. In the future, artificial intelligence agents should be able to design their dynamic reward functions that adapt over time and correspond to the user’s preferences with the real world signals collected from the agent’s actions and observations in the world. We see early versions of self -design rewards with systems such as Dryuka Nafidia.
- Planning and logic: Current thinking models are designed to imitate human thinking. The authors argue that “the most efficient thought mechanisms are definitely exist, using inhuman languages that may be used, for example, taking advantage of symbolic, distributed, continuous or discriminatory accounts.” Artificial intelligence agents must deal with the world, monitor and use data to verify the health and update of their thinking and Developing a global model.
The idea of artificial intelligence customers who adhere to their environment through reinforcement learning is not new. But in the past, these agents were limited to very restricted environments such as table games. Today, factors that can interact with complex environments (for example, Use of the artificial intelligence computer) Progress in reinforcement learning will overcome these restrictions, which leads to the transition to the era of experience.
What does it mean to the institution?
Burn in the Sutton and Silver paper is a note that will have important effects on applications in the real world: “The agent may use” procedures and notes “Human Easy” like user interfaces, which naturally facilitate communication and cooperation with the user. The agent may also take the actions of the “machine” that is suitable for APIS and APIS, allowing the agent to install in the field of targets.
The era of experience means that developers will have to build their applications not only for humans but also with the status of artificial intelligence agents. The machine’s friendly procedures require building safe and easily accessible applications facades directly or through facades such as MCP. This also means creating agents that can be discovered through protocols like Google’s Agent2agent. You will also need to design applications facades and the agency to provide access to both procedures and notes. These agents will gradually think and learn their interactions with your applications.
If the vision that became Sutton and Silver Present become a reality, there will be soon billions of agents wandering around the web (and soon in the material world) to accomplish the tasks. Their behaviors and needs will be completely different from human users and developers, and they will have a friend’s friendly way to interact with your application that will improve your ability to benefit from future artificial intelligence systems (as well as prevent damage that can cause them).
“By building on the foundations of RL and adapting its basic principles with the challenges of this new era, we can open the full potential for independent learning and pave the way to really super intelligence,” writes Sutton and Silver.
Deepmind refused to provide additional comments for the story.
https://venturebeat.com/wp-content/uploads/2025/04/era-of-experience.webp?w=1024?w=1200&strip=all
Source link