OpenAI, a research laboratory whose activity is focused on the development of Artificial General Intelligence, general artificial intelligence, has released Procgen Benchmark, 16 environments with simple use, generated procedurally, suitable to provide a measurement of rapidity with

OpenAI researchers have said they have discovered that all environments in Procgen Benchmark require a 500-1000 different level training before they can generalize on new levels. This would suggest, continues OpenAI, that standard benchmarks of reinforcement learning (learning for reinforcement) require much more diversity within each environment.

Procgen Benchmark has now become the standard search platform used by the OpenAI RL team. The organization also hopes that this platform will accelerate community activities aimed at creating better and more efficient reinforcement learning algorithms.

The Procgen Benchmark platform consists of 16 games that represent as many unique environments, designed to measure both sample efficiency and generalization in learning for reinforcement. This benchmark, explained OpenAI, is ideal for the evaluation of generalization, since in each environment different training and testing sets can be generated.

It is also suitable for assessing sample efficiency, as all environments pose different and convincing challenges, and this diversity requires agents to learn about sound behavior policies. In practice, the ability to generalise becomes an integral component of success, as agents face ever-changing levels.

This is not the only tool recently released by OpenAI, which explains: reinforcement learning agents need to explore their environments to learn optimal behavior. They operate in practice according to the \trial and error principle: they try things, see what works and what does not, and thus increase the chances of positive behavior and decrease those of negative behavior.

However, it also points out that exploration is risky: agents could experience dangerous behaviour that leads to unacceptable errors. This, in summary, is what is called the problem of… safe exploration.

For this purpose, OpenAI has therefore released Safety Gym, a suite of environments and tools to measure the progress of agents that respect the safety constraints during the training. OpenAI also provides a standardised method to compare algorithms and assess how well agents avoid serious errors during training.

If deep reinforcement learning is applied to the real world, such as robotics or Internet-based activities, OpenAI points out, it will be important to have algorithms that are safe during training: such as in the case of a driving car.

More information is available on the OpenAI website.

Leave a Reply

Your email address will not be published.

You May Also Like