To Learn To Deal With Uncertainty, This AI Plays Pong

AI is endowing robots, autonomous vehicles and countless of other forms of tech with new abilities and levels of self-sufficiency. Yet these models faithfully “make decisions” based on whatever data is fed into them, which could have dangerous consequences. For instance, if an autonomous car is driving down a highway and the sensor picks up a confusing signal (e.g., a paint smudge that is incorrectly interpreted as a lane marking), this could cause the car to swerve into another lane unnecessarily.

But in the ever-evolving world of AI, researchers are developing new ways to address challenges like this. One group of researchers has devised a new algorithm that allows the AI model to account for uncertain data, which they describe in a study published February 15 in IEEE Transactions on Neural Networks and Learning Systems.

“While we would like robots to work seamlessly in the real world, the real world is full of uncertainty,” says Michael Everett, a post-doctoral associate at MIT who helped develop the new approach. “It’s important for a system to be aware of what it knows and what it is unsure about, which has been a major challenge for modern AI.”

His team focused on a type of AI called reinforcement learning (RL), whereby the model tries to learn the “value” of taking each action in a given scenario through trial-and-error. They developed a secondary algorithm, called Certified Adversarial Robustness for deep RL (CARRL), that can be built on top of an existing RL model.

“Our key innovation is that rather than blindly trusting the measurements, as is done today [by AI models], our algorithm CARRL thinks through all possible measurements that could have been made, and makes a decision that considers the worst-case outcome,” explains Everett.

In their study, the researchers tested CARRL across several different tasks, including collision avoidance simulations and Atari pong. For younger readers who may not be familiar with it, Atari pong is a classic computer game whereby an electronic paddle is used to direct a ping pong on the screen. In the test scenario, CARRL helped move the paddle slightly higher or lower to compensate for the possibility that the ball could approach at a slightly different point than what the input data indicated. All the while, CARRL would try to ensure that the ball would make contact with at least some part of paddle.

Across all test scenarios, the RL model was better at compensating for potential inaccurate or “noisy” data with CARRL, than without CARRL.

But the results also show that, like with humans, too much self-doubt and uncertainty can be unhelpful. In the collision avoidance scenario, for example, indulging in too much uncertainty caused the main moving object in the simulation to avoid both the obstacle and its goal. “There is definitely a limit to how ‘skeptical’ the algorithm can be without becoming overly conservative,” Everett says.

This research was funded by Ford Motor Company, but Everett notes that it could be applicable under many other commercial applications requiring safety-aware AI, including aerospace, healthcare, or manufacturing domains.

“This work is a step toward my vision of creating ‘certifiable learning machines’—systems that can discover how to explore and perform in the real world on their own, while still having safety and robustness guarantees,” says Everett. “We’d like to bring CARRL into robotic hardware while continuing to explore the theoretical challenges at the interface of robotics and AI.”

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: