A new algorithm could help robots do a better job planning for the complexities of the real world. Interestingly, the code was tested in a video game.
Robots often have a tough time planning their actions, in part because they don’t intutively know how to focus their actions and what information and objects can be ignored. This can lead to a big problem called the state-space explosion, in which a robot is faced with too many options.
“It’s a really tough problem,” said Stefanie Tellex, assistant professor of computer science at Brown. “We want robots that have capabilities to do all kinds of different things, but then the space of possible actions becomes enormous. We don’t want to limit the robot’s capabilities, so we have to find ways to shrink the search space.”
One of Tellex’s graduate students, David Abel has led the effort to solve this problem and narrow down the options to a reasonable number. Tellex and her students hope their algorithm will solve that problem, narrowing the options down to something reasonable.
The algorithm created at Brown enhances a robot’s performance using what the researchers call goal-based action priors: sets of objects and actions that are likely helpful to achieve the goal. Priors can be provided by an outside expert, but ideally, the software itself learns how to extract priors by trial and error.
The researchers say the video game Minecraft was perfect for testing how well the algorithm selected priors.
“Minecraft is a really good a model of a lot of these robot problems,” Tellex said. “There’s a huge space of possible actions somebody playing this game can do, and it’s really cheap and easy to collect a ton of training data. It’s much harder to do that in the real world.”
The researchers began by building small domains of just a few blocks, in a model of Minecraft. They then gave a character a goal, such as mining buried gold or crossing a chasm. The character used the algorithm to try different methods to achieve the goal and learn the appropriate priors.
“It’s able to learn that if you’re standing next to a trench and you’re trying to walk across, you can place blocks in the trench. Otherwise don’t place blocks,” Tellex said. “If you’re trying to mine some gold under some blocks, destroy the blocks. Otherwise don’t destroy blocks.”
After the algorithm tried the task several times and attempted to learn the priors in one domain, they would move it to another. The Brown team found that the algorithm was, in fact, able to apply what it had learned and complete the task in new domains faster than standard planning algorithms.
The researchers then tested their algorithm using a real robot. The goal: help a human bake brownies. The algorithm was given with several action priors for the task. For example, one prior let the robot know that eggs often need to be beaten with a whisk. So when a carton of eggs appears in the robot’s workspace, it is able to anticipate the cook’s need for a whisk and hand him one.
Tellex says the results indicate that goal-based action priors are a viable strategy to help robots function well in unstructured environments.