Old dog training methods teach robots new tricks

Nov 3, 2020

This article is published in collaboration with Futurity.

Sony's Aibo robotic dogs are displayed during the 2018 CES in Las Vegas, Nevada, U.S. January 8, 2018. REUTERS/Steve Marcus - RC1931AEAB10

Positive reinforcement is helping to programme robots.

Image: REUTERS/Steve Marcus

Jill Rosen

Senior Media Representative, Johns Hopkins University

Our Impact

What's the World Economic Forum doing to accelerate action on Emerging Technologies?

The Big Picture

Explore and monitor how Artificial Intelligence is affecting economies, industries and global issues

Stay up to date:

Emerging Technologies

This article is part of: Pioneers of Change Summit

A team of researchers have been teaching robots new skills, by using techniques normally meant for training dogs.
Using positive reinforcement, they were able to teach a robot to learn new things in days, instead of what typically takes a month.
The team imagines these findings could help train household robots as well as self-driving cars.

With a training technique commonly used to teach dogs to sit and stay, researchers showed a robot how to teach itself several new tricks, including stacking blocks.

With the method, the robot, named Spot, was able to learn in days what typically takes a month.

Have you read?

By using positive reinforcement, an approach familiar to anyone who’s used treats to change a dog’s behavior, the team dramatically improved the robot’s skills and did it quickly enough to make training robots for real-world work a more feasible enterprise.

“

“I’ve had dogs so I know rewards work and that was the inspiration for how I designed the learning algorithm.”

”

Accept our marketing cookies to access this content.

These cookies are currently disabled in your browser.

“The question here was how do we get the robot to learn a skill?” says lead author Andrew Hundt, a PhD student working in Johns Hopkins University’s Computational Interaction and Robotics Laboratory. “I’ve had dogs so I know rewards work and that was the inspiration for how I designed the learning algorithm.”

The research appears in IEEE Robotics and Automation Letters.

Teaching a robot to learn

Unlike humans and animals that are born with highly intuitive brains, computers are blank slates and must learn everything from scratch. But true learning is often accomplished with trial and error, and roboticists are still figuring out how robots can learn efficiently from their mistakes.

Artificial Intelligence Emerging Technologies Technological Transformation

With a technique used to train dogs, researchers showed a robot how to teach itself new tricks, including stacking blocks. Image: IEEE Robotics and Automation Letters

The team accomplished that by devising a reward system that works for a robot the way treats work for a dog. Where a dog might get a cookie for a job well done, the robot earned numeric points.

Hundt recalled how he once taught his terrier mix puppy named Leah the command “leave it,” so she could ignore squirrels on walks. He used two types of treats, ordinary trainer treats and something even better, like cheese.

When Leah was excited and sniffing around the treats, she got nothing. But when she calmed down and looked away, she got the good stuff. “That’s when I gave her the cheese and said, ‘Leave it! Good Leah!'”

Similarly, to stack blocks, Spot the robot needed to learn how to focus on constructive actions. As the robot explored the blocks, it quickly learned that correct behaviors for stacking earned high points, but incorrect ones earned nothing. Reach out but don’t grasp a block? No points. Knock over a stack? Definitely no points. Spot earned the most by placing the last block on top of a four-block stack.

High score!

The training tactic not only worked, it took just days to teach the robot what used to take weeks. The team was able to reduce the practice time by first training a simulated robot, which is a lot like a video game, then running tests with Spot.

“The robot wants the higher score,” Hundt says. “It quickly learns the right behavior to get the best reward. In fact, it used to take a month of practice for the robot to achieve 100% accuracy. We were able to do it in two days.”

Positive reinforcement not only worked to help the robot teach itself to stack blocks, with the point system the robot just as quickly learned several other tasks—even how to play a simulated navigation game. The ability to learn from mistakes in all types of situations is critical for designing a robot that could adapt to new environments.

“At the start the robot has no idea what it’s doing but it will get better and better with each practice. It never gives up and keeps trying to stack and is able to finish the task 100% of the time,” Hundt says.

The team imagines these findings could help train household robots to do laundry and wash dishes—tasks that could be popular on the open market and help seniors live independently. It could also help design improved self-driving cars.

“Our goal is to eventually develop robots that can do complex tasks in the real world—like product assembly, caring for the elderly, and surgery,” says coauthor Gregory D. Hager, a professor of computer science.

“We don’t currently know how to program tasks like that—the world is too complex. But work like this shows us that there is promise to the idea that robots can learn how to accomplish such real-world tasks in a safe and efficient way,” Hager says.

Source: Johns Hopkins University

Accept our marketing cookies to access this content.

These cookies are currently disabled in your browser.

Accept our marketing cookies to access this content.

These cookies are currently disabled in your browser.

Accept our marketing cookies to access this content.

These cookies are currently disabled in your browser.

Don't miss any update on this topic

Create a free account and access your personalized content collection with our latest publications and analyses.

License and Republishing

World Economic Forum articles may be republished in accordance with the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License, and in accordance with our Terms of Use.

The views expressed in this article are those of the author alone and not the World Economic Forum.