How do robots ‘see’ the world?
This article is published in collaboration with The Conversation.
The world has gone mad for robots with articles talking almost every day about the coming of the robot revolution. But is all the hype, excitement and sometimes fear justified? Is the robot revolution really coming?
The answer is probably that in some areas of our lives we will see more robots soon. But realistically, we are not going to see dozens of robots out and about in our streets or wandering around our offices in the very near future.
One of the main reasons is simply that robots do not yet have the ability to really see the world. But before talking about how robots of the future might see, first we should consider what we actually mean by seeing.
I see you
Most of us have two eyes and we use those eyes to collect light that reflects off the objects around us. Our eyes convert that light it into electrical signals that are sent down our optic nerves, which are immediately processed by our brain.
Our brain somehow works out what is around us from all of those electrical impulses and from our experiences. It builds up a representation of the world and we use that to navigate, to help us pick things up, to enable us to see one another’s faces, and to do a million other things we take for granted.
That whole activity, from collecting the light in our eyes, to having an understanding of the world around us, is what is meant by seeing.
Researchers have estimated that up to 50% of our brain is involved in the process of seeing. Nearly all of the world’s animals have eyes and can see in some way. Most of these animals, insects in particular, have far simpler brains than humans and they function well.
This shows that some forms of seeing can be achieved without the massive computer power of our mammal brains. Seeing has clearly been determined to be quite useful by evolution.
Robot vision
It is therefore unsurprising that many robotics researchers predict that if a robot can see, we are likely to actually see a boom in robotics and robots may finally become the helpers of humans that so many people have desired.
How then do we get a robot to see? The first part is straightforward. We use a video camera, just like the one in your smart phone, to collect a constant stream of images. Camera technology for robots is a large research field in itself but for now just think of a standard video camera. We pass those images to a computer and then we have options.
Since the 1970s, robot vision engineers have thought about features in images. These might be lines, or interesting points like corners or certain textures. The engineers write algorithms to find these features and track them from image frame to image frame in the video stream.
This step is essentially reducing the amount of data from the millions of pixels in an image to a few hundred or thousand features.
In the recent past when computing power was limited, this was an essential step in the process. The engineers then think about what the robot is likely to see and what it will need to do. They write software that will recognise patterns in the world to help the robot understand what is around it.
The local environment
The software may create a very basic map of the environment as the robot operates or it may try to match the features that it finds with a library of features that the software is looking for.
In essence the robots are being programmed by a human to see things that a human thinks the robot is going to need to see. There have been many successful examples of this type of robot vision system, but practically no robot that you find today is capable of navigating in the world using vision alone.
Such systems are not yet reliable enough to keep a robot from bumping or falling long enough to give the robot a practical use. The driverless cars that are talked about in the media either use lasers or radar to supplement their vision systems.
In the past five to ten years a new robot vision research community has started to take shape. These researchers have demonstrated systems that are not programmed as such but instead learn how to see.
They have developed robot vision systems whose structure is inspired by how scientists think animals see. That is they use the concept of layers of neurons, just like in an animal brain. The engineers program the structure of the system but they do not develop the algorithm that runs on that system. That is left to the robot to work out for itself.
This technique is known as machine learning and because we now have easy access to significant computer power at a reasonable cost, these techniques are beginning to work! Investment in these technologies is accelerating fast.
The hive mind
The significance of having robots learn is that they can easily share their learning. One robot will not have to learn from scratch like a newborn animal. A new robot can be given the experiences of other robots and can build upon those.
One robot may learn what a cat looks like and transfer that knowledge to thousands of other robots. More significantly, one robot may solve a complex task such as navigating its way around a part of a city and instantly share that with all the other robots.
Equally important is that robots which share experiences may learn together. For example, one thousand robots may each observe a different cat, share that data with one another via the internet and together learn to classify all cats. This is an example of distributed learning.
The fact that robots of the future will be capable of shared and distributed learning has profound implications and is scaring some, while exciting others.
It is quite possible that your credit card transactions are being checked for fraud right now by a data centre self-learning machine. These systems can spot possible fraud that no human could ever detect. A hive mind being used for good.
The real robot revolution
There are numerous applications for robots that can see. It’s hard not to think of a part of our life where such a robot could not help.
The first uses of robots that can see are likely to be in industries that either have labour shortage issues, such as agriculture, or are inherently unattractive to humans and maybe hazardous.
Examples include searching through rubble after disasters, evacuating people from dangerous situations or working in confined and difficult to access spaces.
Applications that require very long period of attention, something humans find hard, will also be ripe to be done by a robot that can see. Our future home-based robot companions will be far more useful if they can see us.
And in an operating theatre near you, it is soon likely that a seeing robot will be assisting surgeons. The robot’s superior vision and super precise and steady arms and hands will allow surgeons to focus on what they are best at – deciding what to do.
Even that decision-making ability may be superseded by a hive mind of robot doctors. The robots will have it all stitched up!
Publication does not imply endorsement of views by the World Economic Forum.
To keep up with the Agenda subscribe to our weekly newsletter.
Author: Jonathan is Professor in Robotics at Queensland University of Technology (QUT).
Image: A woman talks to a humanoid robot named Han developed by Hanson Robotics. REUTERS/Tyrone Siu.
Don't miss any update on this topic
Create a free account and access your personalized content collection with our latest publications and analyses.
License and Republishing
World Economic Forum articles may be republished in accordance with the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License, and in accordance with our Terms of Use.
The views expressed in this article are those of the author alone and not the World Economic Forum.
Stay up to date:
The Digital Economy
Related topics:
Forum Stories newsletter
Bringing you weekly curated insights and analysis on the global issues that matter.
More on Emerging TechnologiesSee all
Michele Mosca and Donna Dodson
December 20, 2024