May 29, 2013 at 1:32 PM ET
It's a sci-fi trope that's been used in dozens of movies and books: The robot with the high-precision eyes, blazing-fast silicon brain, and lightning-quick limbs that detects what someone is doing as soon as they start — and usually stops them. But what if that robot wasn't bent on the destruction of the human race, but just wanted to help you put away your groceries?
That's what Cornell University's Personal Robotics lab is working on: A robot that sees what you're doing and comes to your aid, whether that means opening a door, pouring a drink or just moving out of your way.
It works by viewing its environment through images and depth-mapping hardware — in this case, hacker favorite the Microsoft Kinect. The system views the room and categorizes objects: Tabletop, container, wall and so on. When a human enters the scene, the robot doesn't just "think" of him or her as an object, however, but as something that moves other objects around.
"Our algorithm makes a guess on what the person has been doing in the past, and considers several possible activities that the person may be doing in the future," professor Ashutosh Saxena, who is developing the system with Ph.D. student Hema Koppula, told NBC News in an email.
By weighing different options for what's likely to happen given where a person is and how they're moving, the system makes guesses about your next action. If you're by a table with a glass on it, it will look and guess that you're more likely to pick up the glass than the table. And once you have the glass, you might just put it back down, but you might put it in the dishwasher, which the robot has also mapped out as a potential target.
The challenge, according to Saxena, is "visual complexity in the environment." The more stuff there is, the harder it is to categorize and parse all those objects. But better "eyes" could help there, and Saxena acknowledged that the new version of the Kinect could "tremendously" improve the system's ability to understand its surroundings, as well as track multiple people.
Once it has a pretty good idea of what you're going to do, it can either help or adjust its own actions. It could help by opening the dishwasher for you, for instance — physically or remotely. Or if it was going to pour you some coffee, it might wait until you're done moving things around the table.
If it's not sure what you're up to, then it will keep on watching until it figures it out. "The anticipations become more accurate as more observations are made," notes Saxena in this demonstration video:
But it's flexible as well: Though it's trained by watching videos of people doing various actions (lifting, pouring, opening doors), it can put those together in novel ways and predict an action it has never actually seen done before.
The system could be hugely helpful for robots which may in the future help out elderly and disabled folks in their own homes. It would be a pain, after all, to have to constantly announce what you want your household robot to do.
And while the project looked first into household tasks at a human size and time scale, the robot could just as easily "anticipate" the trajectory of a ball or car, or even identify objects in the environment that could harm it or others.
Saxena and Koppula will be presenting their work at the International Conference of Machine Learning, which takes place in Atlanta in June, and then at the "Robotics: Science and Systems"conference in Berlin later that month.
Devin Coldewey is a contributing writer for NBC News Digital. His personal website is coldewey.cc.