Robot See, Robot Do: This AI Learns by Watching YouTube Videos

When humans want to learn how to, say, make a sandwich, we often watch someone else do it first and then imitate the actions ourselves. That may be how robots do it in the future as well, if research from DARPA and the University of Maryland pans out. The school's roboticists have created a system that translates basic actions seen in YouTube cooking videos to real-life actions carried out by a Baxter humanoid robot. No, Baxter isn't quite ready to make you a risotto, but the system does reliably discern which tool to use, how to grip it, and what to do with it.

For instance, by watching a video of a cook brushing melted butter onto a corn cob, the robot learns that it must hold the brush in its left hand with a "power grip" and the corn in a "precision grip," and then use the brush to perform the action "spread" on the corn. The AI is already aware of what all these things are, but it puts together the how just from watching the video — or perhaps, in the future, watching you. Check out the UM paper (PDF) for more detail on what exactly the robot sees and decides.

—Devin Coldewey