Sep. 14, 2012 at 2:53 PM ET
Computers have gotten pretty good at matching photos with objects, but the simple sketches humans make on the backs of napkins or in drawing games like Pictionary baffle them. A new research paper describes a computer system that can recognize these drawings almost as well as a human can.
The researchers, led by Mathias Eitz of the Berlin Institute of Technology, trained the computer by exposing it to 20,000 sketches in 250 categories solicited from people online. Seagulls, snails and blimps were represented, but abstract notions and verbs like dignity or punt (such as might be found on the "difficult" category in Pictionary) were left out.
These thousands of sketches were each broken down further into a "bag of features," using statistical analysis to determine where curves and lines should be placed in certain drawings; Humans have similar, but not fixed, ideas on how something is drawn, as this group of sheep sketches shows (they're grouped according to similarity):
The goal isn't to reach 100 percent recognition. When tested in the study, even humans only recognized the sketches correctly 73 percent of the time. The computer succeeded at 56 percent — not quite as good as a person, but far better than chance, which would have identified hardly any at all.
You can try out the technology yourself using the researchers' iPhone app (which, since the research is ongoing, should be considered a work in progress).
The researchers tested using simple, low-resolution sketches, but they are already experimenting with improvements to the recognition process. The order of the strokes people used to draw something could be a factor, for instance, since few people draw the tail of the sheep before the body, or the stem of a tomato before the fruit.
Ultimately the researchers hope to better understand the way by which humans can easily recognize even the crudest sketches of objects and faces; such information would not only be interesting in its own right, but could help improve computer-vision models, allowing them to recognize unfamiliar or stylized objects.
The entire paper can be read online here (PDF), and aside from the mathematical methods section, it isn't very technical and can be read by any curious lay person. It was presented at the SIGGRAPH conference on computer graphics and interaction.
Devin Coldewey is a contributing writer for NBC News Digital. His personal website is coldewey.cc.