In the annals of customer feedback, one Netflix subscriber might merit a place in the record books. This anonymous individual took the trouble to rate more than 5,000 movies in a single day. That's the equivalent of rating five movies a minute for 16 hours straight.
At most companies, the behavior of one unusual customer might fuel discussions in the hallways. But this user performed for a far larger crowd. Three years ago, Netflix released data for this prolific rater and millions of other customers, scrubbed of any names, and offered $1 million to the first person or group to improve the accuracy of their movie recommendations by 10 percent. Specifically, the winning team had to come closer than Netflix in calculating how many stars each user would give to a particular flick. More than 50,000 people registered for the prize and downloaded the Netflix data. And on Sept. 21, Netflix CEO Reed Hastings handed a $1 million check to the winners, BellKor's Pragmatic Chaos, a seven-member international coalition.
The winning team, which includes scientists from AT&T Research, Yahoo's Israel lab and computer scientists from Austria and Canada, blended more than 700 different statistical models into their formula. They studied every conceivable angle. They looked at the correlation of one movie to the next (Which "Godfather Part II" lovers are most likely to rent "Scarface"?). They studied users' moods. (Once people start panning movies, it turns out, they give lower-than-normal ratings even to movies they like.) And they found that when people rank lots of movies in a single day, from dozens to 5,000, they think differently than when rating a movie they just saw. If a user tends to glorify movies as time passes, the algorithm can make allowances, predicting higher ratings as the months go by. "The frequency of ratings is very useful," says Martin Piotte, a Montreal-based researcher on the team.
Nineteen trillion variables
The relevance of the data, however, depends on the film. Some movies, it turns out, are more popular just after they've been watched. Others appear to rise as time passes. "Memento," for example, the 2000 thriller about a man who suffers from short-term memory loss, gets about 3.4 stars—about average—the day people see it. But when they rate it later, it skyrockets to four stars. "That's a big variation," says Piotte. "Patch Adams," the 1998 Robin Williams film about an unconventional doctor, sinks upon further reflection, Piotte says. Each movie develops its own curve, and the time variations can be incorporated into the predictions.
The challenge facing researchers is to account for scores of different factors, from movies and customers to the time of day, without overwhelming the number-crunching capability of computers. The winning team, according to Piotte, grappled with a possible 19 trillion variables. The team found ways to reduce the complexity. But still, Netflix must weigh the advantages of more precise recommendations against the cost of additional computing expenditures. "There are hundreds of algorithms [in the winning formula]" says Netflix Chief Product Officer Neil Hunt. "We've selected just a small number of them for now, maybe two or three."
The complexity is bound to grow in coming months. While announcing the prize winners, Netflix officials announced plans for a second Netflix Prize. Details, they say, will be revealed in coming days. But it's clear that data miners will be combing through much richer sets of user info. While the first set of data included only the renting and recommending behavior of anonymous customers, the second contest, says Hunt, will include demographic information, such as geography, age and gender, along with details of movies the users have previously rented. The goal, he says, is to be able to size up customers when they first arrive on the site—without waiting for them to establish a data footprint. "We want to predict people earlier in the cycle," says Hunt.
In the second contest, the company will award two $500,000 prizes, a preliminary award after six months, the other after a year and a half. Four of the seven winners of BellKor's Pragmatic Chaos say they plan to continue crunching Netflix data. Judging from the first contest, they'll have plenty of company. One user on the Netflix Prize site, who goes by the name of Newman, laments: "Please, no more! I don't want to spend more time on these contests, but I probably can't resist!"
For the data-mining community, the first Netflix Prize has been a sensation, attracting marketing buzz and brainpower to the prediction of consumer behavior. It's a crucial frontier for online retailers such as Amazon and Wal-Mart as well as online advertisers. "I'm a little surprised more companies aren't launching [these kinds of competitions]," says Darren Vengroff, chief scientist at RichRelevance, a San Francisco startup specializing in customized recommendations. "I bet [Netflix] hurried to announce the second contest to make sure no one else beat them to it. They want to keep their franchise."