Friday, October 19, 2007

The madness of Barbarella


Friday night, time for collaborative filtering! Game theorist and Ajax programmer Rob Brown, a guy with perhaps the worst web site and yet most brilliant essay on the madness of crowds we've ever seen, notes it's been a full year since Netflix offered 1 million bucks to anyone smart enough to improve its movie recommendations. The target is a 10% increase in personalization; 12 months later, the world's greatest minds are only at 8.5%.

Brown tried to solve the problem by plotting each film and viewer as points in space, tied to a suburban metaphor.
The idea is that I needed to put each movie, and each user, into a "neighborhood," which roughly equates to "genre." There is a science fiction neighborhood, a comedy neighborhood, a horror neighborhood, and so on. But the neighborhoods have blurry boundaries, just as real neighborhoods typically do. "Alien" would be somewhere between the science fiction and horror neighborhoods, while "The Hitchhiker's Guide to the Galaxy" would be somewhere between the science fiction and comedy neighborhoods. Each user would live in a neighborhood, closest to the type of movies they prefer, and furthest from those they dislike.
Nice mathematical construct, but we see one glaring, major problem. Consumers are not single points on a map. They are schizophrenics with different mind modalities, leaping from mental neighborhood to neighborhood. Even us. Tonight, we're in drama mode. Tomorrow, we may long for Barbarella. And Sunday, we're hosting a kid party and want a Disney fix to keep the tots at bay. How does any software sort through the madness of crowds in our own head?

This problem is obvious for anyone who's signed up for Netflix. The video rental site gives you a little test, to help set up the personalization, and asks you to rank different movie categories from 1 to 5 stars. When we got to "children's films," we paused. High School musical was personally horrible. But our kids were glued to it, and we had a nice family evening ... so we both love it and hate it. How many stars is that? 3?

The honest fix for sites like Netflix and Amazon would be a "customer modality" dial. It would look like a big radio knob, and you could crank it to your mode of the moment. Father. Lover. Family guy. Documentary-adventurer. Go ahead, collaborative filter us all you want. Sometimes we just need a big red button for Barbarella.

3 comments:

rob said...

Hi Ben, Rob here. Just took a look at my access logs, and saw your site scroll by, so decided to stop in.

Sorry about my lame top level page on my site. :)

I'll admit that initially, I thought that people (and possibly movies as well) might need to be in multiple locations.

But I would argue that my algorithm makes it unnecessary. Imagine I like zombie movies, westerns, science fiction, and kung fu. Now lets say I live right at the corner where these 4 neighborhoods meet. I am pretty near all of them, so it does a good job in predicting I like movies of those genres. Now if a movie comes along that is all 4 of them, such as "Serenity" was, it will be under me, so there is a good chance I'll love it.

Now in a two dimensional world, it's unlikely all of these neighborhoods touch each other, but possible, just as 4 states touch each other at one point in the southwest. In 3 dimensions, it's just as possible for 8 to similarly touch. Make it a 12 dimensional world, and, well, problem solved. Just about every neighborhood can share a border with every other one (of course, these borders are fuzzy, but you get the idea). It's just a matter of increasing the dimensions till you get dimenishing returns.

Ben Kunz said...

Rob, thank you for the clarification. (And your article is brilliant.) I think I understand that in a multi-dimensional model, one user can touch different genre neighborhoods -- so the "relevance" of the interest would be mapped between user A and multiple interests B, C, D etc.

I differ though in that what is presented in personalization may be problematic -- because users may be in different modes of interest, and thus want to see different aspects of personalizatoin at any given time. If I am planning a family movie night, I would love to see a series of movies related to kids... but not on husband-wife date night or guys-drinking-beer night.

I love your ideas for mapping the relevance. But selecting the presentation of that in the personalization interface seems the hardest part of the problem to solve. Or ... there are really two problems, mapping the interest, and then selecting the aspect to personalize at the moment.

rob said...

Yeah, you make a very good point, that from a user interface point of view, they may want to say "right now, concentrate on this side of my tastes".

I guess my argument was simply that, for the purpose of the contest (which of course didn't allow modifying the user interface), users don't really need to have homes in multiple neighborhoods...with enough dimensions there is a special little side street for someone who loves both Reservior Dogs and My Little Pony.

Anyway thanks for the kind words on the article.