MovieLists Dataset


User content curation is becoming an important source of preference data, as well as providing information regarding the items being curated. One popular approach involves the creation of lists. On Twitter, these lists might contain user accounts relevant to a particular topic, whereas on a community site such as the Internet Movie Database (IMDb), this might take the form of lists of sharing common characteristics. While list curation implicitly involves substantial combined effort on the part of users, researchers have rarely looked at mining the outputs of this kind of crowdsourcing activity. Here we study a large collection of movie lists from IMDb. We apply network analysis methods to a graph that reflects the degree to which pairs of movies are "co-listed", that is, assigned to the same lists. This allows us to uncover a more nuanced categorisation of movies that goes beyond simple metadata, such as genre or era.

software

IIIF drag and drop link