CHI Wikipedia research

Just filling in this earlier post, part 1. We're pretty excited about our first set of Wikipedia research coming out in CHI 2007. See the related blog post on the Augmented Social Cognition blog.

In essence, we presented a paper with a model and findings about conflict and coordination costs in Wikipedia at the global, article and user level. My contribution to this work was mostly in getting the analysis to work: we spared no expense analyzing a complete dump of the English Wikipedia (though the paper is based on a dump from July of 2006). Thanks to the excellent Hadoop MapReduce system for making that possible. Hadoop was considerably more immature when we started the project, it is rapidly turning into a very powerful tool for large-scale data processing. Nothing quite like computing the revision-to-revision difference of every edit publicly available for Wikipedia in 4 hours or so, eh?

Keep your browsers tuned to the ASC blog for future results in this space.

