Wikipedia: An Effective Anarchy Dariusz Jemielniak, Ph.D. Kozminski University [email protected] Paper presented at the Society for Applied Anthropology conference in Baltimore, MD (USA), 27-31 March, 2012 (work in progress) This paper is the first report from a virtual ethnographic study (Hine, 2000; Kozinets, 2010) of Wikipedia community conducted 2006-2012, by the use of participative methods, and relying on an narrative analysis of Wikipedia organization (Czarniawska, 2000; Boje, 2001; Jemielniak & Kostera, 2010). It serves as a general introduction to Wikipedia community, and is also a basis for a discussion of a book in progress, which is going to address the topic. Contrarily to a common misconception, Wikipedia was not the first “wiki” in the world. “Wiki” (originated from Hawaiian word for “quick” or “fast”, and named after “Wiki Wiki Shuttle” on Honolulu International Airport) is a website technology based on a philosophy of tracking changes added by the users, with a simplified markup language (allowing easy additions of, e.g. bold, italics, or tables, without the need to learn full HTML syntax), and was originally created and made public in 1995 by Ward Cunningam, as WikiWikiWeb. WikiWikiWeb was an attractive choice among enterprises and was used for communication, collaborative ideas development, documentation, intranet, knowledge management, etc. It grew steadily in popularity, when Jimmy “Jimbo” Wales, then the CEO of Bomis Inc., started up his encyclopedic project in 2000: Nupedia. Nupedia was meant to be an online encyclopedia, with free content, and written by experts. In an attempt to meet the standards set by professional encyclopedias, the creators of Nupedia based it on a peer-review process, and not a wiki-type software. The website relied on an 1 assumption that content should be generated by scholars for free, as a form of a pro bono service. Since Nupedia was developing very slowly, Larry Sanger, who was hired by Wales to oversee Nupedia’s development and was its editor-in-chief, picked an idea pitched by Ben Kovitz1, and proposed using wiki software and philosophy for encyclopedic content creation. This idea resonated perfectly with Wales’ vision of making a publicly editable and accessible encyclopedia, and in January 2001 Wikipedia.com was launched, originally as a “content feeder” for Nupedia. It turned out to be an almost instantaneous success. While the first year brought about 20 thousand articles, in the second there was almost 100 thousand articles developed. Meanwhile, either Nupedia had trouble with reaching out to the academic community with its message, or the academic community did not care enough about its mission. In September 2003 it was closed down, with a proud total of 24 articles finished, and 74 in various earlier stages of development. At this stage of growth English Wikipedia already had more than 150 thousand articles. Illustration by HenkvD, http://en.wikipedia.org/wiki/File:EnwikipediaArt.PNG 1 See more on the milestone event in the accounts of Kovitz and Sanger: http://en.wikipedia.org/wiki/User:BenKovitz#The_conversation_at_the_taco_stand 2 In the meantime, many other versions of Wikipedia started to spin-off. The first one was German (stared as a sub-domain of Wikipedia.com as early as on 16 March 2001). As of December 2011, these are the largest Wikipedias: Rank Language Number of articles Number of articles per capita (per 1 thousand of native speakers) 1 English 3.9 million articles 10.4 2 German 1.3 million articles 13.2 3 French 1.2 million articles 10.4 4 Dutch 1 million articles 43.4 5 Italian 885 thousand articles 14.3 6 Polish 864 thousand articles 21.6 7 Spanish 858 thousand articles 2.15 8 Russian 808 thousand articles 5.6 9 Japanese 791 thousand articles 6.5 10 Portuguese 710 thousand articles 3 English Wikipedia compared to other 10 largest Wikipedias in the world (own work) Based on: http://meta.wikimedia.org/wiki/List_of_largest_wikis The total number of articles in all Wikipedias (from over 270 languages) exceeds 20 million. As can be observed, English Wikipedia is by far the leader in this ranking. However, positions from number 2 to number 10 are relatively less dispersed, (below the top 10 there is a huge gap, since Swedish Wikipedia, currently 11th, has 419 thousand articles). When the number of articles per capita (of native speakers) is considered, among the 10 largest Wikipedias Dutch takes the lead by far, and the Polish one, being second, also significantly stands out. English Wikipedia seems to be average in this ranking (number 5, ex 3 aequo with the French), which is even more surprising, when taken into account the fact that English is the most popular second language in the world, a practical lingua franca, while neither Dutch nor Polish are popular as a second language at all. Actually, there is approximately 29 thousand English Wikipedia users, who declare themselves as native speakers2, and 27 thousand of those, who contribute to the project as declared non-natives. Basing on these rough estimates, which indicate that there is about as many non-native editors as the native ones, as well as taking into account that English Wikipedia had a head start over the others (especially in media coverage), the number of articles on English Wikipedia is, in fact, very surprisingly low, in comparison to other major projects. Also, the page and editor growth rate on the English Wikipedia has been on a slight decline, along with the increased coordination organizational costs (Suh, Convertino, Chi, & Pirolli, 2009). It should be noted, however, that just the sheer number of articles is not a good single indicator of encyclopedia’s quality: many of articles present on English Wikipedia are much better developed, as well as richer in sources and data, than the same articles on other Wikipedias. Therefore it could be speculated that at some level of development editors decide to deepen and develop existing articles, rather than seed new “stubs” (as short new articles are called on Wikipedia). Also, since the number of articles is the simplest and the most visible measure of encyclopedic size, some projects massage the result by employing bots (software scripts) to create new articles in some categories automatically, on topics such as e.g. thousands of villages in China, basing on available geographic lists and maps, and many others. This expansion strategy has been used, to varied extend, by most projects, and dates back as far as to October 2002, when a bot was used to add 30 thousand stubs on American cities and towns to the English Wikipedia in little over a week. Clearly, non-human contributors have quite a significant influence on the development of Wikipedia (Niederer & van Dijck, 2010; Geiger, 2011; Halfaker & Riedl, 2012). Thus, all the metrics should be taken with a grain of salt3. It should be noted, however, that Wikipedic growth clearly slows down (Suh, et al., 2009), and some studies indicate that it actually follows an S-curve (Lam & Riedl, 2011). 2 As of 28 January 2012, 1751 users declare basic level of English, 5523 intermediate level, 11035 advanced level, 5919 near-native level (me including), 4028 professional level, and 29357 a native level. Categories are taken from http://en.wikipedia.org/wiki/Category:User_en and apply only to Wikipedians, who decide to declare their command of languages. However, it is a wide custom to do so among regular editors. 3 The meanders of statistical measures of Wikipedias growth get even more complicated, when one has to make a decision what constitutes an article that can be accounted for. For quite a while in the official statistics only articles with at least one comma were counted. This obviously disfavored non-European languages. Currently, a different measure is applied in MediaWiki software (from version 1.18 and up), but former installations are still in use. See more: http://www.mediawiki.org/wiki/Manual:Article_count 4 Quite interestingly as well, the criteria of notability of a topic for an encyclopedia vary significantly across the projects. In a study of 25 Wikipedias, only 1% of articles were shared across all of them, and as many as 74% were present in one language only (Hecht & Gergle, 2010). Yet, there is some rational and meritocratic development path across the projects, as the differences between average and featured (considered to be exemplary) articles are significantly greater than between languages (Hammwöhner, 2007), and different language project, as long as they reach reasonable maturity, develop following similar network patterns (Zlatić, Božičević, Štefančić, & Domazet, 2006). In spite of its tremendous growth and undisputed maturity, the development of the English Wikipedia does not seem to slow down. One interesting measure of Wikipedic stability is the number of weeks between each 10 million edits (this measure is quite useful, since it is independent from the number of contributors). While in the beginning (in 2005) it took 211 weeks for the number of edits to increase from 10 million to 20 million, and then additional 17 weeks to increase to 30 million, the pace soon stabilized around about 7 weeks and has stayed such ever since (in 2007 it sped up to a little below 6 weeks, and from 2011 onwards it has been 8 weeks, but nevertheless it is amazingly consistent). 250 200 150 100 50 0 5 6 7 8 9 0 1 2 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 2 2 2 2 2 2 2 2 Number of weeks between every 10-million edit, based on http://en.wikipedia.org/wiki/User:Katalaveno/TBE (data retrieved on 2 September 2012) 5 Currently, the total number of edits on the English Wikipedia exceeds 500 million4. Other Wikimedia5 projects flourish as well. Quoting from Wikipedia “As of now, Wikipedia is the world's sixth-most-popular website according to Alexa Internet, with a combined total of over 22 million articles across all 284 active language editions” (see [[WP:History of Wikipedia]]). The total number of edits on all Wikimedia projects as of February 2012 exceeds 1.5 billion6. Wikimedia community7 has evolved over the years and some changes are not as optimistic as the views stats. For instance, according to Editor Trends Study of Wikimedia Foundation8 (based on five largest Wikipedias), until 2005 as many as nearly 40% of editors stayed active a year after they started editing. However, when editors after 2007 are analyzed, only up to 15% of editors stayed active after a year since they started editing. Clearly, their integration with the Wikimedia community has become harder. This may be related to increased decentralized and informal gatekeeping, that is small-scale interactions between community members, aimed at discouraging the newcomers (Shaw, 2012). The community undertakes many efforts to change this unfavorable trend (see e.g. [[WP:Teahouse]]). According to Editor Survey Report from April 20119 as many as 91% of Wikipedia editors are male. These results may not be fully accurate, since they rely on a voluntary online survey advertised to 31699 registered users and resulting in 5073 complete and valid responses, and self-selection may have been an important factor (male editors may have responded more or less frequently than female editors). Similarly, a study of self-declarations of gender, and 4 As checked on February 1, 2012 on http://en.wikichecker.com/ 5 Wikipedias are the most popular projects within Wikimedia Foundation, but there are many others, such as Wiktionaries, Wikinews, Wikiversities, Wikibooks, Wikiquotes, and others (all organized basing on the similar community principles and run from Wikimedia Foundation servers). It should perhaps be noted that Wikileaks.org has no relation with Wikimedia projects, and since 2010 does not even use MediaWiki software. Whenever I write of Wikipedia, I refer to the English and the Polish Wikipedic project, while when I mention Wikimedia, I refer to all communities run under the Wikimedia Foundation umbrella. 6 As checked on February 1, 2012on http://toolserver.org/~emijrp/wikimediacounter/ 7 Although some researchers (Buss & Strauss, 2009) do not consider wikis to form online communities, because of shift of focus from social networking to knowledge creation, it is clear that because of the interactions, socializing, group identity, perception of belonging etc. Wikimedia projects, especially for registered users, satisfy the criteria as any other community in the open source and open collaboration movement (M. Castells, 2001; Von Krogh, Spaeth, & Lakhani, 2003; S. Weber, 2004). 8 Retrieved from http://strategy.wikimedia.org/wiki/Editor_Trends_Study on February 1, 2012. 9 Retrieved from http://upload.wikimedia.org/wikipedia/commons/7/76/Editor_Survey_Report_- _April_2011.pdf on February 1, 2012. 6 showing only 16% of female editors (Lam et al., 2011), may be distorted since more females may chose not to reveal their gender in a community perceived as male-dominated. Yet, the results of different studies consistently show that the number of female editors is surprisingly small, especially when taken into account that the gender gap is much smaller among non- editing readers (Glott, Schmidt, & Ghosh, 2010; Bywater, 2011). This is a very extreme inequality, which is difficult to explain just by the geeky past of the movement, or just through an unsurprising reproduction of social and economical inequalities (Morell, 2010)10. Some studies suggest that it may be a result of the high level of conflicts on Wikipedia, or generally critical cooperative environment (Collier & Bear, 2012). Clearly, gendered perceptions of editor roles and careers could be at play, too (Bourne & Özbilgin, 2008), even though the phenomenon is far from explored and requires more Wikipedia-focused studies from different angles, as well as studies of collaboration technologies and gender in more general (Zhang & Kramarae, 2008). Contrarily to the sterotype, Wikipedians are not necessarily all tech-savvy and geeky, and many of them perceive their work as rather creative or even artistic, than technical and administrative (compare: Jemielniak, 2008). Possibly as a side-effect, Wikipedia biographies of women are more likely to be missing than are articles on men, when compared to Britannica, even though Wikipedia provides better coverage and more comprehensive articles (Reagle & Rhue, 2011). Also, articles about gender inequalities and about feminist topics may be more likely to be deleted (Carstensen, 2009)11. It should be noted, however, that there is also a huge variation in recognizing people notability across different language Wikipedia projects (Callahan & Herring, 2011). Another interesting finding from the Editor Survey Report is that as many as 8% of Wikipedia editors have a Ph.D. degree, 18% a Master’s, and 35% a Bachelor’s level degree, so the stereotypical image of a Wikipedia being most commonly a high school student is clearly false. Only 36% of Wikipedia editors are able to write computer programs by themselves, so also the geeky stereotype is partly false. Quite interestingly also, Wikipedia editors are no very young on average (32), and only 27% of editors are 21 or younger. It should be noted, 10 It may be worth noting that besides significant efforts from the Wikimedia Foundation to close the gender gap, there are also other initiatives, such as http://wikichix.org, which was created in 2006 to allow female editors of Wikipedia to discuss how they perceive the gender bias, how to minimize it, etc. To read more on systemic bias on Wikipedia, visit [[WP:BIAS]]. 11 For an account of an “edit war” in a topic related to gender studies, see an appendix. 7 however, that these results differ significantly from another major study, where 49% respondents reported this age (Glott, et al., 2010). In general, most quantitative studies of Wikipedia have to bear some assumptions and are open to one methodological bias or another, since the bias-free recruitment of subjects is a challenge. Motivations for participation among Wikipedians vary a lot, and their interpretation is also highly dependent on the paradigm and discipline of the researcher approaching the topic. Some studies indicate that a dominant motivator is a possibility to gain recognition in a community (Forte & Bruckman, 2005), particularly attractive when paired with a relatively low transactional cost of entry and participation (Ciffolilli, 2003). Other involve self- fulfillment, mere fun of the activity (quite typical in knowledge work: Hunter, Jemielniak, & Postuła, 2010), a drive to acquire and share knowledge, even if just to boost one’s ego (Rafaeli & Ariel, 2008), sustaining positive self-image, or a passion to contribute to the common good and altruism (Ciffolilli, 2003; Baytiyeh & Pfaffman, 2010; Yang & Lai, 2010), and a sense of accomplishment (Kuznetsov, 2006), which are not uncommon also for other online knowledge sharing activities (Lee & Jang, 2010). As research on other virtual communities shows, many users decide to participate also because of the attractive communal social status building possibilities (Lampel & Bhalla, 2007), or simply the developed feeling of belonging (Lampe, Wash, Velasquez, & Ozkaya, 2010). Other motivations may be also ideological (i.e. a strong belief that “information should be free”) and principle-driven (Nov, 2007; Coleman, 2013). Incidentally, die-hard Wikipedians differ in their participation from the very beginning, and display a different editing pattern even as beginners (Panciera, Halfaker, & Terveen, 2009). This could indicate that there is a significant selection bias into the Wikipedia community. As other research shows, Wikipedians are sometimes perceived as different kind of people by the newcomers, which definitely affects the retention of new editors (Antin, 2011). They also clearly take the role of gatekeepers to the newcomers, especially in the case of breaking news articles (Keegan & Gergle, 2010). It is worth noting however that user retention, contrarily to a popular belief, does not have just positive effects. In fact, as other studies show, some moderate levels of membership turnover on Wikipedia actually positively affect the outcome of collaborative article production (Ransbotham & Kane, 2011). 8 Basic Rules and Norms The English Wikipedia, and to a smaller extent also the Polish Wikipedia, have a huge number of rules, norms, policies, or guidelines, and, somewhat not typically for many other virtual communities, which rely on governance in the hands of a few leaders (Lessig, 1999; Butler, Sproull, Kiesler, & Kraut, 2007), all of them are established democratically by the community itself. The issue of formalization itself is interesting and is discussed later in this book, and one of the appendixes includes a verbatim quotation of the “five pillars” of Wikipedia and the five corresponding policies, cited from the English Wikipedia, to give the reader a feel of the actual content and phrasing of the norms. Here I am going to briefly describe only the chief policies, rules and norms guiding everyday editing on most Wikimedia projects, which should be familiar to most Wikipedians, even if they are just beginners. Possibly the most distinctive norm that makes Wikipedia different from other internet fora is a hard-and-fast rule: no personal attacks (often abbreviated to NPA12). Editors are not allowed to verbally attack others. Even being a target of a personal attack cannot be an excuse to fight back: the proper code of conduct requires that the attacked editor tries to reason with the attacker and ignores the abuse. Retaliation most likely leads to both parties involved being blocked. If an intervention is indeed necessary, an administrator can be asked for help (or act on his or her own, when spotting aggressive behavior). Editors are requested to comment only on the content, and never on the contributors. A similarly strong norm is a related one of civility (abbrev. CIV). It requires editors to treat each other with respect and consideration, and without personal or snide comments. In a quite religious way, Wikipedians are generally advised to forgive and forget (abbrev. FORG), when they are insulted or wronged. The actual practice of enforcing civility and no personal attacks rules varies a lot, depending on a project, people involved, and a situation. Sometimes one snappy comment may be a reason for blocking a misbehaving user, sometimes even a relatively major offense goes 12 All abbreviated rules correspond with their internet addresses. This rule, for example, can be found under [[WP:NPA]], which, as already explained in the introduction, is a typical transcription of the address: http://en.wikipedia.org/wiki/Wikipedia:NPA 9 without a reaction. For instance, once I noticed13 that a Wikimedia Foundation employee and also an administrator on the English Wikipedia called his disputant a troll. The thing was that, although this disputant had a long history of criticizing Wikipedia, including the mainstream media, was actually making a point in a discussion, and in my judgment was not trolling at all. I suggested retracting the comment, but I was refused. Nobody else reacted. Clearly, the norm was bent because of the status of both people involved. Yet, even though power imbalance sometimes does lead to such outcomes, it is not typical. I myself have blocked other admins when they did not mince their words, and witnessed others doing so often enough to believe that status and position in the community more often than not do not shield from policy enforcement. In general, all users irrespective of their experience and standing in the community are expected to adhere to the rules of etiquette (abbrev. EQ). In any case it should be noted that blocking users is meant to be a preventive rather than a punitive measure, as it is not supposed to be used as a direct punishment for breaking the norms, but rather lead to interactions when norms are not broken in the future. A word which is treated with enormous esteem on Wikipedia is consensus (abbrev. CON). It describes a fundamental model of decision-making used on Wikipedia. The norm of consensus requires all editors to seek a solution acceptable for the community. It does not mean unanimity in all cases, but it emphasizes a need to strive for hypothetical unity, whenever possible. It also does not encourage polling and voting. In fact, voting is considered to be anti-consensual, as it does not allow to fully acknowledge all views in a discussion. One of the norms explicitly says that Wikipedia is not a democracy (abbrev. DEM), another reiterates that Polling is not a substitute for a discussion (abbrev. POLL), and a page [[WP:Straw polls]] explains in details14: Having the option of settling a dispute by taking a poll, instead of the careful consideration, dissection and eventual synthesis of each side's arguments, actually undermines the progress in dispute resolution that Wiki has allowed. This is a strength, not a failing, and is one of the most important things that make Wiki special, and while taking a poll is very often a lot easier than helping each other find a mutually agreeable position, it's almost never better. 13 I purposefully refrain from linking to the discussion, as the described person is editing under his own name, and is still a WMF employee. 14 Retrieved from [[WP:Straw_polls]] on 10 September 2012 10
Description: