William Beutler on Wikipedia

Archive for the ‘Wikipedia in the news’ Category

The Wikimedia Foundation is Losing its Chief. What Happens Next?

Tagged as , , , , , ,
on March 28, 2013 at 9:35 am

Big news in the world of Wikipedia, yesterday: Sue Gardner, the executive director of the Wikimedia Foundation (the non-profit behind Wikipedia and other wiki-based projects) announced she will be stepping down from the role, which she has held since June 2007. Gardner, in a post on the Wikimedia blog:

I feel that although [Wikipedia is] in good shape, with a promising future, the same is not true for the internet itself. (This is thing number two.) Increasingly, I’m finding myself uncomfortable about how the internet’s developing, who’s influencing its development, and who is not. Last year we at Wikimedia raised an alarm about SOPA/PIPA, and now CISPA is back. Wikipedia has experienced censorship at the hands of industry groups and governments, and we are –increasingly, I think– seeing important decisions made by unaccountable, non-transparent corporate players, a shift fromSue Gardner at Wikimania the open web to mobile walled gardens, and a shift from the production-based internet to one that’s consumption-based. There are many organizations and individuals advocating for the public interest online — what’s good for ordinary people — but other interests are more numerous and powerful than they are. I want that to change. And that’s what I want to do next.

In January 2012, you may remember that Wikipedia went into “blackout” mode for 24 hours in protest of legislation before the U.S. Congress (SOPA/PIPA), so this explains that much. The rest of the statement is a little harder to puzzle out; the “non-transparent corporate players” in those circumstances were opposed by other corporate players, and both were fighting over government regulations. The line about “mobile walled gardens” sounds like Facebook, and a “consumption-based” Internet sounds like a jab at tablets, of all things, but I suppose we’ll have to see. These are obviously broad statements, and Gardner hasn’t actually announced her next move.

The move won’t be happening too soon, yet: Gardner will be in the position for (at least) another six months, while she works with Wikipedia’s Board of Trustees to find a successor, she writes in the post.

Whether Wikipedia is really “in good shape” is a matter for debate, especially considering Gardner had made a personal cause of trying to fix Wikipedia’s absurd gender imbalance, not to mention the overall downward drift in editor retention and activity.

She also leaves with some organizational questions unresolved: just last October, the board approved her plan to shift and “narrow” the non-profit organization’s focus to primarily software development; whereas the foundation once had “fellows” focused on community-building, the Foundation has shifted to a grant-making process, which is still making a first go of it.

Speaking of development, the great white whale continues to be what’s called the VisualEditor, an editing interface intended to be much easier for users than the current system, which is fairly similar to coding HTML. (It’s not as difficult as real programming, but still too much effort for most.) It’s been nearly two years in the making, and has finally rolled out into testing just this year.

Speaking of whales, Sue was the first leader to follow the much better-known Jimmy Wales, who still sits on the Board of Trustees*. Gardner came from the CBC in Canada, and was not an original part of “the movement,” but she came to identify with it and become quite popular with the overall Wikimedia community. It’s not at all clear who should or will succeed her, but it is clear that a lot rides on the decision.

Photo licensed under Creative Commons by Ariel Kanterewicz, via Wikimedia Commons.

*This post originally stated that Wales rotates off the Board later this year; it’s since been pointed out to me that, while all members’ terms are limited, reappointments are allowed, which it is expected to do in Wales’ case again next time.

Get Your Freakonomics On

Tagged as , , , , , , ,
on February 26, 2013 at 9:19 am

Wikipedia seems like an ideal topic for Freakonomics, the podcast based on the popular book(s) of the same name by Steven Levitt and Stephen J. Dubner. But as long as I’ve been listening, this week’s episode—“Women Are Not Men”—is the first I can recall that includes Wikipedia as a focus. Given the title, you may have guessed the subject: Wikipedia’s gender gap (previously discussed on The Wikipedian).

The segment includes a nice bit on how editing of Wikipedia works, and it includes a brief interview with veteran Wikipedian Sarah Stierch, former Wikipedian-in-Residence at the Smithsonian and creator of the Wikipedia Teahouse, a project designed to help new editors. And she knows from the trials of being a new editor, as she freely admits:

My first article was deleted. I can proudly say that. I wrote about a guy in a band that I knew—that’s no longer on Wikipedia.

I’d be surprised if there are any longtime Wikipedia editors who have not had early articles deleted. Anyway, it’s a worthy segment, and I’m fairly sympathetic to its hypothesis about the gender gap at that. The Wikipedia segment begins at 4:50.

The Other Senkaku Islands Dispute

Tagged as , , , , ,
on February 5, 2013 at 2:52 pm

My friend and colleague Pete Hunt, writes in Foreign Policy today about the dispute on Wikipedia about the Senkaku Islands, and how they parallel the real world. An excerpt:

Regular editing dust-ups might suggest that the Senkaku Islands article and its “dispute” offshoot are dubious resources of little value. In fact, both articles nicely summarize the controversy and provide a long list of citations and references that can advance further research. While news accounts of the islands focus on recent diplomatic incidents and their international implications, these Wikipedia articles provide historical context and a more detailed explanation of the arguments underlying each side’s claims to the territory. The vitriol exchanged by editors might be ugly, but it’s also evidence of a transparent and ongoing screening process.

Actually, now that I think about it, the Wikipedia dispute may be going better than the one in real life.

First Wikipedian (Officially Representing a Presidential Library)

Tagged as , , , ,
on January 24, 2013 at 7:03 pm

Via the NYT Arts Beat blog:

Gerald R. Ford may have governed during a time of economic stagnation, but his library has just laid claim to a cutting-edge distinction: becoming the first presidential depository to employ an official “Wikipedian in residence.”

Michael Barera, a master’s student at the University of Michigan’s School of Information who has been editing Wikipedia articles for five years, started the job last week, The Chronicle of Higher Education reported. He is charged with improving the Wikipedia presence of the Gerald R. Ford Presidential Library and Museum, which is housed at the university’s Ann Arbor campus.

He’s the first official representative to Wikipedia at a presidential library, and surely not the last. Since Liam Wyatt became the first Wikipedian-in-Residence (WiR) at the British Museum, in spring 2010, the concept of an in-house Wikipedian has spread far and wide. So far, these have all been at non-profits, but I won’t be surprised if that isn’t always the case.

(Hat tip: cultural-partners email list.)

Remembering Aaron Swartz

Tagged as , , , ,
on January 14, 2013 at 7:36 pm

In certain corners of the Internet, it’s nearly impossible at the moment to avoid discussion of the death on Friday of Aaron Swartz, the “American computer programmer, writer, archivist, political organizer, and Internet activist”—to quote the current iteration of his rapidly-expanding Wikipedia article. Really, make that many corners of the Internet: from technology blogs to online magazines to mainstream newspapers, Swartz’s apparent suicide has been felt widely. And there’s good reason: Swartz’s career would be incredible even if he had not accomplished it all by the age of 26. But there is one reason why I’m writing about him now, in this space, and that’s because he was a Wikipedian.

Aaron_Swartz_at_Boston_Wikipedia_Meetup,_2009-08-18Aaron Swartz (User:AaronSw) was not just any Wikipedian. He was one of the longest running contributors, first joining Wikipedia in August 2003 and making his last edit just the day before he died. Using a tool for the analysis of Wikipedia user accounts, I found the complete list of articles he created—a total of 199, including some fairly important ones. Among them: Civil liberties in the United States, United States Court of Appeals for the Ninth Circuit and
Arrested Development (TV series). He’s also the creator of dozens of articles about political and policy figures, writers, lawyers and government officials. Like most Wikipedia editors who are content creators, his Wikipedia interests matched his real-life ones. (He even edited his own biography at least once, although unlike most he left an exceedingly polite and deferential note about it.)

Speaking of content creators, in late 2006—around the time that I first began editing Wikipedia—Swartz published a widely-read and influential essay series, arguably titled “Wikimedia at the Crossroads”, after the first installment. However, it is best-known for its second, “Who Edits Wikipedia?”, in which Swartz analyzed the number of characters added by different editors, using code of his own writing, looking to answer his essay’s titular question. One of his most startling findings was that the contributors with the most edits across all of Wikipedia in fact added the least content to the analyzed page (Alan Alda, amusingly enough) while editors with fewer edits added more content:

Edit by edit, I watched the page evolve. The changes I saw largely fell into three groups. A tiny handful — probably around 5 out of nearly 400 — were “vandalism”: confused or malicious people adding things that simply didn’t fit, followed by someone undoing their change. The vast majority, by far, were small changes: people fixing typos, formatting, links, categories, and so on, making the article a little nicer but not adding much in the way of substance. Finally, a much smaller amount were genuine additions: a couple sentences or even paragraphs of new information added to the page.

…Almost every time I saw a substantive edit, I found the user who had contributed it was not an active user of the site. They generally had made less than 50 edits (typically around 10), usually on related pages. Most never even bothered to create an account.

Thus was born the observation that Wikipedia’s editorial community includes both highly active, long-serving facilitators and itinerant, subject matter-expert writers, and their interplay is crucial to Wikipedia’s continued development and its future. When we talk about the lack of new editors (or trouble retaining current editors) on Wikipedia, we’re still talking about this very subject—or at least we should be. The fact that Aaron Swartz was 19 or 20 at the time he wrote this nearly boggles the mind. What he might have contributed under different circumstances, and that we’ll never know what he might have done, boggles too.

As a brief aside, Swartz’s last sustained edits to Wikipedia in November were to Wikipedia’s bibliography of David Foster Wallace, a favorite author of Swartz’s and also mine. Swartz once even wrote a brilliant essay attempting to explain what happens after the end of Wallace’s 1,000-page novel Infinite Jest, which nearly everyone who reads it comes away persuaded and envious (and yes, I mean myself). Like Wallace, Swartz suffered from depression and wrote about it—more openly than DFW ever did—but couldn’t write his way out of it, and it eventually overtook him.

Aaron Swartz’s untimely passing is devastating for those who knew and loved him, and disconcerting for those who knew him only through his public career. You can read rememberences by many of them, including Wikimedia deputy director Erik Moeller (once the winner of a Wikimedia Foundation board election Swartz contested), Wikimedia board member Samuel Klein, and dozens of Wikipedia regulars commenting on the Talk page of Swartz’s Wikipedia account. And anyone who likes can add the following box to their own:

Aaron Swartz Wikipedia memorial

Many more remembrances can be found online, including comments from friends and acquaintances beyond Wikipedia, including Cory Doctorow, Lawrence Lessig, John Gruber, Matthew Yglesias, Matt Stoller, from his family, and a page for anyone who wants to contribute something. Sure, it’s not quite “anyone can edit” like the online encyclopedia he cared deeply about and strived to make better, but it will have to do. And Wikipedia will, too.

Related: Death of a Wikipedian; March 23, 2012

The Top 10 Wikipedia Stories of 2012 (Part 2)

Tagged as , , , , , , , , , , ,
on December 31, 2012 at 9:02 am

For the past two years The Wikipedian has compiled a list of the top 10 news stories about Wikipedia (2010, 2011), focusing on topics that made mainstream news coverage and those which affected Wikipedia and the larger Wikimedia community more than any other. Part 1 ran on Friday; here’s the dramatic conclusion:

♦     ♦     ♦

5. The Gibraltarpedia controversy — Like the tenth item in our list, file this one under prominent members of the UK Wikimedia chapter behaving badly. In September, board member Roger Bamkin resigned following complaints that he had used Wikipedia resources for personal gain—at just about the worst possible time.

Bamkin was the creator of an actually pretty interesting project, Gibraltarpedia, an effort to integrate the semi-autonomous territory of Gibraltar with Wikipedia as closely as possible, writing every possible Wikipedia article about the territory, and posting QR codes around the peninsula connecting visitors to those articles. It was closely modeled on a smiliar project, with which Bamkin was also involved, called Monmouthpedia, which had won acclaim for doing the same for the Welsh town of Monmouth.

Problem is, the government of Gibraltar was a client of Bamkin’s, and Bamkin arranged for many of these improved articles to appear on the front page of Wikipedia (through a feature of Wikipedia called “Did you know”). Too many of them, enough that restrictions were imposed on his ability to nominate new ones. At a time when the community was already debating the propriety of consultant relationships involving Wikipedia (more about this below) Bamkin’s oversight offended many within the community, and was even the subject of external news coverage (now of course the subject of a “Controversy” section on Gibraltarpedia’s own Wikipedia page).

(Note: A previous version of this section erroneously implied that Bamkin was not involved with Monmouthpedia, and was then board chair as opposed to trustee. Likewise, it suggested that disclosure was the primary concern regarding DYK, however the controversy focused on issues of volume and process. These errors have been corrected.)

4. Wikipedia’s gender imbalance — This one is down one spot from last year, but the undeniable fact that Wikipedia is overwhelmingly male (like 6-1 overwhelmingly) seems to have replaced Wikipedia’s falling editor retention as the primary focus of concerns about the long-term viability of Wikipedia’s mission. The topic was given center stage during the opening plenary at the annual Wikimedia conference, Wikimania DC, and has been the subject of continuing news coverage and even the focus of interesting-if-hard-to-decipher infographics. Like Wikipedia’s difficulty keeping and attracting new editors, the Wikimedia Foundation is working on addressing this as well, and no one knows precisely how much it matters or what to do about it. For further reading: over the last several weeks, my colleague Rhiannon Ruff has been writing an ongoing series about Wikipedia and women (here and here).

3. Wikipedia’s relationship with PR — I’m reluctant to put this one so high up, because one could say that I have a conflict of interest with “conflict of interest” as a topic (more here). But considering how much space this took up at the Wikipedia Signpost and on Jimmy Wales’ Talk page over the past 12 months, it would be a mistake to move it back.

This one is a continuation from last year’s #8, when a British PR firm called Bell Pottinger got caught making a wide range of anonymous edits to their client’s articles. The discussion continued into early 2012, including a smart blog post by Edelman’s Phil Gomes that focused the discussion on how Wikipedia and PR might get along, a public relations organizations in the UK developing a set of guidelines for the first time, and a similar organization in the US releasing a survey purporting to demonstrate problems with Wikipedia articles about companies, though it wasn’t quite that.

For the first time since 2009, the topics of “paid editing” and “paid advocacy” drew significant focus. New projects sprung up, including WikiProject Cooperation (to help facilitate outside requests) and WikiProject Paid Advocacy Watch (to keep tabs on said activity). Jimmy Wales spelled out his views in as much detail as he had before, and the Wikipedia Signpost ran a series of interviews over several months (called “Does Wikipedia Pay?”), covering the differing views and roles editors play around the topic. But after all that, no new policies or guidelines were passed, and discussion has quieted a bit for now.

2. Britannica admits defeat — In the year of our lord 2012, Encyclopædia Britannica announced that it would stop publishing a print edition and go online-only. Which means that Britannica essentially has ceased to exist. The 244-year-old encyclopedia, the world’s most famous until about 2005 or so, has no real web presence to speak of: its website (which is littered with annoying ads) only makes previews of articles available, and plans to allow reader input have never gone anywhere. Wikipedia actually had nothing to do with Britannica’s decline, as I pointed out earlier this month (Microsoft’s late Encarta started that), but the media narrative is already set: Britannica loses, Wikipedia wins. Britannica’s future is uncertain and the end is always near, while Wikipedia’s time horizon is very, very long.

Wikipedia SOPA blackout announcement

1. Wikipedia’s non-neutral protest on U.S. Internet law — Without question, the most significant and widely-covered Wikipedia-related topic in the past year was the 24-hour voluntary blackout of Wikipedia and its sister sites on Wednesday, January 18. Together with a few other websites, notably Reddit, Wikipedia shut itself down temporarily to protest a set of laws under consideration in the U.S. House and Senate, called the Stop Online Piracy Act (SOPA) and PROTECT IP Act (PIPA), supported by southern California (the music and movie industry) and opposed by northern California (i.e. the Silicon Valley).

The topic basically hit everyone’s hot buttons, and very different ones at that: the content companies who believe that online piracy is harming their business, and the Internet companies who feared that if the bills became law it would lead to censorship. You can imagine which side Wikipedia took.

But here’s the problem: Wikipedia is not one entity; it’s kind of two (the Foundation and volunteer community), and it’s kind of thousands (everyone who considers themselves a Wikipedian). While there seemed to be a majority in favor of the protest, the decision was arrived at very quickly, and many felt that even though they agreed with the message, it was not Wikipedia’s place to insert itself into a matter of public controversy. And one of Wikipedia’s core content policies is that it treats its subject matter with a “neutral point of view”—so how could anyone trust Wikipedia would be neutral about SOPA or PIPA?

But the decision had been made, and the Foundation (which controls the servers) had made the call, and even if you didn’t like it, it was only for 24 hours. And it certainly seemed to be effective: the blackout received the abovementioned crazy news attention, and both bills failed to win wide support in Congress (at least, for now). And it was a moment where Wikipedia both recognized its own power and, perhaps, was a little frightened of itself. For that alone, it was the biggest Wikipedia story of 2013.

The Top 10 Wikipedia Stories of 2012 (Part 1)

Tagged as , , , , , , , , , , , , , , ,
on December 28, 2012 at 12:18 pm

In these waning days of 2012, let’s take this opportunity—for a third year in a row—to look back and come up with a list of the most important Wikipedia news and events in the last 12 months. Like our first installment in 2010 and our follow-up in 2011, the list will be arbitrary but hopefully also entertaining. There is no methodology to be found here, just my own opinion based on watching Wikipedia, its sister projects and parent organization, and also thumbing through the Wikipedia Signpost, Google News and other news sites this past week. So what are we waiting for?

Wait, wait, one more thing: this post ended up being much longer than I expected, and so I’ve decided to split this in two. Today we publish the first five items in the list, 10-6. On Monday 12/31 we’ll publish the final five. Enjoy!

♦     ♦     ♦

10. Wikipedia bans a prominent contributor — Let’s start with something that did not make the news outside of the Wikipedia / Wikimedia community at all, but which took up a great deal of oxygen within it. It’s the story of a prominent editor and administrator who goes by the handle Fæ. In April of this year, he was elected to lead a new organization within the community based on his leadership of the UK chapter. The move was not without controversy: Fæ’s actions both on Wikipedia and the sister site Wikimedia Commons (best known as a vast image repository) and interactions with editors became the subject of intense scrutiny, and even an ArbCom case (the Arbitration Committee is sort of like Wikipedia’s Supreme Court). Fæ ended up resigning his adminship—he basically jumped to avoid being pushed—and the end result had him banned from editing Wikipedia, which he still is. Not that he’s gone away—he’s still a contributor to Commons, and a very active one.

This might sound like a lot of insider nonsense, and I’m not about to dissuade you from this viewpoint. (Sayre’s law applies in spades.) But the key issue involved is about governance: is the Wikimedia community’s organizational structure and personnel capable of the kind of leadership necessary to maintain and build on this important project? The Fæ incident (along with other incidents in this list) suggests the answer may be no.

9. Confusing software development — Not all of Wikipedia’s contributors are focused on editing articles. Some are also developers, working on the open source software to keep Wikimedia sites running and, perhaps, improving. Some (but not all) are paid staff and contractors, and the hybrid part-volunteer, part-professional organizational structure can make it difficult to get projects off the ground.

One longtime project that has yet to see wide implementation is a “visual editor” for Wikipedia articles, to make editing much easier for users. Everyone knows that the editing interface for Wikipedia articles feels like software programming, and almost surely turns away some potential contributors (though it’s not the main reason people don’t contribute, as a 2011 Wikimedia survey showed). But the visual editor is a bigger technical challenge than one might think (as recently explained by The Next Web), and the outcome of a current trial run (also not the first) is anyone’s guess.

Another announced with a great deal of hype but which no one really seems to understand is Wikidata. It calls itself a “common data repository” which by itself sounds fairly reasonable, but no one really knows how it will work in practice, even those now developing it. Wikidata could be a terrifically innovative invention and the very future of Wikimedia… but first we need to find out what it does.

Other projects have been released, but have received thoughtful criticism for adding little value while diverting resources from more worthy projects. For example, a feature briefly existed asking you to choose whether a smiley face or frowny face best represented your Wikipedia experience. Uh, OK? Some projects have been better-received: the Wikipedia iPhone app, for example, is a definite improvement over the mobile site. But there are some odd decisions here, as well: does Wikipedia really need an app for the failed Blackberry Playbook?

8. Sum of human knowledge gets more human knowledge — If you’ve ever seen a [citation needed] tag on Wikipedia—and I know you have—then you know that, well, citations are needed. And while citations do actually kind of grow on trees (if by “trees” we mean “the Internet”) there is a lot of information out there which isn’t readily searchable on Google, and sometimes that information costs money. This year, some of those paid services cracked the door open just a bit.

The interesting story to the HighBeam Research partnership is that there really isn’t one. First of all, HighBeam is a news database which charges for reader access to its vast collection of articles. But in March, a volunteer Wikipedia editor who goes by the name Ocaasi reached out to HighBeam and asked if they would be willing to grant free access to Wikipedia editors. They said yes—and supplied one-year, renewable accounts to editors with at least one year’s experience and 1,000 edits. For Wikipedia, it meant greater access to information. For Highbeam, it meant a 600% increase in links to the site in the first few months of the project. Seems like a fair trade.

More recently, the Wikimedia Foundation announced an agreement with the academic paper storehouse JSTOR, making one-year accounts available to 100 of the most-active Wikipedia editors. With almost 240 editors petitioning for access, if you haven’t spoken up yet, chances are you’re a bit too late.

7. The first person to 1 million edits — OK, how about a fun one? In April, a Wikipedia editor named Justin Knapp, who uses the handle Koavf, became the first person to make 1 million edits to Wikipedia. To the surprise of everyone, perhaps none more than Knapp himself, this made him an overnight international celebrity of the Warhol variety. Jimmy Wales even declared April 20 “Justin Knapp Day” on Wikipedia.

It’s worth pointing out that most editors with many, many edits to their name typically are involved in janitorial-style editing activities, such as fighting vandals or re-organizing categories. And many very active editors spend a lot of time squabbling with others on the so-called “drama boards” such as Administrators’ noticeboard/Incidents. Not Knapp: his edits over time have overwhelmingly focused on creating new articles, plus researching and improving content in existing ones. In short: Wikipedia doesn’t need more editors—it needs more Justin Knapps.

Also, this is one I actually played a small role in, as verified by Knapp’s own timeline of events. I’d happened to see someone note the fact on Jimmy Wales’ Talk page that day, which I tweeted, and was then picked up by Gawker’s Adrian Chen, and the rest is history. Actually, then Knapp kept right on editing Wikipedia. As of this writing, he’s closing in on 1.25 million edits.

6. Philip Roth’s Complaint — Wikipedia has been extraordinarily sensitive to complaints by living people the subject of articles ever since a 2005 incident where a veteran newspaper editor found his article maliciously vandalized to implicate him in the murder of the brothers Kennedy.

In what was arguably the biggest row since then, in September 2007 the celebrated, prickly author of Portnoy’s Complaint, American Pastoral and numerous other novels took to the pages of The New Yorker to issue “An Open Letter to Wikipedia” complaining that the site had the inspiration for his 2000 novel The Human Stain all wrong. And this wasn’t his first resort: Roth’s first attempt had been to authorize his biographer to change the article directly, which was rebuffed. His consternation here: not inexplicable.

But Roth’s complaint was not really with Wikipedia. Several book reviewers had speculated (apparently incorrectly) about the real-life basis for the novel’s central figure, and it was these speculations which had been introduced to Wikipedia. Roth’s publicity campaign brought the issue to much wider attention, which got his personal explanation of the novel’s inspiration into Wikipedia. However, in a twist on the Streisand effect, the controversy is now the subject of a longish and somewhat peevish section written by editors perhaps irked by Roth’s campaign. So he got what he wanted, plus more that he didn’t. Shall we call it the Roth effect?

♦     ♦     ♦

Look here on Monday for the thrilling conclusion to The Top 10 Wikipedia Stories of 2012!

The Agony and Ecstasy of Wikidata

Tagged as , , , , , , , , ,
on April 12, 2012 at 8:31 am

Although Wikipedia is by far the best-known of the Wikimedia collaborative projects, it is just one of many. Just this last week, Wikimedia Deutschland announced its latest contribution: Wikidata (also @Wikidata, and see this interview in the Wikipedia Signpost). Still under development, its temporary homepage announces:

Wikidata aims to create a free knowledge base about the world that can be read and edited by humans and machines alike. It will provide data in all the languages of the Wikimedia projects, and allow for the central access to data in a similar vein as Wikimedia Commons does for multimedia files. Wikidata is proposed as a new Wikimedia hosted and maintained project.

Possible Wikidata logo

One of a few Wikidata logos under consideration.

Upon its announcement, I tweeted my initial impression, that it sounded like Wikipedia’s answer to Wolfram Alpha, the commercial “answer engine” created by Stephen Wolfram in 2009. It seems to partly be that but also more, and its apparent ambition—not to mention the speculation surrounding it—is causing a stir.

Already touted by TechCrunch as “Wikipedia’s next big thing” (incorrectly identifying Wikipedia as its primary driver, I pedantically note), Wikidata will create a central database for the countless numbers, statistics and figures currently found in Wikipedia’s articles. The centralized collection of data will allow for quick updates and uniformity of statistical information across Wikipedia.

Currently when new information replaces old, as is the case with census surveys, elections results and quarterly reports are published, Wikipedians must manually update the old data in all the articles in which it appears, across every language. Wikidata would create the possibility for a quick computer led update to replace all out of date information. Additionally, it is expected that Wikidata will allow visitors to search and access information in a less labor-intensive method. As TechCrunch suggests:

Wikidata will also enable users to ask different types of questions, like which of the world’s ten largest cities have a female mayor?, for example. Queries like this are today answered by user-created Wikipedia Lists – that is, manually created structured answers. Wikidata, on the hand, will be able to create these lists automatically.

Though this project—which is funded by the Allen Institute for Artificial Intelligence, the Gordon and Betty Moore Foundation, and Google—is expected to take about a year to develop, but the blogosphere is already buzzing.

It’s probably fair to say that the overall response has been very positive. In a long post summarizing Wikidata’s aims, Yahoo! Labs researcher Nicolas Torzec identifies himself as one who excitedly awaits the changes Wikidata promises:

By providing and integrating Wikipedia with one common source of structured data that anyone can edit and use, Wikidata should enable higher consistency and quality within Wikipedia articles, increase the availability of information in and across Wikipedias, and decrease the maintenance effort for the editors working on Wikipedia. At the same time, it will also enable new types of Wikipedia pages and applications, including dynamically-generated timelines, maps, and charts; automatically-generated lists and aggregates; semantic search; light question & answering; etc. And because all these data will be available as Open Data in a machine-readable form, they will also benefit thrid-party [sic] knowledge-based projects at large Web companies such as Google, Bing, Facebook and Yahoo!, as well as at smaller Web startups…

Asked for comment by CNet, Andrew Lih, author of The Wikipedia Revolution, called it a “logical progression” for Wikipedia, even as he worries that Wikidata will drive away Wikipedians who are less tech-savvy, as it complicates the way in which information is recorded.

Also cautious is SEO blogger Pat Marcello, who warns that human error is still a very real possibility. She writes:

Wikidata is going to be just like Wikipedia in that it will be UGC (user-generated content) in many instances. So, how reliable will it be? I mean, when I write something — anything from a blog post to a book, I want the data I use in that work to be 100% accurate. I fear that just as with Wikipedia, the information you get may not be 100%, and with the volume of data they plan to include, there’s no way to vette [sic] all of the information.

Fair enough, but of course the upside is that corrections can be easily made. If one already uses Wikipedia, this tradeoff is very familiar.

The most critical voice so far is Mark Graham, an English geographer (and a fellow participant in the January 2010 WikiWars conference) who published “The Problem with Wikidata” on The Atlantic’s website this week:

This is a highly significant and hugely important change to the ways that Wikipedia works. Until now, the Wikipedia community has never attempted any sort of consistency across all languages. …

It is important that different communities are able to create and reproduce different truths and worldviews. And while certain truths are universal (Tokyo is described as a capital city in every language version that includes an article about Japan), others are more messy and unclear (e.g. should the population of Israel include occupied and contested territories?).

The reason that Wikidata marks such a significant moment in Wikipedia’s history is the fact that it eliminates some of the scope for culturally contingent representations of places, processes, people, and events. However, even more concerning is that fact that this sort of congealed and structured knowledge is unlikely to reflect the opinions and beliefs of traditionally marginalized groups.

The comments on the article are interesting, with some voices sharing Graham’s concerns, while others argue his concerns are overstated:

While there are exceptions, most of the information (and bias) in Wikipedia articles is contained within the prose and will be unaffected by Wikidata. … It’s quite possible that Wikidata will initially provide a lopsided database with a heavy emphasis on the developed world. But Wikipedia’s increasing focus on globalization and the tremendous potential of the open editing model make it one of the best candidates for mitigating that factor within the Semantic Web.

Wikimedia and Wikipedia’s slant toward the North, the West, and English speakers are well-covered in Wikipedia’s own list of its systemic biases, and Wikidata can’t help but face the same challenges. Meanwhile, another commenter argued:

The sky is falling! Or not, take your pick. Other commenters have made more informed posts than this, but does Wikidata’s existence force Wikipedia to use it? Probably not. … But if Wikidata has a graph of the Israel boundary–even multiple graphs–I suppose that the various Wikipedia authors could use one, or several, or none and make their own…which might get edited by someone else.

Under the canny (partial) title of “Who Will Be Mostly Right … ?” on the blog Data Liberate, Richard Wallis writes:

I share some of [Graham's] concerns, but also draw comfort from some of the things Denny said in Berlin – “WikiData will not define the truth, it will collect the references to the data…. WikiData created articles on a topic will point to the relevant Wikipedia articles in all languages.” They obviously intend to capture facts described in different languages, the question is will they also preserve the local differences in assertion. In a world where we still can not totally agree on the height of our tallest mountain, we must be able to take account of and report differences of opinion.

Evidence that those behind Wikidata have anticipated a response similar to Graham’s can be found on the blog Too Big to Know where technologist David Weinberger shared a snippet of an IRC chat with he had with a Wikimedian:

[11:29] hi. I’m very interested in wikidata and am trying to write a brief blog post, and have a n00b question.
[11:29] go ahead!
[11:30] When there’s disagreement about a fact, will there be a discussion page where the differences can be worked through in public?
[11:30] two-fold answer
[11:30] 1. there will be a discussion page, yes
[11:31] 2. every fact can always have references accompanying it. so it is not about “does berlin really have 3.5 mio people” but about “does source X say that berlin has 3.5 mio people”
[11:31] wikidata is not about truth
[11:31] but about referenceable facts

The compiled phrase “Wikidata is not about truth, but about referenceable facts” is an intentional echo of Wikipedia’s oft-debated but longstanding allegiance to “verifiability, not truth”. Unsurprisingly, this familiar debate is playing itself out around Wikidata already.

Thanks for research assistance to Morgan Wehling.

Regarding the Uncertain Future of Encyclopædia Britannica

Tagged as , , , , , , ,
on March 14, 2012 at 5:01 pm

Yesterday, Encyclopædia Britannica made the startling announcement that they would discontinue their print edition after 244 years. Once the current edition has sold out, they’ll become a collector’s item. Which is essentially what they are now, if it’s not too uncharitable to point out. Britannica is not finished as an operation, however: it will continue to publish on the web. It’s a startling announcement, sure, but it makes more sense than if it went on as if nothing had changed. Britannica’s editors acknowledged as much in a post on their blog:

A momentous event? In some ways, yes; the set is, after all, nearly a quarter of a millennium old. But in a larger sense this is just another historical data point in the evolution of human knowledge.

But Britannica’s grip on the evolution of human knowledge isn’t what it used to be—you can see where I’m going, right? As a well-known quote from Jimbo Wales goes:

Imagine a world in which every single person on the planet is given free access to the sum of all human knowledge. That’s what we’re doing.

Since its launch in 2001, and especially since a (much-debated) 2005 Nature article comparing the two, Wikipedia has been a thorn in Britannica’s side. And its influence has long since surpassed its much older rival. A Quantcast comparison suggests that Wikipedia’s traffic is 30x that of Britannica’s. And as I tweeted last night, news organizations have been quick to note the competition.

Under the title “Death By Wikipedia: Encyclopedia Britannica Stops Printing”, ReadWriteWeb observes:

The usefulness of such reference materials has been on the decline for years, especially since the advent of Wikipedia. Whatever flaws its open, crowd-sourced editorial model may invite, Wikipedia is generally regarded as a comprehensive and mostly-accurate source of information, which can be accessed for free.

And in a Venture Beat article titled “Encyclopaedia Britannica wiped out by Wikipedia, selling final print edition” we find:

The extremely thorough Wikipedia article on Encyclopaedia Britannica … serves as the perfect example of why Wikipedia is coming out on top.

It’s true—Wikipedia’s article about Encyclopædia Britannica is very thorough. Britannica’s article about Wikipedia is not bad, but it is far more limited than Wikipedia’s article about itself, and Britannica has those annoying pop-up advertisements that do nothing for readers.

Yet Britannica president Jorge Cauz tells the The Washington Post:

This has nothing to do with Wikipedia or Google. … This has to do with the fact that now Britannica sells its digital products to a large number of people.

This is a little bit like Microsoft saying Windows 8 has nothing to do with the the iPad, merely the shift in consumer purchasing habits toward the tablet and mobile markets. That’s not to say the statement isn’t necessarily untrue, just that it’s complete. I don’t know a great deal about Britannica’s current business model, but it’s safe to say that non-print revenues have become far more important, as Britannica’s print sales have fallen. Whether they will succeed is another question; PC World and doesn’t think so, pointing out the closure of—speak of the devil—Microsoft’s online encyclopedia Encarta in 2009 (which I wrote about at the time):

Microsoft shuttered its digital multimedia encyclopedia, Encarta, in 2009, and the last trace of it, the online dictionary, closed last year. Encarta, though a digital product, was also made obsolete by Wikipedia’s free availability, constantly updated content and thousands of editors, contributors and volunteers from around the world.

At The Atlantic, expert on evolution and Bloggingheads impresario Robert Wright offers this (small) consolation:

Maybe, long after even the electronic edition of Britannica is gone, the idea of Britannica can remain for us what it once was for me–a kind of Platonic ideal that we aspire to evolve toward even if we can never reach it, something that has a kind of reality even if we can never touch it.

As someone who devoured Britannica in my school library when growing up, not to mention someone who relied on Britannica as a college student in the late 1990s (before Britannica added a pay wall)—much the same way as students today (notoriously) rely on Wikipedia —I’m sorry to see it go. But we no longer live in a world where a 30,000 page, 15-volume encyclopedia can be printed on an annual basis for profit. In fact, even Britannica sees itself as a collector’s item now; as Cauz tells the News Observer:

This is going to be as rare as the first edition, because the last print run of our last copyright was one of the smallest print runs.”

I’d love to own one myself, but at $1,395.00 for the “Final Print Edition”, I’m afraid I’ll have to pass. And perhaps Cauz is wrong; maybe the death of Britannica will be more like the Death of Superman.

Wikipedia Gets on its SOPA Box

Tagged as , , , , ,
on January 17, 2012 at 9:46 am

Wikipedia SOPA blackout announcement
The Wikimedia Foundation announced on Monday that the English-language Wikipedia will go offline for 24 hours, starting at midnight tonight on the East Coast, in protest of the Stop Online Piracy Act (SOPA) and a related bill, the PROTECT IP Act (PIPA). The move follows a similar protest by the Italian-language Wikipedia last year, protesting proposed anti-privacy laws in Italy.

Over the past week, volunteer Wikipedia editors debated the proposition and, ultimately decided to go forward. The decision was accepted by the Foundation, which will implement it late tonight. An official public explanation includes the following:

Over the course of the past 72 hours, over 1800 Wikipedians have joined together to discuss proposed actions that the community might wish to take against SOPA and PIPA. This is by far the largest level of participation in a community discussion ever seen on Wikipedia, which illustrates the level of concern that Wikipedians feel about this proposed legislation. The overwhelming majority of participants support community action to encourage greater public action in response to these two bills. Of the proposals considered by Wikipedians, those that would result in a “blackout” of the English Wikipedia, in concert with similar blackouts on other websites opposed to SOPA and PIPA, received the strongest support.

The decision is not one that all are happy about. After all, Wikipedia’s core content guidelines emphasize a Neutral point of view in its approach to encyclopedia topics, so isn’t this a questionable decision?

Just this morning, a participant on a Wikipedia-related discussion group wrote:

Now that we have taken the necessary first step to regard the English Wikipedia and other Wikimedia projects as high-profile platforms for political statements, we ought to consider what other critical humanitarian problems we could use our considerable visibility and reputation to address. We could draw attention to the crises in Sudan or Nigeria, drone attacks against civilians in Afghanistan, the permanent occupation of the Palestinian territories, the Iranian effort to develop nuclear capabilities, police misconduct in virtually any country, the treatment of women and women’s rights in Saudi Arabia and elsewhere, and the list could go on and on.

Well, considering that it was a matter of debate, it surely is questionable and does not reflect the views of all Wikipedians. But I think it’s also fair to say that it reflects the majority of participants.

Wikipedia has its philosophical roots in the free software movement, which is the very antithesis of what SOPA and PIPA are about, so this particular viewpoint should surprise no one. Meanwhile, Wikipedia is well aware that it has its own systemic biases and has organized a project to answer them. In this case, however, Wikipedia’s bias shows through and most participants find this to be a good thing.

I’ll have to put myself more in the skeptic’s camp—not because I support SOPA, which I’m pretty sure I don’t—but because I would prefer that Wikipedia not become a platform for political activism. That said, I don’t think it will lead to similar efforts in the near future and, considering it’s already received significant news coverage, I think there is no question it will be effective in raising awareness about the issue.

For Wikipedians who are uncomfortable with the effort, there’s not much else to do. The band they’re in is playing a different tune, and we’ll see you on the dark side of the Wikipedia blackout.

The Top 10 Wikipedia Stories of 2011

Tagged as , , , , , , , , , , , , , ,
on December 31, 2011 at 10:07 pm

A year ago, I wrote a blog post called “The Top 10 Wikipedia Stories of 2010”. Perhaps, then, I should write a follow-up this year? For some reason, I’m having a harder time of it. Was 2011 less of a newsworthy year for Wikipedia? Not if this Google Insights for Search analysis of Wikipedia-related news stories is to be believed: if anything, Wikipedia was a more prominent news generator this year than last. Make what you will of the proprietary, nontransparent methodology of Google’s news judgment, but at least it seems Wikipedia has been plenty newsworthy.

It’s my personal judgment that Wikipedia was somehow less newsworthy than it was last year. Maybe that speaks to the absence of WikiLeaks / Wikipedia confusion in the public discussion, or maybe it speaks to the fact that I think some of the big topics simply repeat.

Whichever is the case, I say let’s do what we did last year, and count down through the most important and / or impactful news stories about the year in Wikipedia, using my own proprietary, nontransparent methodology, which is to say these are my personal judgments:

10. Superinjunctions — In May, Wikipedia was one of several websites (notably also Twitter) that came into conflict with UK court orders—”superinjunctions”—seeking to suppress scandalous gossip about sports and film celebrities (I know, right?). Wikipedia servers, like Twitter’s, are based in the U.S. and so are protected by the First Amendment. But that doesn’t mean some won’t try.

9. Wikipedia and education — This was on the list last year, and even though there was no singular event to point to, I’m going to include it again. Wikipedia remains a major subject of controversy at both the university and secondary levels, and while teacher attitudes are changing, and Wikipedia is making efforts to work with them, much confusion remains and resistance continues to exist. (But is probably futile.)

8. Wikipedia meddling — Politicians don’t fare well when they try to edit Wikipedia. Nor do some famous newspaper columnists. You know who seems to an even worse job of this? PR firms. As I’ve written about more than once, it’s not impossible to contribute to Wikipedia on a topic you are close to without getting burned, but those who are determined to subvert Wikipedia will keep getting burned.

7. Drawbacks of Wikipedia’s openness — It’s not just politicians who sometimes run afoul of Wikipedia… their supporters do, too. This summer, Sarah Palin said something about Paul Revere that was factually inaccurate, and anonymous someones presumed to be in her corner tried to change relevant Wikipedia articles… and then a few days later, Michele Bachmann said something about John Wayne’s hometown that was incorrect and John Quincy Adams’ status as a founding father that basically is too, and unhelpful Wikipedia edits commenced. Oh, and of course Stephen Colbert was there to fan the flames. To paraphrase a real founding father, if eternal vigilance is the price of liberty, so too is it the price of an online encyclopedia anyone can edit.

6. But how open is it, really? — This will come up again later, but many Wikipedians have become concerned that Wikipedia is too difficult to use, both for reasons related to the community and the once-revolutionary but now-creaky collaborative tools (i.e. the MediaWiki software that powers Wikipedia and its sister sites) and the often-insular community that defines it. Over Thanksgiving weekend, search engine-focused blogger Danny Sullivan published a blog post blasting Wikipedia for being “closed” and “unfriendly” and, even though he wasn’t very friendly (read: a total jerk) in his brief on-site activity, his point that Wikipedia is difficult to use is not incorrect. Wikipedia volunteer developers have created multiple versions of an Article Feedback Tool, something called “WikiLove”, a rather condescending smiley face / frowny face tool still in testing, and there are more user interface (UI) changes in store. But if the community itself is the issue, that’s a much trickier question.

5. Integration with museums and archives — One of the most interesting things happening on Wikipedia these days is the GLAM (Galleries, Libraries, Archives and Museums) project, in which researchers collaborate with the aforementioned institutions to make their material more easily accessed by Wikipedians for use on Wikipedia. Started by Liam Wyatt, who received considerable attention in 2010 for a stint as “Wikipedian in residence” at the British Museum, the project has grown far beyond him. In the U.S., the Smithsonian and National Archives are now participants, with attention paid by The Atlantic, among other news organizations. If Wikipedia’s reputation for accuracy and depth improves in the years ahead, the GLAM project will play a big part.

4. Wikipedia’s gender imbalance — As I asked in February: “Could it really be that just 13% of Wikipedia editors are women?” Well, nobody knows for sure, but this is the percentage of women who participated in the Wikimedia Foundation’s most recent editors survey, and in 2011 the issue attracted renewed attention. A story in the New York Times by the publication’s lead wiki-watcher, Noam Cohen, led to new internal discussion over the site’s gender balance, a renewed outreach effort by Wikimedia executive director Sue Gardener, and and a Wikipedia “fork” of the Change the Ratio campaign spearheaded by my friend Amy Senger. Has it worked? Well… who’s to say just yet? It seems unlikely that Wikipedia participation will reflect the actual gender balance of the wider world—and I would say it needn’t actually do that—but all parties would probably be happy to see a measurable uptick when the next survey rolls around.

3. Wikipedia occupies itself — In early October, the Italian-language Wikipedia edition turned off the lights temporarily in protest against a proposed law that would require websites to issue corrections, or face penalties. The protest received worldwide coverage; the proposed law has not become law. According to Google Insights, this was in fact the most-searched Wikipedia-related news story of the year, but I’m exercising my own editorial discretion here. Meanwhile on the (much more widely read) English-language Wikipedia, similar measures have been considered in response to the U.S. Stop Online Privacy Act (SOPA) however nothing has come of it (yet).

2. Falling editor retention — I begin with the caveat that this should probably be number one; this might seem a bit esoteric to the outsider, but in fact this is a proxy for questions about the long-term survivability of Wikipedia as a project, and is such a huge topic that I can’t properly wrap my head around it.

In August, I wrote a response to a Gawker post titled “Wikipedia is Slowly Dying”, arguing that Wikipedia had lost its mojo, and the “cognitive surplus” that helped build it had now moved on to places like Facebook and Twitter. This is wrong for reasons I only partly articulated at the time, but there’s no question that Wikipedia has fewer editors than it did last year, and the year before, and the year before.

The Wikimedia Foundation’s own research shows that new editors face longer articles offering fewer clear opportunities to get involved (which shouldn’t be a surprise, given the site’s impressive growth) and have a harder time making their edits stick.

The above chart, also prepared by the Wikimedia Foundation, shows it is clearly in flux: the explosive growth of participation crested several years ago, has been in slow decline since. No one really knows what’s going on with the direction of Wikipedia’s participation rate—regardless of gender—but it has been a major topic of discussion and will continue to be.

1. Wikipedia’s 10th anniversary — My choice for the top story last year was also about Wikipedia—the controversy over its ubiquitous fundraising banners—and so it is again. As much as Wikipedia strives to avoid self-referentiality in its own encyclopedia pages, the one thing Wikipedians have in common (and they often do not have much) is a fascination with Wikipedia. And this year was a big milestone: the 10th anniversary since Jimmy Wales (and, oh yeah, Larry Sanger) started up a “wiki” encyclopedia, very much as an afterthought.

To celebrate the milestone, Wikipedia held events around the world, and it happened to be a good time to be a Wikipedia commentator: I was interviewed for Ukrainian TV, and I collaborated with the creative agency JESS3 to produce a web video called “The State of Wikipedia”, narrated by Jimbo himself. As of this writing, it has more than 135,000 views on YouTube, making it one of the bigger things I did this year. Here’s looking forward to an interesting 2012.

How to Stop the Next Bell Pottinger

Tagged as , , , , , , , , ,
on December 12, 2011 at 10:57 pm

I’m somewhat late by now to one of the bigger Wikipedia-related stories to come along in recent months: the revelation of secretive Wikipedia edits by a London-based PR firm called Bell Pottinger. As reported by the BBC and The Independent and others, Bell Pottinger was caught airbrushing client entries, adding promotional material and removing critical information. Of course, the company’s own Wikipedia profile is now disproportionately about this incident, at least for the time being.

In a swift and thorough investigation, Wikipedia’s volunteers determined that Bell Pottinger employed at least ten accounts, and probably more, to edit more than 100 separate pages. These changes included adding “promotional/excessive language”, including “puffery” and in some cases “unambiguous advertising” by accounts with such innocuous-sounding names as “Biggleswiki”. (Ask not for whom the Bell Pottinger tolls, it tolls for Biggleswiki.)

In spite of myself, I was amused: why is it that supposedly smart, sophisticated PR professionals seem to think the best approach to Wikipedia is duplicity?

Problem is, I think that narrative may be driving the response a bit too much. While the coverage has been mostly responsible, noting that Bell Pottinger committed “possible breaches of conflict of interest guidelines”, it is easy to come away with the impression that any interaction with Wikipedia articles by interested parties is inherently illegitimate. Not unlike the widely-reported incidence of U.S. congressional staff edits to Wikipedia in 2006, or similar incidents uncovered with a tool called WikiScanner in 2007, it ends up stigmatizing editors who would make legitimate edits.

The BBC writes: “While anyone is free to edit the encyclopaedia, the site’s guidelines urge users to steer clear of topics in which they have a personal or business interest.” This is not true for personal interests, and while true for business interests, anyone who knows the site well also knows that it is not the full picture. At least the BBC also quoted Wikipedian David Gerard, noting the investigation would focus on whether the edits were carried out in “bad faith”. More Gerard: “We’re having a close look. What the team is going to do is look at Bell Pottinger’s clients and see what edits have been made.” It so happens these details actually do matter. And even Jimmy Wales, amid more forceful denunciations of the bad actors, told The Independent: “There are ethical PR companies out there.” Not that you ever hear about them.

♦     ♦     ♦

As some readers will know, I’ve long been interested in the topic of COI (“Conflict of interest”) editing at Wikipedia. I don’t spend a great deal of time dwelling on the topic here, but indeed it has been a professional focus as well. Over the past few years I have developed best practices for clients, mostly large companies and organizations with existing articles, to facilitate the improvement of those Wikipedia articles in a constructive manner, following Wikipedia’s rules. As noted on the About page of this blog: “My goal has been and will always be to improve such articles while working within consensus.” I’ve carried many of these on my back—these projects are not difficult to find—and helped clients engage under their own name as well. I’m proud of all these, not least because so many find it so surprising.

It shouldn’t be this way. Earlier this year, I teamed up with creative agency JESS3 and marketing automation firm Eloqua to produce a “white hat” guide for marketers and business professionals titled “The Grande Guide to Wikipedia”—a how-to for constructive interaction with the Wikipedia community. The feedback was positive, but I heard more from Wikipedians than from marketing professionals. I have no doubt that furtive, undisclosed edits are common at most firms, not because they seek to do harm (like Bell Pottinger), but because editing transparently seems like too much trouble.

Another reason, and I want to be careful here, is because statements by Jimmy Wales have created the impression that anyone who works for a marketing firm is unwelcome. This goes back to the business involving Gregory Kohs and the MyWikiBiz controversy, where Wales’ “shoot on sight” comments remained effectively the only quote on the matter for a long time. Kohs, openly hostile to Wikipedia and vocal about his intent to subvert Wikipedia was, for a long time, the only model. No doubt this unfortunate turn of history kept well-meaning COI editors in the shadows.

But I’m not alone in thinking that this needs to change. Recently, a social media marketer named David King wrote a very good blog post titled “Why Wikipedia Needs Marketers”, which included this astute observation:

The volume of [Wikipedia] content is growing, but the active contributors to maintain, update and police those articles is shrinking. As this trend continues, vandalism, bias, outdated information and blatant factual errors will run even more rampant.

Marketers are the most motivated to maintain Wikis on subjects important to them and invest the time in providing quality, well-verified content. We can fill this gap if we can learn to support Wikipedia’ s encyclopedic goals and follow the rules.

The response to his post was, perhaps surprisingly, very positive—with encouraging replies in the comments from respected editors including Lori Phillips, FT2 and Wikimedia Foundation reader relations head Philippe Beaudette. King was subsequently invited to expand on the theme at The Wikipedia Signpost, where he continued:

COI contributors introduce bias, but I’m also concerned of the bias without them. Some of our most knowledgeable and motivated contributors are COIs. Does that mean we open the doors wide? Absolutely not. COIs are like political lobbyists. We’re needed but our participation needs to be a delicate and well regulated one. But through teamwork, education, awareness, process, a better ecosystem we could change the tides.

I half-agree with this. I think the analogy of lobbyists is incorrect; “COI editors” should self-regulate their own contributions, as Wikipedia’s Conflict of interest guideline itself says: “Where advancing outside interests is more important to an editor than advancing the aims of Wikipedia, that editor stands in a conflict of interest.” Conflict of interest is not fait accompli; a conscientious editor can and should acknowledge the potential for conflict of interest, and take steps to mitigate that. This should include seeking consensus for making edits outside of what the COI guideline describes as patently “non-controversial edits”.

But he’s right that such edits should also be well-regulated, although they are not now. In practice, following the advice of the Paid editing essay and seeking consensus at the Conflict of interest/Noticeboard (COI/N) or at various WikiProjects can present significant delays, another non-trivial obstacle for marketing and PR professionals who might then choose to just edit without providing adequate disclosure.

♦     ♦     ♦

David King is also right that there needs to be a better ecosystem, both to support and to regulate such editing activity. But such a system is unlikely to happen on its own. The answer may lie in an accommodation not unlike the one that accepts the role of ethical PR professionals on Wikipedia. To wit: although the spirit of Wikipedia is for it to be volunteer-edited, there are cases where COI editors, whether paid representatives or smart employees, can help address problem areas with certain articles. Likewise, the Wikimedia Foundation plays no role in setting editorial policy, but it can and should play a role in facilitating responsible COI activity.

There are good, active editors at COI/N who frequently catch bad actors (and infrequently help good ones) but unless their ranks are expanded significantly, they would have a difficult time handling the volume, were marketers to wise up and learn to follow Wikipedia’s rules. Why not help them out?

I suggest that a model already exists: through outreach efforts described in the Wikimedia Foundation’s Strategic Plan (PDF) and embodied in the Wikimedia Ambassador Program, resources could be put toward meeting PR professionals halfway. I don’t think the Foundation needs to seek more such editors, in part because they are already here. But it can provide a safe harbor for assistance requests and advice to ensure COI compliance, and make it safe to follow the rules. Yes, there are plenty of how-tos on pages scattered around the website, but if Danny Sullivan is right about one thing, it’s that Wikipedia is confounding to the uninitiated.

Five years ago, Wikipedia was definitely not ready for this. Today I think it is. And I wouldn’t necessarily call it traditional public relations, and certainly not marketing, because Wikipedia is a unique medium with its own rules. I suggest thinking of it as Wikipedia relations, or wiki relations for short. Hesitant Wikipedians should see it as a mark of how far the project has come: while volunteers remain the core of Wikipedia’s community, there is room for professional representatives of outside interests to work constructively in this space.

Returning to Jimmy Wales’ comments above, ethical PR firms and COI editors do exist. With some effort by the Wikipedia community and the Wikimedia Foundation, more can be encouraged, and Wikipedia would be better for it.

Johann Hari and the Terrible, Horrible, No Good, Very Bad Wikipedia Edits

Tagged as , , , , , , , ,
on September 15, 2011 at 12:04 pm

Unless you follow the media, and more specifically the British media, you may be wholly unaware that there is such a person named Johann Hari, or that he has been a wunderkind columnist and correspondent, or that a lot of people find him kind of insufferable, and in that case you almost certainly don’t know that he got himself in a big heap of trouble this summer, over charges of plagiarism and meddling with Wikipedia.

Understandably, most of the criticism has been focused on the plagiarism charges. After all, that’s a crime against journalism, and by definition journalists are the ones writing about it most widely. What he did in those cases was not remotely OK, but at the moment I’m a little more animated by his improper Wikipedia activity. After all, that’s a crime against Wikipedia, and by definition The Wikipedian blogs about Wikipedia.

The matter is news again today because Hari has published a public apology in the pages of The Independent, his employer. He is sorry for everything he has done, he’s returning his prestigious Orwell Prize (which he probably was going to lose anyway) and he’s taking a sabbatical to go back to journalism school. I guess it’s a start.

About the Wikipedia controversy, Hari devotes just one full paragraph:

The other thing I did wrong was that several years ago I started to notice some things I didn’t like in the Wikipedia entry about me, so I took them out. To do that, I created a user-name that wasn’t my own. Using that user-name, I continued to edit my own Wikipedia entry and some other people’s too. I took out nasty passages about people I admire – like Polly Toynbee, George Monbiot, Deborah Orr and Yasmin Alibhai-Brown. I factually corrected some other entries about other people. But in a few instances, I edited the entries of people I had clashed with in ways that were juvenile or malicious: I called one of them anti-Semitic and homophobic, and the other a drunk. I am mortified to have done this, because it breaches the most basic ethical rule: don’t do to others what you don’t want them to do to you. I apologise to the latter group unreservedly and totally.

Hari’s Wikipedia article contains this brief account:

Several journalists, including Cristina Odone in The Daily Telegraph and Nick Cohen in The Spectator, concluded that a Wikipedia editor, ‘David r from meth productions’, who claimed to be ‘David Rose’, were in fact made by Hari. Writing in The Daily Telegraph, Odone noted that, after she had fallen out with Hari, Rose began making misleading edits to her Wikipedia article accusing her of anti-Semitism and homophobia. Nick Cohen said that misleading edits were made to his own Wikipedia article by the same editor after he had published criticism of Hari’s work. … The Times leader writer Oliver Kamm later attributed to ‘David Rose’ a change in his Wikipedia biography that he regarded as “merely an unsubstantiated judgement” but which had been made not long after a “spat” with Hari.

I am not one who believes, as a general rule, that someone should never edit their own Wikipedia article. Indeed, I’m kind of the expert on how to do it and not bring grief to yourself. But by his own admission, Hari’s editing of his own page amounts to what Wikipedia informally calls whitewashing. Hari also did not disclose that he was behind the “David r from meth productions” account, which is also, obviously, a problem. And it’s all the worse—and by worse I just mean “embarrassing”—if you’ve read any of his surreptitiously self-serving arguments in the archives of his Talk page.

But embarrassment is the bare minimum of regret Hari should feel about his “juvenile and malicious” edits to Wikipedia articles about his media adversaries. This is the part that really gets me. Others may disagree, but I see a vast gulf between sneakily trying to make yourself look better and sneakily making others look worse. And I think there’s a big difference between being an anonymous Internet critic—although it’s a type known to take things too far—and using the veil of anonymity (or in the case of Wikipedia, pseudonymity) to smear a person’s reputation.

Calling someone a “douchebag” is rude, and you may be wrong, but that’s your opinion. Calling someone a “drunk” is a specific charge of bad behavior, about which one is either right (and maybe still an asshole) or wrong, and that’s unforgivable. I don’t know which is the case, but either reflects very poorly on his character. This is the one thing that I think no apology, leave of absence, or media training, can fix.

Update: In the comments, a reader points out that Hari’s edits are even worse than I’ve described them, and he’s right. He points to apparent sustained anonymous vindictiveness on Hari’s part, and I add that Hari’s self-support included some rather absurd sock puppetry, neither of which I was aware of at the time I first wrote this. Had I the time, I would follow this up in more detail. But the upshot remains the same: as a public figure, Hari may or may not be finished—but as a respectable one, he certainly is.

Rick Santorum’s Wikipedia Problem and its Discontents

Tagged as , , , , , , , , , , , , , , , , ,
on August 10, 2011 at 9:16 am

When former U.S. Senator Rick Santorum started gearing up to launch his presidential campaign earlier this year, there was one question he could not avoid. It had to do with the matter of alt-weekly editor and advice columnist Dan Savage, who has for years positioned himself as Santorum’s most prominent critic. Many politicians have fierce opponents, but few did what Savage did in 2003, and that was hold a contest to give an alternate meaning to the word “santorum”. I hope you’ll forgive me for declining to quote the winning definition, but you can find it here, and suffice to say that it has stuck. So much so, in fact, that eight years later Savage’s term has come to dominate the web search results for Rick Santorum’s name.

In news stories this year it was mostly described—by ABC News, Roll Call, Slate, and Huffington Post, among others—as Santorum’s “Google problem”. Indeed, one of the top three results for Santorum’s name is Dan Savage’s website promoting the campaign. But Google and Wikipedia are often joined at the hip, and one of the top results has been a Wikipedia article, not about Rick Santorum per se, but in fact about the campaign against him… or about the word itself… it hasn’t always been clear. And by mid-summer 2011, the article—then called Santorum (neologism)—had grown to several thousand words, and had itself become the focus of controversy among Wikipedians.

This blog post traces the history of the article’s evolution in some detail—not exhaustive, but getting there—because it’s an interesting window into how Wikipedia deals with controversial topics. Wikipedians can’t always agree, and in fact the article in question still remains a matter of dispute. But after 200,000 words and numerous debates in various forums around Wikipedia, the community has arrived at something approaching a satisfactory conclusion. Below, I aim to show how things got out of control, and how the Wikipedia community worked it out.

·     ·     ·

August 2006—To start from the beginning, let’s start from the beginning. The first version of this article was created five years ago this week, simply as Santorum.

(I should take a moment here to point out that—spoiler alert—because the article today is called Campaign for “santorum” neologism that is what appears at the top of all historical versions of the article; generally speaking, for each version I’ll link here, I will boldface article’s name at the time upon each reference.)

At this point the article was just a few paragraphs, outlining the circumstances that led to Savage’s coinage and a few examples of the term’s usage in the U.S. media. Prior to becoming its own article, most of the relevant material had been contained in a sub-section of the article about Savage’s sex advice column: Savage Love#Santorum.

It didn’t take very long at all before editors questioned the article’s suitability for a standalone article—what Wikipedia calls “notability”. In fact, the same day the article was first created, it was nominated for deletion. The reason for the nomination is one that would be echoed many times over the next half-decade:

The neologism referred to, created by Savage Love, does not have any evidence of real currency as a neologism. It should be treated as a political act by Savage Love, and described under that article.

The nomination failed and the article remained, as it certainly had received some media attention, but it was decided a renaming was in order. The suggestion was made that it be called Santorum (neologism), or possibly Santorum (sexual slang). Recent followers of this controversy might assume that the former was selected, because that was the name of the article for a long while. However, it was the latter, with a large reason being that Wikipedia has an explicit policy against creating articles about neologisms.

But that hardly settled the matter; the next issue concerned which Wikipedia page readers should find when they search for the word “santorum”, which now was considered to have—and here you could say that Savage had already won—two legitimate meanings. So the question was taken to a “straw poll”. For now, the article was still called Santorum, but what would the average Internet user be looking for when they looked up that term? How should the ambiguity be handled—in Wikipedia terminology, “disambiguated”? And what exactly should they call the article about the coinage?

Related to the word “Santorum”, the options included, and I quote:

  • Santorum should be an article about Savage’s attempt to define the word “santorum”
  • Santorum should be a disambiguation page, with its “traditional” content
  • Santorum should be a disambiguation page, with some other content (explain)
  • Santorum should be a redirect to Rick Santorum, and Rick Santorum should have a dablink…
  • Santorum should be a redirect to Rick Santorum, with no reference to the Savage neologism in the Rick Santorum article

Related to the article about Savage’s coinage, the options included, and I quote:

  • The article on the Savage neologism should be titled Santorum (neologism)
  • The article on the Savage neologism should be titled Santorum (sexual slang)
  • The Savage neologism needs no article; sufficiently covered at Savage Love#Santorum

And the result was… inconclusive. Nevertheless, a proposal was made, and subsequently accepted, to keep Rick Santorum as it always was, to call the Savage Love-inspired article Santorum (neologism), and to make Santorum a disambiguation page with links to relevant pages, among other details. The best summary of the considerations involved was stated by User:Dpbsmith, a veteran and still-active editor, who wrote:

Frankly I’ll support anything meeting these criterion:
A user who types in “santorum” as the Go word intending to find information about the Senator can find it very easily.
A user who types in “santorum” as the Go word intending to find information about the neologism can find it easily.
A user who types in “santorum” as the Go word is not presented immediately with the details of the neologism, but must click on a link, and the link must have some kind of label that communicates that fact that they are about to read about a political attack on the the [sic] Senator.
There should be no implication that Wikipedia endorses the neologism as somehow being “the real meaning” of the word.

Oh, did I mention there was also then a page called Santorum controversy, which is now called Santorum controversy regarding homosexuality, that also came up in the discussion? Well, now I have. Just wanted to be clear about that.

·     ·     ·

Late 2006-Early 2007—Although the matter seemed to have been handled appropriately, that didn’t stop editors from raising objections—even the very same objections—in the months following. In fact, someone had changed the article’s title back to Santorum (sexual slang) by the time the article came up for a second deletion debate in December 2006. The nominator focused on the fact that the media hits for the article were trivial—sure, The Daily Show and The Economist had used it, but neither had focused on it as a topic—while several less well-known sources appeared to be joining Savage’s campaign to popularize the term. Meanwhile, the nominator’s first argument was that the primary information was already covered in the Santorum controversy article (now you see why I mentioned it). Following a week’s worth of debate involving approximately two dozen Wikipedians and several thousand words…

The result was hopeless, hopeless lack of consensus.

(Emphasis in the original.) Lack of consensus to delete an article always means that it stays, and so it did. Some editors had suggested moving the article’s content to Wiktionary, Wikipedia’s dictionary sister project, where in fact the term had registered its own entry (without controversy) several months ahead of Wikipedia.

Later in December, one of the editors involved in the previous debate suggested moving the article from Santorum (sexual slang) to the oddly-titled Santorum (sexual slang activism), though the article stayed put. In January, a suggestion was made to merge the article back into the Savage Love entry, but that didn’t happen either.

·     ·     ·

Late 2007—Debate continued. In September, someone renamed it to Santorum (fluid)—ugh—and it was returned to Santorum (neologism), as it was then called. By this point, the article had grown substantially, was attracting the efforts of serious Wikipedians, and was… well, it was actually getting pretty good. In September 2007, the article was nominated for “Good article” (GA) status, and it looked like this. Later that day, the reviewing editor failed the article for including unsourced and “poorly sourced” material—The Onion in particular was singled out, although it was really an interview with Savage in the sister publication, AV Club—and for being a “BLP liability”.

That is to say, the article skirted the line of Wikipedia’s Biographies of living persons (BLP) policy, which aims to keep out scurrilous and weakly-sourced material about living persons that could be damaging to a living person’s reputation. As you might imagine, that had long been an issue; one couldn’t write about this topic without it being an issue. One could argue that Savage’s campaign was all about damaging Santorum’s reputation—I presume Dan Savage would agree to that—and yet it was nonetheless notable. Many editors then, and to this day, wished it would simply go away. And yet some wanted to make it as “good” as possible.

·     ·     ·

2008-2010—We can skip ahead, because after October 2007, fewer than 160 edits occurred in the three years intervening, and it was not changed substantially in that time. Santorum had lost his re-election bid in late 2006, re-entered private life in January 2007, and ceased to make headlines. In December 2007, the article looked like this. In January 2011, it looked like this. It was the same old back-and-forth, and not much happened.

·     ·     ·

Early 2011—As Santorum started making moves to run for president, activity picked up. In mid-February, Roll Call was first to write about Santorum’s “Google problem”, and this was dutifully added. The article continued to draw attention (including from vandals) through the end of February, until it was put under temporary “semi-protection”. When Stephen Colbert mentioned the controversy on his show, a not-so-brief summary was added, then removed, with the point made that “not everything Colbert says needs to be repeated in Wikipedia”. (Imagine that!) March and April were months of relative calm before the proverbial storm: nearly 1,000 direct edits, from May to this writing, lay just ahead.

·     ·     ·

May 2011—In early May, a very active and respected editor-administrator, User:Cirt, began a series of more than 300 edits to the article, starting with a long-overdue link to Wiktionary. By this point, the article contained some 1,600 words, excluding links and references. Cirt announced his intention to add “some research in additional secondary sources”, and four days later he had expanded the article to some 4,300 words. On the discussion page, one editor objected:

Expanding an article about a vile attack on a living person – it’s twice the size now and refs have gone from 33 to 95 – has got to be against the spirit of least of our BLP policy. My proposal, and my intention, stated right now, is to return this article to the content it had on May 9th.

This kicked off the first sustained debate in years—one that has arguably not yet come to a close. A proposal was made to “stub” the article, meaning to reduce the article’s length to a mere stub of an entry; the argument went, because the arguably unfair subject obviously met Wikipedia’s previously-determined standards for inclusion, a possible solution was to reduce it to the shortest possible version. This proposal quickly failed, with Cirt himself citing an earlier comment by veteran Wikipedian (and current Wikimedia Foundation fellow) Steven Walling:

The BLP policy is not a blank check for deleting anything negative related to a living individual. Criticism, commentary, and even base mockery of a public figure like a Senator is protected free speech in the United States. While it would be ridiculous for anyone to try and make Wikipedia a platform for creating the kind of meme Savage did, it is perfectly prudent for Wikipedia to neutrally report on the overwhelming amount of coverage given to the topic.

Remember that part about using Wikipedia as a platform—it will come up later. Meanwhile, Cirt continued to add significant information about media usage and analysis of the term and events surrounding Savage’s campaign, all backed up with acceptable references. In particular, he focused on adding uses of “santorum”, in slang dictionaries and even erotica, to support the article’s focus as legitimately about the neologism, and not Savage’s campaign per se.

For those who did not wish for Wikipedia to contribute to the so-called problem of making Savage’s campaign seem more important than it arguably was, it must have been more frustrating still to observe that the article was quite well-written and scrupulously followed Wikipedia’s style and sourcing guidelines. Cirt was nothing if not sophisticated. Many had the impression that the article itself was now an attack on Santorum, although that conclusion was only in the eye of the beholder. Cirt knew what he was doing and, for lack of a better phrase, Cirt knew exactly what he was doing. One editor objected:

I realize you will defend this bloated attack piece with all your skills (that is actually what I find most disturbing) but you have to realize or at least have noticed that many experienced editors disagree with your massive expansion of it and at some point it will require wider input and a community RFC.

By the end of May, the article had grown to more than five times the length of the article Santorum controversy regarding homosexuality and more than two-thirds the length of the primary Rick Santorum biographical article. Discrepancies of this sort have been well observed, most significantly on the Internet forum Something Awful, but no Wikipedia policy exists to require proportionality among articles.

At its greatest length, on May 31, the article surpassed 5,500 words, including headers but excluding photo captions, links and references—a total of over 77,000 bytes of data.

·     ·     ·

June 2011-Present— Were I to adequately summarize the debates and discussions that occurred beginning in late May and continuing sustainedly—with most debate occurring in June—this blog post could be three times its already considerable length. Instead I will attempt to summarize, although “considerable length” is unavoidable still.

From early June, Cirt pretty much stopped editing the article. To a significant extent, he’d become part of the issue, not just regarding this article but others as well, as can be seen on the discussion page for Cirt’s user account.

Among the many solutions offered around this time, one focused not on the article content itself, but rather its visibility on search engine results pages (SERPs). The editor offered, even if just for the sake of argument:

While I don’t really like the precedent, there’s nothing to say that every article needs to be indexed by search engines. … The majority of the concerns here seem to be focused on how people are coming across this article (via Google bombing, etc.), not necessarily that the article exists. … Both sides have legitimate points in their favor, so a compromise might be best here.

Other editors agreed it would set a bad precedent, and the suggestion did not go any further.

By now the topic had come to involve some of Wikipedia’s most influential editors, and a lengthy debate opened on Jimmy Wales’ discussion page. Wales’ take was as follows:

My only thought about the whole thing is that WP:COATRACK applies in spades. There is zero reason for this page to exist. It is arguable whether this nonsense even belongs in his biography at all, but at a bare minimum, a merger to his main article seems appropriate.

The “Coatrack” argument—one of many analogies Wikipedians have created over the years to illustrate key concepts—is not a policy or a guideline, but an informal essay, yet one with much currency. It states:

A coatrack article is a Wikipedia article that ostensibly discusses the nominal subject, but in reality is a cover for a tangentially related biased subject. The nominal subject is used as an empty coat-rack, which ends up being mostly obscured by the “coats”. The existence of a “hook” in a given article is not a good reason to “hang” irrelevant and biased material there.

In retrospect, it’s a little surprising that the “Coatrack” issue hadn’t been raised in any significant way before—and Wales is neither considered infallible nor is he always that involved in day-to-day Wikipedia issues—but this may yet have been a turning point. The next day, the highly respected User:SlimVirgin opened an RfC (Request for Comment) called “Proposal to rename, redirect, and merge content”. This led to the article being renamed, for a time, Santorum Google problem. Later, it was pointed out that “Google is not the only search engine in the world”, and so the search (as it were) continued.

The argument that the “neologism” had not evolved organically, but was the result of an organized campaign by Savage and his allies, had begun to exert some influence. For one thing, it was now quite clear that the majority of sources focused on the political campaign to bring relevance to the term, as opposed to the term’s relevance itself. In this way, one might say that Savage’s campaign had become a little too successful. Yes, the term was notable, but the controversy itself had become even more so.

Prior to the renaming mentioned above, editors in an adjacent thread had discussed several alternative names for the article. These included:

  • Santorum neologism controversy
  • Dan Savage santorum neologism controversy
  • Dan Savage santorum neologism campaign
  • Santorum neologism campaign
  • Spreading santorum (the name of Savage’s website)

Here one can start to see where the article’s current title would eventually emerge. Meanwhile, the article faced two more AfD (Articles for deletion) nominations, the first under its old name and the second under its current one. These were the fourth and fifth nominations overall, and surely the most futile.

As part of the ongoing RfC discussion in June, it had been strongly suggested that the article needed to be condensed, especially as Cirt’s expansion had contributed so significantly to the controversy. Besides the article expansion, in mid-May Cirt had created a new “footer” template, Template:Sexual slang, which further linked Rick Santorum’s name to dozens of NSFW topics. That template still exists, but on June 11 the link to Santorum (neologism) was removed. Again, it’s hard to say if this was another turning point, but a discussion about this template on Wales’ discussion page supports the notion that a consensus was coming into view: the article in its present form had itself become part of the campaign—that Wikipedia was being used as a platform for the campaign in the manner Walling had suggested.

A day later, a request for arbitration (RfAr)—a petition to the Arbitration Committee, Wikipedia’s equivalent of the Supreme Court—was opened against Cirt on the basis that his concerted efforts on the subject constituted “political activism”. On June 18 the request was rejected, but not before several dozen editors had contributed more than 28,000 words of opinion. One committee member wrote:

Decline for now, I’m inclined to think that this is more of a content dispute, and the community is able to cope with it.

On June 17, the community finally hit on a name that stuck: Campaign for “santorum” neologism. Initially, this was only intended as an interim move while further discussion took place. Among the names considered at this time, not all were serious, but most were:

  • Dan Savage santorum campaign
  • Dan Savage campaign
  • Dan Savage’s verbal attack on Rick Santorum
  • Santorum (sexual slang)
  • Santorum neologism campaign
  • Santorum neologism campaign
  • Santorum neologism controversy
  • Rick Santorum and homosexuality
  • Rick Santorum homosexuality controversy
  • Savage Santorum campaign
  • Dan Savage santorum neologism controversy
  • Dan Savage santorum neologism campaign
  • Spreading Santorum
  • Rick Santorum’s Google problem
  • Rick Santorum’s “Google problem”
  • Santorum Google problem
  • Rick Santorum Google problem
  • ‘Spreading santorum’ campaign
  • Campaign for “santorum” neologism
  • Dan Savage campaign for “santorum” neologism
  • Savage–Santorum affair (a reply: “Oh Please God No.”)
  • Savage–Santorum controversy
  • santorum (neologism)
  • The problem Rick Santorum is facing because every search engine in the world’s top search results says santorum is an anal sex by-product
  • Santorum (googlebomb)
  • SEO Campaign for “santorum” neologism
  • Santorum (cyberattack)
  • Santorum (cyberbullying)
  • Santorm (SEO attack)
  • Dan Savage’s “spreading santorum” campaign against Rick Santorum’s anti-gay stance
  • Santorum Google ranking problem
  • Dan Savage Google-bomb Attack on Rick Santorum
  • Campaign to attack Santorum’s name
  • Campaign to create ‘santorum’ neologism
  • Campaign to associate Santorum to neologism

In the end, inertia and the current title’s inherent virtues won out. Of the eventual “winner”—Campaign for “santorum” neologism—a veteran Wikipedian commented:

This one is growing on me – neutral, correct, to-the-point, and succinctly informative to readers both familiar and unfamiliar with the subject as to what the article will be about.

All that was left was to whittle the article down from its extreme length to a shape that covered the topic adequately, balancing relevance with discretion. While many edits were to follow, the key edit was made on June 21, when SlimVirgin replaced a 4,800-word version of the article (minus links and references) with a 1,400-word version. This is substantially the version of the article that remains in place today.

·     ·     ·

Comparing the late May version of the article, at its longest point, to the trimmed-down and refocused current version, here’s what we find:

  • The earlier version focused on the term in and of itself, with the opening sentence including a definition and describing its use. The current version focuses on the events, explaining the aim of Savage’s campaign—though the definition remains.
  • Excluding the lead section, references and external links, there are only three sections in the current version, compared with seven in the earlier (not including “See also” and “Further reading”, which were also removed).
  • The content of the “Background” section was almost entirely removed, leaving just the key facts about Rick Santorum’s statements in the 2003 Associated Press interview.
  • The section about the website “Spreading Santorum” was removed, details added into the “Campaign by Dan Savage” section.
  • Almost all of the “Recognition and usage” section was removed.
  • “Media analysis” and “Political impact” were combined into one, shorter, summarized section, focusing on the reception of the campaign in the media and its political impact.
  • Santorum’s response to the controversy was kept in the current article, however condensed.

Up to the present day, in the Talk page discussions alone (including the RfC discussion), more than 200,000 words have been written about the article. That is probably well short of the true number.

Perhaps surprisingly, the impact on Rick Santorum’s Wikipedia article was not that great—the article had long summarized the events in a short final paragraph concluding a heading relating to his statements about homosexuality—83 words at this count.

Meanwhile, Santorum’s “Google” problem continues. Conduct a logged-out search today, and here are the top three results:

And let’s not imagine the argument is completely over on Campaign for “santorum” neologism. Visit today, and one will find at the very top:

Images courtesy Wikipedia and Wikimedia Commons, licensed under Creative Commons. Additional research and analysis provided by Rhiannon Ruff.

Is Wikipedia “Slowly Dying”?

Tagged as , , , , , , , ,
on August 5, 2011 at 11:27 am

Here’s a provocative blog post from Gawker’s Adrian Chen yesterday: “Is Wikipedia Slowly Dying?”. It’s based on a provocative comment by none other than Wikipedia’s Jimmy Wales at Wikimania, the annual conference for Wikipedia and its sister wiki sites. Of course, that’s not quite what Wales said, but the Associated Press story Chen’s post is based on is not so far off:

“We are not replenishing our ranks,” said Wales. “It is not a crisis, but I consider it to be important.”

Administrators of the Internet’s fifth most visited website are working to simplify the way users can contribute and edit material. “A lot of it is convoluted,” Wales said. “A lot of editorial guidelines … are impenetrable to new users.”

It’s also not a new concern. In March the Wikimedia Foundation published its latest study of editor participation, showing a decline in editor participation compared with a couple years ago, although it certainly still has more contributors than a couple years before that. In my post on the subject, “Trendy Thinking: Contemplating Wikipedia Contributorship”, I included a Wikimedia-generated chart that shows what Wales is talking about:

From 2001 through 2006, participation grew exponentially, slowed at its peak in 2007, and has decreased at a steady rate in the years since. A number of theories have been floated to explain the decline. Via the AP, Wales offers a very common one: with almost 3.7 million articles in the English-language edition, the project of buiding Wikipedia has mostly already been done. But he also offers one that I hadn’t really considered before:

Wales said the typical profile of a contributor is “a 26-year-old geeky male” who moves on to other ventures, gets married and leaves the website.

There is some evidence for this in the survey results. Turn to page five of an earlier survey report (PDF) and you’ll see that more than 75% of editors (technically, survey respondents who called themselves editors) are younger than 30, and of the remaining quarter, half again are in their thirties. It may be that only 12.5% of Wikipedia editors are older than 40.

This situation points toward a perhaps unlikely but perhaps untapped editor group: retired persons. In fact, it was my expectation to find a higher percentage of older editors—something like a reverse bell curve—showing greater participation by the young and old, with those in the middle with careers and young children contributing less frequently. In my personal experience on the site, some dedicated editors—some of the best, in my estimation—are middle aged or older. Yet the survey plausibly explains why they are statistically less common:

The last group is characterised by the fact that its members started to use / contribute to Wikipedia at a comparably old age. However, since the age range of this group is very broad, it covers persons that grew up with the Internet as well as persons that had to learn to use new media past their school and university time.

Someone who was 39 when Wikipedia was created is now 49 or 50, and actuarial realities will continue to produce a general population that is ever-more Internet-savvy, and therefore ever-more inclined to edit Wikipedia. That is to say, those who were once young editors may return as old editors.

Back at Gawker, the comment section offers another complaint to which Wales only alludes. The pseudonymous SoCalMalaise writes:

I used to write and edit Wikipedia a lot. Some long articles are almost entirely written by me. It was a way to fine tune both my research and writing skills and enjoy the novelty of writing something that thousands (millions?) of people read. But soon I found that your work is frequently stifled by so-called “administrators” who are usually high school or college students with sub-par research and writing skills. These trolls have created a Kafka-esque labyrinth of self-contradictory “policies” and “guidelines” that they used to remove sentences, paragraphs, sections or even entire articles that skilled writers have volunteered to put down. They cherry-pick various parts of their rules as an excuse to act out their God complexes and strike out content. … And I’m not talking about a few bad apples. These people are everywhere! The whole writing-for-Wikipedia thing became very frustrating and just not worth my time.

It’s difficult to generalize from any one person’s experience, and who knows what common-but-non-obvious mistakes SoCalMalaise might have made, but the sentiment is certainly not unheard-of.

Thing is, for every complaint about overzealous editors and sticklers for arcane rules, there’s a complaint about uninformed editors who show little respect for common-sense rules. I have to admit, I’m more of the latter complaint—it is sticklers for policies and guidelines who enforce a minimum level of quality required for new additions, and therefore maintain a semblance of article quality. Myself, I spent a lot of time learning how Wikipedia works. It took several years before I was able to contribute at a high level, creating new entries or significantly improving existing ones. I am polite when I find someone is doing it wrong, although I know also that some are not.

Meanwhile, the organized core of the community has spent a lot of time, especially recently, trying to figure out how to retain those who give Wikipedia a try. There is the WikiLove campaign, which has received some media attention, but I’ll have to explain my skepticism another time. I’ve also heard that new account registrants are sometimes asked to identify areas of interest, which sounds like an interesting idea, but as far as I can tell it hasn’t been widely deployed.

Ultimately, whether Wikipedia’s declining user base represents a problem is not a question that exists in a vacuum. The question is really whether Wikipedia has enough editors to keep getting better or, at the very least, maintain its current level of quality. There are multiple answers here. As I’ve pointed out before, the Wikipedia community’s rapid response to breaking news is impressive: if you want a good primer on the United States debt ceiling crisis, Wikipedia has a very strong and evolving summary. But Wikipedia sometimes fares poorly with articles on many pre-Internet topics, especially in the social sciences: if you want to know about Money market funds, I’m not sure I can recommend Wikipedia.

It’s worth taking stock of the fact that Wikipedia’s decline among editors is a bit more than gradual, but does not now appear to be accelerating. The next two years will be telling, but I suspect that Wikipedia’s contributor base will find its floor, and my guess—though it is only that—is that we’re probably somewhere near it. Wikipedia is no longer the new hotness, and let’s face it, it’s an encyclopedia. To most it is far less thrilling and far more challenging than YouTube or Facebook, and we shouldn’t expect that Wikipedia’s participation will look anything like it. It’s no less popular as a destination for readers, and it would take a very significant drop in article quality for that to happen. (Like, say, if Wikipedia’s vandal patrol disappeared tomorrow… if anyone, send your WikiLove to them.)

I think the current situation also raises a question that many Wikipedians are loathe to consider, but that is the professionalization of some aspects of Wikipedia. This doesn’t necessarily mean hiring editors, but it could mean working out partnerships to share in the responsibility of maintenance and development of software and perhaps even some content. It’s an article of faith that much of Wikipedia’s early growth and unique characteristics derive from its volunteer force, but as any business professor can tell you, the skill set that launches a viable company is not the same skill set that brings that company to maturity. There is precedent for this; Wikipedia needs the Wikimedia Foundation, which does have a paid staff, although they avoid organized involvement in matters of content, except as individuals. Ultimately, Wikipedia must remain in the hands of its volunteer editors—to change that would be too fundamental a shift. But as Wikipedia grows more complex, it’s not hard to think they could use greater support.

Michele Bachmann, Sarah Palin and the Boring Truth About Wikipedia Vandalism

Tagged as , , , , , , , , , , , ,
on June 29, 2011 at 10:02 am

The Wikipedian was traveling for most of this past month, and so I’ve missed out on a few interesting Wikipedia-related stories of late. None was more frustrating (and entertaining) than the case of Sarah Palin’s supporters’ edits to and arguments about Paul Revere’s famous ride. In case you missed it (or, as it is so often abbreviated in campaign e-mail blasts, “ICYMI”) Palin stated in early June that Revere had warned the British—not the American revolutionaries—and a few of her supporters attempted to change the Paul Revere article to more closely reflect her version of events.

Yes, I missed that one, but maybe I’m not too late: according to nearly back-to-back posts by left-wing bloggers at ThinkProgress and Raw Story, the same thing is happening to various Wikipedia articles following erroneous statements by newly-declared Republican presidential candidate Michele Bachmann. At issue:

  • During her campaign announcement speech, Bachmann referred to the late film actor John Wayne’s hometown as Waterloo, Iowa, when in fact it was Winterset, Iowa. As an aside, I seriously doubt, as widely asserted, that she was thinking of John Wayne Gacy (who is most closely associated with Chicago) and, for what it’s worth, Bachmann later pointed out that Wayne’s parents met in Waterloo.
  • Later, interviewed by ABC News, Bachmann referred to John Quincy Adams as a “founding father” although the U.S. president was only a child during the American revolution (he was, of course, the son of founding father John Adams). The last I heard, she was sticking to her guns on this one, as little sense as that makes.

As reported by ThinkProgress and Raw Story, the Wikipedia articles about John Wayne and John Quincy Adams were undoubtedly changed, more than once, to reflect Bachmann’s erroneous statements. I’ll tell you what, though: upon closer inspection, I think this hardly rises to the same level as the Palin-Revere controversy, and really says more about the partisan / ideological online media than it does about Michele Bachmann or her political supporters—let alone Wikipedia.

To wit: On Monday, an IP editor (meaning one who has not registered for an account and so is represented by their IP address) from Pennsylvania changed John Wayne’s birthplace to “Waterloo” from “Winterset”. It was changed back pretty quickly. On the discussion page, there was little actual debate of the issue—and it started anyway with a sarcastic post by someone clearly not a Bachmann fan.

The next day, on the John Quincy Adams page, an IP editor (using the IP address 128.200.11.106, associated with UC-Irvine) added “a founding father” as a subordinate clause in the very first sentence. This too was removed, and a brief, detached conversation occurred on that discussion page as well.

I decided to look at the edit history of the IP editors responsible for the above edits. It turned out the editor responsible for the Wayne edit had made no prior edits and has made none since. The editor responsible for the JQA edit has possibly edited a few times before (IP addresses can be shared, so identity is difficult to establish). On the discussion page associated with the IP address, an established editor politely suggested that the individual create an account, whereupon the IP editor replied:

Are you joking? It was obviously vandalism, so why try to act like I was acting in good faith?

Yeah, that’s about right. You won’t hear it from ThinkProgress or Raw Story, but the Palin-Revere controversy was a much bigger deal, kicking up a much more heated debate, lasting more than a week and encompassing several related discussion threads. And whereas actual Sarah Palin fans seem to have become involved there, there is no reason to think that actual Bachmann supporters are involved here. The best take on it comes from an editor, BusterD, who wrote on the JQA discussion page:

Up to this point, what is reported is not actually happening. A few ip editors have been injecting the phrase “founding father”, sometimes as a clear jest and sometimes modifying the father who is considered one of the founders, but most of what’s going on is normal ip vandalism which occurs when an historical figure gets mentioned in the media. Semi-protection is now in force; nobody has been editing the page in any but the most minor ways. Sure would be a good time to get cites on everything and tighten the page up some.

That’s exactly right. Activity on Wikipedia articles, whether helpful or unhelpful, is often driven by what’s in the news, and this case seems to be no different. General mischief on Wikipedia is an everyday fact of life, and the idle hands motivated to cause such trouble frequently draw inspiration from the headlines. Wikipedia’s Recent changes patrol (and a few automated scripts) keep the most obvious at bay; most of it is caught within minutes. Politically motivated edits are usually much more subtle and focused on specific politicians rather than general topics momentarily associated with them. It seems clear that the Bachmann-related edits were not done to make a point but simply for the lulz.

Whether these incidents say anything about the respective supporters of Michele Bachmann vs. those of Sarah Palin, I pass no judgment. As to the blog-first-ask-questions-later nature of the political mediasphere, well, I think this post speaks for itself.

Audrey Tomason: Newly Minted Star of Washington, and Wikipedia?

Tagged as , , , , , , , , ,
on May 10, 2011 at 11:01 am

Washington, DC (and those outside the Beltway who share its mindset) can’t get enough of celebrity and celebrities. This is why it imports them each April for the White House Correspondents’ Dinner. This is why phrases such as “famous for DC” and the blog Famous DC and the saying “Washington is Hollywood for ugly people” exist. And it explains, at least in part, the sudden prominence of one Audrey Tomason, the subject of several recent “who is she?” news treatments from the Washington Post, Daily Beast, Daily Mail and elsewhere. She is also now the subject of a one week-old Wikipedia article that has been viewed more than 42,000 times:

Audrey Tomason Wikipedia article

And yet it’s not even agreed that she warrants a standalone Wikipedia article: there is so little information available that one of the few facts currently included is that she “regularly donates to the ‘Tufts Fund for Arts, Sciences and Engineering.’” An outright majority of sources in the article are from Tufts University (three annual report links, one alumni magazine) and one is simply a link to a brief appearance on C-SPAN in which she introduces somebody else. That’s awfully thin.

Wikipedia often chooses to delete articles about people notable for only one event, and in this case one might argue she is only possibly notable for appearing in a famous photograph. On the other hand, the Daily Mail reports that she is Director of Counterterrorism for the National Security Council, which sounds pretty important, although Wikipedia editors have expressed skepticism about the report. As one has pointed out, at this point she is more Internet meme than public figure.

So, will the article survive? It’s too soon to say; for now editors are taking a wait-and-see approach. The answer ultimately may be up to the United States federal government, and whether they are willing to let her talk to the press. Chances are slim, and as the Washington Post points out, Wikipedia itself could even play a role:

If it’s true that Tomason’s job is of the clandestine nature, it’s reasonable to think that this photo will not be good for her career. Neither will her new Wikipedia page.

Osama bin Laden is No Longer a BLP

Tagged as , , , , ,
on May 2, 2011 at 7:35 am

That is to say, as the world knows by now, the Wikipedia article about Osama bin Laden no longer describes a living person, and he is no longer subject to Wikipedia’s policy for Biographies of living persons (BLP).*

Osama bin Laden, finally dead (on Wikipedia)

Quite something to see this template attached to this particular article. As I type this just before 9am Eastern Time, Wikipedia editors have been extremely active overnight; since early reports of President Obama’s announcement, there have been more (as of my counting) 430 edits to the main bin Laden page and 999 edits to an all-new article: Death of Osama bin Laden. And, of course, there was the obligatory circumstance wherein someone accurately updated the article to reflect his death without providing a citation, leading another editor to revert the change pending verification. And within a few minutes, it was.

*Of course it’s still covered by BLP insofar as other individuals mentioned on the page are concerned, but can we set that aside and take some satisfaction in this moment already?

USA Congressional Staff Edits to Wikipedia: The Saga Continues

Tagged as , , , , , , , ,
on April 12, 2011 at 12:03 pm

Last week I was asked by Politico’s Marin Cogan to provide some commentary about a situation on Wikipedia whereby a congressional staffer had tampered with her boss’ entry. This became “Rep. David Rivera’s war with Wikipedia” in last Thursday’s paper.

As the article explained, David Rivera’s press secretary, Leslie Veiga, had created an account using her real initials and last name (otherwise, she would’ve gotten away with it) in order to delete a number of negative subjects from the entry and replace them with conspicuously favorable language. Both actions are officially discouraged by site policies, but no official action was needed: the changes were rolled back, the offending account was issued a warning, and the unhelpful editing activity ceased.

Now a new section about the incident has been added to Rivera’s article, although its inclusion has been disputed (Wikipedia dislikes self-referentiality unless unavoidable, and its relevance to Rivera’s overall career is unclear) so it’s not necessarily there “forever,” as Gawker suggested. Then again, as I told Cogan: “All Wikipedia aims to do is reflect what is public knowledge and has been widely reported.” And it seems to have been covered widely enough.

As hinted above, the cynical view is that Veiga’s biggest mistake was the one thing that was laudable about her actions: her transparency. The truth is that she could have been transparent and made helpful suggestions in accordance with Wikipedia’s conflict of interest guideline… but this requires much more knowledge about Wikipedia than most staffers have. (As Politico mentions, I deal with this subject professionally and written about how it can be done it properly.) And none of this is new: the fact of congressional staff editing Wikipedia was first widely reported in early 2006 and is now memorialized in the Wikipedia article “USA Congressional staff edits to Wikipedia”.

What most staffers seem to do instead is what most uninitiated contributors do, and that is edit without creating an account, thereby displaying their IP address. The U.S. House and U.S. Senate have dedicated IP addresses serving members’ offices on Capitol Hill (I used to think there was a single IP address for each, but now I’m not so sure; if anyone knows for sure, please speak up in the comments). As Cogan writes:

The House IP address … frequently shows up in the edit histories of members, committees and constitutional amendments. Wiki editors repeatedly blocked the House IP for limited periods of time until 2009, when they apparently gave up the effort.

By following these edit histories, you can make some guesses about which offices might be doing the same as Rivera’s staffer. To be clear: most of these edits are not so blatantly self-serving as were Veiga’s; most are only mildly self-serving, such as the staffer from Rep. Jimmy Duncan’s office, who apparently tried to add his Facebook page and YouTube channel (for which one could actually make a decent case, but few know to do) only to be reverted and warned.

The Talk page associated with the IP address is also enlightening (that’s how I found the Duncan edits) and sometimes amusing; this comment (under the header “Wow”) is my favorite:

Look at all those edits of mudslinging your opponents and painting yourselves in some golden light. I expected better from our government.

Uh huh… right. And of course there is the page listing all contributions made from the House IP address, where one can find all manner of subjects that Hill staffers are interested in, besides just their bosses. Among non-political recent edits:

As you can see by the repetition of collegiate topics, one may surmise that more than a few are largely concerned with themselves. One edit from late March was undoubtedly self-centered: Congressional staffer. But their bosses do seem to be among the greatest focus. And about the fact that, in late March, edits were made to the article titled Liar, perhaps the less said the better.

P.S. Just over two years ago, I covered this topic in a post titled “Did Rep. Hinojosa Get a Free Pass on Biased Wikipedia Edits?” (Yes, for awhile.)

P.P.S. Just over one year ago, I had an article published in Campaigns & Elections’ Politics Magazine about very nearly the same topic: edits made by political campaigns, how they are most often bad and some pointers about how to make them good.

The Wikipedian Mystique: Do Women Participate Enough in Wikipedia?

Tagged as , , , ,
on February 7, 2011 at 4:57 pm

Could it really be that just 13% of Wikipedia editors are women? That statistic comes from a survey of Wikipedia users (whether contributing or just reading) sponsored by the Wikimedia Foundation, first previewed in fall 2009 and eventually published in full in March 2010. Last week, Wikimedia executive director Susan Gardner announced plans to try raising this number to 25% by 2015. Thanks to coverage by Noam Cohen in The New York Times, the topic has dominated Interweb discussion of Wikipedia since then.

This participatory imbalance is not a new phenomenon, and hardly unique to Wikipedia. Cohen points to op-ed pages, and the same is considered to be true in their virtual equivalent, the political blogosphere. While there are some very prominent female contributors to all of the above, most surveys tend to show that men nevertheless lead these sectors.

On the other hand, as a female colleague pointed out to me, if you were to look at online forums about health care, animals, or the environment, the gender balance is likely to flip. The same is true with regard to professions; some are predominantly male or female, and many fall somewhere in between. Some combination of biological programming and social reinforcement produces a society with masculine and feminine traits. However, just because many stereotypes have a basis in reality does not mean they should be taken for granted or used as an excuse. Just because something is natural doesn’t make it right.

Among the many words expended on the topic, probably the best is by veteran Wikipedia contributor Kat Walsh; the entirety of it is worth reading, but here is the conclusion:

The big problem is that the current Wikipedia community is what came about by letting things develop naturally–trying to influence it in another direction is no longer the easiest path, and requires conscious effort to change. How do you become more inclusive without breaking the qualities that make the project happen to begin with? (Any easy, obvious answer to this question is probably wrong.) That Wikipedia works at all is an improbable thing; that it works, for the most part, well, nearly miraculous. Wikipedia’s culture doesn’t have to be hostile or unfriendly to a group for it to be underrepresented–it merely has to be not one of the most attractive options.

It so happens that “unfriendliness” has been identified as one possible reason. And it’s not that Wikipedia doesn’t have policies designed to address this issue: Wikipedia:Civility and Wikipedia:No personal attacks are core, non-negotiable site policies, augmented by further guidelines such as Wikipedia:Please do not bite the newcomers. The message is simple: Be polite to other editors, or you can be blocked. However, any experienced editor also knows that enforcement is uneven. Wikipedia is a very big place, where many editors are used to working in isolation. If someone comes along and starts behaving abusively, it can often feel like there is nowhere to turn. Even if you do know where to go for help, one actually must petition for a resolution, and this can be an unpleasant process. It’s also probably worth pointing out that this is an already issue on the presumably male-dominated website, so it is far from just women who feel this way.

Another issue worth considering is that no one actually knows for sure how many women are on the site. Anonymity on Wikipedia is guaranteed; hence the survey. But it’s trickier than that still, as I found out personally.

An early draft of the script for The State of Wikipedia video included the same detail from the survey Cohen cites. To make sure I had the details right, I sought the input of Erik Zachte, a data analyst for the Wikimedia Foundation and curator of information at the great Infodisiac website.

What he pointed out is that the survey had a significant problem with self-selection bias; more than a quarter of survey respondents came from Russia, for example. Among survey respondents, it is true somewhat less than 13% were female contributors. Slice it another way, and among contributors to the website, slightly more than 16% were female. Meanwhile, just 25% of survey-takers identified themselves as female. Therefore, the information concerning women on WIkipedia is considerably less likely to be accurate compared with men, but it still seems probable the percentage of female contributors is somewhere south of the 25% Gardner would like it to be.

The question then is what exactly she plans to do about it, and that discussion is underway now. If you want to be part of it, the Wikimedia Foundation has set up a mailing list to address the topic that is open to the public, and the Wikipedians you will find there are likely to be among the most thoughtful and welcoming. I certainly have my doubts that much will come of it, or that we’ll be able to reliably measure it. Wikipedia is a challenge to most people, from all walks of life, and any effort to artificially boost participation from any one group over the other is likely bound to meet with failure. If any solutions do arise, my guess is that will not necessarily be gender-specific.

As a final note, I find some irony in the fact that one reason put forth to explain why women don’t participate in Wikipedia is that they may not feel confident in their contributions, because on this particular topic, I don’t feel confident in my observations. Just for the record, on one hand I find that I am writing something because it’s a big topic and I don’t want to let it pass me by entirely; on the other hand, I think there is far more to be said about the subject than even a lengthy blog post can address. So I publish this now, unsure whether I’ve actually said anything worthwhile. Or maybe I’m overthinking it.

How I Spent Wikipedia’s Tenth Anniversary

Tagged as , , , , ,
on January 16, 2011 at 11:10 am

Alas, I did not make it to the local meetup in Washington, DC, where I live, but I did something else, something as fun as it was unexpected—I was on Ukrainian television.

Friday afternoon, a small TV crew led by reporter Maksym Drabok visited my apartment in Lanier Heights to record me talking about Wikipedia and even editing Wikipedia. Fortunately, I had some material about University of Oregon head football coach Chip Kelly waiting to be added, so I used the occasion to add a few more citations to his biographical article (it still needs more). Also featured in the segment was Thomas Boylston Adams, about the ne’er-do-well son of second U.S. president John Adams, which I created in April 2008 (while watching the HBO miniseries John Adams).

Also also featured: my home office, me in a wiki-related T-shirt, and—you guessed it—this very blog. Here’s the segment in full:

Last but not least, thanks very much to Maksym Drabok and INTER TV for the opportunity.

The Top 10 Wikipedia Stories of 2010

Tagged as , , , , , , , , , , , ,
on December 30, 2010 at 6:50 pm

The year 2010 will be over and out in another day’s time, which means there is no time like the present to look back on the year that was at Wikipedia. Instead of some kind of highfalutin’ think piece on what the past year, like, meant, let’s make this an easy-to-write, easier-to-read listicle outlining the biggest stories of the year involving Wikipedia—at least from an English-speaking, North American perspective. (For it is this perspective from which I am most qualified to write.)

For better or worse, here are the stories that defined Wikipedia, on-site and off, in 2010:

10. Wikipedia backups discovered — This occurred just in the past few weeks, and has not received a great deal of attention outside of Wikipedia circles, but to Wikipedia enthusiasts, it’s a big one. In mid-December, Wikimedia Foundation developer Tim Starling found several files dating back to Wikipedia’s first three months of existence. These had long been presumed to be gone for good, but now Wikipedia’s earliest days are much easier to reconstruct. Joseph Reagle of Harvard’s Berkman Center extracted the first 10,000 edits and has placed them on his own website for viewing, and in the future a more accessible reconstruction may be created, similar to the one at nostalgia.wikipedia.org.

9. Cuba’s Wikipedia copycatEcuRed is the Castro regime’s attempt to emulate Wikipedia. At least, in terms of look and feel: EcuRed may well be built using wiki software, but content updates are strictly reserved for unknown pre-approved editors. The entry for Estados Unidos is amusing. Surprisingly, there is no entry for Capitalismo, only Imperialismo, fase superior del capitalismo. Translated from Spanish, the website’s front page proclaims it was “born from the desire to create and disseminate knowledge with everyone and for everyone from Cuba and the world.” It would probably more more correct to say that it was born of a desire to create and disseminate propaganda for Fidel and Raúl Castro and their cronies.

8. Mike Godwin vs. the FBIThis was just weird. During the summer, the FBI sent a cease-and-desist letter to Wikipedia demanding that they remove occurrences of the FBI seal from Wikipedia articles about the agency. According to the FBI, use of the logo conflicted with the law. According to Wikimedia Foundation general counsel Mike Godwin, the law cited was about preventing people from impersonating FBI officials. Godwin’s sardonic reply—”While we appreciate your desire to revise the statute to reflect your expansive vision of it, the fact is that we must work with the actual language of the statute, not the aspirational version”—amused many. Two months later, Godwin resigned his position at Wikimedia. Were the two incidents connected? That was the whisper, but neither Mike nor the Foundation have clarified the reasons for his departure. It’s entirely possible that the two are not connected, but the whispering hasn’t been refuted. The FBI seal’s presence on Wikipedia, and Mike Godwin’s famed wit elsewhere, live on.

7. Wikimedia expansion to India — Wikipedians are all too aware of the fact that most of their contributions come from the rich, Western nations in the Anglosphere and Western Europe, but they yearn for participation to grow much beyond. As in the global economy, much growth may be found in the BRICs. Among industrializing countries, interest in Wikipedia has been especially strong in India, which is being rewarded with the first non-U.S. office of the Wikimedia Foundation. (For what it’s worth, I myself attended a Wikipedia-oriented conference in Bangalore this past January.)

6. Wikipedia gets a new look — Bet you didn’t notice this until months after it happened, but in the first half of 2010, Wikipedia received its first major redesign in several years. Gone was the “Monobook” skin and in was the “Vector” look. Why change? Wikipedia is always looking for ways to make the site easier to read—and easier to edit—and there had been concern for some time that the site design was becoming outdated, even in some ways confusing. Perhaps the biggest change involved moving the search field from the lefthand sidebar to the top right corner, a placement more common among popular websites. And the result? The number of individuals contributing during the second half of 2010 has been mostly flat, and even down slightly. Whatever drives people to contribute to Wikipedia, or stay away, is a force more powerful than web design.

5. Flagged revisions, er, pending changes — For years, the German-language Wikipedia has maintained a unique system for improving the reliability of its pages: contributions by new and infrequent users are held for review by more trusted editors. The result has been an encyclopedia taken far more seriously by academics in that country, so Wikipedians on the larger (and looser) English Wikipedia decided to give it a try. First called “flagged revisions” and later changed to the arguably more intuitive “pending changes” (yes, there was a debate about this), a number of articles were protected in this manner. The result was inconclusive: while a clear majority of participants voted to continue employing some form of pending changes, there was no consensus on just how to do it. For now, the project lies dormant.

4. Wikipedia in education — This is not one story, and it’s not unique to the past calendar year: encyclopedias have been staples of term paper bibliographies for decades (at least) but the rise of Wikipedia has turned this on its head. Where teachers were once content to let students cite Britannica on any number of subjects, many (if not most) now ban students from using Wikipedia in assignments. But 2010 may be the year in which educators learned to stop worrying and accommodate (if not love) Wikipedia. Time and debate have allowed more professional educators to see that Wikipedia is a legitimate starting point for research, and Wikipedia’s own imperfections provide numerous teachable moments. ZDNet education writer Christopher Dawson’s well-argued “Teachers: Please stop prohibiting the use of Wikipedia” is a good example of the former, while classroom projects at UC Berkeley and the University of Rhode Island show there is great promise for the latter.

3. Larry Sanger reports Wikimedia to the FBI — The Federal Bureau of Investigation and Wikimedia Foundation sure got to know each other this year. In April, estranged Wikipedia co-founder Larry Sanger sent a missive to the FBI reporting the Wikimedia Foundation for hosting “child pornography” and other obscene images on Wikipedia sister site Wikimedia Commons. Among the contested images were nude artistic works depicting the underaged and sexually explicit images featuring adults. Wikipedia’s commitment to the free availability of information can be controversial; name a body part or disease and you are going to see a picture of it on that Wikipedia page. There is even a specific policy related to this question, called “Wikipedia is not censored“. But does this mean that anything goes? Even after Sanger clarified that he understood no actual prurient images photographs of child sexual molestation* were in the site’s collection, some images were deleted, and the FBI pursued no action in any case. Although resolved for now, you can bet the controversy over the line between “censorship” and “editorial policy” will come up again.

2. Wikileaks and Wikipedia confusion — You may protest that Wikileaks has nothing to do with Wikpedia. In fact, I wrote “Wikileaks: No Wiki, Just Leaks” over the summer, when the mysterious online outfit published its Afghan War Diary. But the mere presence of the word “wiki” in the the not-a-wiki site’s name has become a potential PR problem for Wikipedia. When Wikileaks re-entered the news with the publication of leaked U.S. diplomatic cables in the fall, Jimmy Wales openly criticized Wikileaks, telling Charlie Rose: “If I had some information, the last thing I would ever do with it is send it to Wiikileaks.” Even Larry Sanger published a critical commentary about Wikileaks on his own site; although Sanger only tangentially referenced Wikipedia in his comment, the press took up that angle regardless. As long as Wikileaks remains a well-known and much-criticized public entity, Wikipedia will have to keep repeating the message that the two organizations have nothing to do with one another. Which leads us to #1…

1. The face of Wikipedia fundraising — It was perhaps fortuitous that the latest round of Wikileaks debate occurred at the same time the Wikimedia Foundation was undertaking the most sustained and visible PR push in its history. Since late November, Wikimedia sites have featured large banners across the top, asking readers to donate money toward its goal of raising $16 million—the largest amount yet requested, though still not quite enough to cover 2011′s expected operating budget. Most banners featured Wales’ face prominently, asking readers to consider his “personal appeal” to contribute. While effective, they’ve also been a source of annoyance and subject of derision. The New York Observer headline, “Staring Contest with Jimmy Wales To Go On Indefinitely”, was among the politer expressions of this viewpoint. On the other hand, they are working: at the campaign’s outset, Wikimedia collected in one week what they took in over a month last year. As of this writing, the organization had just about a million dollars left to go. Not too shabby. And Henry Blodget will get a chance to recycle his call for Wikipedia to deploy advertising next year.

That was the year that was, at Wikipedia and the Wikimedia Foundation. Next year will be another. If you think I’ve missed or messed up anything important, please share in the comments. See you in 2011!

All images via Wikimedia Commons.

*Updated, per comments.

The Earliest Known Record of Wikipedia Journalism

Tagged as , , , , , , ,
on October 12, 2010 at 6:56 am

I’d gotten to wondering, recently, what was the first time Wikipedia was mentioned by a media source? The project began in January 2001, but I’m sure I wasn’t aware of it until sometime in 2003 at the earliest. I have no memory of first learning about it — only a recollection that sometime in the middle of the last decade, I was spending hours and hours, and entire days on some weekends, reading Wikipedia. I wasn’t too curious about where it came from then, but over the last few years, I clearly have been.

So I did what anyone with access to an online news database would do: I looked it up. And the winner appears to be a July 1, 2001 article in the Australian edition of PC World, by one Aldis Ozols. Here it is, in its entirety:

Roll-your-own fount of knowledge: www.wikipedia.com.; editor’s choice.

“A wiki is a collection of interlinked Web pages which can be visited and edited by anyone” goes the definition by Wikipedia. Rising to the challenge, I edited the page on which this statement was made, and behold, my contribution (all two words of it) became part of the Wikipedia.

This is a collaborative project intended to produce a usable encyclopedia through the efforts of many volunteers who surf in from the Net. While this makes it superficially similar to Everything2 (see June issue), there are differences. For instance, Everything2 seeks to be a live, interactive community as well as a reference, whereas Wildpedia [sic] has a more modest goal: to create a freely distributable 100,000 page encyclopedia online. In addition, where Everything2 has a complex system of user ranking and moderation which attempts to grade contributions and their authors, Wikipedia is wide open. Anyone can rock up and modify existing entries, or create new ones as I did.

Astonishingly, the result is not a pile of chaotic nonsense, as one might expect. Perhaps that’s because the project is still small, with only 6000 pages of text and a few dozen contributors, but something more seems to be at work here. Evidently, articles that start off with a one-sided viewpoint are edited and re-edited until they settle into a kind of consensus with which most people are satisfied. In anycase, this is an interesting experiment containing some surprisingly accurate articles.

Surprisingly prescient, if you ask me. Or perhaps just lucky — many a website that garners positive reviews in its early going nonetheless still folds, or descends into chaos. In any case, I’m surprised to find this article is not online — if I’d been first to report on Wikipedia, I’d want to take credit for the fact.

Looking a little further, it seems that most of the Anglosphere reported on Wikipedia before anyone in the U.S. had anything to say about it: England (London Free Press), Canada (Edmonton Sun), Wales (Wales on Sunday) and Northern Ireland (Irish News) all got there first.

Stateside, the first press mention of Wikipedia was in the Gray Lady herself, the New York Times, by someone named Peter Meyers. This story is online, so I will simply quote the lede (sorry, non-journos) and call it good:

Fact-Driven? Collegial? This Site Wants You

FOR all the human traffic that the Web attracts, most sites remain fairly solitary destinations. People shop by themselves, retrieve information alone and post messages that they hope others will eventually notice. But some sites are looking for ways to enable visitors not only to interact but even to collaborate to change the sites themselves.

Wikipedia (www.wikipedia.com) is one such site, a place where 100 or so volunteers have been working since January to compile a free encyclopedia. Using a relatively unknown and simple software tool called Wiki, they are involved in a kind of virtual barn-raising.

Their work, which so far consists of some 10,000 entries ranging from Abba to zygote, in some ways resembles the ad hoc effort that went into building the Linux operating system. What they have accomplished suggests that the Web can be a fertile environment in which people work side by side and get along with one another. And getting along, in the end, may ultimately be more remarkable than developing a full-fledged encyclopedia.

For the curious, here is what the ABBA entry looked like on the day the story ran, and here it is today. And here is something close to what the zygote article looked like then, and what it looks like now. One wonders what it will look like in another ten years.

Update: In the comments, Graham87 locates the exact zygote entry, from the so-called Nostalgia Wikipedia (a topic worthy of its own post, at some point).

They Send You a Cease and Desist Letter, You Send One of Theirs to the Morgue

Tagged as , , , , , , , ,
on August 4, 2010 at 6:46 am

Apparently the Federal Bureau of Investigation, the nation’s top cops, the G-Men, the public enemies of all public enemies, have found a new target: Wikipedia! The New York Times ran a short article yesterday about a funny-if-it-wasn’t-serious situation whereby the FBI recently sent a letter to the San Francisco offices of the Wikimedia Foundation

demanding that it take down an image of the F.B.I. seal accompanying an article on the bureau, and threatened litigation: “Failure to comply may result in further legal action. We appreciate your timely attention to this matter.”

But the Foundation won’t budge:

The problem, those at Wikipedia say, is that the law cited in the F.B.I.’s letter is largely about keeping people from flashing fake badges or profiting from the use of the seal, and not about posting images on noncommercial Web sites. Many sites, including the online version of the Encyclopedia Britannica, display the seal.

Other organizations might simply back down. But Wikipedia sent back a politely feisty response, stating that the bureau’s lawyers had misquoted the law. “While we appreciate your desire to revise the statute to reflect your expansive vision of it, the fact is that we must work with the actual language of the statute, not the aspirational version” that the F.B.I. had provided.

The relevant statute, helpfully linked by the New York Times, states:

§ 701. Official badges, identification cards, other insignia

Whoever manufactures, sells, or possesses any badge, identification card, or other insignia, of the design prescribed by the head of any department or agency of the United States for use by any officer or employee thereof, or any colorable imitation thereof, or photographs, prints, or in any other manner makes or executes any engraving, photograph, print, or impression in the likeness of any such badge, identification card, or other insignia, or any colorable imitation thereof, except as authorized under regulations made pursuant to law, shall be fined under this title or imprisoned not more than six months, or both.

I do find it ironic, considering that Wikipedia and other projects administered by its parent organization are among the most scrupulous on the whole of the Internet about respecting copyright law.

In most circumstances, Wikipedia requires that images used on the site be in the public domain or released under a free license explicitly permitting such use. Only in circumstances where there is no hope a suitable alternative may be available does the site allow copyrighted images, and only then under very limited circumstances. If you want to use the Nike swoosh on your user page or the article about Michael Jordan, no such luck but you will certainly find it on the company’s corporate profile.

The FBI seal, as a work of the United States government, falls under the first category — it is considered public domain — but its use is nevertheless limited to pages about certain FBI-specific subjects. And the photo’s page on the Wikipedia server even includes this helpful advisory:

fbi_logo_wikipedia_licensing

With no sources inside The House J. Edgar Hoover Built, I’m puzzled as to why they would do this. Perhaps they got the site confused with WikiLeaks?

WikiLeaks: No Wiki, Just Leaks

Tagged as , , , , , , , , , , ,
on July 31, 2010 at 8:55 pm

The website called WikiLeaks makes waves every few months, but never more than now that it has released 90,000+ classified U.S. military documents from the war in Afghanistan, which the site has called the Afghan War Diary. It’s become one of the biggest news stories of the summer, or at least one of the biggest legitimate news stories (cough, ahem). Aside from what the documents reveal (or maybe don’t) and their implications for U.S. policy, the release itself is an interesting subject, especially as compared to its nearest historical precedent.

When the classified documents that came to be known as the Pentagon Papers were first revealed in June 1971, the first stories about it ran in the New York Times, and only following an internal debate about the legal propriety of doing so. When the U.S. government predictably sued, the Washington Post started its own series based on the documents, and quickly faced the same injunction. By the end of the month — and we think things happen quickly these days — the Supreme Court ruled 6-3 that the injunctions were unconstitutional, and the rest is history.

What the Afghan War Diary lacks in public drama it more than makes up for in zeitgeist, with its decentralized, asymmetric, non-state method of publication. Rather than going to the press, the leaker gave them to WikiLeaks, a website based in Sweden, supported by anonymous donors and run (or at least repped) by a somewhat unusual fellow named Julian Assange.

But I’m compelled to point out, as the title of my post indicates, that despite running on the same software as Wikipedia and using the word “wiki” in its name, WikiLeaks is not a wiki. This screen cap below, featuring just a portion of the website’s front page, illustrates my point:

wikileaks-website-small
Click on image to view full-size

If you’re familiar with Wikipedia (and I suspect you are) then you’ll notice the website is based on the same MediaWiki software as Wikipedia. Unlike Wikipedia, it does not acknowledge the fact. Although it’s free software, the terms of its Creative Commons license are such that one needs to give credit where due. At least WikiLeaks is consistently mysterious, not to mention contraband.

More to the point, look at the tabbed links to pages above the site banner. On Wikipedia, this is where you would see the following links: Page (article content), Discussion (where to talk about the article), Edit (what it sounds like) and History (a list of all edits to the article) and a few others, including a link to log in or create an account. WikiLeaks is a bit different: there are only three such links. Most strikingly, there are no options to contribute or create an account. The discussion page is there, but you aren’t invited to participate. (Note that on Wikipedia, in most cases, one need not even register to contribute.) And for what it’s worth, there isn’t even a history page available, so there is no way to see what changes may have been made to the page since it was first posted. That’s a wiki? Yes, there is a link to submit documents for review, but that’s the Internet Movie Database (IMDb) model. I suppose ILDb just doesn’t have the same ring to it.

Just in case you’re rusty on the concept, or are the sort of person who wouldn’t know Ward Cunningham from Larry Sanger, here are a few handy definitions:

  • Dictionary.com: A collaborative Web site set up to allow user editing and adding of content
  • Simple English Wikipedia: A wiki is a type of website that lets anyone create and edit its pages.
  • Wiktionary: A collaborative website which can be directly edited using only a web browser, often by anyone with access to it.

The WikiLeaks FAQ makes it very clear that no open editing is to be found on this site:

Who writes WikiLeaks leaked document summaries?
WikiLeaks staff, sometimes in collaboration with the submitter. Historically, most summaries were written by Julian Assange.

And another:

Can random people edit WikiLeaks documents?
No. Source documents are kept pristine.

Of course this makes perfect sense, given the website’s stated mission. But it also makes it, you know, not a wiki.

Not only is the name misleading, but it’s my (purely speculative) opinion that the site was so named to borrow from the credibility enjoyed (and earned) by Wikipedia. Being a website created with the purpose of disclosing material previously regarded as secret, frequently concerning the security interests of nation states, WikiLeaks self-consciously associated itself with the only non-profit to be found among the top 10 global websites. The name recalls Wikinews, Wikibooks, Wikisource and other projects of the Wikimedia Foundation. Let’s be clear: it is most certainly not. I’d think Wikipedia might even have a legal case to make against WikiLeaks, although it would surely be the least of the website’s legal problems.

If there is a silver lining in all this, perhaps it lies in the implication that the word “wiki” has come to denote something like “openness” and “fairness” and “democracy” to a worldwide audience of Internet users. ILDb really wouldn’t be the same. To have your name become shorthand for such an inchoate but positive concept is obviously a good thing in itself, and quite an accomplishment. But it also means, as WikiLeaks shows, that someone out there is going to bite your style.

Update: In the comments below, a reader suggests that WikiLeaks did, for a time, allow outside contributors as a traditional wiki would. That seems to indicate my speculation above is off-base, although it’s probably still true that WikiLeaks took inspiration from Wikipedia.