William Beutler on Wikipedia

Posts Tagged ‘Sarah Palin’

Michele Bachmann, Sarah Palin and the Boring Truth About Wikipedia Vandalism

Tagged as , , , , , , , , , , , ,
on June 29, 2011 at 10:02 am

The Wikipedian was traveling for most of this past month, and so I’ve missed out on a few interesting Wikipedia-related stories of late. None was more frustrating (and entertaining) than the case of Sarah Palin’s supporters’ edits to and arguments about Paul Revere’s famous ride. In case you missed it (or, as it is so often abbreviated in campaign e-mail blasts, “ICYMI”) Palin stated in early June that Revere had warned the British—not the American revolutionaries—and a few of her supporters attempted to change the Paul Revere article to more closely reflect her version of events.

Yes, I missed that one, but maybe I’m not too late: according to nearly back-to-back posts by left-wing bloggers at ThinkProgress and Raw Story, the same thing is happening to various Wikipedia articles following erroneous statements by newly-declared Republican presidential candidate Michele Bachmann. At issue:

  • During her campaign announcement speech, Bachmann referred to the late film actor John Wayne’s hometown as Waterloo, Iowa, when in fact it was Winterset, Iowa. As an aside, I seriously doubt, as widely asserted, that she was thinking of John Wayne Gacy (who is most closely associated with Chicago) and, for what it’s worth, Bachmann later pointed out that Wayne’s parents met in Waterloo.
  • Later, interviewed by ABC News, Bachmann referred to John Quincy Adams as a “founding father” although the U.S. president was only a child during the American revolution (he was, of course, the son of founding father John Adams). The last I heard, she was sticking to her guns on this one, as little sense as that makes.

As reported by ThinkProgress and Raw Story, the Wikipedia articles about John Wayne and John Quincy Adams were undoubtedly changed, more than once, to reflect Bachmann’s erroneous statements. I’ll tell you what, though: upon closer inspection, I think this hardly rises to the same level as the Palin-Revere controversy, and really says more about the partisan / ideological online media than it does about Michele Bachmann or her political supporters—let alone Wikipedia.

To wit: On Monday, an IP editor (meaning one who has not registered for an account and so is represented by their IP address) from Pennsylvania changed John Wayne’s birthplace to “Waterloo” from “Winterset”. It was changed back pretty quickly. On the discussion page, there was little actual debate of the issue—and it started anyway with a sarcastic post by someone clearly not a Bachmann fan.

The next day, on the John Quincy Adams page, an IP editor (using the IP address, associated with UC-Irvine) added “a founding father” as a subordinate clause in the very first sentence. This too was removed, and a brief, detached conversation occurred on that discussion page as well.

I decided to look at the edit history of the IP editors responsible for the above edits. It turned out the editor responsible for the Wayne edit had made no prior edits and has made none since. The editor responsible for the JQA edit has possibly edited a few times before (IP addresses can be shared, so identity is difficult to establish). On the discussion page associated with the IP address, an established editor politely suggested that the individual create an account, whereupon the IP editor replied:

Are you joking? It was obviously vandalism, so why try to act like I was acting in good faith?

Yeah, that’s about right. You won’t hear it from ThinkProgress or Raw Story, but the Palin-Revere controversy was a much bigger deal, kicking up a much more heated debate, lasting more than a week and encompassing several related discussion threads. And whereas actual Sarah Palin fans seem to have become involved there, there is no reason to think that actual Bachmann supporters are involved here. The best take on it comes from an editor, BusterD, who wrote on the JQA discussion page:

Up to this point, what is reported is not actually happening. A few ip editors have been injecting the phrase “founding father”, sometimes as a clear jest and sometimes modifying the father who is considered one of the founders, but most of what’s going on is normal ip vandalism which occurs when an historical figure gets mentioned in the media. Semi-protection is now in force; nobody has been editing the page in any but the most minor ways. Sure would be a good time to get cites on everything and tighten the page up some.

That’s exactly right. Activity on Wikipedia articles, whether helpful or unhelpful, is often driven by what’s in the news, and this case seems to be no different. General mischief on Wikipedia is an everyday fact of life, and the idle hands motivated to cause such trouble frequently draw inspiration from the headlines. Wikipedia’s Recent changes patrol (and a few automated scripts) keep the most obvious at bay; most of it is caught within minutes. Politically motivated edits are usually much more subtle and focused on specific politicians rather than general topics momentarily associated with them. It seems clear that the Bachmann-related edits were not done to make a point but simply for the lulz.

Whether these incidents say anything about the respective supporters of Michele Bachmann vs. those of Sarah Palin, I pass no judgment. As to the blog-first-ask-questions-later nature of the political mediasphere, well, I think this post speaks for itself.

Charted Territory: When Good Infographics Go Bad

Tagged as , , , , , , , , , , , , , , , ,
on August 12, 2010 at 8:32 pm

I will be blunt: the new infographic from David McCandless (Information is Beautiful), called “Articles of War: Wikipedia’s lamest edit wars“, is so lazy as to be misleading, glib as to be condescending, and generally unhelpful that I’m inclined to say that it sets back the public understanding of how Wikipedia works all by itself.

Up front: I respect McCandless and like what he does, which includes some interesting and thoughtful work, especially his print of Left vs. Right (U.S. and Rest of the World editions) that is better than most professional political analysts could produce. Separately, I am collaborating with friends on a Wikipedia visualization project of our own, so call me an interested observer, but note also that I’ve been thinking about this kind of thing lately.

I have reproduced only the top section of “Articles of War” below, for the purposes of commentary (click through to see the full thing on McCandless’ site):

Articles of War (excerpt)

The first thing to know about “Articles of War” is that it was based on an essay to be found in the recesses of Wikipedia called “Lamest edit wars” that is specifically kept in the site’s intra-wiki space because, as it states at the top: “This page contains material that is kept because it is considered humorous.” McCandless & Co. do give credit where it is due, but that Wikipedia page surely does not and never did intend to be definitive — it’s just a series of cheekily-written paragraphs about various arguments occurring over time, so there is nothing like meaningful numbers to be gleaned from it.

Instead, McCandless and his researchers decided to generate data to visualize these edit wars by counting the total number of edits over each article’s lifetime, counting not just the edits specifically related to that particular dispute (a difficult and time-consuming thing to research, it goes without saying) but every single edit, ever, thereby giving a grossly distorted view of each article’s history. I’ll give them the fact that if one looks to the legend in the top lefthand corner, it indicates that the number listed (and I presume the size of each box) relates to the “Total no. of edits” but even if readers do notice that, it is at best confusing.

Likewise, the articles’ relative position on the chart accords to their creation, not when the described dispute took place. If you think 2,000+ edits were expended on a photograph in the Cow-tipping article in the middle of 2001, that’s too bad, but you were reasonably misled. Nor would would you know that the article did not include a photograph until several years later.

What you are left with is a decent visualization of how frequently edited some randomly selected articles — some popular, some timely, some but not all controversial — happen to be. Why not simply show that? Focusing on this alone we can see that the following articles have attracted tens of thousands of edits over the years:

  • The Beatles
  • Jesus
  • Wikipedia
  • Christianity
  • Ann Coulter
  • Star Wars
  • Wii

That’s not linkbait enough for you? Then please do the research.

Meanwhile, the infographic is also a little too snarky for its own good, especially toward its chosen subject. Color-coding is used to categorize certain types of edit wars; one is labeled “American Cultural Superiority” and exists mainly to identify debates between U.S. and British spellings. Which I find a little… superior itself, but hey, I suppose it’s a misdemeanor violation. Worse is that edit wars involving Wikipedia and site co-founder Jimmy Wales are coded as “Religion.” Too cute. Or maybe just an oversight?

Another oversight concerns an on-wiki debate about whether the most famous Palin was, at the time of its occurrence, Monty Python’s Michael or Alaska’s former governor Sarah. (Since then, I believe the one with decades of contributions to comedy has been definitively usurped by the mavericky one’s more recent, er, contributions.) According to “Articles of War” this happened in 2003. But if you think about it, this makes no sense at all — of course this happened in 2008, when John McCain chose Sarah Palin as his running mate. And the Lamest edit wars essay itself mentions that this happened in 2008. Pure oversight to be sure, but I have to wonder what other mistakes the research team made.

To their partial credit, they have opened their Google Spreadsheets for public inspection, so it’s clear they at least intended to impart real information. And there you can see that they are indeed using the total number of edits over time and that their “Palin” error was made early on. That seems to put the responsibility on the researchers, rather than McCandless himself, but of course it’s a total package.

I hold McCandless to a standard that I don’t the jokers at Cracked* or Something Awful because their job is to make you laugh, while McCandless’ job, according to his website’s own tagline, is to take “issues, ideas, knowledge, data” — and make it easier to understand by visualizing it. There are certainly issues and ideas to be found in “Articles of War” — but knowledge and data, not so much. And though I am getting a little more rant-y than usual about this, I do aim to be constructive, so I would very much like to see this infographic re-done with some extra research. This blog post may serve as a guide if they so choose. I hope they do.

P.S. The Gizmodo thread — where I found it — on this is hilarious, with many people re-fighting the same disputes that once arose on Wikipedia. However, only one that I saw came anywhere near noticing the fact that the methodology was suspect.

P.P.S. Am I being nitpicky to add that “Articles of War” appears to convey that Wikipedia’s articles about The Beatles and Jesus were created prior to 2001? That is to say before Wikipedia itself began? I don’t actually think so.

*Actually, about Cracked — a.k.a. Digg’s favorite website — as I have seen a prominent Wikipedian point out elsewhere, it often does a pretty good job using information from Wikipedia responsibly. Among their articles about Wikipedia, the title of “5 Terrifying Bastardizations of the Wikipedia Model” alone gives away that it’s implicitly pro-Wikipedia, as does “5 Celebrity Wikipedia Entries they Clearly Wrote Themselves“. Even “8 Most Needlessly Detailed Wikipedia Entries” knows what’s good about Wikipedia, even when it isn’t. Cracked writers clearly know their way down through a history page — like say, Corey Feldman’s — but it doesn’t appear that McCandless and his researchers looked as closely.

The King of Wikipedia Traffic

Tagged as , , , , , , , , ,
on June 27, 2009 at 4:42 pm

Michael Jackson‘s sudden and shocking death just about blew up the Internet this past week, and Wikipedia was no exception, even getting briefly knocked offline. And as the New York Times’ tech reporter Noam Cohen reported, the stunning news produced another milestone for Wikipedia:

The Michael Jackson entry in Wikipedia Thursday evening appeared to have set the record as having the highest traffic in the eight-year history of the online encyclopedia.

In the 7 p.m. hour alone Thursday, shortly after Mr. Jackson’s death was confirmed, there were nearly one million visitors to that article. (In fact, for that hour more than 250,000 visitors went to the misspelled entry “Micheal Jackson.” Even his brother Randy Jackson had 25,000 visits that hour.)

“We suspect this is most in a one-hour period of any article in Wikipedia history,” said Jay Walsh, a spokesman for the Wikimedia Foundation in San Francisco.

The article goes on to note that this represented about 1 percent of Wikipedia’s total traffic on the day — this may not sound like much, until you recall the English Wikipedia has more than 2.9 million articles. Writing midday Friday, Cohen predicted that the article could surpass 5 million visits on Friday. As it happens, Cohen set his target a little too low:


1.4 million visits is pretty remarkable, but 5.9 million visits in unprecendented. However, there is one discrepancy: yesterday’s estimates from User:Henrik‘s Wikipedia article traffic statistics tool (and Cohen’s article) put the figure at 1.8 million visits, which means the numbers where somehow reconciled downward in the interim. I’ll be looking to find out why. And while Cohen names as a point of comparison President Barack Obama‘s Wikipedia article, which received 2.3 million visits on Election Day, I know of a page that received more traffic still and offers a better comparison:


That spike you are looking at occurred on the day that Senator John McCain announced Governor Sarah Palin as his running mate in the final days of August, 2008 (as previously discussed on Blog P.I.). Between Jackson and Palin we have one well-known but mysterious and one little-known but suddenly very public figure, thrust into the middle of a breaking news story. By comparison, Obama was a highly visible public figure and Election Day was known far in advance. Perhaps that actually makes the 2.3 million that day even more impressive. But it’s hard to read much more into bar graphs such as this beyond acknowledging they represent a sudden and externally-driven interest in the subject.

Meanwhile, it’s interesting to note that the article containing the information people presumably want most, Death of Michael Jackson, has not recorded anything like the traffic of the primary MJ article:


Why is this the case? Part of the answer is the power of Google, which is the overwhelming driver of traffic to Wikipedia. On that note, I don’t know about you, but in the past 24 hours, Michael Jackson’s official site and his Wikipedia article have traded places on Google, with Wikipedia now ranked first overall. Second, the link to this article is found deep in the primary one, albeit of course at the top of the section concerning his death. Still, 527 is a rounding error compared to 5.9 million. Perhaps the Michael Jackson article itself satisfied their curiosity, before clicking over to iTunes and downloading a copy of Thriller.

And one last, somewhat morbid note: it is strange indeed that the King of Pop is no longer covered by Wikipedia’s Biography of living persons guideline.

Update: In the comments, one of the more knowledegable Wikipedia editors, Tvoz, suggests I’m wrong on the last point:

One thing: actually Michael Jackson’s article *is* still covered by the “biographies of living people” guidelines. Those guidelines protect the integrity of Wikipedia’s articles and intend to thwart defamation, and it is expected that editors will continue to follow the policy and remove poorly sourced defamatory material immediately, even after the death. His family members are alive, and causes of action as a result of such defamatory material could still be brought.

An interesting point, and I think a fair clarification. My inclination is to say this means that Jackson’s family members are still covered by BLP, and this means that any material on the Michael Jackson page must conform to the policy in order to protect them, rather than MJ himself. And of course, spurious information shouldn’t be added at any time — and Jackson’s continued celebrity probably means that this page will be scrutinized more than most.