William Beutler on Wikipedia

Posts Tagged ‘Christianity’

Charted Territory: When Good Infographics Go Bad

Tagged as , , , , , , , , , , , , , , , ,
on August 12, 2010 at 8:32 pm

I will be blunt: the new infographic from David McCandless (Information is Beautiful), called “Articles of War: Wikipedia’s lamest edit wars“, is so lazy as to be misleading, glib as to be condescending, and generally unhelpful that I’m inclined to say that it sets back the public understanding of how Wikipedia works all by itself.

Up front: I respect McCandless and like what he does, which includes some interesting and thoughtful work, especially his print of Left vs. Right (U.S. and Rest of the World editions) that is better than most professional political analysts could produce. Separately, I am collaborating with friends on a Wikipedia visualization project of our own, so call me an interested observer, but note also that I’ve been thinking about this kind of thing lately.

I have reproduced only the top section of “Articles of War” below, for the purposes of commentary (click through to see the full thing on McCandless’ site):

Articles of War (excerpt)

The first thing to know about “Articles of War” is that it was based on an essay to be found in the recesses of Wikipedia called “Lamest edit wars” that is specifically kept in the site’s intra-wiki space because, as it states at the top: “This page contains material that is kept because it is considered humorous.” McCandless & Co. do give credit where it is due, but that Wikipedia page surely does not and never did intend to be definitive — it’s just a series of cheekily-written paragraphs about various arguments occurring over time, so there is nothing like meaningful numbers to be gleaned from it.

Instead, McCandless and his researchers decided to generate data to visualize these edit wars by counting the total number of edits over each article’s lifetime, counting not just the edits specifically related to that particular dispute (a difficult and time-consuming thing to research, it goes without saying) but every single edit, ever, thereby giving a grossly distorted view of each article’s history. I’ll give them the fact that if one looks to the legend in the top lefthand corner, it indicates that the number listed (and I presume the size of each box) relates to the “Total no. of edits” but even if readers do notice that, it is at best confusing.

Likewise, the articles’ relative position on the chart accords to their creation, not when the described dispute took place. If you think 2,000+ edits were expended on a photograph in the Cow-tipping article in the middle of 2001, that’s too bad, but you were reasonably misled. Nor would would you know that the article did not include a photograph until several years later.

What you are left with is a decent visualization of how frequently edited some randomly selected articles — some popular, some timely, some but not all controversial — happen to be. Why not simply show that? Focusing on this alone we can see that the following articles have attracted tens of thousands of edits over the years:

  • The Beatles
  • Jesus
  • Wikipedia
  • Christianity
  • Ann Coulter
  • Star Wars
  • Wii

That’s not linkbait enough for you? Then please do the research.

Meanwhile, the infographic is also a little too snarky for its own good, especially toward its chosen subject. Color-coding is used to categorize certain types of edit wars; one is labeled “American Cultural Superiority” and exists mainly to identify debates between U.S. and British spellings. Which I find a little… superior itself, but hey, I suppose it’s a misdemeanor violation. Worse is that edit wars involving Wikipedia and site co-founder Jimmy Wales are coded as “Religion.” Too cute. Or maybe just an oversight?

Another oversight concerns an on-wiki debate about whether the most famous Palin was, at the time of its occurrence, Monty Python’s Michael or Alaska’s former governor Sarah. (Since then, I believe the one with decades of contributions to comedy has been definitively usurped by the mavericky one’s more recent, er, contributions.) According to “Articles of War” this happened in 2003. But if you think about it, this makes no sense at all — of course this happened in 2008, when John McCain chose Sarah Palin as his running mate. And the Lamest edit wars essay itself mentions that this happened in 2008. Pure oversight to be sure, but I have to wonder what other mistakes the research team made.

To their partial credit, they have opened their Google Spreadsheets for public inspection, so it’s clear they at least intended to impart real information. And there you can see that they are indeed using the total number of edits over time and that their “Palin” error was made early on. That seems to put the responsibility on the researchers, rather than McCandless himself, but of course it’s a total package.

I hold McCandless to a standard that I don’t the jokers at Cracked* or Something Awful because their job is to make you laugh, while McCandless’ job, according to his website’s own tagline, is to take “issues, ideas, knowledge, data” — and make it easier to understand by visualizing it. There are certainly issues and ideas to be found in “Articles of War” — but knowledge and data, not so much. And though I am getting a little more rant-y than usual about this, I do aim to be constructive, so I would very much like to see this infographic re-done with some extra research. This blog post may serve as a guide if they so choose. I hope they do.

P.S. The Gizmodo thread — where I found it — on this is hilarious, with many people re-fighting the same disputes that once arose on Wikipedia. However, only one that I saw came anywhere near noticing the fact that the methodology was suspect.

P.P.S. Am I being nitpicky to add that “Articles of War” appears to convey that Wikipedia’s articles about The Beatles and Jesus were created prior to 2001? That is to say before Wikipedia itself began? I don’t actually think so.

*Actually, about Cracked — a.k.a. Digg’s favorite website — as I have seen a prominent Wikipedian point out elsewhere, it often does a pretty good job using information from Wikipedia responsibly. Among their articles about Wikipedia, the title of “5 Terrifying Bastardizations of the Wikipedia Model” alone gives away that it’s implicitly pro-Wikipedia, as does “5 Celebrity Wikipedia Entries they Clearly Wrote Themselves“. Even “8 Most Needlessly Detailed Wikipedia Entries” knows what’s good about Wikipedia, even when it isn’t. Cracked writers clearly know their way down through a history page — like say, Corey Feldman’s — but it doesn’t appear that McCandless and his researchers looked as closely.

The Archangel, the Renaissance Master and the Ninja Turtle

Tagged as , , , , , , , , , , , ,
on November 22, 2009 at 3:37 pm

raphael-angel     raphael-painter     raphael-tmnt

Back in March I considered the subject of “wikigroaning”—a joke / criticism about Wikipedia popularized on the Something Awful Internet forum in 2007. The idea is this: Sometimes, Wikipedia articles on weighty subjects are shorter and less well-developed than articles about similar, less-weighty subjects.

What I found was that this critique no longer applied to a comparison of “Lightsaber combat” vs. “Modern warfare“; the former entry no longer strictly exists, as the page now redirects to the larger topic of “Lightsaber” while the latter is essentially a hub for accessing articles on various sub-topics (assymetric warfare, biological warfare, etc.).

Today, let’s look at another one suggested by Something Awful members: Raphael (archangel) vs. Raphael (ninja turtle). How do the two compare?

Superficially, the joke is on Wikipedia: the main text of the article about the comic book character is approximately 3,000 words long, whereas the one about the Judeo-Christian figure is about 1,350. But here’s the thing—the TMNT-related article is basically devoid of any citations, and was clearly written by fans of the various comic books, TV shows and movies in which he appears. One might assume that the details should be relatively accurate, as it doesn’t seem to be a contentious subject, but who is to say? One citation is provided for the entire article, and indeed the article has been tagged as needing citations since December 2007:

wiki-raphael-warning

That’s almost two years in which fans have been stopping by to work on the article, but no one has yet bothered to clean up problems identified by a non-fan editor, nor have they bothered to provide citations to verify any of it. From this we can infer that most editors on this particular article are focused on this particular topic and are not involved with Wikipedia otherwise.

Meanwhile there is another problem with this article. While much of it summarizes discrete events that occur in the TMNT series, other sections read as commentary on / interpretations of the character. For example:

He has an extremely loyal side and is the first to react when another of his brothers is in trouble. This happens on numerous occasions, like when he stops a blow from hitting Donatello using only his sais or kicks the Shredder away from Leonardo when the latter is about to attack.

So one could certainly verify the existence of a particular scene by citing directly from the comics. Yet the interpretation of Raphael’s actions is left to the reader, and adding this information directly to Wikipedia is a clear-cut case of original researchexpressly forbidden by Wikipedia guidelines.

What is one to do if there is no published commentary on this aspect of the character’s personality? Is it then to be left out of Wikipedia entirely? In theory, yes. In practice, no. I could remove this section immediately and much more of the article if it so pleased me. But you know what? I won’t do it. The article isn’t hurting anyone, so in that way its relative frivolity helps. Moreover, it’s entirely possible that many or most of these interpretations could be found in published reviews, and without having done this I’m disinclined to delete someone’s sincere work, however inexpert. As a known issue, Wikipedia has an informal term for this type of material: fancruft. Fancruft is often deleted, but this much is so far not offensive enough to merit outright deletion.

tmnt-coverAnd how about the archangel? For an article about a Biblical figure I am surprised that it is not better. Only seven citations have been provided, and sections including “Raphael in Islam” and “Raphael in Paradise Lost” have none whatsoever. The quality of the writing is likewise uneven. Clearly, different sections within the article are substantially the work of different editors, and I would probably base my trust in each section according to the quality of the prose. Unsurprisingly, the better-written sections are also the ones with more sources.

But let’s now finally address the obvious: Something Awful seems to have made a mistake, because the Raphael the turtle is not named for Raphael the archangel. He is named for the Renaissance artist, just like his ninja turtle brothers Leonardo, Michaelangelo and Donatello.

Before we come to a final conclusion, let us consider the article about the real person, which is simply titled Raphael. And guess what? It’s the best of the bunch, and it’s not even close. The article is more than 6,000 words, well-written, well-sourced (84 in-line citations, nearly all from serious biographies) and well-illustrated (easy to do when the subject’s work is all public domain). There is not even a mention of the TMNT character, although it has been suggested before and appropriately rejected.

Did Something Awful purposefully avoid making the comparison? Hard to say. In 2007 the article about the Renaissance master was much shorter and completely unsourced, though carefully-written. At that time, the article about the ninja turtle was certainly longer but also less sophisticated.

According to the original Something Awful post, the criteria was simply an assesment of which article is “longer.” But this is too simplistic—it should be obvious that not all words are equal. Just as Something Awful seeks to highlight the mistake of determine a subject’s importance by the space allotted on Wikipedia, it’s also a mistake to assume that the quality of an article is directly correlated with the number of words contained within.

Both are important to keep in mind when reading Wikipedia. How many readers approach the site with these considerations in mind? That’s what I’d like to know.

Images via Wikipedia.