William Beutler on Wikipedia

Search for “wikigroaning”

Charted Territory: When Good Infographics Go Bad

Tagged as , , , , , , , , , , , , , , , ,
on August 12, 2010 at 8:32 pm

I will be blunt: the new infographic from David McCandless (Information is Beautiful), called “Articles of War: Wikipedia’s lamest edit wars“, is so lazy as to be misleading, glib as to be condescending, and generally unhelpful that I’m inclined to say that it sets back the public understanding of how Wikipedia works all by itself.

Up front: I respect McCandless and like what he does, which includes some interesting and thoughtful work, especially his print of Left vs. Right (U.S. and Rest of the World editions) that is better than most professional political analysts could produce. Separately, I am collaborating with friends on a Wikipedia visualization project of our own, so call me an interested observer, but note also that I’ve been thinking about this kind of thing lately.

I have reproduced only the top section of “Articles of War” below, for the purposes of commentary (click through to see the full thing on McCandless’ site):

Articles of War (excerpt)

The first thing to know about “Articles of War” is that it was based on an essay to be found in the recesses of Wikipedia called “Lamest edit wars” that is specifically kept in the site’s intra-wiki space because, as it states at the top: “This page contains material that is kept because it is considered humorous.” McCandless & Co. do give credit where it is due, but that Wikipedia page surely does not and never did intend to be definitive — it’s just a series of cheekily-written paragraphs about various arguments occurring over time, so there is nothing like meaningful numbers to be gleaned from it.

Instead, McCandless and his researchers decided to generate data to visualize these edit wars by counting the total number of edits over each article’s lifetime, counting not just the edits specifically related to that particular dispute (a difficult and time-consuming thing to research, it goes without saying) but every single edit, ever, thereby giving a grossly distorted view of each article’s history. I’ll give them the fact that if one looks to the legend in the top lefthand corner, it indicates that the number listed (and I presume the size of each box) relates to the “Total no. of edits” but even if readers do notice that, it is at best confusing.

Likewise, the articles’ relative position on the chart accords to their creation, not when the described dispute took place. If you think 2,000+ edits were expended on a photograph in the Cow-tipping article in the middle of 2001, that’s too bad, but you were reasonably misled. Nor would would you know that the article did not include a photograph until several years later.

What you are left with is a decent visualization of how frequently edited some randomly selected articles — some popular, some timely, some but not all controversial — happen to be. Why not simply show that? Focusing on this alone we can see that the following articles have attracted tens of thousands of edits over years:

  • The Beatles
  • Jesus
  • Wikipedia
  • Christianity
  • Ann Coulter
  • Star Wars
  • Wii

That’s not linkbait enough for you? Then please do the research.

Meanwhile, the infographic is also a little too snarky for its own good, especially toward its chosen subject. Color-coding is used to categorize certain types of edit wars; one is labeled “American Cultural Superiority” and exists mainly to identify debates between U.S. and British spellings. Which I find a little… superior itself, but hey, I suppose it’s a misdemeanor violation. Worse is that edit wars involving Wikipedia and site co-founder Jimmy Wales are coded as “Religion.” Too cute. Or maybe just an oversight?

Another oversight concerns an on-wiki debate about whether the most famous Palin was, at the time of its occurrence, Monty Python’s Michael or Alaska’s former governor Sarah. (Since then, I believe the one with decades of contributions to comedy has been definitively usurped by the mavericky one’s more recent, er, contributions.) According to “Articles of War” this happened in 2003. But if you think about it, this makes no sense at all — of course this happened in 2008, when John McCain chose Sarah Palin as his running mate. And the Lamest edit wars essay itself mentions that this happened in 2008. Pure oversight to be sure, but I have to wonder what other mistakes the research team made.

To their partial credit, they have opened their Google Spreadsheets for public inspection, so it’s clear they at least intended to impart real information. And there you can see that they are indeed using the total number of edits over time and that their “Palin” error was made early on. That seems to put the responsibility on the researchers, rather than McCandless himself, but of course it’s a total package.

I hold McCandless to a standard that I don’t the jokers at Cracked* or Something Awful because their job is to make you laugh, while McCandless’ job, according to his website’s own tagline, is to take “issues, ideas, knowledge, data” — and make it easier to understand by visualizing it. There are certainly issues and ideas to be found in “Articles of War” — but knowledge and data, not so much. And though I am getting a little more rant-y than usual about this, I do aim to be constructive, so I would very much like to see this infographic re-done with some extra research. This blog post may serve as a guide if they so choose. I hope they do.

P.S. The Gizmodo thread — where I found it — on this is hilarious, with many people re-fighting the same disputes that once arose on Wikipedia. However, only one that I saw came anywhere near noticing the fact that the methodology was suspect.

P.P.S. Am I being nitpicky to add that “Articles of War” appears to convey that Wikipedia’s articles about The Beatles and Jesus were created prior to 2001? That is to say before Wikipedia itself began? I don’t actually think so.

*Actually, about Cracked — a.k.a. Digg’s favorite website — as I have seen a prominent Wikipedian point out elsewhere, it often does a pretty good job using information from Wikipedia responsibly. Among their articles about Wikipedia, the title of “5 Terrifying Bastardizations of the Wikipedia Model” alone gives away that it’s implicitly pro-Wikipedia, as does “5 Celebrity Wikipedia Entries they Clearly Wrote Themselves“. Even “8 Most Needlessly Detailed Wikipedia Entries” knows what’s good about Wikipedia, even when it isn’t. Cracked writers clearly know their way down through a history page — like say, Corey Feldman’s — but it doesn’t appear that McCandless and his researchers looked as closely.

Email This Post
  • Facebook
  • TwitThis
  • Digg
  • del.icio.us

David Petraeus’ Big Month

Tagged as , , , ,
on June 28, 2010 at 8:34 am

In the views of many, Wikipedia tends toward frivolity. After all, the concept of Wikigroaning assumes that articles on pop culture subjects will be given less attention than articles on weighty subjects. While Wikipedia does include plenty of material that Britannica could and would never address, I’ve pointed out before that this isn’t always the case.

Here’s another reason to retain your faith in humanity, and this time not just Wikipedia’s contributors but also its visitors: this month’s traffic to the Wikipedia article about Gen. David Petraeus. He was in the news twice this month, and for very different reasons. First, on June 15, Petraeus fainted while testifying before the Senate Armed Services Committee. It’s just the kind of TMZ DC-ready story that gets attention, including video, which always helps. Indeed, the story caused traffic on his Wikipedia article to spike.

But as the chart below indicates, that was only about a tenth of the traffic to his page once President Obama nominated him to replace Gen. Stanley McChrystal as the U.S. commander in Afghanistan, following the latter general’s unsolicitous remarks about the Obama administration in Rolling Stone magazine. Perhaps this does not reveal too much, as this is undoubtedly the bigger news story, but it is also a much more complicated one, and at least indicates that no matter how many articles about Pokemon characters Wikipedia may hold, people can still find what’s important.

As for the fact that the top day for traffic to McChrystal’s Wikipedia article this month nearly doubled the traffic on Petraeus’ top day, well, I’ll let you judge that for yourself.

Snapshot of traffic to Wikipedia article about David Petraeus, June 2010.

Snapshot of traffic to Wikipedia article about David Petraeus, June 2010.

Traffic statistics courtesy User:Henrik.

Email This Post
  • Facebook
  • TwitThis
  • Digg
  • del.icio.us

“Treme” vs. “Treme (TV series)”

Tagged as , , , , , , , , , , ,
on April 18, 2010 at 10:04 am

In more than one post on this blog I’ve written skeptically about the concept of “Wikigroaning” — the notion that important subjects sometimes have shorter articles than arguably less-important subjects that appeal to geek sensibilities. In the case of Raphael (archangel, artist or ninja turtle) and lightsaber vs. modern warfare, the complaint did not quite hold up. But I don’t mean to indicate the charge is never without basis.

treme_neighborhood_wptreme_tv_series_wp

With the second episode of HBO’s latest dramatic series, “Treme,” set to air this evening, I decided to compare two related Wikipedia articles — one about the New Orleans neighborhood, and the new TV drama from David Simon. What did I find?

In the first place, Treme is currently 10 Kb long while Treme (TV series) is closer to 17 Kb. On the face of it, the article about the series is substantially longer at present. And this is the case even though the former article has existed since April 2004 whereas the latter was created in March 2009.

It’s fair to say that both articles are in decent shape. The article about the neighborhood has a quality infobox featuring geographic and demographic information, and a concise History section is informative, if perhaps too concise. I compare this to the article about my neighborhood of Adams Morgan in Washington, DC, which has much more information (though fewer references to support them) and no comparable infobox of data. Each article could stand to learn something from the other.

But there is no use arguing that Treme (TV series) is not the better article. It is simply more carefully and completely written, with a more sophisticated article structure utilizing subsections for more in-depth coverage of certain aspects of the show. Plus, it has already spawned a secondary page, List of Treme episodes.

Is there a silver lining here? I think there may be. If the show becomes popular — at least popular enough to inspire a following similar to Simon’s earlier work — then it may well inspire someone or a few someones to become more interested in the neighborhood itself. To be sure, the series itself has already caused a spike of interest in the subject. And all it takes is one person to make it a personal project. If “Treme (TV series)” can do that, “Treme” will be the better for it.

Images via Wikipedia. Neighborhood photograph licensed under Creative Commons by Wikipedia contributor Infrogmation.

Email This Post
  • Facebook
  • TwitThis
  • Digg
  • del.icio.us

The Archangel, the Renaissance Master and the Ninja Turtle

Tagged as , , , , , , , , , , , ,
on November 22, 2009 at 3:37 pm

raphael-angel     raphael-painter     raphael-tmnt

Back in March I considered the subject of “wikigroaning”—a joke / criticism about Wikipedia popularized on the Something Awful Internet forum in 2007. The idea is this: Sometimes, Wikipedia articles on weighty subjects are shorter and less well-developed than articles about similar, less-weighty subjects.

What I found was that this critique no longer applied to a comparison of “Lightsaber combat” vs. “Modern warfare“; the former entry no longer strictly exists, as the page now redirects to the larger topic of “Lightsaber” while the latter is essentially a hub for accessing articles on various sub-topics (assymetric warfare, biological warfare, etc.).

Today, let’s look at another one suggested by Something Awful members: Raphael (archangel) vs. Raphael (ninja turtle). How do the two compare?

Superficially, the joke is on Wikipedia: the main text of the article about the comic book character is approximately 3,000 words long, whereas the one about the Judeo-Christian figure is about 1,350. But here’s the thing—the TMNT-related article is basically devoid of any citations, and was clearly written by fans of the various comic books, TV shows and movies in which he appears. One might assume that the details should be relatively accurate, as it doesn’t seem to be a contentious subject, but who is to say? One citation is provided for the entire article, and indeed the article has been tagged as needing citations since December 2007:

wiki-raphael-warning

That’s almost two years in which fans have been stopping by to work on the article, but no one has yet bothered to clean up problems identified by a non-fan editor, nor have they bothered to provide citations to verify any of it. From this we can infer that most editors on this particular article are focused on this particular topic and are not involved with Wikipedia otherwise.

Meanwhile there is another problem with this article. While much of it summarizes discrete events that occur in the TMNT series, other sections read as commentary on / interpretations of the character. For example:

He has an extremely loyal side and is the first to react when another of his brothers is in trouble. This happens on numerous occasions, like when he stops a blow from hitting Donatello using only his sais or kicks the Shredder away from Leonardo when the latter is about to attack.

So one could certainly verify the existence of a particular scene by citing directly from the comics. Yet the interpretation of Raphael’s actions is left to the reader, and adding this information directly to Wikipedia is a clear-cut case of original researchexpressly forbidden by Wikipedia guidelines.

What is one to do if there is no published commentary on this aspect of the character’s personality? Is it then to be left out of Wikipedia entirely? In theory, yes. In practice, no. I could remove this section immediately and much more of the article if it so pleased me. But you know what? I won’t do it. The article isn’t hurting anyone, so in that way its relative frivolity helps. Moreover, it’s entirely possible that many or most of these interpretations could be found in published reviews, and without having done this I’m disinclined to delete someone’s sincere work, however inexpert. As a known issue, Wikipedia has an informal term for this type of material: fancruft. Fancruft is often deleted, but this much is so far not offensive enough to merit outright deletion.

tmnt-coverAnd how about the archangel? For an article about a Biblical figure I am surprised that it is not better. Only seven citations have been provided, and sections including “Raphael in Islam” and “Raphael in Paradise Lost” have none whatsoever. The quality of the writing is likewise uneven. Clearly, different sections within the article are substantially the work of different editors, and I would probably base my trust in each section according to the quality of the prose. Unsurprisingly, the better-written sections are also the ones with more sources.

But let’s now finally address the obvious: Something Awful seems to have made a mistake, because the Raphael the turtle is not named for Raphael the archangel. He is named for the Renaissance artist, just like his ninja turtle brothers Leonardo, Michaelangelo and Donatello.

Before we come to a final conclusion, let us consider the article about the real person, which is simply titled Raphael. And guess what? It’s the best of the bunch, and it’s not even close. The article is more than 6,000 words, well-written, well-sourced (84 in-line citations, nearly all from serious biographies) and well-illustrated (easy to do when the subject’s work is all public domain). There is not even a mention of the TMNT character, although it has been suggested before and appropriately rejected.

Did Something Awful purposefully avoid making the comparison? Hard to say. In 2007 the article about the Renaissance master was much shorter and completely unsourced, though carefully-written. At that time, the article about the ninja turtle was certainly longer but also less sophisticated.

According to the original Something Awful post, the criteria was simply an assesment of which article is “longer.” But this is too simplistic—it should be obvious that not all words are equal. Just as Something Awful seeks to highlight the mistake of determine a subject’s importance by the space allotted on Wikipedia, it’s also a mistake to assume that the quality of an article is directly correlated with the number of words contained within.

Both are important to keep in mind when reading Wikipedia. How many readers approach the site with these considerations in mind? That’s what I’d like to know.

Images via Wikipedia.

Email This Post
  • Facebook
  • TwitThis
  • Digg
  • del.icio.us

Wikipedia On Dead Tree Redux

Tagged as , , ,
on June 20, 2009 at 3:31 pm

More than a week ago I posted a photo that’s been making the rounds lately — and even wound up as the basis for a joke on Conan O’Brien this past week — about a student artist who had created a physical book of Wikipedia’s Featured articles, one taking up approximately 5,000 pages. I noted at the time that the explanatory text

Reproducing Wikipedia in a dysfunctional physical form helps to question its use as an internet resource.

wasn’t terribly satisfying to me, and I asked at the time

Would printing all of Google’s search results also question its use as an Internet resource? Would printing an image of a sundial question its use as a physical timekeeping device?

and I resolved to find out more if I could. In fact I did hear back from the book’s creator, Rob Matthews, not long after. When posed with the question above, he responded at first:

I’m comparing the Internet Wikipedia to a traditional encyclopedia, by putting it in the same format, therefore suggesting that Wikipedia is dysfunctional compared to a normal encyclopedia. This is suggested by how I’ve conveyed Wikipedia physically.

I still wasn’t satisfied with this, but after a bit of back and forth, Matthews confirmed that his intention was to point out, compared to a traditional paper-based encyclopedia, it’s less reliable because of its radical openness, or hard to find what’s important among the incomplete and unbalanced articles that exist on the site. Those are my words, but he agreed with this much.

I actually do not agree with this view. Not that I don’t agree there is some truth to the point, because there is, but because I do not actually see how anyone is impeded from finding what they want because of Wikipedia. Moreover, “what’s important” is always in flux, and Wikipedia is a reflection of that.

wikipedia-in-print-rob-matthewsIt’s also nothing new. Those who lament the fact that Wkipedia gives disproportionate coverage to trivial matters — a criticism voiced by none other than Stephen Colbert, who sarcastically riffed on the subject, “any site that’s got a longer entry on ‘truthiness’ than on Lutherans has its priorities straight” — should also recognize that these imbalances are often corrected.

I’ve never been one to take my social commentary from visual art such as painting or sculpture, in significant part because it is rare that an image or an object can convey a subtle point while also succeeding as art. For such a purpose — in this case offering commentary on a subject which is overwhelmingly composed of words — I think nonverbal art is inferior to something like the novel, the essay or even the sitcom.

Even if I thought Matthews had a strong argument about Wikipedia to make, I think this fails as standalone commentary. But if Matthews does actually sell copies of this book, consider me interested (price dependent). Mr. Matthews doesn’t have answers for his questions, but his artwork would make for an excellent conversation piece.

Email This Post
  • Facebook
  • TwitThis
  • Digg
  • del.icio.us

Wikigroaning: Less Random than a Blaster

Tagged as , , , ,
on March 9, 2009 at 3:24 pm

On July 31, 2006, Stephen Colbert said of Wikipedia:

Any site that has a longer entry on truthiness than on Lutherans has its priorities straight.

This probably wasn’t the first time someone has noticed the tendency of Wikipedia to feature more information about arguably trivial subjects than arguably significant ones, but it certainly was not the last. Less than a year later, a contributor to Something Awful created (or popularized) a game called “Wikigroaning”:

something_awful_logoThe premise is quite simple. First, find a useful Wikipedia article that normal people might read. For example, the article called “Knight.” Then, find a somehow similar article that is longer, but at the same time, useless to a very large fraction of the population. In this case, we’ll go with “Jedi Knight.” Open both of the links and compare the lengths of the two articles. Compare not only that, but how well concepts are explored, and the greater professionalism with which the longer article was likely created. Are you looking yet? Get a good, long look. Yeah. Yeeaaah, we know, but that is just the tip of the iceberg.

The article included a list of amusingly juxtaposed concepts, such as Modern warfare vs. Video Game Crash of 1983 and while the concept is funny, Wikipedia recognizes this as a systemic bias they must deal with.

This seems to me like an opportunity a) to ask whether they have and in doing so b) launch an occasionally recurring feature, wherein we compare the Wikipedia of today (beginning in early 2009) to the Wikipedia of 2007. So let’s see how those specific articles from 2007 compare then, and now.

First, let’s benchmark the articles at June 1, 2007, just a few days prior to the article’s publication and about the time author Johnny “DocEvil” Titanium was doing his research. Naturally, the piece implies that the “Lightsaber combat” article was much longer than the one about “Modern warfare.” Unfortunately (sort of) the former article no longer exists: if you click the link you are now redirected to the article Lightsaber, and as I am not an administrator, I cannot see the old pages. No matter. If we substitute Lightsaber on June 1, 2007, that article was 9,500+ words long. Modern warfare on June 1, 2007 was just shy of 2,000.

Now, here are the two side-by-side as of today:

wp_modern_warfare_vs_lightsaber

As you can see, the Modern warfare article is now somewhat longer than the Lightsaber article. Of course, length is not everything. For one thing, the Lightsaber article is now well-sourced (in 2007 it had just one in-line citation) whereas Modern warfare in fact has none. But there are mitigating circumstances here, as well. One thing that “Wikigroaning” doesn’t take into account is the amount of material on other pages, and here Modern warfare is nearly a list, serving primarily as a jumping-off point to other articles describing different aspects of modern warfare in greater detail. Some of these are well-sourced, whereas others are not. Another consideration is edit frequency: Lightsaber has been edited many, many more times than has Modern warfare, which speaks partially to the number of “experts” in the former and partially to the stability of the latter.

A more apt comparison might be to the AK-47 article, which I think is a better article still, and much better than the one about the Blaster.

This being the first post in a series I have yet to fully develop, I may develop a rating system and return to this post at another date to include it. Additionally, what I write is guaranteed valid for March 9, 2009 only and may warrant revisiting at another time. But let’s see where this takes us in the meantime.

Oh, and if you really want to know all about Lightsaber combat, Wookiepedia has an article of that name which runs more than 3,300 words — but no in-line citations.

Email This Post
  • Facebook
  • TwitThis
  • Digg
  • del.icio.us
Viagra | Adderall | Viagra Online | Levitra | Free Viagra | Viagra Samples