William Beutler on Wikipedia

Archive for June 2009

The King of Wikipedia Traffic

Tagged as , , , , , , , , ,
on June 27, 2009 at 4:42 pm

Michael Jackson‘s sudden and shocking death just about blew up the Internet this past week, and Wikipedia was no exception, even getting briefly knocked offline. And as the New York Times’ tech reporter Noam Cohen reported, the stunning news produced another milestone for Wikipedia:

The Michael Jackson entry in Wikipedia Thursday evening appeared to have set the record as having the highest traffic in the eight-year history of the online encyclopedia.

In the 7 p.m. hour alone Thursday, shortly after Mr. Jackson’s death was confirmed, there were nearly one million visitors to that article. (In fact, for that hour more than 250,000 visitors went to the misspelled entry “Micheal Jackson.” Even his brother Randy Jackson had 25,000 visits that hour.)

“We suspect this is most in a one-hour period of any article in Wikipedia history,” said Jay Walsh, a spokesman for the Wikimedia Foundation in San Francisco.

The article goes on to note that this represented about 1 percent of Wikipedia’s total traffic on the day — this may not sound like much, until you recall the English Wikipedia has more than 2.9 million articles. Writing midday Friday, Cohen predicted that the article could surpass 5 million visits on Friday. As it happens, Cohen set his target a little too low:

traffic-spike-wikipedia-jackson

1.4 million visits is pretty remarkable, but 5.9 million visits in unprecendented. However, there is one discrepancy: yesterday’s estimates from User:Henrik‘s Wikipedia article traffic statistics tool (and Cohen’s article) put the figure at 1.8 million visits, which means the numbers where somehow reconciled downward in the interim. I’ll be looking to find out why. And while Cohen names as a point of comparison President Barack Obama‘s Wikipedia article, which received 2.3 million visits on Election Day, I know of a page that received more traffic still and offers a better comparison:

traffic-spike-wikipedia-palin

That spike you are looking at occurred on the day that Senator John McCain announced Governor Sarah Palin as his running mate in the final days of August, 2008 (as previously discussed on Blog P.I.). Between Jackson and Palin we have one well-known but mysterious and one little-known but suddenly very public figure, thrust into the middle of a breaking news story. By comparison, Obama was a highly visible public figure and Election Day was known far in advance. Perhaps that actually makes the 2.3 million that day even more impressive. But it’s hard to read much more into bar graphs such as this beyond acknowledging they represent a sudden and externally-driven interest in the subject.

Meanwhile, it’s interesting to note that the article containing the information people presumably want most, Death of Michael Jackson, has not recorded anything like the traffic of the primary MJ article:

traffic-spike-wikipedia-jackson-death

Why is this the case? Part of the answer is the power of Google, which is the overwhelming driver of traffic to Wikipedia. On that note, I don’t know about you, but in the past 24 hours, Michael Jackson’s official site and his Wikipedia article have traded places on Google, with Wikipedia now ranked first overall. Second, the link to this article is found deep in the primary one, albeit of course at the top of the section concerning his death. Still, 527 is a rounding error compared to 5.9 million. Perhaps the Michael Jackson article itself satisfied their curiosity, before clicking over to iTunes and downloading a copy of Thriller.

And one last, somewhat morbid note: it is strange indeed that the King of Pop is no longer covered by Wikipedia’s Biography of living persons guideline.

Update: In the comments, one of the more knowledegable Wikipedia editors, Tvoz, suggests I’m wrong on the last point:

One thing: actually Michael Jackson’s article *is* still covered by the “biographies of living people” guidelines. Those guidelines protect the integrity of Wikipedia’s articles and intend to thwart defamation, and it is expected that editors will continue to follow the policy and remove poorly sourced defamatory material immediately, even after the death. His family members are alive, and causes of action as a result of such defamatory material could still be brought.

An interesting point, and I think a fair clarification. My inclination is to say this means that Jackson’s family members are still covered by BLP, and this means that any material on the Michael Jackson page must conform to the policy in order to protect them, rather than MJ himself. And of course, spurious information shouldn’t be added at any time — and Jackson’s continued celebrity probably means that this page will be scrutinized more than most.

Watch Out, Laszlo Panaflex!

Tagged as , , , , , , , , ,
on June 22, 2009 at 10:42 pm

laszlo_panaflexIn a 1996 episode of The Simpsons, washed-up movie star Troy McClure — you may remember him from such self-help videos as “Smoke Yourself Thin!” and “Get Confident, Stupid!” — enters a sham marriage with Aunt Selma to squash rumors about his sordid personal life and regain his former screen glory. As he is “romancing” Selma along a Simpsonized version of the Hollywood Walk of Fame, McClure declares:

One day, my lady Selma’s gonna have a star right next to mine, so watch out [camera pans right] Laszlo Panaflex!

Like most throwaway Simpsons lines, it has faded from mainstream recognition — the episode’s imagined musical version of “Planet of the Apes” is surely better known — but lives on in offhand references made by those of us who have been watching long enough to remember the controversy over Bart Simpson and those “Underachiever and Proud Of It” T-shirts.

I thought of it again while watching Ghostbusters on TV last night, noticing that the cinematographer was László Kovács. Was Kovács’ the name Simpsons writers were riffing on? Following a well-established routine, I plugged his name — Panaflex’s of course — into Google, hoping for but not really expecting a Wikipedia article to pop up.

It turns out Wikipedia did show up first — but it wasn’t an article. Instead, it was a user page for someone using the fictional lenser’s moniker as a handle. It reads in full:

Nice. But this also got me wondering: is this a loophole in Wikipedia policy? Isn’t this a way to get an encyclopedic page on the site even if it would be otherwise deleted by Wikipedia’s relentless arbiters of significance? After, all articles appearing on what Wikipedians call the “mainspace” of Wikipedia are expected to satisfy a handful of core guidelines lest they be removed or radically altered.

First there is the general notability guideline requiring the subject to meet a certain threshhold of importance (often determined by news coverage). Articles failing the requirement are deleted, and relevant content is sometimes relocated to existing articles about the same topic. Laszlo Panaflex, as one joke in one episode, would never pass Wikipedia’s notability requirement because it would obviously belong on the page about the episode (and as of this writing, it is not even there). An example of a Simpsons reference that does meet this requirement is Homer Simpson’s ubiquitous “D’oh!

Other guidelines it could elide and does in this case: Verifiability and Reliable sources. Sure, it helps to confirm my suspicion that Laszlo Panaflex is inspired by the real cinematographer with the accented name discouraging me from Ctrl-C/V-ing it again. It certainly wouldn’t surprise me if it was named for him, but certainly doesn’t offer a citation for the claim. I need more proof, and articles in the Wikipedia mainspace do, too.* User pages have no such requirement.

On the other hand, I think it passes NPOV with flying colors.

But is it a loophole to treat a user page like an article? After all, Laszlo Panaflex ranked right at the top of Google; other articles on semi-obscure subjects could as well. I don’t believe there is a policy, guideline or essay that specifically addresses this, though I fully acknowledge I may be wrong. In that case that I am not, the possibility exists for unworthy (or even “unworthy”) articles to be given a second home on user pages.

I can say for certain — alas, without being able to summon a link (I’ll look) — that there are a number of editors whose user pages are written to resemble a Wikipedia article. Is that wrong? I don’t think so. However, I do think it could make the Wikipedia community uncomfortable if it became a widespread practice, and was seen as a gray hat SEO technique.

In that unlikely event, the first suggestion that comes to me would be requiring a banner on user pages that specifies that it is not an “article”. It would be phrased like the banner I keep atop my own page, included as a disclaimer in case the page is swiped by an unscrupulous mirror site. After all, this non-accusatory template puts even a flawed but useful article about one Laszlo Panaflex in the proper context:

This is a Wikipedia user page.

This is not an encyclopedia article. If you find this page on any site other than Wikipedia, you are viewing a mirror site. Be aware that the page may be outdated and that the user this page belongs to may have no personal affiliation with any site other than Wikipedia itself. The original page is located at http://en.wikipedia.org/wiki/User:WWB.

Wikimedia Foundation

*It may be out there. Many other Simpsons-related Wikipedia articles, including “A Fish Called Selma”, are buttressed by citations to the commentary tracks on the official DVD releases. If anybody knows for sure, I’d be happy to help add the citation.

Wikipedia On Dead Tree Redux

Tagged as , , ,
on June 20, 2009 at 3:31 pm

More than a week ago I posted a photo that’s been making the rounds lately — and even wound up as the basis for a joke on Conan O’Brien this past week — about a student artist who had created a physical book of Wikipedia’s Featured articles, one taking up approximately 5,000 pages. I noted at the time that the explanatory text

Reproducing Wikipedia in a dysfunctional physical form helps to question its use as an internet resource.

wasn’t terribly satisfying to me, and I asked at the time

Would printing all of Google’s search results also question its use as an Internet resource? Would printing an image of a sundial question its use as a physical timekeeping device?

and I resolved to find out more if I could. In fact I did hear back from the book’s creator, Rob Matthews, not long after. When posed with the question above, he responded at first:

I’m comparing the Internet Wikipedia to a traditional encyclopedia, by putting it in the same format, therefore suggesting that Wikipedia is dysfunctional compared to a normal encyclopedia. This is suggested by how I’ve conveyed Wikipedia physically.

I still wasn’t satisfied with this, but after a bit of back and forth, Matthews confirmed that his intention was to point out, compared to a traditional paper-based encyclopedia, it’s less reliable because of its radical openness, or hard to find what’s important among the incomplete and unbalanced articles that exist on the site. Those are my words, but he agreed with this much.

I actually do not agree with this view. Not that I don’t agree there is some truth to the point, because there is, but because I do not actually see how anyone is impeded from finding what they want because of Wikipedia. Moreover, “what’s important” is always in flux, and Wikipedia is a reflection of that.

wikipedia-in-print-rob-matthewsIt’s also nothing new. Those who lament the fact that Wkipedia gives disproportionate coverage to trivial matters — a criticism voiced by none other than Stephen Colbert, who sarcastically riffed on the subject, “any site that’s got a longer entry on ‘truthiness’ than on Lutherans has its priorities straight” — should also recognize that these imbalances are often corrected.

I’ve never been one to take my social commentary from visual art such as painting or sculpture, in significant part because it is rare that an image or an object can convey a subtle point while also succeeding as art. For such a purpose — in this case offering commentary on a subject which is overwhelmingly composed of words — I think nonverbal art is inferior to something like the novel, the essay or even the sitcom.

Even if I thought Matthews had a strong argument about Wikipedia to make, I think this fails as standalone commentary. But if Matthews does actually sell copies of this book, consider me interested (price dependent). Mr. Matthews doesn’t have answers for his questions, but his artwork would make for an excellent conversation piece.

Words and Deeds: Wikipedia and the Virginia Governor’s Race

Tagged as , , , , , , , , , , , , ,
on June 14, 2009 at 8:03 am

The Democratic Party of Virginia settled on a nominee for governor this past week, choosing state senator Creigh Deeds over two better-known rivals, including former DNC chairman Terry McAuliffe. (On the Republican side, Bob McDonnell was unopposed for the nomination.) Following the race, Virginia blogger and Wikipedia contributor Waldo Jaquith posted about “Wikipedia’s role in Sen. Deeds’ nomination“, featuring quotes from a live discussion WashingtonPost.com. Wrote one voter:

I voted for Deeds. The WaPo endorsement really helped. I started doing the research this weekend and was disappointed that the WaPo did not have a quick guide the issues. I searched for a half an hour and did not find a quick rundown of the candidates and the issues.

Also, Deeds had a wikipedia page about his past stances. That really helped. The other two did not have similar pages.

Interestingly, the specific page quoted — “Political positions of Creigh Deeds” — has been merged back into the main Deeds article, but the content appears intact. Jaquith writes:

I’ve said it before, and I’ll say it again: Wikipedia is going to play a large role in year’s Virginia elections. The campaigns that a) understand that, b) harness that and c) do so in a fair, unbiased way will reap the benefits. The campaigns that ignore Wikipedia or attempt to manipulate its information in a way that is anything less than fully truthful will be penalized accordingly.

In fact, that seems to have already occurred in the primary. As noted in an overexcited but basically correct diary at Daily Kos last week, ““You can’t handle the truth!” TMac’s dogs scrub Wikipedia of facts” supporters of McAuliffe did remove sourced information, none of which has been restored as of this writing.

In the first instance, material about a land deal and disgraced Democratic fundraiser John Huang because it “lacked NPOV” (i.e. not written from a neutral point of view), and in the second about business deals involving Telergy and inPhonic “for being unsourced.” Well. Lacking a neutral tone is cause to rewrite a section, but not a reason to delete — certainly not as a first resort. Second, the inPhonic material was properly sourced, and better than deleting the Telergy section would have been to find a citation. On the other hand, this goes both ways — the material was almost certainly added to cast doubt upon McAuliffe’s fitness for office, and according to the discussion page about McAuliffe’s article, much of this criticism popped up just days before the Tuesday primary vote. And so it goes.

So now the Commonwealth turns to the general election where, if Jaquith’s prediction is correct, the articles about Deeds and McDonnell will be both important resources as well as the locus of battles to establish narratives about each candidate. Indeed, both articles are the top non-official sites listed in Google searches for each candidate’s name. (Another important article will be Virginia gubernatorial election, 2009.)

As yet, Deeds’ article is the better one, in part because of the aforementioned section outlining Deeds’ political positions. His article is also somewhat more active, probably due to the active primary, and more experienced editors working on the page. Recent contributors to Deeds’ page include Virginia resident John Broughton, who literally wrote the book on editing Wikipedia, whereas most recent work on McDonnell’s page has been done from unregistered accounts represented only by the user’s IP address. Jaquith, for his part, has recently edited both.

It’s a good bet that, after the summer, editing on both articles will ramp up as November draws closer. It will be interesting to see how they develop.

Wikipedia On Dead Tree

Tagged as , ,
on June 10, 2009 at 6:35 pm

OK, now this is something else — artist Rob Matthews printed all of Wikipedia’s Featured articles as a 5,000 page book. It’s a great image:

wikipedia-in-print-rob-matthews

Which raises the question — what would a book containing every article from Wikipedia look like?

Meanwhile, Matthews doesn’t offer much explanation for the art or what it is supposed to mean, although he does offer this much:

Reproducing Wikipedia in a dysfunctional physical form helps to question its use as an internet resource.

Hmm… it does? Would printing all of Google’s search results also question its use as an Internet resource? Would printing an image of a sundial question its use as a physical timekeeping device? I love the book as an art piece, but I’m not entirely sold on this point. (No matter what, though, it’s still more constructive than the other Wikipedia art.)

I will drop Mr. Matthews an e-mail and ask both questions — and I’ll update if I find anything out.

The Wikipedia Haters Club

Tagged as , , , , , , , , ,
on June 9, 2009 at 8:42 am

Count as one member Examiner.com personal finance columnist Steve Juetten, who writes in a review comparing Microsoft’s newly launched search engine, Bing, with old standby Google:

Before I started the search, I set two rules. First, I was looking for information from reliable sources. As a result, if a search placed information from Wikipedia high on the list, the search engine sank in my review. As with information from any source (human, web or book), trust but verify and Wikipedia is not trustworthy when it comes to your money.

Anyone who spends much time around Wikipedia is pretty familiar with complaints such as these, and to this end the Wikipedia community maintains a page called Replies to common objections. Juetten isn’t quite specific enough for me to highlight a particular section, but I’m pretty sure he will find some answers in the answers to “Wikipedia can never be high quality“.

Meanwhile, a few objections to his objection do occur to me. For one thing, who is to say that other sources will be more trustworthy? Juetten undoubtedly singles out Wikipedia for its high profile, but it’s difficult to see why it should be placed at a disadvantage to About.com, Answers.com or NNDB, all of which can rank well for certain terms.*

Are these other information resources likely to be more reliable? I know of no reason why they should be. And if About.com or NNDB does happen to be wrong, there’s not a thing you can do about it.

Lastly, I agree with Juetten that “trust but verify” is a good personal rule and a sound approach to research, but I don’t understand why he doesn’t extend it to Wikipedia when this is an area in which Wikipedia often shines. One of the site’s core content policies is in fact Verifiability, that articles need references. But Juetten’s objection becomes even more ironic when you consider that said references are required to meet another core policy: Reliable sources.

Juetten’s worldly cynicism is understandable but, in this case, selectively applied and ultimately misplaced. It is true that Wikipedia is not completely reliable, but it shouldn’t be penalized for being one of the few reference websites that actually admits the fact.

_____
*For example, try searching for Alan Greenspan on Google and Alan Greenspan on Bing. As of this morning, the top three results for each are: Wikipedia, Answers.com and NNDB.

Thoughts on Wikipedia and Scientology

Tagged as , , ,
on June 8, 2009 at 9:21 am

scientology_symbol_logoIt’s unfortunate that The Wikipedian has been in suspended animation for the week or so, because it has been a big past week or so for Wikipedia in the news. On May 28, Wikipedia’s Arbitration Committee — the court of last resort for Wikipedia disputes — banned all IP addresses associated with the Church of Scientology from editing Wikipedia for repeated disruptive editing and the use of “sock puppet” accounts to tilt Wikipedia consensus on Scientology-related articles.

The decision has been all over the tech and mainstream press since, from The Register’s first report on May 29 to the New York Times finally covering the story this morning. In between, Google News shows hundreds of results about the subject.

I have not looked closely at the decision, but my own general take on it is about what you might expect: Wikipedia reserves the right to regulate its own community and, upon fair consideration, expel those who are determined to prevent Wikipedia from operating. This was certainly the case here, as the deliberations ran for more than six months, reportedly the longest in Wikipedia history. It is not the case, as one Huffington Post contributor erroneously imagined, that “all members” of Scientology were banned from editing. Instead, Wikipedia merely banned IP addresses known to be controlled by Scientology. Any Scientologist can still log on from home and, one expects, have their individual account banned if they too persist in deleting good information that the Church does not like.

This is not the first time Wikipedia has taken such an action, and I’d say it’s easily less controversial than the ban on an entire Utah neighborhood in 2007 for incredibly involved reasons that you can read about here.

If I had an objection it would be that an indefinite block, which is what the ArbCom imposed here, should be less desirable than a period of one year or perhaps even two. However, it is probably the case that a year or two from now Scientology would be just as interested in deleting critical information about their organization from Wikipedia as they are now. And to some extent it is likely to continue in any case.

After all, the flagship Scientology article has a long history as one of the most contentious on Wikipedia. Visit the discussion page, and you’ll find 27 archive pages of discussion stretching back to 2001. (Few articles are so active as to need their discussions archived; and the Roman Catholic Church, with vastly millions more adherents, has just 26 archived pages of discussions associated ith its article) The very first, undated, comment on the Scientology Talk page was this one:

As in entries on like organizations such as The Local Church of Witness Lee and the Jehovah’s Witnesses, no fair discussion can take place on this topic. If anyone dare edit this article, it will be swiftly and aggressively reverted to reflect only the official point of view of Scientology. Try it.

And in the week since, more than 6200 words have been expended on the Scientology Talk page, as veteran and newbie editors alike — some of them undoubtedly Scientologists — continue argue over what the article should say.

Scientology logo via Wikipedia, reused here with the same non-free use rationale.