William Beutler on Wikipedia

Archive for the ‘First Versions’ Category

History in the Making: The Tumblr That Explains Where Wikipedia Articles Come From

Tagged as , , , , , , ,
on December 10, 2014 at 8:15 am

Journalism is the first rough draft of history, as the shopworn phrase goes, and it’s a clever one, but it’s never seemed quite right to me. Daily journalism is the reportage of events which may or may not be deemed worthy of reflection and remembrance; it’s in the subequent commentaries and essays—and even supposedly neutral online encyclopedias—where “History” begins to come together.

So I’m with John Overholt, a curator at Harvard’s Houghton Library, who launched a “concept Tumblr”[1]I’m coining that, by the way as a personal project, earlier this fall, devoted to the first version of Wikipedia entries: First Drafts of History. The idea is dead simple and all but infinitely replicable: for every subject Wikipedia covers, there was once a first version of this entry—and it’s just three clicks away from any Wikipedia article, so long as you know which three[2]“View history” > “Oldest” > First time-stamped entry.

That’s where Overholt began, as he told me last month: “I was suddenly struck by how interesting and unusual it is that Wikipedia’s entire (or mostly so) history is easily available and that you can peel back the layers of each article to its genesis. As someone with a keen interest in history, that’s very appealing to me, and I was curious to know what the articles were like in those early stages.”

Radiohead on WikipediaFunny enough, this is close to an idea that I once started to explore, in a post on this very site. Way back in May of 2009 I copied the text over from the first version of the entry about the rock band Radiohead and used it to muse about how Wikipedia’s standards have changed. I announced it as the first in a series, but I never did it again. Ideas are cheap, execution is what matters, and Overholt is executing it like crazy. Every day he posts screen shots with links to the the article and first version every single day, often matching entries to the calendar (Black Friday (shopping) on November 28) or focusing on pop culture goofery (Metal umlaut).

And looking back at the origins of entries reveals something about where Wikipedia came from. The second paragraph of the first Merlot article describes the varietal in three succinct sentences before concluding: “Merlot is also the name of an XML Editor….:-).”

Very, very early early articles, such as the first draft about Venezuela, are just one sentence. Others are written in in shorthand, omitting direct references to subject in a long-abandoned style, i.e. Putin: “Born October 7, 1952… KGB officer from 1975 to 1992…” and so forth.

iPhone on WikipediaIt also offers glimpses into recent-but-forever-ago history, when Facebook was Thefacebook.com, and the iPhone was just a nickname for an Apple partnership with Motorola (later redirected to Motorola ROKR, at least for a time), then rendered “IPhone” due to limitations of the software. This first concludes: “Note of author : please rewritting my article in a correct english. thank you”

I asked Overholt what his take on all of this was, and I’ll do no better than by quoting him at length:

Obviously it’s funny when articles have a really eccentric start, or a tone that’s very different from the standard style of Wikipedia today, but the thing I’m really struck by is how ambitious and difficult a task it is to think about, in essence, organizing all knowledge. It’s a problem that historians and philosophers have grappled with for centuries. I was tickled by the Pastrami article I posted the other day, which had the edit summary “What can one say about pastrami?” What indeed! But the important thing is thinking to say anything about pastrami at all. The genius of Wikipedia is that it didn’t really stop to solve the overarching problem of how to organize all knowledge first (because it’s all but unsolvable) but rather decided, “Well, we’ll just start with something, and hopefully make that something better little by little.” So even if the first draft of an article is terrible, it’s already done the very hardest thing just by existing.

What else I think is important about it Pastrami on Wikipediais that it might help to demystify the Wikipedia process, even if just a bit. Many readers have no idea how articles or written, and few probably ever think about what they once looked like, or what the best version may be.

An example I’ve considered with friends: do you prefer the version of the Wikipedia article Dog from 2014 or the article Dog from 2004? I’ll still take today’s entry for a number of reasons, but a decade ago it was arguably more accessible, and about one-quarter the size.

It makes you wonder: what should a Wikipedia article be? What’s the ideal Wikipedia article? The answer to that has changed over time, and probably will keep changing so long as it’s an active project. Reminding readers that Wikipedia once was very different is a good way to remind them that it can still be better.

All images ultimately via Wikipedia.org; first and third courtesy of Overholt.

Notes   [ + ]

1. I’m coining that, by the way
2. “View history” > “Oldest” > First time-stamped entry

The Earliest Known Record of Wikipedia Journalism

Tagged as , , , , , , ,
on October 12, 2010 at 6:56 am

I’d gotten to wondering, recently, what was the first time Wikipedia was mentioned by a media source? The project began in January 2001, but I’m sure I wasn’t aware of it until sometime in 2003 at the earliest. I have no memory of first learning about it — only a recollection that sometime in the middle of the last decade, I was spending hours and hours, and entire days on some weekends, reading Wikipedia. I wasn’t too curious about where it came from then, but over the last few years, I clearly have been.

So I did what anyone with access to an online news database would do: I looked it up. And the winner appears to be a July 1, 2001 article in the Australian edition of PC World, by one Aldis Ozols. Here it is, in its entirety:

Roll-your-own fount of knowledge: www.wikipedia.com.; editor’s choice.

“A wiki is a collection of interlinked Web pages which can be visited and edited by anyone” goes the definition by Wikipedia. Rising to the challenge, I edited the page on which this statement was made, and behold, my contribution (all two words of it) became part of the Wikipedia.

This is a collaborative project intended to produce a usable encyclopedia through the efforts of many volunteers who surf in from the Net. While this makes it superficially similar to Everything2 (see June issue), there are differences. For instance, Everything2 seeks to be a live, interactive community as well as a reference, whereas Wildpedia [sic] has a more modest goal: to create a freely distributable 100,000 page encyclopedia online. In addition, where Everything2 has a complex system of user ranking and moderation which attempts to grade contributions and their authors, Wikipedia is wide open. Anyone can rock up and modify existing entries, or create new ones as I did.

Astonishingly, the result is not a pile of chaotic nonsense, as one might expect. Perhaps that’s because the project is still small, with only 6000 pages of text and a few dozen contributors, but something more seems to be at work here. Evidently, articles that start off with a one-sided viewpoint are edited and re-edited until they settle into a kind of consensus with which most people are satisfied. In anycase, this is an interesting experiment containing some surprisingly accurate articles.

Surprisingly prescient, if you ask me. Or perhaps just lucky — many a website that garners positive reviews in its early going nonetheless still folds, or descends into chaos. In any case, I’m surprised to find this article is not online — if I’d been first to report on Wikipedia, I’d want to take credit for the fact.

Looking a little further, it seems that most of the Anglosphere reported on Wikipedia before anyone in the U.S. had anything to say about it: England (London Free Press), Canada (Edmonton Sun), Wales (Wales on Sunday) and Northern Ireland (Irish News) all got there first.

Stateside, the first press mention of Wikipedia was in the Gray Lady herself, the New York Times, by someone named Peter Meyers. This story is online, so I will simply quote the lede (sorry, non-journos) and call it good:

Fact-Driven? Collegial? This Site Wants You

FOR all the human traffic that the Web attracts, most sites remain fairly solitary destinations. People shop by themselves, retrieve information alone and post messages that they hope others will eventually notice. But some sites are looking for ways to enable visitors not only to interact but even to collaborate to change the sites themselves.

Wikipedia (www.wikipedia.com) is one such site, a place where 100 or so volunteers have been working since January to compile a free encyclopedia. Using a relatively unknown and simple software tool called Wiki, they are involved in a kind of virtual barn-raising.

Their work, which so far consists of some 10,000 entries ranging from Abba to zygote, in some ways resembles the ad hoc effort that went into building the Linux operating system. What they have accomplished suggests that the Web can be a fertile environment in which people work side by side and get along with one another. And getting along, in the end, may ultimately be more remarkable than developing a full-fledged encyclopedia.

For the curious, here is what the ABBA entry looked like on the day the story ran, and here it is today. And here is something close to what the zygote article looked like then, and what it looks like now. One wonders what it will look like in another ten years.

Update: In the comments, Graham87 locates the exact zygote entry, from the so-called Nostalgia Wikipedia (a topic worthy of its own post, at some point).

Bill Clinton’s Excellent Adventure

Tagged as , , , , ,
on August 5, 2009 at 11:27 am

Update: Hmm, so it looks like I may have gotten out ahead of the details on this one. See the comments, where fellow Wikipedian Graham87 points out that the current Wikipedia database does not in fact include edits from the early months of Wikipedia. As he points out, here is an earlier version of the Bill Clinton article. And what does that mean for this particular series? Well… at least I will have to select articles from approximately 2002 on.

The 42nd president is enjoying a pretty good week, having returned this morning from North Korea with American journalists Laura Ling and Euna Lee free upon his successful negotiations with Kim Jong-Il. This seems as good a moment as any for the second installment in a series on the first versions of major Wikipedia articles.

Bill Clinton left office just five days after Wikipedia was founded in January 2001. Although one might think this would make him a strong candidate for being one of the first articles created, it so happens that no such article was created until November 17 that year. And even then another editor would not contribute again for nearly another month — coincidentally the same day a Wikipedia article was created for his successor.

The first version of the Bill Clinton article was fairly substantial: 979 words excluding the Table of Contents. This is less than a tenth of the 9,900-some words of the Bill Clinton article today — to say nothing of all the articles about the many peripheral articles such as Electoral history of Bill Clinton — but it’s still pretty good.

Here is the first paragraph (of a much longer intro) today:

William Jefferson “Bill” Clinton (born William Jefferson Blythe III, August 19, 1946)[1] served as the 42nd President of the United States from 1993 to 2001. He was the third-youngest president; only Theodore Roosevelt and John F. Kennedy were younger when entering office. He became president at the end of the Cold War, and as he was born in the period after World War II, he is known as the first Baby Boomer president.[2] His wife, Hillary Rodham Clinton, is currently the United States Secretary of State. She was previously a United States Senator from New York, and also candidate for the Democratic presidential nomination in 2008. Both are graduates of Yale Law School.

Here is the first paragraph (of a much longer intro) then:

William Jefferson Clinton (Democrat) was the 42nd President of the United States, from 1993-2001. He was born August 19, 1946 in Hope, Arkansas. He was named after his father, William Jefferson Blythe II, who had been killed in a car accident just three months before his son was born.

In the original version, the Lewinsky scandal is handled in two short paragraphs in the intro section; by now Lewinsky and the subsequent impeachment trial have two short sections which link away to very comprehensive sections of their own.

While Wikipedia today strives to be non-partisan and avoid self-references, these concepts were less-developed early on, and this can be seen in how the original version closed. The last proper article sentence concluded:

There’s a great deal more to be said about him — let’s try to keep it non-partisan and encyclopedic.

And a deprecated link to the Talk page, at the time included in the text of the article itself, said:

/Talk (go ahead and be partisan there)

Not to worry — eight years later, they still are.

Jigsaw Falling Into Place

Tagged as
on May 3, 2009 at 11:52 am

I’m starting a new occasional series of posts here today — showing what the very first version of different Wikipedia articles looked like, one or two or a few at a time. After all, even the best had to begin somewhere, and it’s highly unlikely that they were delivered to Wikipedia as a fully formed article. This is partly because standards have improved over the years, but also just because of the nature of the wiki — most add just a little at a time, but over time those little bits and pieces turn into a complete article.

The first example is about Radiohead, the favorite rock band of yours truly since The Bends in 1995. The article today is ranked among Wikipedia’s best, and earlier this year was a Featured article, meaning featured on Wikipedia’s main page. But it wasn’t always so. Without further ado, here is the very first version of the Radiohead Wikipedia article from February 7, 2002:

Radiohead, British rock band.

Shot to critical acclaim with their third album, OK Computer, one of the best albums of the late nineties.

Others include:

* Pablo Honey
* The Bends
* OK Computer
* Kid A
* Amnesiac
* I Might Be Wrong (Live recordings)

Other decent artists include PJ Harvey, U2, Nirvana, and more recently, Ryan Adams.

Seriously, Ryan Adams? (Note: The original title for this post was I Might Be Wrong.) The notion that Nirvana, U2 or Radiohead may only be “decent” artists is amusing, too.

You may have also noticed that this version of the article would absolutely violate Wikipedia’s NPOV guideline, which proscribes editors from injecting their own opinions into Wikipedia articles, as it stands today. But it would also have run afoul of the much simpler guideline as it existed then, under the principal authorship of Wikipedia co-founder Larry Sanger and The Cunctator, an editor who is one of Wikipedia’s most veteran.

I undoubtedly agree that OK computer is one of the best albums of the late 1990s, and so this is present in the article as attributed to the music critics who said so, and in the section header which currently reads:

OK Computer, fame and critical acclaim (1996–1998)

That works for me.