William Beutler on Wikipedia

Archive for 2012

Disambiguate This!

Tagged as , , , , ,
on April 17, 2012 at 1:05 pm

If the Wikipedia article titled “Wikipedia in culture” is to be believed, the free, online encyclopedia’s primary contribution to popular culture is as a humorous reference, particularly in U.S. cable television programming.

Topic-wise, sometimes the joke relates to Wikipedia’s uneasy relationship to education, including T-shirts featuring leaping graduates thanking Wikipedia. More often than not, Wikipedia’s uneven reliability is the joke, such as The Onion’s classic 2006 article: “Wikipedia Celebrates 750 Years Of American Independence”.

If it has had any noticeable linguistic impact (aside from debate over the meaning of “Santorum”) it is probably in the phrase “Citation needed”. But the word that I wish Wikipedia could popularize is:

Disambiguation

It’s a perfectly cromulent word, and can be found in the dictionary (or at least on Dictionary.com), apparently dating to the 1960s, and unsurprisingly means:

to remove the ambiguity from; make unambiguous

And yet it’s not a word that I can recall having seen prior to Wikipedia, even though I have a degree in English and very nearly earned one in journalism. In a world of ambiguity, what more could we want than disambiguation to help us understand what’s real, and what matters? Well, maybe therein lies the problem: there are no easy diambiguations in the real world. But are they so easy, even on Wikipedia?

If you don’t know what disambiguation is, it’s pretty simple. Wikipedia has articles about many people named John Smith, most real and even some fictional. So many, I’m not even going to bother counting. Because no John Smith is considered vastly more famous than the other, none of them gets this URL:

Nope, that’s the disambiguation page, where one can find, among many others:

And, for fans of The A-Team, there is also:

In many cases, a word will have one primary meaning, and then multiple secondary uses. This is when the parenthetical expression “(disambiguation)” comes in. One example:

Typically, articles requiring some form of disambiguation require a “disambig” note at the top of the page (called a “hatnote”). Frequently, the phrasing is “Not to be confused with…” and here is one example, which I enjoy more than most:

McGraw-Hill disambiguation

As you may expect, there is a lengthy guideline detailing how disambiguation pages are to be governed. But on a website where not everyone knows the rules, nor does everyone agree about relative importance of similarly-named subjects, there can be some glitches. This is especially true when one is being implored by unknown advisers “not to be confused by” a deceptively unrelated topic.

One errant disambiguation comes to mind immediately, because I’m the one who undid it.

First, Bob Dole should well-known to any American over the age of 25, if not for being the Republican presidential nominee in 1996, then perhaps for that one Pepsi ad with Britney Spears. Meanwhile, Robert Dold is a U.S. congressman from Illinois, whom I had never heard of until very recently, although I live in DC and have worked in and around U.S. politics for a decade. (Dold has only been in Washington since 2010, so there’s that.)

Then what explains the admonition not to confuse this:

With this:

Yeah, I didn’t get it either. So I removed the unnecessary disambiguation from Dole’s page, and I seriously doubt anyone has been wondering “What about Bob (Dold)?

There are other interesting unbalances, however often more justified. As I recently tweeted:

Joe Plummer vs. Joe the Plumber on Wikipedia

Indeed, compare this:

With this:

But I’m sure that’s right. Joe the Plumber is far better known, following his stint as the semi-official mascot of John McCain’s 2008 presidential campaign, than is Joe Plummer, who is probably a swell guy and earns bonus points from me for being from Portland. And with Mr. the Plumber now the Republican nominee to challenge Rep. Marcy Kaptur this fall, it’s looking even dimmer. Sorry, Joe (the Plummer).

But in the world of interesting disambiguations, undoubtedly this one is my favorite:

At least it doesn’t tell you to not to be confused.

Email This Post
  • Facebook
  • Twitter
  • Digg
  • del.icio.us

The Agony and Ecstasy of Wikidata

Tagged as , , , , , , , , ,
on April 12, 2012 at 8:31 am

Although Wikipedia is by far the best-known of the Wikimedia collaborative projects, it is just one of many. Just this last week, Wikimedia Deutschland announced its latest contribution: Wikidata (also @Wikidata, and see this interview in the Wikipedia Signpost). Still under development, its temporary homepage announces:

Wikidata aims to create a free knowledge base about the world that can be read and edited by humans and machines alike. It will provide data in all the languages of the Wikimedia projects, and allow for the central access to data in a similar vein as Wikimedia Commons does for multimedia files. Wikidata is proposed as a new Wikimedia hosted and maintained project.

Possible Wikidata logo

One of a few Wikidata logos under consideration.

Upon its announcement, I tweeted my initial impression, that it sounded like Wikipedia’s answer to Wolfram Alpha, the commercial “answer engine” created by Stephen Wolfram in 2009. It seems to partly be that but also more, and its apparent ambition—not to mention the speculation surrounding it—is causing a stir.

Already touted by TechCrunch as “Wikipedia’s next big thing” (incorrectly identifying Wikipedia as its primary driver, I pedantically note), Wikidata will create a central database for the countless numbers, statistics and figures currently found in Wikipedia’s articles. The centralized collection of data will allow for quick updates and uniformity of statistical information across Wikipedia.

Currently when new information replaces old, as is the case with census surveys, elections results and quarterly reports are published, Wikipedians must manually update the old data in all the articles in which it appears, across every language. Wikidata would create the possibility for a quick computer led update to replace all out of date information. Additionally, it is expected that Wikidata will allow visitors to search and access information in a less labor-intensive method. As TechCrunch suggests:

Wikidata will also enable users to ask different types of questions, like which of the world’s ten largest cities have a female mayor?, for example. Queries like this are today answered by user-created Wikipedia Lists – that is, manually created structured answers. Wikidata, on the hand, will be able to create these lists automatically.

Though this project—which is funded by the Allen Institute for Artificial Intelligence, the Gordon and Betty Moore Foundation, and Google—is expected to take about a year to develop, but the blogosphere is already buzzing.

It’s probably fair to say that the overall response has been very positive. In a long post summarizing Wikidata’s aims, Yahoo! Labs researcher Nicolas Torzec identifies himself as one who excitedly awaits the changes Wikidata promises:

By providing and integrating Wikipedia with one common source of structured data that anyone can edit and use, Wikidata should enable higher consistency and quality within Wikipedia articles, increase the availability of information in and across Wikipedias, and decrease the maintenance effort for the editors working on Wikipedia. At the same time, it will also enable new types of Wikipedia pages and applications, including dynamically-generated timelines, maps, and charts; automatically-generated lists and aggregates; semantic search; light question & answering; etc. And because all these data will be available as Open Data in a machine-readable form, they will also benefit thrid-party [sic] knowledge-based projects at large Web companies such as Google, Bing, Facebook and Yahoo!, as well as at smaller Web startups…

Asked for comment by CNet, Andrew Lih, author of The Wikipedia Revolution, called it a “logical progression” for Wikipedia, even as he worries that Wikidata will drive away Wikipedians who are less tech-savvy, as it complicates the way in which information is recorded.

Also cautious is SEO blogger Pat Marcello, who warns that human error is still a very real possibility. She writes:

Wikidata is going to be just like Wikipedia in that it will be UGC (user-generated content) in many instances. So, how reliable will it be? I mean, when I write something — anything from a blog post to a book, I want the data I use in that work to be 100% accurate. I fear that just as with Wikipedia, the information you get may not be 100%, and with the volume of data they plan to include, there’s no way to vette [sic] all of the information.

Fair enough, but of course the upside is that corrections can be easily made. If one already uses Wikipedia, this tradeoff is very familiar.

The most critical voice so far is Mark Graham, an English geographer (and a fellow participant in the January 2010 WikiWars conference) who published “The Problem with Wikidata” on The Atlantic’s website this week:

This is a highly significant and hugely important change to the ways that Wikipedia works. Until now, the Wikipedia community has never attempted any sort of consistency across all languages. …

It is important that different communities are able to create and reproduce different truths and worldviews. And while certain truths are universal (Tokyo is described as a capital city in every language version that includes an article about Japan), others are more messy and unclear (e.g. should the population of Israel include occupied and contested territories?).

The reason that Wikidata marks such a significant moment in Wikipedia’s history is the fact that it eliminates some of the scope for culturally contingent representations of places, processes, people, and events. However, even more concerning is that fact that this sort of congealed and structured knowledge is unlikely to reflect the opinions and beliefs of traditionally marginalized groups.

The comments on the article are interesting, with some voices sharing Graham’s concerns, while others argue his concerns are overstated:

While there are exceptions, most of the information (and bias) in Wikipedia articles is contained within the prose and will be unaffected by Wikidata. … It’s quite possible that Wikidata will initially provide a lopsided database with a heavy emphasis on the developed world. But Wikipedia’s increasing focus on globalization and the tremendous potential of the open editing model make it one of the best candidates for mitigating that factor within the Semantic Web.

Wikimedia and Wikipedia’s slant toward the North, the West, and English speakers are well-covered in Wikipedia’s own list of its systemic biases, and Wikidata can’t help but face the same challenges. Meanwhile, another commenter argued:

The sky is falling! Or not, take your pick. Other commenters have made more informed posts than this, but does Wikidata’s existence force Wikipedia to use it? Probably not. … But if Wikidata has a graph of the Israel boundary–even multiple graphs–I suppose that the various Wikipedia authors could use one, or several, or none and make their own…which might get edited by someone else.

Under the canny (partial) title of “Who Will Be Mostly Right … ?” on the blog Data Liberate, Richard Wallis writes:

I share some of [Graham's] concerns, but also draw comfort from some of the things Denny said in Berlin – “WikiData will not define the truth, it will collect the references to the data…. WikiData created articles on a topic will point to the relevant Wikipedia articles in all languages.” They obviously intend to capture facts described in different languages, the question is will they also preserve the local differences in assertion. In a world where we still can not totally agree on the height of our tallest mountain, we must be able to take account of and report differences of opinion.

Evidence that those behind Wikidata have anticipated a response similar to Graham’s can be found on the blog Too Big to Know where technologist David Weinberger shared a snippet of an IRC chat with he had with a Wikimedian:

[11:29] hi. I’m very interested in wikidata and am trying to write a brief blog post, and have a n00b question.
[11:29] go ahead!
[11:30] When there’s disagreement about a fact, will there be a discussion page where the differences can be worked through in public?
[11:30] two-fold answer
[11:30] 1. there will be a discussion page, yes
[11:31] 2. every fact can always have references accompanying it. so it is not about “does berlin really have 3.5 mio people” but about “does source X say that berlin has 3.5 mio people”
[11:31] wikidata is not about truth
[11:31] but about referenceable facts

The compiled phrase “Wikidata is not about truth, but about referenceable facts” is an intentional echo of Wikipedia’s oft-debated but longstanding allegiance to “verifiability, not truth”. Unsurprisingly, this familiar debate is playing itself out around Wikidata already.

Thanks for research assistance to Morgan Wehling.

Email This Post
  • Facebook
  • Twitter
  • Digg
  • del.icio.us

Public Lives: Jim Hawkins and Wikipedia’s Privacy Dilemma

Tagged as , , ,
on April 6, 2012 at 9:15 am

Editor’s note: The author of this blog post is Rhiannon Ruff (User:Grisette), a friend and colleague, in what I hope is a continuing series. The Wikipedian published a previous guest blog post in December 2011.

Introduction to Jim Hawkins Wikipedia article.

As an occasional Wikipedian, I like to check out Jimmy Wales’ user Talk page every now and again; while user Talk pages are generally where editors leave messages for each other, notes of support, or even warnings, Jimbo Wales’ page is a hot-bed of intrigue, gossip and debate. It’s Wikipedia’s water cooler. And it’s the perfect place to go if you’re looking to find an example of the confusion that can result from the occasional collision of hot-headed editors, complex guidelines and individuals who are themselves the subjects of articles. Just today I came across a discussion that mentioned Jim Hawkins, a radio-presenter in the UK who has been struggling to deal with Wikipedia editors, and Jimmy himself, over privacy issues raised by his biographical article.

Contrary to what many people believe, the Wikipedia community and Wikimedia Foundation are very keen to protect individuals’ privacy. There’s a common misunderstanding that if you edit Wikipedia, anyone can find out who you are—an idea proliferated by media coverage of incidents where editors’ IP addresses were traced and companies outed for editing their own articles (or, worse, those of competitors). But there’s actually a simple solution: creating an account on the site hides your IP address when you edit. And as long as you only edit while logged into that account, there’s no way for anyone to find out who or where you are through your IP. There are also very strong rules against “outing” the real life identities of editors by posting their personal information on the site.

But what if you’re the subject of a Wikipedia article? Getting back to Jim Hawkins, here’s the real dilemma that people in the public eye are faced with: anyone can create an article about them, but how do they go about preventing their personal details from being included in it? Hawkins certainly wasn’t happy about the creation of an article about him, and he was even less impressed that it included details such as the county where he lives and his exact birthdate. He’s been trying to get the article deleted for five years now. Over time, his frustration in dealing with the Wikipedia community has led to increasing antagonism on both sides.

After a recent “edit war” where his birthdate was repeatedly added and removed, the date was removed once and for all after an official request was made on behalf of Hawkins. The edit was made in line with a privacy policy that allows subjects of biographical articles to request the removal of their date of birth from the site. But, the county remained and Hawkins continued to rail against the system on the article’s Talk page:

Why should the people who’ve been stalking, bullying and harassing me – and have been doing so again today! – have any say in what happens to the article?
Hooray for policies. Does common human decency come into this anywhere? Or am I going to get the same response I’ve had for five years, the borderline-fundamentalist ‘that’s not how Wikipedia works’?

In a lively discussion on Jimmy Wales’ User Talk page beginning on April 1, editors were divided over two issues:

  1. Should an individual who is on the cusp of notability (i.e. just about eligible for a Wikipedia article, according to guidelines) be allowed to choose whether or not they have an article?
  2. If personal information about a subject has been published in public sources, does it contravene Wikipedia’s privacy rules to include it in the article?

There’s no simple answer to either of these. The first one in particular is really rather tricky. It’s true that if an article about someone hasn’t been created, there’s nothing that says that it has to exist. If an article has been created, though, it isn’t clear whether there should be the option to delete if the subject isn’t very strongly notable. Wikipedians seem to fall into two roughly two camps on the issue: those with sympathy towards article subjects and those who are concerned with ensuring that information is available on Wikipedia, if sources exist to support it.

The main question that Hawkins raised was why there had to be an article about him, if he felt that it was unnecessary, inaccurate and infringed upon his privacy. At one point in discussion he asks:

Can I point out that the whole damn thing is an invasion of privacy?

And an experienced editor replies, summarising the crux of the issue here:

An invasion of privacy is, by definition, the release of private information. This information, however, is not private, but is stated by the subject in the very show he hosts.

So, the issue is: if information exists in the public sphere, why should it not be included in a Wikipedia article? The details are already out there, some editors argue, so adding it to a Wikipedia article can’t be infringing on the subject’s privacy as the information wasn’t private to begin with. The bright line that exists on Wikipedia is its governing principle of verifiability: information included in articles must always be verifiable, that is, they must be supported by reliable sources. So, if personal information about a subject isn’t supported by a reliable source—even if it’s true—it can’t be included. Unfortunately, as Hawkins has discovered, if the information does appear in a reliable source (in this case, in a local magazine and on the BBC website), whether it is included or not comes down largely to editors’ discretion.

In short, the lesson Jim Hawkins has learned the hard way is: if you don’t want something included in your Wikipedia article, make sure it isn’t published in the first place.

Email This Post
  • Facebook
  • Twitter
  • Digg
  • del.icio.us

Death of a Wikipedian

Tagged as , , , , , , , , , , ,
on March 23, 2012 at 3:10 pm

Public memorials are a phenomenon found in every society and subset: from war memorials to police memorials and semi-permanent ghost bikes to impromptu, impermanent flower displays, mourning and remembrance are universal. Wikipedia is no exception.

Since early 2006, Wikipedia has maintained a public memorial page called Deceased Wikipedians. While public in the sense that it is accessible by anyone, it is perhaps useful to think of it as semi-public in that it’s not part of the actual encyclopedia. You won’t pass by it on your way to work, or to reading about (let’s say) the Syrian uprising. To date, 39 late Wikipedians have been added to the English version of this page. 14 other language editions have their own versions, including the German, French and even Esperanto editions.

The first added to the English-language Wikipedian memorial was Caroline Thompson, an Australian physics enthusiast who worked on articles about quantum mechanics. Afterward, other names were filled in. The earliest current listed was a French editor using the handle Treanna, who died in late summer 2005. Considering Wikipedia began in early 2001, surely some others passed before him, but we may never know who they were.

On a website where anonymity is granted to anyone who desires it, determining that an absent editor is deceased and not just one who has drifted away is a matter of luck, and sometimes detective work. The inclusion of an editor named Xulin depended on the synthesis of available information on external websites. As a contributor primarily to the French-language Wikipedia, a candlelight vigil of sorts remains in his userspace there.

Criteria for inclusion isn’t crystal clear, but the top of the page does give this advice:

People in this list are remembered as part of the Wikipedia community: they have made at least several hundred edits or are otherwise known for substantial contributions to Wikipedia.

The names included do not not appear to have been controversial to this point, although one stands out as different from the others: John Patrick Bedell, known less for his contributions as JPatrickBedell and more for his disturbing role in the 2010 Pentagon shooting (which I wrote about at the time: “John Patrick Bedell: Pentagon Shooter, Wikipedian”).

Two other deceased editors are the subjects of Wikipedia articles based on contributions to their fields outside of Wikipedia: Tron Øgrim, a Norwegian journalist and activist, and Steven Rubenstein, an American anthropologist.

The most recent addition is a young man named Ben Yates, better known around the site as Tlogmer, who passed away earlier this month. An active contributor from October 2003 to October 2008, he was known for several remarkable contributions to the community. This included the original design for the logo of Wikipedia’s annual gathering, Wikimania, still in use to this day. He was also a co-author on the book, How Wikipedia Works: And How You Can Be a Part of It, published in 2008 (free web version here). On a humorous note, he was the originator of the Wikipedia article “Metrosexual”. He also created some hilarious (to a Wikipedian) bumper stickers, which seem to be still available.

Of particular interest to me, he was also at one point the author of a blog about Wikipedia, simply called Wikipedia Blog. Yates’ self-selected favorite posts were three: “The Future of Open Source”, about Wikipedia and Linux; “Wikipedia helps show the economic value of social interaction”, about just what it sounds like; and “Wikipedia and COMMUNISM!”, ruminating on Wikipedia’s comparison to various “isms”. In the last one, he wrote:

Wikipedia will never fade away … its memories will not die with its members. As an open source project, it can always be forked, tweaked, sifted through various filters, read and written anew.

Very well said, and correct he was. So it goes.

Email This Post
  • Facebook
  • Twitter
  • Digg
  • del.icio.us

Regarding the Uncertain Future of Encyclopædia Britannica

Tagged as , , , , , , ,
on March 14, 2012 at 5:01 pm

Yesterday, Encyclopædia Britannica made the startling announcement that they would discontinue their print edition after 244 years. Once the current edition has sold out, they’ll become a collector’s item. Which is essentially what they are now, if it’s not too uncharitable to point out. Britannica is not finished as an operation, however: it will continue to publish on the web. It’s a startling announcement, sure, but it makes more sense than if it went on as if nothing had changed. Britannica’s editors acknowledged as much in a post on their blog:

A momentous event? In some ways, yes; the set is, after all, nearly a quarter of a millennium old. But in a larger sense this is just another historical data point in the evolution of human knowledge.

But Britannica’s grip on the evolution of human knowledge isn’t what it used to be—you can see where I’m going, right? As a well-known quote from Jimbo Wales goes:

Imagine a world in which every single person on the planet is given free access to the sum of all human knowledge. That’s what we’re doing.

Since its launch in 2001, and especially since a (much-debated) 2005 Nature article comparing the two, Wikipedia has been a thorn in Britannica’s side. And its influence has long since surpassed its much older rival. A Quantcast comparison suggests that Wikipedia’s traffic is 30x that of Britannica’s. And as I tweeted last night, news organizations have been quick to note the competition.

Under the title “Death By Wikipedia: Encyclopedia Britannica Stops Printing”, ReadWriteWeb observes:

The usefulness of such reference materials has been on the decline for years, especially since the advent of Wikipedia. Whatever flaws its open, crowd-sourced editorial model may invite, Wikipedia is generally regarded as a comprehensive and mostly-accurate source of information, which can be accessed for free.

And in a Venture Beat article titled “Encyclopaedia Britannica wiped out by Wikipedia, selling final print edition” we find:

The extremely thorough Wikipedia article on Encyclopaedia Britannica … serves as the perfect example of why Wikipedia is coming out on top.

It’s true—Wikipedia’s article about Encyclopædia Britannica is very thorough. Britannica’s article about Wikipedia is not bad, but it is far more limited than Wikipedia’s article about itself, and Britannica has those annoying pop-up advertisements that do nothing for readers.

Yet Britannica president Jorge Cauz tells the The Washington Post:

This has nothing to do with Wikipedia or Google. … This has to do with the fact that now Britannica sells its digital products to a large number of people.

This is a little bit like Microsoft saying Windows 8 has nothing to do with the the iPad, merely the shift in consumer purchasing habits toward the tablet and mobile markets. That’s not to say the statement isn’t necessarily untrue, just that it’s complete. I don’t know a great deal about Britannica’s current business model, but it’s safe to say that non-print revenues have become far more important, as Britannica’s print sales have fallen. Whether they will succeed is another question; PC World and doesn’t think so, pointing out the closure of—speak of the devil—Microsoft’s online encyclopedia Encarta in 2009 (which I wrote about at the time):

Microsoft shuttered its digital multimedia encyclopedia, Encarta, in 2009, and the last trace of it, the online dictionary, closed last year. Encarta, though a digital product, was also made obsolete by Wikipedia’s free availability, constantly updated content and thousands of editors, contributors and volunteers from around the world.

At The Atlantic, expert on evolution and Bloggingheads impresario Robert Wright offers this (small) consolation:

Maybe, long after even the electronic edition of Britannica is gone, the idea of Britannica can remain for us what it once was for me–a kind of Platonic ideal that we aspire to evolve toward even if we can never reach it, something that has a kind of reality even if we can never touch it.

As someone who devoured Britannica in my school library when growing up, not to mention someone who relied on Britannica as a college student in the late 1990s (before Britannica added a pay wall)—much the same way as students today (notoriously) rely on Wikipedia —I’m sorry to see it go. But we no longer live in a world where a 30,000 page, 15-volume encyclopedia can be printed on an annual basis for profit. In fact, even Britannica sees itself as a collector’s item now; as Cauz tells the News Observer:

This is going to be as rare as the first edition, because the last print run of our last copyright was one of the smallest print runs.”

I’d love to own one myself, but at $1,395.00 for the “Final Print Edition”, I’m afraid I’ll have to pass. And perhaps Cauz is wrong; maybe the death of Britannica will be more like the Death of Superman.

Email This Post
  • Facebook
  • Twitter
  • Digg
  • del.icio.us

Verifiability and Truth: What John Siracusa Doesn’t Get About Wikipedia

Tagged as , , , , , , , , ,
on February 2, 2012 at 6:50 pm

One of my favorite podcasts is Hypercritical, co-hosted by and principally featuring the thoughtful criticisms of John Siracusa, a sometime columnist for Ars Technica and Internet-famous Apple pundit. The show’s tagline calls it: “A weekly talk show ruminating on exactly what is wrong in the world of Apple and related technologies and businesses. Nothing is so perfect that it can’t be complained about.” Last week’s edition—“Marked for Deletion”—was about something far from perfect, but of great interest to this blog: Wikipedia.

If you want to listen for yourself, jump to about 1:11:55 (yes, more than an hour into the show) where Siracusa and co-host Dan Benjamin turn the discussion to Wikipedia. And a warning: this is going to be long. Consider it homage.

♦     ♦     ♦

Promisingly, Siracusa begins by asking his co-host to answer, if he can, “what Wikipedia is”. The answer is pretty good for an outsider: it’s a place for sharing information and collaboratively building a resource for (hopefully) accurate information on almost any topic. In general, this will do. But it’s not quite right, as Siracusa explains by recounting his personal experience of trying, in vain, to defend an article from deletion. With five years to reflect on it, Siracusa describes his efforts as a “prototypical example of someone who does not understand what Wikipedia is, proving that he does not understand what Wikipedia is.”

All of this is a way of getting to Siracusa’s fascination—one might say morbid fascination—with Wikipedia’s policy of “Verifiability”. The first paragraph of the policy says:

Verifiability on Wikipedia is the ability to cite reliable sources that directly support the information in an article. All information in Wikipedia must be verifiable, but because other policies and guidelines also influence content, verifiability does not guarantee inclusion. The threshold for inclusion in Wikipedia is verifiability, not truth—whether readers can check that material in Wikipedia has already been published by a reliable source, not whether editors think unsourced material is true.

Or as Siracusa summarizes it: “Something can be as true as you want it to be, if it is not verifiable, it doesn’t go in.” Well said.

He also discusses the related policy of “No original research”. This includes a good explication of the different types of sources that may or may not be used on Wikipedia: primary sources (original documents and first-hand accounts), secondary sources (news articles interpreting primary sources) and tertiary sources (encyclopedias and academic articles summarizing the former). This is advanced stuff, and for a longtime Wikipedian, it’s no small thrill to hear a smart outsider explain why secondary sources are preferred, and work through the fundamental policies of Wikipedia. Siracusa correctly observes: “Wikipedia is not a place where you write down stuff that you know. … Wikipedia writes about other people writing about things.”

Except here’s the thing: Siracusa understands Wikipedia’s core content policies. He just doesn’t like them.

In his particular example, a former standalone article called FTFF (here’s what it used to look like) didn’t survive the process not because it wasn’t true, but (he says) because it contained material that wasn’t verifiable, and constituted original research. This is partly true, but it owes more to a guideline that got only passing mention on the show (and, frankly, in the deletion debate): “Notability”, and specifically the “General notability guideline”. It’s closely tied in with WP:VERIFY and WP:ORIGINAL, and basically says that a topic must have sufficient coverage in secondary sources to be given its own standalone page. FTFF was not, and the result of the debate was to merge the topic to Finder_(software)#Criticism.

Anyway, this pedantry about WP:NOTE and WP:GNG doesn’t affect Siracusa’s main point: If something is true but unverifiable, he would like to see it included in Wikipedia anyway. Nor does it affect his corollary argument, that Wikipedia’s complex rules discourage many would-be participants.

He’s undoubtedly right about the second point: many people try to get involved with Wikipedia who have no idea what it’s really about, and they tend to have a really bad experience. Wikipedia struggles to explain itself to outsiders, and it probably always will.

As to the former, the problem is that he fails to grapple with the implications of the Wikipedia he describes, and this is disappointing. By privileging “truth” above “verifiability”, one gets the impression he’s describing a Rashomon-like Wikipedia where all possible viewpoints are explored, and somehow eventually Wikipedia just makes the right call. This assumes a lot, not least that contentious topics wouldn’t simply devolve into edit wars of unchecked aggression. In a world where Wikipedia aims for truth but eschews verifiability, there are no footholds upon which to steady an argument. There is no way to know what should be considered credible or otherwise.

At times it actually sounds like he’s advocating something that already exists: reliance on “Consensus” for determining how Wikipedia will address the topics it covers. Wikipedia policies and guidelines don’t cover everything, and this is where consensus steps in, however imperfectly. If you’ve ever wondered why there is sometimes an observable discrepancy in the depth or quality of coverage between topics, consensus is the big reason why, and moreso the self-selection that shapes consensus. The current, real-world Wikipedia refers to outside authorities as well as consensus among editors; Siracusa’s Bizarro World Wikipedia would jettison the former and rely solely on the latter.

Meanwhile, Siracusa ascribes Wikipedia’s Byzantine rule structure to Wikipedians’ desire for approval from educators and academics, which he thinks is holding back Wikipedia from what it could become. He repeatedly says “Wikipedia should be something different” and refers to “what’s different about online” but he never gets prescriptive and never actually says why the old methods are outmoded. He does say his Wikipedia would seek to “arrive at truth using every tool necessary” and would, for example, allow original research… but what then is the mechanism for (dare I say) verifying it?

At one point, Siracusa compares the popular, widely-viewed Ars Technica forums to a hypothetical low-circulation print magazine, and complains that the widely-read former site is an invalid source while the unpopular latter publication is acceptable. It’s true that Wikipedia does not necessarily take a populist approach to evaluating sources, but he’s far off the mark in his attempt to explain this: “They’re not cool with the old librarians, because they’re not paper.”

I hope that he was just being lazy and doesn’t actually think that Wikipedia editors prefer paper (if anything they actually prefer online sources, which are easier to check) but he completely misses a key dynamic that ties back to verifiability: the paper magazine with poor circulation at least will have editors who are presumed to care about fact-checking and accuracy. A web forum, however popular it may be, may have moderators, but that’s not the same thing as having an editor. A discussion group is not an editorial operation, period. The forum is a primary source, and so should only be used to support reliable sources.

There are, however, reliable web sources. One of them is the editorial side of Ars Technica; no less an authority than John Siracusa has been cited in approximately 150 different Wikipedia articles about the Macintosh and other technology subjects.

♦     ♦     ♦

I’m sorry to say this, but in the show’s last fifteen minutes, Siracusa pretty much descends into total incoherence. Here’s his summary statement, close to verbatim:

[There are] many flaws in verifiability and reliability of sources. It’s built on a foundation of sand. Notability, what’s a reliable source, those things become so key to making Wikipedia crappy or good, and those sands are constantly always shifting, you know? And so if Wikipedia was centered on truth and that was its final goal, yeah, it would have to include citations and verifiability and stuff like that, but there would never be any argument when the two are in conflict. You know, if you could prove that a series of events happened here, then you could say, well, it’s verifiable, it appeared in a reliable source, but it’s not the truth. And so therefore we should expunge that. Because the final goal of Wikipedia is truth. But the final goal of Wikipedia is not truth, it’s verifiability.

There would “never be any argument” about what is the truth? In the parlance of Wikipedia: [citation needed].

Look, this is an epistemological issue, one much larger than just Wikipedia. The reason Wikipedia’s goal is verifiability, not truth, is because verifiability is an achievable goal. In fact, verifiability is a necessary step toward establishing truth, as Siracusa at this point seems to acknowledge in his imagined alternate, truth-seeking Wikipedia.

It’s not that Wikipedia is actively hostile to the truth: it’s just agnostic as to what it might be. Wikipedia articles are like road signs; truth itself may be unknowable, and we may never arrive at our destination, but Wikipedia can point in the right direction. Wikipedia’s policies and guidelines are designed to make sure that its content does that, although it’s fair to acknowledge that it’s not guaranteed. But what is? And what is truth?

Anyway, there’s a user essay on Wikipedia called “Verifiability, not truth” that says this better than I am going to. Here’s the key point:

That we have rules for the inclusion of material does not mean Wikipedians have no respect for truth and accuracy, just as a court’s reliance on rules of evidence does not mean the court does not respect truth. Wikipedia values accuracy, but it requires verifiability. Unlike some encyclopedias, Wikipedia does not try to impose “the truth” on its readers, and does not ask that they trust something just because they read it in Wikipedia. We empower our readers. We don’t ask for their blind trust.

If you want to upset the old system and do something new, you actually do need to think through what should replace it. Siracusa never does.

If he thinks Wikipedia’s adherence to “old world” rules is driving away contributors, he should consider what the free-for-all alternative would look like. It isn’t a Wikipedia I would spend any time with, it’s not one that Google would be eager to rank so highly, and it wouldn’t be the most important reference site on the Internet.

Email This Post
  • Facebook
  • Twitter
  • Digg
  • del.icio.us

Wikipedia Gets on its SOPA Box

Tagged as , , , , ,
on January 17, 2012 at 9:46 am

Wikipedia SOPA blackout announcement
The Wikimedia Foundation announced on Monday that the English-language Wikipedia will go offline for 24 hours, starting at midnight tonight on the East Coast, in protest of the Stop Online Piracy Act (SOPA) and a related bill, the PROTECT IP Act (PIPA). The move follows a similar protest by the Italian-language Wikipedia last year, protesting proposed anti-privacy laws in Italy.

Over the past week, volunteer Wikipedia editors debated the proposition and, ultimately decided to go forward. The decision was accepted by the Foundation, which will implement it late tonight. An official public explanation includes the following:

Over the course of the past 72 hours, over 1800 Wikipedians have joined together to discuss proposed actions that the community might wish to take against SOPA and PIPA. This is by far the largest level of participation in a community discussion ever seen on Wikipedia, which illustrates the level of concern that Wikipedians feel about this proposed legislation. The overwhelming majority of participants support community action to encourage greater public action in response to these two bills. Of the proposals considered by Wikipedians, those that would result in a “blackout” of the English Wikipedia, in concert with similar blackouts on other websites opposed to SOPA and PIPA, received the strongest support.

The decision is not one that all are happy about. After all, Wikipedia’s core content guidelines emphasize a Neutral point of view in its approach to encyclopedia topics, so isn’t this a questionable decision?

Just this morning, a participant on a Wikipedia-related discussion group wrote:

Now that we have taken the necessary first step to regard the English Wikipedia and other Wikimedia projects as high-profile platforms for political statements, we ought to consider what other critical humanitarian problems we could use our considerable visibility and reputation to address. We could draw attention to the crises in Sudan or Nigeria, drone attacks against civilians in Afghanistan, the permanent occupation of the Palestinian territories, the Iranian effort to develop nuclear capabilities, police misconduct in virtually any country, the treatment of women and women’s rights in Saudi Arabia and elsewhere, and the list could go on and on.

Well, considering that it was a matter of debate, it surely is questionable and does not reflect the views of all Wikipedians. But I think it’s also fair to say that it reflects the majority of participants.

Wikipedia has its philosophical roots in the free software movement, which is the very antithesis of what SOPA and PIPA are about, so this particular viewpoint should surprise no one. Meanwhile, Wikipedia is well aware that it has its own systemic biases and has organized a project to answer them. In this case, however, Wikipedia’s bias shows through and most participants find this to be a good thing.

I’ll have to put myself more in the skeptic’s camp—not because I support SOPA, which I’m pretty sure I don’t—but because I would prefer that Wikipedia not become a platform for political activism. That said, I don’t think it will lead to similar efforts in the near future and, considering it’s already received significant news coverage, I think there is no question it will be effective in raising awareness about the issue.

For Wikipedians who are uncomfortable with the effort, there’s not much else to do. The band they’re in is playing a different tune, and we’ll see you on the dark side of the Wikipedia blackout.

Email This Post
  • Facebook
  • Twitter
  • Digg
  • del.icio.us
pres1cription1
ervrtv cvs pharmacy locations fjkngr cvs rtyhty Adderall Online ehfnfe Adderall ergveve buy phentermine 37.5 without prescription ervn Phentermine ervrv ololo adderall online evbyrf Adderall Xr rtbrgf cheap cialis tygy Cheap Cialis ggyjgy Well, viagra ygcew viagra cheap viagra uhqwdh cheap viagra meds buy viagra hvvdd buy viagra wgdd viagra online asghdwf, viagra online, adgh generic viagra sadgyuw generic viagra cialis cialis afgd! Fdga trusted pharmacy cialis online cialis online wfdwf wefg wfee levitra levitra pharmacy qw, wad phentermine phentermine online qwefdg fda phentermine 37.5 qwdeijg phentermine 37.5 weight loss 5 ef tramadol tramadol qwdyg tramadol 50 mg wagyed tramadol 50 mg ed adderall adderall xr online iehf, wfd, afdwf, xanax xanax sleeping awgd 2-5 valium wfdqgjb valium pharmacy trusted pharmacy wef e facebook login facebook login, secrets, methods, qgywj lexapro lexapro, afgfa afhydrocodone dgvqwd hydrocodone and free viagra excellent free viagra. Viagra Samples
Viagra For Sale
Natural Viagra
order tramadol online community still order tramadol online pharmacies tramadol online pharmacy tramadol online pharmacists setting order tramadol gradually functions health-related order tramadol generic xanax various generic xanax surgery patient free viagra functions still of free viagra order levitra online reversing order levitra online works approach buy cialis acupuncturists inside buy cialis specific