William Beutler on Wikipedia

Posts Tagged ‘Pandemic’

How Wikipedia is Covering
the Coronavirus Pandemic

Tagged as , , , , , , , ,
on April 15, 2020 at 12:12 pm

Words fail to capture the significance of the ongoing global coronavirus pandemic: the suffering of the disease’s victims, the pain of their loved ones, and the frustrations of those otherwise affected are, together, greater than any crisis our generation has experienced. In the first few days, comparisons to 9/11 and the Great Recession were commonplace. They have failed to capture the mood, so references to World War II, the Great Depression, and of course the 1918 influenza pandemic have seeped into news and commentary. 

It’s impossible to know how the present catastrophe will reshape the world in the future, but Wikipedia is already documenting, essentially in real-time, how COVID-19 is changing the world day by day. To understand Wikipedia’s coronavirus coverage, you have to start with WikiProject COVID-19

The WikiProject

For those not already familiar, WikiProjects are collaborative efforts organized by editors who want to work on similar topics. Wikipedia has almost 900 active WikiProjects, from A Cappella to Zanzibar City. Among them, there already existed WikiProjects whose subject matter is closely related to the coronavirus—specifically Disaster management, Medicine, and Viruses—but WikiProject COVID-19 is barely a month old as of this writing. 

In that time, nearly 1900 articles have been created or adopted by the project, out of more than 6,000 articles mentioning COVID-19 on Wikipedia.[1]and growing by about 1,000 articles per week, according to my unscientific spot checks  Launched by a single user on March 15, today it has more than 130 official contributors. And this is not to say there are only 130 editors working on pandemic articles, only that 130 have taken the time out from editing to sign their names up and, in some cases, help to coordinate efforts. Someone even came up with a logo (at right). A separate project from WMF Labs has sought to identify all pandemic-related editing, which at last check counted 527,000 edits by nearly 40,000 separate editors.

WikiProject COVID-19 also maintains a list of more than 600 articles it considers especially important and whose quality they are working hardest to improve. Some of these articles are exceptional in a conventional way, such as 2020 coronavirus pandemic in Germany[2]the Germans are just good at Wikipedia in general while others are more unusual: here is a rare article, about the Chinese doctor Ai Fen, where the encyclopedia entry is in English but nearly all of the sources are in Chinese.

Wikipedia editors’ contributions to our understanding of the coronavirus is important work[3]not like doctors and nurses, sure, but crucial nonetheless and unmatched on the internet. Nothing like Wikipedia existed for most of the world events described in the first paragraph, and even in 2008 Wikipedia wasn’t quite what it is now. This post is no comprehensive survey of these editors’ work, only a report back from a few days of reading and clicking to learn about aspects of the pandemic I knew nothing, little, or not enough about.[4]Lately, it feels almost like my early days of discovering Wikipedia: opening tab after tab after tab in my browser, losing hours to it, engaging in a very mid-2000s activity once nicely captured in a memorable XKCD comic.

How the Information is Organized

Most readers arrive at Wikipedia via web search, but those visiting the Main page over the past month have found a coronavirus-specific information box toward the top-right corner of the page.[5]A development this blog advocated for just before it became reality, ICYMI. This box is obviously a good place to start exploring Wikipedia’s coverage of the pandemic and related topics. By definition it is the highest-level summary of how Wikipedians think about organizing this information. But as we shall see, it doesn’t even begin to hint at the true scope of the topic.

There are nine total links here, which is arguably a lot, but it is well-organized. The bigger typeface on “Coronavirus pandemic” draws the eye to what is not just the box’s name but also a link to the primary article about the phenomenon, 2019–20 coronavirus pandemic. The next two links, “Disease” and “Virus” go to Coronavirus disease 2019 and Severe acute respiratory syndrome coronavirus 2, respectively, wisely sparing the reader from guessing about how these relate to each other. The rest of the links describe the pandemic from different angles, and we’ll examine them more later. But first, I’m interested in capturing some numbers about each of the three main articles. 

The Pandemic, The Disease, and the Virus

Here is a rudimentary side-by-side comparison of the three articles about (in order, left to right) the Pandemic, the Disease, and the Virus. The following is information pulled from WikiWatch, a tool that I built, and from Wikipedia itself:[6]Accurate as of April 13, when this data was collected

Even without charts, the pattern is clear: all the figures, from word count to images to edits to pageviews, are very large for the primary article about the pandemic, and commensurately less for each supporting article on more specialized subjects. Considering how Wikipedia’s content guidelines advise editors to consider giving subjects their “due weight”, the editors involved are doing pretty well on this account, whether by design or accident. It’s rather elegant, actually. 

As for the content, I won’t pretend to have read all 22,000 words, but in sampling a few sections of each, I feel confident saying they represent some of Wikipedia’s best work. The pandemic is clearly a topic of grave concern, with copious available sources to draw upon, and there is sufficient interest from editors and readers alike to ensure the articles are constantly updated as information changes. This is the kind of thing Wikipedia does exceptionally well during extreme weather events, such as hurricanes,[7]hat tip: WikiProject Tropical cyclones only this time the whole world is in one.

There is some repetition in photos, but if you have a good photo depicting a nasopharyngeal swab, you don’t really need another. And while we might think everyone has seen a “flatten the curve” illustration or animation by now, it doesn’t hurt to use in more than one article, just in case. Interestingly, a number of these images are drawn from the CDC which, along with the WHO, has released all of its coronavirus-related content as public domain.[8]To learn more about the coronavirus illustration oft-used in Wikipedia’s pandemic coverage, see this New York Times article. 

Speaking of flatter curves, there is another trend to be found in the traffic. First, here’s a chart from the WMF Labs pageviews analysis tool covering the last 30 days, also covering the Pandemic, the Disease, and Virus, in that order:

This gives you a good comparison of traffic on these articles over the last month, but the pandemic article receives so much more attention compared to the others that we can’t really see what’s happening with them. Via the WikiWatch dashboard, here are the same three articles, each according to its own x-axis:

This isn’t altogether surprising: the internet-surfing public’s greatest interest in these topics occured in the first two weeks, when the stay-at-home orders were as novel as the coronavirus. Now, at least half the public’s demonstrated curiosity has been sated. I also wonder if it might not suggest something about the urgency with which the public is responding, which is to say, less over time. If you feel like social distancing practices at your local supermarket are already diminishing, these charts might help explain why.

The Timeline and the Territory

Now let’s have a look at some of these other pages: the “Timeline” link goes to Timeline of the 2019–20 coronavirus pandemic, which is surprisingly short. But this is only because it is a repository of links to the timeline by month, which explains the “April” link next to it, and which naturally takes one to Timeline of the 2019–20 coronavirus pandemic in April 2020. This article is enormous—and we’re only halfway through the month! It’s a mind-bogglingly extensive list of events from all over the world, for each day this month. It’s already about 16,000 words, or 22,000 words if you count the references at the end.

And here one also starts to confront the limitations of what Wikipedia can offer the reader. Often as not, Wikipedia does not make for a riveting reading experience. Its content is constrained by requirements of sourcing, content, and tone appropriate to an encyclopedia. This is overwhelmingly a good thing: it is this quality control that gives Wikipedia its uniquely authoritative fixedness, although it comes at a price: the context you wish could be found between the facts. But this is not Wikipedia’s job. When the newspaper features and book-length investigations are finally published, months and years from now, then Wikipedia will have the sources it needs to tell a more compelling story.

Next there is “By location” which goes to 2019–20 coronavirus pandemic by country and territory. Less an article than a list of lists, it organizes the globe first by continent, with links to dedicated pages for each. It even discusses territories with no identified cases, such as 2020 coronavirus pandemic in Antarctica, which won’t take you very long. And some of these regions surely lying about it—see 2020 coronavirus pandemic in North Korea. Name a country, dependency or principality, and Wikipedia will tell you how it has been affected by the coronavirus. 

Naturally, there is an article for each of the 50 U.S. states, five territories, and one district, not to mention the parent article 2020 coronavirus pandemic in the United States. A summary of just these U.S.-centric articles would be a fascinating blog post, which I will not attempt here, except to observe that they vary widely in quality. This is not just because some are short. After all, there isn’t nearly as much to say about the 2020 coronavirus pandemic in Wyoming as compared to the 2020 coronavirus pandemic in Florida. But the Florida article is likely too short, whereas the 2020 coronavirus pandemic in California is so long as to be unreadable at times. Skim the subsection called “March 18–19” and tell me it’s worth anyone’s time to read or write. In an archive, of course. In an encyclopedia article, not so much.

Some of this tedium would be better replaced by charts. And indeed, many country and state articles include variations on a really excellent chart, meaning visually appealing and easy to interpret, that you can see below depicting cases in Sweden.[9]and accessible on a Wikipedia template page here 

Then there is another table which is far too tall to show in full, but starts like this:

This is the big picture of what you really want to know: how many cumulative cases, deaths, and recoveries by country. In fact, if you search Google for coronavirus cases right now, this Wikipedia page—not a government or professional organization—is where Google’s knowledge panel is pulling data from. Look for the easy-to-miss “Wikipedia” link at the bottom of this screen grab:

That link goes directly to Template:2019–20 coronavirus pandemic data. Not an article, but a template—the raw back end of Wikipedia that most readers never see. Because of the link from Google, this template is currently receiving nearly 200,000 pageviews a day, putting it in the top 1,000 pages across all of Wikipedia. A template!

The Rest of the Story

Finally, there are links for “Impact”, “Deaths”, and “Portal”. We’ll take these in reverse order: Portal:Coronavirus disease 2019 is like the front page of Wikipedia but focused entirely on the coronavirus (less the pandemic, for some reason). It’s a perfectly good starting point if you’d like some help in finding your way around; it presents partial lead sections of the “Disease” and “Virus” articles, and links to some other important pages, such as COVID-19 vaccine[10]hypothetical, just to be clear and COVID-19 drug development.[11]not just the hoped-for vaccine, but treatments as well Again, a great place to start, especially if you like curation, but its purpose is diminished because it is not actually the starting point. Compared to the millions received by the first three links, this page gets only a little over 2,500 pageviews daily

List of deaths due to coronavirus disease 2019, by contrast, is getting around 35,000 views daily. This article is self-explanatory, and is also a specialized version of Wikipedia’s perennially popular “Recent deaths” article.[12]see: Deaths in 2020 The coronavirus deaths article lists more than 200 individuals, each the subject of Wikipedia articles before or, in some cases, after they died. Previously, there was a separate list article about prominent individuals who had been infected, and then recovered, from the coronavirus. It was deleted in late March, largely for being potentially impossibly long, and also problematic for privacy reasons.

“Impacts” takes one to Socio-economic impact of the 2019–20 coronavirus pandemic another very long article with numerous links to articles organized principally by industry and then by region. There are dozens of them, and they too could be the topic of substantial study. Alas, considering the length of this article already, I will leave this for you to explore on your own. Know this: if it exists in the world, you can bet the coronavirus has had an impact upon it, and Wikipedia editors have organized the available news coverage and government statistics to explain it.

While you’re stuck at home over the next few weeks or months, you could do a lot worse than spending your time reading it all. And then, when you’re done, you might as well start again at the beginning, because WikiProject COVID-19 will have revised each article dozens or hundreds of times to keep up with the evolving situation. 

Odds & Ends

I can’t resist leaving you with a couple of unusual or unexpected things I found out that didn’t fit into the post above:

  • Speaking of the 1918 flu pandemic, the current article is called Spanish flu. Nowadays we know that it did not begin in Spain but likely it was in the U.S., and especially after the “Chinese Virus” controversy, many of us are more sensitive to these kinds of historical injustices. In March, there was a fierce debate about whether the article should be renamed. Ultimately the move to rename it failed, following Wikipedia’s sometimes controversial policy about using commonly recognizable names

  • What was the first news article to mention Wikipedia and the coronavirus? It appears to be “On Wikipedia, a fight is raging over coronavirus disinformation” by Omer Benjakob in Wired on February 9.

  • According to Wikipedia’s official statistics, pageviews are up 7% over the past month, and editing activity is up 9%. But if you look at past months, there’s nothing statistically significant about these upward ticks. For some reason, various months in 2019 and even earlier were on par or higher than these figures. Then again, we are talking matters of millions and billions, and one has to assume the law of large numbers applies.

  • This being Wikipedia, where anyone can edit as they wish until enough other editors become fed up with you, Wikipedia already had a list of “generally sanctioned” editors and pages. Not too many editors, fortunately, but if you’re looking for a list of coronavirus-related articles that have been more controversial than others, here you go.

  • Finally, WikiProject COVID-19 also maintains a list of its most popular pages, sorted by traffic. A couple of entries near the top of the almost 800 caught my attention:

    So there you have it, definitive proof of how much American life has changed in the coronavirus pandemic: Dr. Anthony Fauci is more popular than Tom Hanks.


1 and growing by about 1,000 articles per week, according to my unscientific spot checks
2 the Germans are just good at Wikipedia in general
3 not like doctors and nurses, sure, but crucial nonetheless
4 Lately, it feels almost like my early days of discovering Wikipedia: opening tab after tab after tab in my browser, losing hours to it, engaging in a very mid-2000s activity once nicely captured in a memorable XKCD comic.
5 A development this blog advocated for just before it became reality, ICYMI.
6 Accurate as of April 13, when this data was collected
7 hat tip: WikiProject Tropical cyclones
8 To learn more about the coronavirus illustration oft-used in Wikipedia’s pandemic coverage, see this New York Times article.
9 and accessible on a Wikipedia template page here
10 hypothetical, just to be clear
11 not just the hoped-for vaccine, but treatments as well
12 see: Deaths in 2020