Three Million Served

This week marks a milestone for the English-language Wikipedia that is both major and somewhat arbitrary: the creation of its 3 millionth article. If you visit the front page of Wikipedia now, you will see this message:


That article, about Norwegian actress Beate Eriksen is currently locked down to prevent vandals from messing it up, something that happens with nearly every article that gets widespread attention. Of course, usually it is because the subject was in the news, rather than the article itself.

As the chart below indicates (taken from here), Wikipedia passed 2 million articles in the third quarter of 2007. Will it take another 2 years for Wikipedia to reach 4 million?


Actually, it may take a bit longer: Wikipedia’s article growth has been slowing down. This has been a topic for discussion on the Wikipedia Weekly podcast at least as far back as a year ago, and is inevitable. Given Wikipedia’s success and its strict rules on what qualifies for an article, there will come a point where most articles have already been created. We may have reached that point.

Or, as I think more likely, we have created most of the articles that can be assembled from web sources and in-print books. That’s why I think the next phase of Wikipedia’s growth will have to depend on archived materials involving historical subjects that are exactly the type of article Wikipedia does least well at. This wouldn’t stop Wikipedia’s growth from slowing, but it would keep its growth meaningful.

Update: From the comments, here are two thoughts from very smart and much more experienced Wikipedians than yours truly. First, David Gerard:

Actually, I think we’ve barely scratched the surface of books, in-print or not. What’s been done so far isn’t even the low-hanging fruit, it’s the fruit that’s actually sitting on the ground waiting to be picked up.

The growth curve so far looks like a logistic curve with a linear increase on top.

One interesting thing is that is the growth curves for the other large Wikipedias look similar. And the smaller Wikipedias are typically in early linear growth or the exponential upcurve of the logistic curve.

And from Sage Ross, a Wikipedia Weekly contributor:

“we have created most of the articles that can be assembled from web sources and in-print books”

That’s not nearly the case, especially if you count digitized scholarly journals as available sources too. Wikipedia could easily have another 3 million articles (probably more like 30 million) based on published sources. It’s just that the deeper you go into specialized areas where the untapped sources are rich, the fewer people there are who are interested in and/or capable of writing about those areas.