William Beutler on Wikipedia

Posts Tagged ‘Wikipedia Weekly’

The Top 10 Wikipedia Stories of 2020

Tagged as , , , , , , , , , , , , , , , , , , , ,
on December 31, 2020 at 1:46 pm

It’s no overstatement to say that 2020 was a year where everything changed. Since March, ubiquitous semi-ironic references to the “Before Times” have served to euphemize the unfathomable. To date, COVID-19 has killed nearly two million people worldwide, reshaped the global economy, galvanized worldwide protests, and impacted politics, business and culture for years to come—including in ways we can’t yet see. 2020 gets all the hate now, but can we be so certain that the coming year will be meaningfully different?

2020 was also a time of change for Wikipedia, though these shifts occurred almost entirely below the surface: unless you’re an active participant in the Wikimedia movement, much of this list will come as news to you. This was a year where ambitious new projects were announced, small-scale tweaks took on larger significance, the relationship between human editors and the software supporting them became more fraught, differences in vision between the community and professional corners of Wikipedia emerged or were reinforced, and the future of the movement simultaneously became both clearer and more contentious.

Every year since 2010, The Wikipedian has offered its summary of the top ten Wikipedia stories—events, themes, and trends—of the previous year. In this installment we’ll do the same again, but with a little something extra. On Wednesday, December 30, I joined a recording of the Wikipedia Weekly YouTube livestream to discuss the big issues of the year that was. This list is informed by the “top ten” discussed on this show, although it is not identical. I hope you’ll read through my list, and then watch or listen to the discussion, which complements the topics covered below.

♦     ♦     ♦

10. Wikipedia approaches its 20th anniversary

Countless retrospective pieces will surely be published in the coming weeks to commemorate the 20th anniversary of Wikipedia, which I am certain you do not need to look up to know was founded on January 15, 2001. That milestone has loomed large over the past year, lending additional significance to milestones and benchmarks recently passed.

Wikipedia’s 6 millionth article, maybe?

In January, Wikipedia hit 6 million articles in the English language, its largest and most widely-read edition. No one knows precisely which article was the true number 6,000,000, but the nod was given to Rosie Stephenson-Goodknight, co-founder of the Women in Red project, for her article about a Canadian schoolteacher and temperance movement leader. 

In February, Wired published a story calling Wikipedia “the last best place on the internet”, using the site as a counterpoint to the neverending dumpster fire of today’s World Wide Web—the last refuge of the promise of the “open web” which has long since given way to the mundanity of knowledge workers never being offline, every day facing another onslaught of disinformation and unpleasantry. By the end of the year, BuzzFeed offered a different way of saying pretty much the same thing: “The Top 40 Most Read Wikipedia Pages Of 2020 Perfectly Capture The Hellscape That Was 2020”.

Meanwhile, Wikipedia’s impressive stature was affirmed yet again when Twitter announced it was considering using Wikipedia as a benchmark for which user accounts would be bestowed with the simultaneously coveted and scorned “blue checkmark”. It was likewise affirmed in a more serious way when the World Health Organization announced it would be licensing its information for use on Wikipedia.

All in all, not a bad way to mark two decades, right? Well, you should see what else happened.

9. Should Wikipedia fear a Section 230 repeal?

If the phrase “Section 230” doesn’t mean much to you, then you probably don’t spend much time following the United States Congress… or on Twitter. Section 230 is the portion of the 1996 Communications Decency Act that protects providers of internet platforms, such as Google, Facebook, Twitter and, of course, Wikipedia, from being sued for content posted by users. Section 230 specifically allows these websites to moderate content—or not—as it sees fit. The internet as we know it today could not exist without it.

But in the last few years, 230 has come under increasing scrutiny, especially for websites alleged to permit sex trafficking (Craigslist), or terroristic threats (8chan), or disinformation (too many to count, but Facebook especially). What’s more, right-wing politicians and conspiracy theorists in the U.S. have viewed it as shielding the tech giants which they believe (or at least claim to believe) are censoring them. Meanwhile, “the internet as we know it today” is no longer seen as the frontier of possibility it was as recently as 2015. In the last week of December 2020, Senate Majority Leader Mitch McConnell tied a vote on the latest covid stimulus package to 230 repeal, a poison pill designed to derail modifications sought by Democrats (and of course Republicans’ own outgoing president). 

Although I hesitate to make any predictions about the world we live in now, full repeal seems exceedingly unlikely. But maybe I’m only saying that because the internet after 230 is impossible to imagine—it would spell headaches at best and doom at worst for the entire Web 2.0 ecosystem (including Wikipedia) and the tech giants who rely upon it. So while it’s probably not going to happen, it’s still worth worrying about.

8. Creating Theresa Greenfield’s Wikipedia article

November already feels like it was years ago, but barely two months ago a news story involving Wikipedia captured the attention of American political media for about 24 hours: why Theresa Greenfield, the Democratic nominee opposing Iowa senator Joni Ernst, did not have a Wikipedia article. It goes without saying that Wikipedia is a widely-read source of information by voters, so it seemed notable that Iowans (and the reporters covering one of the country’s most hotly contested racers) couldn’t even look her up on Wikipedia.

The reason owes to a perfect storm of three applicable circumstances: 1) Greenfield was not a well-known figure prior to capturing the Senate nomination, 2) Wikipedia doesn’t have a rule granting “Notability” to major party nominees, but 3) it does have a rule against creating articles about individuals known for just one event—in this case, the Senate race. This surprised me, because for years I had been under the impression that there was a rule automatically guaranteeing an entry for major party nominees, the same way there is for professional athletes.

As tends to happen in such cases, debate ensued and Greenfield was eventually granted a Wikipedia entry. Given how much news the race had generated, the article quickly grew to a level of detail that made the earlier obstinacy seem ridiculous. And then on November 3, she lost.

7. Scots Wikipedia and the trouble with small Wikipedias

Perhaps the actual biggest story involving Wikipedia this year, at least in terms of headlines generated, was the “fun” and “lighthearted” discovery that the Scots Wikipedia was basically a complete sham. For those whose only experience with Scots is thumbing through an Irvine Welsh novel sometime after seeing Trainspotting in the mid-1990s, Scots is either a language of its own or a heavy dialect of English spoken by the Scottish peoples. This blog last mentioned it in 2014 when Scotland voted on a referendum to leave the United Kingdom (lolsob emoji goes here) and it is one of the smaller language editions of Wikipedia.

If it’s not Scottish, it’s crap!

Well… in August a Reddit user realized that roughly a third of its 60,000-odd articles had been written by a single user, who turned out to be an American teenager with scant knowledge of proper Scots grammar or terminology. In other words, by a kid using a bad Scottish accent. The story was too good to pass up for almost any outlet that considers itself remotely “online”, and they all had a good laugh

A month after the Scots Wikipedia controversy, it emerged that a significant majority of the articles on the Wikipedia edition written in Malagasy—the national language of Madagascar—had been written by a bot translating articles from other editions. And most of them rather badly. And the Malagasy Wikipedia is far from the only Wikipedia edition to be mostly written by bots—a Vice report in February pointed out that the Cebuano edition was largely written without human editors, albeit apparently with more success.

But bots are not the only challenge. In a different example, the Portuguese Wikipedia—containing more than one million entries with just shy of 1400 active editors—decided to ban IP accounts from making edits, because the vast majority of vandalism on the site came from these unregistered editors. According to the Wikipedia Signpost, vandalism went down, and new account creation increased. This is unlikely to be adopted on the largest editions, but it’s worth watching to see if other small language communities decide to follow suit.

5. Anticipation and apprehensions about Abstract Wikipedia

Wikipedia is as human-created a project as exists in the world, but its future increasingly looks to be dominated by computers, programs, and algorithms. Look no further than the newly announced project called Abstract Wikipedia, and its sister project WikiFunctions, which plans to do much the same as the bots on small Wikipedias, but at a much larger scale and with greater ingenuity. 

First announced in a Signpost editorial in April, and approved unanimously by the WMF board just three months later, Abstract Wikipedia aims to create Wikipedia articles independent of any one language, combining structured data and “functions” related to information within them, to make it feasible for machine translation to effectively translate articles from one language to another. It sounds so ambitious as to be reckless, but its pedigree couldn’t be better—creator Denny Vrandečić is a former WMF board member, former Googler, and the creator of another pie-in-the-sky project that has become wildly successful: Wikidata.

Father of Wikidata, and now Abstract Wikipedia

As Vrandečić pointed out, of all topics that exist across Wikipedia, only a third of them have articles in English. Further: “only about half of articles in the German Wikipedia have a counterpart on the English Wikipedia … There are huge amounts of knowledge out there that are not accessible to readers who can read only one or two languages.”

If Abstract Wikipedia succeeds, it points toward a future where Wikipedia is controlled less by those who can merely write articles, and more by those who can write code. Exciting as the project may be, anxieties exist, too. Will Abstract Wikipedia dictate the content of articles, or merely inform them? Local control matters a lot to Wikipedians and, as we’ll see in the next few sections, WMF bigfooting is of increasing concern to some community members.

But it’s also easy to see why it appeals to many Wikimedians: much like Wikidata and very much unlike Wikipedia, it’s greenfield, unencumbered by the old habits of the arguably hidebound, conservative editorial base that both keeps Wikipedia running while also preventing it from growing beyond its original vision. The building of Abstract Wikipedia is set to begin in 2022, and it’s expected to start integrating with Wikipedia itself in 2023.

5. WMF Board makes some suspicious moves

In the spring, as the far-reaching implications of the coronavirus pandemic became clearer, the Wikimedia Board of Trustees announced that it would postpone its tri-annual board elections, and the three trustees whose terms were set to expire would stay on for another year. At the time, it was seen as a regrettable if understandable concession to the dire circumstances, even for an organization that can operate exclusively online in many other ways.

But then in October, the Board unveiled a considerable overhaul to the committee’s bylaws, with eyebrow-raising changes to the terms of, well, board elections. Certain board seats were no longer described as “community-selected” but “community-sourced”, and the words “majority” and “voting” were removed. A number of community members raised concerns that it could spell the end of community-elected board members, thereby increasing the stratification between the “professional” and “community” parts of Wikipedia. WMF general counsel Amanda Keton conceded that the community had “found a bug” in the proposal, and promised they would address them in a revision that is still yet to come.

Compounding matters, the timeline set for the change was considered too short, while Board members expressed different opinions about how far along in the process the proposals really were. Furthermore, apt questions were raised about the wisdom of sweeping changes when the board had three members who, in normal times, wouldn’t even be there. Perhaps it was merely an oversight, but it certainly exacerbated tensions that already existed.

4. Wikimedia debates Jimmy Wales’ permanent board seat

But that wasn’t the only discordant note involving Board governance this year. Shortly after the new bylaws were proposed, prominent Wikimedian Liam Wyatt suggested another change: discontinuing Wikipedia co-founder Jimmy Wales’ permanent “Community Founder Trustee Position”—in short, eliminating his board seat after nearly 20 years. As Wyatt put it, “Now that the WMF is a mature organisation, I do not believe it is appropriate any longer for a single individual to have an infinitely-renewable and non-transferrable position on the board.”

Jimmy Wales, man of the people—really!

Wales himself replied in short order, expressing a not intractable opposition to the idea at some point, but arguing that the reason it should not happen now is because of the self-same tensions ongoing. As Wales put it, it is actually he who represents the community among the professional set. And in fact, Wales’ positions on the board have been largely pro-community, including expressed opposition to curtailing community voter supervision of the board.

And while it seemed a “modest proposal” in its initial offering, the idea was soon hotly debated, with community members taking it very seriously and arguing the pros and cons. Mike Godwin, former WMF general counsel, even took to the Wikipedia Weekly Facebook group to argue for Wales as the connective tissue back to Wikipedia’s original purpose, concluding: “in my view, he shouldn’t be kicked out of the traditional position before he’s ready to go.”

The debate never really focused on Wales’ leadership, but rather the wisdom of having such a position in the first place, and it doesn’t seem likely to be taken much further for now. In a year where many statues around the world fell, it seems like the Wikimedia community decided it should at least consider whether to topple one of its own.

3. Covering COVID-19 and the George Floyd protests

It feels sort of wrong to put COVID-19 and the George Floyd protests into just one list item, but they are very much of a piece, and together they highlight what Wikipedia’s community is better at than any other editorial body: documenting far-reaching global happenings. The old saying about journalism being the “first draft of history” made sense when it was first expressed, but now that role clearly belongs to Wikipedia.

This blog covered both efforts when they first arose, in the early part and middle of the year, respectively, with posts more thoroughly researched than imaginatively titled: “How Wikipedia is Covering the Coronavirus Pandemic” and “How Wikipedia Has Responded to the George Floyd Protests”. Both subjects gave rise to dozens, if not hundreds, of new articles apiece, and several were among the most-read Wikipedia pages all year long. Quartz recently assembled a calendar depicting the most-read articles for each day of the year, and the month of June is dominated by relevant topics, including Killing of George Floyd, Juneteenth, and Edward Colston.

George Floyd protest in Brooklyn

The George Floyd protests also created opportunities for organizing around social justice issues, which have been close to the hearts of many Wikimedia affiliate groups for a long time. A virtual Juneteenth edit-a-thon was well-attended, WikiProject Black Lives Matter took shape, and the AfroCrowd initiative built a following.

To this day, the main page of the English Wikipedia retains an information box in its top right corner directing readers to critical information about the pandemic.

Activism on Wikipedia is a tricky thing: as the Neutral point of view policy spells out clearly, articles should not advocate for a particular perspective on the topics covered. But which articles Wikipedians choose to edit shows a lot about what they think is most important.

2. Effects of the global pandemic on the Wikimedia movement

How much could Wikipedia be affected by a global pandemic, anyway? Everything it does is about putting information on the internet, while the lockdowns and restrictions most affected those who couldn’t simply move online, such as restaurants and the travel industry.

In the first place, its professional class realized how much it actually depends on travel. Although all the editing necessarily happens online, in every other year dozens of regional and global meetings take place. The Wikimedia Summit, formerly known as the Wikimedia Conference and scheduled for April, was the first to be canceled. It didn’t take long for the main annual event, Wikimania, to be “postponed” from its August date in Bangkok, Thailand as well. Rumor has it that Wikimania 2021 will not happen either.

Some events, with more time to prepare, moved online: Wikiconference North America went ahead with a scaled-down virtual program in mid-December. And Wikipedia’s community has long made use of online tools from the esoteric like IRC and Etherpad to the commonplace like Zoom and Google Hangouts. A new wikiproject even sprang up to catalog the various online-only events, and to offer advice to those wanting to host their own. But virtual conferences are a split proposition: the lack of obligation to appear in-person made it easier for some to participate remotely, while removing a lot of the reason to show up in the first place for others.

I’ll add one more possible effect of the pandemic, and I suggest this very delicately: COVID-19 might have actually been a good thing for Wikipedia. As The Signpost noted this summer, editing activity on Wikipedia surged to levels not previously seen in a decade. As they explained: “Recent years seem to have stabilised at a million edits every six to six and a half days, so the lockdown period with its editing levels of a million edits every five days is a significant increase.” 

Some people learned to make sourdough. Others, presumably, learned to edit Wikipedia.

1. The Wikipedia Foundation?

Chances are, you have never heard of the biggest controversy to envelop Wikipedia in 2020. The dispute, which began in January, boiled over in June, and remains as yet unresolved, centered on the obvious desire of the Wikimedia Foundation (WMF) to change its name to the “Wikipedia Foundation” despite the clear majority of active Wikimedians who oppose the idea. 

The case in favor of doing so is simple: everyone and their grandmother knows what Wikipedia is, but almost no one outside of the movement knows what Wikimedia means. Wikipedia’s ubiquity has overshadowed other important projects funded by the WMF. By rechristening the entire endeavor “Wikipedia” and doing away with the confusing split branding of “Wikimedia”, it would unify the whole project behind the one word everyone knows.

I still remember when the WMF logo was in color

But the arguments against were simple, too, and passionate: rather than drawing attention to other projects, it would obscure their independent status and achievements. Further, the proposed change was initiated without sufficient feedback or consideration for the branding of the movement’s many organized chapters and user groups. Procedurally, it was inexplicably separated from the rest of the long-gestating Wikimedia 2030 Movement Strategy that it clearly belonged to, and rushed to the proposal stage at a time when the conferences and meetings where this would normally be debated had been called off due to the pandemic. What’s more, the proposal drew the harshest rebuke from those very groups who work most closely with the WMF—a rare intra-wiki dispute not between Wikipedia’s professionals and volunteers, but within the professional class itself.

The sequence of events was damning, too: In June, the WMF opened up a survey asking the community to weigh in on what Wikipedia should call itself. The survey was heavily weighted toward the conclusion that “Wikipedia Foundation” was the way to go, even though a Request for Comment earlier in the year ran 9 to 1 against it. Yet the WMF decided that its “informed oppose” was less than 1%, based on an invented number of “~9,000” community members whom they claimed had a chance to fill out the survey, though far fewer actually submitted responses. Soon after, an open letter organized by the affiliate groups received nearly 1,000 signatories calling on the WMF to “pause renaming activities … due to process shortcomings”. 

And so it was shelved, but only until March 2021. Whether the WMF will go ahead and become the WPF (I guess) remains to be seen, but this blog for one finds it unlikely. Interestingly enough, it also shows the limits of even these change-oriented groups’ interest in changing how they think of themselves and the movement they’ve dedicated their lives and careers to. The WMF would do well to put this aside and accept this as just one of the many contradictions that Wikipedia has managed to succeed in spite of over nearly two decades. As the old joke among longtime editors goes: “Wikipedia doesn’t work in theory, only in practice.” That’s as true here as it is anywhere.

For threatening the goodwill of its closest allies, for creating a headache where none need exist, and for being an own goal of massive proportions, the controversy around the renaming of the Wikimedia Foundation is easily the #1 Wikipedia story of 2020. 

♦     ♦     ♦

And now, if you still can’t get enough Wikipedia year-in-review content, I present to you the Wikipedia Weekly episode featuring Richard Knipel, Vera de Kok, Netha Hussain, Jan Ainali, Andrew Lih, and yours truly. Enjoy, and see you in 2021!

Image credits, top top bottom: Public domain, Sodacan, Victor Grigas, Zachary McCune, Rhododendrites, Wikimedia Foundation

Why Aren’t There More Wikipedia Editors?

Tagged as , , , , , ,
on July 16, 2018 at 11:15 am

Why do some people contribute to Wikipedia? Conversely, why don’t others? Ever since Wikipedia became a self-aware community, this question has vexed those who participate in it, and would like to see more people pitch in and help build the encyclopedia. After all, Wikipedia was created by a community of individuals with diverse interests and motivations. Some stay for a short while, and others stay much longer, but no one can stay forever. For this reason, the community must analyze itself and attempt to address the problems which hold it back. But this is a very, very difficult topic to grapple with.

Wikimania_2012_Group_Photograph-0001In mid-June, an editor named Ziko van Dijk, who happens to be one of the longest-running active contributors, posed a version of this question on a Facebook group for Wikipedia editors called Wikipedia Weekly. In the post, van Dijk noted the difficulty of finding new contributors, and speculated that a big reason is “simply that most people don’t like the hobby that is Wikipedia”—it’s a rather abstruse pursuit. Few people enjoy writing, and those who do prefer to express themselves, rather than impersonally collate facts. Meanwhile, other “occupations” on Wikipedia, such as clerical work involving categorizing pages is similarly unappealing. Therefore, in his view, existing Wikipedians must be clearer about what being a Wikipedian really means.

A discussion ensued, and weeks later, the thread had grown to more than 100 comments, with numerous current and former editors, including Wikimedia Foundation personnel, weighing in. I was a participant near the beginning, and in returning to the thread last week, I found the discussion in its whole a fascinating and perhaps useful compilation of views about Wikipedia’s problems recruiting new editors and retaining existing ones. This blog post is an attempt to summarize some of the more interesting arguments; the following are presented without judgment as to their correctness, but simply to describe the views in circulation:

Why aren’t there more people joining Wikipedia in the first place?

  • Many people simply do not know that they can edit Wikipedia. This seems difficult to believe, when Wikipedia is one of the most-visited sites in the world and has been for more than a decade, but the fact remains: we can’t assume that everyone who reads Wikipedia understands how its articles come to be written in the first place.
  • As van Dijk suggests, most people are not writers. Despite the rise of social media, few people write very much or at length—Instagram is bigger than Twitter, and most people who use Twitter simply read, rather than tweet. Moreover, the kind of writing necessary to produce Wikipedia articles is slow, laborious, and exhausting. However energizing a Wikipedian might find the work involved, it’s not hard to see why others might find it enervating.
  • Those who do write tend toward personal expression, sharing opinions and experiences. Wikipedia is the opposite of this: it’s not a place to write what you know, but a place to record what others have written about what they know. Similarly, most who write like to have their name attached to it—even if it’s not their real name. But Wikipedia is not a place for brand-building; it’s a matter of policy that Wikipedia articles are unattributed to their authors, only to the sources the authors used to compile them.
  • Those who try may be surprised that Wikipedia places unexpected restrictions on what they can write. You can’t just copy material from another source into Wikipedia wholesale, for example. And the range of acceptable sources is fairly limited. Wikipedia’s content rules are complex, and many of them are non-intuitive for those not steeped in Wikipedia’s community.
  • Some who try writing or editing an article may have just one topic they really care about, and are uninterested in going beyond that to work on many articles. Once they’ve said their piece, or tried and failed, their interest in the project has been exhausted.
  • A lot of what’s involved in contributing to Wikipedia amounts to clerical work. For many people, this sounds like, well, work. People who work in information jobs, especially, may find that Wikipedia is not a break from the kind of tasks they have to do in their real jobs, so Wikipedia feels too much like more of the same.
  • Potential contributors may associate Wikipedia merely with writing, and not with the myriad other tasks necessary to build the encyclopedia. These include contributing photographs and illustrations, coding templates and writing software, curating information, reviewing content, or patrolling new changes to keep articles free from vandalism or nonsense. You can be a Wikipedian even if you never write an article! But this isn’t readily apparent.
  • Wikipedia is simply too difficult to understand, and finding your way around can be head-spinning. As one participant put it: “Wikipedia is a maze without walls.”

Even if they want to join, the barriers to contributing are quite high

  • Wikipedia now has more than 5.6 million articles: all of the “low-hanging fruit” has been picked and there are fewer opportunities to create new articles. Meanwhile, expanding or revising existing articles may be less enticing to new contributors than the possibility of creating new ones. This is not at all to say that Wikipedia has created all or even most of the articles that it should eventually include, but it does mean these remaining opportunities are likely to be on more esoteric topics.
  • Wikipedia’s rules are very difficult to discover and master. There is no comprehensive list, nor a clear order in which they should be read. Should you begin with Policies and guidelines, Key policies and guidelines, or List of policies and guidelines? Who knows? And once you’ve found them, they can take awhile to read, not to mention internalize.
  • Another potential problem is a lack of clear goals for the Wikipedia community: back when Wikipedia was much smaller, it was easier to say that the goal was to get to 50,000 articles, 100,000 articles, or 1 million articles. Growing the encyclopedia is no longer the focus—that seems to happen almost on its own these days—but what goal replaces it? Reach? Quality? It’s not clear.
  • The “confidence factor” may play a role in a few ways. One is simply by getting started editing, one exposes themselves to evaluation, judgment, and criticism for their work. That’s not inherently a lot of fun. Additionally, with so much already written, new contributors may be reluctant to “interfere” with the work of those who have come before. After all, Wikipedia seems to have done quite well without their input, so why start now?

Harassment is a problem, but how much of a problem?

  • A recurring theme in the discussion was the degree to which harassment, especially of women, on Wikipedia is really a problem. Many editors have experienced it or seen it, but disagreement exists about whether it is a truly pervasive problem that is turning off potential contributors, or if the worst examples are rare but memorable.
  • Prevalence of harassment is difficult to measure for the same reason that crimes of violence often are: victims may be unlikely to report it, because doing so is daunting, and more so when the default assumption of Wikipedia discussions is that they occur in public. Were ANI to feature a private reporting feature, perhaps this would be mitigated.
  • A related question: don’t you have to contribute to Wikipedia first in order to experience harassment? The thinking being, it doesn’t really make sense to discuss in terms of new editors. Still, it’s possible would-be contributors have heard horror stories. And regardless of the reality on the ground (or the page) you can be certain this is a topic that will come up when these questions are raised.
  • Lastly, was Wikipedia ever a friendlier place than it is now? One suggestion was: no, it only seemed that way because there were more wide open spaces between content and there were fewer opportunities for contention and confrontation. Also, because Wikipedia had not yet become a global brand, there was less vandalism, and fewer COI problems. It doesn’t change anything now, but it’s interesting to consider.

What might some potential solutions look like?

  • There are as many potential solutions as there are problems. Maybe more? Here is a short list of ideas floated in the discussion thread, relating to the explanations listed above. Like before, they are presented without judgment, but in some cases with a little bit of supplementary commentary mixed in.
  • Wikipedia’s information pages must explain better what participation means before new users sign up. Wikipedia:Introduction is intended to be the starting point, but it doesn’t really offer any context for what to do. Not only is a better community portal for first-time editors a possible solution, but perhaps “better” isn’t the same for everyone, and there should be more than one point of entry based on one’s background or intentions.
  • Spotlight other things people can do than simply edit articles: patrol changes, review articles for GA or FA status, contribute photos, produce cartography, create templates, write bots, or fix grammar and spelling. A “101 ways to contribute” video or similar presentation could help spread awareness.
  • Better integration of tools from the community; VisualEditor is the WYSYWIG editing interface new contributors are encouraged to try, and Wikipedia Teahouse is the place for new editors to ask questions of veterans, but you can’t use the VisualEditor at the Teahouse.
  • For those who want recognition for their contributions to Wikipedia, perhaps Wikipedia’s articles could be re-designed slightly to include randomized lists of contributors to the article. Every once in awhile, you would get to see your name in lights. (Un-discussed: what if you don’t want your name in lights?)
  • “Stop over-policing contributions and under-policing behavior”. This is a fascinating insight, but also one that appears to run counter to the long-observed community advice to “focus on the edit, not on the editor”.
  • Stop pretending that everyone should be an editor, and find ways to support those who do. Additionally, find out why current contributors do so, and find ways for Wikipedia’s support teams and infrastructure to better nurture these motivations. Showcase stories of editors explaining why they are personally motivated to contribute.
  • More outreach projects to specific communities who are actually likely to edit Wikipedia: in science, literature, and especially at libraries.
  • Find ways to surface specific tasks to be done within different modes of contribution. Twitter, Facebook, Reddit all have feeds with new content to consume, but Wikipedia has no such centralized resource, whether communal or individualized. A new editor-focused dashboard was a popular suggestion in the 2016 Community Wishlist Survey, but not much has happened with it recently.

Ultimately, to borrow a phrase from academic work, mentioned in the thread: “further research in this area is needed”. Hopefully, in the meantime, discussions like this can help shape more rigorous explorations of this subject matter, and point toward solutions that benefit Wikipedia and its contributors, present and future.

Photograph of 2012 Wikimania participants via Helpameout licensed under Creative Commons.

Search and Destroy: The Knowledge Engine and the Undoing of Lila Tretikov

Tagged as , , , , , , , , , , , , , , , , , , , , , , , , ,
on February 19, 2016 at 11:00 am

The Wikimedia Foundation is in open revolt. While the day-to-day volunteer efforts of editing Wikipedia pages continue as ever, the non-profit Foundation, or WMF, is in the midst of a crisis it’s never seen before. In recent weeks, WMF staff departures have accelerated. And within just the past 48 hours, employees have begun speaking openly on the web about their lack of confidence in the leadership of its executive director, Lila Tretikov.

knowledge-engine-rocket

All in all, it’s been a terrible, horrible, no good, very bad start to 2016. Controversy in the first weeks of the year focused on the unexplained dismissal from the WMF Board of Trustees of James Heilman, a popular representative of Wikipedia’s volunteer base, before shifting to the unpopular appointment to the WMF Board of Arnnon Geshuri, whose involvement in an anti-competitive scheme as a Google executive led him to resign the position amidst outcry from the staff and community.[1]The denouement of Geshuri’s time at WMF might have been a great post of its own, but I didn’t get to it, and, as usual, Signpost has you covered.

But other issues remained unresolved: WMF employee dissatisfaction with Tretikov was becoming better known beyond the walls of its San Francisco headquarters, while questions mounted about the origin, status and intent of a little-known initiative officially called Discovery, but previously (and more notoriously) known as the “Knowledge Engine”. What was it all about? How do all these things tie together? What on Earth is going on here?

Deep breath.

The strange thing about the Knowledge Engine is that, until very recently, basically nobody knew anything about it—including the vast majority of WMF staff. Not until Heilman identified it as a central issue surrounding his departure from the Board had anyone outside the WMF staff ever heard of it—though in May 2015, a well-placed volunteer visiting HQ[2]specifically, User:Risker, a widely respected former member of Wikipedia’s Arbitration Committee observed that a team called “Search and Discovery” was “extraordinarily well-staffed with a disproportionate number of engineers at the same time as other areas seem to be wanting for them”. This despite the fact that, as we know now, the WMF had sought funding from the Knight Foundation of many millions of dollars, receiving just $250,000 in a grant not disclosed until months later. As recently as this month, a well-considered but still in-the-dark Wikipedia Signpost article asked: “So, what’s a knowledge engine anyway?”

♦     ♦     ♦

After several months of not knowing anything was amiss, followed by weeks of painful acrimony, we think we have the answer: as of February 2016 the mysterious project is in fact a WMF staff-run project to improve Wikipedia’s on-site search with some modest outside funding, which sounds like a good idea, sure, Wikipedia’s on-site search engine isn’t maybe the best, but we also know at some point it was an ambitious project to create a brand new search engine as an alternative to Google. Sometime in 2015 the WMF submitted a proposal to the Knight Foundation asking for a substantial amount of money to fund this project. It is described in still-emerging documents from this grant request as a “search engine”, and several early mock-ups seemed to suggest this was in fact the idea (click through for higher resolution):

Knowledge Engine mid

Why would Wikipedia consider building a search engine, anyway? The most likely answer is fear of being too dependent on Google, which sends Wikipedia at least a third of its total traffic. In recent years, Google has started providing answers to queries directly on the search engine results page (SERPs), often powered by Wikipedia, thereby short-circuiting visits to Wikipedia itself. Tretikov herself, in a rambly January 29 comment on her Meta-Wiki[3]a wiki devoted to, well, meta-topics regarding Wikimedia projects account page, identified “readership decline” as Wikipedia’s most recent challenge.[4]“Our aim was to begin exploring new initiatives that could help address the challenges that Wikipedia is facing, especially as other sources and methods arise for people to acquire knowledge. If you haven’t yet, please have a look at the recent data and metrics which illustrate the downward trajectory our movement faces with readership decline (since 2013), editor decline (since 2007, which we stabilized for English Wikipedia in 2015), and our long standing struggle with conversion from reading to editing. These risks rank very high on my list of priorities, because they threaten the very core of our mission.”

It’s an understandable position: if you are the leader of an organization whose success has been largely described in terms of its overall traffic,[5]#6 in the U.S., #7 worldwide any decline in traffic may be equated with a decline in Wikimedia’s ability to fulfill its mission. I submit this is short-sighted: that Wikipedia has an educational mission whose impact cannot be measured solely in terms of traffic. That Google borrows information from Wikipedia—though they are not alone in this—in such a way that it answers people’s questions before they have to actually click through to en.wikipedia.org is still a win for Wikipedia, even if it reduces the (already low) probability that a reader will become a Wikipedia contributor.[6]See this comment from WMF’s Dario Taraborelli, who argues: “[T]raffic per se is not the goal, the question should be about how to drive back human attention to the source”.

The logic is twisted, but you can follow it: most readers find Wikipedia through a search engine, so if the search engine that helped make Wikipedia the success it is today changes its mind and starts pointing elsewhere, better to get ahead of things and create a new alternative that people will use. I guess? If we accept this reasoning, we still have to confront questions like: Is this actually something the WMF can accomplish? Is this within the WMF’s scope? Is this something that will help Wikipedia accomplish its mission? These are much harder questions for WMF to answer—in part because the answers are “no”, “no”, and “no”—and would absolutely have to be shared with the Wikimedia Board of Trustees ahead of time and, for political reasons, socialized within the Wikipedia community itself. The incident surrounding Heilman’s departure suggests the former was an issue, and the ongoing furor is because the latter obviously did not occur.

Meanwhile, the extreme unwillingness of Lila Tretikov and even Jimmy Wales to talk about it is, in fact, tearing the Wikimedia Foundation apart. Tretikov has lost all remaining credibility with Wikimedia staff and close community observers, not that she had much to begin with. As this week comes to an end, more staffers are quitting, remaining ones are complaining in public, and it seems impossible to imagine Lila Tretikov remaining in charge much longer.

♦     ♦     ♦

If you’ve come to expect a detailed timeline of events from The Wikipedian, I am pleased to say you’ll find just what you’re looking for below, although I’m afraid this whole thing is too large and multifaceted to do proper justice within the space of this already very long post. A full accounting may go back[7]as James Heilman does in his own timeline of events to the mid-2000s, when Jimmy Wales harbored ambitions of building his own search engine—Wikiasari in 2006 and Wikia Search in 2008. It certainly would include a full accounting of the many high-profile WMF staffers to leave since late 2014, and the role Tretikov played in each. It would include a careful examination of what the WMF can and should do in Wikipedia’s name, and an evaluation of how the evolving app-focused Internet raises questions about Wikipedia’s own future.

I think that’s more than I can accomplish in this post.

Instead I want to focus on what’s happening this week. But first we have to fill in some of the blanks. To do so, you’ll want to wind back the clock a few weeks:

  • Let’s start on January 25, when Jimmy Wales called Heilman’s claims that transparency issues were at the core of his dismissal “utter fucking bullshit”. Jimmy Wales is known for occasionally lashing out at pestering editors on his Talk page, and this certainly seems to be one of those times.
  • Jimmy Wales, 2013

  • On January 29, Tretikov made her first public, community-facing statement about the Knight Foundation grant, which was welcomed for showing some self-reflection[8]“It was my mistake to not initiate this ideation on-wiki. Quite honestly, I really wish I could start this discussion over in a more collaborative way, knowing what I know today.” but also raised more questions than it answered.
  • On February 1 WMF developer Frances Hocutt stated[9]on Tretikov’s discussion page, no less that employees were being “censured for speaking in ways that I have found sharply critical but still fundamentally honest and civil”.
  • Don’t skip the aforementioned “So, what’s a knowledge engine anyway?” investigation by Andreas Kolbe for The Signpost, published February 8, still the most comprehensive evaluation of this multifaceted controversy.
  • We then jump ahead to February 11, when Wales was still doing his “Baghdad Bob” routine, publicly insisting to Wikipedia editors that any suggestion WMF had ever considered building a search engine was “a total lie”.[10]Full quote: “To make this very clear: no one in top positions has proposed or is proposing that WMF should get into the general “searching” or to try to “be google”. It’s an interesting hypothetical which has not been part of any serious strategy proposal, nor even discussed at the board level, nor proposed to the board by staff, nor a part of any grant, etc. It’s a total lie.”
  • Just hours later, WMF comms uploaded the Knight Foundation grant agreement itself to the WMF’s own wiki, confirming for the first time, in public, that WMF was describing the project as “the Internet’s first transparent search engine”. The Signpost has the most detailed breakdown not only of the grant agreement, but also three supplemental documents which were leaked to the Signpost but have not been made public at this time.
  • Also read this powerfully-argued blog post by Wikipedia veteran Liam Wyatt about the poor strategic decision-making that led to the current controversy.[11]“It seems to me extremely damaging that Lila has approached an external organisation for funding a new search engine (however you want to define it), without first having a strategic plan in place. Either the Board knew about this and didn’t see a problem, or they were incorrectly informed about the grant’s purpose. Either is very bad.”
  • You might then have a look at The Register, always snarky, but with a decent summary of where things stood last week, just before it became newsworthy. I definitely recommend this February 15 story by Vice’s Motherboard about the fiasco (and this follow-up)[12]Both of which quote yours truly, so take that into consideration. but skip this Newsweek story except to see how the media was, for a brief moment, cluelessly reporting that Wikipedia was taking on Google.[13]This story has since been corrected, albeit on an insignificant, unrelated point.
  • However incomplete, I think this upshot from The Verge is a good enough summary, at least for public purposes:
    • Whether Wikimedia’s plans just naturally evolved [away from the search engine project] or whether it was responding to the community’s response is difficult to say, but the organization is now, at least, claiming it does not want to square up to Google, but just improve its own product.

  • As all this was unfolding, the exodus of key WMF staff was accelerating. On February 8, Tretikov announced on Wikimedia-l that Luis Villa, head of the Community Engagement department and previously a member of the WMF’s legal team, would be leaving.
  • At least Tretikov seemed to be in control of that one. Because the next day Anna Koval, a manager of the education program, announced her own departure on the mailing list.
  • And then on Friday, February 12, a very big resignation letter dropped on the Wikimedia-l: that of Siko Bouterse, another veteran leader who had long provided a crucial link between the Wikipedia volunteer community and the professional WMF staff. Careful with her words, Bouterse wrote:
    • Transparency, integrity, community and free knowledge remain deeply important to me, and I believe I will be better placed to represent those values in a volunteer capacity at this time.

  • Messing up my timeline a bit, but still worth noting: Hocutt, the developer who had made public internal fears about silencing dissent, announced her own (albeit temporary) departure in yet another Wikimedia-l post on February 17, noting her leave was “due in part to stress caused by the recent uncertainty and organizational departures.”

♦     ♦     ♦

Finally, on February 16, Lila Tretikov published an open letter[14]Co-authored by Vice President of Product Wes Moran on the Wikimedia blog titled “Clarity on the future of Wikimedia search”. Alas, it wasn’t terribly clarifying: it seemed aimed at the clueless mainstream journalists like the one at Newsweek, and not at the Wikipedia community who knew which information gaps actually needed to be filled in. It began:

Over the past few weeks, the Wikimedia community has engaged in a discussion of the Wikimedia Foundation’s plans for search and discovery on the Wikimedia projects.

Lila_Tretikov_16_April_2014Well, that is certainly one way to put it! Put another way, you have been backed into a corner defending the untenable proposition that Wikipedia has never considered building a search engine, and now that the mainstream press is reporting, based on your own documents, that you are building a search engine, one certainly has to say something about it.

After much boilerplate about the growth of Wikipedia and its many achievements, Tretikov and Moran finally get around to the point:

What are we not doing? We’re not building a global crawler search engine. We’re not building another, separate Wikimedia project. … Despite headlines, we are not trying to compete with other platforms, including Google.

This seems to be true, insofar as there is no search project currently. However, Wales had previously locked himself into the position that there was never a search project originating from WMF, and by now we know that is obviously false. Without any acknowledgement in this letter, it is useless. But it’s worse than that:

Community feedback was planned as part of the Knowledge Engine grant, and is essential to identifying the opportunities for improvement in our existing search capacity.

We are 10 months past the initial plans for this far-reaching, mission statement-busting project, six months past the award of a grant to pursue this quixotic effort, and not two months removed from the violent ejection of a Board trustee over the matter… and all you can say is “feedback was planned”?

Finally, the closest thing to acknowledging the Knowledge Engine was, at some point, actually a search engine:

It is true that our path to this point has not always been smooth, especially through the ideation phase.

And nothing more.

The first comment on the post was brutal, bordering on uncivil, from a retired editor. It concluded:

You are either:
a. Flat out lying, and hoping we don’t actually read the grant,
b. Have misled the Knight Foundation as to your intentions for their grant money, or
c. Seriously incompetent and should never be put in charge of writing a grant application
None of these options look good for the WMF.

A few hours later, a member of WMF’s Discovery team gamely stepped forward and tried to offer a plausible explanation for how the grant request did not necessarily imply a Google-competitive search engine project—damage control, essentially—but still had to concede the wording of the grant did not make Tretikov or WMF look good: “It is ambiguous. I can’t speak to the intent of the authors and while there are current WMF staff listed, they are not the sole authors of the document.”

Finally, a day later, a true hero emerged in Max Semenik, another Discovery team engineer, mostly unknown to the community, and who was willing to take off his PR hat to say what everyone pretty much knew:

Yes, there were plans of making an internet search engine. I don’t understand why we’re still trying to avoid giving a direct answer about it. …

The whole project didn’t live long and was ditched soon after the Search team was created, after FY15/16 budget was finalized, and it did not have the money allocated for such work … However, ideas and wording from that search engine plan made their way to numerous discovery team documents and were never fully expelled. …

In the hindsight, I think our continued use of Knowledge Engine name is misleading and should have ended when internet search engine plans were ditched. No, we’re really not working on internet search engine.

Now that sounds like a real answer! What’s more, it also provides the outlines of a believable story as to why the Knight Foundation grant included language about the search engine, even if it wasn’t then the plan. This is transparency of a sort! But it’s transparency of the last-ditch kind. That it had to come from a low-level engineer indicates there is a major problem, and speaks to the fact that the WMF simply cannot go on this way.

At a time when Wikipedia has already-existing problems, the WMF was asking for money to basically create a whole new set of problems. That is the mark of an organization, if not a movement, adrift. Clearly, they pitched a search engine to Knight, and they asked for millions—I have heard the number placed at $100 million over 5 years—later reduced to $12 million, of which Knight provided $250K to build a plan—essentially a pat on the head: ‘since we like you, here’s a few bucks to come up with a better idea’.

knowledge-engineMysteries remain: where did the idea come from, who championed it, when did it die—or when did it recede and what happened afterward? One answer is supplied in another comment on this public thread (!) from yet another WMF team member (!) pointing a finger at former VP of Engineering Damon Sicore as having “secretly shopped around grandiose ideas about a free knowledge search engine, which eventually evolved into the reorg creating the Discovery team.” Sicore left in July 2015. A big remaining question, for which there is no answer at this time: when the actual grant was submitted to the Knight Foundation.

An argument I have heard in recent days is that it’s common in grant-making to try for everything you can and see what actually sticks. This may be true, but if so, it doesn’t seem to have been worth it. That WMF leadership felt they had to hide the fact later on also underlines the mistake they knew they were making.

Another big question: how does this affect Wikipedia’s public reputation, particularly among donors, most especially among foundations? You have to think the answer is a lot. The WMF looks like the Keystone Kops. Why would you give it money? And right now, the Knight Foundation specifically must be asking what it’s got itself into.

♦     ♦     ♦

Within the last 24 hours, the trickle of public criticism about Tretikov has become a widening stream. Some of it is taking place in the above comment thread, plenty is still happening at Wikimedia-l, but a lot of it has moved to a semi-private Facebook group called Wikipedia Weekly, where staffers previously not known for voicing internal dissent have been speaking quite frankly about how bad things are at 149 New Montgomery Street.[15]Example: “Dozens of staff formally warned the Board and Leadership months ago that this would happen. Sadly, we were right. But it was entirely predictable, and preventable.”

Yesterday afternoon on the mailing list, a developer named Ori Livneh replied to a plea for calm by community Board trustee Dariusz Jemielniak by explaining why they could not remain silent:

My peers in the Technology department work incredibly hard to provide value for readers and editors, and we have very good results to show for it. Less than two years ago it took an average of six seconds to save an edit to an article; it is about one second now. (MediaWiki deployments are currently halted over a 200-300ms regression!). Page load times improved by 30-40% in the past year, which earned us plaudits in the press and in professional circles. …

This is happening in spite of — not thanks to — dysfunction at the top. If you don’t believe me, all you have to do is wait: an exodus of people from Engineering won’t be long now. Our initial astonishment at the Board’s unwillingness to acknowledge and address this dysfunction is wearing off. The slips and failures are not generalized and diffuse. They are local and specific, and their location has been indicated to you repeatedly.

Shortly thereafter Asaf Bartov—one of WMF’s more outspoken staffers, even prior to the last 48 hours—voiced his agreement and turned his comments back to Jemielniak:

Thank you, Ori. +1 to everything you said. We have been laboring under significant dysfunction for more than a year now, and are now in crisis. We are losing precious colleagues, time, money, *even more* community trust than we had previously squandered, and health (literally; the board HR committee has been sent some details). Please act. If for some reason the board cannot act, please state that reason. Signal to us, community and staff, by concrete words if not by deeds, that you understand the magnitude of the problem.

And then, about 10 minutes later, Lila Tretikov posted to this very conversation thread, and this is all she had to say:

For a few 2015 accomplishments by the product/technical teams you can see them listed here:

https://meta.wikimedia.org/wiki/2015_Wikimedia_Foundation_Product_and_Technology_Highlights

That is the complete text of her emailed post. That is really all she had to say, in a public thread specifically criticizing her leadership and all but explicitly calling for her removal. One gets the feeling, at this point, even Lila Tretikov just wants it to be over.

♦     ♦     ♦

In the early morning hours of February 19, a WMF software engineer named Kunal Mehta wrote an impassioned, rather forlorn post on his personal blog, titled: “Why am I still here?”:

Honestly, I don’t understand why the current leadership hasn’t left yet. Why would you want to work at a place where 93% of your employees don’t believe you’re doing a good job, and others have called you a liar (with proof to back it up) to your face, in front of the entire staff? I don’t know everything that’s going on right now, but we’re sick right now and desperately need to move on. …

I love, and will always love Wikimedia, but I can’t say the same about the current state of the Wikimedia Foundation. I’ve been around for nearly nine years now (nearly half my life), and it feels like that world is slowly crumbling away and I’m powerless to stop it.

240px-Wikimedia_Foundation_RGB_logo_with_textAnd that’s why there is really just no way Lila Tretikov can continue to lead the WMF. A week ago, the thinking was: the Board of Trustees chose her over James Heilman, so they’re really sticking with her. At the time it also seemed like the Knowledge Engine was a going concern, and their support for her owed to their insistence on moving ahead with the project above community and staff objections. Knowing what we do now, it’s inexplicable. The thinking now is: she obviously has to go, and the only reason the Board might have for not acting on it would be legal considerations.

For the sake of Wikipedia’s future, the Wikimedia Foundation needs new leadership. Lila Tretikov must resign, or she must be replaced. This is the most challenging blog post I’ve ever had to write at The Wikipedian. The next one, I hope, will be about the start of the turnaround.

Notes

Notes
1 The denouement of Geshuri’s time at WMF might have been a great post of its own, but I didn’t get to it, and, as usual, Signpost has you covered.
2 specifically, User:Risker, a widely respected former member of Wikipedia’s Arbitration Committee
3 a wiki devoted to, well, meta-topics regarding Wikimedia projects
4 “Our aim was to begin exploring new initiatives that could help address the challenges that Wikipedia is facing, especially as other sources and methods arise for people to acquire knowledge. If you haven’t yet, please have a look at the recent data and metrics which illustrate the downward trajectory our movement faces with readership decline (since 2013), editor decline (since 2007, which we stabilized for English Wikipedia in 2015), and our long standing struggle with conversion from reading to editing. These risks rank very high on my list of priorities, because they threaten the very core of our mission.”
5 #6 in the U.S., #7 worldwide
6 See this comment from WMF’s Dario Taraborelli, who argues: “[T]raffic per se is not the goal, the question should be about how to drive back human attention to the source”.
7 as James Heilman does in his own timeline of events
8 “It was my mistake to not initiate this ideation on-wiki. Quite honestly, I really wish I could start this discussion over in a more collaborative way, knowing what I know today.”
9 on Tretikov’s discussion page, no less
10 Full quote: “To make this very clear: no one in top positions has proposed or is proposing that WMF should get into the general “searching” or to try to “be google”. It’s an interesting hypothetical which has not been part of any serious strategy proposal, nor even discussed at the board level, nor proposed to the board by staff, nor a part of any grant, etc. It’s a total lie.”
11 “It seems to me extremely damaging that Lila has approached an external organisation for funding a new search engine (however you want to define it), without first having a strategic plan in place. Either the Board knew about this and didn’t see a problem, or they were incorrectly informed about the grant’s purpose. Either is very bad.”
12 Both of which quote yours truly, so take that into consideration.
13 This story has since been corrected, albeit on an insignificant, unrelated point.
14 Co-authored by Vice President of Product Wes Moran
15 Example: “Dozens of staff formally warned the Board and Leadership months ago that this would happen. Sadly, we were right. But it was entirely predictable, and preventable.”

Three Million Served

Tagged as ,
on August 18, 2009 at 6:32 am

This week marks a milestone for the English-language Wikipedia that is both major and somewhat arbitrary: the creation of its 3 millionth article. If you visit the front page of Wikipedia now, you will see this message:

wiki-3-million

That article, about Norwegian actress Beate Eriksen is currently locked down to prevent vandals from messing it up, something that happens with nearly every article that gets widespread attention. Of course, usually it is because the subject was in the news, rather than the article itself.

As the chart below indicates (taken from here), Wikipedia passed 2 million articles in the third quarter of 2007. Will it take another 2 years for Wikipedia to reach 4 million?

wiki-article-growth

Actually, it may take a bit longer: Wikipedia’s article growth has been slowing down. This has been a topic for discussion on the Wikipedia Weekly podcast at least as far back as a year ago, and is inevitable. Given Wikipedia’s success and its strict rules on what qualifies for an article, there will come a point where most articles have already been created. We may have reached that point.

Or, as I think more likely, we have created most of the articles that can be assembled from web sources and in-print books. That’s why I think the next phase of Wikipedia’s growth will have to depend on archived materials involving historical subjects that are exactly the type of article Wikipedia does least well at. This wouldn’t stop Wikipedia’s growth from slowing, but it would keep its growth meaningful.

Update: From the comments, here are two thoughts from very smart and much more experienced Wikipedians than yours truly. First, David Gerard:

Actually, I think we’ve barely scratched the surface of books, in-print or not. What’s been done so far isn’t even the low-hanging fruit, it’s the fruit that’s actually sitting on the ground waiting to be picked up.

The growth curve so far looks like a logistic curve with a linear increase on top.

One interesting thing is that is the growth curves for the other large Wikipedias look similar. And the smaller Wikipedias are typically in early linear growth or the exponential upcurve of the logistic curve.

And from Sage Ross, a Wikipedia Weekly contributor:

“we have created most of the articles that can be assembled from web sources and in-print books”

That’s not nearly the case, especially if you count digitized scholarly journals as available sources too. Wikipedia could easily have another 3 million articles (probably more like 30 million) based on published sources. It’s just that the deeper you go into specialized areas where the untapped sources are rich, the fewer people there are who are interested in and/or capable of writing about those areas.