William Beutler on Wikipedia

Posts Tagged ‘Brett Kavanaugh’

What Happened to CongressEdits?
The Thrilling Life and Untold Death of Twitter’s Most Important Wikipedia Bot

Tagged as , , , , , , , , , , , , , ,
on January 17, 2019 at 11:59 am

Wikipedia and Twitter are very different internet platforms, but parallels can be found if you look closely enough (as The Wikipedian did a few years back). One important commonality is bots written by developers that automate certain tasks. On Wikipedia, bots can be found fixing typos, reverting vandalism, and performing repetitive administrative procedures. On Twitter, bots automate tweeting, retweeting, and related behaviors.

One of the most newsworthy bots of the past five years ties both platforms together. We’re talking, of course, about @CongressEdits, created to track edits made to Wikipedia from U.S. Capitol computers.

Launched in summer 2014, the account quickly became a cause célèbre, and if you wonder why, I’d like to invite you to familiarize yourself with the unavoidably self-referential Wikipedia article titled “United States Congressional staff edits to Wikipedia”. (Congressional staffers edit Wikipedia a lot, often embarrassingly, sometimes scandalously.) Over the next few years, CongressEdits would prove to be the source of news stories both serious and just for fun, revealing efforts to hide unflattering information and announce the availability of Choco Tacos in congressional vending machines. But then, in late 2018, @CongressEdits disappeared. If you visit today, you’ll see a standard message: “This account has been suspended.”

CongressEdits in 2014

What happened? Let’s start at the beginning: the account was set up (and the code behind it written) by Ed Summers, a software developer then working at the Library of Congress. He had previously established other Twitter bots, and also created WikiStream, a visualization of recent changes to Wikipedia displayed in real-time. CongressEdits was hardly a planned project, in fact it was “largely just an experiment,” as Summers explained to The Wikipedian in an interview. His inspiration for the new account came from the sudden appearance of @ParliamentEdits, which then and now tracks Wikipedia edits made from the UK Parliament. The creator of that account, Tom Scott, had piped the two known IP addresses for Parliament through the IFTTT automation service, which then published its findings to Twitter. ParliamentEdits was simple and clever. But a similar tool focused on the U.S. Congress would not be so simple: computers at the Capitol Building and its half-dozen office buildings are known to have multiple ranges and within them[1]Approximately 30 are known to be in use. —Last updated 1/22/19 thousands of possible addresses. After asking around, Summers received a list of known congressional IP addresses from GovTrack.us, an organization focused on government transparency. Summers put in a few hours of coding, and on July 8, 2014, CongressEdits was born.

Almost from the start, CongressEdits was the subject of supportive coverage—first from tech and politics sites like Ars Technica and TechPresident, and soon enough from The New York Times as well. Before long, the bot had its very own article on Wikipedia. What’s more, all the attention on CongressEdits (and to a lesser extent, ParliamentEdits) inspired developers in other countries to borrow the idea—and in some cases, Summers’ open source code—to create similar Twitter bots focused on the legislatures of Australia, Canada, Germany, and other countries. Summers is happy to have played the role he did, but also thinks it would have happened without him. “If I didn’t do it, somebody else would have done it soon after me,” he said. “It was just in the air at the time.” Still, CongressEdits proved to be the most famous among the bots, eventually gaining more than 60,000 followers.[2]Encouraged by the success of his side project, Summers created another bot, called @CongressEditors, which tracks edits made to the Wikipedia biographies of congressional members themselves. Later, he returned to CongressEdits and added screenshots of each edit, making it easier still for followers to scrutinize congressional IP edits.

Twitter-addicted journalists were soon mining CongressEdits for story opportunities, whether frivolous (The Daily Beast interviewed a 20-year-old congressional intern who admitted to vandalizing Wikipedia for funsies) or frightening (Mashable discovered edits watering down Wikipedia’s description of a Senate report on CIA torture). On multiple occasions, Wikipedia went so far as to temporarily block IP addresses from editing Wikipedia, for periods of up to three months, before restoring access in the name of openness. Some wondered if CongressEdits actually encouraged bad behavior. These included Wikipedia’s own Jimmy Wales, who speculated in an interview with the BBC “that it only provoke[s] someone—some prankster there in the office—to have an audience now for the pranks”. Others saw worse scenarios—as one developer said: “I just wonder when the first smear campaign leverages the watch bots.”

Which brings us to the autumn of 2018. The Supreme Court nomination of Brett Kavanaugh started off as a routine exercise in shared powers among the U.S. government branches, but ended as one of the most bitterly partisan nomination battles in history. On Wednesday, September 27, in a dramatic sequence of events before the Senate Judiciary Committee, Christine Blasey Ford, a former high school classmate of the judge, accused him of sexual assault, a charge Kavanaugh angrily rebutted. It was the next big moment in the #metoo movement, and the fight turned personal—in the committee hearing room, around dinner tables, and especially on social media.

On Thursday, September 28, a journalist alerted Summers that a CongressEdits tweet was going viral. This was nothing new, and Summers didn’t investigate until the next morning—when he found that Twitter had suspended the account. The story was already in the Washington Post: an anonymous person using a congressional IP address had “doxxed” several Republican members of the Senate Judiciary Committee, posting their phone numbers and even home addresses—“personally identifiable information”, or PII in legal terminology. This was no regular chicanery, Summers said: “In the past, people have noticed that the bot had a lot of followers and then if they edited Wikipedia from within the Capitol building, they could basically send messages. … But leaking PII through the edits themselves was something new.”

Redacted CongressEdits doxxing tweet

On Wikipedia, the edits were swiftly reverted and suppressed from public view. On Twitter, with the offending tweets deleted, Summers appealed to Twitter for reinstatement. The request was approved, and soon CongressEdits was operating as normal. But this didn’t last either: the senators’ personal information was again added from a Capitol IP address, and Twitter suspended the account once more. Within days a former Democratic staffer was identified and arrested, but for CongressEdits it was too late.

Aghast, Summers was ready to simply give up. His mind changed after speaking to Daniel Schuman, policy director for the government transparency organization Demand Progress,[3]co-founded by the late activist, programmer, and Wikipedia contributor Aaron Swartz who persuaded him that additional code could be introduced which would scan edits for patterns common to such information: seven-digit strings for phone numbers, @ symbols for email addresses, and the like. Offending tweets could be withheld, or put in a queue for review, and Summers was willing to do it. He appealed again to Twitter, explaining that he would introduce a filter were the account to be restored. Instead he simply received a form letter stating: “Your account was permanently suspended due to multiple or repeat violations of Twitter Rules … This account will not be restored.”

As of January 2019, the account remains suspended.[4]There is, however, an archive on GitHub. Reached for comment, a Twitter spokesperson told The Wikipedian, “We don’t comment on individual accounts for privacy and security reasons.” Summers eventually did post the revised code to the Twitter alternative site Mastodon, and so CongressEdits lives on—but in a place where almost no one will think to look for it. It now has just 382 followers.

The controversy surrounding CongressEdits thus became another casualty of the weaponization of social media, an increasingly common phenomenon. Twitter knows it has a harassment problem, has made repeated pledges to address the issue, and has taken serious steps to crack down on abusive bots and individuals. So far, they have yielded mixed results. It’s not uncommon to find stories of accounts suspended for reasons mild and mysterious. Sometimes Twitter’s rules enforcement has arguably contributed to the problem—for instance, the decision to ban @ImposterBuster, a bot which confronted users making racist comments.

To be sure, bots deserve more scrutiny than individual users. Last summer, Twitter introduced stringent new guidelines requiring botmakers to resubmit applications to continue their operations, no matter their content. These rules have apparently imperiled another internet-famous bot, @RealPressSecBot, which reformats tweets from @RealDonaldTrump to look like an official White House press release. In a December tweet thread, its creator, Russel Neiss, expressed his frustration and refusal to comply, and promptly began selling sponsored posts to monetize his protest until such time as this bot, too, is shut down.

Should CongressEdits return to Twitter, it will return to a much different internet than the one that gave birth to it. Using open technology for spirited problem-solving has given way to recently-realized threats and increased security measures. If unblocked, Summers says he would consider bringing it back to Twitter, but wouldn’t do so absent a clear message of support from the Wikipedia community: “I would take it up at that point,” he said, “but I didn’t feel like I was going to unilaterally do that.”

Unable to speak up for itself, CongressEdits’ legacy is undefined. Given the events of last fall, its critics might seem to be vindicated. Schuman, who had sought to help Summers restore the account, believes its real value was invisible: “I actually viewed it as something that inhibited people who have conflicts of interest from editing their own pages,” he told The Wikipedian. Summers is more cautious, but stands by his creation: “If I had to say if it’s a net positive or net negative, I would definitely say it’s a net positive, right? … I think it’s useful to think critically about our information sources that way.”

But now, Schuman laments, “This valuable tool just doesn’t exist anymore.”


1 Approximately 30 are known to be in use. —Last updated 1/22/19
2 Encouraged by the success of his side project, Summers created another bot, called @CongressEditors, which tracks edits made to the Wikipedia biographies of congressional members themselves. Later, he returned to CongressEdits and added screenshots of each edit, making it easier still for followers to scrutinize congressional IP edits.
3 co-founded by the late activist, programmer, and Wikipedia contributor Aaron Swartz
4 There is, however, an archive on GitHub.

The Top Ten Wikipedia Stories of 2018

Tagged as , , , , , , , , , , , , , , , , , , , , , , , , , , ,
on December 28, 2018 at 4:17 pm

Were you exhausted by 2018? If not, then The Wikipedian doesn’t know what year you just lived in. The continued crises in Western democracies, ongoing wars in the Middle East, embrace of authoritarianism around the world, and the inexorable, seemingly unstoppable transition to a world where data comes before people—all served up for consumption on your internet device of choice as quickly as you can pull to refresh—have changed what “normal” means. Where 2016 was once half-jokingly called the “worst year ever” only for 2017 to replicate the experience, by 2018 it’s become apparent that we may never end up reverting to the previous mean. Indeed, this is just how things are now. Mean.

But is Wikipedia different? Whether because it’s a decentralized, international effort or simply not one dependent upon advertising or unstable business models, the wide world of wiki has often this year felt disconnected from the madness it ostensibly documents. Yet, if we look closely, we can see where the real world has seeped in. In this blog post, for the ninth year in a row, The Wikipedian will present a summary of ten events, trends, phenomena, and people that marked the year in Wikimedia.

Shall we?

10. Is that all she wrote for WikiTribune?

It was a questionable decision on The Wikipedian’s part to make last year’s number one story the rocky start for WikiTribune, the collaborative internet news site from Wikipedia founder Jimmy Wales. It isn’t an official Wikimedia project, it has no financial relationship with the Wikimedia Foundation (WMF), and Wales’ involvement with Wikipedia is arguably at an all-time low. But he had announced the concept in a Wikimania speech five years ago, and it certainly got a lot of attention when it launched. Well, it also got some attention when it laid off its entire staff this fall, having burned through its funding without otherwise making a dent in the broader media ecosystem. This was entirely foreseeable, as the idea always involved a leap of faith (but so did Wikipedia!) and Wales’ post-Wikipedia projects have mostly failed to thrive. Will we see WikiTribune mentioned again next year? It’s already fallen nine positions, so I wouldn’t count on it—or even that it’s still around by then.

9. Testing new models of collaboration

It is no minor understatement to say that Wikipedia has gone very far with its laissez-faire model of knowledge production: like Douglas Adams’ eponymous Hitchhiker’s Guide, the content is written by those who have happened across it, spotted something they could fix, and miraculously actually done so. Yet Wikipedia’s content gaps and systemic biases are well observed, and it should take nothing away from the prior accomplishment to believe that more concerted efforts may be necessary for Wikipedia to take another step forward. For several years now the Wiki Education Foundation has been trying out different models, and this year they may have had a breakthrough with their Wikipedia Fellows pilot program, inviting academics from associations in multiple disciplines to try improving Wikipedia. The project has had some early success, though the number of participants were few and achievements relatively limited. Bringing more subject matter expertise to neglected areas of Wikipedia is still a daunting task that may not scale, but these experiments show promise and warrant further study.

8. Getting serious about systemic biases

Wikipedia and its associated nonprofits have been tackling similar problems in other ways: this year was the first occurrence of the Decolonizing the Internet conference, held concurrently with this year’s Wikimania in Cape Town, South Africa. Spearheaded by another independent group called Whose Knowledge?, the event brought together multiple strands of discussion and voices typically underrepresented on Wikipedia. Whereas Wikipedia has historically been the province of white males from North America and Western Europe, the conference’s participation was more than two-thirds non-male, from the Global South, and more than three quarters non-white. Actual outcome? Lots of discussion, a published report outlining agreement on issues to address (not always easy in sometimes fractured, identitarian spaces) and the creation of working groups to tackle specific issues. Whether this effort will have any measurable impact on a recognizable time frame is still an unknown, as the report acknowledges, but formalizing such efforts outside the WMF is nevertheless a major development.

7. “Free” Wikipedia goes offline

OK, one more in this vein: the Wikimedia Foundation’s efforts to bring Wikipedia (and yes, the other projects as well) to the far corners of the world without always-on wifi has unsurprisingly faced many challenges. Since 2012, the leading effort has been Wikipedia Zero, a program seeking telecom firms in developing regions to “zero-rate” Wikipedia, which means accessing it using their services would be exempt from the normal fee. It’s controversial in some quarters as it is often perceived to conflict in spirit, if not in law, with the principle of net neutrality. (Similar programs are also controversial in parts of the Global South: for example, in 2016 India rejected Facebook’s similar Free Basics program.) Although the WMF estimates it has reached more than 800 million people in more than 70 countries, the criticism never subsided and there was no corner to be turned, so in 2018 the program was shuttered.

So how will would-be Wikipedians in Ghana, Sri Lanka, Kosovo and elsewhere reach Wikipedia now? One would-be contender is the independent Internet-in-a-Box initiative, which seeks to put a copy of Wikipedia (and other digital libraries) on a low-cost computer (currently a Raspberry Pi) and distribute it the old-fashioned way. While it doesn’t come with any of the scary global data questions of Wikipedia Zero, because now we are again talking about atoms as well as bits, the old problems of distribution and scalability threaten to keep it a niche project. The tradeoffs are stark, and a sign of the times.

6. Attrition of administrators

It’s been a couple of years since we last worried openly about the decline in the total number of Wikipedia editors, largely because the erosion has been arrested. (These days Wikipedians worry about different charts going not down, but going up too much.) But topline figures only tell part of the story, and when it’s the power users who have the most impact on Wikipedia’s day-to-day governance, it’s troubling to note that Wikipedia contributors approved just ten new administrators—trusted editors who step in to lock pages and block accounts when needed—on eighteen nominations, the lowest number in either category in Wikipedia’s history. Yes, there’s even a down-and-to-the-right chart to describe it, and while it’s clear this trend has been developing for awhile—The Atlantic covered it in 2012 (!)—in 2018 all of the relevant figures approached, or breached, single digits for the first time (speaking of “Wikipedia zero”…). While Wikipedia still has more than 500 active administrators, there was a net loss for the year and no sign that will turn around. As attrition advances, will Wikipedia decide to lighten up, loosen requirements, or learn to live with fewer admins?

5. Save the links!

There are two widely held and mutually exclusive ways to think about the durability of content on the internet: nothing is forgotten, and everything is ephemeral. On Wikipedia, both are true: Wikipedia exists to record knowledge for posterity and every edit to every page is saved for all time, yet once something disappears from Wikipedia’s pages it rarely resurfaces—although it can! And this year, in one sense, it did.

The concept of “link rot” is central to this dilemma: because the internet is made up of links between files (and the World Wide Web specifically between web pages) if one file should disappear, the connection is broken, and so is information. The Internet Archive was established in the mid-1990s—practically the dawn of time, as the internet goes—to combat this problem by actually crawling the web, page by page, and storing all kinds of content long after its original publishers decide they no longer care to. This year a three-year effort in collaboration with Wikipedia delivered on rescuing millions of links to references once used in Wikipedia articles that later disappeared. It’s hard to overstate how important this is: Wikipedia is only as good as its sources, and finally its external sources are as stable as they ever have been—and perhaps can be.

4. I promise we’ll only mention him this once

The Wikimedia movement may be a global one, but considering its flagship Wikipedia edition is in English and its nonprofit foundation based in the United States, in 2018 hardly a week could go by without some intersection between the metastasizing national shitstorm that is the U.S. federal government with the leading source of putatively non-partisan, non-sectarian, non-biased information the world has agreed upon, Wikipedia. Most of the time, this involved harmful edits that require, ahem, administrators to combat effectively. From early in the year when Google amplified an instance of vandalism calling Republicans “Nazis” to efforts to whitewash articles related to the Mueller investigation to seemingly constant attacks on the Donald Trump Wikipedia page (often juvenile in nature, which alas is entirely fitting) and finally multiple issues revolving around the Brett Kavanaugh Supreme Court confirmation hearings. The eyebrow-raising edits to the Devil’s Triangle page were almost quaint; more troubling was the “doxing” of elected officials on Wikipedia, which was then transmitted by CongressEdits (a Twitter account reporting Wikipedia edits from congressional IP addresses) which was then shut down by Twitter for being an unwitting conduit. The account, much celebrated since its 2014 launch, has not returned. Like much else these days, it makes for a tidy symbol of the nice things we can no longer have.

3. Building our own Hal 9000

The Wikipedian is not a very successful computer person and therefore pretty anxious about getting this one wrong, so let’s try to keep this really high-level and see if I don’t royally screw this up: besides Wikipedia, there are related projects like Wikidata (an open source knowledge database) and Wikimedia Commons (a repository of media files, especially images) that provide content for Wikipedia articles and serve as resources for researchers. Both have come a long way in recent years, and they are growing together. This year, structured data came to Wikimedia Commons, meaning the metadata about the files will now be better organized and machine-readable, and therefore more searchable, editable, and useful in ways we haven’t yet defined. Also lexemes came to Wikidata, which you’ll just have to trust me is important, too. Meanwhile, the WMF’s ORES project, which uses machine learning to evaluate the quality of entire articles and individual edits, got more useful—but it’s still most useful to decently successful computer people who know how to do things like install javascript files, and so it’s not quite ready for prime time. Maybe in 2019 some of this will become more comprehensible.

2. Donna Strickland and Jess Wade

Speaking of very successful computer people, in October the Canadian physicist Donna Strickland was awarded a Nobel Prize for her work in chirped pulse amplification. At the time, Wikipedia had no biographical article for her, and very quickly, this became an international incident in itself. Wikipedia’s oversight was covered by The Washington Post, The Guardian, The Independent, Business Insider, Vox, Nature, The National Interest, The Daily Beast, and many more. In fact, it turned out an article about Strickland had been proposed in the months prior, only to be declined by a reviewing editor.

The Wikimedia Foundation, which absorbs every column inch of bad press that Wikipedia gets, was put on its heels, eventually publishing multiple explanatory blog posts about the matter, first by a mere staffer, and later by its executive director, Katherine Maher. What happened is perfectly understandable to anyone familiar with Wikipedia: there was not enough published information about her from independent sources prior to the Nobel committee’s announcement to satisfy Wikipedia’s stringent requirements. This is not unusual, as academics nearly always toil in obscurity. But of course, it’s almost certainly related to institutional sexism, and that while the processes in this instance were followed correctly, the outcome was nevertheless regrettable after the fact. Understandable, yes, but defensible? Perhaps not. And so the line out of the WMF is that yes, Wikipedia has to do better, but so must we all.

Meanwhile, there is another female physicist whose Wikipedia article was successfully created in early 2018: Jess Wade, who happens to be a Wikipedia editor herself. (Hmmm.) And not just any editor, but one who is the creator of hundreds of articles about other female scientists and who has received considerable media attention because of the fact. (It’s not even the first time this has been a story: cf. Emily Temple-Wood, an American medical student and prolific Wikipedian recognized in 2016’s list). Wade’s star began to rise this summer, and while it owed nothing to the Strickland issue—her first big round of U.S. coverage arrived more than two months earlier—it does feel like it may not be remembered that way.

1. YouTube’s bewildering fact-checking announcement

Wikipedia’s relationship to the global tech giants like Google and Facebook it is sometimes compared to is uncomfortable for many reasons: all enjoy audiences and impact of truly staggering scale (not to mention Bay Area headquarters) but Wikipedia’s mission and governance are completely the opposite of its supposed peers. If Wikipedia was a for-profit corporation, it would undoubtedly be a “unicorn”, except it’s a nonprofit and it ever tried to monetize the value of its reach, its community would rebel and the project might collapse entirely. (Which could still happen to some unicorns, actually.)

All of which is backdrop for probably the most jaw-dropping, perplexing, and as-yet-unsettled Wikipedia-related news of the year: an announcement from YouTube CEO Susan Wojcicki, speaking on stage at SXSW in March, that they would combat “fake news” by including links to Wikipedia articles on certain user-generated videos that ventured into conspiracy theory territory. How would this be done? What videos would be flagged? What articles would be linked? Among those asking: the Wikimedia Foundation, which quickly put out a statement saying that Wojcicki had not shared this information with them. And yet, some publications went so far as to call it a “partnership” even though no such relationship existed. But it’s not hard to imagine why they leapt to this conclusion. Following the announcement, you could be forgiven for thinking they just dropped the whole thing. In fact, YouTube did start including Wikipedia-sourced advisories with some videos, at least in some instances. It’s not clear how it has worked in practice because neither YouTube nor Wikipedia ever mentioned it again. Has the internet already forgotten?

Clearly, this was an unforced error on YouTube’s part. But was it also one by the Wikimedia Foundation as well? After all, it was little more than two years ago that the WMF published a blog post declaring Wikipedia a bulwark against the “post-fact world”. While the real shame lies with YouTube and its tendency, however unintended, to radicalize its audience by algorithmic recommendation, it’s another reminder that there remains a significant gap between what Wikipedia says it is, what people believe Wikipedia is, and what Wikipedia really is.

Will that gap narrow in the coming year? We’ll see, but I doubt this trend will fall all the way to number 10 in next year’s list. See you in 2019!

Image credits, in order: WikiTribune via Neiman Lab, Tinaral, Doc James, Hazmat2, RandomUserGuy1738, Gaia Octavia Agrippa, Sikander, Andrew Lih.