A machine learning system trained on scholarly journals could correct Wikipedia's gendered under-representation problem

Quicksilver is a machine-learning tool from AI startup Primer: it used 30,000 Wikipedia entries to create a model that allowed it to identify the characteristics that make a scientist noteworthy enough for encyclopedic inclusion; then it mined the academic search-engine Semantic Scholar to identify the 200,000 scholars in a variety of fields; now it is systematically composing draft Wikipedia entries for scholars on its list who are missing from the encyclopedia.


In addition to correcting omissions in Wikipedia, Quicksilver (which is named for the Neil Stephenson novel) is particularly useful in improving the representation of women in the project. On 18% of Wikipedia's biographic entries are about women and the vast majority of Wikipedians are men.

In addition to creating new Wikipedia entries, Quicksilver can suggest new material for existing entries.

Quicksilver doesn't directly edit Wikipedia; rather, it drafts entries and revisions for humans to refer to in improving the encyclopedia.


The first step was to collect 30,000 Wikipedia articles about scientists to train algorithms to detect the signals in news articles that correlate with a researcher having an entry on the site. Quicksilver uses that knowledge to find notable missing names by cross-referencing existing Wikipedia entries against a list of 200,000 scientific authors drawn from an academic search engine called Semantic Scholar. The software sources the facts needed to write missing entries from a collection of 500 million news articles and feeds them into a system trained to generate biographical entries from past examples.

Using Artificial Intelligence to Fix Wikipedia's Gender Problem [Tom Simonite/Wired]

(Image: Amit6, CC-BY)