Algorithms can write fake reviews that humans rate as "helpful"

Cory Doctorow

10:29 am Tue, Sep 5, 2017

One of the reasons that online review sites still have some utility is that “crowdturfing” attacks (in which reviewers are paid to write convincing fake reviews to artificially raise or lower a business or product’s ranking) are expensive to do well, and cheap attacks are pretty easy to spot and nuke.

But in a new paper, a group of University of Chicago computer scientists show that they were able to train a Recurrent Neural Network (RNN) to write fake reviews that test subjects could not distinguish from real reviews — and moreover, subjects were likely to rate these as “helpful” reviews.

This is an ominous sign, since fully automated attacks on review sites could spell the end of reviews as an even moderately useful way to sort out otherwise impossibly long lists of potential candidates for your money and/or attention.

The good news is that the researchers were able to develop a countermeasure in the form of another neural network that could reliably identify fake RNN-authored reviews — and even better, it’s cheaper to detect fakes than it is to improve them.

For now.

Future Work. In terms of potential future work, one direction is to
consider the role that user and content metadata can play in both the
attack and defense perspectives. Metadata can be crucial in terms of
deceiving users (e.g., by increasing the number of friends/contacts
on the site) and in assisting defenses [10, 19, 20, 31, 47, 71, 75] (e.g.,
by analyzing the patterns in timestamps of user activites). Orchestrating
the general behavior of user accounts using deep learning
to bypass metadata based defenses could be an interesting research
challenge. Second, while we limit ourselves to the domain of online
review systems and fake review attacks, deep learning-based
generative text models can be applied to launch attacks in other
scenarios as well. We highlight two of these possible application
scenarios.
Strengthening Sybil Attacks. Attackers can use our techniques to
generate realistic looking text-based user behavior patterns [4], e.g.,
posting, commenting and messaging. This can help attackers make
Sybil (fake) accounts indistinguishable from legitimate accounts
based on textual content. A special case of this involves launching
an impersonation attack in online social networks [11].
Fake News Generation. Identifying fake news, i.e. “a made-up story
with an intention to deceive” [61], currently remains an open challenge
[9]. The research community has started to explore the possibility
of automating the detection process by building an AI-assisted
fact-checking pipeline [41, 72, 76]. We believe that AI can not only
assist fake news detection but also generate fake news. Given the
availability of large-scale news datasets [68], an attacker can potentially
generate realistic looking news articles using a deep-learning
approach (RNN). And due to its low economic cost, the attacker
can pollute social media newsfeeds with a large number of fake
articles.
We hope our results will bring more attention to the problem
of malicious attacks based on deep learning language models, particularly
in the context of fake content on online services, and
encourage the exploration and development of new defenses.

Automated Crowdturfing Attacks and Defenses in
Online Review Systems [Yuanshun Yao, Bimal Viswanath, Jenna Cryan, Haitao Zheng and Ben Y. Zhao/Arxiv Cryptography and Security]