The infodemic, a scientist’s opinion
Interview with Luca Nicotra, senior campaigner at Avaaz. He is a data analysis and statistical modelling researcher, as well as an information and internet rights advocate, working on issues such as privacy, open innovation, freedom of expression, transparency, open data and intellectual property. He is a senior campaigner at Avaaz where he has researched extensively online disinformation. His academic background is in computer science (BS & MS from the the University of Pisa), machine-learning and data mining (PhD from Max Planck Institute for Biological Cybernetics).
You recently published a report on Covid-19 disinformation disseminated on Facebook that found a substantial delay in the implementation of the tech company’s own anti-disinformation policies. More specifically, you found that it can take 22 days for the platform to take down or issue warning labels for content. Do you continue to monitor the situation? If so, have you seen any improvements?
In terms of the disinformation sample analysed in our report, the average time – considering the entire database – between the publication of the fact-checking and Facebook issuing warning labels was seven days. We continue to monitor disinformation on this topic, however, tracking the delay in labelling is really time-consuming as it means re-checking on a daily basis. That is why we haven’t been able to do it since the publication of the report but we may go back to it in the future to check for any improvements.
You called Facebook “an epicenter of coronavirus misinformation”. Can you elaborate on why do you believe that to be the case?
What makes Facebook the epicenter is their worldwide reach? It is the number one social media platform worldwide: it has 2.4 billion active users (over half of all internet users globally). Additionally, Facebook owns the other two social media giants, WhatsApp and Instagram, with 2 billion and 1 billion users respectively. That said, all measures Avaaz is advocating for can and should be applied to other social media platforms such as YouTube and Twitter.
Your research covered non-English language content, and more specifically, Italian, Spanish, French, Arabic, and Portuguese content. Did you spot any common threads across different geographic regions?
During the course of the recent investigation combined with previous ones, Avaaz was able to assess that the same misinformation artifact can be translated and tweaked to fit better national realities or different platforms. One effective misinformation story can mutate, infect, and then lodge into and remain across hundreds of Facebook groups and pages without detection. For example, a harmful misinformation post on Facebook that claimed that one way to rid the body of the virus was to drink a lot of water and gargle with water, salt or vinegar was shared over 31,000 times before eventually being taken down after Avaaz flagged it up. But the other 2,611 clones of that post remain on the platform with over 92,246 interactions. Most of these cloned posts have no warning labels from Facebook.
There are also differences, as depending on the language, moderation and labelling are weaker, as Avaaz also observed in a previous report, titled Megaphone for Hate. Also each country has its own specific social, economic and cultural specificities, and often the misinformation content is created and adapted to fit better those realities.
What is the most prevalent coronavirus related disinformation that you discovered? Health remedies, origin stories, or scapegoating specific groups?
The largest category of disinformation identified in our report were conspiracy theories, mostly about the origin of the virus. The next largest group related to false medical information, such as bogus cures, methods of self-diagnosis or prevention which we consider to have the potential to contribute to immediate physical harm. Of this category, a common trend was posts containing false medical advice purporting to come from health authorities, that got translated into different languages. Our sample also contained four stories attacking minorities, refugees and migrants. Incidentally all of these originated in Italy, but none were fact-checked by Facebook’s third-party fact-checkers and at the time of the report’s publication, none carried disinformation warning labels.
Facebook was quick to act on one of your recommendations by pledging to alert people that were exposed to the disinformation of that fact. Do you think it’s a measure other platforms should implement?
It is important to note that the change announced by Facebook on 16 April 2020 came after three years of intense campaigning by Avaaz and following our latest report. Although the measure announced is an important first step, it is not a full adoption of our “Correct the Record” recommendation. Facebook’s reluctance to retroactively notify and provide corrections to every user exposed to harmful coronavirus misinformation is threatening efforts to flatten the curve across the world and could potentially put lives at risk. We do recommend that Facebook and all the other platforms implement Correct the Record in its full format, as well as other best practices against mis- and disinformation.
More specifically, in our view correcting the record should be a five-step process:
1) Define: The obligation to correct the record would be triggered when independent fact-checkers verify that content is false or misleading and a significant number of people – e.g. 10,000 – viewed the content.
2) Detect: Platforms must proactively use technology such as AI to detect potential disinformation with significant reach that could be flagged for fact-checkers; deploy an accessible and prominent mechanism for users to report disinformation; provide independent fact-checkers with access to content that has reached e.g. 10,000 or more people.
3) Verify: Platforms must work with independent, third-party verified fact-checkers to determine whether reported content is disinformation, as defined by the EU.
4) Alert: Each user exposed to verified disinformation should be notified using the platform’s most visible and effective notification standard.
5) Correct: Each user exposed to disinformation should receive a correction that is of at least equal prominence to the original content, and that follows best practices
Facebook has begun to implement a type of corrections for a limited number of users – mainly in the US but increasingly in Europe. Importantly, however, these corrections are only shown to people who see or share the content after fact-checkers determine it is disinformation. None of the platforms are taking any steps to show corrections to people who saw the debunked posts before they were fact-checked.
An additional useful measure would be the extraction of all content identified as misinformation from recommendation engines. It is time social media platforms take full responsibility for their recommendation algorithms. Content creators should have freedom of speech, not freedom of reach. Another measure we are proposing is adopting a three-strike rule for misinformation. This policy would mean that if a user, page, group, or channel is detected to have spread misinformation more than three times, all the provider’s content should be extracted from the platform’s recommendation algorithm. Over time, this policy would help ensure that high quality content is more prominently promoted by the algorithms, while misinformation actors are marginalised.
What further transparency measures would you like to see implemented by tech companies? Are ad libraries enough for example?
Platforms must provide comprehensive periodic reports listing – aggregated by country and/or language – the disinformation found on their services, the number of bots and inauthentic accounts that were detected, what actions were taken against those accounts and bots, how many times users reported disinformation. The reports must also detail platforms’ efforts to deal with disinformation in order to make the nature and scale of the threat public. In order to protect citizens from disinformation warfare, standards of transparency should apply to all paid content – not just political advertising.
Another key transparency issue where platforms have many shortcomings is researchers’ access to data. Avaaz has worked with the Institute of Strategic Dialogue ISD and the Mozilla Foundation, and concluded there are clear and continuing failures in data provision for those seeking to detect, call out, and respond to disinformation. During the European Parliament elections last May, these shortcomings included lack of access to appropriate public data for research, as well as the absence of swift and timely communication from platforms to researchers. Platforms’ collaboration with external research organisations was sporadic and inconsistent.
Apart from Twitter, major tech companies that signed up to the Code of Practice have also rolled back access to public data or metadata. This not only hampers detection efforts, but also the evaluation of the impact of their current responses to disinformation. In our communication with key experts and fact-checkers in this field, we have been informed that although platforms promised more cooperation when the Code was introduced, very little has in fact changed.
Overall, while Twitter continues to provide relatively thorough access to public user and content data for researchers through its API, Facebook and YouTube continue to fall short in their provision of accurate and timely data for researchers. Facebook also blocked Netvizz, which was a tool that allowed the download of all posts and comments from certain pages. As far as we know, YouTube allows continued use of the Netvizz data tools.
A tool currently available for research is CrowdTangle, a Facebook-owned platform to track social media activity on Facebook, Twitter, Reddit, and Instagram. CrowdTangle is the only tool that allows researchers to analyze large quantities of data on pages/groups’ posts on Facebook. While it can be incredibly useful, it also has its own shortcomings. For instance, it is not possible to search and view all posts by time period for a page or multiple pages.
In terms of ad libraries, Facebook and Instagram’s data is not provided in a format that allows scalable scrutiny. For example, there is no ability to download data in a machine-readable format, and no ability to export the full data. Developers can access large amounts of Twitter data legally via the platform’s API. Twitter also has an ad transparency center although that data is not nearly as comprehensive as the data that Facebook provides. Google’s advertising API was better designed for research, but with far less scope.