Name:
Misinformation and truth: from fake news to retractions to preprints
Description:
Misinformation and truth: from fake news to retractions to preprints
Thumbnail URL:
https://cadmoremediastorage.blob.core.windows.net/2a60b6a4-3050-4e41-993c-27b65c1acf48/videoscrubberimages/Scrubber_1.jpg?sv=2019-02-02&sr=c&sig=1sI%2BH042DaEg6kGkOMxthSFOJg%2F59f9lPehgsw0l%2BZs%3D&st=2025-01-15T00%3A47%3A27Z&se=2025-01-15T04%3A52%3A27Z&sp=r
Duration:
T00H50M26S
Embed URL:
https://stream.cadmore.media/player/2a60b6a4-3050-4e41-993c-27b65c1acf48
Content URL:
https://cadmoreoriginalmedia.blob.core.windows.net/2a60b6a4-3050-4e41-993c-27b65c1acf48/40 - Misinformation and truth- from fake news to retractions.mov?sv=2019-02-02&sr=c&sig=ZbyFjgY15WwXaDYMWVwsxU7Dr2kdztnvUeDWtrhMVoU%3D&st=2025-01-15T00%3A47%3A29Z&se=2025-01-15T02%3A52%3A29Z&sp=r
Upload Date:
2021-08-23T00:00:00.0000000
Transcript:
Language: EN.
Segment:0 .
[MUSIC PLAYING]
SPEAKER: Hello, everyone, and welcome to this Misinformation and the Truth, from Fake News to Retractions to Preprint session. We'll be having a lot of people speaking today, so first I'm going to introduce everyone, and then I'll follow up and cue people as it's their turn to speak. So first we have Sylvain Massip from Opscidia, then Jodi Schneider from the University of Illinois, Randy Townsend from the American Geophysical Union, Caitlin Bakker from the University of Minnesota, Hannah Heckner from Silver Chair, and last but not least, Michele Avissar-Whiting from Research Square.
SPEAKER: First up is Sylvain, representing Opscidia. OK.
SYLVAIN MASSIP: OK. Hello, everyone. My name is Sylvain Massip, and I'm the co-founder of Opscidia, a French startup. And I'm going to describe my work about the use of open access article to help fight fake news. OK, first, very quickly, I'm going to introduce Opscidia. So Opscidia is a startup created a year and a half ago with the ambition to help promote the reusability of research results in society as a whole, so outside of academia.
SYLVAIN MASSIP: The basic idea for Opscidia was the idea that scientific articles are a huge source of knowledge but that they are actually not very well used outside of academia. And the reason for this little use is basically threefold. We have identified three looks. The first lock is obviously access.
SYLVAIN MASSIP: People outside academia generally do not have access to everything that's behind paywall. So open access is obviously very important. The second look is reproducibility. If you have only one article stating something, it's actually difficult, especially if you are not an expert in the field, to know whether the conclusion of the article is actually trustable or not. And the third block is discoverability.
SYLVAIN MASSIP: With around 2 million articles published every year, it's really difficult to find the right information, the information that you need. So really, the purpose of Opscidia is to try and help remove all these three blocks in order to ease the reusability of scientific results in society as a whole. So more concretely, what we do, we are doing several things. The first thing we do is that we have an open access publishing platform, which is Diamond Open Access, so APC free.
SYLVAIN MASSIP: So it's totally free. And we host academic journals run by people from a university, usually. So this project is totally free, and we fund the full enterprise with other services that are scientific text analysis tools and services. We have a platform for access to scientific information.
SYLVAIN MASSIP: And we are also running projects to address specific needs from our clients that are, most of the time, research and development companies and sometimes governments and policy offices. Now let's move to the main part of the talk, so our work on scientific fact checking.
SYLVAIN MASSIP: Basically, we have built a prototype. It's a project that we have run for about a year and which is funded by the Vietsch Foundation. And we have built a prototype of the following form. The user input statement, such as "does Agent X cure, cause, or prevent Disease Y?" And based on data that we collect from Europe PubMed, we are trying to assess whether scientific literature backs or contradicts this claim.
SYLVAIN MASSIP: The main objective of this prototype is really first to demonstrate that, indeed, we can use open access scientific articles to help people make sense of different claims that they may have. The second objective is to educate people that basically there are a lot of scientific studies out there, and there is usually no one study that can kill the debate.
SYLVAIN MASSIP: And really you need to have things that are checked between different studies in order to make sure that, indeed, you have a fair view of what experts are thinking of one topic. And obviously, the third objective is to help user identify scientific consensus or the lack thereof. Our approach is, then, to allow the user to enter a scientific claim, such as "does coffee cause cancer," for example.
SYLVAIN MASSIP: Then we select the right corpus from Europe PubMed. And then from this right corpus, we develop three indicators that I'm going to describe very briefly afterward that are at the timeline of the sources, semantic analysis indicator, where we try to say whether specific articles backs or contradicts the claim, and then the numerical conclusions.
SYLVAIN MASSIP: We are trying to extract the numerical conclusions of the articles, again, to see whether the articles backs or contradicts the original claim. Just to give you an idea of the integrated pipeline that we aim for when the project ends, we collect the data from PubMed, then we have an index in Elasticsearch.
SYLVAIN MASSIP: And with different Python API, we are actually communicating this data to the different pipelines in order to extract the result. And then we have a visual interface to give back the results. And this is how it looks like at the moment. We are working with a designer, and hopefully it will be much better when it will be live.
SYLVAIN MASSIP: OK, now let's describe quickly the three indicators. So indicator one is the analysis of the sources of the information. Actually, it is a very simple indicator, very quick to develop. But it's actually very interesting. And it's quite standard in fake news fighting outside of the scientific world because it's usually very difficult to know whether a statement is true.
SYLVAIN MASSIP: It's actually much easier to find where it's come from. And if it comes from something which looks dubious, well, that gives you an indication on whether the claim is true or not. And actually, sometimes it gives nearly the whole story. There are a few famous examples of that that you can see on the topic. In the scientific field also, it can be very interesting.
SYLVAIN MASSIP: And in particular, there is one example that we have found where we can illustrate the use of this indicator. If you look at the number of articles on the link between vaccines and autism, you can see that basically there was nothing before '98. And after '98, it actually became very important. So what happened in '98? '98 is actually the date where the article by Andrew Wakefield in The Lancet was published.
SYLVAIN MASSIP: And this article was quite famous. Then it was retracted. So we can see that actually at the beginning of the study of this correlation, there is an article which was retracted. So that's quite an indication on where the story come from. Outside of this very, very strong example, we think this indicator is always useful because it helps people really understand how the studies on a particular topic are organized.
SYLVAIN MASSIP: It helps people know whether the subject is still discussed. It helps people know when actually it was studied at first. And it helps people know whether there are a lot of articles or not on this topic. Now let's move on. Our second indicator, which is much more complicated in terms of computer science, because we try to do the task to classify the research articles with respect to the input statement.
SYLVAIN MASSIP: So really to give an answer to the question, does this research article support, is it neutral, or is it contradicting the statement that was input by the user. To do that, we have tried different natural language processing pipelines based on the general family of technology of question answering. And I'm not going to go into details about that.
SYLVAIN MASSIP: But we have evaluated the results that we obtained on different known data sets. And we can see that all these three techniques, Boolean question answering, abstractive question answering, and extractive question answering, give quite interesting results. We can see that the accuracy is quite good, but still not good enough to go without a disclaimer stating that, well, there might be a mistake in the way we evaluate articles.
SYLVAIN MASSIP: So what we still have to do here is to finish this assessment and maybe combine these technologies and then integrate them into the website. Now the third indicator is, as I said, the retrieval of the numerical conclusion of the articles. To do that, we developed the following pipeline. So we start from the corpus of relevant article, always the same that we selected at the beginning.
SYLVAIN MASSIP: Then we identify the right sentence, the right parts of the articles where the answer to the question might be. Then from these sentence, from these small parts, we extract the number. And then we have built a visualization interface to, again, help people explore the results and discover by themselves what can be said from the scientific literature about the claim that we're asking.
SYLVAIN MASSIP: Here is a closer look to the visual interface that we have developed. As you can see, when we collect the data and, well, one article is one point. And you can see here that we have little red dots, and we have a lot of green dots and a lot of orange dots, which show that basically on that particular statement, does sport prevent cancer, we found a lot of studies saying yes, some studies saying that they couldn't see any correlation, and some studies that actually might be computer mistakes that said that sports is causing cancer.
SYLVAIN MASSIP: And still you can see here that it gives an interface to really explore the literature on a specific topic. From this point, we link back to the article so that people can actually see for themselves whether the algorithm was right on that specific topic. So, to conclude, we have built a pipeline based on three indicators to try and detect scientific consensus and to help the public understand what is a research article, what is a scientific consensus, how it does work, and to really help people see and discover by themselves that you actually need to explore the whole consensus and not just be happy with one study.
SYLVAIN MASSIP: It's usually not enough. We think we have shown also that, indeed, open access can be useful to fight fake news. We think it's very important to show that open access actually has applications outside of academia. And we really think that's how open access will develop. When it's useful, it will be developed. And to say a few words about the next step, we have to integrate online all these three pipelines together, and then our website will be live, probably in a few weeks.
SYLVAIN MASSIP: And then we are really looking forward to continue this project. So we are very happy to find new collaborations, and we are also looking for funding to go further with this project. OK, to finish, I would like to acknowledge a few people, first people of my team, Charles, Loic, and Timoth?e, who actually developed all the work that I presented today. And of course the Vietsch Foundation for funding and you for listening.
SYLVAIN MASSIP: Thank you very much.
SPEAKER: Thank you, Sylvain, very much for your presentation. And up next, it's our second block of presentations with Jodi Schneider, and Randy Townsend, Caitlin Bakker, and Hannah Heckner. And Jodi should go first.
JODI SCHNEIDER: Thank you so much. I'm going to talk about a project that is working to reduce the inadvertent spread of retracted science. The Wakefield article that Sylvain mentioned is one of the most infamous cases of retraction, and we know that it's had a lot of uptake despite the retraction. So the RISRS2020 project is funded by the Alfred P. Sloan Foundation.
JODI SCHNEIDER: And this is a stakeholder consultation and environment scan. We have done some various things like literature review of empirical research about retraction and a citation analysis leading up to a stakeholder consultation, where we had a series of three online workshops in October and November, which invited people throughout the scientific publishing ecosystem to interact in real time about the problems that retractions pose.
JODI SCHNEIDER: And this was built in part on a series of interviews with about 50 people from all across different parts of the ecosystem. We're now moving towards disseminating the draft recommendations. And the start of this is to talk about what sorts of problems we see, what sort of questions we started with, and then we'll have some further discussion of that across some different parts of the ecosystem.
JODI SCHNEIDER: So the questions that we started with, there were four of them. What is the actual harm associated with retracted research? What are the intervention points for stopping the spread of retraction? Who can intervene? What are the gatekeepers? And how do we disseminate retraction status through them? What classes of retracted papers are there?
JODI SCHNEIDER: And are there some that are citable, and in what context? What are the impediments to open access dissemination of retraction statuses and retraction notices? And so these questions will form the basis for the next presentations, which will be from a nonprofit publisher, an academic librarian, and a publishing platform provider. So that's first, then we'll hear from Randy Townsend. I'll turn it over to you.
RANDY TOWNSEND: All right, so thank you all for joining us today. And it's really great to be a part of this conversation on retractions and retracted research. One of the responsibilities that I have at AGU is safeguarding the integrity of the content. This includes evaluating allegations of misconduct in all the various forms. A part of this responsibility means that I'm often engaging with individuals that are reaching out to me because something just isn't right.
RANDY TOWNSEND: They feel that the systems of trust and professional respect have been compromised, at times they may have been wronged, or that the science itself has been jeopardized. Publishing has a number of safeguards in place to protect the integrity of the content, and nothing has been more taboo, in my opinion, than retractions. Retractions were long believed to be career ending, and retracted authors often felt shamed.
RANDY TOWNSEND: I think that retraction has really helped reset the publishing landscape and really protect the integrity of the science. And one of the most drastic reports that highlights the angst surrounding retractions, in 2014 Yoshiki Sasai-- and please forgive me if I butchered the name too badly-- but he committed suicide after a paper that he co-authored was retracted in Nature.
RANDY TOWNSEND: In this presentation, my colleagues and I will discuss the impacts and reach of retractions. So for my part, I'll demonstrate what this looks like on the front lines. So AGU is the largest earth and space science organization in the world. We have more than 60,000 members worldwide. This is just a quick snapshot of what AGU and the volunteer makeup landscape looks like and participants.
RANDY TOWNSEND: We have 22 peer-review journals. That includes nearly 38,000 authors, nearly 35,000 peer reviews, and nearly 16,000 peer reviewers. So Jodi mentioned in one of the questions, what's the harm with retracted research? There's reputation damage. There's scientific dissonance, professional disgrace, and of course the feelings of failure.
RANDY TOWNSEND: In today's world, we are quick to build connections and establish relationships, whether we're talking about a conversation on Twitter or data to an article, to an author, to their orchids. New researchers build on previous research to further our understanding. But when one link in that chain is compromised, whether it's because an analysis was incorrect or an instrument was not properly calibrated, or maybe somebody just fudged the numbers to produce results to sell a product, we move further and further away from the truth.
RANDY TOWNSEND: At AGU, we're paying more attention to boundaries. By this, I mean that territories that are the cause of disputes between two or more countries are being captured in our maps, phrased in a way that could lend to a case where one of the countries could lay claim to that region. We have a lot of guidance that says that we follow the UN guidelines for the naming of territories, but we continue to see inappropriate names in our submissions.
RANDY TOWNSEND: So this figure comes from a paper containing China's unsubstantiated nine-dash line claim of territory over the South China Sea. And I'm not sure if you can see in the red section, there is this gray dash that kind of moves along the South China Sea there. According to this map, all the islands that are in dispute between China and other countries in this region, such as Brunei, Indonesia, Malaysia, Singapore, Thailand, and Vietnam, they all belong to China.
RANDY TOWNSEND: The text in the middle of this, between the two images I have here, they're from a co-author of this paper of Vietnamese descent. The nine-dash line was not in the original submission but was added sometime during peer review after the co-authors had already reviewed the submission. The authors, when it was brought to our attention, the authors were offered the option of correcting the article by replacing the image or retracting the entire article.
RANDY TOWNSEND: In my discussions with the authors, there was clearly pressure from the Chinese scientific community to retain that nine-dash line. If the authors did not come to an agreement, AGU would have retracted the article so that we could remain unbiased in the political consideration. Ultimately, in this case, the paper was corrected, but there was clearly an impasse among the authors. This article currently has an Altmetric score of 145.
RANDY TOWNSEND: It was mentioned in several blogs and cited five times since it was published. And even to this day, it's still being tweeted. So other retracted manuscripts that we have at AGU, they continue to be cited after they've been retracted. They receive attention. And some of that attention my colleagues will go more in-depth about. But I really want to give you a picture of what the harm with retracted research is to an author who really feels passionately about their loyalties, about who they represent as an author and what they bring to the table, and if something was retracted, what that could mean to their professional standings and potentially their careers.
RANDY TOWNSEND: So I will thank you now. I'm looking forward to the conversation, and I'll turn this over to my colleagues. Thank you.
CAITLIN BAKKER: Hello, everyone. Thanks very much for joining us here. My name is Caitlin Bakker, and I'm the research services librarian at the University of Minnesota Health Sciences Libraries. And so I'm approaching this question really from an information-seeking perspective. And from my perspective, the primary harm associated with retracted research is the potential for use of that research without knowledge of its retracted status or the reason for the retraction.
CAITLIN BAKKER: One question that I've personally been asked about retractions is, aren't retractions enough? A paper presumably has some sort of flaw or underlying issue, and therefore it gets retracted. And shouldn't that subsequently correct the scholarly record and ensure that researchers and students and practitioners are able to use and are using the best available evidence? And I might not be surprising anyone here when I say that unfortunately, no.
CAITLIN BAKKER: Our current processes, practices, and systems do not fully correct that scholarly record. And one way we know that the scholarly record isn't being fully corrected is that retracted publications continue to be used. Now, just as some brief background, one of the cornerstones of evidence-based medicine is the idea that one would use the best available evidence when making health care decisions.
CAITLIN BAKKER: And generally speaking, the best available evidence is considered to be a systematic review, which is a research method that identifies, critically appraises, and synthesize all available evidence on a particular topic with the goal of definitively answering a question like, what is the best treatment for this particular condition? So these are incredibly useful and powerful in medicine. To put it in perspective, the average primary care doctor in the United States sees about 20 patients a day and, generally speaking, generates about one question per patient where they need to find some sort of research or information to address that question.
CAITLIN BAKKER: Now, overall, doctors generally spend less than three minutes seeking out that information. And it's not possible to find all of the original research on a particular question in three minutes. But it is possible to find a systematic review. And so these are really many providers' go-to resources, and they are taught and encouraged to use these resources in the vast majority of medical schools and programs worldwide.
CAITLIN BAKKER: And I'm currently working on a project with my colleagues Sarah Jane Brown and Nicole Theis-Mahon where we're looking at retracted publications in systematic reviews, particularly in the pharmaceutical literature. And we found that in a sample of about 1,400 retracted papers, 283 of those retracted papers were cited over a thousand times in systematic reviews. And over a third of those citations were occurring after the paper had been officially retracted and after the retraction notice had been published.
CAITLIN BAKKER: So we're teaching health care providers to rely on this form of evidence when making decisions, but we're also struggling to account for retracted materials within that methodology. Now, more important than the fact that materials continue to be used, I would venture to say that the major problem is that they continue to be used in inappropriate ways. When we ask what classes of retracted publications should be considered citable, I would say all of them.
CAITLIN BAKKER: Because to cite a paper isn't to agree with it or to endorse its findings. In order to refute something, you cite the paper. So when I say inappropriate ways of use, what I'm referring to is the context or the nature of the citation. Are the individuals citing the paper doing so because they're using that work as a basis for their own arguments or their methods? Are they supporting that work?
CAITLIN BAKKER: Or are they pointing out the flaws or the inaccuracies within it? So it's really the sentiment of the citation that's key. And we have some evidence that citations to retracted publications may be more supportive or positive than not. Looking at the sentiment of citations to retracted publications in the field of dentistry, my colleagues and I found that in a sample of 685 publications that cited retracted articles, the vast majority of those citations are positive, with only 5.4% of the citations acknowledging the retracted status of the paper or refuting its findings.
CAITLIN BAKKER: And so to me, the previous two studies really highlight the potential harm of retracted research in that people are using retracted work presumably without knowledge of its retracted status. And they're making decisions about research and practice based upon that information without the necessary context. So we have a sense that these retracted materials are being used, and we might question why it's happening and what can be done.
CAITLIN BAKKER: Now, while I can't make conclusive statements as to why this is happening, I do want to draw attention to the inconsistency with which retracted publications are being displayed across different platforms and journals. In a study looking at the representation of retracted publications in mental health, my colleague Amy Riegelman and I looked at 144 retracted articles across seven different platforms.
CAITLIN BAKKER: And we found that 40% of the articles, or the records, rather, for the articles that we reviewed across these different platforms did not indicate that the paper had been retracted. And out of the 144 articles, there were actually only 10 that were noted as being retracted across all platforms. About 83% of the platforms did include a retraction notice, but the majority of those retraction notices were not actually attached to the original article, or they weren't indexed with subject headings.
CAITLIN BAKKER: So the likelihood that one would uncover them was pretty minimal. And then within platforms, the retracted status of a paper could vary significantly. Depending on where you were looking, you might see that something had been retracted anywhere from 4.5% of the time to about 91% of the time. And that inconsistency was present even when the records and the different databases were seemingly linked.
CAITLIN BAKKER: So, for example, a Scopus record included a PMID and was linked to PubMed, but one record would note that the article was retracted and the other would not. And this would indicate, at least to me, that there are some data transfer and data display issues. So as an end user, my mode of access determines the context and the robustness of information and in turn, then, how I might choose to incorporate that information into my knowledge base and into my decision making.
CAITLIN BAKKER: So what interventions are available to help address these issues? From a librarian's perspective, I see roles in the production of information. For example, if you're a librarian involved in systematic reviews, how can you adjust your workflows to identify when articles have been retracted?
CAITLIN BAKKER: But also in advocacy and education. I spend a lot of time focusing on teaching people how to effectively find and use information. And there's a rule there in communicating the user experience to vendors and to publishers to provide perspective on if and how retractions are being accounted for in the information seeking and discovery experience. We also need to be teaching students, researchers, and health care providers how they can account for these challenges in their information seeking.
CAITLIN BAKKER: But beyond interventions that can allow retracted status to be clearly and consistently displayed, which, of course, is very important, and teaching users how it is to identify when articles have been retracted, there's also the bigger question of how to equip end users with the skills to incorporate this knowledge into their critical appraisal processes. So how do they move beyond asking, how do I know if it's retracted or is it retracted, to instead asking, if all or part of the best available evidence is retracted, what do I do?
CAITLIN BAKKER: So on that note, I'm just going to thank you very much for your time and attention, and I am going to turn things over to my colleague, Hannah.
HANNAH HECKNER: Thanks so much, Caitlin. That was like a perfect tee up. So hi, everyone. I am Hannah Heckner from Silver Chair. I'll share my screen here. And I am representing the platform provider part of this cross-industry discussion. So in approaching the workshops that Jodi and her colleagues put on, I thought a lot about where platforms fit into this. My previous experience has been on the publishing end, so this was a really great experience for me to really distill down what the job of the platform is.
HANNAH HECKNER: And at its base, the platform is a container that should be built to suit its contents. The platform should be open when necessary and closed when required. You want to disseminate information with clarity and thought to the user. And when I think of users, I think of not only the reader, but also those downstream delivery locations for where the content that is on the platform is going to be going.
HANNAH HECKNER: So also, aside from the research on our platforms, we also want to allow publishers to communicate their brand, allow them to cultivate loyalty and trust with their users. So in thinking about that and how the platform functions, it was really helpful to use that as a grounding space to think about the questions that were brought forth as part of this workshop. My colleagues preceding me did a really great job talking about the harm with retracted research and offering some recommendations for the intervention points.
HANNAH HECKNER: So I'm just going to go to these three questions that I really wanted to speak to today and look forward to speaking to you all live later on. So this intervention points for stopping the speed of retraction, as Caitlin pointed out, there is a lot of inconsistency when you look at various publisher sites as to how they communicate the retracted status of an article. There's a lot of opportunities to provide watermarking on the actual front end, be it on the article PDF in addition to the actual article page.
HANNAH HECKNER: I think that this is something that is adopted by many, but certainly there's some room for improvement there. Past that front end, though, I think that there is a lot of opportunity in increasing the metadata vocabulary around retracted research. When you're thinking of downstream delivery and how this isn't just the sending of an article to an abstracting and indexing service, but instead of opening up your platform to be able to be crawled by abstracting and indexing services like a Clarivate, like a PubMed, like a Google Scholar, perhaps that there's some opportunities there to add new tags to articles to communicate those retractions, maybe even a sub-vocabulary where you can talk about the types of retractions there.
HANNAH HECKNER: I think that that would be a really interesting thing to think about. I also think that there is a lot of opportunities, when we think of how research is communicated on platforms, about making the various article artifacts more open on platforms. I think that overall, a move towards more posting of open data, posting of article versions, even posting more information about the life cycle of an article, I think that that could be helpful just to increase the transparency about research.
HANNAH HECKNER: Maybe that would shine light on retractions before they became a larger problem. Or maybe it would just create more trust with the reader and show that lifecycle of the research. This is what the article looked like when it was submitted. Following along with Randy's example, where you could see those figures that were used, seeing how that changed throughout. I think that also when we think about impediments to dissemination, just making sure that we are collaborating, that platforms are working with abstracting and indexing services, are working with libraries, are working with publishers to make sure that, yes, we are creating an environment that maintains safety around content, keeps up those paywalls when they're desired, but also allows for the reading and dissemination of research in all of its lifecycle forms to really communicate all of those different stages and shine a light on all of those life cycles.
HANNAH HECKNER: To really show just the entire ecosystem of scholarly communication, I think, will really help for platforms to be an important part of this conversation as we think about retracted research. So thank you all for your time today. I look forward to speaking more in the discussion. Feel free to reach out with any questions. And at this point, I will turn it back over to Jodi to talk about the draft recommendations that have come from our workshops.
JODI SCHNEIDER: Thanks, Hannah. So from RISRS2020, we're currently putting together white paper and circulating, and we're very happy to hear from you if you're interested in providing feedback. Right now, we have these five top-level recommendations. First, to make retraction information easy to find and use. Second, to recommend retraction metadata and a taxonomy of retraction statuses that can be adopted by stakeholders.
JODI SCHNEIDER: Third, to develop best practices for coordinating the retraction process. Fourth, to educate and socialize researchers and the public about retraction and post-publication stewardship of the scientific record. And finally, to develop standard software and databases to support sustainable data quality. I'd be very happy to hear from you. jodi@illinois.edu is a good way to get a hold of me.
JODI SCHNEIDER: And you can also search for RISRS2020, and you'll find more about our project. Thanks.
HANNAH HECKNER: Back to you, Carolina.
SPEAKER: Thank you, everyone. And now for our final bit, we have Michele Avissar-Whiting.
MICHELE AVISSAR-WHITING: Thank you, Carolina. So I'm Michele Avissar-Whiting. I'm the editor in chief of the Research Square preprint platform. I'd like to open my short talk with a short quote from the author Marchette Chute, who said, "Nothing can so quickly blur and distort the facts as desire, the wish to use the facts for some purpose of your own. And nothing can so surely destroy the truth." So I've been reminded of this more often than I ever thought I would in my day-to-day life at work in the past 10 months or so.
MICHELE AVISSAR-WHITING: And it's been fascinating to see how people use information and to think about the challenge that these behaviors pose to researchers, to journalists, and to the platforms like ours that have been hosting these early outputs, the ones that have been so often the source of the information, which is preprints. So most preprint servers or platforms are not anything goes platforms. They do filter out submissions that are clearly pseudoscientific, ethically dubious, potentially dangerous.
MICHELE AVISSAR-WHITING: But they don't routinely block the posting of papers based on methodological flaws, poor or opaque reporting, or specious conclusions. So preprint servers are already not totally passive hosts for research. And the last few months have taught us that we may be able to play a more active role in ensuring that people, at minimum, don't come away with totally misguided ideas about what a study means.
MICHELE AVISSAR-WHITING: So I'm going to frame this exploration by using real-world examples from our own platform. So here's a tale of three preprints. The first one I'm going to call the misunderstood. It came to us in June, at a time when other preprint servers had also posted similar work about T cell immunity to SARS-CoV-2, which is derived from our past exposures to coronaviruses that cause the common cold. Some people came away from this complex immunology story with the impression that this must mean that most of us are already immune to the virus, proof that this is all just a hoax.
MICHELE AVISSAR-WHITING: And of course this false narrative was repeated again and again and again on Twitter using this preprint and others in its defense. This was the main driver behind the usage metrics that it accumulated. So the paper was ultimately published in Nature Immunology, where it, of course, continued to garner attention from conspiracy theorists. The next one I've called the overinterpreted.
MICHELE AVISSAR-WHITING: It's a preprint that reports the results of a targeted questionnaire about the effects of wearing masks on kids. So we know that's a contentious topic, but the question is still a valid one. The limitations-- and there were many-- were stated. A causal relationship was not asserted, and the conclusions were really not alarming at all.
MICHELE AVISSAR-WHITING: But none of this stopped people from using this preprint to push an anti-mask narrative. This one is more recent, and it hasn't been published yet, but the survey is ongoing. And the last one I'll call the convenient truth, which is a story about the negative correlation between vitamin D and COVID severity. This is an association now that's been found by a number of groups.
MICHELE AVISSAR-WHITING: And ultimately, this was a sound study, which was quickly published in a journal. And both the preprint and the journal article are used to support some very strange conspiracy theories about how the government is hiding this information and telling us to stay indoors to further hinder our vitamin D production. It was also perversely used to suggest that the huge disparities that we're seeing in COVID infection and outcomes between people of different races could be fully attributed to their differential levels of vitamin D, which is a totally unsubstantiated and highly tenuous claim, but one that conveniently, for some narratives, also may have some basis in reality.
MICHELE AVISSAR-WHITING: So, OK, what's the point of these stories? Well, there's a few take homes. One is just that the spotlight is shining on preprints right now because they've been the first ones up on the stage during this pandemic. So this has been a sort of trial by fire for those of us whose job it is to think about the role of preprints, how they're being received, and establishing policies around them, et cetera.
MICHELE AVISSAR-WHITING: So as a preprint server, it's felt increasingly like it's incumbent on us to not only screen out the really bad stuff and include disclaimers for everything else, but also to take other actions within our means to provide clues to the rigor of the study and also to help people make sense of it. And these are features that have the potential to add value above and beyond what a standard editorial or peer review process can offer.
MICHELE AVISSAR-WHITING: So I've used this Pentateuch framework here for scientific rigor, which was introduced by Casadevall and Feng. I don't have time to get into all the nuances of this, but the idea is that these five things are the components that should determine to what extent we trust a given study. And we can start to evaluate them at the preprint stage, before or alongside a standard peer review process.
MICHELE AVISSAR-WHITING: So these are low-lift things, like incorporation of smart citation metrics. I show the site badge here as the best example that I know of out there right now presenting context-specific citations that speak to the reproducibility of the study. And it's also things like automated assessments of methodology and open data reporting, such as those performed by CiteScore and Repeta, respectively.
MICHELE AVISSAR-WHITING: And here's what that looks like on our platform, on Research Square. And then higher lift things, like thorough human-driven reporting assessments that results in public-facing badges like these, and editorial or lay summaries such as those that we've now written for all of the three papers that I talked about earlier and many others that people were misusing or confused about.
MICHELE AVISSAR-WHITING: So this is just an effort to explain the study in an accessible way and bring its limitations to the fore. So obviously there's so much more to say about this topic, but I'll leave you with this. Everyone here will probably remember the so-called uncanny study that was posted and then quickly roasted by the scientific community and withdrawn from bioRxive very early in the pandemic.
MICHELE AVISSAR-WHITING: Nobody is walking around now with the impression that SARS-CoV-2 was engineered from HIV. But-- and this is appropriate because it's been mentioned a couple of times now in this panel-- we are still living with the terrible consequences of the Wakefield MMR autism study, which was published in The Lancet over 20 years ago and retracted just over 10 years ago. And so, from the public's perspective, at least, 'tis better to have preprinted and been quickly discredited than to have been published in a well-respected journal and sowed decades of fear and concern around an important public health effort.
MICHELE AVISSAR-WHITING: Preprints are not without their problems, but those problems are not intractable, and we have a great creative, resourceful community that will find solutions. So thank you for coming to my NISO talk.
SPEAKER: Thank you, Michele. And now thank you, everyone, who stuck around and seen our presentations. We'll see you soon in the discussion room. [MUSIC PLAYING]