Name:
Quality and reliability of preprints
Description:
Quality and reliability of preprints
Thumbnail URL:
https://cadmoremediastorage.blob.core.windows.net/e02b2357-c71f-49b4-8206-3cec1ca35669/videoscrubberimages/Scrubber_1.jpg?sv=2019-02-02&sr=c&sig=3cvtOuzNxPZd2GXAI98PhBNZgj4m57NJZygh5P%2FkP9o%3D&st=2024-11-22T05%3A06%3A02Z&se=2024-11-22T09%3A11%3A02Z&sp=r
Duration:
T00H57M50S
Embed URL:
https://stream.cadmore.media/player/e02b2357-c71f-49b4-8206-3cec1ca35669
Content URL:
https://cadmoreoriginalmedia.blob.core.windows.net/e02b2357-c71f-49b4-8206-3cec1ca35669/4 - Quality and reliability of preprints-HD 1080p.mov?sv=2019-02-02&sr=c&sig=B9s%2BLKWvnJkcdJ5hT1OEBJtF8gHU09IoHXnjNi6JsNo%3D&st=2024-11-22T05%3A06%3A02Z&se=2024-11-22T07%3A11%3A02Z&sp=r
Upload Date:
2021-08-23T00:00:00.0000000
Transcript:
Language: EN.
Segment:0 .
JABIN WHITE: Hello, everyone, and welcome to our session on quality and reliability of preprints. My name is Jabin White, and I am Vice President of Content Management for ITHAKA, as well as Treasurer of the NISO board. It's in that latter capacity that I am especially grateful to be a part of this conference and to be moderating this panel. I hope you'll agree over the next several minutes that we have three wonderful speakers lined up, coming at this issue from a variety of different perspectives.
JABIN WHITE: And it's just been terrific getting to know them and hearing their presentations. So the last few years have seen a pretty big increase in interest among folks over preprints-- mostly in good ways, but also with some concerns. And the COVID-19 pandemic has only increased that interest. And as a global community, we are wrestling with issues around access and visibility versus quality and reliability, and our three speakers today come out these issues from, like I said, very diverse geographical and institutional perspectives.
JABIN WHITE: And they'll guide us through some of these issues, and then hopefully, we can have a lively Q&A session afterwards. So write down your questions, store them up, and we've got about a half hour afterwards for a Q&A session. So I'm going to go ahead and introduce all three speakers at the top. So first up is Kathryn Funk, who is a technical information specialist at the National Library of Medicine.
JABIN WHITE: Katie is, like I said, looking at this issue from an institutional lens. She'll talk about-- the NLM has had a preprint pilot going that started in June of last year, and she and her colleagues have learned a lot and she'll talk about what they've learned and what they plan to do next. After Katie is Joy Owango from AfricaArXiv and also from the Training Center in Communications.
JABIN WHITE: She's going to talk about her work in preprint at AfricaArXiv. And she cites an amazing statistic that says that most people assume that the research output of Africa is pretty small. She makes a pretty compelling argument that it's not that it's a small contribution. It's that it's not discoverable for a variety of reasons, which she will discuss.
JABIN WHITE: And then our final speaker is Abel Packer, co-founder and director of SciELO-- the Scientific Electronic Library Online from Brazil. Abel will talk about, as I mentioned, the explosive growth of preprints in 2020 and how SciELO has handled that while maintaining quality control and some other issues among that. So three excellent speakers, diverse viewpoints, experience and expertise off the charts in this subject.
JABIN WHITE: So I really hope you enjoy these presentations. Save up your questions, and we'll see you afterwards for the Q&A session. With that, let me turn it over to Katie, who will start us off.
KATHRYN FUNK: OK, thank you for the invitation to participate today. I, as Jabin said, am Katie Funk from the National Library of Medicine, and I'll be representing an institutional perspective on the panel today around preprints, their potential value, our current activities around them, and how we're starting to probe questions of transparency and quality through the lens of one of our current activities, which is the NIH preprint pilot.
KATHRYN FUNK: So preprints have been an area of interest at NIH for a few years now. This excerpt you see here was from an editorial co-authored by Dr. Lauer, who is the director of the Office of Extramural Research and was published in The Lancet in 2015. And it really gets to this idea of advocating for building a preprint culture in clinical research and health research, saying that other scientific fields have come into the internet era with timely global conversations and rapidly evolving scientific discoveries being stimulated.
KATHRYN FUNK: And how can we get there within the realm of NIH? This editorial was followed by efforts in 2016 to learn more about what the community thought about preprints and other interim research products. A request for information went out that was seeking input on these various questions, and we received over 300 responses.
KATHRYN FUNK: Almost all of them were supportive of this sort of sharing and early sharing of research results. And they saw the benefits really as the obvious ones that we often talk about around preprints-- speeding dissemination of the research, increasing the rigor and collaboration through more openness and public commenting, and avoiding publication bias. The following year, 2017, we built on that response and released guidance that encouraged NIH investigators to use preprints in order to speed the dissemination and enhance the rigor of their work.
KATHRYN FUNK: It clarified some reporting instructions as well, so we put out there that investigators can put preprints on their progress reports as products of awards. Preprints can be included in grant applications, and NIH just gave some basic guidance, not only to make the preprint publicly accessible, but to make it under a very open license-- CCPY or CC0, ideally.
KATHRYN FUNK: They want to make sure authors are following best practices with preprints, so acknowledging funding, declaring competing interests. And they also included some guidance on selecting repositories that has served as a framework for some of the work we're doing now. In encouraging this open sharing and liberal license for use terms, as well as best practices, NIH is really looking to maximize the impact of the research that it funds to reach the most number of people and to generate new science.
KATHRYN FUNK: When we started discussing the role that NLM might play in all of this, especially around improving discovery of preprints-- we see that if you're going to maximize the impact, it's not enough for them to be open. But they must also be discoverable. So we had these conversations internally beginning in 2019 framing what we thought we wanted to do with preprints.
KATHRYN FUNK: We knew that we wanted to focus our efforts on those that clearly acknowledge NIH support. We also wanted to use our role as a library to encourage best practices among preprint servers, consistent with that guidance from NIH. We knew that in order to get a sense for how we can enable discovery, that preprints would need to be available in both PubMed with the metadata and citation, searchable, as well as in PMC with the full text searchable whenever possible.
KATHRYN FUNK: And we knew that we didn't want to start with this long-term commitment forever. We wanted to do something time-bound. This is a very rapidly evolving area, particularly in the last year or so. And so we really wanted to get a sense for things before we made any long-term commitments. What we ended up launching on June 9th of last year was the phase one of the NIH preprint pilot, and that phase is focused on increasing the discoverability of COVID-19 related preprints that have NIH support.
KATHRYN FUNK: And we have been using a tool created by our colleagues in the Office of Portfolio Analysis called the iCite COVID-19 Portfolio at NIH that is working with a number of preprint servers and we take what they hold and we further curate it down to identify those papers that are NIH-funded. At this point, the preprint servers that are included there, and hence in the pilot, are medRxiv, bioRxiv, arXiv, Research Square, SSRN, and ChemRxiv.
KATHRYN FUNK: So with this sort of revised, narrowly-scoped phase one, we felt there was an opportunity to take this measured approach to engage in with preprints that not only prioritized and responded to NIH goals but also to the information needs that we were hearing from researchers and the librarians in our community, starting at the beginning of last year in response to the public health emergency. Before I get into the implementation of the pilot and where things stand now that we're seven or eight months in, I wanted to cover briefly what the pilot is not.
KATHRYN FUNK: I find that it helps clarify things often. So what we're doing here is not creating an NIH-hosted preprint server. We are working with existing preprint servers. We're also not seeking to create a comprehensive preprint discovery resource. We are just looking for those NIH-funded preprints at this time. And this is also not a new requirement for NIH investigators.
KATHRYN FUNK: It's something NIH encourages, but you are not required if you take NIH funding to post a preprint. So in all of this, what is NLM trying to learn by engaging with preprints? The first is just to test the viability of making this type of content available in PMC. We also want to understand the impact that it has on discoverability and dissemination of NIH research results to do so.
KATHRYN FUNK: But in all of that, we want to make sure that how we implement the pilot ensures the ongoing trust of those who use our resources. So that's at the heart of every decision that we've been making around the pilot is how can we engage with preprints and retain community trust? We set up a number of workflows to make this all work. We wanted those to make-- we wanted those workflows to have no impact on the author.
KATHRYN FUNK: We hear a lot about author burden and the expectations that come with taking an NIH grant, so it was very critical to us that this not require any effort by the authors. So what we've been doing is using that COVID-19 portfolio tool I mentioned earlier to identify records. We text mine them. We manually review the results. We ingest what we identify as NIH-funded, so that's the curation piece.
KATHRYN FUNK: And as soon as we've identified them, we do load them to PMC and PubMed to make sure that even if it's just a stub metadata title, some authors, maybe an abstract-- what you get for preprints is variable-- that at least it's there for discovery as soon as we can, especially during this public health emergency. If a paper is available under a Creative Commons license, we also send it out for XML conversion so that we can make that full text discoverable and enriched as well.
KATHRYN FUNK: We will also continue to check for updates. These are automated checks that run multiple times a week. And when I say "updates," that means both the new versions of the preprints as well as any journal article version of that preprint that may come along later. And we've been continuing to monitor these processes to see if we could scale them up beyond COVID research, but that's a question for another day.
KATHRYN FUNK: We have felt that transparency is vital in how we implement this pilot, going back to that idea of trust. We make sure that we have clear linking to information about the pilot and also labeling that this is a preprint, that this has not been peer reviewed. We even run a watermark on the side that labels it as a preprint.
KATHRYN FUNK: So we're trying to make that clear, and for those who aren't aware of what kind of information a preprint may contain, we're trying to get them some background information. We also will post to links to earlier versions that we hold. We do put everything under the same PMC ID and the same PubMed ID, though. So if there's seven versions of a preprint, you'll only find one record in our databases.
KATHRYN FUNK: But you can look at the earlier versions. And then as you'll see in the yellow box here, we will link to the journal article as soon as we've identified it. We run about four different types of checks of different systems to make sure that we are doing as much due diligence as we can to get people to the most current version and to the peer-reviewed version, if available.
KATHRYN FUNK: So with that, where are we now? These numbers are from mid-January I would say. At that point, we had about 1700 preprints that we had added to PMC. The vast majority of those, the blue and green that you see here, are bioRxiv and medRxiv. So bioRxiv is the green, medRxiv is the blue. And I think that's somewhat consistent with what you see overall in posting.
KATHRYN FUNK: The smaller numbers for Research Square and ChemRxiv and arXiv aren't necessarily a reflection of the number of COVID preprints they are posting. It's just that there's a lower number of NIH-funded preprints that we're finding in those databases. We've also been keeping an eye on of these articles, how many are being made available under Creative Commons licenses? How many are being made available under the NIH-recommended CCBY or CC0 license?
KATHRYN FUNK: At this point, it looks like about 2/3 will have some sort of Creative Commons license and full text, which is great because we're seeing a definite impact of the availability of full text on discovery. But only 18% have an NIH-recommended license, so that's an area of education that I think we'll need to pursue.
KATHRYN FUNK: We've seen through the end of January about one million page views-- just over one million page views-- of preprint records in PMC. It's about 2/3 of that of what we're seeing in PubMed as well. And really, the interest for us is, A, as we've discussed, enabling the discoverability but also in understanding the comparison between engagement with preprints and engagement with the journal literature.
KATHRYN FUNK: We are seeing COVID journal literature viewed slightly more frequently. I'd say it's about 16% more views than the COVID NIH preprints. But it's an interesting comparison, and it's one that we'll keep an eye on to see if it would hold true if we were to expand beyond COVID. Around 35% of the preprints that we're making available have been matched to a journal article.
KATHRYN FUNK: By the end of the second quarter of the pilot, we found that just over half of the preprints in PMC-- so around 54%-- had been posted publicly-- that were posted publicly in the first half of the year and had been published. And by that I mean from January to June, those preprints, over half of them are now in a journal. As the year goes on, unsurprisingly, we see those numbers drop off.
KATHRYN FUNK: They're lower than we anticipated prior to launching the pilot. BioRxiv had mentioned to us that they generally see about 75% of their preprints published within six to seven months. But I do think this is slightly higher than a lot of the COVID preprint publication rates. So again, controlling for the COVID factor and understanding what we're seeing is still an ongoing effort on our part.
KATHRYN FUNK: We also-- in trying to administer this pilot in a way that really retains trust, we've been doing some high-level monitoring of the "open science indicators" is what I'm calling them that appear in the articles. And these are just things around the open data sharing and transparency that we're looking for-- I'd say, at least can be useful in reproducibility. They may not necessarily be full on indicators of quality, but we think they give readers a better opportunity to assess the quality themselves.
KATHRYN FUNK: To date, 75% of the preprints that we've received do have supplementary materials of some sort. 20% have a data availability statement. 23% have GitHub links, which is astronomical compared to the published literature we're seeing. So that's a very strange and interesting trend where we're curious about. 57% have COI statements, which is great.
KATHRYN FUNK: We'd always love to see COIs higher, but that's been great. And then 53% have author contributions, which I think has to do with preprint server practices themselves. We're also trying to do what we can to be an influencer in the field, for lack of a better term, but we really want to make sure that by engaging with preprints, we're also just helping to establish best practices.
KATHRYN FUNK: That helps us in return as we determine our engagement with these in the future. And so we're very dedicated to working with folks on transparency. To be eligible for the preprint pilot, a server has to have a peer review status on their site for preprints. They need to have a very clear screening process. They need to link to the version of-- or have a clear virgin record, "record of versions" as I've heard people call it.
KATHRYN FUNK: And to have ethics policies-- what do you do if there's withdrawals? What do you do if there's removals-- that sort of thing. We also want to make sure they have an NIH-recommended licensing option, that their metadata is machine readable and reusable and easy to grab, and that they have a preservation strategy. At this point, I think for the duration of the pilot, we will only be working with preprint servers that have a high volume of NIH-funded preprints involved, and that's really just because of the scope of what we're trying to do.
KATHRYN FUNK: Finally, we've been trying to engage with different stakeholder groups. We see this as another opportunity to not only engage with the quality and transparency of the preprints up front but how they're interpreted downstream. And so we have a lot of general communications that we've put out. We've reached out to the investigators to remind them of NIH expectations and try to simplify reporting preprints via their bibliographies.
KATHRYN FUNK: But we've also engaged with those information intermediaries, like the journalists and the librarians who are working in different communities to convey information. And we very much want to make sure they have the tools they need to communicate around preprints and what sort of research this is and how they talk about it. So that's sort of where we've been spending some of our energy while we operate this pilot, just to make sure we're doing our due diligence across the board in a very new area for us.
KATHRYN FUNK: And since this is a very new area, I don't really feel like I have a lot of lessons learned. I have a lot of things we are still learning, but we welcome feedback at any point, either today in the discussion or via the avenues you see here. We have a preprints email alias set up that just goes to a number of us. You can also always contact me directly.
KATHRYN FUNK: So thank you again for including me today, and I look forward to chatting.
JOY OWANGO: Hello, everybody. My name is Joy Owango. I'm the Executive Director of the Training Center in Communication, and I'm a board member of AfricaArXiv. And AfricaArXiv is our-- we are the founding project partner of AfricaArXiv, which is Africa's first and only preprint repository. Now, what I'm going to take you through is how personal identifiers in preprints are facilitating ownership of African scholarly content.
JOY OWANGO: Now, there is a claim that 0.1% of scholarly content is said to be from Africa. And yet, this continent contributes to 13.5% of the global population. In reality, this is not true. Instead, it's because our work is not discoverable.
JOY OWANGO: Our work is not visible. So there are so many factors that have led to our content-- African research content-- not to be visible. So one of the things that we've noted as some of the issues that have led to this low visibility is because regional bias in Western journals, especially by editorial teams-- they are more inclined to accept research or publication outputs from anglophone countries.
JOY OWANGO: So countries that do not particularly use English as their national language tend to fall through the cracks. So if you're looking at this continent, which has over 2000 people-- I mean, it has about 1.3 billion people with about 2000 languages-- in some countries where the national language is the indigenous language, like in the case of Ethiopia with Amharic, with Nigeria being Igbo or Rwanda being Kinyarwanda, you find that research that is written in those languages falls through the cracks.
JOY OWANGO: So African scholars are more inclined to list Western partner institutions instead of their African home institute, which is a problem we've noted. Because even when they are collaborating with researchers, the first thing that will actually give acknowledgment to is their Western partners, and rarely do they acknowledge their African institutes where they're actually housed. So one of the things we are doing as a center-- the Training Center in Communication-- is build that capacity for researchers to make them understand that they need to list both institutions so that they can also help in increasing the visibility of their output.
JOY OWANGO: A large proportion of present and historical African scholarly output is still in print, which is something we cannot deny. And with the digitization of African output, chances are we are going to see an increase in the output that has come out of this continent. So the overreliance of the fact that you're looking at only online output makes it skewed in the sense that we have outputs going back to the 1950s or to the 1960s, but unfortunately, they're in print format.
JOY OWANGO: So there are bottlenecks in infrastructure, network challenges, and internet connectivity. So to be honest, when you look at the reality on the internet infrastructural systems in the continent, of course this is going to impede on the support systems that can help in increasing our visibility. So when you're looking at repositories or even making those repositories interoperable, you'll come to find out that not all the institutions have repositories that meet the fair principles or not all institutions even have repositories or even know how to use the repositories.
JOY OWANGO: Even though there's a preprint server in the continent through AfricaArXiv, not everyone is aware of it. And they do not quite understand how some of these preprint repositories work. So a lot of capacity building needs to be done when you're looking at how we can increase the visibility of African research output, particularly when you're looking at the infrastructural networking capabilities that need to be worked on, such that African research output has substantial supporting systems to increase their visibility through good infrastructural systems that would support their visibility as well.
JOY OWANGO: So what is AfricaArXiv? This is a community-led platform for African scientists of any discipline to present their research findings and connect with other researchers. In essence, it is a platform for not only African researchers, but anyone who is doing research on Africa and doing research about Africa as well. So this has given us a fighting chance as African researchers to have a place where we can not only own data, have serenity over our data, but also increase the visibility of our output.
JOY OWANGO: And what is done is instead of reinventing the wheel, we've decided to partner with six established scholarly repositories. And these repositories further help in increasing the visibility of the output that is indexed within AfricaArXiv. So these repositories include a PubPub, which is quite unique because it's through PubPub we are able to accept audiovisual material, which you cannot get in most repositories.
JOY OWANGO: So you're able to index audiovisual material and whether they are webinars. And especially for those who are doing social sciences, they do a lot of video and audio recording-- this can be indexed and put within AfricaArXiv in partnership with PubPub. ScienceOpen, the Open Science Framework. FigShare is our latest department, Xanadu and Qeios. Now, the one thing that excites me about ScienceOpen is that even through our partnership with them is that we were able to start collating research output coming out of the continent on COVID-19, which I'll talk about later.
JOY OWANGO: Because we are not seeing a lot of that. You're seeing a lot of research that is coming out of-- you're seeing a lot of research on COVID-19 from all over the world, but you're not seeing the research coming out of Africa as well. So the partnership with ScienceOpen was able to give us this opportunity to showcase the research output that is coming out of Africa in regards to COVID-19.
JOY OWANGO: So these are the interfaces of all the partners that we are working with. Now, when you're looking at visibility of research output, the one thing that researchers need to take advantage of is the personal identifiers. And AfricaArXiv is in partnership and also in the process of partnering with a few of-- with some of the leading organizations that are producing personal identifiers for researchers but also research institutions.
JOY OWANGO: So we are in partnership in ORCID to help researchers for individual research-- to help researchers produce individual personal identifiers. We are also working with ROR, which helps in creating personal identifiers for institutions, and also with DOI for scholarly output as well. So when you're looking at how we are working with the institutions that provide personal identifiers, we look at it from three folds.
JOY OWANGO: Number one, individual identification, but then you're also looking at institutional identification and also scholarly output as well. Now, as I'd stated, one of the biggest challenges in the visibility of African research visibility is caused because of language. Now, the fact that English is still seen as the language of science, you don't see quite a bit of research output in indigenous languages coming out of Africa.
JOY OWANGO: It's not even seen as a contributor to the global knowledge as well. So what you're trying to do, as I'd said earlier on, we are looking at a continent of 1.3 billion people with a diversification of over 2000 languages. And some of these languages are used in writing academic research. So what happens is that what we're trying to do as AfricaArXiv, we are trying to-- we are trying to promote language diversity in science.
JOY OWANGO: Not only English, but also the local languages that are coming out of the continent that are being used in conducting research. With the languages, we are also trying to increase in diversity-- in a bid to also increase language diversity, we also need to look at a common language to connect.
JOY OWANGO: So if you look at it from an African perspective, the African Union languages are a perfect example of local languages that you can use to connect. So these are the African Union languages, bearing in mind you're looking at about 2000 languages in the continent. This continent is unified by the following integration languages-- Arabic, which covers the northern region of the continent.
JOY OWANGO: English, which covers the bulk of the continent-- so anglophone Africa. And then there's French, which covers francophone Africa. Then there's lusophone Africa, which covers Portuguese, of which we do have Portuguese-based countries in this continent, so that is Mozambique. And there's also Swahili-- Swahili which covers the eastern African region.
JOY OWANGO: So these common languages help connect the continent, and there are papers that are actually written in these common languages. So what we're trying to do is try and also increase the importance of some of these languages that are used-- I call them integration languages-- that are used in research and thus increasing the visibility coming out of the country. So even minus these integration languages, there are countries in this continent that have singular languages that are not only the indigenous language, they are National languages and it's also their business language.
JOY OWANGO: So for example, Ethiopia with Amharic, Nigeria with Igbo, Rwanda with Kinyarwanda, South Africa even with Afrikaans. So you realize that there are outputs that are coming out in these languages and they are still not seen. So what we are proposing is a balanced approach of using both not only science but then also including diversity and adding in local languages as much as possible and also with the use of technological support in helping in translating some of these languages into English so that people can also have access to them.
JOY OWANGO: Now, how are we doing this? The fact that we are trying to promote local languages, we do accept submissions in African languages. And as I mentioned earlier, we are working on Swahili, Yoruba, Igbo, Afrikaans, Wolof-- Wolof is from Senegal, Amharic is from Ethiopia. And what we're doing is that we are doing it in two fronts. We're doing it in two fronts. So when we accept these languages in local languages-- when we accept these papers or research outputs in local languages, we have partnered with Masakhane, which uses technology in translating these languages.
JOY OWANGO: And we also are in partnership with the African Literacy Science Network, where they manually help in translating some of these languages-- these local languages-- into English. Thus, you're going to have access to these publications, not only in the local languages but also in English as well. And at the same time, you're able to increase the output coming out of the continent.
JOY OWANGO: So that bias of looking at English being the language of science is really flawed because a perfect example was that the Nobel Peace Prize winner I believe in medicine from China. She won a Peace Prize award, if I'm not wrong, in 2015. And the fact that people do not know about her work was purely based on the fact that her paper was written in Mandarin. So this is how flawed the system is when you're looking at English being the language of science.
JOY OWANGO: We need to include diverse-- you need to diversify that by adding in also local languages or working with technological partners who can help in translating some of these local languages into English so that you can also increase the visibility of the output that is coming out of the respective regions, like in this case, for our case, Africa. Earlier on, I talked about ScienceOpen-- our amazing partnership with ScienceOpen.
JOY OWANGO: Now, what I like about ScienceOpen is that we went a step ahead. Not only were we using their platform to increase the visibility of our output-- from our submissions, rather-- but what we did is that we worked together to come up and create a list of the research output that is being done on COVID-19 by African researchers. And as of yesterday, there are over 520 publications produced by African researchers on COVID-19.
JOY OWANGO: So this is updated real time. This is updated real time, and you're able to see African contribution to research, especially when it comes to this pandemic. So this is the power of this platform. It has actually given us a chance to sit at the table and tell the world, this is what we're doing. We are also contributing to research on COVID-19, and this is the output that you can get in one platform.
JOY OWANGO: Now, we are not complaining about the situation we are in. We are cognizant of the challenges that we are facing, especially when it comes to the infrastructural systems and also language barriers. But we are not looking at those as-- we are not complaining about them. We are looking at how we can turn those challenges into opportunities.
JOY OWANGO: And these are the opportunities that have arisen. We have also seen what other archives are doing in the world, and we are trying to learn from their mistakes and try and improve AfricaArXiv. So number one, one of the opportunities that have come up is increased digital discoverability of African content. And now, we are seeing a lot of output coming out of-- being visible on African research.
JOY OWANGO: African researchers are now cognizant of data ownership, research ownership, and research output sovereignty as well. So they are cognizant that they need to invest in systems that can help protect their research but also make them own their research. And they are also cognizant of the fact that data management is a very important aspect if you want to increase your visibility as an African researcher.
JOY OWANGO: Investing in good data management systems and also even studying it-- so this is why now we are looking at it from an opportunistic perspective. So there's increased discoverability of African content because now we are cognizant of what we need to do to increase that visibility. We need to build our own scientometrics based on measurable outputs.
JOY OWANGO: The scientometrics we've been relying on-- and I'm grateful to the pioneers of this, Leiden University, University of Sussex, who are working quite some amazing work on scientometrics. It is still global, not skewed. So we need to build our own scientometrics and measure our own outputs. So far, Africans have been using the UNESCO indicators to measure our research output, and as of now, we have several partners coming up to create indicators that would be suitable to measure research output in the African context.
JOY OWANGO: When I say the African context, this is not from a cultural perspective. It's from a political economic perspective. So if you are able to get indicators that able to measure research output, factoring the political economic perspectives of the continent, you'll end up getting a wholesome report on how Africa is performing. And when I talk about Africa, let's not merge it into South Africa.
JOY OWANGO: I'm looking at 53 countries minus South Africa. Because I always keep on telling our peers that South Africa is a perfect example of how higher education should be. It's the normative example of how we should be conducting our research and our education systems. But beyond South Africa, the other 53 countries have more or less the same challenges when it comes to increasing our research output and visibility or even managing our higher education systems.
JOY OWANGO: So we need to work with established infrastructure providers for long-term digital storage. So pretty much, let's not reinvent the wheel. Let's not reinvent the wheel. There are already existing digital infrastructure providers. Taking advantage of SDG [INAUDIBLE],, we need to work with them so that we can have long-term storage so that our data can still be accessible if anything was to happen to the platform.
JOY OWANGO: So the beauty about this is that at the end of the day, we are able to pull out data that is happening in the continent. We are able to pull out research output that is coming out of the continent. And the fact that it's from Africa and it's about African research-- anything to do with African research-- we as Africans end up having more data for evidence-based policy for monitoring and evaluation.
JOY OWANGO: And it is contextualized to our needs. It is not data given to us. It is data from us, data that is cleaned by us data, data that is owned by us. And we end up having sovereignty of our output as well. And we're able to use that for evidence-based policy in monitoring and evaluation, which is really important, especially when governments need to make decisions on the investments they need to make on research and higher education in the continent.
JOY OWANGO: Now, one of the opportunities we decided to take advantage of is capacity building. Capacity building is really important. We cannot ignore the need for capacity building. Because what we noted is that continuous capacity building makes it easier for researchers to understand the importance-- the importance of why they need to work with preprint servers.
JOY OWANGO: Why they need to have personal identifiers as a basis to increase their research visibility and also to manage their research outputs. So in partnership with Training Center in Communication-- TCC Africa-- AfricaArXiv has been working with us now-- our center-- in supporting researchers and training them-- not only researchers, but we're also looking at governments and also institutes-- on the importance of using preprint repositories and using personal identifiers in supporting them in improving their research output and also increasing their visibility.
JOY OWANGO: And they are seeing the importance of this because at the end of the day, as African researchers, it's all about sovereignty-- sovereignty and ownership of data and research output, which is always a big challenge when it comes to this continent. With that, ladies and gentlemen, thank you so much for listening to my talk.
PACKER: Hello, everybody. I am very pleased to participate in this session of NISO Plus 2021 and particularly to share this panel with Joy, Kathy, and Jabin. I will talk about evolution, the lessons learned on the implementation of the SciELO Preprints which has been operating for nine months.
PACKER: SciELO Preprints is a component and implementation of the SciELO Program. It is an evolution that we understand these achievements or remarkable achievement for SciELO and PKP who developed the platform. Quality control is a cornerstone of the SciELO Preprints development. Very critical because to face the resistance to preprints, we stressed the quality control.
PACKER: Of course, there are incredible number of challenges and barriers. I will highlight some of them. And today, I will talk about the final comments and the lessons learned. So SciELO is a program which, at the national level, is a public policy to improve visual renowned publishings by national institution.
PACKER: At international level, it is a program of collaboration. It has two components. One is what we call publishing model which is the mechanism, the two methodologies to publish a collection of journals, and to operate through decentralized network which involves today, 17 countries, a public health systematic collection. We have also a data repository.
PACKER: We just have our data versing. And the SciELO Preprints server which we'll talk about it. This shows the evolution of monthly evolution or yearly evolution per year or per number of journals and the number of articles. We divide the evolution of the SciELO program in three major periods. The first one was the transition to online publication, formation of the core collection.
PACKER: Then, we put the centrality on the digital publication. So all the journals need to have the online in manuscript processing. All the full text under [INAUDIBLE] and also to adopt the continuous publication. From 2013 until now, 2021, we started the implementation of the transition to open science in which we hope to finish by 2023.
PACKER: It involves preprint research and data content. And if you reopen it's up for peer review. So preprint is part of this evolution of the SciELO program. The preprint evolution is very interesting because we started with the first two months. We have a very enthusiastic submission, a preprint mainly on COVID 19.
PACKER: And they lasted three months the average of the documents per month is 80 which means about two-three per day. And the percentage of acceptance is 60%. This is a very small percentage of the official operation because in nine months, SciELO journals process the network more than 150 submissions [INAUDIBLE] thousand submissions, published 40,000 documents and about 30%-50%.
PACKER: So the market for preprints is more than or greater than what we have seen. Evolution of by thematic areas, we can see that predominates life science, particularly health science, and we see health science and the human science, preprints on COVID. SciELO is part of an international let's say effort to advance communication particularly in COVID-19 like is shown in this graphic here.
PACKER: In terms of language, predominates the Portuguese language with more than about 50% today in English and Spanish. And 70% of submissions comes from Brazil, then we have a 20% from Latin America, and 10% from other countries. So let's talk now about the topic, central topic of this panel, which is quality control.
PACKER: So submission also revision goes to basic requirements which are run by automatic procedures and the manned or manual procedure, let's say. If your preprint manuscript is submitted by a SciELO journal, it is automatically accepted. If it comes from an author, it goes through premoderation which would follow basic criteria.
PACKER: Of course, if it meets the basic criteria, it is accepted. Otherwise go to moderation which results to accepted or not accepted. As we saw before, 30%-35% is not accepted. The first quality control is done by SciELO editorial team. And the second moderation is done by SciELO editors, journal editors.
PACKER: So let's see the basic requirement, precondition to be loaded documents you need to have title, abstract, and keywords in the original language and in English. ORCID is required for all authors. All authors when we have more than one, need to specify the contribution. Should also show if there are conflict of interest. Author correspondence will inform if manuscript is published by a journal.
PACKER: So the link from the preprint to the virtual record is established. All articles receive a DOI and will not be excluded from the server except in the case of a retraction. Declaration will be added to the printed, on the preprint PDF. Let's take one article here for example. One the student's paper blah blah from Professor John Winieski.
PACKER: We have these preprint was submitted under the following conditions. So on preprint receives these print declarations from the authors in Portuguese, Spanish, or English depending on the interface. And the author declare they are aware that they are solely responsible. They follow the good ethical practices. Declare that the indication that the terms of the print form and the consent of participants or patients in research are needed to be obtained.
PACKER: And they are described in the document. Author declare that the preparation of the manuscript follow with the ethical norms of scientific communication. And there is all these declaration that the authors need to say, OK. For example, the author declares that all authors' contributions are included in the document. It's 11 declarations.
PACKER: Then, let's say basic criteria. Is it a research article? If it is not, go to moderations. So reviews issues it says all these go to moderation by a SciELO editor. If it is a research article and at least one of the authors has already three indexed Article with DOI, it is accepted.
PACKER: Otherwise it go to moderation by a SciELO editor. And moderation by SciELO editor follow two basic rules. It is accepted if it brings a new knowledge and if it is believable to be acceptable for peer review by a typical SciELO journal. So the idea is that preprint is the initial let's say, initial point, initial stage of within a flow of a research communication.
PACKER: So that is the flow chart of the quality control. Let's talk very briefly about the challenges and the barriers. The first one is that the classical structure of research communication is deeply established. So it is very hard to introduce a modification. So journals field that the propriety can be a risk to their positioning.
PACKER: They claim against the fragmentation of the articles and particularly because of how citation will be counted. Authors are afraid of the lessen manuscript opportunity to be accepted by a journal senior authors. Most of the day, our committee to classical research communication.
PACKER: So they communicate to the newer researchers. And the research agency and the research evaluation system has no proactive position regarding preprints. Open science which gives authors more control for their research community is not that well let's say, generally accepted or practiced by researchers.
PACKER: Renovation or changing of the journal rule arrangement [INAUDIBLE] As so peer review of a published and nonpublished manuscript. The use of the non-blind operation. Most of [INAUDIBLE] operators under double-blind peer review. And the researched data and the older content bring a more let's say complexity to the peer review.
PACKER: And again, research agencies lack a proactive promotion of open science, particularly preprints. I am talking about the Brazil and Latin America. Lessons learned, if I comment, I would just say that the innovations we have a direct experience in 23 years of the SciELO, that the innovations on the structure of scholarly communication happen slowly, very slowly.
PACKER: Preprints power resides in the web disintermediation property. So when you feel the need to publish, you have the new opportunity because you do web and it's disintermediation property allow that. Preprints are not promoted and not perceived and not believed yet as established researcher's career progress.
PACKER: Prepints, yes, they are adopted when a sense of urgency in communicating our research is present. We need also agree that a preprint add complexity to the management, of the quality of the research. As SciELO Preprints is we understand that it has enriched our research community infrastructure.
PACKER: So we can now have, in addition to journals, we have instead of preprint repository and share data repository. Thank you very much. This is my presentation on the evolution of SciELO Preprints. Thank you very much. [MUSIC PLAYING]