Name:
Diversification and Decentralization of Peer Review
Description:
Diversification and Decentralization of Peer Review
Thumbnail URL:
https://cadmoremediastorage.blob.core.windows.net/b34d14fa-724b-4060-b096-9ec3ae696d14/videoscrubberimages/Scrubber_1.jpg
Duration:
T01H03M45S
Embed URL:
https://stream.cadmore.media/player/b34d14fa-724b-4060-b096-9ec3ae696d14
Content URL:
https://cadmoreoriginalmedia.blob.core.windows.net/b34d14fa-724b-4060-b096-9ec3ae696d14/SSP2025 5-30 1330 - Session 5C.mp4?sv=2019-02-02&sr=c&sig=J0URnJdSl1v2ch2CE5H5AeB9OJXYMsL91udnlSjEteY%3D&st=2025-12-05T20%3A57%3A29Z&se=2025-12-05T23%3A02%3A29Z&sp=r
Upload Date:
2025-08-15T00:00:00.0000000
Transcript:
Language: EN.
Segment:0 .
Hello, everybody. I'm going to go ahead and start on time. So thank you for joining us. And thank you to my speakers, John Inglis from Cold Spring Harbor Laboratory. Fiona Hutton of eLife. And Emily Easton of knowledge futures. This session, diversification and decentralization of peer review is grounded into companion articles that explore how peer review is diversifying and decentralizing, both in theory and in practice.
Today's speakers will expand on these ideas with real world tools, case studies, and forward looking perspectives. So if you want to maybe take a picture of that QR code, you can go and read those articles. I'm going to now move on. So go ahead and get that done. Excellent OK, I see the phones go down, so peer review as we know it faces growing challenges.
There's reviewer fatigue. There's lack of transparency, biases in selection. The push toward diversification and decentralization is not just about innovation. It's about equity, trust and scholarly integrity. Our first presenter explores how decoupling peer review from traditional publishing via preprint servers enables faster dissemination, greater transparency and innovation in research evaluation, highlighting how preprints accelerate discovery and foster community engagement and support flexible, modular peer review processes.
Our next speaker advocates for a more diverse and decentralized peer review system by empowering historically excluded voices, particularly early career researchers and scholars from underrepresented regions through community led preprint review and then decentralization. It doesn't just mean disorganized, it doesn't mean disorganization. In fact, it requires a robust infrastructure, and our final presenter will focus on the technologies that can support peer review models like the ones we're discussing.
So with that, I'm going to turn it over to John. And he's so you move them yourself with this. Yeah OK. Doing that right. Yeah so where do I point it. Over there. Are you bringing the other slides up.
There we go. Oh, he's got it. There you go. OK, good. Well, good afternoon, everyone. Thank you for coming and choosing us over the many competing attractions, including sunshine outside for once.
We're here, as Tony said, to talk about the variations and evolution. Let's call it evolution of peer review. And I speak as a co-founder of two preprint servers, bioRxiv and medrxiv, and I should say right from the outset that despite what you may have read in various venues, including the blog associated with this excellent society, people who run preprint servers are not against peer review.
Far from it. What we are interested in is whether preprints can provide a platform for the evolution of peer review in a variety of ways. OK, so you're all very familiar with the pattern of traditional publication, which of course does involve peer review, then the certification of that content by the journal and then the distribution of it.
This is a model that has worked very well for in a very long time. But it's one big problem is that it is very slow. And as Tony alluded it not, it is not always as inclusive and diverse as we would like it to be. What we're interested in is whether preprints offer us a different path forward. And a preprint, of course, is simply an author's manuscript that is being distributed without peer review.
But with screening. So the difference is that a journal article, once it's gone through all the various iterations and steps that you're all very familiar with, it may take months or even years for that piece of work to reach the community, whereas a preprint takes hours or days. Preprints were pioneered over 30 years ago in the physics and mathematics community by arXiv.
It was based on the development of the web server, was based on a pre-existing email based communication amongst certain kinds of physicists, and then it evolved into a web server. It's been a long running, not for profit funded by Cornell University, and it really is a central element in the communication strategies of physics physicists and mathematicians with enormous numbers.
I think 120,000 preprints submitted each year, and they have a corpus now of well over 2 million. In 2013, Cold Spring Harbor Laboratory, where I am based, was gracious enough to give me and my co-founder, Richard seeber. The opportunity to try to bring preprints to biology. This is an effort that had been tried before without success. We had the sense that the timing was right and that coming from an academic institution that had a long history of sharing science in a variety of different ways, we thought we might have a chance of success.
And I will tell you something about our progress. We are still going. I am glad to say, and we are doing so because we have had generous funding from a variety of sources, including the laboratory, but also the Chan Zuckerberg Initiative, the Sergey Brin Family Foundation, and also contributions from academic institutions. And again, I would emphasize that what we do at the preprint servers is not peer review, but screening.
Since bioRxiv began, there's been a proliferation of efforts to distribute un peer reviewed manuscripts. They're all collectively called preprint servers, but they do work in different ways. And those variations I have listed at the bottom of the screen variations by discipline, different article types, language, region and submission requirements, and also a for profit and not for profit purpose and a different degrees of independence from publishers.
And that's one thing I should emphasize that even though bioRxiv and medrxiv were started by two guys from a not for profit publishing house, we kept the two operations separate. In 2019, we followed up with medrxiv after thinking about it for quite a long time because of the anxieties around medically related information. So medrxiv focuses on Health Sciences. Again, these are free services, free to read, free to submit to.
And we did this. We did metaarchive in partnership with colleagues from the Yale Medical School and from the BMJ. And we did this. We started Med archives six months before the pandemic arrived. And immediately metaarchive became a very important distribution channel for the latest information about the virus and its various effects.
That collection of articles continues now over 22,000. And just last week, there was the first report from China of two new variants of sars-cov-2, which are going to be reaching our shores very soon. So now metaarchive has moved beyond its sort of roots in infectious disease. And in the last few months, both medrxiv and bioRxiv have achieved the highest monthly totals they have had.
So screening, I'm not going to have time to go into too much of this, but screening and we could talk about it afterwards if anyone is interested. But it is screening for scope. In other words, within the remit of the different servers, we only accept research articles, not reviews or hypotheses we screen for risk of harm. Things like dual use research of concern, and we also have ways of signaling for fraud.
The screening is conducted with a combination of an in-house team, affiliated scientists and tools which we are developing. There are various requirements made on authors to declare aspects of their submission which I've listed here. Again, I don't have time to go through all of these things. Since 2013, the servers have continued to grow. So now we're at the stage of posting over 5,000 manuscripts a month.
They are posted if it's a straightforward submission that doesn't require any interaction with authors, they are posted between 2 and four days after we receive them. There is a great deal of usage of these sites 10 million views a month is not unusual. And the point about going back to the question of peer review is that eventually, somewhere between 70 and 75% of these manuscripts end up in a journal after peer review. So another thing that's changed in the 12 years since bioRxiv began is that funders have become increasingly interested in the potential of preprints, and a number of those funders have either encouraged or even mandated their grantees to post preprints of work that those funders supported.
So just a few months ago, we took the step of spinning out these two servers from Cold Spring Harbor Laboratory into their own 501(c)(3 not for profit. And that is an organization that will now manage the growth and the further development of the servers. So we see the goal. I mean, the goals of open archive basically are twofold. One is to accelerate discovery.
And here is a major scientist who's done some advanced math to try to persuade us that if every paper yielded two new ideas, then as bioRxiv scaled in 10 years, there would be a five-fold acceleration in the rate of scientific discovery. I leave it to you to assess the validity of that calculation. But the other goal is to enable other elements in the scholarly publishing ecosystem to innovate in a variety of ways, and that of course, covers many aspects of that ecosystem, from the reproducibility problems to discovery discussion more generally.
And onto the question of certification. So I'm really only going to talk about assessment and certification at this point. We are delighted that bioRxiv in particular, has become a platform for various projects and initiatives that are attempting to do peer review in different ways. And again, with lack of time, I cannot do justice to all these different projects.
But we have Fiona here who's going to talk about eLife in particular, which is a very imaginative use of bioRxiv as a platform that supports peer review. But there are a number of initiatives. Several of which I'm sure you're familiar with because they've been represented at this meeting and in many other publishers gatherings before. What we have tried to do then, recognizing the existence of these initiatives, is to provide a way of helping readers discover the outputs of these assessment initiatives.
So we have on every preprint, we have a launch bar down on the bottom right there. And if clicking on any part of that opens up a dashboard, which reveals the different kinds of assessments that have been provided about that particular preprint. So one of those is what we call for transparent reviews in preprints. And a number of organizations contribute their reviews of preprints to our site, and we post them in this form so that readers can see them.
Prominent amongst those organizations is eLife and also review Commons. They account for most of these peer reviews, and at the moment we have over 10,000 preprints that have peer reviews, opted in, peer reviews that you can read on the site. And I should also say in the previous slide, among those initiatives, there are different ways of approaching assessment. So some of them require the author to opt in and some of them do not.
So that's another place where peer review is evolving. The question of the engagement of the author in the process. So we are seeing a variety of experiments of ideas about how to change peer review when it's done in the process of distribution. And obviously, in the case of preprints, it's done after distribution rather than before.
How it's done, all these different initiatives that have different takes on that. Who does it. Who is engaged. Where in the community do they come. What stage in their careers do they come from and what is being reviewed. Because as you well every research paper has many different elements.
It's entirely possible to break down a paper into those different elements and have review be focused on each individual one. Perhaps the methods, perhaps the statistical analysis. So people are interested in exploring those things. We're are also working with a number of organizations that are interested in verification of content in different ways. We've recently implemented a relationship with Cisco, which is a company that provides data and validation of the ingredients that go into experiments, the materials.
We're also working with Dryad and will be announcing we'll be implementing that soon, where an author can deposit a data set at Dryad, and there will be a link from the preprint to that data set. And then everybody now knows about DataSeer because of this morning's prize giving. And we've been working with Tim and his team for a couple of years now on giving authors the opportunity to offer up their data sets for advice from DataSeer about which repository is appropriate for that data.
So we regard these kinds add-ons or extensions of the pre-print as potentially increasing levels of trust in the material that's presented in the preprint. And as I say, I've covered a number of ways in which we are doing that. What you're seeing here at the bottom on the right hand side, is the manifestation of the PSI score connection with a particular preprint.
So I'm setting the stage here for what my colleagues on the panel are going to talk about in more specific detail. But basically, I think we are looking at a situation where scientific assessment itself is evolving into a potentially into a process where you have first of all, submission of a preprint after screening. And then the accumulation of different kinds of trust signals, which may come from a whole variety of sources, including comments on the site and formal peer reviews, and then eventually certification and publication by a journal or a journal like entity.
So with that, just a quick acknowledgment of my amazing colleagues who work so hard to keep all of this show on the road. And also, I want to thank the many generous funders that make these free services possible. And now over to Fiona. Thanks a lot. Thank you. Can you bring your slides.
There we go. OK, great. Thank you very much. And thank you, Tony, for inviting me to talk in this session. So my name is Fiona Hudson. I'm the head of publishing at eLife. And really this is a sort of education session.
So what I want to talk about is why we actually need diversification and decentralization of peer review, what the benefits are. And I'll touch a little towards the end on the Isla model and why we've done it. So as Tony and John said, we speak to a lot of scientists. Scientists are actually quite unhappy with the current system. There's a certain percentage of scientists that actually blame the publishing community for the situation that we've ended up in.
We don't want that to be the case. I've worked in publishing for 20 years. We don't. We want to be part of the community helping scientists do science better. So when they are complaining about a system and asking for us as a community to work with them to innovate, I think it's our responsibility to do that. So what's wrong with the current system.
Obviously, we've heard, it's overburdened, it's inefficient, it's slow, is opaque. It can be biased. And it's clear that we need to think about different ways of changing that. So let's go over here. So firstly thinking about diversity in peer review. We know it's mostly a closed and it's often an anonymous system. And it's linked to biases based on gender, nationality, institutional prestige of authors.
So this was a study that was done in my life a number of years ago. So hopefully, things have improved slightly since then. But it illustrates the point to say, like, who are the experts that are getting these. Being asked by publishers to do peer review. In this particular study, it became clear that over 75% of gatekeepers, gatekeepers being the reviewers or the editors were men.
The majority of them were coming from North America and then Europe, and a very small number elsewhere. Hopefully this is changing, and I know a lot of publishers are making really active efforts in order to change that, but it does illustrate a point that the people that are actually theoretically, what some people call validating or peer review are coming from a particular pocket of the population, and that's not representative of science. Also, you've got to think about who's acknowledged as these experts.
So we're really interested in early career researchers and how they contribute to the whole system. We did a survey on these early career researchers. It was a few years ago now, but over half of those respondents had never done an independent peer review. The interesting thing is that they had done a core reviewing experience. They had a lot of experience with that.
But what had happened was their pi would get the paper to peer review, the postdoc would peer review it and then get no credit. The name wasn't known to the editorial office that asked them to do peer review. So this is seen by that community as a kind of ghost ghostwriting. They're not getting the credit. The journal doesn't know.
Actually there's experts within those labs that are contributing to science, but just not getting acknowledged. So it's really clear that we have to make efforts to make this pool of peer reviewers much more diverse. So what do we mean by decentralization. Why do we want to do that. We want to address those shortcomings of traditional peer review.
We want to open it up to a broader range of people. We want to make it more transparent, give authors more control. This is really key to what a lot of scientists tell us. There needs to be real accountability in the process of accountability of the editor, of the reviewer, of the author and all of that accountability. Transparency increases scrutiny of what's happening during the peer review process.
So I don't know how many of you might be familiar with PRC, but PRC is a publication model that's called publish, review, curate. It's essentially based on the idea that preprints are out there in the community. That research is already published. What it is reviewed. It's not assessed, it's not curated. So the PRC model starts with publish being on a preprint server or an open access repository review.
The review can be a standalone contribution. It can be associated with the article through metadata. And at the end of that process, there needs to be a curation step. And that can take many forms. It could be evaluation, collection of those articles, acceptance into journal. And many people are thinking about that in different ways. It's also clear that quite recently or not recently, a year or so ago plan S developed a proposal, they call it towards responsible publishing.
And within that the talk about the PRC models. The way forward is it's a community led scholarly led model of publication. So I'm not going to dwell on this because John gave a great overview of preprints. But this is just to illustrate the increase of preprints over time, over time. This is just for the last five years. Now, you can see that you can't.
This is a increasing behavior of what authors do. It's not going to go back into the bottle. It's essentially when you think of what is preprints, scientists go to meetings, they talk about unpublished research all the time. This is another way of doing that. They've got unpublished research. They talk about it with their communities in different ways. The community helps them build on that helps progress the scientific conversation.
So it's the start of a scientific discourse that we feel is really important. And the process of building that trust and support is getting produced by the scientific community. So this is just an example there's a lot of diversity in this independent review of preprints. A number of examples here that John also alluded to review comments. This is a run by EMBL.
It's a journal independent peer review tailored to the life sciences. There's 28 journals. The articles get peer reviewed. And then after that peer review, the articles get submitted to the journals in that system. And then the journals either add more peer reviews or they accept the peer reviews that it came with. So I think the majority of those peer reviews are accepted.
I think it's over 90% without additional peer review needed. So this is a way to understand what the best fit of a particular journal is for. Given the peer review reports, there's pre-review and ASAP bio. They're both sort of focused on training peer reviews. Peer reviewers providing them with tools. Diversifying the system, experimenting with ASAP bio with crowd review. So they have 14 different individual contributors that are conducting a peer review and they get really engaged researchers.
A lot of the people that they work with are early career researchers, so they're wanting to do something different. There's other examples PCI, which is based around certain scientific subjects, and there is also pre likes, which highlights curated content based in the life sciences. There's a number of other examples of these independent reviews.
So we've also developed a platform called society. Society is a platform that brings different pure peer review groups together. And they look at different sets of peer reviews that are organized in different communities. And they can evaluate more than one different groups can evaluate a single preprint. So this just shows you the diversity that's happening within these groups.
This is an example of the different groups that are there. There's lots of real activity excitement amongst these different groups. You can any journal or organization can go and have a look and set up their own groups. This is a really great way for young researchers interested in journal clubs to actively contribute and add. Add their thoughts to. To preprints and organize preprints in the way they want.
Some are associated with journals as well, but you can just go into and play around with. With preprints. So there's a number of publish review, create model that are more like journals. And some of them are what we would call born PRC journals. So they've launched with the idea that they start with a preprint, the review. They do some kind of curation.
There's a lot of activity going on. And these communities quite recently, Meta launched, which is a meta science journal. That community is very enthusiastic about this model and sees it as the future. But there's other journals such as eLife, gigabyte that have actually switched from the traditional model of publishing. There's also overlay journals in other fields, such as mathematics.
Discrete analysis, for example, is an overlay journal that each has its own way of looking at this model and experimenting. But what's really cool is all these different organizations are coming together and saying, let's get together, let's experiment, let's try to do something different to really figure out how best to do peer review of articles in certain fields.
So I just wanted to tell you a little bit about the eLife model here. So why did we launch this model. Fundamentally, we feel that with preprints the science is already published. So what value does a publisher need to add at that point. The value that we feel that we're best placed to do is review and curate articles. So what we're actually doing now is publicly reviewing and curating articles rather than seeing ourselves as a journalist.
So authors submit their articles. It has to be pre-printed. If it's not pre-printed, we ask them to submit it as a preprint. There's a decision to review by a number of our reviewers. It's generally around three reviewers or three editors that make that decision. It's sent out to review. And then there's a peer review and consultation period. And a life assessment is also produced.
And the eLife assessment discusses the significance of the work and the strength of the evidence of the article. The authors at this point are able to respond to those reviews, and the whole document is published together as a kind of integrated document, which we call a reviewed preprint. So here we are reviewing preprints. That is all public.
The assessment says very clearly what the editors and the reviewers felt about the article as the author. If they don't like the reviews, they can't say at any point I don't want it to be published. They make that decision when they send it to us to be reviewed. What we are doing is reviewing it and being public about it and being open and transparent about the problems or the good things about that particular article.
The interesting thing about a reviewed preprint is that a number of funders are accepting them for a research assessment, so they don't require their students or their researchers to then publish a version of records. They're happy that those reviewed preprints are enough for research assessment within their own organizations, but we let the authors choose what to do next. They might decide to keep their article as a reviewed preprint, and it sits there in.
A number of them are quite happy with that outcome. Others the majority of authors revise their manuscript, and we go through the process again where there's an updated eLife assessment or the author can decide to send their article to another journal. A number of them do that. The article is still a preprint and it's metadata. So some authors that don't like our assessments, because our assessments are fair assessments on the quality of the work, they send their article to other journals and then they get those indexed in other journals.
Obviously that reviewed preprint is there, published on the alive website and also indexed in Google Scholar. So it might have been better for them if they didn't like the comments for them to actually address them, but that's entirely the author's choice or the author can decide. I'm happy with this version. I'm happy with the comments.
I'm going to publish it as a viewer for indexing. Indexing in PubMed. I should say so. So what benefits does this have. It gives authors control. Now, this is really important for a lot of the authors that we've spoken to. It is clear that there are a lot of authors are very upset with the current system, and it gives them an ability to decide what to do with their article.
It increases scrutiny. Obviously, it's in combination with open science practices, and that's really important for the scientific discourse, and it stops that regurgitation and review of articles at different journals with the same problems that just end up getting accepted somewhere else. But nobody knows why it's accepted, or why it was rejected for an article in a journal in the first place.
It's stopping that regurgitation process that people are getting so fed up with. And the most important thing is it's looking at the science. There's an evaluation that the eLife assessment, it's looking at the research in that article. Authors are not saying I got published in Super Journavx. So that makes this particular assumption about the article. The article all articles in journals are different. And so these assessments are based on the actual research and the comments within the article.
So it's a much more nuanced process because that's what science is nuanced and it correct and it evolves and that's what we're trying to show with this model. So what are our authors think. We asked them to pick what the key benefits of the model were. One of them is greater transparency, less wasted rounds of revision accelerated publication. Because with the first version of the reviewed preprint that that's taking about three months to publish.
Whereas in our legacy model, it was taken about nine months to get a reviewed version of the manuscript published. So what's the benefit of this system altogether. It's faster. It's good for science. That nuanced discussion discourse of science is actually what we want to expose. It's good for scientists because they get higher scrutiny. A lot of our reviews are looked at in certain papers, certain papers that are maybe controversial.
A lot of people are looking at those reviews. And so that's really good for the scientific process. The whole thing is transparent. There's that accountability throughout the process, and it gives the authors control, which is something that seems to be a real key thing to satisfy authors. Thanks very much.
Thank you all for joining us. And thank you to John and Fiona for starting this off. My name is Emily Austin. I'm the customer success manager at knowledge futures, where most of my day job is explaining to people how they can do the things they want to do with technology. And so I'll be giving an overview of our work on the mapping the preprint review metadata transfer workflows, which was a collaborative initiative to improve interoperability in scholarly communication.
So thinking about the work that Fiona and John have already discussed, these new paradigms for publication and discovery and conversation around peer review also required tools for facilitating that work across the ecosystem. So if we're going to diversify and decentralize our processes, our methods and opportunities for publication, then we need to think about how the technology we're using is also supporting our work in that publication. So this project and the resulting preprint are part of a working group organized by Europe, PMC and asapbio, to support interoperability of peer review.
It was one of several working groups, and this preprint specifically sought to document the technical elements of metadata transfer pathways, making actionable recommendations for new preprint review groups and encouraging community participation to evolve and improve current standards. So I've used the word ecosystem a few times now, and it's important to define who and what makes up those broad and complex partners in an ecosystem.
So we were looking at preprint platforms, review providers, commenting tools, metadata agencies and indexers who are all connected by various APIs and scholarly communication standards interoperability is ensuring that all of these providers and all of these partners can communicate effectively across this diverse network. So how these various actors communicate come down to these protocols, schemas and standards things our working groups spend time documenting.
So this governs how data is communicated across providers. And so by aligning across common protocols and frameworks, we can foster more consistent and discoverable peer review metadata practices. And so the three that we identified, we grouped into groups of communication protocols, Data Exchange frameworks and metadata schemas and standards. So communication protocols specify rules governing the information transmission.
So implementing a communication protocol ensures that information about scholarly content updates are efficiently shared across different platforms. And communication protocols. You'll find terms like a notification format transmission protocol, authentication options, notification process, processing routes. So for example, core notify is used for sending notifications between systems when a research object is updated or undergoes a status exchange.
So you can see in this there's a preprint server or platform which is notifying a preprint review platform or journal publisher via communication protocol. And so, in the context of preprint peer review, core notify protocol is used by preprint servers such as Hal to notify participating preprint review groups like peer community in about review requests coming from preprint authors.
So we see how first creates a preprint manuscript and preprint metadata record. They make it available for review to peer community in. And they also notify peer community in using core notify about that information. Secondly, we have Data Exchange frameworks. So they establish standards for the systematic transfer of information between different systems. And they ensure that metadata about preprints and their reviews are accurately and consistently shared across various platforms.
So you'll see terms like metadata formats, data harvesting protocols, validation processes, and interoperability standards. So here we have the example of a preprint server or platform creates a manuscript and metadata record. And then they're sharing via metadata standard that information with the platform or journal publisher. And so a key example is Meca which stands for manuscript exchange common approach.
And so here we see with metaarchive makes the preprint manuscript and its associated metadata record available on their server, and if it's selected for review by eLife, the pre-print manuscript and metadata record is shared with eLife through the BMJ process using the Mecca standard. And then finally, we have metadata schemas and standards. So these are structured frameworks for representing and organizing metadata about a research object.
And by implementing a metadata schema, they can standardize that information related to peer review or about the preprint in general, and makes it easily understood across various platforms and systems. And here you'll see elements like a data field, relationships and controlled vocabularies ensuring comprehensive and interoperable metadata representation. So here we have the preprint review platform and journal publisher creates via some tool.
The preprint review, metadata record and preprint review goes into a repository. And then is deposited via metadata schema with a metadata registration agency. So first we have the example of Crossref. So information about science open preprint reviews is displayed in Europe PMC, a repository for journal articles and preprints in the life sciences. But Europe PMC is retrieving metadata for peer reviews using the Crossref API and associates them with the preprint via is review of relationship that is in the Crossref peer review metadata, which indicates the Doi of the review preprint.
Similarly, DataCite in this case f1000research creates the preprint review metadata record and then deposit it to DataCite using data set's metadata schema, which then it can share out with via API for others to ingest. And then finally doc maps, which is a project I worked on, is a metadata standard or framework for representing the creation process of research outputs in a machine readable, extensible, interoperable format focused on the events that happen.
So in this case, CDI creates a doc map schema about a preprint review metadata record, and shares it via its API with all of the different events that have occurred related to that preprint review. So in the project, we noted six different workflows that we hoped covered a majority of use cases in the ecosystem. Each of these workflows comes with a detailed graph like these that you've seen following a specific preprint. The reviews and the respective metadata journey throughout the ecosystem, and it maps the metadata workflow, aligning with the ecosystem data we noted above, as well as the benefits and challenges of this particular workflow.
So the different options and workflows that we came up with is a preprint server with an integrated review option like science open preprints. We have a preprint review group that's using a commenting platform to handle their review, like Arcadia science. A preprint review platform registering like peer community in or prelates or rapid reviews. A preprint review platform using a repository like pre-review, an open research platform like F1000.
And then the publisher review curate platform that Fiona just mentioned around the e-life the PCI journal and review comments. And then the last thing we wanted to notice, and this overview of all the different complications that these can get really complicated graphs. So in this case, we take all of the names out of them and just say what kinds of standards or frameworks are present.
And so in reviewing all of these workflows and as I mentioned earlier, the goal of the report was not to be determinative. We didn't want to tell people specifically what standards or frameworks they should be using, because each of these metadata workflows we had found had benefits and challenges. It's important to consider the technical needs and capacity of a team in order to implement these kinds of systems.
The long term sustainability and the ability to ensure discoverability. A key goal is to link preprints to their associated reviews, no matter where or how they were created. So if you're interested in learning more or getting involved. I encourage you to read our full preprint on our archive. It's important that we work together to build a more connected and transparent scholarly communication landscape.
Thank you. So we have some time for questions. There's a microphone in the center. We could hand it around if we need to. But just to give an overview summary, what John talked about was the preprint server and how the paper comes in from the researcher, and how then there are organizations that are interested in taking those preprints and doing something with them.
And it might be a peer review, it might be a working Witt Junior researchers and early career researchers to teach them how to do peer review and/or other sorts of things that they could possibly do with those preprints. And then Fiona talked about some very specific use cases and specific organizations that are actually doing that kind of work. Organizations that are taking pre-prints out of the wild and actually doing peer review on them, and then making those peer reviews accessible to others, that to have those peer reviews potentially move from the preprint server onto actual journals to help the peer review process.
And then Emily talked about some of that infrastructure that already exists today that is actually facilitating all of this movement behind the scenes, whether it be protocol that's actually taking, that's helping to move the material around, or protocol that is describing what has been written up and all of the processes that have taken place to that piece of research and describing, using, sending that description throughout the ecosystem so that everybody can see what has happened with that piece of research as it's moved through the ecosystem so that the.
That, I hope is given us somewhat of a summary to help you pull this all together. And so does anybody have any questions. Are you sure you don't have any questions. We have somebody coming up. Is this that should be on. Great I have a couple, and they're going to sound really skeptical, but I truly do believe this is a great system, but I fear it's not.
We're not ready yet in this industry as a whole. So I'm scared of it too. And my first question is, are there any safeguards, especially with all this talk about AI that we've heard in the last two days and the last, however many years. What safeguards are in place to make sure that information that's put up for free and openly, that it's not being taken and used without any sort of proof burden or anything like that it's actually trustable.
That's not a word. But yeah, just to maintain that proof burden as well, but also keeping researchers safe as well, that their information isn't being taken and stolen in a way that we don't know how to control yet. Yeah, I think that's fine. So we have this stringent systems in place as any other journal and any other organization.
We're using all the same similar tools that everybody else has in terms of research integrity. What we also think, though, that in order to do, very sort of open science in terms of thinking about the provenance of that research, it has to be open. It has to be open science. You have to be able to see the comments that are made by reviewers, by editors, by authors.
You have to link to the data. We have a really stringent life anyway. We have a really stringent editorial board and process that is actually more stringent than any publisher I've ever worked for before. So it's not about the fact that things are open, that there's worry. It should be the fact that things are not open, that there's worry because when it's not open, it's opaque.
You don't know why decisions are made. You don't know what the provenance of those bits of research are. And that's the danger. And that's why we're championing the open science route in order to actually address all of those problems. Forgive me. I'm not sure I heard all of your question, but I think you were asking about whether say, the preprint servers are using AI based tools in screening process on the opposite side that maybe AI tools that are screening the archive sites, if that makes sense.
So if it's an open tool that people are publishing to the open archive or Med archive, et cetera, and not going through peer review and all those copyright things and whatnot that goes through that. There's all that legal jargon. Our OpenAI model is able to take that information off of open archive or Cam archive or whatever. I see what you mean. Protecting researchers in general.
I think all of us who are publishers in the room realize that particular horse has left the stable. The question now is, what do we do going forward. And I know there are a lot of publishers engaged in pretty intense conversations with AI companies. Trying to enter into more formal and more rewarding relationships with AI companies, rather than just have that content be scarfed up and repurposed.
I'm part of a group of publishers who are having those kinds of conversations, and there are some interesting possibilities emerging. Let's put it at that. I think that the more responsible AI companies realize that scholarly content has great value, and they want to realize that value. And we obviously, as scholarly publishers want that to happen, too.
But we also want attribution for the source and for the authors. And I think many of the major AI companies realize that they made a misstep by not having those kinds of conversations up front. And now there's a certain amount of trying to catch up, and I hope that will I hope that will be beneficial for all concerned. I mean, I don't think there's any way I know there were initially there were thoughts that, well, we must exclude these crawlers from particular companies.
I think that's probably fruitless. What we need to do is try to act as responsible stewards of the content that scientists have been good enough to give to us and work on their behalf, as well as on the part of our own organizations. Thank you. I'm sorry. I hope that answers the question you asked. Yeah thank you.
Yeah from an infrastructural perspective, I'm thinking so in the case of core notify, there are protections in terms of who can they accept information from and who do they send information back out to. It's not an open inbox that can accept all of these notifications. They're going from one space to another. In the case of Doc maps, less so in terms of the metadata is present, but in terms of the content that it's relating to.
I was thinking about how Fiona mentioned the value add of publishers. In this case is the ability to do review, but it's also in the ability of publishers come with a specific level of trust and respect within a community. And so in a doc map, you can see this is where review happened. This is who did it, but also this is the space or the platform on which it occurred.
And so there is sort of a check back of OK, if all of this is being done by AI, we know where the space is happening and we know where it can occur. And so I think infrastructurally, the flood of information that is out there can be wrangled in such ways by these different standards that we have. And on that note, is there any using ORCID or other verifications to verify authors or published publications or things like that, as well being used on these sites and communications as well to make sure that they aren't coming from a paper mill or something like that along those lines as well.
Yeah, certainly. Infrastructurally one of the things that the various metadata working groups that ASAP bio and Europe PMC put together was thinking about what information do we need to collect and be able to share with other people and is important. And some of these modes of verification of authorship or of publisher were really were identified as key pieces of metadata that people want to be able to see alongside any kind of content that they're also seeing.
And I don't know if you were in the session on content on author identity verification this morning, I thought that was particularly interesting. And the STM framework, I think, is going to be helpful for everyone. And the preprint servers are also looking at all of those potential methodologies for verifying verifying identity of authors who do not have institutional email addresses.
There's a range. If you were there, you heard there was a range of possibilities. And I think we will almost certainly have to go in that direction too. One interesting thing someone asked me about this afterwards was, does bioRxiv have a huge problem with unverifiable and basically fraudulent content from mythical authors. And I mean, I hope I'm not sounding complacent in this, but so far we've had a very small problem, and I think we've very often detected it by very manual methods.
But I think part of the reason we've escaped the problems that many other organizations have had because as far as the paper mill guys are concerned, a preprint is worth nothing. So there's no incentive basically to try to fool us, except in the limited case of citation hacking, which we have come across. But the solution for that is a different one to the ones that were talked about this morning.
I just want to reiterate some of the things that have already been said. We obviously run, very clear sort of checks on all the papers that come in and verify through metadata, everything that we look at. So there is a very stringent process in place. And I think that will evolve and will evolve as everybody has to work with that going forward.
It's almost like there's a race going on. And every organization just has to keep up with using technology to help them with that. But sometimes there's other processes, manual processes that are in place that actually facilitate that as well. And I'll also add on. I've been working with the stem integrity hub, working managing one of the working groups for over 4 years.
And it is looking at all of the different integrity checks that you can do. The author or researcher identity is the key to a lot of this. And I know that they're doing a lot of work around figuring out, what some of those options are. But there are a lot of signals that you can pick up that might indicate that an author is not verifiable, but it's a matter of looking at the preponderance of the evidence is not one thing.
So that's a very difficult topic to or a very difficult thing to figure out. So last question I swear I promise. Sorry you're going to ask all the questions. Sorry it kind of goes to all of it at the same time. Where does OK, try to frame it the right way. In the last slide of the open archive, you had that new framework of how peer review could look kind of in the future of the new system.
Is that relying on publishers to come looking for these publications now, or is it still lying with the author to send their preprint and then go to a journal or publication or somewhere else afterwards. And it kind of goes to all three in a way of that was the first start. But then eLife has going back and forth with the preprint as well, and then the communications of other people or not people, but I guess.
But like, are there communications that are grabbing those preprints and going, this would be a good submission to this journal or kind of noticing those pieces. Or where does that burden lie on the final publication, I guess. No, I think that's a very pertinent question. And at the moment, I mentioned the DataSeer integration that we have that is entirely voluntary. And I mean, our position as the platform is that the responsibility for the content of any preprint is entirely the author's.
And if they choose not to do any of the things that we might give them the opportunity to do, well, that's their prerogative. I think that we're still at an early stage in accumulating these signals. And one of the things that I hope will happen is that authors will realize that the more of these signals they attract and volunteer for, the better it will be for them and the work that they've reported.
Because sooner or later, that work is most likely to be published in a journal, and having an accumulation of these signals may well facilitate that publication process, as it kind of does with eLife right now. So it's AI mean, at some point, we might progress you have to do these things, but I think not at this point. And I think if we cross over there, I think we've actually crossed a boundary that we've set for ourselves.
And I would be unhappy about doing that. From an eLife point of view, obviously, we're looking at or editors are looking at the preprints that come in, but the more signals that they have that preprint is passed particular checks. It's not giving alarm bells about anything particular. The more likely our editors are to say, hey, this is a great article, let's send it to review.
So over time, I'm sure that will just become more and more part of the workflow. Because whilst at the moment a lot of these systems are new and experimental, or people are seeing what value that they're bringing I'm sure that at certain points they're going to become mandatory. You're going to have to have a badge that says, all of these different things before an article even gets sent to peer review.
So I think that evolution actually just helps streamline the whole process because then we're not worried later on, after review has been done, that there are certain things that have not happened and the authors will start behaving in that way. And there's a lot of organizations that are looking at, open science indicators and how best authors can achieve particular markers. And I think that's all good.
I don't think I have too much to add. But in terms of infrastructural aspects, right, the tools we have can help do those things. And I think it thinking from your question from a different perspective, it's not on the authors to say, great, I put this preprint and now I have to pull together all of these things, and all these things have happened in all these different places to show that these move towards publication or that they are these indicators.
What we're trying to do is make it easier and streamline that process, and also to highlight yeah, to highlight these smaller providers and review platforms that are doing this work to help it happen and that make the entire ecosystem stronger. Yeah and I'll also add on that, there are different Fiona showed a bunch of different organizations that all approach this in a different way, so you'll have some that are actively going in and looking at the preprints and pulling things out there or there.
They have a specific topic in mind, and they're harvesting from that idea. Or, or you have overlay journals which are also, looking across and saying, OK, these are papers that we want to group and display together. And even though that overlay journal is not bringing those authors in, or they might bring those authors in. But the idea would be that here, here's a group of preprints that match this match, this topic, and they then put it through some sort of peer review.
And so I kind of touched on just a few models. There's a lot of experimentation going on in this area and we are out of time. So that's OK. I mean, we've run over time. So I think that thank you, everybody, for coming and thank you for the thank you for all the questions.