Name:
New Directions Keynote Presentation: Emerging Traits in Scholarly Publishing
Description:
New Directions Keynote Presentation: Emerging Traits in Scholarly Publishing
Thumbnail URL:
https://cadmoremediastorage.blob.core.windows.net/f0e4ea83-0ab6-4bb8-a8e5-b46f0c3678f8/thumbnails/f0e4ea83-0ab6-4bb8-a8e5-b46f0c3678f8.png
Duration:
T01H16M44S
Embed URL:
https://stream.cadmore.media/player/f0e4ea83-0ab6-4bb8-a8e5-b46f0c3678f8
Content URL:
https://cadmoreoriginalmedia.blob.core.windows.net/f0e4ea83-0ab6-4bb8-a8e5-b46f0c3678f8/GMT20241001-134352_Recording.cutfile.20241001164126942_galle.mp4?sv=2019-02-02&sr=c&sig=cNw9s1I%2BLlhefh5Vp6s7VLLTN%2FeYzuhlOolrXfEob0c%3D&st=2025-04-17T13%3A39%3A17Z&se=2025-04-17T15%3A44%3A17Z&sp=r
Upload Date:
2025-04-17T13:44:17.2441731Z
Transcript:
Language: EN.
Segment:0 .
All right. Top of the hour, everyone. Hello, Hello. Good morning. Good afternoon. For those joining from afternoon time zones. Thank you so much for joining SSPS 2020 for New Directions in scholarly publishing seminar. Our trends are our theme this year is from trends to tactics shaping scholarly publishing near future.
My name is Lily Conrad. I have had the pleasure of being the new direction seminar co-lead for the last year, and I am so excited to introduce you to my new co-lead. We'll do that shortly. I am an independent researcher and consultant, spending most of my time these days doing product development and product architecture with lib links, which has been a great joy.
For us over am I? So, Susan, shall I advance slides or. Yeah OK, there we go. Let me start with our land acknowledgment. SSP acknowledges that this event is hosted on the traditional territory of the Anacostia and nacotchtank. Piscataway Oh.
This is. I should have tried this before. The Piscataway and pamunkey people. And apologies for my pronunciation. Our Thanks to the native peoples on whose ancestral homelands we gather, as well as the diverse and vibrant native communities who make their home here today. May we honor their past and care for this land. Let me move on to thanking our volunteers who did that for two.
There we go. OK I want to thank the volunteer working group for all their hard work developing the program. Jenny Herbert, who is now our co-lead of the new direction seminar, Jenny Ann Carvalho de Amirault, Matt cannon, Neil Blair Christiansen, Jamie Devereaux, Betsy Donohue, Ann Goering, John gerstle, Amy Licciardi, Laura Martin, Jordan Schilling, showing. There are two education committee co-chairs, Sophie rice and David Myers, and of course, our SSP president, Heather Rutland Staines.
As we look ahead to next year's seminar, I want to invite anyone listening to reach out. If you are interested in participating in next year's working group, we are looking to fill a few more slots, so if you're interested in planning the 2425 season, please let us know and we want to Thank our event sponsors. Of course, you make this possible. Cadmore media DCL, the data conversion laboratory dimensions a Digital Science solution, origin and silver chair.
Thank you to all of those organizations. This advance is a little delayed OK there we go. So a couple of housekeeping notes before we get started. The Wi-Fi password is welcome1 to net zero and the sign. And that has to do with agu's dedication to carbon neutral sustainability. The meeting hashtag is SSP n 2024.
Please be sure to silence all those lovely Cell phones and devices that make noises as a courtesy to our speakers and to other attendees. And we will have closed captions, as you can see, enabled for all of our sessions. And speaking of accessibility, we work hard to really establish an interactive hybrid event for both remote and live attendees. So to those of you in the room when you're speaking during a session or when you want to ask a question or contribute a thought to a discussion, please use the microphone for attendees.
When you're asking a question, be sure to identify yourself, your name, and affiliation first as well. So we are using the Zoom events platform and you can join using this QR code again. Our goal is to host a really engaging hybrid event. So that's true for those of us in the room as well as those of us joining online. So please give us a shout in the chat. Use the Q&A feature.
And that's true for everyone here in the room, as well as anyone joining us online. So anyone here can join using the Zoom events platform. And you can engage with folks who are speaking or attending virtually. And we do have volunteers watching both here in the room as well as remotely, to ensure that everything is working as it should. But if for some reason you can't hear or see something, please indicate as soon as you can in that meeting.
Chat during the session and we will be sure to get to you as quickly as we can. For those of us in the room, you can, as I said, engage with virtual attendees and speakers in the Zoom events. If you haven't logged into Zoom events, scan the QR code, participate in the chats, polls, et cetera, and let Zoom launch. The advice from our technical Wizards is to let Zoom launch, and then as long as your status is set to in person, which it should be, if you're in the room, the audio will not stream, so there shouldn't be a conflict with the audio.
We are recording all of our presentations, and those will be available on the Zoom events platform within 24 hours. Once they are posted, they will be available through the end of December of 2024. If you need any technical assistance at any time, of course use Zoom events. If you can't get into Zoom events, then email us at spac.org. Then a shout out for our SSP generations fund.
So we do. We do. Thank you all for contributing and making the new directions fund sustainable. Your contributions, both your cash donations as well as buying the awesome stickers and sweaters and chef. Aprons Thank you Marianne. There's great stuff out there.
It all goes to the New directions fund. The SSP fellowship, mentoring and diversity education, inclusion and accessibility programs as well. And again, you can make a cash donation any time as well as buy those lovely gifts there. OK, a quick note about SPS code of conduct and today's meeting. We are committed to diversity and equity and providing an inclusive meeting environment that fosters open dialogue.
Lack of discrimination, lack of harassment and a lack of hostile conduct. We ask all participants, whether speaking or in chat, virtually, in person, however you're interacting with one another. Consider and debate relevant viewpoints in an orderly, respectful and fair manner. And there is more information about the SSP code of conduct in the link in the Zoom events.
There is a link in the faqs in the information for attendees section. And without further ado, let me introduce you all to our keynote speaker. I'm happy to introduce you to Michelle avissar Whiting. Michelle is the director of Open science strategy at hhmi, overseeing the open access policy and a new initiative focused on research communication.
Prior to joining hhmi as a program officer in 2022, she was rocking various roles at the Research Square company and serving as editor in chief of their preprints platform between 2020 and 2022. Our accomplished keynote speaker also earned a PhD in medical science from Brown University today, Michelle's Michele's address will reflect on how the scholarly publishing community is facing many pressures for change.
As researchers and institutions embrace innovation, we are witnessing the emergence of new traits in our industry faster science, but also greater transparency, more autonomy, also more accountability. So Michele is going to discuss these traits and challenge us to consider what sort of future we desire and what stakeholders must come together to make a new paradigm of scholarly publishing possible.
So please join me in welcoming our keynote speaker, Michele Whiting. Where did she go? There she is. You ready? Now I have to figure out how to share this again. How did it.
All right. All right. Thank you so much to letty, Jenny and the working group. And definitely Thank you to Neil for inviting me to do this today. Neil Christensen, who couldn't be here today. I love an opportunity to step back from the trees and look at the forest. It's a lot of work, at least for me to, you know, understand how we got to where we are and what it means in the big picture.
When when Neil reached out to me, he mentioned a LinkedIn post that I just made when I was announcing my new role at hhmi. Which is basically just saying that my focus and my new role is to sorry, is somebody I'm hearing my voice coming. OK got it. All right. The focus of my new role is to think about how to help scientists communicate their research in the Modern world.
And this talk is going to focus on a couple of those, what I'm calling emerging traits to kind of riff off the language of evolution. And this image here, by the way, is a collaboration between me and Google Gemini. What I really wanted it to do is to depict a piece of paper taking flight, and it just couldn't get it right. So I ended up taking part of what it did and putting it into Photoshop to get it the way I want.
It had to actually imbue it with my own creativity, which is rare these days. OK, so before I begin, I'm going to reiterate a lot of what I just said give you my tiny potted biography so that you can understand where I come from on this topic. So since 2022, I've been at Howard Hughes Medical Institute, first as a program officer and then more recently as director of Open science strategy.
And in these roles, you know, I've had various responsibilities. One is to support our scientists and their efforts to comply with the open access policy. And I'm helping to drive a number of initiatives related to open science things around preprints, that sort of thing. And right now, I'm leading a project that is focused on improving and modernizing how we do research communication.
So that explains why I'm coming, coming to you with this information today. But before I was at hhmi, I was the editor editor in chief at Research Square, the preprint platform which is now owned by Springer Nature. And that in that role, I really tried to help create new standards. And it's kind of a new venue. So I was creating standards for how we decide what to post.
And this was, of course, during the COVID 19, the beginnings of the COVID 19 pandemic. So I guess this is where I was sort of radicalized because, you know, I got to see firsthand what this new form of communication actually meant for communication, for the speed of research, for research communication. So I was actually at Research Square in some capacity or another for 11 years before that.
I got my PhD and I did a postdoc at Brown University, and I was studying epigenetics of cancer, head and neck cancer specifically. And these days, I spend most of my time with these two crazy little girls. And yes, this is a perfect encapsulation of their two personalities. So I wanted to show this picture. And lastly, I love music and I'm always up for karaoke.
So if you're ever in the Raleigh area, you know, hit me up. All right, here's how I'm going to start. This is some really clever art from a graphic designer named Lily Djibouti. You know, art, good art makes you feel something. And this art makes me feel nostalgic, I guess. Within the lifespan of probably everybody in this room, you know, every mode of human communication, productivity, entertainment has been massively and irreversibly revolutionized by the internet.
Many of us fondly remember these old modalities, right? But I'm willing to bet that none of you actually Mrs. adjusting the rabbit ears on top of your T.V. Some of you are going to be too young to know what I'm talking about there. It's OK. And it's kind of a truism now, right, that the, the scholarly article, the way that we communicate research has hardly changed between the 1600s and now.
But we made it a PDF. We added a column. I know I'm being a little bit. I'm exaggerating a little. But what's more surprising is that the system of dissemination, the way that we communicate research also just fundamentally hasn't changed either. Certainly not since the digital revolution anyway. So these are the two separate aspects of scholarly publishing that I, that I believe are showing emergent traits.
And those are what I'm going to talk about today. Solve the system of dissemination and the unit of dissemination, both of these are going to be in flux in the coming years. Let's start with the system. And the story is going to be familiar to a lot of you. All right. So it's true that the formality and the style of peer review has gone through several transformations, but basically, we still have a predominant system in which private drafts are sent confidentially to journals reviewed confidentially.
And at some point in here, you know, over the course of months, sometimes years, and sometimes after multiple revisions and repeating the process again and again, a publication decision is made in a way that that's, you know, sometimes mysterious to the author and often completely inscrutable to the ultimate readers. But in the end, there is a publication released by a publisher with a stamp of approval from a ostensibly trusted authority.
This is a review and curate then publish model. And it made a ton of sense when we were constrained by the number of pages that we could bind together and the amount of magazines that we could ship on a horse drawn buggy. By the way, this is a real book, a real book of poetry. It's over 10,000 pages, so 100 volumes bound together into a single volume that's two feet, two feet wide.
And I imagine that this is what a monthly issue of like plos one would look like right now if we were printing it. And and on the other side, this is actually the first bookmobile in any librarians will love this photo. It's, you know, from Washington County Library. Really cool picture. OK so I want to emphasize it's not that the system of dissemination hasn't evolved at all. And in fact, in my view, there are three major transformations that have happened since the dawn of the internet.
First, we've developed quite a beautiful infrastructure, you know, to index and organize articles. We now have a rich and growing set of metadata schema that, you know, at least in principle, allows us to organize reference and track events on articles and increasingly non Article outputs too. So this has opened up worlds of possibilities for us.
It's not perfect but it's getting better all the time. And the other two big big developments won't surprise you. Open access and preprints. Open access has been a major success story in many ways. It hasn't been a complete transformation. It hasn't moved maybe as quickly as, some of the open access advocates would have wanted. But there has been a huge proliferation of fully open access journals.
And since about 2017, more than half of all papers published have some form of open access. I had a really hard time finding more recent stats than this. So if anyone knows of good graphics that go closer to 2023, let me know. Because I had a really hard time finding that information. And it's very difficult to argue that it's not a good thing that more research than ever is accessible to anyone on Earth who has an internet connection.
That's citizen scientists. It's grad students in poorly resourced places in the world. And it's its patients looking to educate themselves about their own medical conditions by reading research that their tax money funded. And I think we've gotten used to this. We stopped thinking about it. How seismic this shift was going from authors always handing over their copyright, the copyright for their work, to, more often than not now retaining that copyright and being able to apply licenses that give them control over how their work is, is used.
But the shift to open access has also created some unintended consequences. To put it lightly, paradoxically, I guess publishing digitally is more expensive and now, you know, I would say much more expensive. It's being borne by authors that expense and instead of libraries. So we're feeling it kind of on an individual basis more now.
In other words, the access to read issue that we used to have is now an access to publish issue. And moving to an author pays model has turned articles into a kind of commodity. And of course, some journals have exploited this by charging authors not actually performing the services that they say they perform, those peer review and other editorial functions. Basically, they're just charging them to host their PDFS.
And of course, more journals, more papers has put a huge strain on the peer review system. We hear about an ongoing crisis of peer review, and we still have a dominant incentive structure that depending on what environment you're in, you know, either rewards publication and high impact journals or rewards huge numbers of publications. So that's given us these twin issues of a progressively worsening paper mill problem and a record number of integrity issues that are leading to retraction.
Oh Yeah. And because AI has been added to the mix, we also have this guy. So Thanks for that. OK, so that's a long tangent just to foreground us and helping us to understand why there is pressure where the pressure is coming from, for the system of dissemination to change. And one way that was proposed by balderston and Erin O'Shea, my bosses this was in 2019 is to flip the order of operations and publishing.
So from Review, curate, then publish, to publish, review, curate. Can you raise your hand if you've heard of publish? Review curate. Cool all right. That was not like that just a few years ago. So this means the author is always the person who's in control of when the paper is published.
Peer review reports are publicly associated with that article. And then various forms of curation can happen afterward. And of course, the most common objections to this paradigm revolve around peer review, which has come to represent a guardian against creatures in the night. But what has our decades long experiment with peer review actually taught us?
And here I borrow some ideas from Adam Mastroianni's famous or infamous blog post, the rise and fall of peer review, which is a really great article and I recommend you read it. Really funny too. Here are some of the points. Here are some of the learnings. It's stochastic right? The reviewer three memes that we've all have heard and jokes we've all heard.
They're funny because they're true. Review is random. It doesn't keep out quote unquote bad science. We saw evidence of that a few slides back. It can't actually predict what ends up being right. Nothing can except for time. But to the extent that it tries to do all of these things, it's slow and it's plagued with bias. And, you know, all of these things are so because peer review is a human activity and humans are flawed and humans are discriminating and political and busy and prone to mood swings.
So does that mean that peer review is worthless? Well, most scientists agree that a good review. Always improves their paper. So people hear about peer PRC published review curate and they say to me, so what, you're going to dump peer review? I just want to publish everything. And the answer is the second one is kind of yes.
But it's important to emphasize that the PRC model doesn't discount the importance of peer review. It just shifts the role from deciding whether something is publishable or not. And we have to concede that that hasn't worked. It also frames peer review, in my opinion, in a much more realistic light, acknowledging that it's sometimes wrong. And either way, there is almost always a valuable discourse, a valuable conversation that emerges from it, one that we should and can capture.
So actually, one of the things I love most about the PRC model is that it really elevates peer review, because the reviews are public. It demands a different level of accountability from reviewers. And it turns reviews into a scholarly product, a scholarly work in their own right. OK, so decoupling review from the dissemination of science is an important emerging trait in scholarly publishing.
I'm declaring that here. But you might say, sure, Michelle, that sounds great and I agree with you. And during peer review, you know, doing peer review this way, it's a lot more logical. It's a lot more efficient. But good luck getting people to air their dirty laundry in public.
We've always done it this way because that's what people are comfortable with. It's just human nature. And to that, I would just say history is filled with examples of tectonic shifts that people didn't believe in, like this one. And this one.
And this one. Ouch OK, let's go to the second emerging trait which concerns the unit of dissemination. And this part I want to preface by, you know, first acknowledging that the doing of science, the way that science is actually carried out in the lab.
And I'm mostly here focusing on biomedical science because it's what I'm most familiar with and what I have experience with. This part has definitely evolved significantly. I don't even understand what happens in labs anymore. And I wasn't that long ago. It was just a little over a decade that I was in a wet lab, and I could never make sense of what's going on in wet labs right now.
Many labs are now using robotics and sophisticated machines to handle samples, because it's faster and more precise, and it's more consistent than, than we could have ever imagined previously. Why would you hand pipette into a 96 well, plate when you can have something, do it for you? Some labs are using smart consumables to handle this huge production of samples. So this is things like vials with barcodes which track a sample through its life cycle in the lab.
And we're seeing also smart instruments, smart instrumentation in the lab that it's equipment that basically monitors its own usage, its own health and reports it back. And it's been a little bit slow in the uptake, but elns electronic lab notebooks are starting to be used in labs. And and of course, cloud computing is having a massive explosion in labs which are becoming this method of doing work in the lab is becoming, you know, critical because the data that we're producing is getting bigger and more complex.
You cannot do this stuff manually anymore. And as a result, also labs are using these limbs or laboratory management information management systems, which kind of ubiquitously track all of this various information and manage samples across these various systems. And the labs that are, you know, employing many of these technologies will increasingly make use of the internet of things.
And this concept is just to make sure that all of these various systems don't become information silos unto themselves and that they're networked together. And then, of course, AI is going to compound the benefits of this system by adding, you know, intelligent analytics to make sure that the needs of every specific lab are met in their specific ways.
So if you haven't heard of the internet of things, this is the same tech that allows your. Your Apple Watch to call the police if you've been in a crash. It's the thing that holds systems together or makes them talk to each other. It's also the one that allows Walgreens to predict that you might want a cool drink when you walk into the store.
It's the real thing. And as you know, for better or worse, this is the world. Probably for better and worse. Right? this is the world that we. That we live in now. Anyway, because of these advances, we are able to process manyfold more samples, more information than we've ever been before. And of course, this is compounded by or complemented by increased sensitivity of all the assays that we use, increased resolution of microscopes.
You know, we're producing basically substantially more data than we ever have. But what about the research paper? So as I mentioned before, it hasn't kept pace with technology. And the reason I wanted to mention those developments in the lab is to really put this in italics for you, right? Science has moved on, but I know we all agree that we can't possibly get the most out of it.
We can't maximally capitalize on that if we can't communicate it well, if the output, you know, fails to be a reasonable representation of the work. So what's next for research communication? It's not just about doing away with the PDF. It's discovering how we can incorporate the process of science in all its emergent richness into how we communicate science.
So society can only benefit from science if it can be built upon. It can only be built upon if it is shared, and if it's shared, then it must be shared in a way that is accessible, actionable, and complete to the best of our ability. How do we leverage modern technology to get us here? And what does a research communication modality look like in the 2020s?
So here are some attributes that come to mind. And there's probably others that I'm not that I'm not going to list here, but. This is an obvious one. Machine readable well connected. It shows rather than tells. It's versionable. I'm made up that word. It's close to the data.
And it does not put supplementary info in a corner. That's for the 80s babies. OK, so let me pull on some of these threads. These are all going to be screen captures of articles created on the curve. Note platform. And this is going to be helpful.
This is not the only group that's doing this kind of work, but it's the one that we're working with. So I have a lot of access to their cool screen captures and will help me demonstrate what I'm trying to say here. So when I say, well connected, I mean well connected both within and outside of the work itself. So simple things like allowing quick jumps to the reference literature instead of jumping you down to the bottom of the page, making the figures visible wherever they are referenced, so that you don't have to keep scrolling back to those figures.
We know how to do that now. And really leveraging this rich world of metadata that I was just talking about, it exists to improve the accessibility of an article. So, for example, doing simple things like this, automatically pulling in abbreviations all well within our capacity right now. And what I'm talking about. So far here is not anything really fancy.
It's just that there's still a long way to go to simply make the experience of reading a paper, you know, less laborious. For example, how often do you think readers actually track down the supplementary material? Do readers do reviewers look at it ever? Probably not often. Right? it's basically hidden.
And what message are we sending? What message does that convey about the importance of that material? So there's no reason for this anymore. It's just an artificial constraint. And what do I mean by shows rather than tells? So by this and now we're getting a little bit more fancy.
I mean, allowing for a meaningful exploration of the data itself. So we now have the ability to bring software and data together to tell a story. You've probably seen this in the New York Times immersive articles. Especially during the pandemic, there were some really amazing immersive articles from the New York Times giving us up to date information that was built on top of the data that was being collected by all these science groups about the progress of the virus throughout the world and vaccines and all of that information was coming to us in a very rich way.
So this is a demonstration of a plot where a large amount of data can be interrogated directly by the reader, just using selection tools intuitively. They can Zoom in and get more or less granularity depending on what specific thing they want to see. And I mentioned that labs are increasingly computational, right. They're working in computational spaces in the cloud, with large amounts of data that traditionally would have been presented using some, some representative static image.
Right or maybe a movie if you're lucky. And then the data, you know, is hopefully shared somewhere else. Should someone want to reproduce it. But we have the ability to create visual layers that draw directly from the data and the code. So this kind of renders a mini notebook environment inside the article itself, and you're kind of running the notebook in situ.
And you can see how in this way reproducibility is built in. It's not an afterthought. All right I'm showing you one more example. And this time is from the work of an hhmi investigator, Lauren Frank. And he's developed a jupyter based tool that allows collaborators. This is in neuroscience and colleagues of his in the field to more easily access his labs data and their computation, their code.
And now, in partnership with curve. Note, we've been able to bring some of that data to life with an interactive visual. So the data were collected, you know, on the spatial positioning of a rat in a maze. That funny shape is a rat maze. And simultaneously tracking the neural responses of the brain of the rat. So that's the little circle at the bottom, right.
So this is overlaying a visual onto data and code that's pulled directly from the Frank labs jupyter notebook. And the interactivity allows users to change parameters and move through a timeline of this data set, looking at different criteria at the same time. This is obviously a very different experience from looking at tabular data for reading the results as text, or looking at a static image.
So I see these things as co-evolving to some extent. OK and I believe they have the potential also to reinforce each other in a couple of ways. Stick with me. So these preprints that we're talking about is being released by the authors, right. If they're enriched in this way.
And written as an extension of the data on top of the data. Well, those are the best author released outputs that we can hope for, aren't they? You know, especially from the standpoint of integrity. It's not that it's completely impossible to, you know, completely to totally fabricate a whole data set or a whole lab notebook. But it's much more laborious then putting together a, you know, a static representation, just an article of that work.
So in a way, outputs like these have just inherently have trust signals built into them. So if what we're concerned about with preprints is adding a lot of, you know, noise to the system kind of flooding the zone, I've heard people express concerns about that. Well, that's much more unlikely to happen if this sort of communication becomes customary. I actually think it's unlikely to happen anyway with preprints, but that's a separate conversation.
I also think that presenting work where the narrative is closely connected to the data is closely like integrated with the data. It's bound to deepen the peer review as well, right? We don't expect reviewers to review data, and they rarely do it. Oops, sorry about that. But here they would be doing it by virtue of just reading the paper.
And in a world where reviews are public, that review has now a better chance of contributing to the development of the work and also improving the readability of the understandability of the paper for the readers. We seem to have lost my mouse. All right. So we have unit of dissemination, system of dissemination.
And these things are both already changing. There's already stuff happening here. So on the unit of dissemination side, we have seen experiments with this from elife. This is actually going back a few years now with the executable research article. The microscopy society of America just this summer launched a journal using curve. Note And that allows people to, you know, explore microscopy data and reproduce analysis in place.
Agu is piloting an initiative called notebooks now, which essentially publishes a curated version of a notebook, an electronic lab notebook, which is meant to kind of replace the standard static PDF with a dynamic, you know, computational document. And some other notable similar efforts that classify themselves as preprint servers, I think are libre and DC labs.
They both come at this from a sort of different angle, but basically the same idea where the narrative, again, is connected to all the data and to all of the outputs. And on the system of dissemination and of course, the most prominent example right now is elife reviewed preprints. I didn't on purpose put elife at the top here for both of these, just the order that they came to me. And this was Eli's experiment that started in 2023 with publish, review, curate, or, you know, releasing author outputs, doing the review in a public way, doing away with the accept and reject decision, and replacing it instead with a more nuanced editorial summary that accompany that accompanies the article.
We have other. So gigabyte joined the PRC family, I think, toward the end of last year, and meta raw is a new platform for meta science research that's built atop meta science preprint platforms, also using the PRC model. And that's coming soon. And then there are various other experiments in this area that all again, come at this in a slightly different way, integrated with journals.
Disconnected from journals. Review comments. Peer community and pre-review. Just to name the most prominent. So I'm backing up again here. For centuries, publishers have been doing the heavy lifting. Right of trying to facilitate the communication of science between scientists.
And for a very long time, this was mainly an issue of, you know, managing a peer review process, some sort of evaluation process, and producing and distributing magazines to subscribers. Science was simpler then. The papers had 1 or two figures. There wasn't the capacity for this high throughput data generation. In other words, everything could fit on the flat page.
But 21st century data is bigger and more complicated than it's ever been, and it's just going to get more complicated still. And our ways of communicating it have become woefully inadequate. So around that, though, we also have an increasingly complex situation in publishing, which, you know, features these perverse incentives for journals, for example, to publish flashy science to improve their own profile and track citations so they can have a better impact factor, and in parallel, also an incentive to publish as many papers as possible to maximize the bottom line.
That's increasingly dependent on apcs. But we still don't publish negative results. We've built a reward system that's based entirely on the yellow half circle, and the blue half is completely dependent on it. And now, Meanwhile, journals are trying to take on more and more of the role of trying to prevent the issues caused by those incentives further driving up their costs.
So none of this is good for science, and all of it serves to widen the chasm between these two halves. This, to me, results in a situation where, unfortunately, you know, the publication is becoming more of an advertisement for the research conclusions rather than being a useful representation of the research. So this is really the story about how these halves move closer together.
OK how doing and sharing science becomes one and the same system, one in the same process. And it is a scientist driven process. That doesn't mean that we can dispense of organization, that we can dispense of standards and infrastructure. We'll need those more than ever in this world. In fact, I see a new service industry, you know, emerging from all of this.
I can't predict exactly how it's funded, exactly what it looks like. But I know that a lot of the people with the expertise to create it and operate it are probably in this room. All right. So right now you might be asking yourself, isn't she going to talk about. I feel like she's barely mentioned AI.
Does she even know what AI is? So there are of course, lots of AI threads that run through the story. Generative models are inching ever closer to being able to just conjure an entire article from whole cloth. Chatgpt has been, you know, a force multiplier for just about every industry, and the paper mill industry is not going to be an exception to this.
So it's as good a time as ever to stop rewarding people for simply pumping out papers. AI is already being leveraged for screening of papers for one, for assurances of integrity. It's only going to get more sophisticated in this area, and it will be useful, I think, as a first pass to check that a paper meets, you know, certain standards.
I think it's going to be really important that we use efficiencies like these to process papers. These papers that I'm talking about in this future where they're all author released, of course, into the public domain. You know, one thing I think is really a concern for preprints, and I have firsthand experience with this. Having managed a preprint platform is concerns about breaches of privacy.
Patient privacy, for example, or publishing something really dangerous, like instructions to make a bioweapon or instructions to rectally administer oxygen if you're hypoxic. This was a real preprint that we got at Research Square that had instructions for how to do that on YouTube. These are things we could recruit help from AI to detect, and that we're going to need to do if we're doing this in scale.
I also think in the near term, we will use AI to augment peer review, not do peer review, but augment it. You know, help us find common problems like lack of appropriate controls. And this could be sort of, you know, a researcher. I could see this being a researcher assistant tool as well, help you to write a paper, make sure that you're including the right references as you write, making sure all the metadata is there, et cetera and of course, eventually it will be able to perform peer review.
And that peer review will be uncanny. And I really think we need to be careful about that because we run the risk of, you know, doubling down on some of the most horrific aspects of, of our characters, of our, of our humanity. Filled with bias, tendencies toward politicization over reliance on our priors. And I talked about AI I writing papers in the scary negative light just above.
But again, as a force multiplier, this is also a potential strength for papers that are based on genuine data. Write with the speed of the papers being of the development of generative models that we've been seeing in just the last year. It isn't crazy to dream about a time when it's just up to us to do the experiments and produce the data, and then the article writes itself on that basis.
And where I really think we're going to see some of the most interesting, you know, applications of this in the near term is in synthesizing literature for us, you know, literature reviews, living systematic reviews. That's already a thing. And it's going to be more common and automated with the amount of research being published. We desperately need this capability of reading in and dynamically, you know, viewing data on the kind of meta scientific level.
Oops sorry. And we're frozen. OK all right. And I also want you to think about the most important point of, of the intersection of scholarly publishing with AI, which is, of course, that the literature that we produce right now is already being Fed into generative models.
Right so now it's more important than ever that we get a handle on the incentives that are creating the two high levels of falsification misconduct. The noise that we currently have and make sure that we open peer review, because that is an important contextual signal to overlay on the literature, can be added into the system and help us to make sense of it.
And just the last a last point about this on the more nefarious aspects of, I like AI generated images, AI generated data. All of the different ways that you can imagine that being applied. I sincerely believe that this will be less of a concern in a world where we are rewarding things like transparency and completeness, including negative results.
Sorry, there's a big lag or something, right? OK, so evolution takes time and pressure. It's also what geology is about. According to red in Shawshank Redemption. Is this anybody else's favorite movie? But when I said time and pressure, all I can think of is this scene. OK, but that's not important right now.
We do see movement on these things. So where is that selective pressure coming from? For one it's funders right. So look at the moves of CCI gates, Michael J. Fox foundation and dare I say hhmi in recent years. Even among the government agencies watch that space. Things are changing. Researchers you don't, you know, have to look very far to find scientists that are just totally fed up and disenchanted and longing for change.
Institutions are a little bit slower and more conservative, I think, in this way. But we do see signs of pressure mounting here to check out Helios as an example. And finally, society people are more distrustful now of science, of its institutions. I think possibly than ever before. And it's I can't, you know, make a causal connection between these things, but I find it unlikely that there is no relationship at all.
All right. I'm going to close on this quote. This is from one of my favorite journals, Ed Yong, who wrote this in one of his many brilliant editorials during the pandemic. This is how science actually works. It's less the parade of decisive blockbuster discoveries that the press often portrays and more a slow, erratic stumble toward ever less uncertainty.
This is the sort of humility that we need to see reflected in the system and the unit of dissemination, and the way that we evaluate research, moving toward more transparency and curiosity and away from gatekeeping, secrecy. Prestige signaling. And I'm going to end there. That's the end.
I have to look at my face. Oh, OK. What a start. Hi Simon Holt from Elsevier. So listening to what you had to say, I was really struck, actually, that you spoke a lot about what people are reading.
But I think also we need to think about something else you touched on, which is about how they're reading. Right so as publishers, our job really, as you, as you mentioned, is to curate, enrich and disseminate content. And actually my role is all about a related but different type of accessibility. Right so it's about publishing books and journals and formats where people with disabilities can read them, but actually putting disability to one side, what we're talking about here is a world where people learn in different ways, and actually we're talking here about a future, hopefully, where we're thinking about not just people have greater choice about what they read, but how they interact with that content as well.
Right? whether that's visual or audio or being able to change the format and structure of the content that they're reading. And my question to you is, how do you feel that greater choice in terms of how people ingest content, whether neurodivergent people or just different types of learners, will really enrich the pool of people that we have both reading with authoring, reviewing, and interacting with content more generally.
What impact do you think that will have on our scientific and learned communities in the next kind of five, 10, 15 years? Yes Thank you. I didn't really touch on this aspect. It's part of accessibility. Of course, it's a major part of it. But I didn't get into that. And it's a great question. I think this only stands to make that better because some of these, at least in from my experience with this kind of interactive.
And again, I take it back to the New York Times immersive immersive figures or immersive visuals. It's tactile, right? You can kind of touch it and move it around and turn it around in your hand. And for me, that conveys information better than I actually need to read things quite a few times before it gets in. I'm just one of those people that has to read the same sentence again and again.
But if you gave me that sentence as a visual, that just allowed me to click around and do it at my own pace, that completely changes the experience of the absorption of information for me. And I can only speak for myself. But what I'm talking about here is not removing options, but creating more of them. So with all of these, by the way, these curved note. Documents, all of them have the option of producing a PDF at a click.
The PDF then becomes more static and you can't do as much with it. But if that's the way that you're comfortable reading information, that's still an option. So I think the goal is to create more options for accessibility without taking away the things that people are comfortable with. Hi, I'm Bill kasdorf. I was struck by how much of your talk still was still was structured around different forms of in the evolution of an article because the pre-print is still an article.
Yeah and I will have to say, just to put you on the spot, that I've done a lot of work with somebody, you know. Well, which is Kristen rattan, and she's really opened open my eyes to the importance of constant, ongoing communication during the research process. In other words, it's not communicating. Here's what we found out, but it's really communicating. Here's what we're doing right. And it's with typically your peers, et cetera.
So do you have any comments on that? Yes so you probably did. I kind of intentionally stayed away from that because I was trying to focus on more near future developments, and we're working up to that. But I find that discourse freaks people out. I mean, they don't, including scientists. I mean, there are scientists I've shared this idea with who are like, what are you?
Data is so messy. As I'm creating it, I would need to hire people to just basically curate the data as I'm producing it. Full time person to make sure that what people actually see is something I want them to see and not this messy, useless thing. Right so Kristen dreams big, and that's why I work with her, because she she's pushing us to think in these terms.
I don't think we're there yet. And I think that this is a step like my goal here is to do things, or at least at hmi, is to do things that are palatable right now or that feel achievable right now and work toward that. Because from where we've just been, those articles that we've been looking at today, you can see that the evolution, so to speak, toward this, you know, real time communication of data as you produce it.
If we're building articles on top of notebooks and it's only a matter of time before that just becomes, you know, I you're I think Kristen actually says, visiting the data, not looking at the data. Right you're going into somebody's notebook and seeing, Oh, where are they at right now? And that, of course, has massive repercussions for how we even think about the article. It's an article even relevant anymore at that point.
Are we just doing these kind of share outs periodically to, you know, have some kind of, you know, stake in the sand, you know, as to where we are or where we were at a given point milestones. But Yeah, Thanks for bringing that up. I mean, I, I debated whether to talk about like, really we need to think about beyond the article and breaking this whole thing apart, but hopefully we get there.
Sarah has removed her shoes. See, Sarah, I'm soaking wet here to go online first. Right OK. We have a couple questions from online. The first one is from Neil and he says traditionally they're coming in fast. Give me one second. Traditionally, there's been a focus on publisher incentives, milking away APC dynamics, et cetera but how?
Promotion? how about promotion, tenure and funding incentives by institutions and funders for generating and setting the requirements for publications? How do you think of the balance of agency in that regard? Who holds the keys to change? Thanks, Neil. Is that neil? Yeah Yeah, of course it's us. And, you know, I just think we maybe have the most important role in all of this.
Everything, everything, all the standards that we set are going to be exploited. And, you know, that's fair, right? If we've said that it's the article and it's this particular kind of Article with this particular brand on it, then that's what people will shoot for. And until we get away from that, we're not going. I don't expect there to be any meaningful change.
So so that's a big, you know, part of not so much my role, but certainly my team's role at hmi is to rethink that. But it's a little bit of a chicken and egg problem where if you don't know what you're actually. So what's the alternative? That's what we're tasked with trying to figure out, because we need to give people something to shoot for. And it can't be this pie in the sky thing.
So I think, you know, we're getting closer to converging on that and, and finding what that alternative is and being able to start shifting what we reward. We can, at least in the interim, start to remove those the signals that have been damaging. And so one thing he has done is just to stop putting journal names on bibliographies or allowing journal names to be put on journals, on bibliographies for scientists.
So that doesn't factor in to our decisions. But that's a small example. It's just the beginning. Do you mind taking a few more online questions? Perfect OK. The next question is from Jacqueline. How do you feel about using AI to explore theoretical areas? Is it a reliable way to mimic a science process? That's a great question.
I that's not my area of expertise. I would venture a guess that AI is still too noisy and hallucinatory to do that reliably, but I think that that's where I think we will see that, you know, I use I already to try and give myself those ideas or start to, you know, brainstorm about things and it does help. So but these are, you know, mine are my things that I do in my daily job are a lot more prosaic than these big, bigger questions that are very domain specific.
But to the extent that we start to train AI models on domain specific topics, I think that we can get closer to that. That's definitely in the future. OK, Michelle. Thank you, Sarah from aip publishing. Hi, everyone. Two questions. So my first one is, you know, looking at the stakeholders that care about this, I think what's most perplexing to me is that researchers have built the ecosystem of how they're incentivized, and they seem to care most about is it novel?
You know, the things that we're publishing are in some ways a function of the things they say they want. So how much of this future is possible when the researchers themselves seem very, very many researchers seem less inclined to see this much more complex, connected world as valuable. So that's my first question. And then my second question is about what? What about what shouldn't be shared?
Because it doesn't add value. It's junk. It's not meant to be reproduced. It's just, you know, marginal marginalia. Like there's so much data we know that's being shared now that's not being ever used, ever touched. How is how are you thinking about this stuff that is doesn't need to be shared yet?
So on the first question, that's hard. It it's also generational. I mean, I think what we're seeing a big split between, you know, people who have been doing research for decades, and they're used to doing it this way. And this frankly, like they just don't, you know, or they just don't have exposure to these new ways. And until I really feel like until you show people, not just tell them, Oh, it's data built, it's built on, what does that mean?
That's why I wanted these gifts, because I think it gives you a sense of what am I actually talking about here? Or just build it for them, take their preprint and do it. And then then show them and say, you would this better? Maybe it's not right. Like I don't claim that we necessarily know this yet. I'm seeing some early signs that it is and that people like it. But I do think that there is a generational split here, and that we're the reason that we're seeing a very slow movement toward it is because, you know, people are retiring and there are new people on the scene who they're used to consuming information this way.
In the main. Right and then the second question was what not to share. Right so we do have to trust, researchers to some extent about this. We don't actually have a very strict or well defined data policy right now at hhmi for very good reason, because we don't really know one how to do it right and enforce it because there is data you just produce that's just like, I don't why would I take up space with this?
And I think researchers do have to be trusted to know, like this is not the thing that will be used. This thing would be used. This thing I can see being useful and allow them to kind of lead the course on that. Yeah, I had something else I wanted to say about that, but now I'm blanking on it. But that's basically it.
It's like one trust, researchers. Two like, OK, so some stuff is just noise that people think might be useful. And then it's out there. And I think that that's OK. Like, this is the point I was making about the preprints. Like I remember Anurag at Google Scholar saying, I can't believe we're complaining about having too many papers.
Like, how privileged are that we like? This is amazing that we have too much information. Now we have to figure out how to sort it, curate it, and figure out what actually to pay attention to. There's going to be a lot that we can't pay attention to, and most of that we're not going to know up front. So we can't. So it's difficult because you don't want to not share it, assuming that it's not going to be useful.
But there's some stuff that scientists are like 99% sure this is just going to take up bits and it needs to be just destroyed, right? So anyway. That's all. Hi, I'm Joel Silver. I'm an adult. Think so I'm going to riff off a bit of Neil Christensen's question, which I'm sure you'll appreciate.
But leaving to the side, I mean, you talked about biomedical sciences. I mean, there is obviously a wider world of social sciences and the sciences and research varies, you know, quite across the board. But when you take it back to sort of where this all comes from is academia, right? I mean, academia is the one that created all of this. This is where publishing developed from and all of the other pieces.
So and if you're thinking about sort of how this evolves in the future. So in seeing these roles come together, if you take academia and look at how quickly it's changed, take one example, right. How has the dissertation process changed from the 1800s? Right it hasn't. Right and so I want to get your thought with the glacial speed of academia, do you actually imagine any of this ever changing, given that none of it has changed since the 1800s?
Well, I'll let you think about that. We can talk about that at lunch. I know it's going to take me that long. I've been trying to understand why. Why glacial? Why is this the way that it is? And the only thing that I can come up with is conservatism. And that there is good reason for that, right?
Like we are, we consider ourselves, as you know, scientists and stewards of science to be the stewards of information that is the most important in the world. And so if you anything that you do, that is a huge disruption to the way that we've been doing things. Presents seems to present a threat or a risk to that underlying to that information space that we're supposed to be protecting.
Guarding but my. My challenge to that is that we're just seeing again and again that it's been corrupted. Something's something has gone very wrong here. Right so we can't. So we're trying to protect it. It's not working. So, you know, we're as scientists, we should, you know, see that and say, this isn't working.
Maybe something else, but it's just too big of it's a huge ship to turn. So it's conservatism that's just built in. And then having this conglomerate that you need to kind of move everybody at the same time. That's why I think that organizations like Helios higher education, learning, leadership incentive, it's way too long of an acronym. But it's something like incentive, open science, open scholarship there.
You know, that's I think almost 100 institutions now that are, that are interested in this project of reform. I don't know that it extends to dissertations and the like, tenure, all of these things that, you know, we should probably be starting to question. But the fact is people are signing on to initiatives like that. Thanks there we go.
A couple more questions from online. This is from group watch. Wiley as far as an author goes, how concerned, if at all, would you be that I, as an author, could present the data gathered by researchers in misleading ways? Or a bad faith researcher presenting incorrect data with confidence, and then they reference a infamous example of Google's AI recommending adding glue to pizza cheese to get a better melt. As asking I a clearly incorrect question and then it runs with it.
I am concerned about that. I think it's a real problem. I you know, we're already, as he points out, we're already seeing evidence that people are using generative models to write their papers or parts of their paper and misusing it. Probably you have to like, also read the thing that it outputs. I don't understand how people just copy paste.
But again, it's like the paper mill thing, right? Like if you're just getting if you can get this published, if you can get this, get it peer reviewed and published, then you've done the thing that your institution wants you to do and you've ticked that box. And so that's kind of what I was saying is like with the advent of these technologies that are making it easier to just produce noise, we have to this is the time to pull back on our promise to reward just anything that's published.
I mean, and again, like the reason I said that I was skeptical of, you know, preprints flooding the zone is because the one wonderful thing that preprints have done, I think, for, for us, is to take the Magic out of publishing. This is not a magic thing anymore. You can just publish anything you want. And so it's not the act of publishing that should be special, that should be rewarded.
And in fact, you know, watch out because people are now looking at what you publish and you're making a profile for yourself based on this. So I think that will end up turning around and biting people. If they use I in that way, it's going to be caught. Now we have AI detection algorithms. It's just going to be this arms race to see which one can, you know, outcompete the other.
But I do think like that's a problem. And it will sort itself because you're just going to destroy your reputation if you do things like that. All right. Hold your questions, please, for breaks and lunch. In fact, we have. Thank you, Michelle, so much. Thank you everyone, so much for that conversation.
We need just a few minutes to set up the next session. So don't go far. But you have time for a quick leg stretch or fresh cup of coffee, and we will start again in just a few minutes. Thank you again, Michelle.