Name: The Opportunities and Threats of AI for Scholarly Publishing!
Uploaded: 2025-03-26T00:00:00.0000000
Duration: T01H04M37S
Description: The Opportunities and Threats of AI for Scholarly Publishing!

Name: The Opportunities and Threats of AI for Scholarly Publishing!

Description: The Opportunities and Threats of AI for Scholarly Publishing!

Thumbnail URL: https://cadmoremediastorage.blob.core.windows.net/4f9aeba6-5900-4ce6-b748-864f74d93c16/thumbnails/4f9aeba6-5900-4ce6-b748-864f74d93c16.png

Duration: T01H04M37S

Embed URL: https://stream.cadmore.media/player/4f9aeba6-5900-4ce6-b748-864f74d93c16

Content URL: https://cadmoreoriginalmedia.blob.core.windows.net/4f9aeba6-5900-4ce6-b748-864f74d93c16/GMT20250326-150012_Recording_gallery_1920x1080.mp4?sv=2019-02-02&sr=c&sig=nZHsSVa1T9lxgnn5L1n5OqZVhtPb2QKfttOU%2FEXdhQ4%3D&st=2025-10-31T08%3A31%3A07Z&se=2025-10-31T10%3A36%3A07Z&sp=r

Upload Date: 2025-03-26T00:00:00.0000000

Transcript: Language: EN.
Segment:0 .
Hi, everyone. Thank you for joining us. We're just letting folks gather and we'll be getting underway momentarily.
Hello, everyone. Good morning, good afternoon, good evening. Depending upon where you are. We're just letting folks gather, and we'll get underway shortly. OK we'll give it about 20 more seconds, and then we'll get started.
OK Thank you. And welcome to today's SSP webinar, the opportunities, and threats of AI for Scholarly Publishing, understanding external factors, business opportunities, and use cases. Before we start, I want to thank our 2025 education sponsors, access innovations and Silverchair. We are grateful for your support as always. My name is Lori Carlin. I am the chief Commercial Officer at Delta Inc and the SSP education committee's webinar lead.
Before we get started, I have just a few housekeeping items to review. Attendee microphones have been muted automatically. Please use the Q&A feature in Zoom to ensure questions for the moderator and panelists can be viewed. You can also use the chat features to communicate directly with other participants and organizers to send chat messages to everyone in the session, select everyone from the two dropped down instead of the panelists.
I think we've all gotten kind of used to this, but every once in a while I. I do the panelists instead of the everyone closed captions have been enabled. If you don't see the CC icon on your toolbar, you can view captions by selecting the More option on your screen and choosing show captions in the dropdown menu. This one hour session will be recorded and available to registrants.
Following today's event. Registered attendees will be sent an email when the recording is available. A quick note on SSPs code of conduct and today's meeting. We are committed to diversity, equity and providing an inclusive meeting environment that fosters open dialogue and the free expression of ideas free of harassment, discrimination, and hostile conduct. We ask all participants, whether speaking or in chat, to consider and debate relevant viewpoints in an orderly, respectful and fair manner.
Now, I'd like to briefly introduce today's moderator, Dave Myers, CEO of data licensing alliance. Dave is a serial entrepreneur and a recurring revenue, licensing and B2B information expert with over 30 years experience specializing in strategy, sales, legal, licensing and business development. He has drafted, negotiated and closed over 500 domestic and international licensing agreements with partners, customers and distributors.
He's also negotiated and closed countless business alliances, strategic partnering and revenue generation deals, and prior to starting his consulting practice, he was executive director, global licensing and business development with Wolters Kluwer health for seven years. And Dave has been an active participant in SSP for many years. And so we're happy to have you hosting today.
Take it away Dave. Thank you Lori. I really appreciate it. And very kind words. So today as Lori mentioned, the webinars, the opportunities, and threats of AI for Scholarly Publishing, AI and scholarly publishing presents both challenges and opportunities. And while concerns about copyright and ethical usage are valid, the potential for AI to enhance efficiency, personalized research experience, and support research is undeniable.
And so today, you'll hear about the current state of AI and the possible impact on our scholarly publishing community with respect to external influences, such as copyright and future legislation, licensing and business opportunities, and use cases with end users and researchers. So AI is having a defining moment. Over the last few weeks, there's been countless reactions to us vice president JD Vance's speech at the Paris eye Action Summit, many of them understandably focused on Vance's forceful framing of the US as the global leader in AI and the Trump administration's visions for fostering AI.
But underneath the bluster was something more interesting positivity. Vance described AI as one of the most promising technologies in generations, highlighting how it can augment rather than replace, workers. Regardless of your political beliefs and feelings about the current US administration, the speech was correct on at least one significant point. We can't just fear AI or hope it magically works out.
We need to think clearly about how we want to use it in order to start preparing the next generation for AI era. We have to actively engage with this rapidly developing technology and create incentives to encourage students, researchers, and of course, the publishing community to achieve something meaningful. There's been a recent blitz of new product releases related to AI.
Google expanded Gemini two OpenAI, shared his roadmap about the future products, aiming for a future where I just works and interestingly, just yesterday incorporated dall-e, their image generation AI into its core offering. And of course, of relevance to the SSP community, just to name a few. Humm launched alchemist review, an AI powered tool designed to enhance efficiency, consistency and quality of the peer review process.
Silverchair and oup jointly announced the launch of Oxford academic AI discovery assistant, which supports researchers to develop highly relevant search results, and EBSCO launched AI insights, which generates a short list of key insights for full text articles and overall search accuracy. McKinsey the publishing. Excuse me. The consulting company published a report recently, and they said that the value of I comes from rewiring how companies are run, and that includes organizations in the scholarly publishing community.
Their latest survey showed that out of 25 attributes tested for organizations of all sizes, not unexpectedly, the redesign of workflows and the exploration of AI to create new products and services will have the biggest effect on the organization's ability to see impact from the use of generative AI. Many organizations are ramping up their efforts to mitigate generative AI related risks. More respondents were more likely than in early 2024 to say their organizations are actively managing risk related to inaccuracy, cybersecurity security and intellectual property infringement.
Three of the j'en AI related risks most commonly having negative consequences for their organizations. The practical result of all this is that the individual will continue to become more capable than ever before. You can use AI to generate fully COVID apps, bring projects to life that once required entire teams, or even push forward on cutting edge research. The question is which problems we aim all this energy at and the vision that guides us as a consultant for many large and small organizations who both license and are licensees of data for AI.
I'm in a position that affords me a bird's eye view into how AI is being deployed in solutions, and how people use AI in the course of their work. When I work with teams of senior executives to create AI policies for their organizations. We recognize that the most effective route would align the fundamental values of the organization on which it is built. So what do we want to happen next? That is the core question you all should think about, and I'll start ending up with a little quote that says, I might let us do more than ever, but it won't decide for us what's worth doing.
So now I'm going to turn to our esteemed panelists to give you a glimpse from their perspective of the opportunities and threats they see related to AI. On our panel, we have Keith kupferschmid from Copyright Alliance, Kate Whitlock from outsole, Richard Bennett from harm, and Avi stayman from skywriter. So first up, Keith Coopersmith from the Copyright Alliance. All right.
Thank you, David. Hopefully everyone can see my slides here and very happy to be able to present to you all. And I think my job here is to set the table a little bit in terms of what we're going to talk about, to provide some legal and policy background. And before I get to that, let me just mention that who the Copyright Alliance is. So you have some perspective here.
We are really the unified voice of the copyright community and our job. My job is to promote and preserve the value of copyright and to protect the rights of creators and copyright owners. And we do that through several different means. We work with policy makers, whether it's members of Congress, members of the, you know, Trump administration, the US Copyright Office, other executive branch agencies.
We filed briefs in federal court to try to educate the courts. And a lot of what we do is education, which I'll talk about at the very end of my few minutes. Here we do represent, you see, a smattering of our members up on the slide here. We do represent the copyright interest of over 15,000 different organizations across a spectrum of copyrighted copyright disciplines. So it's the usual that you might think of when you think of copyright, right?
Movies and record labels and book publishers and video game publishers and scholarly publishers, I should mention as well. But it's also groups that you don't think about, like sports leagues, the National Association of Realtors with their MLS database that's protected, or the National Fire Protection Association, which has got electrical codes and that are protected by copyright.
And then we also represent about 2 million individual creators. These are the photographers and the performers and the songwriters and the software coders and the artists and the authors and this new generation of creators that are out there, many others who create and make a living through their creativity. So when it comes to AI, you know, we talk to all of them. And, our general view. Their general view is that we support responsible, respectful and ethical development and use of AI technologies.
And and that means, you know, that means respecting copyright on obviously. So what I'm going to talk about today is I'm going to go through fairly quickly, though, the current state of play. I'm going to talk about government AI activities, both legislation in the United States and abroad, the Copyright Office, what they're doing, the Biden and Trump administrations, what they're doing, the courts.
Boy, Oh boy, are they busy. And then sort of wrap it up with licensing technology. And and do all that in just a couple of minutes. So I'm going to kind of fly through here a little bit. So first off government AI activities. So in terms of congressional activities frankly and maybe this shouldn't come as too big a surprise. There hasn't been too much right. Congress lately just doesn't seem to do very much in any particular area.
But they are obviously concerned and looking into AI. There have been many, many congressional hearings on the issue of copyright in AI and obviously AI more generally. And then late last year, the AI taskforce put out in the house, put out its report of views, and then copyright, in essence said, we're looking to the courts. The courts are going to solve these problems for the most part. The one area where there was copyright legislation in the United States that was considered is on transparency and transparency bills, and they were introduced last year, so they would have to be reintroduced this year since it's a new session of Congress.
And in essence, what these bills would do is require AI companies to disclose the copyrighted works that they ingest for their training purposes. So and that's all it does. It doesn't require weights or parameters or algorithms or any of that stuff to be, to be, to be disclosed. Just the type of the works that are actually ingested. And there are two different approaches to that. One was in the shift Bill, which at the time was introduced in the House.
Of course, Representative Schiff is now Senator Schiff, but he introduced a Bill that would require all AI companies to really deposit that information, all the copyrighted works into one depository. And then there was the Welch Bill. Senator Welch from Vermont, who had introduced a Bill that would basically require AI companies to disclose what copyrighted works they're using if they were asked by the copyright owner.
And so we do anticipate that at least the Welch Bill will be reintroduced. I'm not sure about the shift, the shift Bill or not. And then there's legislation outside the United States, and we have the EU AI Act out there that would require transparency and opting out. But it's very much even though it's in force. It's still very much a work in progress. I did leave a lot of the details to be figured out.
And and we constantly see kind of updates there. There's a new consultation going on in the UK about a proposed exception that would allow a copyrighted works to be used without compensation or authorization. And the creative community has really, really pushed back on that. And then there's other countries. Singapore, Japan, Hong Kong and others that have considered what to do here in terms of AI and copyright and in some cases actually have exceptions.
In some cases, the idea of those exceptions have been rejected in terms of presidential action. President Biden put out an executive order on AI. I will not talk about that because that doesn't exist anymore, because when President Trump came on, one of the first things he did is basically end that executive order terminated. It would have required the Copyright Office and the patent office to work together to come up with some recommendations on copyright in AI.
The Trump administration itself has put out a request by OSTP for information on various AI issues, including copyright and those. Those comments were due back, I think it was on the 25th of March. And so they are now calling through those over 8,000 responses to try to figure out what their AI action plan should say. And so we'll be waiting for that.
I don't know when that will come out. The Copyright Office has been very active. They've put together 3 studies, two of which have already been released. One is on digital replicas. The idea of someone protecting their image, voice and likeness. Another was on copyrightability. And that was published in January of this year. And then we expect one more, which is on ingestion, fair use and allocation of liability, which will be coming soon.
So I would certainly keep your eyes out for that. And happy to talk about any of these in more detail. If people are interested. But the real activity, the real big activity is in the courts. There are about 40 different pending AI copyright infringement cases in the court. About half of those are class actions. Every day it seems like there's new activity in these cases. Matter of fact, this morning there's a bunch of activity in some of these cases.
The majority of these cases involve authors and publishers, right. The vast majority. There's a couple cases involving music, photographs, you know, images, computer code and what have you. But almost all of them have to do with literary works, authors and publishers filing cases. And most of the cases are brought or pending in California and New York.
And the primary issue in all these cases is, is the unauthorized copying and/or ingestion of copyrighted works for training purposes of fair use. Because it's fair use, the AI companies can do it, and they don't have to compensate anyone, and they don't have to get authorization to do it. And so fair use is decided on a case by case basis. There's no one rule.
You have to look at each case, look at the facts of each case and decide. AI companies will make the point that they think their use is transformative. But in each case, you have to look at that and go, OK, what is the ultimate purpose and justification? If you've got a generative AI program that's ingesting visual works or music, let's say, to produce other music, that's not terribly transformative.
That might be substitutional in nature. And that's a problem. What is the impact of scraping pirate websites should, you know, should that be protected, really, that the fact that it's pirated works and they're scraping that? What harm is there to the licensing markets? Frankly, that's the big issue because there's a lot of companies out there that are licensing their works to AI companies to ingest.
And if all of a sudden it's fair use, those licensing markets disappear with the blink of an eye. That is a big, big issue. Only one of these cases has been decided. That's Thomson Reuters versus Ross. That was decided just a few weeks ago actually in February in 2025. And in that case, the court held that it was not fair use that what Ross was doing was an infringement of Thomson Reuters copyrights in their head notes.
And so but that's just one case. And like I said, all these cases are decided on a case by case basis. So let me wrap up here a little bit in terms of the future of AI and copyright and what to watch for next. Like I said, there are about 40 cases pending. There are new cases, it seems like filed every week or every month.
And so there'll be new cases to watch out for, but there'll be new cases coming down and being decided. So that will be something to track. The Copyright Office report will be coming out presumably in the next month or two. And this UK consultation, what the UK decides to do on these very important copyright issues. They don't have the fair use doctrine in the UK or for that matter, anyplace else.
So whether they decide to create an exception or not will be interesting in our mind. The solution here is licensing, licensing, licensing and also technology, of course. Right and if we just put everyone into a room, I think we'd be able to come up with solutions. And let's not have to rely on Congress or the administration or anything like that. So just real quick in closing, in closing, I think I just lost.
There we go. Is if you want more information on any of this, I encourage you to go to our website. We've got this page, which you see up on there about education. We've got FAQs. We've got a special page on artificial intelligence that you can find a whole bunch of information on here. And then also under our get involved page, which is all the way to the right, the orange part, we actually have different newsletters.
And one of our newsletters, even though it's not shown up here right now, is the we have an AI newsletter. So if you're interested in just keeping up with what's going on in AI, I encourage you to check that out. So with that, I will throw things back to David Keith. Thank you very much for those words. Up next on our panel is Kate Walsh from outsole. Hi, Dave. Hi, everybody.
Thanks for the intro. I don't have any slides, so you're going to have to just watch my face or, you know, don't watch my face and go and do something else. But listen, there are interesting things coming up. So what I want to talk about today as part of this webinar is what we've seen scholarly information providers doing with AI. Some of the low hanging fruit that we've seen being picked and where we think this might be going next outsole provides market research, strategic analysis to information providers across the information industry, as well as to financial services and technology providers.
We've been looking at Gen AI extremely closely, obviously, to the extent that some clients at one of our recent conferences, we're like, is there not another topic that we could be talking about as well? And of course, there is, but this has been really dominant. I think what's important to remember, though, is that AI isn't new to providers of scholarly information resources. So we're seeing the next generation of offers as generative AI has come into play.
And it does. I think what's important, though, is that AI does enable us to do things which have never been possible before in terms of solving some researcher and end user pain points so we can help to improve working practices. And that's both within scholarly providers and for researchers themselves. We can help researchers to easily scour vast reams of data.
And we all know about the amount of publications being created and the amount of content that's out there to sift through. We can help them to identify relevancy in a really efficient manner. And that's particularly important when they're looking at unfamiliar research search areas, we can help with the summarization of huge volumes of data. We can help to identify trends using AI in really vast data sets, particularly through visualization.
And we can even help users to understand and access content which may not be in their native language. So lots of things that AI is able to is able to speak to that we haven't been able to look at before. So I've narrowed down a few use cases that we've seen publishers speaking to when they've launched some of their new Gen AI offerings. I think one thing that's important to notice is that there's a number of ways in which these are being charged for as well.
So I want to come on to talk about some of the business models a little bit later on too. But let's talk about the actual capabilities and use cases first. So I think that that first piece of low hanging fruit was really around search. And how can we improve search using semantic tools and AI. But early AI tools were in were in that space for years. So on silo, for example, from cactus, was used by project muse to drive content recommendations.
So once OpenAI launched ChatGPT at the end of 2022, we saw new types of solutions emerging pretty quickly. So by September 23rd, Elsevier had started piloting Scopus. AI Digital Science had a beta test of its dimensions. AI assistant and site had launched site assistant. You'll see the word assistant coming up a lot in generative AI tools from scholarly information providers, and I think that's really a key way of helping providers to emphasize that they're looking to support and facilitate the work that researchers are doing, not to replace those activities in any way.
I think what we also saw right at that beginning point, when some of these new search tools were being introduced, introduce was sort of a note of caution. There was worry about validity of these generative AI search results, worries about things like hallucinations. And to honest, that's a problem which still hasn't gone away. So different technological approaches were taken using more traditional search methodologies, for example, to pull out a small number of articles, or over which the generative AI technology can then be run to produce results.
That's certainly what. Happened when solutions started. And actually it's an approach that continues to this day. So the new Science Direct AI product that came out a couple of weeks ago, what that does is it uses a sort of a rag method of searching. They call it a chunking method. It must be called something more technological than that.
But I don't know what it is, and it pulls out relevant passages from full text books and journals, and then they search over that more limited data set using the generative AI tools. There's lots of startups in this space as well. I mean, I've mentioned some of the bigger players. But we've seen illicit, we've seen consensus, we've seen zeta alpha. There's a number of providers out there who, who, who are doing some really interesting things.
And they've added some additional capabilities into those initial search offerings. So Clarivate web of science research assistant as their word assistant. Again, you see, offers natural language search capabilities in several languages. So that speaks to that pain point of not being able to access content outside of your native language.
And the new Elsevier Science Direct AI tool has a reading assistant, which essentially enables researchers to chat and have a conversation with specific journal articles and to query that article through a chat functionality. So it's still search, but it's a different way of looking at that search problem. So search was sort of the first big use case that I wanted to talk about. I also wanted to talk about the way in which these tools help to sort of supercharge our research workflow.
So search and discovery is really just the start of a much, a much longer process of research activities. So that might be things like automating literature reviews, generating hypotheses, drafting, undertaking editing, even some peer review. And I'm sure Richard will come on to talk about the kind of peer review activities that hum have been working on.
Again, lots of startups in this space elicit again and research. Rabbit and I think what we started to see after a few months was that researchers were starting to understand that there were really interesting capabilities out there from, from, from generative AI tools, but they were kind of using them in the same way that they were using old type of tools. So search is search. It may be a more effective search, and you may start to evolve that through a kind of a chat mentality.
But it's but it's just search. And they were finding it difficult to think about other use cases and other ways in which they might use these tools. So we started to see providers building in guidelines, as it were, I guess. So so Clarivate started with its web of science research assistant, and it had context specific prompts. So rather than just having a search box when the user arrived at the page, it's got a box saying something like, are you trying to find a journal in which you might want to publish something?
Or are you trying to understand a topic better? Or are you undertaking a literature review and helping to guide through those kinds of questions. And prompts to get really effective usage of those things? And we've seen that evolving as well. So elsevier's new Science Direct AI product, one of the things that they noticed was that one of the most time consuming tasks for researchers was about identifying methods and protocols that had been used.
And comparing those methods across multiple journal articles. So they've produced a tool that does a sort of a generative AI summarization over a certain set of articles, but looking specifically at methods. And then it summarizes that in a grid. So it's a bit like when you're going to Amazon and you're buying a new vacuum cleaner, and it has a grid with the different capabilities and the prices, and you're able to really compare very simply.
So it uses that same kind of approach. They describe that as really being a wow factor. People haven't seen that elsewhere and they saw it as a massive time saver, which was really important for researchers. There's a number of different other use cases. I don't want to take up too much of my time because we've got some other fantastic speakers here, but I think we've seen the use of data visualizations to improve research capabilities.
We've looked at the way in which generative I can help with confirming transparency and stability. So our repeater, which is a division of Digital Science, uses natural language processing to look at the content of papers and look for hallmarks of sort of responsible science. And then in a more applied setting, we've seen the use of generative AI to help facilitate clinical decision support and accelerate R&D in pharma contexts.
I actually did a piece of research towards the end of last year that looked at what researchers might be expecting in terms of future use cases, and I thought that that would be useful to highlight here. So about half of them were hoping to use generative AI to generate creative ideas or designs. So in other words, to use it right at the start of the process, which is really interesting.
2 2 actually undertake research. So we've seen a lot of that search activity already, 43% were looking to generative AI to help them with drafting of documents or reports. 37% thought they'd use it to automate admin tasks. I don't know why that isn't higher. It's definitely a very good usage and 30% expected to use AI to analyze data. So there are lots of different opportunities still out there.
I said I'd quickly talk about business models, so I just want to spend half a minute on that. We do see tools sometimes available free of charge. Increasingly unusual, I would say. But I think at the very beginning, when people were looking to set a sort of a competitive position that was relatively common and looking to make up their investment by increasing revenues through raised prices or improved renewal rates.
Some startups offer paid for standalone solutions, but for some of the bigger providers, we tend to see a premium level solution offered at an additional cost to the existing subscriber base. As Dave mentioned, there are some challenges in this market as well as opportunities. I feel like are focused on the opportunity. So I just want to close with some of the challenges. I think for many publishers, there's a commercial imperative to invest in AI driven innovations to help them keep that competitive edge.
The larger players tend to be at an advantage here. There are obviously concerns, as Keith has outlined, around protecting IP, and there are also concerns around research integrity. We're using generative AI now to try to identify content created by bad actors using generative AI. So the battle is certainly on there. I've certainly heard a lot of concerns from librarians around hallucinations and information quality.
And they're very keen to test solutions before they get to their patrons. In some cases, they may be more concerned than patrons themselves. I think it's natural for researchers and health professionals to very much sanity check the research that they find and to validate it by, by comparing with other resources. So hopefully there won't be they won't be a bottleneck because I think there's a really interesting opportunities and values here for a lot of scholarly researchers and for the scholarly publishing community overall.
It's quite a high level overview, but those are my initial comments. Happy to answer any questions when we get to the Q&A session. Thank you Kate. Our next panelist, Richard Bennett from hum. Thank you, Dave. So for those of you who aren't aware of hum. Hum is a technology company based out of Charlottesville in Virginia.
We've been active in the scholarly publishing space for around about four years now. So I'm going to share with you a couple of maybe more practical examples of the utilization of AI in scholarly publishing. You've heard from quite a few from the landscape side, but these are going to be looking at two different aspects where it really does have a benefit for us to be able to utilize it.
One is around audience intelligence and the next one, and thankfully it's already been mentioned. But the more recent application of AI in editorial stroke peer review kind of processing. So just a little bit about of our backstory, and it kind of gives you an idea of where we developed and why we developed into the kind of AI space. So hum started life as a customer data platform. So aggregating pieces of disparate pieces of data across a publisher's ecosystem to create a single record.
The one thing that became very obvious quite early on was that there was a huge gap in publishers understanding, or at least the data around the interactions that a publisher will have on their content site. And the content site, obviously is providing one of the key aspects of integrated interaction with the research community as they're reading. But most of these users are anonymous, anonymous profiles that but but those anonymous profiles were engaging deeply with the content.
So one of the things that hum really needed to do was to find a way of being able to create a way of, of understanding these profiles and create an intelligence around it that would allow publishers to both understand more deeply and also be able to activate on these interactions. So hum developed. Alchemist alchemist was is a language model that was based on a model called lodestone.
And lodestone was essentially developed to be able to cope with a longer queries around longer strings of text. So obviously with scholarly publishing and research articles, you have a very text dense situation. So essentially you needed a model that could cope with that and be able to understand that in a significant and deep level. So the way that the AI kind of aspects come together is really around the content intelligence aspect.
So taking in a publisher's corpus and then being able to use a couple of different versions of AI to be able to create a taxonomy. So it's building a custom taxonomy by using interpretive AI to be able to extract things like topics from the content, and then generative AI to create structure and organization. So multi-level taxonomies from that, from those topics, once those that taxonomy is created and those topics are attached to pieces of content on the content site, you can start being able to see how different profiles interact with that content.
And those topics are essentially will then migrate over to the profile, and it'll give you a profile of actually topical interest, but also of the level of engagement that you have. So there's a secondary basis. You can actually start looking at that audience in a, in a much deeper, deeper level. So you can actually have this AI native understanding. So we can look at profiles and understand and see what they've interacted with, how much they've interacted with that, that content.
And that gives you a basis of you may not have any demographic kind of details of that user, but you will know them deeply. You will know what they're interested in, how much they're interested in, how recently, they've been interested in, which gives you a very kind of rich set of data that you can be able to utilize for various different use cases. The other side is the predictive intelligence.
So by actually having a mass of data, you can actually start being able to understand not just what people have done, but also what people might do in the future. So for example, if you have a profile and then you want to be able to understand where it might want to go next, what piece of content might be interested, you can look at the collective understanding of all the profiles who have a similar, similar kind of route through the data and then see what is the most engaging piece of content that they went and found after that, that point.
So you have a whole host of number of, of actionable usage that you can be able to build off the back of this. So content recommendations, content recommendations up until this point have really been contextual. So they've been associated with the piece of content on the page. But what this really actually allows you to start to conceive is that actually you can make a content recommendation based on the profile and the what you understand about the profile, and be able to provide and surface content recommendations that are tailored to that individual person.
You can do interesting things, like you can take a scope of a special issue, and you can put it into a model, and being able to have the topics extracted from that, from that scope, be able to search your entire database, find the exact matches that have the same level of deep engagement with those topics. And then be able to invite them for authorship opportunities.
So there's so you can be able to talk very specifically target authorship opportunities to individuals. And the same thing for advertising. It's similar to the content recommendations. It has tended to be contextual in its purpose. So but by being able to move it to a behavioral based targeting Association, you can be able to open up a far greater level of inventory across your site. You're not linked to just having content that could be advertised on a single page.
You can advertise to whoever might be interested in your target group, wherever they're occupying in your content sites, so you can be able to have a far wider range of opportunity to be able to interact with them. So this is the kind of shift that we made as the next shift. We had a very deep understanding of content, and in many ways. We started to look at what are the opportunities out there in the sky, or the challenges that scholarly publishers were facing, and where could we?
Where do we feel that we could bring that content intelligence to bear in solving some of the key problems? And one of the key problems, obviously, is that the submissions are growing at an exponential rate. You have a limited pool of editorial staff and peer reviewers who are trying to process those manuscripts. There is a greater level of scrutiny to each of the manuscripts as we kind of work forward. So one of the aspects that we decided to look at was, can we start extracting from research publications interesting pieces of information that will help editors in the first stage, but also potentially into peer reviewers do their job in a more efficient, more consistent way.
Alchemist review, obviously, is a beta launched. It's currently under development with a few different publishers at the moment. But what this does is a number of different things. Essentially, it does kind of three major actions on a, on a manuscript. The first is the content extraction. So this is, you know, what you would normally have. So extracting the key interesting parts.
And you've heard a little bit from Kate on some of the, some of the bigger models, the, the bigger sciencedirect of this world who are starting to do this sort of work. But it's things like creating a really entity dense expert summary. So can you distill the meaning and understanding from a piece of research into an eight sentence summary? It's extraction of the author claims.
Can we extract and present the claims that the authors are making on why their research is valuable and valid, and should be reviewed? Extraction of key concepts and things like the research methods. So again, it's that these aspects, these key aspects of the research paper being able to extract them. I think one of the really interesting aspects at the moment is that there are now these deep thinking models.
So these are not just immediate response models, but these deep thinking models which allow us to be able to do much deeper analysis on pieces of research. So that allows us to be able to look at statistical kind of applications. So are the sample sizes congruent with the statistical kind of methodology within the paper? You can start making these sort of assessments.
You can start looking at novelty, whether from a discipline or a methodological application within a discipline specific perspective. Writing quality image analysis from a contextual perspective. So is the image congruent and relevant for the context that it's been presented in the paper. And look at methodological issues within that may occupy within the actual paper itself. Finally, you can start doing things like structural addition.
So applying taxonomic terms at submission, you can classify the paper automatically at submission. And you can start looking at retrieval of related content. So there's a whole host of different applications. What I mean essentially what it's doing is creating an environment hopefully where a lot of the key significant work, not the thinking work, and maybe not the contextual analysis in the field, but actually the actual thinking work on the manuscript itself is done for the editor or the peer reviewer, and then you can be able to essentially have them focusing on the research itself.
I thought I'd put this in because I don't think we're really going to be taken over by robots, I think. And you've heard you heard this all the way with Kate's presentation as well. The future really is looking at how we can apply AI to make greater efficiency and support the human based process.
And I don't think certainly for the near term, we're really looking at a future where that's going to change significantly. Where we can change is we can make that process massively more efficient, massively quicker and probably more consistent in its nature. So that's and happy to take questions at the end when we're in the panel. Fantastic Thank you Richard.
Our last speaker is Avi Steinman from academic language experts. Brilliant Thanks so much, Dave, and pleasure to be here. So I'm going to share some observations from having traveled around European universities for the last year and a half teaching them about AI. I teach both researchers and research offices, and I get bombarded with questions about AI.
So I want to share with you, the publishers, some of the questions that they asked me about publishing and publisher policies and some of the lackluster answers that I have to give them, because I think there's some work that we have to do as an industry to really make sure that researchers are that we're not over legislating proper and good use of AI or and potentially under legislating potentially bigger issues that could come about. So first of all, just share some of the biggest, you know, kind of themes that come up around when I do my teaching, when I'm actually in the classroom and go to the universities and talk to them, I would characterize researchers approach to artificial intelligence as cautious enthusiasm.
There's a lot of excitement, as was discussed by some of my colleagues. Researchers are they're so kind of inundated with between their teaching and advising and Grant submitting and, and, you know, proposal writing and article submissions. They will look for ways and means to become more efficient. Time is their most valuable asset. So if they can rely on the AI, they will do so.
Whether or not, by the way, we say that they can or not. That's just something that's important to note. They're cautious because they're scientists. And I think that most of them with good intentions are really looking to do the right thing and don't want to be producing research that is lacking, that is substandard. So, you know, they actually self I would say they self regulate, to a great extent.
Second thing I see are big, big issues around privacy right. So two there's two kinds of privacy questions. The first question that's almost always asked is my if I, you know, share my information with the AI tool, not so much. Is it going to be as the AI tool going to be using it? But it show up somewhere else online? And that's a point of caution for these researchers is what's going to happen with this research.
Who's who's taking it, who's using it, who's seeing it. And it's not a one size fits all answer in terms of the different tools have different privacy levels. Sometimes within the tool there's higher levels and lower levels. So I try and show and demonstrate how to be private online, but I think it's important to know that that's there. I also saw a question about that in the chat, and I'll say that it's not an impossible challenge to overcome.
It's not an inherent technological limitation. Meaning it's very simple. It's quite simple, actually, to set up a via an API or on a private server to run some of these large language models where nothing where everything's entirely private, it's just a matter of how things are set up. And then the last thing that I'll mention is they have a strong preprints for tools that are built around trusted sources.
So, you know, when I, for example, if you use, deep research tool of ChatGPT, you may get a mix of academic and non-academic sources. Whereas when you use tools that are dedicated towards research based and are built on Semantic Scholar and PubMed. So Kate mentioned a bunch of them earlier PSI, PSI space, you know, and others, they are built on top of the academic literature. So that's important differentiation to make.
The big issue is, is that they really cover Semantic Scholar essentially covers the open access literature and only abstracts from the non-open access literature. So there's still this big gap where basically imagine you're, you know, you're coming and you're asking a question. And the answer is only based on open access literature, which in certain cases is more than enough, right?
If you're asking a pretty simple and straightforward question. But if you're asking a very kind of niche, high profile question, the fact that you're missing 50% of the research is actually quite critical. So, you know, I think the next generation, what we're going to see hopefully, and what I'm hoping for is that we're going to see licensing deals struck between some of these tools, these search tools, and the publishers, so that when a researcher goes and asks a question, they're actually getting a complete, verifiable but also comprehensive answer.
Yeah so currently there's a few frustrations that researchers have with publishers. And I think it's important for publishers to be aware of this, because these are kind of the questions that come up in my, in my trainings. And these are the things that I've been pushing and trying to, in different fora. I've written in the Scholarly Kitchen, the Digital Science blog about these topics and how we as a publishing community can do a better job supporting researchers with AI.
So first of all, is it's really unclear about what we all talk about. Declaration so almost every publisher has a policy that says, well, you need to declare and you need to be transparent, but how exactly you do that when you do that, where you do that is definitely not consistent, even across publisher profiles. So even within WileyPLUS or within Elsevier, there might be different journals may have different rules.
And just exactly how am I supposed to do that. So on a very basic level, maybe it's enough to have, you know, if I'm a researcher, I may think, well, maybe it's enough to have one line that just says, hey, I used AI for parts of this project. Whereas, for example, I'll just give one example. The Lancet came out with a what I would call a very strict policy, whereby they basically have stipulated for the researcher that they need to document and share any time that they're having a chat with GPT.
Right that has something to do with the research. You know, I don't I don't personally think brilliant idea, but it's important. But researchers can get confused because they say declaration doesn't mean the same thing in different journals. So what do we do? The second thing that they fear is they fear being penalized. They fear that US publishers.
Maybe they think more of us than we know about ourselves. We have, magic tools that can basically go into their computers and identify any time they've used AI. So sometimes they'd love to use it, but they fear that the publishers are going to they're going to get in trouble or that publishers have banned it and therefore they're not going to use it. To me, that's a shame.
You know, if you've got a researcher who's not a native English speaker has trouble writing and communicating their article if they're not going to use I because they're worried about some punitive measures. I think that's a shame. The lack of guidance. So they're looking for publishers to tell them how should or should I not be using this?
Sometimes they're looking to their University, but they, it's kind of this, this circle where everyone's like, Oh, you know, just be tell us how you used it. But but before that, they're asking, well, how should we be using it? What is responsible use? Are all use cases the same? Spoiler they're not. There's a big difference between proofreading and data analysis.
And we need to really kind of get into the weeds in order to do this properly. And then finally like just again, this transparency, I'm a big fan of transparency. And I'm all for it. But I think the transparency needs to when we look through a transparent window, there's got to be something on the other side. We need to know what the value system is or and I think it's just reinforcing the current scientific value system.
I don't think we need to be doing anything different or new. It's important to know that AI adoption really is very much dependent on different factors. I mean, of course I'm generalizing here, but based on the research that I've seen, age plays a big factor. So much more likely that your doctoral and post-doctoral students are going to be, you know, kind of jumping all in on AI, whereas the pis are going to let them do it.
You know, not going to use traditional methods. So there's an interesting tension there. Interesting to note different geographies and their adoption rates of AI. So for example, I think that if you know, in the, you know, India and China. And where else have I seen brazil? They have no qualms at all about using large language models they're happy to feed in.
And again, I'm generalizing but happy to feed in their information. They don't see the issue with it. Whereas I think especially in Europe, there's a lot more concern around GDPR and privacy. So, you know, to understand there's that kind of adoption gap there. And the last thing I would say is differences in subject areas. So, you know, in general, humanities scholars tend to be more, you know, skeptical or hesitant to adopt these than STEM researchers.
Obviously, computer scientists lead the way in terms of their use of AI, and then social sciences are somewhere in the middle. So just to kind of wrap up some key business opportunities that I see, you know, from my perspective, first of all, is peer review. I don't think we should be afraid of experimenting with AI with peer review. That doesn't mean we should hand the keys over to AI tomorrow, but it does mean that I think we need to be asking ourselves, what are the different components of peer review?
Break it down to its basic foundational, you know, atoms, as we might say, and ask ourselves, which one of these do humans do better and which one of these does I do better? And then maybe we can make a more sophisticated augmented approach where we're kind of, you know, synthesizing between the two. The content discovery has already been mentioned here. I mean, incredible tools like undermind and perplexity are doing really deep research and being able to really write out search plans in a very similar way that you would, you know, you would as a research lab, maybe even better.
But instantaneously, which is really quite incredible. And I think in the end, it's all about trust, right? It's about validating AI usage. You know, researchers are looking to us as the publishing community to say, yes, you know, here's how it works here, how it doesn't work. Here's what you should be doing. I want to just call out for good. Wiley, I don't have any no conflict of interest that I particularly.
They just released the first, I think, the first comprehensive guidelines for AI in your writing for researchers. I've been very critical of kind of publishers being very kind of thin on their. Yeah, just be transparent and all's good. And, you know, we'll sweep everything else under the rug. I'm not going to go through this whole thing now, but it is dozens of pages in terms of not only guidelines but also kind of best practices, also more educational, which I think is a good way to go about it because, you know, it's great to make a rule, but people need to understand how to apply that rule and what makes sense.
So I highly encourage you. I can share this. But again, this is really hot off the press. I think it was published, a few days ago. You know, if you're looking to kind of, you know, ask yourself, well, how do we even approach this topic? I think it's a good starting point. So recommend you checking it out. And just to conclude, you know, I kind of when I do these trainings that I do, I ask people, give me your word.
Right give me give me your one word that describes your feeling around AI and research in 2024. Last year was a lot about fear, distrust, and nervousness this year it's different. It's curiosity, learning, excitement, potential, possibility. So I think that really reflects kind of this, you know, a quick, informal sentiment analysis of the research community shows that we are changing.
And the question that I'm asking myself is, is 2026 going to be the year where people actually take the plunge? And not only individuals, but institutions take the plunge. Start investing in tools and tool infrastructure. Structure, understanding the transformative nature of artificial intelligence, and maybe get to a point of actually transforming the research process as a whole.
So Thanks, Dave, for the invite. Great to join you today. If anyone wants to continue the dialogue and conversation, I try to write about AI and research on my LinkedIn page at least once a week. You're welcome to reach out and connect there. Fantastic Thank you Avi, and Thank you to all the panelists for your participation. It really warms my heart.
I heard the word licensing in a number of talks and you know, that's where I focus on. And I really think that that is one of the largest opportunities for the publishers on this call, because once the Google's and the rest absorb all the open access content in the world, what's left is the firewall content. And as mentioned, comprehensiveness is one of the aspects to great AI.
And they're going to need your content. So that's the lowest hanging. Opportunity I see. And hopefully we can, you know, talk about it if we have some time left. Unfortunately we only have about four minutes. So we have four questions in the AMP and I encourage anybody on this webinar to include your questions, and I'll get to them as or hopefully we can get to all of them.
The first one is, how does AI influence the selection of references and citations? Anybody want to take that one? I don't know if we know. I mean, I'd be happy if someone has more insight on this than not, but I haven't seen any research that really digs deep into, you know, if there's a certain bias or trend towards, you know, quoting certain researchers over other researchers, there's definitely, like I mentioned before, there's definitely a bias towards open access, simply because that's where it can search the full text and be able to spit it back out.
So I think you'll probably see an increase. Assuming that researchers are using these tools quite heavily, you'll see an increased use of open access text, which were already, quoted more often anyway. But I don't know if there's any more specific kind of biases that are baked in that have been proven. Yeah, I tend to agree with you. I think that the challenge is it's all really around context. And so encouraging both authors and publishers to have the most enhanced metadata that they have.
Because when the AI tools manage the relationships between the content and the context that they're looking for, that may influence the selection of the references and citations. That's the only thing I can really think of as well. Yeah I mean, I was just going to give a perspective from maybe from the opposite side, insofar as you've got tools like grounded AI that are looking at contextual kind of citations.
And so I think the focus there's going to be a greater level of clarity and focus on the, you know, validating the contextual use of the site of citations within a research paper. So how it influences the selection. I mean, normally these are, you know, the chicken and egg type of scenario. They're definitely the tools are there. And cite is another good example.
Right of the tools are there that are making the selection of references, a part of the assessment of the paper. So I think so I think it will have an influence. I'm not quite sure how it will be, how it will influence just yet. Yeah Thank you. This next question is for Kate. Kate, can you talk more about the use of generative AI in clinical decision support?
Yes, briefly, because we don't have long. We've certainly seen the use through. There's a startup called I think it's called aiforia. Aiforia aiforia. I don't know how you pronounce that, but they do image analysis for pathologists and diagnostic tasks so they can improve accuracy, reduce bias, standardize analysis. So there's certainly examples there.
We're also seeing quite a lot of incorporation of generative AI into electronic health records. So Epic did a deal with Microsoft. And there's a couple of other examples there really to try to enhance clinical decision support, real time insights and recommendations, personalizing of treatment plans. Speech recognition and natural language processing to reduce admin burden and streamline documentation.
So a whole range of case studies if anyone's interested. Just just reach out and I can provide a bit more detail. Thank you Kate. This next one is for Richard. How does using AI and peer review fit with the best practices around not feeding confidential, privileged communications into AI tools. That's a very good question. I mean, the first aspect that just for clarification, in all of these different models and everybody who's utilizing kind of AI tools around peer review, the actual manuscripts themselves aren't being Fed in for training purposes into any of the actual models that are out there.
And essentially you have I mean, for us, we're creating private clouds and being able to work in a very consistent manner with data protection requirements of all the publishers we're dealing with. But essentially, you are hitting a wall. It's giving you an answer. It's not absorbing it into the body of the actual model itself. And then essentially you're utilizing the, the output of that.
And I think also having also reference into this, you can run models entirely privately outside of this so that there's no slippage. So, you know, you don't have any kind of movement of materials into the models themselves. So for us, it's working with publishers and making sure that we are absolutely consistent with all the data policies, which is what we're doing at the moment with publisher BI Publisher.
Great I know we're slightly over time. I'm just going to fire to each one of you. If you have any comments for this last question, what AI tools are you seeing scholarly publishers trust and adopt in-house to enhance workflow and productivity? What are some of the ways these tools are being used? So really, just what are these AI tools? Are people using one way or the other just quickly around the virtual table?
Well, I'll kick off. I think what we're seeing is AI tools not coming in under the radar, but becoming a standard part of a lot of workflow offers. So when we look at any of the big providers of editorial workflow solutions. AI is becoming an increasingly important part of that for editing purposes, for a whole range of internal use cases.
Thank you. Anyone else? Yeah I mean, I think we are in the advent for of tools in the editorial and peer review. I think that's still nascent. And we are seeing those. And people are experimenting where the areas where I've seen the most adoption has been around things like research, integrity and screening.
So where there are essentially technical tasks, tasks that can be taken over by AI tools, and that's probably where I've seen the majority of the tools being applied at the moment. And, and structurally, they're probably the easiest to apply in that area. Abby anything? Yeah, I was just going to say I'll give a quick plug for SSP here.
I sat in Dave's seat a year and a half ago, and we did an event titled I efficiencies to optimize workflow from submission to publication. Three case studies. So that was actually three different publishers talking about how they're using AI internally. So if I'm assuming we're all big SSP fans here. So go ahead and log into your account and you should be able to watch that.
Sounds great Keith. Anything last one. If not. Yeah no I leave it to the experts on that one. I'm following the legal and policy issues. You all are following these other issues. So it's really, really interesting to hear it all. So thank you. Sounds great.
Well with that Lori is got the hook in me to end this thing. So I'll turn it back over to her. Thank you. And Thank you all. I just wanted to Thank the panelists, Keith, Kate, Richard Navy, for your participation. And back to you, Lori. Yes on behalf of SSP, Thank you all. Thanks our attendees as well.
Great great session. Really wonderful information. Encourage the attendees to provide their feedback. There's an evaluation. You can scan this code as well. For that, we'd like your input. We're also planning for the our in-person new directions event in the fall. And there's an AI component in there.
So if there are things you would like to see expanded on or added, please put that into your evaluation on suggested topics. Don't forget, the SSP annual meeting is may 28th through 30. And early bird registration is ends April 18, so get your registrations in as soon as possible. Thanks again to our wonderful sponsors, Silverchair and access innovations. And again, today's webinar was recorded and all registrants will receive a link when it's posted on the SSP website.
And that concludes our session today. Thank you everyone. Right now.

Cadmore media player playing video The Opportunities and Threats of AI for Scholarly Publishing!

Video Player

Transcript

Segments

End of Video Player Control