Name:
AI A to Z: A Primer on AI Applications in Scholarly Publishing
Description:
AI A to Z: A Primer on AI Applications in Scholarly Publishing
Thumbnail URL:
https://cadmoremediastorage.blob.core.windows.net/c87a5896-c7bc-41cc-bf83-f9a3bc897313/thumbnails/c87a5896-c7bc-41cc-bf83-f9a3bc897313.png
Duration:
T00H58M54S
Embed URL:
https://stream.cadmore.media/player/c87a5896-c7bc-41cc-bf83-f9a3bc897313
Content URL:
https://cadmoreoriginalmedia.blob.core.windows.net/c87a5896-c7bc-41cc-bf83-f9a3bc897313/April 2024 AI Lab Report recording.mov?sv=2019-02-02&sr=c&sig=%2BkvSlMFjMugOhFmsf8vr44RPdbtLvgQC%2BUeLAPoILxY%3D&st=2025-01-29T00%3A03%3A05Z&se=2025-01-29T02%3A08%3A05Z&sp=r
Upload Date:
2024-04-22T00:00:00.0000000
Transcript:
Language: EN.
Segment:0 .
CRAIG GRIFFIN: Greg Griffin, Vice President, Solutions Engineering at Silverchair. I've been at Silverchair for 13 years. And this is a very exciting time for us. AI technologies are revolutionizing every step of the publishing process, from automating the peer review process to enhancing content discoverability. AI is making scholarly publishing more efficient, accessible, and impactful. For authors, AI tools are assisting in literature review, data analysis, and for better or worse-- and I'm not advocating this-- even in drafting manuscripts.
CRAIG GRIFFIN: For publishers AI is streamlining workflows, improving content recommendation systems, and enabling more effective engagement with readers. And for researchers AI is unlocking new insights through advanced data mining and analysis techniques, transforming how we understand complex data sets. As we navigate through this AI driven landscape, we encounter a plethora of acronyms that can be bewildering.
CRAIG GRIFFIN: We use the initials AI as shorthand for a range of technologies. Terms like LLM, NLP, Bert, and RAG are becoming increasingly common in our discussions about AI and scholarly publishing. But what do they specifically mean? In this panel we will go through the different areas of activities in scholarly publishing, and review the relevant AI related technologies and why they are suitable or not for a particular application.
CRAIG GRIFFIN: And with that introduction, I just want to introduce our panelists. And we're going to start off just with identifying the individual, where they're from, and then answering the question, what is your engagement with AI looked like so far? And I'm going to have Chhavi go first.
CHHAVI CHAUHAN: OK. Thank you so much, Craig. I'm Chhavi Chauhan. I'm a former researcher who transitioned into scholarly publishing over a decade ago as a scientific editor for the journals managed by American Society for Investigative Pathology where I serve as their current director of scientific outreach. I'm also co-facilitating an AI community of interest for the scholar-- Society for Scholarly Publishing with over 200 members.
CHHAVI CHAUHAN: And we have three dedicated working groups focusing on understanding AI in the scholarly publishing domain, building tools that leverage AI for expediting processes in the scholarly publishing domain, and putting ethical practices and policies and governance in place for stakeholders in the scholarly publishing domain. I was also invited to be on FASEB's generative AI task force.
CHHAVI CHAUHAN: So FASEB is a federation of 23 scientific societies. And our mandate is to come up with deliverables to improve efficiencies in the editorial offices workflow to bring societies closer to their mission, as well as to delineate policies with the federal agencies for the use of generative AI in scholarly publishing. And, lastly, I'm a co-author of a newsletter that focuses on augmenting scholarly publishing using AI tools and trends.
CRAIG GRIFFIN: Great. Thank you, Chhavi. Jeremy, you want to go next?
JEREMY LITTLE: Sure, Craig. So I am Jeremy Little. I'm a tech lead at Silverchair. And recently I've joined the AI team at Silverchair. So I've been at Silverchair for about six years now and primarily as a software engineer, actually. So I come with an engineering background to AI. But within the last year the Silverchair platform has focused on investing in AI and experimenting with different ways of applying it to our platform.
JEREMY LITTLE: And I've been a bit-- in the process of that with the other team members. So while I come from more of a technical side, I definitely have been engaged with many of our clients about conversations here and in the industry more broadly. So I look forward to bringing a technical angle to a lot of this discussion today.
CRAIG GRIFFIN: Great. Thank you, Jeremy. Phoebe.
PHOEBE MCMELLON: Yeah. So Geoscience World, we're an aggregator of 40 plus societies in the geosciences. And, really, our involvement with AI is from multiple stakeholders across the value chain, which we'll talk about a little bit more in the session. But those stakeholders are largely publishers. Even Geoscience World not only are we aggregator, but we publish one of our own journals. We come at AI and how do we help the end users.
PHOEBE MCMELLON: The research scientists actually do their science more efficiently, more effectively, and gain greater insights. And then, lastly, how do we help our society partners leverage it internally so that they can actually focus on the missions and less on necessarily the aggregation and how to reach the end users so that they could use their content. And I think, lastly, having spent 13 years at a large publisher, there's lots of ways that you can use AI on the back end to just improve the efficiency of operations, automate things so that you reduce costs, and at the same time find alternative ways for generating revenue.
CRAIG GRIFFIN: Great. Thank you, Phoebe. And then finally, Rich.
RICH DOMINELLI: My name is Rich Dominelli. I'm Assistant Architect at Data Conversion Laboratories. So Data Conversion Laboratories does a large amount of business for structured conversion-- or, sorry-- unstructured to structured data conversion, data aggregation, web crawling in the legal, academic press, and even medical spaces. So we've been using artificial intelligence in our tool chain for a while now, or at least, say, a broadly defined version of artificial intelligence, for things like entity extraction and OCR and other PSAs.
RICH DOMINELLI: So it's not a new thing for us but, obviously, it's now exploded and it's constantly changing and we're always looking for a better mousetrap to simplify our procedures and to do more things in an automated fashion.
CRAIG GRIFFIN: Super. Thank you, all. OK, great. We're going to move right into a poll. And I'd like everyone to respond to the poll, please. We have all probably heard anecdotal evidence of different organizations using AI at various times, but I think this is a fairly large turnout for this particular webinar. And I would really like to see what we can find in aggregation of any patterns or trends in our industry to reflect where we are.
CRAIG GRIFFIN: In my conversation it's gone from organizations that are just monitoring and keeping abreast of changes and then all the way up to full skunkworks type programs that are working on fully funded feature development and that kind of thing. So I'd be very curious to see what this crowd has for feedback on where their organization is on this particular front.
CRAIG GRIFFIN: All right, there's a few of you who still haven't voted yet. OK, that looks like about it. So this is interesting. At the lowest level of engagement is mostly conceptual, and that's in third place it looks like. And I've seen that quite a bit where there's certain organizations that can't or won't lead the charge on something like this, but are certainly keeping abreast of changes.
CRAIG GRIFFIN: Beginner or that sort of second level, that is the number one result. And that's not surprising either. Some experimentation it's easy to start working on some ideas, some concepts, that kind of thing, as well. And then, of course, the third level, the amateur, if you will, started working AI into workflows. That's still fairly easy to do in number of ways, in particularly certain roles that works well with.
CRAIG GRIFFIN: And the expert, there's only 1%. So very little have really gotten into that deep dive, if you will, into AI. And I think that will-- that's very interesting and will probably develop over time. Does anyone on the panel have any comments on that real quick? Any more insight? OK, great.
CRAIG GRIFFIN: OK, so now we're going to get into the meat of the presentation. And I believe Rich is going to walk us through the foundational concepts here.
RICH DOMINELLI: OK. So let's talk about what artificial intelligence is. I mean, when we're talking about AI, what we're trying to do is we're trying to make a computer mimic the way human intelligence works. So a lot of this is based on Hebbian learning, which is the theory of neuroplasticity. Which is a fancy way of saying that neural networks are how organic creatures learn new concepts. And what happens is in this neural network, the neural pathways slowly get trained with new stimuli over time.
RICH DOMINELLI: Networks are those pathways with different levels of learning get deeper and deeper. So an artificial neural network is based on this concept. And it's actually surprisingly an old-- a much older concept than you would expect. Alan Turing back in the '40s wrote a couple of papers about using artificial networks. Marvin Minsky in the '60s wrote the concept of perceptron recursively when he was working on OCR in the early '90s.
RICH DOMINELLI: Did a lot of work based on how human intelligence functions and how to model it properly using computer systems. So machine learning is a subset of artificial intelligence where we have these statistical models of how do humans understand speech, how do humans understand vision. And we have different layers within this network to train these models through the course of training data. And also to generate results based on that to the point where it's becoming increasingly hard to-- and there's-- I think we're going to talk about explainable AI later on.
RICH DOMINELLI: But it's becoming increasingly hard to understand how a given neural network model or a large language model generates the information and concepts and decisions that it does based on that. And then I think we have Jeremy who's going to talk about some of the more detailed and direct use cases of AI.
JEREMY LITTLE: Yeah, that's a great transition. So like Rich just said, AI has been around for quite a while. And we've been hearing about it more and more nowadays. And I want to talk sort of about the terms you've probably been hearing and more of the vocab that's being thrown around nowadays. So I'm going to go through a list of terms that you've probably heard recently. Some of them are pretty simple and well known and some of them are a little more complicated.
JEREMY LITTLE: But I do think it's important to get a foundational understanding of what these different terms mean before we continue with this panel. So, like we've been saying, AI has been around forever. So why are we hearing about it all of a sudden? I would argue a big reason of this that we're hearing about it all of a sudden is because of big leaps and bounds in the generative AI space. So generative AI is a particular kind of AI.
JEREMY LITTLE: And it's just-- it just means AI that focuses on creating new content. So this is built off of training data, but it's focused on generating new things rather than just reacting to data or reciting things that it has stored somewhere. So generative AI can come in many forms. It can be text, image, or video based-- and the list goes on-- but it's just about creating new things.
JEREMY LITTLE: So the most widely spread use of generative AI is through large language models or LLMs. These are your ChatGPT or Google Geminis. So LLMs are huge machine models that are trained with lots and lots of text. They focus on understanding and generating natural language. And there are many implementations of them. So the most well known is GPT or the generative pre-trained transformer.
JEREMY LITTLE: And I want to make a distinction here. GPT and LLM aren't the same thing. So LLMs are a broad term, but GPT is actually OpenAI's implementation of that LLM. So while all LLMs are about language, GPT is specifically OpenAI's version of an LLM. So Google's an example of an LLM is Gemini, for instance. But when I talk about models, models are also distinct from these things.
JEREMY LITTLE: So models are a broad AI term. And they fundamentally mean a pre-trained system that's just designed to perform tasks. It's built up with data to perform certain tasks. So in the context of large language models, a model is trained up with lots and lots of text and then tuned to make it sound more like a human would respond to things. And it's about understanding the language.
JEREMY LITTLE: But there are other kinds of models that are more fundamental to machine learning that can be built up with other training sets. So think of a stock market predictor. There are lots of ways of applying models generally, but we've been hearing about large language models because that brings generative AI into consumers' hands more easily. So I think these three terms are often conflated and they're thrown around in place of each other, but I do think that they are distinct concepts that we need to keep in mind when we talk about these kinds of things.
JEREMY LITTLE: All right, so we can go to the next slide. So with this context, how are we seeing LLMs and other models being put into scholarly publishing itself? How are these terms actually relevant to us? So I think the most common use case of this is chat bots. This is what we've been seeing with many of these tools now. This is the familiar interface you know. You enter in text and you get a human-like response back from the chat bot.
JEREMY LITTLE: So traditionally these weren't built with LMS, but with all the modern investments there, a lot-- like a lot of the times these are just wrappers around LLMs themselves. So when you enter in something to a chat bot, it will build up a prompt and send that prompt over to the LLM itself. So the prompts can be augmented in a number of ways. It can include extra data or it can include extra information that the LLM will take into account when it gives your response.
JEREMY LITTLE: So the way that you craft this prompt actually makes a really big difference about how the LLM responds to you. And that's what the concept of prompt engineering is about. So, for example, in a research context, you could add in to any question answer like, I'm a researcher and I'm an expert in my field, and it will really change how the LLM responds. So there are more advanced ways also of changing prompts that chat bots can do.
JEREMY LITTLE: So RAG is probably something you've heard recently. And this refers to retrieval augmented generation. Now, RAG is a way of augmenting chat bots to make them even more advanced. And basically the way it works is when you ask a question to the chat bot, the system will go and find a bunch of relevant data. When it finds that data, it will package that data with your original prompt and send the combination to the LLM.
JEREMY LITTLE: So what this looks like in practice is, say, you ask a chat bot, how deep is the ocean? The search algorithms will go and find potential research paper or data around how deep the ocean is. It will include those sources with your question and send that off to GPT or Gemini or whatever chat-- or whatever LLM you're using. So this-- the applications of this really cut down on hallucinations because now LLMs can have primary sources on where they're getting their data instead of just being the models' direct responses.
JEREMY LITTLE: So we've seen that with these tools on the right, there are lots of industry examples of RAG or chat bots. Site consensus and some tools from Clarivate all are examples of this where you can ask a question to it and it will provide you an answer as long as-- as well as the citations where it got that answer from. So that's on the consumption side. This is on how people are changing how they interact with content.
JEREMY LITTLE: But there's also new ways of creating content and curating content as well. So LLMs are great with language as we know. So there are lots of LLM writing assistants that have come out. Grammarly is an example of a very mainstream common one where you can do spell check or grammar check, but there are also more advanced use cases, like citation lookups and automatic inserts or auto formatting of data into tables. So the potential here is really huge and we see more and more tools like HyperWrite Jenny come out.
JEREMY LITTLE: It seems daily now. But also I want to acknowledge these have all been LLM conversations. And a lot of what we're seeing is based on LMS because that's where a lot of the new stuff has been. But there are also traditional ways that AI can still impact scholarly publishing. And two great examples of that are NLP, or natural language processing, and text and data mining.
JEREMY LITTLE: So NLP is also a broad concept, but it does specifically mean being able to build a system, a model, that can process real natural language. It's right there in the name, actually. So this can mean a lot of different things but a common example of this is you can give an NLP model a sentence or a paragraph, and it can extract things like the sentiment of that. Is it happy or sad?
JEREMY LITTLE: Or it can do things like tell you the subject matter about this. It's just about processing language and synthesizing it. Text and data mining is a technique where you take systems like an NLP system and you run lots and lots of data through it. You run lots of text through it or you run lots of data points through it, and you can gain broader insights about that data set after mining through it all.
JEREMY LITTLE: Now, this has really made possible by AI because machines can, obviously, process this much faster. So the example applications for these two techniques together are things like industry insights or just large data analysis. Maybe you want to invest more in a particular field or a subfield, and NLP and text and data mining can show you where the investments coming in or where it's lacking.
JEREMY LITTLE: Also along the creation side, peer review is starting to be looked at as automatable because you can basically have an NLP system that can do things detect fraud or detect maybe potential poor research. So there are lots of applications for traditional models as well as LLMs, especially in the earlier side of the publishing chain. So speaking of the publishing chain, I think these-- we can go to the next slide, too.
JEREMY LITTLE: So these are the terms that I've sort of thrown out. There's, obviously, a lot of them and we're going to hear them throughout the discussion, but that was a primer on exactly what we're going to be talking about and how these apply. But I want to talk more grounded now or I want to pass it off to Chhavi, actually, just to apply these to different parts of the publishing chain, generally speaking.
CRAIG GRIFFIN: Great thanks, Jeremy.
CHHAVI CHAUHAN: Thank you, Jeremy. Rich, that was an excellent foundation that you laid out for us. Thank you for that. And, Jeremy, I always struggle with these terms so thank you for so eloquently laying them out and explaining them so well. So my intention here today is to provide a little bit more depth to the use of AI in the scholarly publishing value chain, which essentially begins with the researcher identifying a question that has not been answered in whatever field they may be researching in.
CHHAVI CHAUHAN: And it culminates with the dissemination of the knowledge that the researcher gathers over the course of their experiments. So, essentially, to be able to address this question, the author-- or the researcher needs to secure some resources to be able to conduct a series of experiments, which essentially involve them generating a bunch of data that they analyze and then go to the next step of sort of collating this data in an edible format for the scientific community to build upon.
CHHAVI CHAUHAN: And for the lay public as well, for them to understand how the science is advancing. And that's where the journals come in for the authors, for the researchers to be able to reach out to the scientific community and to the broader public at large. So the intent of most authors is to submit their manuscripts, which could be in many different forms, to a reputable journal. And what makes a journal reputable?
CHHAVI CHAUHAN: It's their screening process, their high standards of peer review where other experts from the domain vet the information that you have provided, check it for its scientific accuracy, its rigor and reproducibility. And that then it undergoes an editorial decision making process. And if accepted, it gets published and disseminated to the public.
CHHAVI CHAUHAN: Now, there are many different ways in which we can leverage AI. So I'm a researcher myself. And as I was doing research I formulated questions. And by the time I was ready to go to my initial phase of proposing my question to my committee, there were some studies that came out that had sort of delved into that question partially. So now the researchers have the unique advantage of leveraging AI to scan through the literature which an early researcher probably cannot do as efficiently and as speedily as a machine-- as AI can.
CHHAVI CHAUHAN: And they can find out whether or not their question is unique, if there are some partial or full overlaps to the kind of questions they're asking. And if so, they can fine-tune the question to have a greater impact on the scientific community. As a researcher, I struggled with statistical analysis. And no matter how intellectually sound you may be, you cannot compete with AI when it comes to data analysis and showing you trends.
CHHAVI CHAUHAN: And I from at least my PhD I focused heavily on genetics which involved doing a bunch of experiments, looking at numbers to even see whether what my next question should be. So if AI can expedite the data analysis, it really helps the researcher to move on to the next step of thinking about what would be the next big question, how to approach it. Again, as a researcher, when I was writing my pieces, I absolutely hated it.
CHHAVI CHAUHAN: Writing the primary research manuscripts was such a struggle for me. It changed during my post-doc, but up until then I hadn't built the muscle. So I think-- and I'm not-- like Craig mentioned, I'm not advocating for the use of generative AI in synthesizing your entire content or compromising its scientific integrity but I would highly encourage everyone, especially researchers, to think about defining the outline of their articles, building that skeleton.
CHHAVI CHAUHAN: There's going to be introduction, materials and methods, what's going to go there, results, discussion, so you can overcome that initial barrier of starting to write your manuscript. Now live in the journal world, and I can't tell you enough how we can leverage AI to improve efficiencies in the editorial office workflows where we are not waiting on someone in the editorial office to actually hit a button to push the article forward.
CHHAVI CHAUHAN: Of course, human oversight is needed, but there is so much mundane business that goes on in a journal office that we can improve efficiencies, simplify things, expedite the process with human supervision by leveraging AI. And most importantly, when it comes to the dissemination of scientific content, I can't tell you enough how exciting it is for a researcher to share their findings with the scientific community to build upon and with the public to let them know what they're working on.
CHHAVI CHAUHAN: But leveraging AI, we can make this dissemination smoother. We can make it a lot more accessible by either providing simplified summaries of the content to people to be able to digest, as well as translating content in regional languages so it becomes easily accessible for researchers in other parts of the world. And Jeremy has shared several examples of leveraging NLP or natural language processing and other examples which, of the tools, that can be used or leveraged during the publishing value chain.
CHHAVI CHAUHAN: I think I'm ready to pass on the baton to Phoebe at this point.
CRAIG GRIFFIN: Thank you, Chhavi.
PHOEBE MCMELLON: Thank you. And I apologize in advance if there's any noise. I am at a conference, actually, that's very much talking about AI in the geoscience world. So let's start there. I wanted to cover, as I mentioned, where geoscience world sits and I personally, also as a geoscientist back in the day, starting with the researcher and looking at the stakeholders and really what are the key questions that can be asked or how can AI really help these various stakeholders.
PHOEBE MCMELLON: This is not an exhaustive representation of the stakeholders, but clearly the key ones that exist. And as we've discussed earlier, and Jeremy has sort of laid out very, very nicely, all the different ways that AI can be leveraged, I'm just going to include that bucket as AI in general in this. So for researchers and authors-- and I think Chhavi presented a very detailed understanding of how researchers and how AI can actually help across their workflow.
PHOEBE MCMELLON: But, really, the securing funding, analyzing the data, certainly even at the conference that I'm at today where there are several vendors or startups, both in the academic space and also in the corporate space who are playing around with different ways to extract insights from just the breadth of content that exists. And as we see each year the amount of publications that is being published in the digital domain is just staggering.
PHOEBE MCMELLON: I think it's something like 56% increase per annum over a 10 year period. And so that's pretty hard for a researcher to stay up to date on what content is out there, let alone try and understand those very important research questions that either have not been answered or have been answered but perhaps there are gaps in that knowledge. So I see that data analysis piece is being a really, really significant way to just accelerate and make that research process more efficient.
PHOEBE MCMELLON: Once you get to the output side of your research, I've been here and I've been talking to some people who are non-native language speakers of English. And their dream is to be able to write their article, a manuscript, in their own language and then upload it and it converts it all to English so that it can be reviewed. And I think that AI really has the capability here to make things more equitable for non-native speakers of English who are trying to participate in the research process and, certainly, the scholarly publishing.
PHOEBE MCMELLON: So I'll move on to talk about editor's peer reviewers and how can AI use-- how can they use AI in their world. I think the key priorities streamline the review process, as already been mentioned, identifying integrity issues, automating some of that process, perhaps taking out some of the more mundane processes and operations of creating a manuscript. We already see it today with some plagiarism checkers checking for paper mills and things like that, but ensuring compliance in a variety of aspects across the editorial and peer review process.
PHOEBE MCMELLON: Moving on to researchers and readers, how can AI help them in their workflows? Really, it's around discovering the relevant content, analyzing the trends, gaining insights. I think you'll start to see a trend that it's about getting the right information, relevant information more efficiently and more effectively than what we do today. The last line is making a shift, really, to particular stakeholders that I have personally engaged with as the aggregator.
PHOEBE MCMELLON: And I think on the librarian side, this too, given the amount of stakeholders that they have both within the administration but also the actual researchers, the faculty members, the lecturers, I think there's a variety of ways that a librarian can use AI. But, really, all of the ways is helping them operate more efficiently and effectively to understand how, one, to support their patrons better and understanding their research needs.
PHOEBE MCMELLON: But also how to improve that experience of the researcher and basically make it easier for researchers to find the relevant information that they need in the library. I think I read an article recently that was just talking about the way librarians could actually leverage AI in, really, accessibility in helping people who can't actually have disabilities navigate the library to find and read and consume the content that's out there.
PHOEBE MCMELLON: Some other ways is perhaps maybe a little bit more 2001 where you actually have robots who do some of the work of restocking the books and helping keeping the organized-- the library organized. On the publisher front it's really, again, about gaining efficiencies across the value chain and ensuring that for publishers, how do we make sure that the content that we are publishing is not compromised in quality and we can do it in a cost effective way.
PHOEBE MCMELLON: And, lastly, I put in society leaders. And I think both publishers and society leaders, these two questions can actually be mutually shared, is, how do I attract members, support the research community, diversify my revenue, and reduce costs? I think the AI could also be the same questions-- could be asked of AI for publishers. And I think on the society leaders side, it's understanding how you can deploy better targeting, better marketing tools so that you're engaging with your community that you really care about and that cares about your mission and what you do.
PHOEBE MCMELLON: So I'm going to stop there. And I think we're about to move on to the Q&A.
CRAIG GRIFFIN: Yes, we have an intermission before that. We have another poll. And the question is for everyone. Where in the value chain do you see the most opportunity for AI applications? So if everyone could just take a moment and chime in on the poll. We have had a couple questions come in during the presentation so we're going to go into Q&A right after this poll is done. So don't worry, we'll get to questions and we'll have quite a bit of time for discussion after we're done with the poll here.
CRAIG GRIFFIN: And unlike the first question, the first poll question which was, what are you currently doing? This is a question more about where you see it going. Where is the most value coming from? Which is a little bit different.
CRAIG GRIFFIN: The first question that came in for our panelists which we can just wait for the poll to finish and then we can get into the first question which is, do you know why the current LLMs produce so many hallucinations when it comes to searching for academic sources? So just think about that for a second. It looks like the poll has ended, OK. And data analysis is a clear winner on that particular front.
CRAIG GRIFFIN: And that makes a lot of sense. Research as well, peer review. So many of the things that we've talked about, but definitely the use of data analysis seems to be where value can be realized the most. It's interesting. OK. Now let's get into questions. So, again, the first question was, do you know why the current LLMs produced so many hallucinations when it comes to searching for academic sources?
CRAIG GRIFFIN: That sounds like a Rich or a Jeremy question to me.
RICH DOMINELLI: I'll take a swing at it. So one of the important things to remember about ChatGPT or any of the LLMs is what they are designed to do. They are designed to generate text based on their training. They are actually text prediction engines. So they are building a statistical model of what the text-- what the most likely next word is based on the prompt that you've created. Which is why you start getting these increasingly creative answers when you ask it at what you think is a simple question.
RICH DOMINELLI: One of the early projects we had was trying to analyze some financial documents. And we would take the financial documents and feed it to it. And if the question wasn't there or if it couldn't access the data that we were looking for the answer from, it would come back with a seemingly accurate answer which was completely made up. So we've hit the same kind of issue. One of the nice things going forward is you're starting to see more and more integration with RAG toolkits where the engineering of the prompt that gets sent to the AI includes your answer space that should constrain some of the creativity.
RICH DOMINELLI: There's a couple of parameters you can set when you're interacting with ChatGPT's API to say, be less creative. There's prompt engineering things you can do to say, answer specifically from the data I'm about to send you, which will also-- that's part of the RAG solution. And there's it's getting better. And there are some things you can do after the fact to try to vet the solution.
RICH DOMINELLI: We had another opportunity where we were extracting authors and affiliations of those authors from papers. And it would work probably 80% of the time and 20% of the time it would make up random people that it had no source for it. So it's important that anything you're doing with this ChatGPT or any LLM has some kind of Q&A mechanism behind it to make sure that it's working.
RICH DOMINELLI: But, ultimately, they're there as a text prediction engine, not as a better version of Siri, as it were. And I'm sure Jeremy probably has something he wants to say in the same regard.
JEREMY LITTLE: Yeah, you covered it very well. I think, yeah, I think just stressing the way that these are designed and built up is sort of what causes the hallucinations. It really is just text prediction. So without any kind of grounded sources or citations, the model itself isn't even aware of where the words that it's sort of generating came from. It's much more about word association than about reading some data and giving that back to you. So I think, Richie, you answered that very well.
JEREMY LITTLE:
CRAIG GRIFFIN: Fantastic. OK, more questions. So here's one generative. AI is notoriously abstractive, even going so far-- this is semi-related-- even going so far as to making things up and producing fictitious citations in order to answer a given prompt. How can scholars and publishers guard against this sort of unreliability if AI is going to be so heavily relied upon across all stages of the publishing workflow?
JEREMY LITTLE: So I'll jump back in here. I do think that, Rich, you mentioned RAG systems, and I think that's probably the most common way of guarding against this. And so RAG is really designed to basically constrain LLMs and their text prediction to only using primary documents. As in before an LLM even answers you, you would essentially instruct it to read a specific set of data and only answer from that data. So that really can constrain the hallucinations and can make the AI much more grounded in its responses.
JEREMY LITTLE: Remember that LLMs can read and understand data as well as generate text. So if you give it lots of data, it can read all of that and then generate from what it's just read in that same context. So I think that's probably the most common way of guarding against it, at least in the context of LLMs.
CRAIG GRIFFIN: Great. Thank you. Let's see. Another question. My understanding, which is clearly out of date, was that AI tools generally aren't searching the entire internet for data, but instead have a subset of data available to them. What are the current limits in terms of data available to publicly available tools?
RICH DOMINELLI: So ChatGPT 4, until relatively recently, was trained on data up until September of 2021, and occasionally would admit that. Usually would try to fake an answer based on its generative tools, but acknowledges that training takes time, and it's trained on a subset of data of what's available to that time. Now, since then, OpenAI and Claude and Grok, and-- you know, there's a whole list of them-- have started to integrate limited web search capabilities within it so that you are starting to see some interactivity.
RICH DOMINELLI: And that's actually a specialized version of RAG. I mean, what's essentially happening is it's going out, it's doing a web search. It's coming back with what that web search returned as far as its query, and then it's generating its answer based on that.
JEREMY LITTLE: Yeah, that's a great answer. I'll add one more thing to that. So when LMS, when you're interacting with the ChatGPT, for example, when those are built up, those actually become static models. So when you're interacting with ChatGPT, it's not actually adjusting on the fly. It's a static thing that once it's built, it sort of stays in place. So like Rich said, once it sort of scrapes the internet, it builds relationships with words and it builds relationships with language, but from that point it doesn't change anymore.
JEREMY LITTLE: When you see more releases, like GPT 4 or 5 is going to come out, that's future iterations that have been trained on more data sets. So an important distinction there.
CHHAVI CHAUHAN: Craig, can I just add a comment and possibly a question for Jeremy and Rich on this-- along the same lines?
CRAIG GRIFFIN: Yeah, go ahead.
CHHAVI CHAUHAN: Yeah. So one of the things that I keep struggling with is the scholarly publishing industry content. You know, whatever we publish with the journals is so often behind the firewall or behind the firewalls that none of this content even is not it's accessible to the current large language models for their training purposes. But at the same time, there is so much more engagement of the readers, the authors on social media platforms that may be discussing some recent developments that can be made accessible to these training models.
CHHAVI CHAUHAN: So I was wondering about your take on that in the-- with keeping in mind that it's not the peer reviewed vetted content in its original form that gets to train these models, but it's the sort of people's interpretations of that data that gets into training these models. So how might that affect the quality of the content that these may create or may be does that impact hallucination at all?
CHHAVI CHAUHAN:
RICH DOMINELLI: I mean, there's the old term of garbage in, garbage out. One of the interesting things recently is the website Reddit was publicly offered, and the principal component of their evaluation was their use for AI training. ChatGPT and Claude and Llama 2-- which is the Facebook or Meta offering-- were all trained on Reddit information and Twitter information and the information that they contain.
RICH DOMINELLI: And I don't know about your Twitter feed, but my Twitter feed certainly has a lot of random, awful-- often incorrect information going by. So, yeah, that is a factor for causing hallucinations, absolutely. You do see some offerings now from academic indexes like Crossref. They have a company called Turnitin, which is doing a lot of plagiarism work, and they do have access to a lot of those journals.
RICH DOMINELLI: I know IEEE has done some work with them as far as identifying plagiarism in academic papers and ethical violations and academic papers. And they have the ability to train up on some of these papers in their raw form because they are getting the papers as part of their indexing process.
PHOEBE MCMELLON: Yeah, if I can just add to that. I think this is why it's really important. And I see-- we see it already in the news about some of the perils of training content on open web social media. And I think I've certainly at GSW, we've seen an uptick in companies reaching out, trying to license the scholarly literature because it is peer reviewed, it is high quality, and it can have the potential to give better answers that are trusted.
PHOEBE MCMELLON: And I think in our community that's really important. I don't know any scientist that's into serious research-- perhaps students-- that would take an answer from ChatGPT or one of these generative models and use it without knowing the source. And I think that's where the RAG models become so important because you are able then to trace it back to the source and then decide critically whether you trust that or you can use prompts to then dig deeper into that.
PHOEBE MCMELLON: I've been playing around with it on topics that I know pretty well to see how good these models are. And there's certainly an art in the asking the questions and the prompts that you feed. So I think there's just-- we have to be wary in this time of making sure that we're not putting garbage in and getting garbage out.
CRAIG GRIFFIN: And in your experience, how good are the answers?
PHOEBE MCMELLON: So I looked up the-- being a geoscientist I looked up the causes of the Permian extinction, which was one of the most largest extinctions in-- that we know of in the Earth's history. And it was pretty good but it kept throwing in some other ideas that are no longer, really, the leading-- even though you kept saying, well, what is the most accepted reason for the Permian extinction now?
PHOEBE MCMELLON: It still kept throwing in-- wanting to throw in some ideas that are pretty much discounted or not the leading ideas. So OK but not to the level that I would write a paper on it.
CRAIG GRIFFIN: It's interesting.
JEREMY LITTLE: I'm going to actually-- let me address that in one other angle. You bring up an interesting point, Phoebe, where a lot of the outdated information sort of comes back. And I think that's a common thing we've seen. And I think it's useful to keep in mind how these models are built. It's about taking lots and lots of text and running it against a model until it's been built up. So the more text that sort of points towards one direction, the more statistical significance that model will have.
JEREMY LITTLE: So this leads a lot of LLMs to clinging on to older concepts or older research that is maybe publicly available or in-- that's free use at this point. So when it comes to specific modern research, the LLMs can just be wildly off and wildly out of date without sort of a grounding in this is the new take on this. This is the new research around this topic.
RICH DOMINELLI: I almost wish that there was a way of specifying Meta information about your query, kind of like you have on Google Search where, please, constrain this to only peer reviewed sources published within the last 12 months, kind of thing, when you're sending these prompts. You can ask it to do that but it kind of ignores you if you try.
PHOEBE MCMELLON: And the last thing I want to add about that is the biases that are being introduced. And what they mean by that-- this goes back to Chhavi's point of if we're just adding in what's available-- and I don't actually know, perhaps maybe Rich or Jeremy you know-- the generative models today, are they using information from all different languages or is it just English? Are we being very selective in choosing the scholarly literature or what's on the web?
PHOEBE MCMELLON: I just don't know how much from other cultures and other parts of the world these models are consuming.
CHHAVI CHAUHAN: Yeah, that's a very interesting point, Phoebe. And if you think about scientific literature, a term or a series of terms may mean something very specific in one scientific discipline, but they can mean something very different in another discipline. So I think that's another concern because for a particular scientific discipline, you may want to enrich the training data sets with content from that discipline so it does not sort of spew out any content.
CHHAVI CHAUHAN: Which is sort of it makes sense in terms of stringing the words, but it's actually not the right context for that discipline.
CRAIG GRIFFIN: Yeah, that's one of the tricks or challenges, I guess, with the output of many of these models is that it sounds plausible, especially if you don't know the details. It seems like it's kind of right. It's close or seems relevant, but in the-- at the end of the day, to an expert it's way off base, to Phoebe's point. OK, let's shift gears here real quick.
CRAIG GRIFFIN: We got a new question. From the perspective of using AI to improve processes, a lot of the gains are from the perspective of decision making, e.g. peer review or initial screening. How close are we to having an AI that can do acceptable peer review? Seems like a Phoebe or Chhavi question.
PHOEBE MCMELLON: I actually don't really know the answer. I would imagine not that close yet. I think there are still experiments that have to happen. There are certainly talks of how you can perhaps streamline or make part of that more efficient. For example, giving a reviewer some key takeaways and a summary, a lay summary of the manuscript before you read it, I think that's probably quite possible to do.
PHOEBE MCMELLON: And certainly I've seen some examples of AI tools, RAG models in particular generating AI summaries that could help point the peer reviewer in the right place. But I don't think that, at least any that I've seen, perhaps anybody else on the panel has seen a tool that is being actively tested on being able to peer review like a human.
PHOEBE MCMELLON: I think the future will be that there will be parts of that peer review process that certainly editors and peer reviewers can use to help make it more efficient, identify perhaps gaps in integrity, gaps in citations in a faster way. Do their job essentially, just more efficiently and faster.
CHHAVI CHAUHAN: Yeah, Phoebe, you're absolutely right. Like, we've been using Authenticate for the longest time to look for plagiarism, for example. So that's a quality check process. Also expedited the peer review process by letting our reviewers know that this is the similarity check that we came up with. We're going to give authors an opportunity to respond to this and paraphrase their language. There is a new tool that we are about to start use in a week. And I've been in discussion with some other editors, and they are designing their own in-house tools, and that is to scope the article.
CHHAVI CHAUHAN: So right now it's the editor in chief in both our journals that looks at the scope of an article. So, essentially, the AI will be able to at least make recommendations of whether or not an article is within the scope. So take it closer to the point of outright rejecting it or it's going to the next step of making recommendations to which journal the scope aligns better with. So we're actually also serving the needs of the authors and expediting the process of finding their science a home for which would be a better match for them.
CHHAVI CHAUHAN: So this new tool that we are about to start using is from Elsevier called Manuscript Evaluate. And I know some other people are building in-house tools. And going back to your initial suggestion of using AI in small little doses for different aspects of peer review is probably the best approach. And I would highly advocate for human in the loop and human supervision for any of these functionalities as we start improving the efficiencies in the peer review workflow.
CHHAVI CHAUHAN:
CRAIG GRIFFIN: Great. OK, I think we're--
PHOEBE MCMELLON: Sorry. Maybe we should call it augmented intelligence versus artificial.
CRAIG GRIFFIN: Yeah, there needs to be a driver to the self-driving car still. Here's another question. How reliable is AI language translation at this point, especially in regard to scientific research articles? I remember anecdotally hearing, when ChatGPT was released, that they only trained it on English language, but that it had picked up and inferred basically all of the languages of the world over time.
CRAIG GRIFFIN: I don't know if that's still true or not, but does anyone have any experience with the actual translation using one of these tools?
JEREMY LITTLE: I think within the context of LLMs it's probably a little weaker. LLMs are really just about the volume and quality of data in, but there are specific different kinds of models that have been used for translation that are probably more suited for it. I still think this is an area where the lack of data and lack of investment is probably holding it back. And we'll probably see more and more of this as more investment in AI improves.
CHHAVI CHAUHAN: And I think it may remain a little bit more challenging for the medical specialties, just given the nature of the language. Also its implications on human life. But as Jeremy said, I think for other purposes, I think there's improvement in other.
CRAIG GRIFFIN: OK, great. We have one-- room for one final question. So as an editor, I'm concerned that anything I feed into an LLM could then be part of its pool of data that is used to respond to future prompts. Are there ethical concerns in feeding an author's text to help refine, check, shorten it, et cetera. In other words, are we feeding unpublished work into future models of the tool?
JEREMY LITTLE: So I think there's a distinction to be made here. To answer the question broadly, I think you should be very cautious what you put into these systems, especially when you're using the websites directly. So if you're on ChatGPT, especially the free tier, they'll be keeping track of that full conversation and they'll definitely be using that in future trainings. Now, that being said, there are third party tools.
JEREMY LITTLE: For example, Silverchair has a tool that we've been developing that sort of goes around this process and actually cuts out of the data tracking. And that's been really crucial for our internal use and for some of our publishers to experiment with because by going around the public versions of these things, you can silo the data off. So I think where you enter the data is very important and it's and it's important to read the terms of the exact tool you're using before entering in any kind of unpublished research or proprietary information.
RICH DOMINELLI: Samsung, actually, famously discovered that a lot of their internal research had made it to ChatGPT. And what they did in response is they immediately spun up a local LLM to silo the information out of the general public.
CRAIG GRIFFIN: OK, great. We are at time. Great discussion, everyone. There will be a supporting materials communication coming out, in addition to the recording. We've also put together some additional deeper information on everything we've talked about today. Thank you, all, for coming. And there are two more events coming up, one in May, one in June, around our AI lab.
CRAIG GRIFFIN: And thank you, all, once again for coming.