Name:
Breaking the Silos: Using AI to Fuse Copyright Law, Academic Knowledge, and Commercial Innovation
Description:
Breaking the Silos: Using AI to Fuse Copyright Law, Academic Knowledge, and Commercial Innovation
Thumbnail URL:
https://cadmoremediastorage.blob.core.windows.net/5d8b1fee-53f5-4829-adf9-1d991a56104d/videoscrubberimages/Scrubber_1.jpg
Duration:
T00H59M24S
Embed URL:
https://stream.cadmore.media/player/5d8b1fee-53f5-4829-adf9-1d991a56104d
Content URL:
https://cadmoreoriginalmedia.blob.core.windows.net/5d8b1fee-53f5-4829-adf9-1d991a56104d/SSP2025 5-30 1100 - Session 4B.mp4?sv=2019-02-02&sr=c&sig=qEfFBwP7jUkc5dmiYuy2duje%2BlhO1umLOlvd4ObT6gc%3D&st=2025-12-05T20%3A56%3A03Z&se=2025-12-05T23%3A01%3A03Z&sp=r
Upload Date:
2025-08-15T00:00:00.0000000
Transcript:
Language: EN.
Segment:0 .
Hello, everyone. Thank you. Thank you for joining us today. I'm Roni Levi. I'm a licensing and legal advisor at the Copyright Clearance Center, and I'm really excited to be moderating today's session. It's an exciting question facing scholarly publishing today.
So how can we break the silos between the traditionally separated copyright law, academic publishing, commercial innovation to build something that is smarter and more integrated with AI. We've seen headlines about AI this entire SSP meeting. I think half of the presentations this morning, if not more, were AI related. So by the time you come to this room, you've likely heard about AI from multiple angles.
And so what we're going to try and do today is to put it together in a more practical approach to AI and the challenges and opportunity we face today. I'm joined by 2 extraordinary panelists who will bring distinct yet complementary lenses to this conversation. So Simon Taylor, Chief publishing officer at the American Psychiatric Association, and Diane Harnish, senior consultant at Datalink.
So together we're going to explore what implementation really looks like and what kind of infrastructure, policy and collaboration it will take to get it right. So let's begin by setting the stage with insight from a recent study that was conducted by the copyright Clearance Center, CDC, in partnership with research consulting. We heard from over 60 society publishers worldwide, and the findings were illuminating and yet not that surprising either.
The key headline is that AI is a paradox. It was ranked number one as the biggest challenge faced by scholarly publisher, and also number one as the biggest opportunity facing society publishers. Nearly half are already licensing or plan to license their content for AI use. However, many feel ill equipped to make decisions about copyright, content, permissions or metadata management in this new context.
Only 35% of societies have updated their author policies in response to AI. And this tells us something about the governance gap that we're seeing. So content is being ingested, mined and transformed through AI. And our author policies are still lagging behind. Meanwhile, open access policies are also under new scrutiny.
Some societies are reconsidering Creative Commons licenses there. Some are potentially moving from cc-by to cc-by and see non-commercial. And they're not doing this out of principle, but really out of a need to slow down uncontrolled use and unattributed use. We are hearing that cc-by licenses are limiting the ability to license content for commercial use cases.
So we understand that some publishers are looking more closely at this, and they're including author education around the implications of different Creative Commons licenses to help more informed decisions over time. As we move deeper into the legal and policy implications of AI, I want to unpack how copyright frameworks are being tested and transformed by generative AI. I'm not going to go into a lot of detail because there was a session yesterday that covered that, and there's another one coming up later today on fair use as well.
But what is the reality that we're living with it today. AI systems are being trained on massive corpora. That includes scholarly publications, often without permission, without attribution, without licensing, and sometimes in ways that can directly compete with the original work from a copyright perspective. This raises fundamental questions. So what rights need to be cleared when content is used for training or prompting.
What obligations exist when outputs resemble or derive from source material. And how do these obligations differ across jurisdictions. The truth is that there is no uniform answer. We are looking at a patchwork of regulatory approaches and a great deal of uncertainty. There are over 40 active lawsuits in the US alone, addressing different facets of how generative AI systems interact with copyrighted works.
Even once the courts weigh in, were unlikely to get sweeping clarity. The legal outcomes will remain highly fact dependent, and the global picture will continue to be quite fragmented. So this complexity is one of the key drivers behind the surge of licensing activity that we're now seeing. Publishers and AI companies are increasingly striking direct deals. Many of them, if not most of them, are confidential, and these deals attempt to reduce the legal ambiguity, manage risk, and support innovation.
From csc's vantage point, we believe the infrastructure for collective licensing is already in place and has evolved to accommodate new uses like AI training, fine tuning, prompting, retrieval augmented generation as well. Our white paper, the one I mentioned, the survey we conducted, also reveals another important insight many society publishers feel left behind, uncertain of how to protect their content, participate in licensing, or even assess whether their works are being used.
And this is where collective action in matters industry coordination can ensure that smaller publishers aren't left out of licensing conversation. It can also raise the bar on AI ethics and transparency across the board. Throughout the research, we found a strong desire for societies to collaborate and work together to innovate and respond to emerging challenges. As community and research led organizations, greater collaboration appears a prerequisite if societies are to successfully navigate the commercial and technological challenges they face, and help to preserve the integrity of the scholarly record for the communities that they serve and for society as a whole.
If we want AI systems that reflect the integrity, rigor, and quality of the scholarly record, we need to start by building licensing frameworks that respect creators of the content globally practically and responsibly. And this brings us to the heart of today's session. How do we make this real. How do publishers manage AI inside their content processing and publishing workflows.
How do product teams handle the messy innovation journey. And how can we implement policies and guardrails that support integrity, innovation, value, and values. So I'll turn first to Simone. Simone, help us see and better understand what's happening inside publishing houses. Thank you. Hello, everyone, and thanks for joining us on what I thought would be a really quiet Friday morning with everyone gone.
I mean, it's really nice to see that crowd. As Ronnie mentioned, there's been a lot of talk on AI at this conference, and if you've heard me speak on Wednesday, forgive me if you've already heard some of the things that I'm going to say, but. Artificial intelligence is really nothing new. And it's something that we keep talking about a couple of decades ago, the talk was about neural networks.
It was about pattern recognition, machine learning. And these are things that have evolved now into more specific applications, such as facial and voice recognition. And it was interesting that last year, the Nobel Prize for physics went to John Hopfield and Geoffrey Hinton for foundational discoveries that enable machine learning with artificial neural networks. And in publishing.
We have historically used these technologies extensively in data analytics to improve our publishing processes and develop new products. And these are continuing to be used in comprehensive content classification systems, for instance, that allow us to index content and search and retrieval further down the line. Other examples include editorial and production processes where we have automated reference checks, clean up and formatting, and increasing use of customer data platforms that help us understand behavior and offers insight into new product development.
Now, generative AI and the launch of large language models that creates new content has changed our vocabulary dramatically, and we now use the term AI to account for everything from machine learning through to creating plain language abstracts and generating visual representation of conversations that never really happened. And this is what threatens to transform the way we live and work and learn.
But what opportunities does this provide. I mean, clearly, language models are only as good as the data that are put into them. And from a publishing perspective and small, community based, community focused publishers in particular, who invest a lot of time and effort in generating accurate content, doing statistical analysis, this is an important opportunity for providing, as Rani was mentioned, licensing opportunities for that kind of work.
But there are also other considerations, because we have an obligation to the authors and creators who provide that work to make sure that any licensing is done fairly, and we have an obligation as curators of medical content to prevent to protect the privacy and confidentiality of the people whose data has been mentioned. So while there is an opportunity, I think the guardrails do need to be in place that will allow us to do this effectively.
Thank you. Does that answer your question. It does now. Diane, you've been working with publishers large and small, helping them take on some of these challenges and opportunities. What are you seeing. We're seeing really a landscape of from some of our clients and our publishers that are really standing at the sideline.
Starting to figure out how to dip their toe into the water, especially from a commercial perspective. We see those clients taking good steps towards experimentation and workflows. I think we've all been hearing about that, especially in this morning's sessions, about the great tools that the vendors and service providers within our industry are bringing forward. And really experimenting with those as well.
We're seeing, though, in recent months, what I consider to be the really a very exciting component of this effort. It's moving faster than an ever an evolution. It's a revolution. There's a topic we'll talk about in a little while. That is a clear example of that. But what we're seeing with our clients is an exciting opportunity, investigating exactly what's happening at the user level, going beyond the aspects of workflows, internal efficiencies, and really turning back to the people that we all serve first and foremost, our researchers, our practitioners.
And even I'm a specialist in education. And I look to those students and those young researchers they like. Digital native learners, will never know a day in their lives that they aren't interacting in some way with artificial intelligence. I studied the platforms that are in the k-12 area, and AI agents have been there for a long time, so we need to think about those future users.
So we're excited that that's some of the work that we're at delta think able, we're doing with our clients. Similar work that we've done for a long time, is really looking to talk with the market about their workflows, what they do, what they need, what the challenges they have in their workflow, and with our clients to really bring forward the enhancements to existing platforms that they have, and certainly the excitement of a possible new launch of a product, not necessarily something that everyone in this room is thinking about today, but we're excited to say that we feel that tide is starting to turn, that it's not just as 12 months ago or even 18 months ago, should we be doing this.
We have more clients that are starting to say, we know we need to do this, and let's figure out how we are integrating AI capabilities at the user level. And that's really important. So I was reminded this morning when I was reading some headlines in the newsletters that I follow that the future is already here. It's just not evenly distributed. And I'm just going to read to you the headline, it's Sochi achieves main conference acceptance at ACL 2025.
Now, what's hidden in that headline is that Sochi Zo Kai, if you're looking it up, is an AI agent that automates. So it autonomously does the entire scientific process from generating the area of study or of research, to conducting the research, to validating the work, to producing the paper.
And so from end to end, it was done autonomously by the AI agent. And I'm quoting from the announcement without human. In without human involvement except during manuscript preparation. Typically limited to figure creation, citation formatting and minor fixes. And so that's a really interesting example where at the user level, the entire process is AI generated all the way to the publication.
And so the question, of course, it's not everywhere. This was a beta. The zocchi is not available. It's going to be made available to researchers shortly. So we're not seeing this everywhere. It's first of its kind, but it's perhaps the beginning of things to come. And as you mentioned, the acceleration of change. The adoption is incredibly fast.
What does this mean for scholarly publishers. Strap on. We're moving. That's what it means. We need to be. I'm a big believer. I'm a marketer by education and most of my career, as well as a business strategist.
So it really means how the market is unfolding in front of us. It really is. And the commercial innovations that are happening are not just incremental. They are revolutionary and are happening much faster. Many of us in the room were part of the industry during our digital transformation. It took us about 10 years to figure out new workflows.
It took us some time to figure out new business models. We need to be prepared to work as an industry a bit faster than we're accustomed to. That's an uncomfortable space for a lot of us, especially as Simone was saying, we have such responsibility to protect the copyrights that we are representing and working with. So how do we do that with the right balance between the pace this industry is moving and with the innovation, as well as then staying close to our values.
And one of the things we will talk also is how do we do this across our organizations. Because it's not just a consideration for our editorial teams, our production teams, our it's really across all the organization, and it's part of the organizational, culture and strategic mission that everyone needs to really start to embrace, I believe. Thanks, Dan.
I think there's no question that artificial intelligence will be transformative to the way we work, in much the same way as the printing press was in its day. And the computer. The important thing is though, that I think it will take us some time to get to the stage where it's fully trusted. And I also think that maybe we don't have to get to the stage where it's fully trusted in that sense, in the way it means that our tasks will change and the role of the human becomes, and the role of the human becomes focused on getting the stuff that the artificial intelligence gets wrong.
So that we have that sort of complimentary work through it. I think there is a lot of excitement about what I can do, but I think we have to approach that with a great deal of caution, too, because I can get things wrong and it can get things spectacularly wrong. So we need to make sure that we maintain that balance. I think you raise a really important paradox. Well, both of you do from either end.
So on the one hand, caution is absolutely key. There is the integrity of the research. There is the trust in the research. There is the accountability that is also heavily felt by scholarly publishers. So caution, which means going slowly and slowly, looking at the technologies and adopting them is really important. And then on the other hand, there is this incredible speed of innovation that requires people to keep up because the users, the researchers, are using the technology faster than scholarly publishers are likely to take it on because they are being way more cautious, rightfully so.
But this creates the paradox. So, Diane, what's your advice on how to navigate both this really important caution and this incredible speed of adoption that's happening by the researchers themselves as a starting point. It's not an easy it's a challenge. There's no question. And I think it really starts organizationally with where your core missions and really understanding what your mission is to your members, your industry, your constituents, and really making some very serious considerations to now where does AI fit into that, are we embracing it as an organization.
Is it happening to us, or are we going to. It is a big part of it. How it's not going to go away. So how are we looking at it from a strategic driver perspective. So I think that that's really one of the key places that it starts, because it really then that becomes your North Star to really with the permission of how this is going to fit into the work that you do, not just as an efficiency enabler or workflow enabler, but how are we going to think about this.
And it's interesting because again, at Delta, I think we have clients across all of stem. And in certain areas, we've been doing work in a health area like Simone. And our clients are not they're integrating, investigating some AI capabilities into live products out in the market. And we're helping them with that. That testing right at the user level.
And the core of it is you don't let all of your good practices and product development and your user testing stop. It's the same. We just need so we need to continue to have the same rigor, the same oversight that we've always had in the Guardians of that content. So I think it's really ensuring, but knowing that we need to do that a bit, at a little bit more pace than we might be accustomed to.
Again, I think we are where we are today, and we've been working on some of these products for quite some time. The landscape of where we started with our clients 8, 12, and 18 months ago has dramatically changed, and one of the key parts of that is even the competitive landscape has changed so, so dramatically. So I think it's making sure you have really some serious conversations across your silos within your organizations to really understand where this sits as a strategic driver within your organization.
Yeah agreed. One thing I would say, though, is caution does not necessarily mean a slow pace. It's just it means taking the care while you're moving at a fast pace. Yeah, I love that. I think that's a really important clarification. So there are efficiency gains. And then the headline this morning.
It was real disruption. But let's start with the efficiency gains, which is where you start rolling your sleeve and getting comfortable with the technology and learning about it and seeing how it could improve things. What would be the first three things. If someone has been sitting on the sideline observing and getting ready to dip their toes. What are the first three things they would need to do on policies or guardrails to get their organization ready to start going in, and maybe on the use cases and technology as well.
Maybe Simon, you want to start. Yes, I can try. I think in terms of getting into what very many publishers did at the outset was defined policies around content and that was something that we did, along with very many other publishers on the use of AI within. Material that was being submitted for publication. So an artificial intelligence system cannot be an author.
And if you do use artificial intelligence in your work, then you're obliged to let the publisher know. And for us, we don't accept artificial intelligence and we don't accept images that have been produced with AI. But I think the first thing to do would be to draft your policies around what's workable and what isn't, and also to understand that artificial intelligence is only one part of technology that can help you innovate.
So it is a very influential part. But there are also other pieces of technology that helps you solve problems that you need to solve to get your business moving forward. But being comfortable with any new technology is about setting the right frameworks and the right guardrails to around which people can coalesce.
I think also getting comfortable is really just important for moving forward. One of the first steps is really just knowledge and education. I mean, this came on to all of us in our consumer lives, and our professional lives like a freight train. A couple of years ago or a year, 18 months ago. So education sitting in these type of discussions, but also staying close to what feels like by the minute news releases about the latest releases and impacts and experimenting, putting that toe in the water lessons learned.
We've heard about our numerous vendors, partners that have created sandboxes for experimentation. And even the setting up some think tanks within your own organization or some time to think about on this innovation in a very targeted way, giving yourselves time to really absorb and learn and experiment and get comfortable and confident because and then working within those guardrails to really see what's best for you from the view of your workflows and your user base, how do you move forward in an impactful way.
Thank you. So we've seen a lot of presentations this morning that were around efficiency gains, and a lot of them using AI, but not exclusively AI for sure. How about new revenue opportunities. So one of the easiest ones, quote unquote easiest ones to tackle is licensing of the content for AI uses. How are small and medium size.
We've seen some announcements from the larger publishers. How are small and medium sized publishers to embrace that opportunity if they so wish, which should they be looking at. In addition to announcements from commercial publishers, there was one from ASCO, the Association of Clinical Oncology, a couple of weeks ago where they are using they've partnered with Google.
I think it is to produce a highly effective search engine. And forgive me if I'm getting this wrong, but they're doing it across and practice guidelines. And these are some good examples of how you might use. You might use artificial intelligence, limited to a closed loop of content to mine that data and ask questions to help people who could make use of it.
And that was a really good example, and I'm sure that will spur other similar applications in that space. I couldn't agree more. I couldn't agree more. It was a very timely announcement with the ASCO conference happening, I think this weekend. I'll go back to talk a little bit about it's really your commercial innovations and your revenue generating innovations really starts from the core of your customers' needs and wants and motivations.
We do a great deal of work on this with our clients, and that's the work that we've all that all of you have always done to really understand, how to bring new products to market. Good user definition is really critical. I will say that it really for me, this really goes diligence in a couple of different ways is important. One is really understand what your vendor partners are doing.
As we've been saying at the user level, when you think about the platforms, not just for journal journals, but also those other platforms, if you have content that's book based, if you have education platforms, understand what those vendors are doing to bring to enhance the user environment, their commercial opportunities. Because the more we please our users, the more satisfaction they have and the retention that you have.
There's a lot of products out there that are sold outside of academic institutions to individual learners and professional development, and those education platforms are very introducing, very interesting user based AI capabilities. And this stems from what I was saying earlier, is happening even in the k-12 market, those LLMs in the k-12 market have had AI agents for a very long time.
So these students are growing up through their early education and wanting these type of tools. So talk to your vendors about what capabilities they're bringing in to them, to your platforms. The more satisfaction you have at the user level, the higher retention you will always have within the products that within your products and your user base. And I should add as well that collective licensing if you're not already working with the Copyright Clearance Center on our annual copyright license or our other offerings, and have not maybe signed the Grant of rights for the annual copyright license or the AI systems training license.
This is an easy way to start offering some of your content for reuses internally for under the ACL, or externally for training under the AIA systems license. So don't hesitate to reach out to us. And I would just add to that having done some having some different conversations with folks we like to be able to guide our clients in directions that where they have questions and whatnot. And as far as the agreement that it's especially if you have your markets bridge into the corporate sector.
We have certain clients that that's absolutely the case. It's not just in academia. Academia core into research, academic research. It's also some of the impact that you have in the corporate sector. So really a client shared with me not too long ago some feedback from their board meeting that happens to be a mix of researchers and corporate personnel.
And the corporate members of the board said in every corporation that they work in, this is happening. So those corporate use cases are real and they're unfolding in front of us. So this gives you a great way to actually get your toe in the water in the licensing with the collective license. I'm just going to ask one more question of the panelists, and then I'm going to open it up to the floor.
That's where the more interesting questions come from. So one of the things that the headline that I mentioned this morning raised, highlighted for me is that, again, the future is already here. When we think about the future of AI and the impact on scholarly publishing, we don't have human readers interacting with the output of scholarly publishing anymore. We have machines interacting with the output of scholarly publishing.
And so in this case, thousands of articles were ingested by the AI agent in order to produce and do the whole flow of scientific discovery and research output. In that scenario, it was the audience for the scientific journals, was an AI machine. And if that becomes the dominant quote unquote reader in the future.
What does that mean for the role of scientific publication. What is the mean as well for the nature of the content. Are we producing articles in the same way. Is the output the same now that the dominant reader is another machine. Early thoughts I know this is we won't hold you to it. Yeah, this all sounds very futuristic, although it isn't.
I mean, as you say, it's here. But you also raise an interesting point that the content put into this system was papers that had already been written. So eventually there's going to become a point where these systems need some input from somewhere. So I'm not sure where that's going to come from. And also the other interesting point is that the endpoint is a machine.
But what is that machine going to do with all this information. Is it going to talk to other machines or is it going to eventually talk to people. So I think exciting and transformative as this is, there's still a lot to work through and to see how it would eventually become a useful piece of technology. I agree. I mean, it was very timely that this appeared this morning.
But I think as I was saying earlier, we're learning as we go, and assessing. And I think, though, in some of these releases are going to be like this feel very futuristic but they're real. So thanks very much. What's that. What's that. Springboard to next.
So I think it's again having us think a little bit beyond our comfort zone potentially. And the world, the pace that we know to really say, how big and how fast is this going to unfold. It's happening quite fast. So a foot in today, where it's about efficiency and a foot in tomorrow where it's really disruptive. And looking at both horizon at the same time.
That's the challenge. Questions from the floor. Is this on. Yeah there we go. Hi Roy Kaufman from c-c-c. We know from reading papers that eyes are being trained, including eyes we might even be using or being trained by.
From preprint servers, from PubMed central, from sci-hub. And it sort of leads me to this conclusion. So I'm going to make a comment and ask you to respond. If publishers are sitting on the sideline, their content isn't right. The content is out there, their content is being used because it's in sci-hub, it's in Books3 data set. It's in all these things that we know everyone is using for training.
And so how do we close that gap. I mean, licensing is one thing obviously, that we focus on. How do we close that gap between publishers. Because when you're sitting on the sideline, all you're doing is ceding control. You're not doing anything else. Like how do we as an industry, SSP and others, how do we start closing that gap. Yeah, that is a good question, Roy.
And I do think it is important that publishers close that gap. The challenge is, or the trouble is at the moment there doesn't there isn't an infrastructure that allows this to happen on a normal level. And I know ccaecae is trying to do this and create that infrastructure, but I think it would be welcome to have something where Publishers, publishers can say and as curators of highly, highly peer reviewed, intensely, rigorously peer reviewed, highly curated content, to be able to say this is high quality data that you can put into your LLM because the chances are that you'd get better results from using it.
So yes, I agree. Does that answer your question, Rory. Did you want to say something. So infrastructure. There's infrastructure missing to close the gap. More questions. Hi, I'm David Sampson from NEJM group, New England Journal of Medicine.
And responding to Roy's question, if you're a smaller publisher, you need to really figure out how to be in the game in the deal flow. And the big AI tech companies are going to argue that it's very difficult to license content from publishers. And so, they have a point because they can't go out to thousands of publishers to try to strike licensing deals, and I think the infrastructure is being built now for whether it's collective licensing or perhaps working if you're published by a commercial publisher.
But Roy is quite right, and you can't sit on the sidelines because your content is being used as we speak, as I speak by these companies, and there is a growing long tail of other opportunities for licensing your content, and it's not necessarily your entire corpus of content. So that's something that you need to think about. And as you prepare to license, make sure you catalog your content, whether it's text based, multimedia based, because that is a question that a potential licensee will ask you, what content do you have and what formats what can you tell me about the content you don't want to in these deals.
Say, I'll get back to you in four weeks with that information. David, that's a great point. I mean, readying your content is something we should always be thinking about. But what does that look like in today's world. I mean, does that have a nuance that we haven't thought about yet. And so I think you're right that the small publishers, large publishers and everything else ever get in the game at and part of that getting in the game is what again, you might consider doing yourselves, is the opportunity compelling enough to create a small language model in a particular niche area, which is really even when you think about the ASCO situation, they took their clinical guidelines.
I'm just hypothesizing here but took their clinical guidelines, put them in a walled garden, and really started to train that in a manner that will be meaningful to support clinical decision support. That same type of paradigm can be really applied across the board. That's happening in education platforms again, where you have students being able to create personalized assessments on the fly to their learning, to their learning objectives, their weaknesses based on a select corpus of content.
So that type of model is proven that you can do so. So licensing is one thing to get in, get moving some of these other considerations. And it may not be all your content. It may be where can you start. Is there a compelling I mean I think again ASCO situation example is really interesting to me where ASCO has a ton of content. But they looked at one use case, one need in the market and looked at.
And even for them it was probably an element of let's go and do something and learn and see how we then might be able to apply that and scale that over time. Thank you. David, you raised several really important points. Yes they'll argue that if licensing is difficult. It's not available. But one of the important things or the important risks there is that if you don't license it, you will lose it.
Now you lose it in reality, as Roy pointed out, because it's being used anyways, whether you're licensing it or not. But from a legal perspective, when and as I've mentioned, there are over 40 court cases in the US. The core issue there is whether or it's fair use. There's going to be an Oxford style debate later on today on whether training is fair use. And one of the key elements in the assessment of fairness is how easy or difficult is it to license it.
So if it's difficult to license it, there is a higher likelihood that the use will be considered fair. And so that's also one of the reasons why CTC was out there early to put together collective licenses around these kinds of uses, so that they cannot claim that it's not possible to license it and should therefore be considered fair R readiness of your content. Absolutely at times we work just on an ad hoc basis brokering some deals, and if you're not able to return the answer to the person who's looking for the content within days, sometimes less than days, the opportunity disappears really, really quickly.
So you can't wait for the opportunities to come to you to get your house in order with respect to using or allowing the use of your content in AI. You need to be ready for the opportunities where they come, and the long tail is absolutely key. A lot of the focus right now is on large language models. The big platforms, the OpenAI, the meta, the Google, et cetera. But the largest opportunity that is emerging is all of the smaller players, corporations that are trying to build their own.
Maybe it's a small language model, maybe they're fine tuning a large language model for a very specific use case, and they need good, reliable data. And that's where the opportunity is just around the corner. There's someone in the back. No OK, thanks. Good morning.
Thanks for the informative discussion today. Sharon Mattern from research solutions I have a question about the collective licensing. Approximately how many publishers do you have signed up for it to date. And if publishers don't sign up, what's the main pushback. That's a good question. And I have some of my colleagues in the room that are probably better equipped to answer that.
Mary Aaron, why don't you take it. See if you could answer that question. Thanks, Sharon. Thanks for putting me on the spot at the moment, we have several hundred rights holders signed into the AI extension to the annual copyright license. And I think the second part of your question was like, what's the main bad stuff if you don't do it as a publisher.
And I would say that it just going back to some of the things that we've talked about already today and going back to David's question is it's sort of a use it or lose it that situation. They're definitely using your content already, whether you know that or not. And so from our perspective, the two main negative consequences of a rightsholder not signing into the AI rights through CCP or also, I will say through some of our other friends around the world as well, is that one, there are being unremunerated uses of your content as a publisher, and 2, the likelihood of severe executive or governmental intervention, whether it's in the United States or elsewhere throughout the world, is significantly higher the fewer rights holders that we have.
So it's kind of like an all boats rise with the tide situation. And I think I'm hoping I'm answering your question, Sharon. Which is fear, apprehension, not wanting to give away those licensing rights. Well, I spend most of my day answering that question, and I think it is fear, apprehension. And the other thing that I think is that many publishers are really interested in doing direct licensing and I think one of the key tenants of our license, but also other rose collective licenses and many licenses, is that they're non-exclusive.
There absolutely is much opportunity out there to do a ton of really incredible direct licensing, whether it's with small language models large language models, individual organizations or universities, and also do collective licensing. Collective licensing is the dump truck that comes behind all the fun stuff. And it's like, we'll just catch all the cool stuff that falls out of the bucket.
So I think it's really want to put those LEGOs in a plug and play model. You want to stack things on top of other things, but it's basically like, don't eat my lunch. We're not eating your lunch, guys. We're collecting the Pebbles at the end. Does that answer your question. Now, I've spent way too long answering if I could add just one other element.
I think a lot of the reasons why some of the publishers are not signing on yet is back to that AI readiness. They haven't actually sat set down. Everybody's super, super busy. And this has just been layered on top of everybody's to do list. And they haven't had the time to sit down and digest what this all means for them to start making decisions around licensing.
And so, I think a lot of the time, they just haven't taken the time to educate themselves and to get ready to make those licensing decisions. But as David mentioned, you either license it or you lose it. The court cases are going on right now as we speak. These decisions and the policy implications or reactions to not having the ability for AI companies to license their works are happening live.
Like that's happening right now. And I'll just follow on. I think the evidence that there is, the reality that there is a collective license is evidence that the need is there in the market. So to me it really says get the money that you can the revenue that you can to your valuable content. There are the guardrails in place.
And at the same time, go speak to your users about potential proprietary one off extensions, whatever it may be, to existing products, so that you then can really solve directly to a need. Because that collective license to me, really, when it was launched and we saw all the press about it, I said the need is there that not just the need the uses there.
So what else is there in the market that we should be really thinking about and contemplating as well. Question over there. Thanks I'm Jessica Myles, I'm founder of the informed frontier, and we're a consultancy serving emerging and established enterprises in science, technology and innovation.
Thank you all for having this forum. It's been super informative. One thing that I wanted to make sure that we touched on was the recent preliminary report from the US Copyright Office on generative AI and copyright. Certainly, I don't want to veer too far on the fair use because as we've said, that's coming this afternoon. But the report did touch on collective licensing, both voluntary and compulsory, and also spoke a little bit about potential antitrust implications and potential changes in antitrust policy and law around that.
How much are those sorts of considerations, even though the law is sort of unsettled, impact how you develop your collective licensing agreements and how you're working with clients in the space. That's a great question. Thank you. So one thing to highlight about that report that just came out maybe two weeks ago. It's from the US Copyright Office.
It's the third report focused on AI and it's around training with copyright protected content. There were no huge surprises there. It is what we expected it to be. Is that. Well it depends. That's essentially, the takeaway. There are some extreme cases, on the one hand where the use is likely to be fair.
So the training is for research purposes. The output does not compete with the market for the work. It's a very transformative use that is being made of the work. So these are the things that will lead, on the one hand to a higher likelihood of fairness. On the other hand, where it's unlikely to be fair, the use is not that transformative.
It's a commercial use. So the training is for commercial purposes. It competes the output or competes with the market for the work. Or there are licenses that are readily available in order to be able to make a training, use of a work. And so there's a whole section in the report on licenses, specifically because it is such an important part of the analysis of fairness.
And in that analysis on licenses, they go to great pains to say, it could be one on one licenses, it could be collective licenses. There's a recognition in the report that the licensing market is not where it completely needs to be. So there is more work to be done by the rights holders to ensure that we have a well established, well functioning licensing market so that we can at the end of the day, make sure that as often as possible, we're able to ensure that we could license, instead of things being declared fair to use without payment or authorization.
So that, as I mentioned, is one of the key reasons why we went out there early. Knowing that this has always been an important component of protecting and ensuring your copyright is making sure that you have a licensing market. And I've worked closely with our rights holders and continue to do so in order to really strengthen that licensing market out there.
Any other questions. I think we might have time for one more. One more question. Oh come on. I'm just going to keep rotating until somebody takes me up on this. Thanks, David. Don't make me embarrass myself.
David to the rescue. Aaron, this is maybe a sassy comment versus a sassy question, but as a publisher, and especially if you're a smaller, medium sized publisher, if you think you're going to be successful trying to do direct licensing deals with these AI companies, especially the big ones, dismiss that thought because we have tried and some of these big companies will dismiss you outright saying, no, we're not interested.
Come back to us six months later. It is a slog. And then if you do get to a deal stage, that too is a slog. And so you have to weigh the pros and cons of trying to do direct deal licensing versus working through a collective licensing arrangement or an alternative. But there's a long tail of opportunities out there. And you do have to get in the game but make that decision.
Do you think you can do it on your own, or do you need to work with a partner. I promise you that we did not plant him in the room. I was going to say your check's in the mail, David, your check's in the mail. And what I want to highlight as well is that you could do both. You can do it on your own and work with and it's not just seek with other aggregators as well. The key message I would like you all to walk away with is you got to license your content.
However you choose to do it, you got to license your content and you got to do it fast before you lose it and go talk to your users about how you might explore the opportunity to do some type of value based solution directly for them. I mean that at the core of our values is the mission that we have and supporting that user base. So continuing to bring out great content and great products targeted to that particular user base.
Some of these licensing deals, as that it's serving markets well beyond the core audience of your missions, but talking to your users and your members about how we can continue to be of value as an organization to them, we need to go talk to them and see what those solutions might look like. So with that, I would like to thank Diane and Simone, and thank all of you for joining us today.