Name:
Data and analytics, outcomes and budgets
Description:
Data and analytics, outcomes and budgets
Thumbnail URL:
https://cadmoremediastorage.blob.core.windows.net/c8a9ceeb-c0a9-4544-aafb-f4d0a7e45f28/videoscrubberimages/Scrubber_1.jpg?sv=2019-02-02&sr=c&sig=XbQHkQ6hRXUOy%2FFUIhM1Nru%2B0biry63pJYJ0%2BBguEoQ%3D&st=2023-09-21T09%3A19%3A56Z&se=2023-09-21T13%3A24%3A56Z&sp=r
Duration:
T00H34M23S
Embed URL:
https://stream.cadmore.media/player/c8a9ceeb-c0a9-4544-aafb-f4d0a7e45f28
Content URL:
https://asa1cadmoremedia.blob.core.windows.net/asset-aa5bc349-c84b-48ac-af88-4efdbc3bb45b/44 - Data and analytics%2c outcomes and budgets-HD 1080p.mov
Upload Date:
2021-06-03T00:00:00.0000000
Transcript:
Language: EN.
Segment:0 .
[MUSIC PLAYING]
MICHAEL HABIB: Hello. This session is on data and analytics, outcomes and budgets. I'm your moderator Michael Habib. We have three speakers today representing different stakeholders in the scholarly communications ecosystem. First, representing the perspective of academic institutions, we have Curtis from Iowa State University, who will give a look at how they're using data to support OA agreements.
MICHAEL HABIB: Next, we have Shelley from Emerald for the publisher's perspective, discussing what is needed for the successful transition to open and the changing needs of understanding impact in a more open environment. Lastly, we will hear from Steve with Altum, who will look at how funding agencies are tracking research outcomes through the grants management systems. Following the talks, we will have separate breakout sessions on the two primary topics of the sessions-- impact transparency, research outcomes, as well as breakouts on using data to support OA agreements.
MICHAEL HABIB: With that, I will hand it off to Curtis.
CURTIS BRUNDY: Thank you, Michael. Are you seeing my presentation? OK, well, welcome, everyone. My name is Curtis Brundy. And in my role as AUL for Scholarly Communications and Collections at Iowa State, I oversee both our collections program and our scholarly communication efforts as they apply to negotiating and implementing open access agreements.
CURTIS BRUNDY: So what I'm going to talk about today is getting at the data and how it informs, and how we're using it to support the open access agreements that we're doing. And I'm going to talk about it in three levels or three use cases here. I'm going to talk about the data that we're using to inform our strategy. When we're doing particular negotiations with an individual publisher, what kind of data we're looking at there.
CURTIS BRUNDY: And then finally, I just want to throw out a couple of pieces in regards to workflows. Because you come up with a strategy. You enter a negotiation. If you're successful, you actually have an agreement at hand. And then you have to implement it. And that workflow piece is actually really, really important.
CURTIS BRUNDY: So starting with strategy, I have my big, scary publisher profile pie chart here. And I do not want people to try and squint and look at who these individual publishers are. You're welcome to look at it and zoom in on your own. But just more generally, what I wanted to get out of in showing this is that one of the first things you need to do from an institutional perspective, I feel, that's really, really important as you're going to start negotiating these types of agreements is you need to switch from your mindset of being a subscription negotiator, and into your mindset of being an open access agreement negotiator.
CURTIS BRUNDY: And for that, you really need to understand the publishing profile of your institution. And to get at that, you have to do some data work. And underlying this pie chart here is work that we've done using the data from Web of Science. And I'll talk a little bit about why that's our data source for this. But when you think about strategy, the reason why this pie chart is important is because it just gives us a lot of insights into what's happening on campus, where the articles are coming from, the relative importance of the different publishers.
CURTIS BRUNDY: And a few things just jump out at this. And this is consistent with other institutions that have done this type of analysis. Typically, there's five publishers that will make up approximately 50% of the corresponding authored output of an institution. So that's an important point you can take from this. Who are your five most important publishers, and what percentage do they make up on your campus?
CURTIS BRUNDY: And there's 20 publishers here, and these 20 publishers represent 85% of Iowa State's corresponding authored output. So doing this analysis, you get to actually put names and numbers to each of these publishers. And so when you're thinking about a strategy on who should you enter into a negotiation with for an open access agreement, this gives you some idea of the impact, the number of articles involved. And then I would just say that the companion piece to this pie chart is one that includes the subscription numbers, and where the spend for each of these publishers actually is.
CURTIS BRUNDY: And then the last thing I would mention about this is, in that Other category, that's about 15% of our article output. That is where the long tail is. And in our case, there's about 100 publishers that are in that 15%. These are publishers who, in some years we won't even have an article with that publisher, and some years, we might have one.
CURTIS BRUNDY: So thinking about strategy, the type of work that goes into this, it's actually not easy to create this pie chart. So this is not a canned report that you run out of Web of Science. And I'm going to talk a little bit more about some of the issues and challenges that you have in trying to do something like this on your campus.
CURTIS BRUNDY: But just it's enough to say right now that this is something that, on my campus, it wasn't until about the last six months that we had a data analyst on staff who was capable of actually doing this for us. So before I talk about some of the challenges, I wanted to talk at the second level. So first, I was going to talk about the data that informs our strategy, and then I'm going to talk about the data that informs our actual negotiations.
CURTIS BRUNDY: And again, here, the numbers aren't the point of this slide. I just wanted to use this as an illustration of the type of analysis that we do when we're entering into a negotiation. So what you're seeing on the left of this slide in that table, that's actual publishing numbers that are coming from the publisher. So when you start these negotiations, the first thing that you want from the publisher is the corresponding authored numbers, usually going back at least three years.
CURTIS BRUNDY: And in this case, this is the publishing numbers from Elsevier for Iowa State. And what we do on our end, then, is it's not enough-- I think Shelley may speak to some of this. Publishers are not exact in how they can report these numbers at this point. And so there's some due diligence at the institutional level, when you're having these negotiations, to try the best that we can to verify these numbers that we're seeing from publishers.
CURTIS BRUNDY: And it's not easy to do. And it's not exact, either. But at least you get a couple of other points of comparison. And in this case, what we do is we also look at Web of Science. So how many corresponding authored articles are we seeing from the publisher from Web of Science. And then the other tool we use is Dimensions We like Dimensions a lot. It's easy to use.
CURTIS BRUNDY: Web of Science costs 2 and 1/2 times what Dimensions does for us. There's a lot of value for us with Dimensions. But at this point, Dimensions does not index corresponding authors. So what we end up having to do is to use a deflater on there to try and get at what that corresponding authored output from Dimensions might be. So we are doing this now for every negotiation we enter into.
CURTIS BRUNDY: We get the numbers from the publisher, then we run a verification on our end to try and make sure that those numbers are where they need to be. And talking about some of the challenges in doing those two things-- trying to develop that pie chart to look at the strategy, and trying to get the numbers to verify the publishing numbers that we're seeing, from that publishers actually see, as there's a lot of challenges.
CURTIS BRUNDY: Like, again, this is not out-of-the-box. And this is work that my library, a mid-sized research university in the United States, on our staff, we did not have anybody who could do this work for us up until about six months ago. So again, the issue with dimensions in doing some of this is that you just can't get at the corresponding authors at this point. And I know that they're aware of this.
CURTIS BRUNDY: I think that they're working on it. And hopefully, that will change. And then with Web of Science, our data analyst, his name is Eric Schares. And I asked Eric in preparing for this presentation, I said, well, tell me the pain points for you in trying to do this work using Web of Science. And he put it in three different categories. One is getting the data out of Web of Science.
CURTIS BRUNDY: There's some work involved with that. There's different ways that you can get your hands on it. You can use the API. For whatever reason, he's pulling these things out 500 records at a time, because it does something with the metadata that makes it easier for him to process it. But it's time-consuming. And then we have the cleanup that's involved. So we have publisher variance.
CURTIS BRUNDY: So if you want to get down to your number for Wiley or Elsevier, you've got to make sure all the different names that those companies might be listed under in Web in Science are all collated together. So some more time and effort there. And then with Web of Science, it's an incomplete index. It doesn't index everything. So we end up having to do scaling in order to inflate, and try and get to that accurate number.
CURTIS BRUNDY: So all of that is staff time, staff expertise, and not necessarily that easy to do for a library of our size. Now, I think there's libraries that are way ahead of this. And we are. Certainly some of the European consortia have folks who have been doing this for some time. But when you're thinking about individual libraries trying to inform themselves, and being able to come in to develop a strategy, come in as an informed negotiator at the table to do an open access agreement, it's not easy to get to this point right now.
CURTIS BRUNDY: So I think that's a challenge that I would throw out to the Dimensions and the Web of Sciences of the world-- make this easier for us. And the last thing I'm going to mention just very quickly is, you came up with your strategy. You did your negotiation. Now you have an agreement. Now you need to implement it. And this is where the workflows come in.
CURTIS BRUNDY: And this is also messy. And so one of the most exciting things, I think, going on in 2021 right now is the launch of the OA Switchboard. And if you're not aware of the Switchboard, this has the promise to be that in-between communication channel between funders and publishers and institutions, where we can exchange the messages about the invoicing piece and about the status of the publication.
CURTIS BRUNDY: And that's so important because it allows for things like Oable from Knowledge Unlatched, which is a new platform that Knowledge Unlatched has developed. And if you think about a library like mine, we've only been trying to pursue these agreements for about two years. We have in the ballpark of 10. We have to go to 10 separate publisher platforms in order to do verification, in order to do reporting.
CURTIS BRUNDY: I mean, it's really, really difficult. And so the promise of the Switchboard is that we would be able to use that API, the data would flow into the Switchboard. And then we could use an aggregator platform like Oable to manage the workflows, where we have one place to do verification, one place to do the reporting. And that would just be a huge improvement over where we are right now.
CURTIS BRUNDY: So three topics there-- the data to inform your strategy, the data to inform your negotiation, and then the work that we need to do as a community to make these workflows a bit easier. And I will stop there. Thank you.
SHELLEY ALLEN: Good, so I'll just to share my screen. Can you see my slides now? You're going to see me [INAUDIBLE].. And then moving on without me. OK, yes. Thanks Curtis.
SHELLEY ALLEN: I'm Shelley. I'm from Emerald publishing, and I'll just start by giving you a quick overview of Emerald in case you're not aware of us as a publisher. We're a medium-sized social science-based publisher. So up until about two years ago, there wasn't much call for transformative agreements from our customers. So we had to learn quickly, upskill quickly.
SHELLEY ALLEN: We currently have manual workflows and manual data gathering. So there's a lot of work that's going on behind the scenes. And I can absolutely echo a lot of what Curtis was just saying about some of the challenges around data. We are signatories of DORA. We firmly believe in the need for culture change in regards to how research is assessed and how impact is measured.
SHELLEY ALLEN: And it's normally in applied fields that we're working, so we're most famous, at least in management, for the social sciences generally. And we have just over 300 journals. They are mostly hybrid. We have two gold journals and an open research platform. This is a fairly obvious slide. But really, this is just to say that, in terms of Emerald's position, obviously, the ecosystem is increasingly open.
SHELLEY ALLEN: But we do have grave concerns about the equity in the new system. Most of our research is 90% of them have no funding. So their only route to publishing is through support of their institution. We're very keen to ensure that we remain open to all, and that we don't end up with a playwall against a paywall as we bring down that pay wall. We're very keen to ensure people can still publish.
SHELLEY ALLEN: We also publish a reasonable proportion of authors in practice. So they don't have the support of any institution, and we would like to ensure that that dialogue between practice and academia remains in our journals. So these are some of the unique challenges that we face in our particular mix. We are also very keen to ensure the applied research that we publish has meaningful impact in the world, and that it not be based purely on citations, and that we're able to track the impact of research from the university into society.
SHELLEY ALLEN: So that is also a key tenet of our mission within our disciplines, and is something that we would be keen to ensure remains part of the discussion that we have with our customers as well, and how we can support researchers in this. This is important. And this map is showing us what we call our opportunity map. And where there is green, this map is really showing Emerald's perspective of where our customers have expressed interest in open so far.
SHELLEY ALLEN: So if it's green, they've expressed some interest. But this is also, for ourselves, maps against where we fall as a mid-sized publisher, where we fall in the negotiating queue. So as a smaller publisher, I don't think there's many universities where we'd be in that top 50% of [INAUDIBLE] top four or five publishers. So we have to wait our turn before we can get into that negotiation, and we don't have a seat at the table for all consortia or all libraries.
SHELLEY ALLEN: And where this can become a problem is quite interesting in my next slide. I might spend some time on this slide because this is a Venn diagram that we spent a lot of time on, and we were constantly referring to this particular diagram. And this is really about the issue of, within subscriptions, the cost is borne out by a number of different customers, and within open particularly APC based upon, that cost then would fall to a smaller number.
SHELLEY ALLEN: Like Curtis, when we go into negotiations, we go in with similar data. What is the current spend on subscriptions? What is the current output? But there is a lot of complexity within this graph. So we already have the fact that 37% of our customers don't actually write for us at all at the moment. There's a lot of different reasons for that. But obviously, that is revenues that will fall away.
SHELLEY ALLEN: And as a business, how do you manage that in time? And we also have 27% of our authors who don't have an existing commercial relationship with their institution. So how do we support those authors? As I said, some of them are in practice. Some of them are teaching institutions. And how do we ensure that they are supported, as we go forward, where there isn't that existing link?
SHELLEY ALLEN: Particularly because a lot of those authors aren't aware of how they can access funding for APCs, for example. So what models might help instead? And even in that read and write, there's an awful lot of complexity. So Emerald is not a big deal publisher. We do have a package that includes everything, and customers can choose to take that.
SHELLEY ALLEN: But only 27% of our customers actually take everything. We have a lot of flexibility in what we have. So when we're transforming in transformative agreement-- subscription spends into publishing spend, that can actually be very challenging. Firstly, within that section, we have a similar long tail as Curtis showed. We have a number of institutions where we may only publish one or two articles a year, or even some years none.
SHELLEY ALLEN: So what's the discussion there? When you are asking for a transformative agreement, how do we ensure that we do support that? But also, I think one of the biggest blockers, or one of the areas that definitely takes the most negotiation, and is possibly the biggest area of concern, is we also have a lot of customers who their current spend is less than their output. And most of those customers, particularly at a time like now, don't have extra money.
SHELLEY ALLEN: So how do we make this sustainable as we transition? Because it's not sustainable for us if they're not paying for that publish. Particularly, as you see the read fall away, how do we support the write that's not being there, and that burden falling on that small proportion of high-output customers? And then also within that is the fact that, as a social science publisher, we don't have the same volumes of authorship as the large publishers, or the STEM publishers.
SHELLEY ALLEN: So authors are not as prolific, they tend to write less frequently. The journals tend to be more niche, so some journals may even just be 12 articles a year, and not those high volumes that you see that help lower cost of production. So if you're supporting one journal and a megajournal with thousands of articles, that's a cheaper operation then if it's 300 journals with much fewer articles.
SHELLEY ALLEN: So that's something for us to manage, obviously. But it does make the picture more complex. As you are investing in workflows, that is also something to consider, is the existing cost base to support those niche communities that quite frankly at the moment don't want to lose their journals yet. So that's definitely a tension that we're managing in our current approach.
SHELLEY ALLEN: So the other side of this as well is, when we were in our negotiations, at the moment, they do tend to be fairly transactional. It is about how do we flip subscription spend in our hybrid journals to publishing spend in our hybrid journals? And that is, for very understandable reasons, the level of discussion at this moment. But I think the most active in this ecosystem are quite keen to see the way that research is-- the way impact is measured change, so that we can break some of these issues that we're seeing in the ecosystem.
SHELLEY ALLEN: I think that's in a lot of our interests to help support that. And in the research we've done, we've found that there is appetite for this among researchers whose voice often isn't in this discussion. But they're not seeing the support of that from their institution, and also really from the publishers. So this is an area I think that our conversations could evolve, and we could look at different data points in how we're assessing the success of these deals.
SHELLEY ALLEN: So at the moment, it's all about output and usage. But how can we evolve those metrics that we're looking at? And that's something, I think, a lot of us would be interested in getting involved in. And from Emerald's perspective, although we're quite new to transformative agreements, we're very quickly coming to the realization that there needs to be a lot more collaboration and co-creation amongst us and our consortia and library customers for how we're really supporting the researchers to be more open, to ensure that we're not raising barriers to publish as we seek to lower the paywall.
SHELLEY ALLEN: I'm personally very interested in APCless models. I think the experimentation in subscribe to open is really interesting. I really love the ACM tiered model. I think that would be really interesting to discuss with people in the breakout sessions. And how we really drive cultural change through our arrangements, and not-- I feel in some regards, with transformative agreements, we could be potentially rushing to recreate big deals, rather than missing an opportunity to actually drive culture change.
SHELLEY ALLEN: And I think that would be also an interesting discussion point. We have a shared purpose in supporting faculty in their research endeavors, and to get that research out into the wider world for societal good. And I think that is something that could be much more part of our discussions than we've experienced to date. And we hold our hands up for our part in that as well.
SHELLEY ALLEN: So that's what I have to say about our experience in transformative deals, building on what Curtis has said. Really, everything Curtis has said was exactly the same in Emerald's experience, and then the complexities of managing a program level in unfunded disciplines as well. So I hope that was helpful for people. Thank you.
SHELLEY ALLEN:
STEVE PINCHOTTI: Thank you, Shelley, very much. Thank you, Curtis. So my name is Steve Pinchotti. I'm the CEO at Altum. And if you're not familiar with Altum, we're a research provider, grants management solutions. We run the largest independent grant-making platform for research funders out there. So I'm going to talk about our also data challenges, but from the funder perspective and the researcher perspective-- research institutions around impact transparency, and how to create more visibility into research outcomes.
STEVE PINCHOTTI: So whether it's a funder or a research institution, they have similar challenges. They have similar challenges when it comes to impact transparency and research outcomes. They have to identify and track the return on investment and outcomes of the research, and communicate the impact to their constituents. In many cases, with nonprofit funders, their donors, what did they do with their money, and show impact there.
STEVE PINCHOTTI: And also other funders, especially with researchers trying to get grants and other things, and then showing that impact and trying to get published in these different journals. So there's a lot of overlap here in terms of challenges, and trying to figure out what the transparency and the outcomes are. At a minimum, they need to answer to questions. So what was the overall impact of funding on the research and the researchers' careers?
STEVE PINCHOTTI: And how do I link the research I funded or received to publications, patents, clinical trials, and other outputs of this research? And these are two very simple questions to say, but these are incredibly complex and challenging questions to answer. And we might say, well, why is that? Well, similar to Curtis and Shelley's presentation, in terms of just wrapping our hands around data, there's an incredible amount of data out there.
STEVE PINCHOTTI: And there are statistics published last year that, on average, every second, we are generating about two megabytes of data per person. And per day, collectively, we're creating about 2.5 quintillion bytes of data every day. And a quintillion is a number that has 18 zeros. So there is just an incredible amount of data out there that we're all trying to wrap our arms around. We're inundated with it every day.
STEVE PINCHOTTI: And the other interesting statistic is that 90% of the world's information was created just in the past two years. So these are numbers that are increasing exponentially every year, and this is with nearly half of the world's population still not on the internet. So it's an incredible challenge just to figure out who is doing what, where, and when. And so what do we do about that?
STEVE PINCHOTTI: So organizations like NISO, and the other organizations I have on this slide, are doing an incredible job of just coming up with standards and structures and interfaces. We all can share data in an open way through persistent identifiers. And persistent identifiers are these long-lasting references to digital objects, whether those are datasets, websites, documents, manuscripts, and publications.
STEVE PINCHOTTI: And we can talk about this more in the discussion group, but what do these organizations do between ORCID and Crossref and DataCite. There was just a conference last month called PIDapalooza all about persistent identifiers. And everyone is doing some amazing work in this space to help just track what's happening in the industry and who's doing what, and helping streamline the process of finding the information or analyzing it, and less time tracking down who's doing what, where, and when.
STEVE PINCHOTTI: So with that baseline of work, and more of these persistent identifiers being adopted, where is this all headed? And really, the future is around artificial intelligence machine learning. And our company did a thought leadership webinar last month about this. And what we're going to see is all organizations in this ecosystem will spend less and less time finding information, and more and more time analyzing it.
STEVE PINCHOTTI: So you will, through these persistent identifiers, be able to make better connections, accelerate the process of your grants management, your research evaluation, teaming people together in faster ways, finding experts in areas. These algorithms and things that are out there are already transforming our lives, and they will continue to do that in really, really big ways.
STEVE PINCHOTTI: So I would say go back to your organizations. Ask everyone there what you're AI machine learning strategies are, because these are something that are really important, and will help us all be able to just get information faster. And then just spend more of our time analyzing, rather than piecing it together. So what I want to do is just show a few examples on the next couple of slides around what does this mean, and what would this look like.
STEVE PINCHOTTI: And the first example I have is around a researcher's career dashboard. So this is an example from our system that shows a researcher and their career. And the reason this is important is that, whether it's a research institution, a researcher may only work at a university for a few years, or they may move between different organizations. And a funder may only fund a researcher for a short period of time of their career.
STEVE PINCHOTTI: But it's important. They want to know, well, what happened to this person over the course of their career? And you can imagine, like an early career researcher, that maybe someone was mentored or was funded by a particular organization. But 20, 30 years later, you find out they won the Nobel Prize. Our customers certainly want to know, and research institutions want to know that that happened.
STEVE PINCHOTTI: And this is an incredible challenge to track down people and what they were doing and everything. But now, the more than ORCID is adopted, and the more the profiles are up to date, you really can pull this information down into your systems in amazing ways, and start showing some really rich dashboards like this, where you can see that, where this red arrow is, this is where our funder gave this person their first grant award.
STEVE PINCHOTTI: And then you can see, over the course of time, that they received other grants and published, and their proliferation of publications over time. So this is just one example of what can be done when persistent identifiers are used in a common way and really adopted. We can get some really fascinating information that makes it easy for funders and institutions to pull this information together.
STEVE PINCHOTTI: And this product we have called Insights is revolutionary. It's something really different, and that's a way to visualize the data and make connections between the data. And this all started actually out of a project with Dr. Fauci's portfolio analysis team at NIH, where his portfolio analysis team we're looking for ways to link the research that they funded to outputs, other products-- things like FluMist.
STEVE PINCHOTTI: We know that was created. We know that hit the market. We're looking for ways to link the research we funded to those outputs. And they identified 90 products. And they took four people, and it took them four years to manually curate and link the data to create these stories and to tell the story of what happened. So we created this platform.
STEVE PINCHOTTI: And what this enables organizations to do is search on vast amounts of data very quickly, and link it in a visual way. So you can see across the timeline how the research and the knowledge has evolved. And you can see if a project was funded whether a clinical trial came out of that or a patent or a publication. And if it was a publication, did other things reference that publication?
STEVE PINCHOTTI: And you could see all this very quickly. And what this does is it lets you see how knowledge has expanded and evolved over time. And what the NIH would call it is almost like a Big Bang scenario, where, like, was there an area of research that happened that they weren't even expecting. And they can see all this now visually through this interface. So really, really cool things that can happen through cross-ref data that's made available.
STEVE PINCHOTTI: All of these are through publicly-available sources that you can pull down start to experiment with. And then lastly, what I'll share is, combining what we talked about with the career dashboard earlier, that organizations can take data and layer it on top of each other. So in this case, one of our funders took a dashboard of multiple principal investigators and lined that all up so that the first time they gave them a grant, we could call that year 0 here where this red arrow is.
STEVE PINCHOTTI: And they saw that, with $1 million of investment with these four principal investigators, that led over time to an incremental gain of $18 million in future funding and 116 post-award publications. And so what they can see is that they really can dive down more into that return on investment. What happened with that initial funding that they did? What happened with the careers of these researchers? And they could do this with groups of researchers or groups of programs.
STEVE PINCHOTTI: And this is just some of the information that people can get out of these ecosystems when information standards are created through NISO and many other organizations, and the great work that's being done by this whole community. And it's a really exciting time. And I think we're moving into areas that will be unprecedented, where there will just be less data aggregation, and just more information at our fingertips.
STEVE PINCHOTTI: And we can all make better decisions. So really exciting times. That's all I had for today. So once again, my name is Steve Pinchotti. You can find me on Twitter @StevePinchotti. And we are now going to break out into discussion rooms. Thank you all very much for attending. [MUSIC PLAYING]