Name:
DesigningAMetadataFitnessProgram
Description:
DesigningAMetadataFitnessProgram
Thumbnail URL:
https://cadmoremediastorage.blob.core.windows.net/cea03f86-7c6b-4492-8ff7-da9ab3efed44/videoscrubberimages/Scrubber_1.jpg
Duration:
T00H18M50S
Embed URL:
https://stream.cadmore.media/player/cea03f86-7c6b-4492-8ff7-da9ab3efed44
Content URL:
https://cadmoreoriginalmedia.blob.core.windows.net/cea03f86-7c6b-4492-8ff7-da9ab3efed44/DesigningAMetadataFitnessProgram.mp4?sv=2019-02-02&sr=c&sig=DUv5DdJUwvBCWJ2vANjFB1QIkT5TypgCmLUEgVUrCw0%3D&st=2024-12-21T14%3A16%3A51Z&se=2024-12-21T16%3A21%3A51Z&sp=r
Upload Date:
2024-03-06T00:00:00.0000000
Transcript:
Language: EN.
Segment:0 .
Well, Hello, everyone. Chris kennealy here with ICC. I see my panelists and plenty of audience members for the niceville Plus conference. Welcome Welcome to the live portion of the program will give everybody a minute or so to join in. The recording went very well and great presentations from all of our panelists from Wayland Butler with AIIB publishing, Jenny Hendricks with crossref, Ryan shulkin with UCP assists and my colleague at CDC, Laura Cox.
And actually, why don't we. Start off with Christine. I know you you've been watching all of this. You've got a great background in this issue. And Christine, you had had a question you're going to be joining, I believe, the nice board shortly. If you're not already on it. So you have a question.
That's a board. Nothing but excuse me. I'm sorry. It cross phrase very different. Yes, indeed. Yes, indeed. Yes so, so, so. But you do have a question that is for our NISO audience, but for the community at large.
Tell us about that. Yeah I mean, it's fascinating to hear all that. And I I'm in a position where we aggregate data from many different publishers, aggregators, content providers, and it's really hard to kind of massage it in a way that it goes together and then it's being discovered together. So I guess I have I have several questions, actually. You know, there are lots of questions I could probably ask. But let's stick with the author for a moment.
So we have the orchid, we have other ids, we have researcher ID, we have, as you see, names, we have I'm sure there are more. And then we have the knees and then we have ringgold, et cetera somehow it feels like there needs to be an entity where this is all coming together, because the end of the day, I get the other record. You know, every, every publisher is giving me the name slightly different.
If I get if I get an idea, I'm lucky. And if I get one idea, I'm not getting the other ID, but the other publisher may give me the other ID. So any, any comments from any of the presenters and maybe also let's look at this also. Should there be a central entity that is providing that kind of author's record, so to speak, or is this totally unrealistic?
How do you see this? Well, it's a great question. And Christine, we saw some of that on the chat. During the presentation. People were asking about just the way that authors don't sit still, do they? And so that makes it. No, they don't. And, you know, ORCID is a good example because we are expecting here the authors, the researchers to actually enter their information into orkut and keep it up to date.
This is this is a big dependency on all these individuals who have really other things to do. All right. And William butler, I saw you engaged with Sophie online during the presentation. You had some thoughts on that. Tell us about that. Yeah so I think there is a case of and I'm kind of inspired now by Arianna's presentation excuse me, thinking about kind of a mesh or a way to kind of collect the best data that you could collect.
Right and it might be right if we start from orchid, but maybe it's not up to date. Maybe it's an old affiliation, maybe. Right if that author and I'm just focusing on institution affiliation arbitrarily, we can collect that information and allow the author a chance to update that information. There's a default value, but it can be overridden. To Christine's point about a centralized kind of authority or a place to store all of that data.
Maybe I mean, I'd love to have an authoritative source, but then that authoritative source will have the same problem that Arianna described, which is the data coming from different places, from different publishers. It's going to be hit and miss. I might have ringgold, but not Rau's someone else has Rau has been not ringgold, someone else has nothing at all.
So does that get stitched together? You know, the layering of those data sets that you can pick the best so that by the time you get to the bottom of the full list, you've got something that's more complete than what you might have started with otherwise. Right I think that Yeah. I'm sorry. I was going to ask Ari and shulkin, who's there with us from the Netherlands, to follow up on that.
Ari, on do you have some thoughts with regards to Christine's question about sort of keeping track of all these moving authors? Yeah well, my first idea is many organizations, especially in the Netherlands, are managing also the systems. So that is also a good place to start. But how can you then exchange the information in the system with other external systems?
I think everything should be loosely connected. So not one centralized organization that will be too complex and maybe also there's not one per se right or wrong, but. But several. Yeah so I think it's, it's OK to several that are several institutional ideas. But somehow if we start connecting those, then for several use cases or several services, you can pick the right ID, but then that the gold ID is linked to route ID and vice versa.
And also to prevent competition within the metadata landscape. All right. I'm sorry, Christine, did you want to follow up? Yeah, just to add, I mean, you're partially just kind of talking about people who are publishing today, but we also have this whole history of wonderful publications that, you know, these things are completely without ID. I mean, you know, and also who died in 2000.
Never entered an ID. So we will talk about research today. But there's also this whole thing of historic publications, which could equally be important, but it's not it's not covered here. Jimi Hendrix with crossref. Do you have thoughts on sort of tracking authors you did address in your presentation in our discussion about the way the community is kind of helping to keep things up to date.
And so forth. That's a real challenge. But but talk about that from the perspective of Christine's question. Do you, jenny? I don't see you. Are you there? I'm here. OK, great.
Jenny, do you have some thoughts? Yeah so I think I totally agree. Let's not leave it all to the researcher. We we run sort of systems and have an interest in underpinning the entire system should be better connected so that the researchers don't have to be bothered with the awkward identifiers that that was really the initial point that they would be a sort of open record, but that it wouldn't just be left to all the authors.
They it's important that they're responsible for approving information and deciding what's private and what's not. But lots of different types of organizations and systems are pushing, if you like, assertions into the authors records. So publishers are doing that through crossref metadata and that gets automatically pushed through to authors records. We're just about to hit 3 million authors that have granted that I'm going to steal the Thunder of that announcement.
But that's quite a significant number of researchers that have agreed to do have agreed to let that the publishers of their articles sort of inform their record for them. So that's a really good example of the systems that underpin this process, working together to try to save to save the individuals some time. And institutional repositories do that as well. They might do that through data sites and other sources, even Scopus and others, lots of systems put into the ORCID records.
So I think the identifiers supposed to be the most minimally the minimal piece of metadata that allows different records to connect. And there's always going to be from whatever business case or use case or question that you have, you're going to need a different set of metadata. So there's always going to be a different set of metadata around an author. If you're concerned with tenure tracking.
And there's going to be a different set of methods around the author if you're concerned with finding reviewers or something like that. So the identifier can help with connecting. But it isn't the be all and end all and it doesn't solve anything. It's necessary, but it's just not sufficient. And then my last thought would be that on a sort of central entity, you know, Ideally ORCID is trying to be that of course, but it if there is, if it is open and if lots of parties can connect to it, they can challenge the data that's in there.
They can update and make different assertions about the data and potentially correct it. I think that's what we're trying to get to. And if it's open, that builds trust. And you can see sort of the provenance of the information and how and who has updated that that ORCID record over time. That would be I'm not from orchid, but that would be my goal. If I was.
Well, I'm sorry, but there's a question actually in the chat. And if you do have a question, please use the chat there on the Zoom function. But it's from Ted and asking about both crosswalks and it sounds like Jenny Hendricks. That's a bit of what you were chatting about there. I don't know if you see that question, but but our reader here, everybody that he says that there's some p PID providers like war focus a lot of energy on managing the crosswalks between different organization IDs.
This is very helpful. So can, can ORCID users provide other people's ids? So so crosswalks there, what are they? I'm not sure I in that context. And Ted has been really active in assessing and looking at the different identifier systems and metadata out there. So thank you, Ted. Yeah, certainly raw maps with iss and Wikidata IDs and other identifiers as well.
And orchid, you just have to look at your own ORCID record. You can add your scope of research or idea. If you want to. You can add your Wikidata idea if you want to. I don't know how programmatically those are populated, but it's certainly possible that ORCID users can provide other identifiers. Lots of them use their Google Scholar ids, of course. Well, they're researchgate ladies.
All right. Do we have. I'm sorry, Whalen, please to go on that. I think something that is maybe a challenge here is what's in it for the author to do that work potentially. Right if we're asking the author to maintain that ORCID ID, what is there something they will get out of it? Will the next submission be a little more a little easier to get through?
Is it easy for them to find funding because they've populated this if it is busy work that I as a publisher get the benefit from? But there's nothing in it for the author. what's the incentive? You know, I think the fact that exists that there. We also need to be thoughtful about incentives of building structures around to make it worthwhile for that author to do the work or for someone else or for someone else to do the enrichment.
Right and do you ever experience that the institution, for example, is taking responsibility on behalf of the author? I have no visibility of that. If like somewhat, the institution is like maintaining an ORCID for them. No, no. And mostly, at least what we see with AP publishing is that the author is doing a lot of their own.
They're doing their own submission and publication process by themselves. They might be getting support, but they're still doing the work and doing the updating the data, which is maybe why there is an incentive. And this is why, like UCP has had to do all of this work to pull all of that together. It's initially it's not maybe not transparent to the author why they would do it.
And publishers are maybe not so great at collecting all that data that ultimately the institution would paint to kind of close that loop. All right. Well, well, thank you, Christine. Did you have something else to add to the discussion? Well, I you know, I'm the NISO representative. So I would like to also see, if possible, some discussion around how can I also help in this whole realm of quality metadata and metadata fitness program.
Do you see any role for ISO here, for example, in your recommendation or maybe even in extending an existing recommendation? Yeah I certainly have a thought. I'm going to start with a question of Oregon. Oregon, I'm sorry. The when you were all of those publisher reports that you're receiving. Was there any success in normalizing those at all?
Or does each publisher you're getting a unique report from each publisher for those published articles. Yes so in our contract we mentioned several metadata fields we need, but the interpretation and the way it's represented differs per publisher and it can even differs during time within one publisher. So to have some kind of standards, yeah, that would be, that would be very useful.
So like you have a counter for example, and usage data. And that can also prevent that. It's becoming some kind of negotiation aspect in an already very complex discussion with publishers about the surface and the costs. I mean, you should need to negotiate about metadata. Yeah that's where I was thinking specifically of a counter, right?
For a subscription based arrangement. We don't talk. Counter reports are guaranteed if we don't have them. That's that's kind of if you were a gambler, that's table stakes that has to be there before we can even start having a negotiation or discussion. Right it's not it's not even not an option. Right we maybe need to get to that same place of having a standard by which publishers are sharing out.
These are the articles we published specifically for the institutions that are paying for them or potentially to funders that are paying for them. I think there's a role for now. So in advocacy in many ways and looking at how we can improve the metadata quality, it's quite clearly vital. We're talking about interoperability really strongly in these presentations, in this discussion, and making sure that that is part of the program.
I think it's also there are lots of stakeholders involved here. It's there are huge numbers of people involved. And there are going to be some different opinions and different requirements. And it's trying to find something that really covers all the use cases, covers the workflows. And we've got to have interoperability at heart so that we can really follow through things upstream and downstream from workflows from end to end.
But I think the other thing is that we as a community more broadly can each organization can be doing some of the things that everybody's talked about, which is about ensuring data quality, completing data, making sure it's transmittable, adding persistent ideas where they can. As John said, that's not the be all and end all. There's a lot more to it than that. You know, that's the thing that links things together.
That's not necessarily that's the thing that keeps things persistent. But that's not necessarily enough to add all of the data that you really need to feed into all sorts of different things that we're trying to do with data, which is inform ourselves, as I said as well, and said, we're trying to inform ourselves about what's actually going on in this space. And I look at Orion's heat map and saying, well, you know, for the sort of radios that he's talking about, the data quality is really imperative.
And having that sort of accurate, complete, consistent metadata means that when people are actually submitting articles and they're looking to get their own funding, there's a deal in place for their organization. We need to have that accuracy to make sure that they're not denied by mistake. So we can boost that by having better metadata, we can boost the uptake of owa.
And so no, we're not no one's left out. And for publishers, it saves a lot of time and effort as well because it means avoiding refunds and confusion and unhappy researchers, which is not what None of us want, because we're all here to serve them. And we've got to stop pushing all the work onto the researchers and try and do things more collaboratively, I think across the board. Well, I think that's probably a good place to end, not only because it's a good thought about not sort of pushing all the work to the authors, but because I'm being told it's time to wrap things up.
So I appreciate that. Laura Cox and I want to thank everybody on the panel. Wayland butler, director of data analytics, but aip publishing. Janie Hendricks, director of member and community outreach with crossref. Ryan shulkin, who's program manager for UCP scholarly information services at UCP, the network of Dutch University libraries, and Laura Cox, my colleague with TCC, who's senior director of publishing industry data.
Thanks as well to Christina stern, the session's host from nisa, Jason Griffith, director of strategic initiatives at ISO, who produced this session, and my colleague Caitlin sund, who managed the logistics again this year. KCC is very pleased to sponsor the niceville press conference. I'm Christopher Kennelly with TCC, where I host the velocity of content podcast series to everyone who participated in this session on designing a metadata fitness program, part of the niceville Plus 2023 conference.
Thanks for joining us. That's all for now. Goodbye thank you, everyone. Goodbye