Name:
Telling a story with metadata or Always drink upstream from the herd
Description:
Telling a story with metadata or Always drink upstream from the herd
Thumbnail URL:
https://cadmoremediastorage.blob.core.windows.net/aea2fb04-4edf-4afd-9d24-95dff8bae4ac/videoscrubberimages/Scrubber_1.jpg
Duration:
T00H24M11S
Embed URL:
https://stream.cadmore.media/player/aea2fb04-4edf-4afd-9d24-95dff8bae4ac
Content URL:
https://cadmoreoriginalmedia.blob.core.windows.net/aea2fb04-4edf-4afd-9d24-95dff8bae4ac/Telling a story with metadata or Always drink upstream from .mp4?sv=2019-02-02&sr=c&sig=TLM4DsHVZNbobSnMsSzQoYMDgxY7v9MfMNWngahy3jU%3D&st=2024-12-26T11%3A50%3A58Z&se=2024-12-26T13%3A55%3A58Z&sp=r
Upload Date:
2024-03-06T00:00:00.0000000
Transcript:
Language: EN.
Segment:0 .
OK well, now that we've moved over into the Zoom portion, I'm wearing the same shirt today for continuity. This is kind of a new format for me to go from a recorded presentation to a Zoom meeting. But I see that Julie is with us. I know that Jenny is on the call as well. So I just want to welcome both Julie to thank both Julie and Jenny for being here with us today.
Both of the presentations were very different in scope, but really showed how metadata is so important from the very beginning of the process in order to capture all of the outputs and make everything possible in the whole landscape of scholarly communication. So in our live session right now, if anyone, I'll be monitoring the chat. If anyone does have questions, you can put them into the Q&A or simply raise your hand in order to speak.
We really want to make it a conversational event. And on that note, I do have questions for both of our speakers, but I will start with Julie. Your presentation really got me thinking as a publisher at SPI digital library where I see the leaks most frequently and in the process. A couple of examples. As a publisher we do rely on FTP connections to send out article level metadata to our indexers, but the discovery layers, such as ex libras and echo discovery as well as indexes that are really important to our authors, like web of science and Scopus.
It's urgent for our authors that they do appear there, but one problem we run up against is kind of unreliable FTP connections. Sometimes the data partner's FTP site will be temporarily unavailable, or perhaps they have changed their username and password for us. And so there's a gap in what we're sending them until we notice and get that fixed. Another leak that I see as Publisher is really in working with Google scholar, and Julia and I have talked about this.
The relationship with Google Scholar is interesting because unlike our other indexers, I don't really have a particular contact I can work with there. When there is an indexing or a linking problem on Google scholar, it's very difficult to get that fixed or get a response. But of course, that's one of the most important referring sources to our platform for our authors, for our users and so forth.
So for me as a publisher, those are the two big areas I see where leaks can occur. But I wonder, Julie, what your perspective is at tripoli for the ftp? Actually, we noticed this problem long time ago. I think in 2015 we started project to monitor the FTP process because it happened so often like we claimed, we put the content on FTP and the index is supposed to come to index them and then but they claim they didn't find them.
You know, it just goes back, back and forth. So we developed systems to monitor and so now we receive weekly reports, we have dozens of indexers and we have a count for each indexer. And I can see and what time. And what this index comes to pick up the files and what they downloaded. And so and if an account is inactive for some time, we'll follow up.
So yeah, and also if any content is missing. And we can trace back. So we developed this system. So I don't think we have heard any big complaints since then. It took some effort, but it was definitely worthwhile. That speaks to the importance of the publisher devoting resources and staff to tracking the metadata and tracking the indexing. Yes, it is not easy.
And you have to work with the internal teams and the internal teams and everybody is busy. Right and so you have to make cases and you have to show them the gaps. And you have excuse me to show the level of complaints. And finally, they decided that it's worthwhile to do it. So, yeah, it requires a lot of internal communication and also external communication, but definitely it's a project worthwhile.
As for Google Scholar and we are a little bit fortunate, we have some maybe more connections with Google Scholar and we are able to whenever we send them inquiries and they respond fairly quickly. And I established a Google Sheet with them. So whenever there is an inquiry from an author on something that cannot easily resolve, I put in a Google Sheet and so that we can trace and can go back and can monitor.
So yeah, again, because they are small team and they cannot possibly deal with everybody. So yeah, it's not easy, but so that's why we, we put some best practices on the website. So we talked with them and to see because we receive so many complaints from authors about the articles not being found and the problems may come from the author, the authors themselves. So we want to make sure we put these practices on a website, on the author website, so that it's sort of like a communication education process.
It does strike me that Google Scholar is such an important resource, but they do have a pretty small team. They're working on it. And one of the things our authors pointed out recently is that some of our proceedings from the Google Scholar results, when you click through it goes to the link resolver for a different journal, school psychology journal, I believe, which has nothing to do with us.
So resolving that problem has been a little tricky. We're looking for the root cause, but it can be difficult, as you know, to work between the publisher, the referring source and the link resolver vendor to see exactly where that leak is occurring. Yeah, Google Scholar handles the journal articles better than proceedings. And so when they did the subscriber links project, because we have a huge amount of conference proceedings where the largest publisher of conference proceedings, 3/4 of our content is conference proceedings.
So we have to really work with them. They did some special projects for us too, because our eyes cells are not optimized. There are missing gaps here. And there. And so and this is not something that we can easily fix. So I did try to push the conference team, but it's not something that can be easily done. So we have to Google it and we had to do some work around.
And now I think our conference proceeding papers are more visible in Google Scholar. But yeah, again, it's, they have to they do not handle books that well yet. So yeah, I think it's a learning process for either side. Well, thank you, Julie. And you may have also noticed a question came up in the chat as to whether or not your slides will be available. I'll find out about that because I agree that it's really useful to show the whole picture of all of the steps and all of the places where I've already uploaded the slides in the session.
Yeah, OK. See it. Wonderful thank you. And so, Jenny, I wanted to ask you a couple of questions to as I mentioned in the chat, your talk when we recorded it was really the first time I had heard of practice research. And your talk does a good job of showing what all goes into that.
I did a little bit of googling on my own to find out what is this and why have I not heard of this? And I think being a little bit more on the scientific side of things, I'm a little less in the loop on the arts and humanities. So I just wondered if maybe other people are in my situation here and are new to that concept. I know it's not easy to define, but if you could quickly review the definition and perhaps give a concrete example of an output of practice research.
OK yes. Let me have a think that went on, I guess, because there's just so many of them. I think the kind of example that springs to mind is this idea of performing arts research, practice research. So so and actually even one of our participants in our research, a musician, gave us some examples. So when a piece. It could be a dance piece or a musical composition, perhaps when that's recorded, the actual events is happening in real time.
And so the research community sees that events that happens in real time different to what we would see and capture in a repository, which they'd see as a remediation or a version of that event. So and what's different about this is that sometimes the research questions and in conversations is very obvious in that recording, but sometimes it needs a narrative to kind of point out and explain what the research questions are.
So in you might often find for a practice researcher that they do publish a journal article that describes some of the methodologies and the practice and the process. But often that kind of. Tangible information is not always obvious in that recording, for example. So so it's a very different kind of, I guess, interpretation to, to scientific publishing where it's spelled out in a journal article, so to speak.
So so what practice research is and it happens. Ah, another example is where is, is that idea actually there's a project called the re tax project going on in South Africa where they've got archivists sitting in and recording and capturing performance. And I'll find a link in a minute and share a link to that. That's a really good example of where the kind of research questions and process is being captured in that recording of that performance.
So it's just a very different, I guess, illustration of research process also, I guess kind of going back to the idea of scientific research, that practice may well still happen, but there's a very clear, agreed standard of how you publish and talk about that methodology and practices in a journal article. Whereas for example, communities a really good example, probably if I'm going off on a tangent here, but indigenous communities who create knowledge often create it in a verbal or an oral format.
So there's this kind of question around whether imposing written formats on, on certain communities is actually really just proliferating this inequity because you're expecting a community to communicate in a way that's very Western written English often format. Does that help? I'm not sure how often it does. And I'm glad that you brought up the example of indigenous community communities because I was feeling that practice research is a somewhat more inclusive methodology because it's not relying upon our traditional models of peer review and reproducibility necessarily.
So I wonder if there is if you do encounter resistance from more traditional arenas that may critique the fact that the knowledge gleaned from practice research isn't necessarily systematic or repeatable. Yes and I think also that's what we've sort of learned is, is that there's often elements that are reproducible. So it might be it might be the practice that's not necessarily the content of the practice, but the actual format and the way the practice is carried out that that can very much be reproducible.
And we're certainly working with the UK reproducibility network to try and bring together some of these conversations because it's not that it's not reproducible, it's just in a different way. So going back to that idea of participants, it's about understanding. What you have permission to reproduce. And actually what's really important to this community is around maintaining the integrity of what was captured and what those participants have given you the right to capture, because it's not necessarily a case of simply reusing a piece of original research.
It needs to be reused while maintaining that context. And that's really important. OK thank you. Those are the two questions that occurred to me, top of mind for both Julia and Jenny, but I would like to invite anyone else on the call. If you have questions for our presenters to pop those in the Q&A or feel free to speak or add additional comments in the chat.
And while we're waiting for that, I do want to echo what Adam has said here, that Google Scholar is not an official product and only exists at the whim of alphabet, and it's staggering how much reliance there is on it. That's absolutely true. I know it. Spea digital library. We work.
So hard to optimize our results in. The big discovery layers that libraries rely upon. But nonetheless, probably 80% of our traffic comes to us from Google Scholar. So it's this area that we can't directly optimize and yet is used so broadly. And authors particularly really, really care about their results. Their yes.
Which is why I think it's so that's what really struck me about looking at these non texts on traditional practice. Research outputs was how invisible they are and and that's I mean there's certainly work to be done in the next 5 to 10 years to move the landscape to a place where this research is recognized. Because at the moment, if you don't publish that single PDF journal article as Julie was spelling out, even just conference proceedings are different enough that they're not easily discoverable, let alone kind of going beyond that.
Yeah several years ago, the most inquiries I receive are from. From libraries. Librarians and now I get more inquiries from authors directly. So I think we are shifting toward a more author centric, like New landscape. And so I think I want to devote more time to figure out how to make authors papers more discoverable and more author centric focus.
And so I think the landscape is changing somewhat. Yeah, like we said, we already did a lot of work with discovery service layer and and that's just only that much we can do. And like, but like the most traffic comes from Google Scholar and maybe Bing. Actually, we are seeing a lot of visits coming from being now in the past year or so we see a spike in traffic from being.
And now if Bing is going to incorporate chachi and I think there will be even more. So I think it's an exciting time to watch how the search engine, the landscape is changing. I want to piggyback on your idea of the conference proceedings. I know at Sti more and more of our conference proceedings have a video only component as we do encourage our authors to submit that PDF manuscript. But it doesn't always happen.
And with that video content we are seeing the need for new standards. And I know ISO has just finalized the metadata standard for video, which we'll be looking at quite closely. But we do find that some of the more traditional indexes, such as web of science, they will not index that video content. And in fact, if one of our conference proceedings volumes contains less than less than 70%, I believe full text, they will reject that volume as well.
So I think there is a real need in the community to continue pushing for the viability of non text outputs as serious scholarly communication artifacts for sure. Yeah, I totally agree. And so the release of this new standard on audio video is really timely and it definitely will make a push and especially for conference proceedings because of the pandemic.
And so many conference things online only now. So I think we will have to deal with it. And I think it's just a matter of time that the indexers like web of science or the other parties, they have to invest more in that direction. So yeah, I think this is the direction to go. I agree. I think that we will eventually see a decrease in the dominance of the PDF.
But then again, I've been thinking that about records for 20 years and we are still working with records, so time will tell. Yeah so people have been talking about disappearance of PDF forever, right. But it's still the number one choice of format for most scholars. It's easy to save and it's easy to put it somewhere, and it's not likely to go away anytime soon.
But videos and audios definitely will be added and to know the formats will be most welcome. So yeah, time is changing and all of us have to do follow that trend. Yes Yeah. I think that idea of equity is really, really important. I know we've been speaking internally about excessive digital accessibility of our repository, and PDFS are not great for accessibility.
So I think that's certainly something kind of aligns with our University values, but also something we really care about as a project is, is this idea of recognizing equity and the whole existing landscape is just it kind of just continues on the existing challenges. So hopefully conversations like these will kind of point us in a direction where we can recognize that that was something else that came out of the work was, was our community wanted all of this content to be accessible.
And that's really tricky. That's really challenging. It's not it's not straightforward. Yeah many publishers now have in-house accessibility specialists, right? So we have to pay attention to what the keywords, where to put the captions, et cetera. And we do have a content manager who watches out for these things.
So for instance, we did a project to add captions to the images and the videos, so we just want to do it systematically. And so yeah, again, all these projects take resources, take time. But I think as we receive more, more and more requests from libraries and we, we have to do it little by little. And accessibility is really high on our agenda. So OK.
Well, I would like to invite any of our attendees to add questions to the Q&A for Jenny and Julie or any other comments are. Our presentation was a little bit shorter than most will be since we only had two speakers for our session rather than three today. But a lot of interesting ground has been covered here. I'd love to hear from more of the folks on the call.
Everyone stunned into silence. It would be lovely to get some other questions. I've noticed this, this conference, as so many sessions are, metadata and a huge amount and also talking about different aspects of metadata. So we're talking about more concrete examples for discoverability. And there also will be many sessions on the infrastructure, the pids, et cetera.
I think it's interesting. So we are going to just it's just a huge thing. And so we only cover a small portion of it. And so I think we go to other sessions and we'll think from different angles. And I believe that Jenny and Julia, both of you are speaking in other sessions as well. Is that right? Yes I'm also speaking a session of SeamlessAccess.
So we implement seamlessness last year and it was a success and we wanted to showcase how we worked with seems less access and we worked with different teams to launch it and also to show the impact. And I think also things to do in the future. And Jenny, you also have another session. I have. And it's I think it's apology. I think it's called describing the practical.
As one of my collaborators on this call as well. And that's happening, I think, Wednesday afternoon around about the same time. So will be talking a bit more in detail about some of the metadata and persistent identify findings that came out of the project. If people are interested in this, I also was just and I think I've got the right link here know some work that's been going on in the US around next generation library publishing may be of interest to people.
I think that that's one of the kind of areas that has been pointed out to us as and actually Australia is doing quite a lot of work, the Australian Research data Commons around humanities and social sciences, infrastructure and publishing as well. So, so things going on there to look out for. I'll be sure to attend both of those sessions and very grateful that during this Zoom session we didn't have any of the technical difficulties we saw with our keynote this morning.
So things seem to be operating pretty smoothly now. So if there are no additional questions or comments, we can bring our session to a close. I would again like to thank Julie and Jenny for their expertise and time today and encourage everyone else to check out the notes for our session, which I'll drop into the chat here. And encourage everyone to enjoy the rest of the conference.
Thank you all for attending this session. And thank you, Laura, for organizing this session. It's very informative. Thank you both so much. Thank you. I just wanted to also mention that the notes are ongoing. So please, even after the session is complete, that document is active. And if you have thoughts that occur to you, please feel free to add.
Thanks very much. They?