Name:
OA Usage Reporting - Understanding stakeholder needs and advancing trust through shared infrastructure
Description:
OA Usage Reporting - Understanding stakeholder needs and advancing trust through shared infrastructure
Thumbnail URL:
https://cadmoremediastorage.blob.core.windows.net/303f4e50-e879-472a-8a3c-d4d897428d40/videoscrubberimages/Scrubber_1.jpg
Duration:
T00H31M59S
Embed URL:
https://stream.cadmore.media/player/303f4e50-e879-472a-8a3c-d4d897428d40
Content URL:
https://cadmoreoriginalmedia.blob.core.windows.net/303f4e50-e879-472a-8a3c-d4d897428d40/OA Usage Reporting - Understanding stakeholder needs and adv.mp4?sv=2019-02-02&sr=c&sig=CP5CarJrv0Vtq9wBYdn%2BIZezxXfYp94TTbs%2Bv1juGmk%3D&st=2024-12-30T17%3A12%3A51Z&se=2024-12-30T19%3A17%3A51Z&sp=r
Upload Date:
2024-03-06T00:00:00.0000000
Transcript:
Language: EN.
Segment:0 .
I think we've got a couple of questions in the notes document. Do we want to just jump in or. Well, Jennifer. Yes thank you. Thank you all for joining us. And thank you to the panelists for that very interesting and thought provoking discussion.
One thing that I've learned, it's a benefit of being on a Thursday afternoon rather than Tuesday morning, is that I got to watch lots of other moderators to find out how they handled these Zoom discussions. And I noticed that there is usually a lag with people coming into the Zoom and then getting their questions into the chat or the question and answer module.
So I decided to just start by posing a question to all four of our panelists just to get the excuse me just to get the conversation going. I'd like you to spend a couple of minutes addressing a question that I imagine most of us have been pondering, which is why is it proving so challenging to develop best practices around a usage data reporting?
I don't want to go into professor mode and call on someone. So who would like to go first? I can start. Just this is just my own opinion. But I think it's so hard because all of us have our own motivations and we all value the access and use of scholarly information differently.
So we have to balance what our own needs are with the value for the needs of the collective group. So for us, that annual review is a non-profit publisher. We have embraced sort of our mission and experimenting and trying to make something work for not just those who are paying for it, but who those we feel like should equitably have access to something that our institutions and indeed cultures who radically disagree, that that should be a high value.
So I think that's why it's hard. It's not just the value of each stakeholder, but it's also institutional values. It's also cultural values that are impacting what we're creating. Thank you. I'm happy to chime in. So from my perspective, there are just a ton of different stakeholders here.
And if we think about how long it has taken down to standards to be developed, I can't remember how long ago the first counter code of practice came out, but I think it was. Ten, 15 years ago. It's a long time. And even there we had a fairly standard set of stakeholders. It's like publishers reporting to librarians typically, and that has taken a very long time to get to where we are.
Count of V code of practice and we're still improving it and in open access just reporting. We've got so many stakeholders, stakeholders outside the library, stakeholders outside of traditional publishers. We've also got all these investments in legacy infrastructure. And, you know, we're not all funded equally. And it's really tough to make the case for a brand new investment when you've got an existing one.
And some of those investments are tied to Google analytics, some of those investments are tied to counter reporting. So those investments are tied to institutional repositories and some are home brewed in-house stats that only death down and it knows how to produce them. God forbid we should ever change them and then we have different goals. And this is one of the things that Tricia, you were talking about earlier.
What are we trying to get out of this? Is our goal to calculate our cost per download? Is our goal to show to authors that look, we generated usage for you? Is our goal to show that there are particular stakeholders in the community that got value from this. So when you throw all that in together, it's a total wild West and I don't think it will coalesce until we start to pick out of this sort of LEGO box of bits.
Now, what are one, two, three key use cases that go across our community that we want to really focus on simplifying and standardizing? Christina, would you like to add anything? Yeah, I was just going to add, I think one of the challenges is finding a way for us to share that cross-platform data in a more efficient way that keeps the door open for different communities and different groups to run with their use.
Case development. There is so much innovation happening in the various niches right now, especially around data visualization, I'll put it that way. And we know there are so many unique use cases for this information. I think the challenge is figuring out what is, what is our shared need and how can we work on that together while also leaving the door open for all of that innovation on the edges?
Because there's going to be a lot and I feel like we're just at the forefront of what that's going to look like. I think from my point of view, the hard part is given how many different stakeholders are involved for each of those groups to see, kind of. That they're part of this overall supply chain and what that role is and how it connects to all of the other parts.
I think that's just a very difficult thing to do in any sort of landscape, particularly this one that is this kind of dense with a variety of stakeholders. OK thank. Thank you. I'm not sure if I can see the questions and the question and answer module or not. So Kendra, if you have access to that and you see anything pop up, please let me know until I encourage the audience to include their questions in the chat or just speak, you know, raise your hand and speak up.
That would work as well. But I have a couple of more questions that I'm. There is one question in the doc, actually, Jennifer, the one that Kendra interlinked to the sort of shared doc that everyone can contribute to. OK thank you. OK it's for. It's for Trisha Miller.
Could you care more about how annual reviews balances the transparency of usage data with the privacy for users and their institutions? Yes at the moment, privacy is what we're most focused on. So a lot of the open access usage reporting that we're getting, we're keeping to ourselves and just trying to understand what our impact is. If if there is an institution looking for what the open access usage is, as we discussed in the presentation, some is attributed and some is not.
So we can't necessarily say who the actual users are, but we can give general information if we know that there is an attributed use based on iprs that we have registered or that we know who the institution is. So at a basic level, we're transparent to specific users, but private beyond that. And so this is why this framework and standards are really important, because there really isn't a rule.
There isn't anything to say. This is what you should or shouldn't share. So at the moment where we're just toeing a really fine line, that if we know who it is, we can share only with that institution. If not, we keep it to ourselves. And and that's all. Thank you. And there's also a question in the doc.
That is. It's essentially to Christina, even though it doesn't specify that. And you answered it a little bit, but I thought you might want to speak about it more. The question is, it starts with the observation that it would have been nice to have library publishers talk about usage data reporting, and then it asks if the University of North Texas libraries have their own library publishing program for journals, and if they do have journals program, how do they handle reporting?
And believe it or not, with the libraries, don't say that's actually handled by some of my colleagues in our library scholarly publishing services. So I did put the link in there and I'm happy to put whoever asked that question. If you want to reach out and chat, I'm happy to put you two in contact. We do have journal hosting where we do that free hosting and support for journals through unt and we also do Publish Online editions for open texts or our resources and open books.
And so we do have a lot of those mechanisms there. And I'm happy to connect whoever that is with our staff that manage those pieces. OK thank you. I actually have a question for Tim and Jennifer. I know from our discussions when we were putting this panel together that trust is extremely important as it relates to a data usage, a usage data reporting.
And we heard Tricia and Christina talk about it explicitly, but I was wondering if you two would like to share your thoughts on it, especially in terms of what stakeholders in different parts of the scholarly communications ecosystem can do to promote trust. And Jennifer. Yeah, it's a great question and it's a big question.
I mean, I think it's always important. And again, I'll go back to when there are lots of stakeholders involved, because so much of this requires discussion and collaboration. And I don't think that ever gets very far without a lot of trust. So I think it's the need for it is definitely there and it's sort of implicit even if we don't always make it explicit.
And one of the things that Christina and her work and others are trying to do is to put a spotlight on that, to make it a little bit more explicit, to have some kind of ground rules basically around how some of that collaboration will work. I think without that, that's going to be it's going to make all of this work much more difficult. And what are your thoughts, tim? Um, I think this comes back to transparency.
Standards and best practices. I feel like this is a path we've carved already on counter and we learned a lot doing that. So one part of trust is ensuring that we're comparing apples to apples. One of the things I really value encounter is the not the reports so much, but the metrics. The metrics are really good. Building blocks for usage reporting.
I'm talking here about searches, investigations, requests, denials, obviously not relevant in an open access environment, but you can then build reports out of those. And these are things that we as a community understand and we can audit. We can be confident about what they mean best practices as well, because there are plenty of areas where we don't have standards yet. So for example, if you attribute usage to a geographical location, is that based on the user's location, which is very fraught and can be highly inaccurate?
Is it based on the location of an organization or affiliation? How do we do that? Even if you don't have a standard in that area, we could have some best practices and then transparency. So where you are doing custom processing logic to do something, be transparent about it, say this is how we're getting these numbers. And I think if the numbers flowing through the system right from collection to delivery, everyone's clear about what they mean and how they were calculated.
That does a huge amount to build trust in the overall process. Yeah, I imagine that. It must be particularly difficult to figure out how to attribute usage geographically when we're living in a world where so many people have to choose to use VPNS to access content, maybe because it's their only way to access content where they live.
And that's a shame because I think we would all like to know a little bit more about how people living in some more oppressive places are using open access content. OK there's just one more point I'd say on that, and that is that it's also hard to know what does it mean. So in a world where people are no longer working from offices and they're working from home and they may be working from different countries.
What does it mean to say you've got a whole bunch of users in poland? Is that because there are organizations in Poland using it or because there are people just happened to be working from poland? So the notion of what that geolocation tells you is also really tough. That Plus the fact that someone's geolocation might just be a data center.
It's not where they are. And so it can be challenging to interpret these. There's just one shout out to one interesting idea here, and I'm just going to throw a link into the chat. It's the idea. And this relates back to the question about privacy. Is that you can invite people to provide qualitative feedback and. Put this link in.
So Harvard's got a repository called the digital access to humanities. I think it's called digital access to scholars at Harvard. And if you click on that link, you'll see that they've got these little slideshow of feedback from people. And this is a really nice way to get qualitative feedback where when people access content from their open repository, they're invited to leave a note about why do they get value from it.
They don't have to leave their name. They can just leave an affiliation or nothing. They can just leave a comment. But as you see there, that's a nice way to invite the community to share how they got value from it without breaching any privacy. Christina I just wanted to kind of build on that. I think the qualitative is so, so important as we have these conversations around the impact.
But going back to that issue of trust, I think I want to flag and there's actually some great work by the Open Data Institute around this know how do you maintain trust when you're either bringing together data or sharing data with others? And I feel like this is an important place to bring up to context and concepts. One, the role of neutrality when it comes to trust. So it's easy to trust things with people where we're on the same team or we're in the same consortia, and we have a shared affiliation that's holding us together.
But when we think of global infrastructure, what does that mean? And we think of the global ecosystem where each of us have our own national regulations we have to abide by. And perhaps local cultural context. That was on a call yesterday. We were thinking about the care principles and how that interacts with all of this.
And so as we think about trust, I think it's important to always keep in mind that at some point, trust depends on not only the data itself. Can we trust the data quality, but there is an aspect of trusting how that data is used to make sure that there isn't an a use that is out of line with what either the subjects behind the data expect or those who are sharing the data expect.
And so there are all these potential unintended consequences that could evolve. We've seen this with all kinds of AI uses of things that are in the open and public domain. And so when we start thinking about what does trust mean, I think that's where we have to take this next step of who do we do we trust as a community to play that neutral role as a broker?
But also, how do we protect against those unintended consequences. So people trust the ecosystem at large with this data. Thank you for that. That's a good point. Jennifer, do you have something you'd like to add? Yeah, I just want to pick up on the idea of best practices. I'm a big fan of them. I think it's worth noting that they're just so often aspirational and so linking.
Being transparent about that, I think is really important because we can agree on best practices. But, you know, for many of us, in many situations, we won't be there just yet. And that's not a problem necessarily. That's not a reason not to try for it or to participate. It just requires, I think, being transparent about that fact and why.
It's good. Good point. I mean, best practices are. Or what? Keep me up at because they're the standard that you'd love to meet, but you're always aware of. Hopefully small ways in which you're not quite meeting them, but it's nice to have something to really strive for.
So let's see how. You've touched on this topic in a. In your recordings and in some of your answers to previously asked questions. But I'd like to flesh it out and deal with it a little bit more explicitly. How do you balance the.
Need for privacy, perhaps even the legal need for privacy with the need or desire for granular usage metrics. Where where do you where do you draw that line? Which of you would anybody like to go first? I don't mind starting it. It's a problem that we face.
Frankly, in a paywalled world as well. It's this issue. There's a trade off when people access information anonymously. It protects their privacy, but it also makes it harder for the person that's publishing that to make a good case for the value of them publishing it. And we see a similar conversation, therefore, in libraries, where as we switch from IP authentication to forms of single sign on where information on the user can be transmitted, you know, publishers and other service providers are keen to get more granular detail on users to understand them and to make the case for value.
But then that is also a privacy concern. And the same problem is playing out here. And it's a difficult one because if you know nothing about your audience, you're in a much weaker position to make the case for the impact of open access compared to someone who knows more about their audience. And so I think what we need to do as a community is, is do some more, frankly, experimentation here.
Think about it. What else can we do to identify impact that doesn't compromise privacy or where we're very transparent about the value of it and there's a value exchange. So in the paywalled world, what publishers might do is say, hey, look in and we'll give you some extra tools, some extra value, or personalize your experience.
Now, maybe there are similar ways to encourage users and consumers of open access content to get value from it, to provide that sort of feedback that helps build the business case in a privacy preserving way. It's a difficult balance. Christina I would just add kind of a third scenario to the two Tim outlined, which is what we often talk about in terms of as open as possible or as controlled as necessary.
And I often find myself thinking of the Census Bureau and all of the data we give to the census, which is very personal. And we need that for, of course, public, good public services, public planning. And I think there's an allegory here and that very personal data, they are able to do a lot of privacy masking and algorithmic changes to how that data is shared so they can still share it out and get the public good.
And I feel like what we're trying to figure out right now in our space is what is appropriate, how much can we share, how can we get the value we need as an ecosystem and its different stakeholder groups while still protecting that privacy and doing it in an open way that doesn't harm anyone. So I feel like there are other areas around us where this is being done. I don't think it's an unsolvable problem, but we still don't know exactly what we need, which levers to push.
Thank you. Tricia, do you want to add something? Let's see you're muted. Thank you. I just wanted to say that we can do a lot with open access data without sharing it. And in my role, I'm concerned with sort of creating these new personas of who our users are.
And then with that, we can evaluate what are we creating for them. You know, just because there's access to information doesn't mean that everyone has the same level of understanding. So should we also have a role in creating new ways to communicate science or new pathways that sort of thing? So I think when we're talking about this transparency of data, it's not just about sharing it, but also about what opportunities is it creating for publishers.
We can do so much more if we know who our users are and can do more for them. So I think there's actually a real opportunity beyond just the data sharing. It's also about creating something more. But to turn that. I'm sorry, Jennifer, just go ahead. To make it a discussion.
I was going to say, as we talk about users, I presume you don't need to know the specific reader like Christina Drummond opened this or Christina opened this chapter to read this paragraph at this time, on this day. There's a level and one of the best case studies I had heard when we did our research two years ago was that some folks wanted to look at how open, regional, Open Usage was occurring.
So that they can inform their print strategy and their translation strategy. And it's like, hey, we can actually get books that are more often used in that market if we know there's demand. But how do we signal demand? And to do that, I would say you could probably cut off some of those IP addresses and still get a regional enough perspective of what usage is without knowing.
Christina read it at this time. It's true, and I think that's a lot of what open access data is showing. You know, as more content becomes open, less is attributed to an IP or an institution. So we have more information, but different information that will inform us of new strategies that we can take. So in terms of. Privacy is.
Is is privacy violated when you collect the information or when you share the information? Is there an agreement on that? Technically, it's when you collect it. I mean, the GDPR and related ones are pretty clear. You should not be collecting personal data on people without their permission. But the definition of personal data is quite specific.
So if you collect something on something which on which you are unable to relate back to a human. You know, to a specific individual. That doesn't count as personal data. It's so personal data. The definition of personal data can build up over time. It's where you have many bits of information on people and you can start to relate that to each other and start to identify individuals.
So it's being able to identify IP addresses down to the individual level is much more privacy threatening than being able to say, I had a user who is from this organizational IP range. I have no idea who they are. Well, this isn't the case in the US. In the EU, for example, GDPR, IP addresses our personal data. And so it's very country specific right now in terms of what that is.
But it's not just in some places like in the us, you know, notice and consent is one thing, but in other places you have to have express consent for each one of those instances of sharing. And because we have a global ecosystem, I think we're rapidly approaching a place where we, each of our organizations need to navigate that complexity, which is going to be challenging if we don't figure out how to do it together.
Well, we are we're getting close to the end of our time. So this actually is a good lead up to the one. Final question I've wanted to ask, which is in a global world where everything's connected and we're talking about a content that's available to anybody, anywhere. How how do you negotiate the different laws and standards in different countries? Do any of you, all of you work with a usage data?
Do you have any tips or suggestions or best practices for that conundrum? I think this is the million question. And it's not just about different laws or standards. It's also about cultural value. You know, in some parts of the world, open access isn't solving a problem. They don't see that content needs to be open.
Just because some of us do. Doesn't mean that we can change institutional or geographic wherever you are. We can't change other people's minds because this is what we think of. So we have to be, I think, really respectful and collaborative and recognize that we're all in different places, either wanting or not wanting open access, being on a pathway or not wanting to be on a pathway to open access, and being able to have these standards and frameworks that can.
Successfully allow everyone to participate. So whether it be just want the cost per download, you only want to know about your IP'S use or you want to know how you've helped every individual person in the 50 miles around your institution. You know, we have to be able to create something that works for all of us. But that that's the biggest challenge is that the standards and the framework and that compliance is going to depend is going to be different.
Every person you ask. What are you another way of saying? You should be cautious. Just be cautious. If you don't have a clear reason to collect something, don't collect it. If you do have a clear reason, ask for permission. And Christina, I think this also really comes back to the interoperable component of fare.
Like we have to think about how I often joke it's not token, right? We're not building the one ring to them all. But can we interrupt or can we operate with distributed network? Is this a Federated network where we're going to have to have national infrastructures that interconnect, much like we have with the internet today? We don't have a single we have all kinds of enrons in each country that are coordinating and working in collaboration.
I wonder if we're just at the early stages where we're trying to figure out, is it really one global effort, one global organization, or is it this kind of Federated network that's emerging? So thinking about data sovereignty, which is really taking off in many places right now, is going to be a key piece of that. And so just lots to watch for over the coming years. Interesting and Jennifer, is there anything related to this or any of the other topics we've brought up today that you would like to add?
Yeah, I think it's interesting because we're talking about such granular data at the moment, and I think that that's fair and makes perfect sense, and that's where a lot of these things come into play. For me personally, as you know, having a librarian background and being involved in infrastructure, I, I tend to think of it as very sort of at the very macro level, like where you might not need a lot of data. I want to know like at a very high level, what open access usage is doing it just in books across the board.
And it would be nice to have country, country information sort of sliced and diced and all that. But I think for the group of stakeholders, just to have that kind of aggregate information, which has its own set of challenges, of course, but I think that would be really useful to have as well. And I hope we're in or get to a place where we can do that. OK well, thank you. And we'll I think we're out of time, so we'll end it there.
I think you've brought us once again full, full circle, because what in the end, what we were talking about in this panel was the different needs of different stakeholders and how to define them and meet them. So thank you very much. I appreciate I appreciate your effort and putting together this great panel. Thank you.
Thank you, everyone. Thanks, everyone. Bye bye.