Name:
Aligning national priorities when it comes to open science metadata requirements Recording
Description:
Aligning national priorities when it comes to open science metadata requirements Recording
Thumbnail URL:
https://cadmoremediastorage.blob.core.windows.net/5efad249-b403-4a0f-88ca-a966dc00fd65/videoscrubberimages/Scrubber_3.jpg
Duration:
T00H34M43S
Embed URL:
https://stream.cadmore.media/player/5efad249-b403-4a0f-88ca-a966dc00fd65
Content URL:
https://cadmoreoriginalmedia.blob.core.windows.net/5efad249-b403-4a0f-88ca-a966dc00fd65/Aligning national priorities when it comes to open science m.mp4?sv=2019-02-02&sr=c&sig=pmjfYdpnrsKttY4%2BeGqMlozv2uhqvvsydVf6EkGECrw%3D&st=2024-10-16T00%3A19%3A52Z&se=2024-10-16T02%3A24%3A52Z&sp=r
Upload Date:
2024-03-06T00:00:00.0000000
Transcript:
Language: EN.
Segment:0 .
Hello and welcome to this next session of the NISO Plus 2023 conference. I hope that you've been enjoying the meeting so far.
This segment of the conference is entitled aligning national priorities when it comes to open science metadata requirements. My name is Todd carpenter, executive director of ISO the National Information Standards organization, and I will be moderator for this conversation. Joining me today are Dan Valen, Director of Strategic Partnerships at figshare, a part of Digital Science.
Dan joined figshare in early 2014 as the first US based employee. In his current role, he focuses on the development of figshare in the academic, publisher, researcher and biopharmaceutical community through building strategic partnerships, community engagement and educational outreach. Prior to joining figshare, Dan spent six years at one of the largest open access STEM publishers.
Dr Nokuthula Mchunu is Deputy Director of the African open science platform at the National Research Foundation in South Africa. She earned a doctoral degree in fungal genomics and was previously a senior scholar at the Department of Biotechnology of Durban University of Technology. She has served as a scientist in a number of international institutions, including the University of Cincinnati in the us, Lund University in Sweden, Tianjin University in China and the Centre for Chemical Biology in Malaysia.
She's the first recipient of the young scientist program between China and South Africa. Dr. Jo Havemann is CEO, consultant, entrepreneur, mentor, researcher and trainer with her background in evolution and development developmental biology. Dr. Havemann is a trainer and consultant in open science, communications and Digital Science project management. Her work experience covers NGOs, science, startup and international institutions, including the UN Environmental program, with a focus on digital tools for science and her company, access to perspectives, she aims to strengthen global science communication in general and with a regional focus on Africa through open science.
To begin a little framework. From the US perspective, the United States has been a hotbed of innovation focused on open science and research for several decades now. Significant advances in data sharing and repository development in open source tools. And open access publishing have taken root in many, if not most, US institutions. Unlike other national communities, the diversity and uncoordinated, nature of the US research ecosystem has made consistent alignment on national priorities somewhat more difficult here than in other countries around the globe.
Communities of practice have developed independently with domains developing their own specific metadata and identifier requirements and norms at a higher level. However, there needs to be consistent application of a core set of required data elements to ensure dependability of discovery and the connection between research outputs. This core would include elements like ORCID identifiers, DOIs for research outputs, institutional identifiers, funding, identification, protocols, terminologies and similar resources that exist for practically every research output.
Because of their size and centralization of research and administration, some nations have been faster and more able to adapt, adopt and promote this consistent infrastructure than others. The US unfortunately, hasn't been a leader in consistent application for many reasons. However, it is about time that the United States has stepped forward. In the fall of 2020 to the US White House Office of Science and Technology Policy put forward a plan to mandate many open science practices for all federally funded research.
And thereby increased the adoption and prevalence of these open resources. This plan aims to be fully implemented by the end of 2024. That memo also outlined the broad scope of many of the technical requirements for general metadata and identifier application related to these open science objects. The US federal government agencies are in the process of drafting their own specific implementation plans in alignment with the general guidance issued by the OSTP.
While significant and important, these plans will fall short of a national plan or strategy for open science in several key respects. First, and possibly the most important is these plans do not construe a national digital strategy, but rather the adoption plans of various us agencies. The US federal government may be the largest single funder, but it is by no means the only funder or generator of research.
Outputs outputs in the US ecosystem, individual research institutions and a vast network of both corporate and philanthropic institutions contribute the majority of outputs in this research. In the US research ecosystem, academic institutions, libraries, publishers and software companies all contribute to the distribution and preservation of these outputs. Which brings us to the second challenge presented by any federally led national strategy.
Each of these communities have to agree and adopt a consistent strategy for how these research outputs should be applied in their own systems, in their own domains. Consistent approach by all players at a very high level and a very general level needs to be consensus led for a buy in to that strategy and consistent application across different systems. It is through a voluntary consensus process like those used by miso that we can come to agreement on how identifiers should be applied across organizations.
How metadata to describe those research outputs, again at a general level can be consistently applied. This consistent approach to metadata is key to some of the most fundamental elements of the research fair research outputs framework, but particularly when it comes to the final two elements regarding interoperability and reusability. Hopefully the following presentations will give you some quality framing for the subsequent conversation about how we can drive toward more consistent and more effective research infrastructure through the use of metadata and identifiers.
The next speaker will be Dan Valen from fig share, followed by Dr. Mchunu and then Dr. Havemann. We will then continue the session through an interactive discussion about these systems and the opportunities across countries and continents for a truly global research infrastructure. Hi, everyone. My name is Dan Valen. I'm the head of Strategic development at fig share.
And today I'm going to talk with you about the state of open data in the time of new funder policies. So real quick, we are figshare. We build repositories to help universities, academic publishers, funders themselves and any other stakeholders in the skull community that I may have left off. Better share data and other digital research content and for those unfamiliar figs shares a cloud based platform that has free component as well as an enterprise offering where we build custom repositories for the aforementioned stakeholders.
And so each repository is constantly iterated on with rolling updates and improvements, additions of integrations with researcher tools. And the like. And this helps us keep up with best practices from funders on a global scale, remaining policy compliant with publisher policies as well as those funder policies. So now we have a quick background of figshare out of the way.
I wanted to jump into the topic of today's talk, that being the state of open data, this is a survey we've been running for the last seven years that looks at trends in the Open Data space. And so over the next few minutes, I wanted to touch on how this ongoing annual survey of the global research community looks at open data and how it aligns with some of the open science actions the US government and the federal funding agencies within has been taken.
So these are some of the common themes we saw in the state of open data over the last few years. And a lot of these trends aren't too surprising. These are from state of open data 2018, 19, 20 and 21. And one of the things I do want to note are the increases or jumps here seem to be around concerns, around the misuse of data, concerns around the cost of sharing data from researchers and researchers who have a fear of not getting credit for sharing their data.
And funnily enough, these are all issues that the NIH and the larger US government is trying to tackle with their new policy. And I one of the things that I want you to zero in on this slide is those growing concerns about not of researchers not getting credit for their data. So what is slightly surprising in last year's State of open data, this being 2021. So maybe was two years ago.
At this point, we did see 70% of respondents say that funders should put in requirements for sharing data. And if the grantees don't comply, then funders should withhold funding or penalize the researchers for not complying with those open data policies. So this sets the stage nicely for the idea that from the state of open data's point of view, researchers are really looking for nudging on requiring data sharing and getting that credit for sharing data.
And this puts us at a bit of an impasse. We've got respondents who say there are real concerns about not receiving credit for their data, that it costs a lot, that they're worried about misuse. But a large majority of those polled also think that it would be helpful if sharing data were mandated or enforced. So it's that idea of sticks and carrots in action.
Can we incentivize you with a credit reward system for better behavior, or do we need to mandate change through policies and requirements? And from the carrot side, we start here with plos a public library of science, who in 2014 were one of the first academic publishers to put forth a data policy at all. And six years later, this study came out looking at more than 530,000 articles coming from PLOS and biomed central and their corresponding data availability statements.
And what are the authors of this study find? One is fairly obvious at the top here. Following mandated publisher policies, data availability statements became very common in appearing in these articles. And the authors also found an Association between articles that include data availability statements that link to data in a repository receive up to 25% higher citation impact on average.
And so sharing data linked to a paper in a repository works and funders took knowledge, took notice. And the researchers could also take some solace from this as well. And in February 2022, the NIH issued a seismic mandate, at least according to nature. Some are debating the seismic nature of it, which seems to be really moving the needle. Regarding how researchers need to think about sharing data.
The new policy requires that all NIH supported research to have a data management plan and sharing plan outlining how scientific data and the accompanying metadata will be managed and shared. These are often referred to as daps, and also the researchers themselves need to be compliant with the NIH approved plan. So data management and sharing costs may be requested as part of the grant process, which is an exciting new development.
And this includes costs associated with curating that data, preserving that data and sharing that data through established data repositories. And the NIH is really looking to foster a culture of data stewardship by promoting effective and responsible data management processes and practices from researchers. Then, just a few months later, in August of 2022, Dr. Alondra Nelson of the Office of Science and Technology Policy I refer to as the OSTP in some places released a memo that recommends that all federal agencies develop new or update existing plans indicating how they will be providing public access to the outcomes of research that's federally funded, that being manuscripts and data.
And so some of the key points from the Nelson OSTP memo is provide free and immediate without embargo access to research that is federally funded, that this applies to all federal agencies and this applies to both peer reviewed publications as well as the underlying data. And if that last part that is really exciting to us because it meshes a lot with what we're seeing in that's needed to move the space forward.
Thinking about data as a first class research object along with the manuscript. And as of last week, the NIH policy is live. They've backed it up with funding these different spaces. And this is one of their initiatives called the gray initiative. Essentially, the NIH is funding repositories to work together to create consistent metrics and help researchers comply with their own policy.
And what's really cool about the gray initiative is the cooperation, competition aspect of it. That being that the NIH is working with a number of generalists repositories listed here on the slide to standardize functionality of their repositories. And this does two things. It allows folks to make sure that everybody has a place to make their data available and can make sure that metadata is standardized, interoperable, and allowing the wider community to ask better questions of the data and glean more data from across generalist repositories as opposed to having that information siloed.
And so if we go back to one of those early concerns that we saw a few slides back, one of the things that researchers indicated was we need to solve how we measure or give credit. And so one of the first problems that we have to solve is, how do we measure credit? And so the funders can't say we're going to measure downloads of your research when you have, say, videos as part of your research because it's 2023, no one is going to download your videos because no one downloads videos off the internet anymore.
You all stream them. So implementing open metrics. So we have a level playing field across repositories. So there's consistency in how we measure. The impact should lead to a way in which funders themselves give credit to researchers. And it won't just be a sticks approach, it will be the carrots as well. And having some kind of reward mechanism should in theory move this base forward.
And so that's the backdrop. We see that looking at funders in the SHERPA Juliet database, 52 funders listed their required data archiving in some capacity. And now we're starting to get into the 2022, the most recent state of open data report, that being 2/3 of respondents for this most recent report are supportive of a national mandate to make open research available or make data openly available.
And that down from the 70% that we saw a few slides back in 2021. And we think that maybe because when you ask a researcher, anybody in the space, do you think open science is a good thing? Do you think open access is a good thing? Do you think open data is a good thing? When pressed, the majority of people will say yes, but when the rubber hits the road, the question is are they actually doing it?
So while we at figshare and I know other folks in the community really try and keep the barriers to entry low here. There is some additional work required there. Researchers have a lot of administrative work to consider. In addition to complying with funder policies, publisher policies, institutional policies, applying to grants, writing publications and all of this on top of doing their own research.
And so this slight decline is something to flag. You have to be very realistic about what researchers are facing with the addition of these mandates and policies as well. And so as publishers and libraries and institutions themselves are also subjects of the aforementioned top down initiatives and mandates, they have an essential responsibility and role to play in the progression of open data.
And in this year's this past year's 2021 survey or 2022 should say, 72% of the respondents indicated that they would rely on an internal resource, that being either colleagues, libraries or the research office, where they require help with managing or making their data openly available. When asked who they would be willing to receive support from, the most popular answer was publishers, closely followed by those within their own institution.
And I think this is mainly the result of researchers having to publish a paper where they're met with new data availability policies that they may not be they may not recognize. But hey, Thanks to publishers now, including these as part of the manuscript submission process, they look to them for guidance at that point of submission. And so here's a quick snapshot of some of the other key findings of the 2022 state of open data, as well as the link to the actual report and as well as accompanying data, which we've also made available.
So any one can dig into the results themselves. And I recommend that everyone take a look at this report because it is one of the longer ongoing open data surveys of its kind and it does ask some really interesting questions and you can kind of see trends in the space and how people view open data. And finally, as figshare turns 10 in 2022. We actually just had our 11th birthday this past January, but for the sake of this presentation, we're still celebrating 10.
And so we've been beating this drum for 10 years. In fact, our initial slogan was credit for all of your research. And the Open Data space has moved a lot in the last 10 years. When I've been at figshare for it'll be 9 years in May. And so when I first started giving talks about figshare to stakeholders in the community, I had to start with a why open data?
Why are we even talking about this? Why are we thinking about this? And now here we are in 2023 and looking harder at how to make open data reality. And that's really exciting to me. So taken as a whole, the state of open data survey points to a need to plug holes around training in open data and to remove yet more administrative burden from researchers and really push the space forward.
And I think we can really do that if the next 10 years can progress at the same rate as the last decade. When it comes to open research, we really need to think about supporting researchers with better metadata, more metadata, FAIR-er metadata to help folks glean more from the research that's being made publicly available. And all of these points need to align stakeholders, that being librarians and researchers and publishers and funders, and ensure we can align both those carrots and sticks to make ubiquitous research data sharing in academia a reality.
So thank you so much for listening to my presentation. And I look forward to the discussion today. I'm going to talk a little bit about African open science and how we are trying to align our strategies with the countries and the region around us. So I'll tell you a little bit on the platform how it is trying to have an open science vision in basically what the African open science platform is aiming to do is to make sure that the African scientist or researcher is able to use this cutting edge, data intensive science and resources that are available in our modern society to understand and tackle global challenges.
And we think in order to do this, we have to make sure that we have a Federated network of computer facilities or infrastructure, make sure that we develop the capacity on data science and in AI, and we are envisaging a physical or a virtual Institute. We are also making sure that the collaboration between the scientists and the researchers on different thematic areas or research areas is enhanced so that the knowledge and sharing and the advancement is quickly pushed forward.
Also, the network of education and skills, as I said before, expression and data and information and Australia are one of the most important things we're finding critical in the continent, is making sure that the network for the dialogue of open science and open access is enhanced. So of course, as a relatively new platform, we have set up governance structure where we envisaging having nodes across Africa and also making sure that we create the networks that will work on different strands, as I've mentioned.
So one of these areas that we're trying to make sure that we are working with the region in implementing their priorities in the SADC region, space science has emerged as one of the priorities in working. This is mainly due South Africa having one of the biggest telescopes there is the square kilometer array. So SADC region, which is an economic zone of the southern sitting African countries, has established the strategy and we as an open science platform, we have now come into this dialogue to see how everybody who's involved in this project can come, can share.
So this is the square kilometer array that sits in Cape Town, sorry, in the Karoo, and collects vast amounts of data on many things. And this data can be shared through different thematic areas using public awareness, functional programs in the growth innovation. There is a lot of opportunities through using this space. Computer facilities or the infrastructure that have attached also had to grow in order to be able to accept or utilize this information.
So again, South Africa is one of the largest cluster of computer facilities. However, these facilities are connected throughout the SADC region to make sure that everybody can have access to this information and utilize this information. Same way with this. Also computing facilities is being used also in the sadc countries in order to understand weather and climate activities.
So there's a lot of activities and collaboration with on and in South Africa to make sure all those countries are connected and they share the knowledge and the ideas based on this open infrastructure and a course based on priority areas and implementation in each country. Also not to lose hope being important, especially these days. South Africa has been awarded an NIH grant project which is aimed at collecting open data for science and creating a platform which this data could actually be hosted.
So this is both, again, on a network that has been around more than a decade in the continent where H3ABionet, which was usually based on genomics, has now emerged as looking not just the genomics is looking all the health data and disease related and trying to build an open data platform with everybody in the country. H3ABionet it already has 27 Institute in 17 countries and this is aimed to grow that after five years of this project, most of the African of the EU member states will be contributing into this project.
For now, I will stop here and then Dr. Jo Havermann will take over and tell a little bit about African archives and what are their. What is the idea or the goals within the continent? And I will take questions at the end. And thank you again for listening. Thank you. So I am telling you now about the work we do with Africa archive and defining Africa specific metadata schemas for continent wide and inclusive discoverability of African research output.
The challenges we are addressing are language barriers, low visibility of African scholarly output, internationally restricted access to research funding and the underrepresentation of internet of African scholars in international research networks. The solutions that we're providing are twofold and all based on the adoption of system identifiers and open licenses, which we believe in, will allow us to remove the challenges altogether.
So the two fold approach is, since our inception, since the launch of AfricarXiv in 2018 as a pre-print repository, we have been exploring opportunities and possibilities to establish a Federated and regionally hosted repository system. For that, we are we're in conversations with the various stakeholders across the continent. And in the meantime, and to enable immediate discoverability of African research output, we are utilizing existing digital scholarly infrastructure to incentivize and encourage African researchers to share their research, researchers to share the research.
The systems and organizations we are working with are the nodal science open chaos fixture pop up and open science framework. We are starting this year working with or partnering with the UbuntuNet Alliance, which provides us with Server space in Malawi and Uganda so that we can unpack and deploy open source software for locally hosted version of AfricarXiv. We are also together with TCC.
Africa have been engaged in capacity building, training and webinars to inform about modern state of the art publishing workflows using PIDS and open licenses. And we very much want to work with the National research and educational networks, as well as libraries and University research institutions across the continent to assess where they are at.
What's the status quo of their digital setup for institutional repositories? How we can have them leverage for highest possible discoverability and also data ownership. Um, with AfricarXiv and the cloud hosting remotely hosted instances. We are in the process of building, Uh, curation system on pop up where we invite featured collections by academic and non academic institutions like foundations as you see here, the Dianne Fossey Gorilla fund in Rwanda.
By networking institutions like AfricarXiv, open science hardware, Africa, Psycom in Nigeria. So community based organizations, some of which also provide scholarly content or all of them really, or scholarly relevant content. The table that you see in the slides compares the different services and what types and formats they adopt. And we will or we're in the process of making Africa archive searchable by language.
And with language, we mean not only english, Arabic, French and Portuguese, but also traditional African languages. We want to work with country. Are we working with country profiles now and also mapping the research items to SDGS and research disciplines? So looking at the African language, language profiles each language of an incoming submission in a particular language gets a language profile page, which can also be signed with metadata and itself.
So you see Zulu here as an example. We work with Africa, which is an organization that is collecting language resource, African language resources across the internet. So it can also tap into that database as you're on it, and you see a list of the research items submitted to AfricarXiv in Zulu, and we can expand that list further by again pointing to LANFRICA database as an additional, um, harvesting initiatives for language specific content.
And yeah, for the country profiles that we're setting up where we're looking at 54 African countries each is getting their own country profile as a collection and that is then to display research from and about their particular country. The see here in the magnifying window like the National flag for reference.
And then mapping out an interactive map with institutional repositories at a particular country and the research institutions based on there are identifiers we identified for those repositories that we've listed, which systems they utilize. Usually it's disk space, but then there's also others being deployed and we add the African languages working in those countries and, and a lens.
So we're working also with the lens and giving a shortcut to the research items that are indexed in the lens from and about a particular country. And then you can also go to their analysis and features and central metrics. And so looking into the research output of particular institutions in a country for the SDGS is a similar setup. So we're looking at 17 SDGS and here you have a focus on SDG three good health and well being.
And we assign or we also ask the authors as they submit, which SDGS their work contributes to. And that could be then in another iteration, be further diversified to the actual target. And targets and activities within an SDG, but that's for another project funding phase. So summarizing, we are now also grantees of the first round of ORCID global participation fund and thereby propose to establish market facilitated multi-stakeholder consortia which are bilateral in nature for efficient scholarly knowledge, exchange from and about Africa.
So on one side, we have user consortia will organize webinars together with TCC and UbuntuNet Alliance and other stakeholders want to get involved to brings scholarly stakeholders together in consortia and then matching these with provider consortia of particular packaging of their respective services for publishing and discoverability, including additional features and services and facilitation for quality assurance as in community driven peer review and the like, to get a full scope of state of the art and affordable and efficient open research infrastructure and publishing features.
That's all for now. We're now happily awaiting your questions. Thanks for your attention. And let's discuss.