Name:
Relationship between infrastructure, protocols, standards
Description:
Relationship between infrastructure, protocols, standards
Thumbnail URL:
https://cadmoremediastorage.blob.core.windows.net/56d2887f-1b27-4dc9-b1a5-7e0413ef954a/videoscrubberimages/Scrubber_1.jpg?sv=2019-02-02&sr=c&sig=ziKsyf7s0kh2XyYz8ouBQVciCLnScifI%2F%2BRr3xdu2WE%3D&st=2025-01-15T06%3A54%3A07Z&se=2025-01-15T10%3A59%3A07Z&sp=r
Duration:
T00H51M33S
Embed URL:
https://stream.cadmore.media/player/56d2887f-1b27-4dc9-b1a5-7e0413ef954a
Content URL:
https://cadmoreoriginalmedia.blob.core.windows.net/56d2887f-1b27-4dc9-b1a5-7e0413ef954a/25 - Relationship between infrastructure%2c protocols%2c standar.mov?sv=2019-02-02&sr=c&sig=n%2BDhrd3tLUDC4Z2l1FMZ5u%2BevdIf%2FbkDE8o6l3aoNcc%3D&st=2025-01-15T06%3A54%3A09Z&se=2025-01-15T08%3A59%3A09Z&sp=r
Upload Date:
2021-08-23T00:00:00.0000000
Transcript:
Language: EN.
Segment:0 .
[MUSIC PLAYING]
KEN RAWSON: Hello, and welcome to the NISO Plus 2021 Conference session on The Relationship Between Infrastructure, Scientific Protocols, and Standards. I'm Ken Rawson, Director of Program Management and New Initiatives at the IEEE, and I'll be moderating the session today. I also serve as the co-chair of the NISO Information, Creation, and Curation Topic Committee, as well as serving on several of the NISO standing committees.
KEN RAWSON: We are very pleased to have three experts in scientific protocols with us today to speak about this area and some of its challenges. Each speaker will talk for approximately 10 minutes or so in prerecorded presentations. We will then open the floor to questions, and the panelists will respond in a live session for approximately 40 minutes. Ideally, this format will enable you to view these presentations ahead of time and spark your curiosity, generate questions you would like to ask, and raise points for further discussion.
KEN RAWSON: At any time during the actual conference session, please submit your questions via Zoom Chat to the group of panelists. The real point of this session is to have an interactive dialogue between the participants and the panelists and try to come up with several actionable items for further investigation. Our first speaker is David Crotty. He will give us an overview of capturing research methodologies as part of scholarly workflows.
KEN RAWSON: David is the Editorial Director, Journals Policy for Oxford University Press. He oversees journal policy and contributes to strategy across Oxford University Press's Journals Program, drives technological innovation, and serves as an information officer. David serves on the boards of the Society for Scholarly Publishing, the STM Association, and CHOR Inc.-- C-H-O-R. As the Executive Editor of Scholarly Kitchen blog, David regularly writes about current issues in publishing.
KEN RAWSON: David, please, go ahead.
DAVID CROTTY: Hey, thanks for inviting me to speak today. And what I really want to do is to make the case for why we so desperately need to improve the way we communicate research methodologies, and I'm going to talk a little bit about the different levels where standards could make that process a little bit easier. So put up my screen. OK, so there's that.
DAVID CROTTY: 25 years or so after Journals first went online, we're just on the cusp of realizing what that really means in terms of reporting research results. Our first efforts, really our first decades were spent recreating the analog print experience for journal readers-- monthly issues filled with PDFs, those laid out, sort of space-limited articles. But there's a lot that happens over the course of a research project.
DAVID CROTTY: And while the resulting paper provides a really useful summary of that project, a lot gets left behind and it never sees the light of day. So what we were realizing as a community is that we're leaving an enormous amount of value on the table. And if we want to do a better job of-- if we can do a better job of capturing, preserving, and making available more of the research workflow, we'll drive better transparency, better reliability of research conclusions, and we'll improve efficiency and return on the investment that we make in research funding.
DAVID CROTTY: Now, I'm not a believer in doing away with the research paper itself, and it's not because I'm a Luddite intent on preserving the past, but because I'm a student of evolution. And to me, the journal article is a highly-evolved form that very effectively serves an important niche in the ecosystem. The paper marks a point in a research project where conclusions can be reliably drawn, and it tells the story of that research project and those conclusions in a concise and readily digestible manner.
DAVID CROTTY: Now, this allows for the literature to serve as a high-level conversation among experts. Really the ratio of people for whom a short summary of the research project will suffice to those who really want to take a deep, deep, deep dive into the very large volume of details behind the project, is typically hundreds, if not thousands, to one. So to me, the journal article sits at the center of scholarly communications because it serves its purpose well, which is not to say that it couldn't be improved.
DAVID CROTTY: But I tend to think in terms of "yes, and" rather than "instead of," in terms of taking advantages of the open horizons of the digital realm of scholarly communications. And the idea is to add to the value that's already in place, rather than discarding that value and then hoping maybe to make it back up somewhere. My preferred metaphor at the moment is a solar system where the paper is sort of orbited by all the different research outputs that are available throughout the life of a project.
DAVID CROTTY: And there are many, and this slide is pretty far from comprehensive. Each of these offers a different benefit to a different audience, and each comes with its own costs in terms of production, preservation, and in a possible positive or negative impact on one's career. Each output here needs to be both independent and connected and, hopefully, in particular, discoverable through the paper, which is where the majority of people are likely to start their information journey.
DAVID CROTTY: So the key questions then are about drawing out that additional value. How do we capture the value that we're leaving on the table by digging deeper into the enormous amount of work that's done on any research project, and making that available to the community so we can wring every single drop of knowledge out of it? The second question is how do we expand that percentage of people who can draw value from more than just a written summary?
DAVID CROTTY: Remember, that research funders want return on investment-- RLI. Research funding is a scarce commodity. And any project that puts out more resources providing more value to the community, offers a better investment for a funder. And then the third question comes into play as well. More transparency means more trust. Now, on the surface, it seems like an obvious idea-- creating a detailed public record of everything that happens every day throughout a research project-- but in the real world, this runs into practical limitations.
DAVID CROTTY: Now, aside from things like storage space, discovery, and infrastructure issues, it's a huge ask and a huge time sink for the researcher. In nearly every talk I've given over the last 15 years or so, I've used some variant of the phrase, time is a researcher's most precious commodity. And this panel turns out won't be any exception to that. Time that is spent in record-keeping, in documentation, and publication of those records, is time not spent doing experiments.
DAVID CROTTY: Any changes made in reporting requirements are going to take the researcher away from the bench or away from the bedside. Doing things right needs a combination of strategies to really reduce that burden on the researcher as much as possible and to offer significant enough of a reward to justify the reporting of the burden that remains. Further, there's a lot of stuff a researcher does that doesn't go anywhere that would have little public value in contributing to research reliability and reuse.
DAVID CROTTY: How do we separate out the wheat from the chaff? So rather than an immediate and probably impossible sweeping cultural change to radical transparency, I think it's better to start with parts of the research flow that can offer the most obvious value and cause the least burden for the researcher. The end goal is, of course, completely open research, but it's likely to be a long road to get there. So we need to choose steps that will quickly provide value in order to build momentum going forward.
DAVID CROTTY: Open data is an obvious first step, and we're increasingly along the way to making that a standard part of any research project. What's really helpful about the open data movement is that it's created a model for how we can open up other parts of research workflows, sort of offering standards and best practices that can be applied and adapted elsewhere.
DAVID CROTTY: But data alone is not enough, and the enormous hole, I think, in the open science movement has been a lack of attention paid to the reporting of research methodologies. Being able to review the data behind the study does, indeed, allow one to see if that researcher's analysis and conclusions drawn are accurate for that data set, but it does little to validate the quality and accuracy of the data set itself.
DAVID CROTTY: If I don't know how you got that data, I have no idea if it's any good, and I certainly don't stand any chance of replicating it. Now, a big problem here is that the scant information that's offered by most journals, materials, and methods sections, is insufficient to have any chance for someone to repeat what the original authors did. Often when describing a technique, an author will cite a previous paper where they used that technique, and you go look at that paper, and it cites a previous paper, and then a previous paper, and then a previous paper.
DAVID CROTTY: And it just becomes this wild goose chase, and you can never find the actual original technique. I think the lack of detailed methodology reporting is an anachronism that's really driven by decades of a print-dominant publication model where everyone wanted to try and reduce the number of pages in a journal issue-- you know, make it lighter and cheaper to mail and cheaper to produce. So one of the things they started cutting out more and more or cutting down on was the method sections.
DAVID CROTTY: There's also really been a lack of incentives to improve methods reporting throughout the community, so I would say that now is the time to set this right. To put it simply, transparency around research methodologies is essential for driving public trust and for accurate reproducible research results, but reproducibility isn't the only goal here. So just as important is really increasing efficiency, and open science offers great potential for that.
DAVID CROTTY: So one of the big drivers for the open data movement is the potential for reuse of that data. And that's also the case for open methods and maybe even more so. There's an enormous amount of scientific data that is really generated in a really specific manner. I'm going to look at one particular cell type, or one particular geographic region, or one behavior, under really, really specific conditions.
DAVID CROTTY: And it's not obvious or easy to repurpose those kinds of data, but methodologies are much more adaptable for new research projects, even ones that aren't really directly related. And if you've ever done research, you know that a huge amount of any research project is spent trying to figure out how to do what you want to do and then learning and perfecting those techniques that you want to use.
DAVID CROTTY: So having a vetted and successful methodology available to you right away can be just an enormous head start. Lest you doubt the power of the development of new techniques, I would suggest going back and looking at the last 10 to 15 years or so of Nobel Prizes in chemistry, and medicine, and physics. A significant proportion of them have been given to those who created the approaches that others are now using to apply to research questions.
DAVID CROTTY: And if you ask any journal editor, they will tell you that methods articles are nearly always among the most cited articles. And I think that speaks to their value in driving future research, as well as their often broad applicability to different types of projects. So a personal anecdote here. In putting this talk together, I went back and looked at the 15 or so scholarly papers that I have published.
DAVID CROTTY: And the most cited one is not any of the actual research I did-- which is perfectly fine research if somewhat incremental-- but it's the how-to article for a method that I spent a couple of years working out how to do it. Now, this method came out in 2002, and yet it was still getting cited by new papers even last year. Interesting it's also been cited in multiple patent applications.
DAVID CROTTY: And again, if you're a funder, if you're a government funder or looking to drive economic development through research funding, patents are a great indicator of success there. So if we assume that better, more open methods documentation is both tremendously important and the next obvious step for open science, what can we do to help drive its adoption through reducing the research or burden necessary to provide this type of reporting and improving its discoverability and reuse?
DAVID CROTTY: So I probably don't need to make the case to this audience in particular that standards and defined best practices are absolutely necessary to make this happen. So we have a road map available to us provided by the open data movement starting with the FAIR principles. I'm not going to go into detail here. I think one of my other panelists-- the other panelists here.
DAVID CROTTY: I was going to talk more about how they apply to methodologies. And I don't have a detailed framework I can provide for a standards road map, but I do want to start the conversation by talking a little bit about the different levels where we can approach standards for research methodologies. So as an editor at heart, I tend to first think in terms of the content itself-- should the actual content be left free-form and up to the individual researcher, or are there common elements across fields that can help organize things?
DAVID CROTTY: Is any of this universal or will each field or each subfield need its own template? Is there-- basically, is there a value in creating a standard template or standard templates? Open data has been easier, I think, to advance in areas where there are common data types for a field, and that's really been helped quite a bit by having standards for how those data types are reported and made available.
DAVID CROTTY: So as an example, DNA sequence in particular file formats in GenBank or protein structures in PDB, for example. To me, there is great value in consistency. It's why most journal articles follow a roughly similar format-- abstract introduction, materials, methods, results, discussion. Having a set shape to information makes it easier to move between different areas of information.
DAVID CROTTY: So for example, a chemist might really not know that much about astrophysics, but in looking at an astrophysics paper, they know what to expect from an abstract or where to find the research conclusions in that paper. And I think the same thing really applies to methods. I need to learn how to do something new. It's a lot easier to see it in a form that I'm comfortable with. And back when I was one of the people creating laboratory manuals and then adapting them into an online resource called Cold Spring Harbor Protocols, we settled on a standard article template that covered the basics really for the wide variety of biomedical methods we were publishing-- not all that different from what you'd see as a recipe in a cookbook.
DAVID CROTTY: And this isn't a unique shape. Other sources like Wiley's Current Protocols, Nature Protocols, many other sources like this are very similar. So the way we set it up was an introduction-- basically, this is what this method does and why you would use it. Related information-- what are other methods that are connected to this one or necessary for you to do this one?
DAVID CROTTY: Materials-- we divided that up into reagents and equipment. The method itself-- basically a step-by-step explanation of the workflow. Troubleshooting-- useful tips about how to make things work correctly. And then a discussion section with any other further useful information or maybe an example of the technique in action. And some of our protocols had all of these elements. Others omitted some.
DAVID CROTTY: And it worked pretty well across a broad spectrum of biology methods. Obviously, the further you go out from biology, you're probably going to find more and more missing elements here-- we may need a functionally different template for computational research or maybe the social sciences-- but conceptually, I think it's a lot easier to learn a new recipe if everything in the cookbook is in the same basic shape.
DAVID CROTTY: Standardizing formats makes some sense, but at the same time, you want to be careful. You don't want to stifle innovation. So you need to leave some flexibility for finding new ways of doing things. And one thing that comes into play here is that protocols tend to evolve over time, and different applications of the same protocol often have different requirements or different steps are needed.
DAVID CROTTY: So you need to have a standard approaches to versioning and maybe more appropriately here, branching-- sort of the way something like GitHub deals with source code where you branch off a new variant of it rather than creating a completely new-- a new entry for it. Some other levels where I think standards would be important-- persistent identifiers.
DAVID CROTTY: Each protocol should obviously have a DOI, but each reagent and each piece of equipment should also be identifiable as well. I'm a big fan of SciCrunch and their research and resource IDs-- RRIDs, they call them-- as a model here. Now, if we're publishing these, then beyond the DOI, we need to think about tagging the various elements and making them machine-readable, and also the citation practices.
DAVID CROTTY: The citation practice is another place where open data has really shown us the way to do this. On a different scale, what are the standards and requirements needed for methods repositories, and how do we ensure perpetual access to those, and how do we ensure good discovery through those repositories? I could go on-- I probably will during the discussion-- but, hopefully, I've made the case to you, at least, for why open methods is such an important next step and why clear and consistent systemic standards for methods reporting are really a vital component of open science.
DAVID CROTTY: So thanks, and I will stop there.
KEN RAWSON: Thank you, David. Our second speaker is Emma Ganley. She is going to build on David's overview and talk about protocols.io and the work that they're doing with the coronavirus research community. Emma is the Director of Strategic Initiatives at protocols.io, which is a secure platform for developing and sharing reproducible methods with its headquarters in California. Prior to joining protocols.io, Emma was the Chief Editor of PLOS Biology and worked in scientific publishing for 15 years.
KEN RAWSON: During this time, she gained an enthusiasm for open data, preprints, and improving science communication. Emma is passionate about all things related to open research, data, code, and method availability alongside of the articles, research reproducibility, and integrity. As of January 2021, Emma is also a member of the Board of Directors of Force11.
KEN RAWSON: Emma, please, go ahead.
EMMA GANLEY: Thanks so much to the organizers for the invitation. It's really great to be taking part in a session that's focused specifically on infrastructure, protocols, and standards. In theory, I hope my talk will be complimentary and follow on from some of the things David's just talked about. I'm hoping overall to show you how protocols.io is one available tool and platform that already works well as a location for method sharing within the existing infrastructure, but also how FAIR principles can be applied to method sharing.
EMMA GANLEY: And I will highlight some of the areas that I think still need more consideration and community efforts. At risk of repeating some of what David's already covered, a common problem for researchers is whether work that's already being performed and published can be reproduced and replicated. As a new person joins a research group, they're often directed to notebooks from previous researchers.
EMMA GANLEY: These might be damaged, incomplete, or missing altogether. Or a researcher may look for details of methods within a research article, and often that may lead to a citation chain pointing to an earlier publication which points to another earlier publication that may well not end in the method being available.
EMMA GANLEY: Basically, researchers often can't find access or replicate already-performed and published work, and this can be a source of much frustration as shown by this biologist's tweet. This isn't a problem just for biologists though. This tweet here is from a physicist who ended up on the same wild-good-chase citation trail looking to understand how some devices were fabricated only to find at the end of that trail that devices were fabricated with conventional methods, which is not very helpful.
EMMA GANLEY: This is where protocols.io comes in. We have one available tool and platform that works for sharing methods information. Our mission is very simply to make it easy to share method details before, during, and after publication. If you are not familiar with protocols.io and want to have a browse, a good place to start is the URL that I have at the bottom here-- protocols.io/welcome. In brief though, it is an open access repository for method sharing.
EMMA GANLEY: It's very versatile. So although this was initially created with biology and life sciences in mind, it's now used by researchers across all research disciplines. It's also used by many as a tool within the research process throughout the research process to collaboratively develop and optimize methods privately before they're ready to be shared.
EMMA GANLEY: Looking for silver linings in the ongoing pandemic that we're all still struggling our way through, one outcome has been a spotlight on the need for open science, and open research, and making sure that methods are shared is a really important part of this. At protocols.io in January and February last year, we started to see a lot of coronavirus-relevant methods being published on the platform.
EMMA GANLEY: So we set up this Coronavirus Method Development Community where all of the methods could live and be shared more easily discovered together, and there they could be reused and built upon by others. This community grew organically faster than any other community we've ever had on the platform, and it's continuing to grow and publications are continually being added to this. Availability of the methods and the degree of interaction between the researchers in this group space has just been amazing to watch.
EMMA GANLEY: And discussions have varied widely between researchers, but they've also included really important things like people seeking advice on how best to perform testing in some less well-equipped developing nations. And helpful responses have been forthcoming, and important connections have been made as a result of this. And so we start to get more of an insight into the value of methods as an important standalone research output.
EMMA GANLEY: In recent times, some funders have started asking researchers to include details in their research data management plans of how their methods will be shared. And we can start to understand the need for comprehensive methods details being provided as crucial for replication and also the need for these to be available and interlinked with other research outputs-- start to think about how the FAIR principles can be applied to methods sharing.
EMMA GANLEY: And protocols.io already does a reasonable job of applying those FAIR principles. I'll talk over some of how we achieve this next. So if we start thinking about being findable, all published protocols have a persistent DOI as the identifier. This can be cited alongside any related manuscript citations. In the same way as publishers do, we've taken persistence and preservation very seriously.
EMMA GANLEY: And our content is archived in CLOCKSS, but it's also mirrored in several locations. Content can be exported if a user wants to do so in a variety of different formats. But one place I'd like to flag for potential improvement is how protocols are indexed. We have been working on this quite a bit, but support from the community would really help. Currently, Europe PubMed Central is actively indexing coronavirus preprints, and to this end, they plan to index our coronavirus relevant methods as preprints.
EMMA GANLEY: So this is clearly a case of a shoehorning a round peg into a square hole, but it's definitely a start. We've also been sending more of our metadata to Crossref to ensure greater discoverability. Originally, when we send information to Crossref, and actually still now, it's been classified as data sets. We plan on switching to preprint soon so that more relevant metadata can be included.
EMMA GANLEY: But ultimately, we've been discussing with them the possibility of a new schema for protocols that would be suited better to containing the relevant information that we would like to capture for protocols and methods. Many other stakeholders would need to be involved in those discussions, but we're really hopeful that we might be able to move this forward in the future.
EMMA GANLEY: And just to highlight the findable in FAIR in a different way, these tweets illustrate this really nicely. The top-left tweet was from a researcher in Chile who was looking for an approach to perform RNA extraction working in neuron cultures-- so brain cells. After a few tweets, there is a reply from a postdoc at UCSF who flagged a protocol that was published on protocols.io. On looking more closely, this protocol came from a GigaScience paper that was focused on stickleback parasites-- so a fish parasite.
EMMA GANLEY: If that method had not been published separately as an independent research output in a protocol and protocols.io, it seems highly unlikely that these researchers working on brain cells would have happened upon an experimental approach that was presented in a paper on fish parasites. So sharing as an individual distinct research output can massively increase the discoverability of the research.
EMMA GANLEY: Moving on to accessible, protocols.io is fully open access. It's free to sign up, free to use and publish, and importantly, it's free to access published content, and no account is needed to view published content. Everything that's published is available under CC-BY License, and, of course, it's all fully searchable on our platform.
EMMA GANLEY: I did already mention we really hope for better indexing across the rest of the research publishing and communications infrastructure and that that will come in the future which will aid with that accessibility-findability element. Also relevant to accessibility is how easy it is to place a protocols DOI into a research article.
EMMA GANLEY: I've got here an example from PLOS Biology where the authors state within their materials and methods that, "The methods and protocols for the experiments in this paper are available as a collection in protocols.io." And they have the DOI link there through to the protocol collection. If you click this link, you're taken to the collection in protocols.io. But what's really great is that you're immediately informed that a newer version of the collection is available, and you can choose to see the version from when the paper was published or the most recent version.
EMMA GANLEY: What this means is that the authors have been able to add information and detail, will correct something in the protocols, and they've been able to do this without having to correct the paper at the journal. The protocol, essentially, is able to exist with versions-- new versions as a living document. Any subsequent version that's created and published gets a new DOI, but they're linked as versions on the platform.
EMMA GANLEY: It's really easy as well to view and compare changes so that you can see what it was that's been updated in the newer version since that version that corresponded to the publication. So I like to call this dynamic permanence of the research output of the method. Moving on to interoperable, we have a fully open API for protocols.io, and we welcome people who would like to integrate more with us.
EMMA GANLEY: And developers are, generally, happy to try and help in any way that we can. We are integrated with ORCID so that if a researcher has put their ORCID ID into protocols.io, any protocols that they publish will automatically appear as works on their ORCID profile. So this is starting to allow for some recognition of methods as a valued research output on a person's CV if you will. Finally, a lot of journals' publishers and funders also endorse, recommend, and in some instances, are partnering much more closely with us to join the research output dots and help support a more modular approach to publishing research content.
EMMA GANLEY: Over 500 journals recommend that their authors place the methods on protocols.io in the way that I just showed you. And we've recently launched a new partnership with PLOS ONE by way of a new article type that they will peer review in which the protocol or methods will be hosted on protocols.io but that will be wrapped around by the other elements of an article on PLOS ONE. And there are funders, as I noted, that either encourage, recommend, or enforce method sharing using protocols.io or some other similar platform.
EMMA GANLEY: And then moving on to being reusable, there are a few ways that protocols on protocols.io are reusable. So first of all, obviously, the DOI means protocols are citable. So if you use somebody's already published protocol, you can site that within your paper. If you search Google Scholar, you'll find protocols within papers and citations.
EMMA GANLEY: Some researchers add them to their reference lists as I'm showing here, but this doesn't seem to be standard at this point in time. So this is an area I think we could try as a community to come up with a standardized approach much in the same way as has happened for data citation. Another way to reuse protocols here is directly on the platform. So if you do have an account-- which is free to make-- and you want to run through a protocol you can do that dynamically, directly by clicking to run.
EMMA GANLEY: And then you get, essentially, a tick-box list for the protocol and the timers are in there and things like this. So you can run through the protocol ticking off each step as you work your way through. You can add notes and record any changes that you make as the run takes place. And finally, it's really easy on our platform for another user to make a new copy or fork-- that's terminology people will understand if they know GitHub well-- to fork a new copy of a protocol on protocols.io and then modify it for their own purposes.
EMMA GANLEY: And the great thing about how this works is that these forks can then be visualized on the system. So you can track the evolution of a protocol. And I'm showing here an example of one much-accessed method for which the second version has had multiple new copies forked. And you can expand out the view on the right-hand side and see that each of those has several versions of its own. This reuse is just really easy to visualize almost like a phylogenetic tree on the platform.
EMMA GANLEY: Overall then, FAIR method sharing is possible, but we do definitely have a few places for improvement. I've hopefully, shown some of the benefits and how research can be accelerated by sharing methods. It can increase discoverability and facilitate research connections like the example of researchers working on neuron cultures finding their likely pairing with other researchers working on fish parasites.
EMMA GANLEY: It also improves reproducibility and facilitates reuse. The versioning-- as I showed-- allows for a living document that has dynamic permanence, but overall, ensuring full details are shared is also a really important means of ensuring stewardship of methods as a valued research output. Taken altogether, this just massively enhances the value of the research. I've, hopefully, shown that there are already ways to do this that are free, open access, interoperable, and persistent.
EMMA GANLEY: But as mentioned, there are ways and places where we would like to see improvement, and I will quickly run through a few of these. Obviously, we'd be thrilled to help with these efforts. But a few ideas of places that we could consider efforts are the better improving the indexing and recognition of protocols and methods as a separate research output, trying to create better mechanisms to integrate, and facilities to manage and merge the research outputs in a modular way.
EMMA GANLEY: This also ties in with the need for improved and increased interoperability. I think that we could much better show these things ideally together alongside one another. And finally, some recognized standards would be really helpful, but the complexity here probably shouldn't be underestimated given that methods can differ radically across different research disciplines.
EMMA GANLEY: And with one final slide, I would just acknowledge the protocols.io team-- it's not a huge team-- and some support that we've had along the way. And thanks, everyone, for listening.
KEN RAWSON: Thank you, Emma. Our final speaker is Adrian Burton. He will talk about research protocols from an infrastructure perspective. His organization has experience with making links between different types of scholarly output, such as Scholix.org-- S-C-H-O-L-I-X-- which works to collect and exchange links between research data and literature. Adrian is the Director of Services, Policy, and Collections with the Australian Research Data Commons and has many years of experience in building and supporting national data policy, infrastructure, and services.
KEN RAWSON: Please, go ahead, David.
ADRIAN BURTON: Hello, everyone. My name is Adrian Burton. I work for the Australian Research Data Commons. So far in this session, we've been looking at the importance of protocols and some of the standards in infrastructure that would help to release any unrealized potential value of protocols for science and broader society. I'll now pause just to look at some of the lessons we can learn from a similar journey that research data has been on for the past few decades to try and release the unrealized potential value from research data.
ADRIAN BURTON: And let's see what we can learn from that potentially for the protocols community. We might just look at two specific examples as part of a larger trend that will draw our attention to some key directions that the protocols community might like to take. There's Scholix and FAIR.
ADRIAN BURTON: Scholix is an initiative that comes from the research data lines. It's, in fact, a series of RDA working groups. The initiative set out to facilitate the linking of research data with publications following the assumption that both data and publications can be greatly improved by being linked. For example, the integrity of publications is increased by knowing what data underpinned the publications conclusions.
ADRIAN BURTON: And the value of a data set can also be indicated by the publications that the data has previously supported. You could consider a similar model for protocols obviously. The Scholix initiative identified an exchange format for describing the link between data and publications that would allow potentially a global system to emerge. The Scholix initiative took a holistic approach and also analyzed the supply chain for data publications linkage and worked with infrastructure service providers within the repositories and publishers to make it feasible to create, collect, and aggregate information about data and publications linkage.
ADRIAN BURTON: So this social architecture or social infrastructure is as important to the success of an initiative as any technology or standard. Thus, the Scholix initiative continues with efforts to link data in publications supported by the data repository and publisher communities and supported by the infrastructure service providers in these respective communities such as Crossref, DataCite, and OpenAir, as well as publisher peak communities like STM.
ADRIAN BURTON: The protocols community can, and indeed already has piggybacked off the Scholix initiative to link protocols to data and publications, and this will increase the value of all three. And it's quite a strategic area. More importantly, the Scholix approach just teaches us that embedding these improvements within communities and their services is a very good lesson for any community.
ADRIAN BURTON: So enough on Scholix. If we now look at FAIR, FAIR is an international movement. And I looked at their Wikipedia page, and it says that FAIR data are data which meet principles of findability, accessibility, interoperability, and reusability. A March 2016 publication by a consortium of scientists and organizations specified the FAIR guiding principles for scientific data management and stewardship.
ADRIAN BURTON: So they're using this FAIR acronym and making these concepts easier to discuss is the key point of this FAIR initiative. So each of the four elements of FAIR has its own subset of components which acts like a checklist to explain and prioritize what would be involved in making things FAIR.
ADRIAN BURTON: So specifically that the text is like this. And if we look right at the bottom of that table, it gives you the specific requirements, I suppose, or guidelines for making something reusable-- making data reusable. So if we zoom in on that, we can then see that the key ingredients of reasonable data according to the FAIR guidelines are that it should be well-described, clearly licensed, have recorded provenance, and adhere to community standards.
ADRIAN BURTON: So FAIR has become a very successful clarion call through its combination of this simple acronym, the fundamental high-level concepts of findability, accessibility, et cetera, and then levels of technical detail underneath that supporting it. The momentum of the FAIR movement has spawned a myriad of other FAIR training models, FAIR certification, FAIR maturity models, FAIR assessment tools, machine-processable FAIR assessment, then all sorts of community-specific FAIR implementation profiles, as well as, for example, national and regional FAIR data policies.
ADRIAN BURTON: So what does this mean for protocols? That the FAIR guidelines were intended to improve the findability, accessibility, interoperability, and reuse of data. But the proposition applies at least conceptually to all digital assets such as software models, protocols, et cetera. And when you think of it, the added value that protocols can deliver if they are findable, accessible, interoperable, and reusable, is a compelling proposition.
ADRIAN BURTON: And the potential value of piggybacking on FAIR means that a lot of that initial icebreaking has been done as far as the socialization of the value of the concepts. So an interesting topic of discussion for this workshop and for future work with NISO and other organizations would be to develop a profile of FAIR for protocols and then, potentially, further more granular profiles that would be appropriate for protocols in specific research domains.
ADRIAN BURTON: So if we'd look at that example that I had before about what does reusable mean for data? Where the FAIR guidelines say that to be reusable something should be described with a plurality of accurate and relevant attributes, we could then ask, well, what attributes are relevant for the reuse of protocols?
ADRIAN BURTON: And if domain-relevant community standards are also important for reusability, then what community standards apply to protocols? And regarding provenance-- oh, sorry, just fixed that.
ADRIAN BURTON: Doing live recordings-- and with regard to provenance, then what do reusers need to know about the development of a protocol that would allow them to reuse it? So what do they need to know about the history and the development of this protocol? So there's just some examples of just using that FAIR framework and going through. I think, again, that would give us a good checklist, and a plan, and a prioritization for work in adding new value to protocols.
ADRIAN BURTON: So some food for thought. I'll just leave you with this. What could protocols community take from the experience of the FAIR data initiative? So for example, you'd want to start with just making sure that we understand what we're trying to achieve by doing this-- you know, what is the actual value in making protocols FAIR?
ADRIAN BURTON: Because if you have that clear, then you can say, well, therefore, it's this kind of findability that we're looking for. So you'd start with it, and you're just making sure that it's valuable when you've understood what value is. Then we could take each of the elements of FAIR and say, well, what does that mean for protocols? You could then also ask, well, how could FAIR for protocols be profiled for a specific discipline, domain, communities?
ADRIAN BURTON: And taking each of those questions and actually giving guidance to say when we say community-- as you saw the guidelines for FAIR are fairly generic-- when we say FAIR for protocols, it should conform to community practice. Then we take a community and say, well, this is the practice that we mean and that can give it a nice-- a much more relevance to these broad guidelines.
ADRIAN BURTON: And then FAIR has a particular focus-- if you read up on FAIR-- it has a particular focus on machine-readable data. So what would that mean for protocols and the very close cousin, I suppose, of protocol, which is a workflow-- an actionable workflow-- which is very similar to the idea of actionable data over in the data side of things. And so they've already done good work on that, so I think that could help us again with, again, unlocking value in workflows and protocols.
ADRIAN BURTON: Well, it's 3:00 AM in Australia now, so I, unfortunately, won't be able to join you for the panel session. But I do look forward to hearing about the discussion. Thanks and good night.
KEN RAWSON: Many thanks to David, Emma, and Adrian for taking the time to prepare these stimulating presentations and get us thinking about these issues. We will now go into live session to begin the discussion. Your participation is the key to making this session work and, hopefully, produce several actionable items that the community is interested in discussing further and perhaps in developing some guidelines or solutions. Due to the great time difference between Australia and US Eastern Time, I don't know if Adrian will be able to join us for the interactive Q&A session or not, but I certainly hope so.
KEN RAWSON: [MUSIC PLAYING]