Name:
Two years and counting Nelson Memo
Description:
Two years and counting Nelson Memo
Thumbnail URL:
https://cadmoremediastorage.blob.core.windows.net/4f22cf66-d8d5-4cdb-ac13-3e9fb09d97cc/videoscrubberimages/Scrubber_1.jpg
Duration:
T00H56M44S
Embed URL:
https://stream.cadmore.media/player/4f22cf66-d8d5-4cdb-ac13-3e9fb09d97cc
Content URL:
https://cadmoreoriginalmedia.blob.core.windows.net/4f22cf66-d8d5-4cdb-ac13-3e9fb09d97cc/Two years and counting Nelson Memo.mp4?sv=2019-02-02&sr=c&sig=Ef0sE6F8o30kD8lTiJkmg79Ud5OmfDMr0INXvVE3WcE%3D&st=2024-05-17T02%3A04%3A40Z&se=2024-05-17T04%3A09%3A40Z&sp=r
Upload Date:
2024-03-06T00:00:00.0000000
Transcript:
Language: EN.
Segment:0 .
I wanted to see first if anyone had an immediate burning question that they wanted to start with, or I'm happy to kick us off with a question to get folks the opportunity to pull their thoughts together. OK so I'll go first while folks think of some things. So we were talking about the fact that this. The Nelson memo is obviously different than the Holdren memo.
And so I wanted to start us off on a possibly positive note to kind of think about what opportunities that this different memo brings while fully recognizing some of the challenges. So what are some of the new opportunities that are brought about with the memo here? Joanna I'll kick us off and I'm interested in Howard's perspective and others.
You know, one of the things we really see here is a broader emphasis on data sets and on metadata, clearly. And I think that's been although I think a couple of us in the presentations talked about areas of concern or skepticism, unevenness of funding, for example. But I think there are some real opportunities here, and that is an emphasis that I think is really welcome, because in that space of identifying, connecting and utilizing data sets, we have a long ways to go.
So laying that groundwork strikes me as being vitally important and quite connected as an interest of the ISO community. So I'll say something here. So in my opinion, I'm following on what Alex just said. Metadata is key. This is the first time the US government has actually talked about metadata in this gross way and that's gross in a good way.
They are going to be pushing for grant ids, which is something that I think we've never seen before from the government. I think we'll also see a much heavier push on things like orchids, even though they're not announcing orchids, per se. That will allow for a lot of connectivity and a lot of discovery between publishers, between all the various different places that research can be found.
I'll just add one quick thing to that. I think that difference in public access versus open access language was interesting to me and I wondered what it signified. And I think we've tried to ask if that was an intentional use. To me, I hope that what that means is moving away from monolithic policies and towards exploring what public access might actually mean.
So I'm interested actually in the language of the memo. So I would agree about the metadata, too. Of course. Well, Karen, actually, they've never used open access even in 2013. They always use public access just to stay away from the radicalisation of the words open access. Well, there you go.
Thank you, Howard. I'd like to see some conversation in the chat, so just give me one second. And I'll read out the comment just for everyone else here in the room. I believe Alex mentioned that gold is likely better than green, given their respective shortcomings and responding to the Nelson memo, do you predict that apcs will go up drastically in response to this demand?
No, I really don't think so. Actually, I'm hoping that it could be quite the reverse. What we've been spending a lot of time on recently are the challenges for non-profits and smaller publishers to essentially get up to economies of scale. Right, which require cooperation and re-engineering of systems. I think the hurdle, you know, speaking for my part of the community is that there's a lot of re-engineering of systems that have to take place.
So I was just yesterday, you know, kind of noodling, but we talk about it all the time, which is that the systems that we use for peer review, manuscript, tracking and editorial, broadly speaking, are really designed once upon a time for print products and have a lot to do with the ivory tower gatekeeping mechanism. It's the way that the publishing process has evolved around what I'm calling a prior business model, although it's still very much in play when you shift to open access, the energy all moves towards the author as the customer.
So rather than a gatekeeping mechanism, you're really trying to communicate, connect with foster submissions, shepherd documents through the process still apply the entire process of curation and peer review, but there's a real different orientation. And frankly, our data and systems, these are born digital products at this point have evolved a lot.
So one of the things, in my view that has to happen is that we re reinvent the workflow. I noticed that there was someone on here from archive. I'm also personally really keenly interested in the role that preprints play in all of this. So I see the workflow continuum changing pretty dramatically, but I'd like to see existing publishers and nonprofit publishers have a valued role in this, but that is one of the areas where we invest.
We get to scale, we bring prices down. So I see that happening and I don't see apcs going up, but it does speak to the fact that we have an investment need to support this kind of what's really a radical engineering of the way things have been done. So go ahead. It is Charles who has his hand up, too. There you go.
Since I was mentioned. Yeah if you want to talk about processing at scale, come talk to us. We do 15 to 18,000 submissions a month and also about 15,000 revisions. So one of the things we cannot do is manually process articles. We do manually process exceptions, things that are broken or we have lightweight moderation for things that don't meet our policies.
And somehow, even without the lightweight moderation, without serious peer review, there's somebody out there published an article in one field who said, our index is better than a lot of journals. Right although that's not the plan. Our attitude is, if you want. Review or rankings of articles on archive. Have overlay journals. Do it.
Point to our content, I will claim we're the most efficient custodians of permanently storing the content, at least of papers. And and frankly, we welcome most fields others just using us as the repository and then layering the value added on top of it. Great we will definitely be in touch. Yes Whalen, did you have a question as well or an add on?
Yeah, I was. I'm curious. Question for the panelists of how might we accelerate this data interoperability. Write to what Charles was speaking on. Right if our time is able to handle this volume of submissions as publishers, we have these legacy systems, how might we make this happen faster before we have to be kind of prodded by things like the Nelson memo?
Any thoughts on how we can make this go faster? And this is maybe thinking about where could ISO standards, where could it could it be standards that could come in and give us kind of a roadmap? Can I answer part of that? So, so I'm actually archives technical director a.k.a. CTO, but we're at a University so that I'm going to give out that title. I'm quite interested in making the interchange between the publishers and what we have in our repositories in both direction.
More efficient, right? We have open public metadata on our articles, but our metadata schema is going to be revised soon because it doesn't have everything everyone wants. I noticed there was mention of the grantor or funder ID. The European Research Council actually came to us a couple of years ago to ask for that because it's essentially a radio requirement there. So we're going to be moving along in things like this, limited only by the funding we get from our benefactors.
But we have some who are stepping up to increase our capacity to handle these things. And I'd love to have a conversation with we're already starting a conversation actually with some professional societies about how to make the process more efficient. It comes down to simple things like the public publishers use. That's format. Most of our articles are submitted and I'm happy to explore bridging the two and other such work we can do to make the process smoother for everybody.
Howard yeah, so Charles, I'm really glad to hear that chorus has actually been doing some work with preprint servers and we try to archive first actually, but metadata, getting it up, getting it right up front, realizing that archives metadata is probably some of the oldest metadata. So a revise is certainly welcome. But when it's right, such as the work that we did with can archive it, the connections flow really quickly and easily.
So I highly recommend that if you can make deals with either us or any of these professional societies to ramp up more rapidly, the funders will react, I think, pretty, pretty well to it. Even though preprints are not directly called for in these memos, the agencies are definitely looking at that stuff. We would definitely like there to be a situation where depositing your article on archive or archive is considered to meet the requirements.
Because we're fast, it's publicly accessible. We promise to be custodians for ever. I hope so, Yeah. Welcome back and talk to us again. You may have talked to us at a time where we were kind of shorthanded and. And it didn't have the capacity to respond where we're back with the capacity to respond. Now So let's have a conversation.
Sounds great. Great I'm suggesting in the chat that we put that in the doc, perhaps a course archive conversation and Whalen as well. I'm curious, Karen, because you really talked about the distinctions for important topics in the humanities. These preprint servers, you know, exist increasingly in the scientific and medical fields.
Is there equivalent partnerships or metadata concerns or different ones for the areas that you are studying and who is supporting those? You know. Thanks, Alex. I was just pondering how this conversation would look if this were a primarily humanities publishing and research focused group. And it would be really different because preprints don't really.
It's just it's like so much else. It doesn't quite fit. What you might call a preprint in a humanities context is the way that scholars will publish in peer reviewed blogs, for example, kind of early or smaller. And that's very public access, very public interest focused. But the problem is that preprints when, what, when the product of your research is the articulation of an argument through careful language, a preprint doesn't really quite work the same way.
We do exchange material a lot in workshops and seminars, but we exchange things so differently than in the scientific context that like many other things, the preprint doesn't quite, doesn't quite fit. So I would say sorry. Yeah, I'm not disagreeing with what you're saying about the different nature and humanities, but I want to point out that a lot of the National archives, like the one in Japan and the one in Russia, and I'm not sure about the one in China and I think Africa.
Include the social sciences. We don't. And there's probably two reasons for that. One, as a matter of capacity, we want to be very careful what we take on because we want to make sure that what we take on is sustainable. And the other may be exactly what we spoke about and that the process we have in place right now for scientific articles might not be the best or.
Social science articles, although we can discuss that because we do scientific articles, do go through revision cycles and we have means of supporting them. Right I just having looked at some analysis of the difference between the revision process for scientific articles and the revision process for humanities article, it's like it's a world of difference. And I think the difference between kind of quantitatively driven econ, for example, publication is really different from language based historical argument.
It's very different where there is some conjunction is in data set. So some of the big data, what we consider kind of data sets and I see Andrew made a point about digital humanities. Yes and so I would say big OPENLink data projects for us. Big 30,000 items seems big for us. Like enslaved org slave trade database, which I talked about at last conference I was at actually intersect here a little bit.
I'm going to pop in a question that or a conversation that was happening here in the chat someone wrote as a librarian, I'm concerned about the ramifications for discovery and fulltext linking. We've already seen large scale user confusion for journals that provide very selective public access to specific individual articles. Are publishers committed to sharing accurate level, accurate level article access information with all library discovery tools.
And aggregators? And then kind of a continuing conversation was, I think we need to emphasize the ongoing problem of legacy systems in this environment. Technology is constantly upgrading and there are too many stakeholders with systems that are limping along. So I think those two points are kind of getting at the again, that kind of discovery and metadata layer.
So anyone, by the way, on this call can correct me if I have this wrong. But as far as I can tell specifically for open access publishers, meaning that they have not spent one nickel on access control systems, they've actually spent most of their efforts to make sure that their metadata and dissemination systems do get out to as many venues as possible. So discovery, I think they're all about discovery, as far as I can tell.
But maybe I have it wrong. I'll add on to that Howard fits my understanding as well. And that's where as a kind of a legacy publisher with a deep subscription history, we might be at a disadvantage because we also we still have all of that infrastructure and we have agreements and we're providing gated, gated access to content where the publisher or journals that are born in an environment of purely open access don't have that overhead.
And so that idea of which articles are going to be selectively available doesn't apply because they're all available. Right? it's kind of I wish we had that a little bit. I think Andrew put an interesting question in the chat as well, which is, Karen, you and I seem to be pushing back a bit.
What does that mean? What form does that take? You know, Andrew, I, you know, I really think we haven't landed the model. So I'm coming at this from this notion that actually our interests, they may not always have been aligned, but I think they're quite aligned now. I think we really all accept this and we welcome a shift to public access.
And the question is, how are we going to get there? As I said in my talk, without, you know, without losing some of the things that are working in the system or the things that we want to work better. So without those unintended consequences, how much do we want to risk and break along the way? So my challenge is if the model was landed, we wouldn't need so much lobbying, right?
It's easier to get traction if we're all in agreement and we all see the path forward and it's equitable and supportive. So I tend to look at this idea of what's the right on the business side because the financial aspects are real and do matter for everybody. I think sustainability. But what's the what's the model that works? And there are lots out there, but it's not the APC, I don't think.
I think it's probably a multi payer solution. You see a lot more kind of big deals perpetuated. But I'm I think it's likely to be a multi payer kind of a collaborative effort. And so that the onus, the burden isn't landing squarely on one party and the ecosystem versus others. So I think there's still a real opportunity to get around tables and hash this out because if we really want the question is, you know, how quickly do we want to see it all accelerate?
And if we really do want that, I think we should all be kind of putting some economic force behind it. Jill has her hand raised. And I just say one quick thing, but I definitely want to hear from Joe. But I would say I think we just need to start with a premise that all research can and should be of public service. Let's start with that. I mean, I think the problems we're trying to solve here are the problems primarily of a set of very expensive access to an exploding scientific publication world.
And what's being left behind is that all research can be of public service. And in fact, as I was making the argument is of critical public service. Just think for two examples of the histories of the Holocaust and the histories of slavery in the United States. Two critical historical contexts which are being our have expanded our understanding of self, government and democracy.
This is not stuff that we would brush aside, right? So we need to support that research environment and we need to support obviously it's public utility. So let's start with the premise that all research matters and then ask how can we push that into the public, into the public audiences and into the public sphere rather than the back end question of how to make your journal article free to read online for somebody with a connection that's so.
Jill I see your hand up. Yes I'm sorry. My concern is we have this conversation is that every stakeholder, every stakeholder community is currently operating at something of a disadvantage when it comes to legacy systems. And this in part is because technology has accelerated to such an extent over the past 30 years, it's hard to keep up.
That said, re-engineering of the degree that Alex was talking about is a resource intensive thing. Different groups have different sets of resources available to them, and higher education is not necessarily in a position in the current environment to pour lots of extra dollars into it either. So what I think we should acknowledge and focus on is the dollars have to come from somewhere.
To make this happen. Who do we see in the community as having the necessary dollars? Is this something that we turn to the federal government and say, this is a really big problem. The government has to do this. Is this where we turn to the commercial sector, the private sector, and say, come on, guys, you can a bit more than you're doing.
And what do we say to those smaller educational institutions? Not the ones but the, the twos, maybe even our threes. And say what do we say to them about, here's what you have to do. Here's what you have to focus on. I think the big question right now when we look at the Nelson memo is, all right, who's going to pay for this in a way that allows everyone to be part of the collaborative effort that it really is required that we're required to have?
Interesting your hand up. Yeah thank you. I want to I wanted to go back to what Alex was saying about the workflow and landing the model, because I think that is pretty important that as we land a model. My concern is that the workflow that preserves the quality and a workflow that preserves the for non-profit publishers.
Right, by simultaneously preserve the workflow and profit margins up to 40% profit margins of the big, big publishers. Right so we want to make sure that we do some people are after a model that does take some of that money out of the system rather than just push it like a bubble through a hose that can never escape. On the flip side, also acknowledging that we want to workflow and financial model that also might challenge the current promotion and tenure model, because I think the current promotion and tenure model also works to preserve the current workflows and the current financial models of some of those big publishers.
So I am sort of interested in what that when we land the model, what is it going to look like that's going to be different? Because I'm not sure the transformative model is particularly transformative and. Yeah anyway, that's my comment. Thanks so I'm going to just I would put it in the chat, but I can only either talk or chat. So what I'm interested and Drew there is that you have put forth a couple of other objectives, right?
It's not just public access to research information. It's also ensuring that for that, there is less profit taking in the system and also the disruption of the current tenure model. Is that fairly stated? I mean, I think in a way, if you want to start the collaborative conversation, we need to be clear about what the drivers are. And as I said in the chat, it's sort of, you know, how are we prioritizing these things?
You know, to some degree, if we're biting off everything at once, we just create chaos. You know, what is the linear order? And not everything has to be linear, but what is the order of priorities. And precedents and how are we going to rationally make progress against that? You know, the one thing I'd say about the profit taking is that might be what financial interest?
This is what they do all day long, right? There will always be there will always be entrepreneurial and business interests that try to find profit in any system. But I think there's no assurance that the level of re-engineering that we're talking about, I'm not sure that other models will generate the level of profitability or surplus that is found in the subscription model. And not just because it's the subscription model, but the model for journals has been very consistent, very templated, very reliable.
It has not changed a great deal. It is standardized. So those are the kinds of places where you can actually create surplus or profitability in the new. When you come out of this process, you may have a very different business model, but by and large, the subscription model is very regular. And the more diversified your models and the less standardized, the less surplus will be available for anyone.
I'm going to pop in here and perhaps consolidate some of the conversation that's been happening in chat. There was a very early mention that the memo is a recommendation with a question mark and also the fact that the different ANSI plans are currently still being written. Some are coming out now, but they're going to be different. And so what might the fact that these are recommendations, what sort of teeth, if any, are there and kind of in enforcing these plans?
And how might the difference with those plans benefit or not benefit different communities and how. Right I see your hand up. Yeah so the fact that it says recommendation, don't read into that. This has been debunked already a couple of times, even by Chris Marcum of ODP. The recommendation is just the parlance that the way that the OCP interacts with agencies, though usdp doesn't have direct influence, that is, that they can't tell an ANSI to do something.
However, their bigger brother, which is the Office of Management and budget's OMB, absolutely has much more influence over these agencies. And so what we have seen even back in the 2013 memo, which also was a strong suggestion to the agencies that they do it, but they do it in a way that works for them, that they can comply. And they take a very long time to do this. And some of the agencies have a lot of technical know how, so they will do it themselves.
Others will collaborate with other agencies. So many of them collaborate with NIH, many of them collaborate with. And some of them just say, if we can find it on the publisher's site, we're very happy. And because we know we can access it and we are meeting we are meeting what we can do in order to meet the memos needs. So it it'll be very different.
I'm predicting that even though I don't haven't seen any of these ANSI plans, I think we saw in 2013 that they were very broad and very wide. I think we'll see a very similar thing here. Scott I see your hand. Uh, just to jump in. Also, there was a great conversation yesterday on a related topic.
For those of you who weren't able to participate in that, the recording will be up soon. And there was a lot of conversation. We had someone from Dewey talking about their policy and how they were working with other agencies and how the agencies were working together to work through this, to build on to what Howard said in terms of the Office of Management and Budget.
Part of the reason that the administration was trying to get this implemented out quickly, get the plans back, is so that they can implement this before the end of the current congressional term. And also can get some of this kind of incorporated into legislation that kind of. Fosters the case for some of this so that it's more than just ANSI policy that might get changed here to hear that it could actually get written in to kind of the legislation that organizes these various agencies to really kind of make it permanent.
And Thanks Nicholas for posting the link to that session in the chat. You've stumped the panel, I think. All right. I'm happy I'm happy to head the panel into another interesting direction.
And this is something that Howard and I and a few others in the community have been talking about over the last several months. We were unable to get it going as quickly as possible because of some staffing changes within miso. Kind of. Draw at Drew our attention away from this particular project. But I think there's still value that this community is far broader than the federally funded agencies.
While they're the single largest source of funding in our space, they're not the only source of funding. They're not the only players in this space by any means. And there's a conversation, a simultaneous conversation taking place in a group called the open research open research funders group, or RFG, that is talking about how to. Build some momentum around consistent practice here, both by the publishers, by the repositories, by the funders, both federal and non-federal, to get behind things like funder identifiers, various, you know, consistent use of things like, hey, let's all use ORCID even though we like talk about it.
I wrote a Scully kitchen piece about this a while ago. Be interested to hear your perspective in the perspective of the people in the group about the need for kind of a national policy that is not just driven by the federal government, but belt driven by a consensus of the community about metadata needs, about expectations for identifiers both in funding systems, publisher systems and repositories.
I'd be interested just throwing that out as a potential project idea that has been discussed. And some conversations are kind of picking up speed around it. Anyone interested in that? So I'll jump here. Having listened to a lot of the NISO presentations that were yesterday, there's been quite a bit of movement in this direction, I think in the EU, not specifically the EU itself, but in European nations.
And I think we can learn a lot from that. And I think there's they came out very strongly and I again advise you to look at some of the ones that were on the morning of yesterday whereby they were talking about like, what are the good pids? What are the ones that we should be involved with? There are a lot of different pids that are out there. Are there three that we can get behind? And by the way, they were talking about ORCID die and raw.
So for me, being a very big consumer of kids, I would love to see more and more pids being used and used well. And I'm going to stress the word used because the more that we use these bits of metadata, the better they become fixed and and therefore they get there. They're better for discoverability and better for linking and better for everyone. Be able to reuse this content as we go forward.
Yeah I commented in the chat. I think that's right. We can shift the conversation from how do we make it all possible, how do we finance it, how do we sustain important players in this? But I think at the end of the day, it is, you know, how do we envision it all working? And that is why I kind of mentioned there's a real important groundwork layer in here because it doesn't really happen if we haven't laid the Foundation.
So those are parallel tracks. And I suppose, you know, what I heard from you, Todd, is this idea that we are also driving consensus as a community. And in that, I would kind of go back to creating a vision of what needs to be laid first. Right and how do we order that? And and are we all equally informed about the state of play currently?
Right so again, it sort of connects to the thing. I also put in the chat about taking centralized views. I think to the degree that we're popcorn machine or decentralized, we can only get so much traction. So organizations like nisa or chorus or anybody else who's, you know, archive who are trying to put this together and have the broader conversations, are really vital right now, I think. And and, you know, same goes for humanities as well as what are there things I suppose which can really support those systems so that scholars are getting the benefit of connected open access to archives, to documentation, you know, that support is there so that the research can continue either, you know, and and by people, not just by machines.
So so at this point, I'm going to ask probably a really obvious question, but sometimes those are important. Who all do we need to have in the room to have this conversation, whether that's not necessarily individuals, but what groups, what communities do we need to make sure that the fullest conversation is happening, that all the voices are being heard to kind of make sure we don't accidentally forget something or create a system that's not interoperable yet as large a room as we need it to be.
Who all needs to be in this room? Can I just say that? The other thing that is I appreciate your question so much, because I think it's a really inclusive one. But the other thing I would really appreciate is the inverse of that, which is to just be clear when in fact the room is limited. And I think it mostly is, you know, it's mostly driven. You know, I make this argument, I don't know how successfully, but that research and research value is really broad and complex, but that almost all of the drivers here are around very particular fields, high output, high dollar fields.
And I would love it if we just said, OK, you know, it's really important to get open access, however we define that for Biomedical Research. And let's put that there and let's just acknowledge that we're delimiting certain things rather than saying it's about everyone. So I agree with you. It's great to have more people in the room and more perspectives in the room.
And I would say, like, obviously it should be lots and lots of historians and humanists, but but I also think it's quite important to recognize and be responsive to the fact when it is really, in fact, discrete. So I wonder there. I mean, so that's I mean, I just put a wag in the chat, which is, you know, could you have sort of communities by domain having similar focused conversations with real outputs where representatives could then come and convene and we could really look at because I think it's that kind of thing where you have to contemplate the, you know, the expertise and the goals of those communities.
You can't overlook them and then come together and say, well, where do we have points of connection? And how does that help us get clarity about what can be advanced and how and when? And I think to your point, Todd, I think that would also give clarity on the metadata requirements. Right sorry, I'm trying to read these comments in the chat and respond to them at the same time.
I mean, some of the questions that you're asking, I'm thinking about like what we're actually doing with digitizing rare materials in libraries. Right and those are very library questions, like library systems don't interoperate with Museum objects, systems, for example. So researchers that want to explore both aspects of 18th century materials have to work differently.
So I'm going to answer your question here in the chat, Howard, about pids and also a quick response to Andrew. Great Thanks. Thanks, Karen. You can just talk your answer. OK I could.
I mean, I'm just responding to Andrew's point about we need an alternative to the notion that everybody else's data should be free. And I think that's true because we literally don't understand what other people's data is or are, depending on a singular plural data person anyway. It is it is a blocker to interoperability. And it's again, goes to my response to Kiana that we actually should acknowledge when we're having too limited conversations because it would help us to be more discreet and to tackle the problems, the kind of problems that we're discussing here.
The kind of data I had at my last appearance at. So we talked about the sensitive issues around, for example, data aggregated data on the transatlantic slave trade and how that's different from the kind of sensitive data we deal with in biomedical research protocols, for example, sensitive data, but sensitive and entirely different ways. And the ways that it can be used are really different. So everybody's data isn't the same at all.
And Howard, when we're talking about what pids are useful sometimes, you know, even when I was doing kind of more active ORCID work, just persuading humanists that an ORCID ID was really helpful, not just to them but to everybody in the community was a lift. It's still a lift, actually. And just talking about how for a humanist, the kinds of outputs that we have needed to be recognized.
I see Krishnan is here but recognized in ORCID kind of data structures. So that we're not jamming humanity's outputs into awkward categories. These things are these things are important. Anyway, I'll answer about specific kids here in the chat. Alex, I liked your devil's advocate question, so I'm going to introduce that into the room.
Should all data be open? Are there proprietary and not are they proprietary or not proprietary products of analysis. And I'll open that up for anyone in the room who wants to take a stab at it. Just to be a little provocative. Oh, go ahead.
Do you have do you have something? So, I mean, if you look at a lot of the ANSI plans or even the odb memo, they're going to say that, yes, it should be open unless it shouldn't be. Right so, for instance, maybe some things that would break HPA. Right that that would be a bad thing or things that would damage a person's right to get a job or whatever or that a person could be targeted.
But not even just so much people data. But that is key. But there's I also heard from like the usgs, the USGS geological survey, they can't release certain maps because they show toxic zones or nuclear waste or whatever. There are certain things that just can't be. But it's kind of the onus is upon you, you as the people that own the data to say why it can't be right and give a rational reason why it shouldn't be.
Well, and I think the I mean, the other thing that I was thinking is that where I was going to be controversial is that there are certainly a lot of interests in this sort of content should be open and machine able because it will advance progress in these areas. It will. But a lot of that actually, I think, relies on commercial interests wanting to get their hands on the materials themselves.
Right not having these big gated. And we know a lot, you know, from working in geoscience, I can say an oil and gas or farmer or what have you. What really happens is that they all want to ingest all of the open information into their very expensive commercial systems. Where they bring, because of their sort of deep pockets, can put a lot of experts on and do a lot of the analysis, but they're not releasing that back out for the public good.
So there's this sort of bind where it's everything should be free. But but then there are Google scale advantages or big corporate advantages to who can analyze this and who can keep pace with that. So I think it's part of our interest to ensure that actually in the public sphere, we're equally resourced and equally expert and really have, you know, sets of goals so that we can keep pace.
And one of my concerns is that really we tend to be more belabored, right? We're not as agile. And so we don't, in fact, keep pace with what happens commercially. I think that's a very tricky question. There's certainly I mean, again, I'm looking at the stream of research coming into the archive and it's certainly lots of publicly funded research around the world or research at universities, etc., that that are, by and large, very open.
And interestingly enough, we just had a startup come to our archive to integrate in our lab section that's trying to be a GitHub for data, which it looks seems to be a very promising approach. We have not to date offered to host the research sets for the papers that we host because it is in some ways an unbounded problem. But there are parties trying to do this.
Right and there's also the conversation of, well, if we can't host it, know, maybe the University that did the research is going to offer to host it hopefully indefinitely. And we'll just get a link, preferably with a PID back to where the data set is, et cetera. There's we shouldn't. I guess what I'm saying is we shouldn't say, oh, it's useless to host to, to try to make the data available because so much of it is proprietary.
I'm looking at the other way. So much of it is non proprietary. It's just not been made available. You should be thinking of ways to make it available. One of these. And I'd say that we would certainly all agree that the space of having consistent metadata for these resources is the first step, which is a difficult problem to solve because you want actionable metadata.
And every research project has its own data set and its own way of processing it. So in some ways you want to combine the code with the data, with instructions, et cetera. It's not an easy problem. Right? and that's I think that's where, you know, Howard and I had talked about this and he'd really been this is where my comment that we were decades away and in the metadata space.
But it's really interesting because I think politically some of these initiatives are racing ahead without the Foundation being laid and with the Foundation being decades away. So the question is, you know, what happens with the absence of metadata? Does that gap grow? No, I think that's too pessimistic. I think you're looking for one grand solution.
And that might be decades away. I'm not sure they'll ever be a grand solution, but that doesn't mean we shouldn't do what we can do now, which is what basically the scientists need. I agree. Charles in fact, we're seeing lots of different initiatives happen in different disciplines. So like the Earth and space science area, they've made a lot of progress in data now.
Are they perfect, alex? Absolutely not. But are they making progress? Are we starting to now see data citations show up in the Au articles? Yes but has it taken years? Yes and do they all seem to have different syntax that they use? They're talking and they're even in that small, small space, they are making progress.
So I think it's not a matter of waiting for the perfect solution, because we all know if you wait, nothing will ever get done. You have to make incremental steps. You have to learn from each other. You have to use groups like ISO to keep people talking to other groups like chorus, you know, making sure that all of this information is transparent and that we can build upon each other because, you know, we don't all have the money.
Right but but maybe collectively we do. The standard for the sciences should be what it's always been. When you publish an article that you provide a paper, you provide enough information for someone else to reproduce your research. And increasingly that will include providing the data set. It's really as simple as that.
Now, it doesn't mean that the people consuming the data set might not have to do some work. 2 2 make use of it or interpret it. But again, that's what science is about, right? You want them to potentially approach your data set in a manner different than you may have approached it originally. I'm going to pipe in here because I'm looking at the time and we have about six minutes left.
And I also have not left us enough time, but I wanted to make sure that we had the opportunity to kind of have possibly deliverables after our session of the who, what projects are could potentially come out of this. What are the conversations that need to be had? I know there was one someone put in the Google docs, so thank you for doing that. Are there additional areas of work that we see need to start happening again, recognizing that we're at the start of this conversation, what are some things that need to be started in order to kind of fully implement the memo?
But could you post the docs link? Again, I was. I was looking for it. Sorry sure. There you go. Thanks so Howard, I suggest I think, of course, as maybe the best respondent for this, given your role in the space.
There was another conversation that came up yesterday. I can't remember which session it was. It might have been the same one with the. Another topic that has come up is, how do we ensure compliance and how can we build a better data exchange system for was this thing done, you know, for whom, by whom and how do we improve the data exchange there?
And someone had suggested the idea of a kind of counter type standard that describes these are the data that we need in order to assess compliance. And here is an API structure that moves that data La sushi that. Would simplify the gathering data and reporting of those things. Sounded a lot like what chorus does, but be interested to hear your perspective on that idea.
I wasn't part of that conversation, so it might have been that doing meeting during the evening. So some of the agencies will say that the only way to comply is to actually follow exactly what they say, meaning use their system and insert an accepted manuscript into their public access repository. Now, whether or not the agencies would want to.
Try to streamline that in some way by working with stakeholders. Some of the agencies do, right. Some of the agencies work with chorus, and they accept the harvesting of information directly from publishers into the repositories as being part of their compliance. I'd be interested to know more about what precisely what they were talking about know, because I think I think, again, it's going to vary by the size of the ANSI and their technical know how of whether or not they'll be able to contribute to that conversation.
Well, again, I think part of this idea was. Not simply federally funded, because there's also philanthropic organizations that are contributing and and demanding open access compliance with their Sloane grants or their Arnold Foundation grants or gates or welcome, et cetera. So again, this is sort of broader than just the. Federal Nelson memo issue.
But this issue of compliance and who's doing what, where and with whom. Is is an interesting question there. I think it is an interesting question. And I think you'll quickly get into the realm of how each ANSI worldwide treats its grantees and the reviewing of their grants.
But recognizing that I ask a big question at the end of our time together. And that's the beautiful thing about the Google Doc is that as we go through our day and have more coffee and mull things over, that document will be there for us to continue to add different ideas and opportunities. So I very much encourage us all to do that. And I want to thank our panelists today for a really engaging conversation.
Thank you all for participating. So well. The chat was I know I missed many things in the chat, so I appreciate you putting them in there and it's going to get saved. So again, thank you so much for such a really great conversation and I hope to see you in the rest of the conference. Thanks, everyone.
Thanks, everyone.