Identifiers, metadata, and connections
Identifiers, metadata, and connections
https://asa1cadmoremedia.blob.core.windows.net/asset-c5878f1e-b482-4a86-8a4f-fe8fbc6d6c22/36 - Identifiers%2c metadata%2c and connections-HD 1080p.mov
SPEAKER: Hi, everyone. Welcome to NISO Plus 2021 and this session on identifiers, metadata, and connections. This session is going to have a couple of individual presentations. The first is going to be "Hocus Pocus, Mixing Open Identifiers" by a whole selection of well known identifier groups. This will have representatives from DataCite, Crossref, ORCID, and ROR all talking about how identifiers work within the ecosystem.
SPEAKER: And then we're going to have a follow up presentation on sort of what you can do with those identifiers, at least one of the things you can do with those identifiers. One on data visualization. That's going to be primarily from Cactus Communications, Deb Wyatt and Donald Samulack. And then secondarily, we're going to have a follow up by Dario Rodighiero, who is going to show you how he did his visualization of this very conference.
SPEAKER: And so this is a really fun session. Lots of interesting stuff. So I hope you enjoy it. Afterwards, we will have a discussion with all of the speakers, as well as with the moderator, Jonathan Clark. Thank you very much, and enjoy the session.
RACHAEL LAMMEY: I would like to thank NISO for the invitation to speak today. And I'm going to talk about how mixing open identifiers into metadata makes connections between research work. But I'm not going to do that all by myself, because I'm talking about connections. So I'm going to join with colleagues from DataCite, ROR, and ORCID to talk you through our vision as to how and why this should work.
RACHAEL LAMMEY: So a lot of you will be familiar with persistent identifiers and their rules in creating sort of persistent and individual links for people, places, and other things in the research community. But sometimes all of that stuff can sit separately and not be connected to each other. And I think the community as a whole is starting to realize the value in that. So you can see that identifiers allow us to follow a thread from the funding for a project through the project team to the outputs and then filter the attribution back up the chain so that the funders and the researchers can be rewarded for the work that they've done.
RACHAEL LAMMEY: But I think I'd also like to argue that in the real world, it's often not quite as linear as this. And research is really a bit of a graph. So connecting authors, publications, funding, et cetera is a lot more multifaceted than just that up and down motion. So connecting everything is one of the real benefits of using persistent identifiers. And making those on the metadata associated with them openly available helps effectively map the research ecosystem and help us make connections and links between different research and researchers.
RACHAEL LAMMEY: So open identifiers and metadata are key. And that's what the group that I have today are going to talk about. So from my perspective, I'm Rachael. I'm head of special programs at Crossref. And a lot of these statistics will look familiar. Yes, we work with lots of different scholarly content items, and we make the metadata associated with those openly available.
RACHAEL LAMMEY: And towards the end of last year, we committed via our blog and via our board to the principles of open scholarly infrastructure, which try to talk to the openness of our data, the fact that we want to be community governed, and that we want to be sustainable in the long term. And part of that is working with our partner organizations and with the community to make sure that we make those effective connections.
RACHAEL LAMMEY: And our members send us lots of different metadata. And I've highlighted in bold some of the aspects of that that are related identifiers that then help us to make these connections between other things in the research ecosystem. John Donne has written a book called No Man Is An Island. And it basically talks about why humans don't do well in isolation and the benefits that can be conferred from basically the fact that we can thrive whenever we're connected to other people and other communities.
RACHAEL LAMMEY: So I can let my little PDF hang out here on a desert island, but I think that the way to set that to life is to ensure that the connections are made from articles via metadata to other related things to really put the research in context. So this is a deliberately busy slide, and I'll explain why, in that just providing identifiers for other aspects of the paper really help put the research in context.
RACHAEL LAMMEY: So first of all, it can tell me if the document's current. And I can look at the ORCID ID of the contributors to the paper, and I can also find information via their ORCID IDs of other achievements in their space, other papers, funding they've received, peer reviews that they've done, et cetera. So being able to find all of those things via the related identifiers helps amplify the discoverability of a paper.
RACHAEL LAMMEY: In terms of funding, I'll be able to see other papers and other outputs that are related to that funding and help me take a look and ascertain if there might be any conflicts of interest. I can see the value that's being added in terms of the review process. So I can see the preprint, I can see where the paper has gone on to be published, and I can see the review process that happened along the way.
RACHAEL LAMMEY: So that gives transparency to what's happened with the paper, any concerns that have been raised, and the developments of that paper. And it also gives the reviewers credit for the otherwise hidden work that they would be doing. And I've also got links to data and code. So again, there's a lot of work on this within our communities. But being able to easily find and link to the data and code that underlie the paper helps the research to be reproduced time and time again.
RACHAEL LAMMEY: And it also again helps those individuals who developed and shared that information gives them credit for other research outputs that just aren't the article that appears at the end of a very long process. The fact that those are persistent means that they can be linked to in the long term. So in terms of linking and making connections, I've got my string.
RACHAEL LAMMEY: It's an identifier string. I think the best way to find out is to pass that over to Gabby at ORCID, who's going to pick up the thread of this narrative. Thank you.
GABRIELA MEJIAS: Hi. I'm Gabby Mejias, Engagement Manager for Europe, Middle East, and Africa ORCID. And let me start with a short introduction of who we are and what we do. ORCID is a non-profit community driven organization, and we aim to solve [INAUDIBLE] issue in scholarly communications by enabling transparent and trustworthy connections between researchers and their activities.
GABRIELA MEJIAS: We provide an open and proprietary registry of persistent identifiers and metadata associated to researchers and all those who contribute to research. Organizations can use ORCID as a hub to synchronize metadata, and our open APIs allow to connect researchers to identifiers for their contributions, affiliations, funding, and more and also to read or import such information.
GABRIELA MEJIAS: We have a large and engaged community of users and integrators including researchers, our institutional members, our regional consortia, and all those adopting ORCID to streamline research workflows. All together, we're working to improve transparency and trust in research. Our vision of a robust research infrastructure is powered and connected by persistent identifiers, and we rely on the community to share and we use those connections.
GABRIELA MEJIAS: We work closely with other peer organizations like Crossref and DataCite that have enabled workflows to automagically update ORCID records with DOI metadata. Universities, libraries, research institutions, publishers, and increasingly funders are integrating ORCID to correctly identify and recognize their researchers. Individuals can use their ORCID IDs in different stages of the research lifecycle.
GABRIELA MEJIAS: And as you can see, in real life everything is connected. And an open research infrastructure should make those connections visible and transparent, and that can be done with persistent identifiers. So let's see how that looks like. An organization can publicly assert affiliation information. Seen on this example include [INAUDIBLE] identifier that also links to a ROR and other IDs for the institution.
GABRIELA MEJIAS: Data provenance is very important for us, and that's why we clearly display when the item was added and last modified and also the source of information, which in this case is the Tel Aviv University. Funding is also part of our metadata schema. And applications like Dimensions can link grant information via identifiers for the funding body from the Crossref funder registry and identifiers for the grant itself.
GABRIELA MEJIAS: Works can also be connected to working records. We currently support 43 different type of contributions like this preprint identified with a Crossref DOI and written by Europe PubMed Central. We also enable recognition for peer review activities, an important element of the scholarly record. In this case, [INAUDIBLE] connected an open post publication peer review, linking the DOI for the review report and the ISSN associated to the journal.
GABRIELA MEJIAS: Last, individuals can be connected with information about resources used for the research as done by EMSL connecting this information with the organization ID for the resource provider and the project ID for the resource. Privacy and consent are core principles for us. And researchers control who can access the metadata on the records.
GABRIELA MEJIAS: As part of our commitment to openness, all public data on a registry is shared annually via our public beta file. And our public API allows to obtain IDs and also with public information. The whole community can benefit from our open tools by using them to build other tools and services or to do research. Dario Rodighiero, also present in this session, has used our public data dump to analyze the research community in terms of relationships and individual trajectories.
GABRIELA MEJIAS: But the road to open research is never complete. If we want to avoid islands, as Rachael said, or silos in the research ecosystem, we need to work together as a community to improve access to information, opportunities for collaboration, and to reduce the administrative burden and ultimately to improve trust and transparency in research. At ORCID, we are committed to doing so. And as part of our upcoming work, we are adding support to link credit contributor roles to works.
GABRIELA MEJIAS: We are also enabling connections between contributions and their allocated funding by adding a new relationship type called funded by. And as part of that work, we're also going to add support to DOIs for grants. And last but not least, we are adding support for ROR so that organizations and individuals can connect affiliation information directly with a ROR ID.
GABRIELA MEJIAS: And at this point, I will pass the string to Maria.
MARIA GOULD: [INAUDIBLE] string. Hello, everyone. I am Maria Gould from California Digital Library where I am also the project lead for the Research Organization Registry, better known as ROR. And my part of the string that I am talking about today is identifiers for organizations. ROR is an open registry of identifiers and metadata for research organizations. If you'll excuse the metaphor mixing for a moment, I want to ask all of you to think about scholarly infrastructure as a set of puzzle pieces we're trying to fit together to show a complete picture of research activities.
MARIA GOULD: And there are three really key pieces in this puzzle. The people doing the research, the research outputs they are producing, and the research institutions that they are associated with. Of course, there are many more elements, but these three are really foundational. And we rely on identifiers to put these pieces together. And within this picture, ORCID identifiers for researchers and DOIs for research outputs have become very well established in the infrastructure that we use to publish and discover research.
MARIA GOULD: But we've been missing the puzzle piece that could complete this picture by linking organizations to those DOIs and ORCIDs. So for instance, finding and tracking all research associated with a specific institution is something many of us have to do on a regular basis, whether we're research administrators at a university or librarians collecting publications or funders managing the research they have supported or publishers trying to determine open access publishing workflows for specific institutions or quantifying publishing output by institution to factor into transformative agreements.
MARIA GOULD: And it seems like finding all of this research should be pretty easy, but it has actually become quite challenging and cumbersome. And one reason for that is the lack of an open standard identifier and high quality metadata to support efficient and comprehensive discovery and linking. So what we've ended up with is a bunch of free text affiliation strings floating around in the wild, as you can see here.
MARIA GOULD: So if you're searching, for instance, for all of the articles by UCSF researchers, your results are not going to be complete, because there are so many different ways that this affiliation can be expressed. You might also discover that there are identifiers associated with these affiliations, but the identifiers for them and the metadata associated with them are in a proprietary or commercial database that not everyone can access, use, and reuse.
MARIA GOULD: So coming back to this puzzle metaphor, what we've been needing is an affiliation piece that can be connected to the other puzzle pieces without any restrictions. Now, ROR is not the first identifier for organizations that has ever existed, but it is the first one to be developed for a very specific and unique purpose, which is to provide a truly open and community driven solution to the problem of how we can connect research outputs to institutions and how we can do that within the context of existing open research infrastructure.
MARIA GOULD: And so the good news is that we can start putting the puzzle together right now. ROR launched its registry two years ago. We currently have 99,000 ROR IDs for research organizations around the world. This data is all openly available in a front end search, which you can see on the left hand side of the screen, as well as an open API and a public data dump. So all of the data is CC0.
MARIA GOULD: All of the code is open. Everything is openly and freely available and readable by humans and machines. ROR is beginning to be integrated into many systems across the scholarly communication landscape, and ROR IDs are already supported by data site and soon will be supported in Crossref and in ORCID, which will help us be able to complete this picture and unlock more knowledge and insights about research activities.
MARIA GOULD: And so because ROR is open and because it is being supported in infrastructure like Crossref and DataCite and ORCID, it will become easier over time to find all of the research associated with a specific institution, like UCSF in our example, when the DOI metadata for research outputs includes ROR IDs for affiliations. So from the metadata perspective, here's an example of where we see this going and how it can look.
MARIA GOULD: So this is metadata from Dryad's data repository that has been deposited in DataCite where you can see the ROR ID being used to identify the UCSF affiliation. You can also see the ORCID identifier for the researcher has been included as well, all wrapped up into the DOI metadata. So this is what we mean when we talk about having rich metadata and open identifiers mixed together to make magic.
MARIA GOULD: And on that note, I will pass the string over to Helena Cousijn at DataCite.
HELENA COUSIJN: Thank you, Maria. Hi. I am Helena I work for DataCite as DataCite's director of community engagement. And at DataCite, our vision is to connect research in order to identify knowledge. So how do we do that? DataCite is, like Crossref, a DOI registration agency. And so we provide DOIs for a range of research outputs. What you see on my screen is an institution, and DataCite is a membership organization.
HELENA COUSIJN: Research institutions join us to be able to use our services. Historically, research outputs such as data sets would often just be stored on a researcher's computer. And so then when the researcher left, the data set or the other research outputs would just disappear as well. But by assigning a DOI, you can ensure that the research outputs become part of the research ecosystem, and they can be connected to other research outputs and other entities.
HELENA COUSIJN: They become discoverable in a range of search engines, and they become easy to track. It's easier for an institution to know which research outputs there are and also whether these are being reused. And then an institution can use this information to report back on that. Now, as I think was clear from some of the other talks, metadata registration is a really crucial part of DOI registration and the use of any persistent identifier.
HELENA COUSIJN: So when a member registers a DOI with us, they also register accompanying metadata following our metadata schema. And you can see the properties of the metadata schema here. And as Maria mentioned in her talk, it's very important to include the affiliated institution through a ROR identifier. We ask for a creator with an ORCID. And under recommend, you can also see that we have a related identifier property where any other identifiers that are related to the DOI can be included.
HELENA COUSIJN: So for example, related Crossref DOIs indicating that data were cited in a particular article. Now, through our metadata, we already knew about the connection between A and B. But within the European funded FREYA project, we wanted to take things a step further. And as Rachael said in her talk, research into real world is a graph, is connected already.
HELENA COUSIJN: So our idea was to link PIDs for different entities together via relations in the metadata. And basically, what that means is that we don't just know about the connection between A and B, but we can attach more to that so that then we also know about the connection between A and C. And so all this information then forms a graph that we can use to answer new questions.
HELENA COUSIJN: And what you see here is the PID graph last year and all the connections that were in it between data set, software, but also the researchers that generated this, the organizations that these researchers were affiliated with, and the funders that funded the research. And because we realized that not everyone really wants to dive into a graph, we built an interface on top of it called DataCite Commons, which you can find at commons.datacite.org if you want to take a look.
HELENA COUSIJN: And here you can see all that information that's contained within the graph. And so for example, you can see citations of data sets. Here's a Dryad data set where you can see it was cited once by the paper that was based on the data set. And it was also viewed 99 times and downloaded 16 times. And when it comes to supporting recognition for each research output, you can see who the creators were that contributed to the data set or output.
HELENA COUSIJN: And also done using the ORCID ID, you can look at a specific researcher and you can see all their outputs and reuse of their outputs. And also at the level of a funder, here the European Commission, you can see all the different outputs that are associated with that funder, the reuse, but also the type of work. Was it a data set?
HELENA COUSIJN: Was it software? Was it a publication? And what license was assigned to it? So I hope that all these talks have shown you that connecting open metadata is very important. It helps you with the discovery of different kinds of outputs and entities. It can improve your workflows. It can help you with global collaborations, because you have a better idea of contributors to different outputs.
HELENA COUSIJN: It helps you with tracking and analytics so you can do better reporting. And so overall, that means it helps you with identifying knowledge. Now, if you're wondering what you can do following our session, there are a couple of steps you can take. And these are also outlined in a paper in Patterns that we published recently.
HELENA COUSIJN: The first step is maybe more obvious. It's very important. You use PIDs for all entities and all the different stakeholders can play a role in this. But then it's also important that you track and record connections between PIDs in the metadata. And so for example, as a researcher, when you publish a second paper based on a data set, then please go back to the data set and indicate in the metadata that there's now another paper associated with the data set.
HELENA COUSIJN: And then the last thing, and this is where open infrastructure comes in again, it's very important that infrastructure providers such as the organizations that we work for aggregate the information and make it openly available so that everyone can benefit from it. So thank you very much. We're very happy to answer any questions you may have now or after just talk.
HELENA COUSIJN: So please feel free to get in touch with us, and please make sure you help us establish these connections. Thank you. [MUSIC PLAYING]
DEBORAH WYATT: Hi. I'm Deborah Wyatt. I'm the vice president for society and academic relations at Cactus Communications. I'm joining you from the traditional lands of the Wurundjeri people, which you might know as Melbourne in Australia. I work with Cactus' research communications and global marketing agency Impact Science, and we bring research to life in new ways and try and bring communities together around knowledge and ideas.
DEBORAH WYATT: Me, I've worked in academic publishing and editorial roles for over 20 years now. I want to start by thanking the NISO Plus 2021 organizers for inviting us to join this virtual conference and participate in the panel. You're also going to hear a little later on today from my colleague Don Samulack. He's going to be joining in the live Q&A session, and he's going to pop in to say Hello.
DEBORAH WYATT: To you right now.
DONALD SAMULACK: Hi, everyone. I'm so happy to be with you here today virtually. My name's Don Samulack. I'm coming to you from my home just outside the Philadelphia area. I've been with Cactus for about 13 years now and in the research publishing community for what seems like forever, first as a researcher, then I worked in the pharmaceuticals for a while, then I was director of the scientific editing department at St. Jude Children's Research Hospital for several years.
DONALD SAMULACK: But now I'm working with the Impact Science team, and I'm head of global stakeholder engagement. Back to you, Deb.
DEBORAH WYATT: The 21st century is characterized by rapid technological development. And with these changes have come evolution in the way that we perceive, understand, and share information with each other. As the technological tools that are available to us have improved, we've unlocked newer, more efficient, and more impactful ways to communicate information. The internet has supercharged our ability to connect and to communicate.
DEBORAH WYATT: And of course, it's increased our capacity to gather information, conduct research, and produce content that we can share with others. Having ready access to a broad and deep corpus of data and information has enabled connections between ideas at a scale that just wasn't possible before. Public health challenges that our communities are facing during this pandemic are a great example of our existential need to bring global multidisciplinary teams together around ideas and research.
DEBORAH WYATT: The scale of this shift in connection calls for a shift in the way that we approach research communication. In a changing digital landscape, we have a shared responsibility to work together and to ensure that trusted research reaches the people that really need it. The speed at which things were being published in 2020 just showed us how fast an open research publication can actually be.
DEBORAH WYATT: Already in the global digital landscape, there's so much information out there vying for mind space. As producers of content, the first question that authors, editors, and publishers often ask is how do we make our research accessible and how do I get it noticed where it needs to be seen? The second question is, how do we share content in a way that's reliable and accurate and is in line with community standards of rigor, ethics, and integrity.
DEBORAH WYATT: To maximize reach and impact, it's not enough for a paper to just get published. In today's digital communication tools, there's potential to do so much more. A video or an infographic provides a simple at a glance summary of a study and provides that key data accurately but in a way that allows it to be understood within minutes. This is a huge benefit for busy professionals and clinicians who are struggling to keep up with reliable, relevant, and clinically accurate information.
DEBORAH WYATT: Feature articles and podcasts, they unlock the potential for broader discussion of research and its implications for different communities. This is tremendously powerful at a time when organizations and individuals are really seeking ways to address inequality and to encourage greater diversity and inclusion. The benefits of these content formats are clear. They're widely used, they're in high demand, and they here to stay.
DEBORAH WYATT: It's time that we start thinking about these content formats as core components of research publishing, just as important for science and scholarship as published technical papers are. The challenge for us now is to ensure that we apply the same standards of rigor and fair, findable, accessible, interoperable, and reusable principles to this content. We must make sure that the standards of research, ethics, and integrity are protected.
DEBORAH WYATT: Of course, a key topic of discussion in our evolving research publishing landscape is transparency and open scholarship. To achieve true accessibility, we need to rethink what research communication means, and we need to use newer, more efficient content formats in addition to traditional manuscript publication. We also need to create infrastructure, build the technology, and create guidelines and systems that will allow these content formats to be in keeping with global publication standards of ethics and integrity.
DEBORAH WYATT: The open scholarship initiative is a huge step in that direction. There's a number of global stakeholders working with UNESCO to develop solutions to open scholarship and transparency. Impact Science, which has come to lead research communication in the global South and East, has now become part of the open scholarship initiative. Our head of Impact Science, Harini Calamur, is going to fill us in on the role that Impact Science is playing in the open scholarship initiative.
HARINI CALAMUR: As far as the open scholarship initiative is concerned, Impact Science hopes to contribute to formulate global standards around research communication that go beyond the technical paper. We are hoping that we can bring in the dot standards to this essentially ensuring that the communication is discoverable, accessible, reusable, and of course, transparent. And also in today's day and age, we have to ensure that this entire thing is sustainable.
HARINI CALAMUR: I think the starting point of this is not just to produce high quality research communication, but also to ensure that you involve the =her every step of the way. Ultimately, you need to understand that it is the researcher's research. It is something that they have dreamt of. It's something that she has lived with. And therefore, their input, their [INAUDIBLE] used at every step of the way becomes a very important part of the process.
HARINI CALAMUR: How do we do this? We do this by tailoring the research communication to the specific needs of the researcher. For example, if it's adaption of a opinion leads to a certain format, that process that we use, the workflows that we use will be very different from if you were creating summaries of the research meant for distribution to peers to understand that research.
HARINI CALAMUR: Similarly, if you look at, let's say, communication to policymakers, that communication will be very different from, let's say, communication to the media. So depending on the needs of the researcher, we tailor our process. I think one of the most important things to keep in mind when you're doing research communication is understanding the fact that the researcher is at the center of this entire communication process.
HARINI CALAMUR: Sometimes [INAUDIBLE] that come from the culture. Sometimes there are triggers that come from just the university or the situation they are in. In other cases, it could be a personal example that leads them to the research. Sometimes it's just the shared culture of the place. It's a completely different culture. They've come up with a perspective. We did this piece around researchers in Sudan talking about gender rights.
HARINI CALAMUR: And here the culture becomes a very important part of what they have to say. In the whole process, we also ensure that the work is attributed to all the right people, including the authors, funders, institutions, and creators of a piece of content. In addition, we connect the content of the original technical papers and researchers' profiles via DOIs [INAUDIBLE].
HARINI CALAMUR: Transparency and attribution is as important to us as making each person's content stand out from the crowd. It's only when we do all this that can we say that research will have the impact that it's meant to have.
DEBORAH WYATT: Of course, it's critically important for scientists and scholars to be able to share insights with their peers and to progress scientific discovery. But what about impact beyond the academy and the lab? What are the implications of research for evidence based policy, for practice, and for the public? We need to help researchers talk about their research in ways that are understandable to the non-academic audience. This unlocks the potential for new kinds of impact, like industry partnerships, patient outcomes, and public awareness.
DEBORAH WYATT: While doing so, we need to ensure that content is accurate and that the facts aren't distorted and diluted. We found that this kind of communication when it's done right can really drive up Altmetrics and citations. This tangible data to show the impact that this kind of communication can have. We have developed at Impact Science tools and expertise to amplify traditional publication with visual, multimedia, and journalistic content, all the time while maintaining that trust and credibility.
DEBORAH WYATT: As we develop the tools to create shareable science that's easier to understand, we also want to know how is that information being shared. Which audience is it reaching and how are they engaging with that information? At Impact Science, we're looking really closely at tracking these new formats so that we can better understand and analyze the kind of global attention and engagement that they're receiving in the global community.
DEBORAH WYATT: We've actually acquired an artificial intelligence research startup to try and drive forward the development of that thinking and to create the infrastructure that we need to really understand the diverse paths that these kind of formats take. Last year our parent organization, Cactus, also developed a platform called researcher.life, where AI matches relevant research to the right audiences.
DEBORAH WYATT: And attribution is key. We work hard to make sure that every piece of research that we've published online is correctly attributed by the Digital Object Identifier so that we can track the work that we produce via services like Altmetrics and PlumX. We want to make sure that wherever that associated work is being discussed, it's being talked about in the right context, it's being linked to the original published research, and it's being attributed to the correct authors.
DEBORAH WYATT: Research can change lives, and it can solve global challenges. As stakeholders in the broader research ecosystem, we're really committed to making sure that the work that we produce is fair and open. So let's work together to ensure that as new research communication formats evolve and grow, we're also building the right infrastructure and adopting the right standards to protect that rigor of trust and integrity.
DEBORAH WYATT: We're so delighted to have had the chance to meet and speak with you today, and really looking forward to listening and learning from you in the Q&A. Thank you. We're looking forward to your questions.
DARIO RODIGHIERO: My name is Dario Rodighiero, and I'm a researcher at Harvard University. This is a very short presentation about the mapping of conferences. And in particular, I wanted to introduce you to the mapping of NISO Plus 2021. I am a member of the Berkman Klein Center for Internet and Society but also a member of the Metalab, which is a knowledge design laboratory at the intersection of arts and humanities.
DARIO RODIGHIERO: My work is situated in the mapping of science. And you see a few example of it. My way of doing research is visual. And you see some pictures of my maps of science. And I like to work at different scales and at different dimensions. Among them, I like the idea of collaboration, the idea of academic mobility, and also the idea that exists a lexical distance between the scholars.
DARIO RODIGHIERO: This presentation is about specific mapping for conferences. And I think that especially in this moment in which events went online, I think that attendees need new instruments to orientate in these kind of events. And this is work that combine network visualization and natural language processing on the base of the documents at the disposal of the conference.
DARIO RODIGHIERO: So for example, this is the Conference of Digital Humanities at 2020 and the keywords you see are based on text analysis of scientific articles. And I create a kind of topographical terrain where individual are situated around. And the position of speakers is created according to their lexical similarity. It means that if scholars, speakers, or individuals have a dictionary in common, they are close in the map.
DARIO RODIGHIERO: And this is the map of the conference of NISO Plus 2021. What you can see at this point is that the conference is composed by many keywords. And keywords are about library science, about information science, about research, about publishing, all subjects, all topics that are discussed within the conference. But you can even see peaks that correspond to particularly alive areas of the conference.
DARIO RODIGHIERO: But these are entry points to zoom in. And when you zoom in, you see people, you see speakers. And so in this case, I present you myself and my context. So the context when you zoom in the map is created by keywords, again, and people. So around me, I present you my neighbors that is composed by Judy, Carlo, Jenny, Joy, Bill, and Yvonne.
DARIO RODIGHIERO: And with them I share a few keywords that situate ourself inside the conference. For example, you can read research, digital, collaboration, education, communication, practice. And if you move in another part of the map, you will see different keywords. So this is the reason because I wanted to define this conference map as an instrument to explore.
DARIO RODIGHIERO: And I have three goals that are very simple that differ according to the user. And so I can imagine that this map can be used by speakers as myself to explore the context, the cultural context around them. But it can be used even from attendees to search a specific area of interest by searching in the map through keywords.
DARIO RODIGHIERO: But in the future, I think that this could be even useful for an instrument to organize panel automatically according to the lexical distance of the speakers. And why I presented this work in this conference, I have a few reasons for that. The first is that with open data, I can facilitate the visual mapping. So my work will be easier.
DARIO RODIGHIERO: Another one is the increase of information precision. Because the more information I have at my disposal, more my studies can be precise. And then there is even a reason to create a scientific awareness between the scholar. Because if you share the data, you are the people in charge of sharing your own data. And then it's interesting, because with a complete set of metadata, you can play, you can study at different scales.
DARIO RODIGHIERO: So for example, I'm doing works about academic mobility, and academic mobility and can be seen at different levels, like individuals, like institutions, but even collaboration between countries. And so the presentation is finished. I leave you with my website you can explore. It's HTTP dariorodighiero.com. And here in the last slide, you can see in my book that will be published this year by [? Metis ?] Press.
DARIO RODIGHIERO: It will be published in French but also in English. And the good news is that the English version will be open and accessible on the internet with the title Mapping Affinities. Thank you for your attention. I see you in a while for talking together. [MUSIC PLAYING]