Name:
Wikidata and Knowledge Graphs in Practice: Using semantic SEO to create discoverable, accessible, machine-readable definitions of the people, places, and services in global information community institutions and organizations
Description:
Wikidata and Knowledge Graphs in Practice: Using semantic SEO to create discoverable, accessible, machine-readable definitions of the people, places, and services in global information community institutions and organizations
Thumbnail URL:
https://cadmoremediastorage.blob.core.windows.net/99802897-ed72-4b7f-94f5-3dc29496dcd2/thumbnails/99802897-ed72-4b7f-94f5-3dc29496dcd2.png?sv=2019-02-02&sr=c&sig=1Gty0Qihv%2BXKzj%2F7zzbI3GPuQnwki%2BEK4PmmeIVVKho%3D&st=2024-12-22T06%3A26%3A13Z&se=2024-12-22T10%3A31%3A13Z&sp=r
Duration:
T00H49M02S
Embed URL:
https://stream.cadmore.media/player/99802897-ed72-4b7f-94f5-3dc29496dcd2
Content URL:
https://cadmoreoriginalmedia.blob.core.windows.net/99802897-ed72-4b7f-94f5-3dc29496dcd2/Wikidata and Knowledge Graphs in Practice -NISO Plus.mp4?sv=2019-02-02&sr=c&sig=4MDex5iJV51RcH3YKSLZYxh%2Fs80HGGt1QU%2FH95JsNkY%3D&st=2024-12-22T06%3A26%3A14Z&se=2024-12-22T08%3A31%3A14Z&sp=r
Upload Date:
2022-08-26T00:00:00.0000000
Transcript:
Language: EN.
Segment:0 .
[MUSIC PLAYING]
NATASHA SIMONS: A warm welcome to NISO Plus 2022. I hope you're enjoying the conference so far. My name is Natasha Simons and I'm the Associate Director for Data and Services at the Australian Research Data Commons and I'm based in Brisbane, Australia. I'm also a member of the NISO Plus Advisory Committee and I will be your moderator for today's session. Thank you for joining us. This session has the title Wikidata and knowledge graphs in practice, using semantic SEO to create discoverable, accessible, machine readable definitions of the people, places, and services in global information community institutions and organizations.
NATASHA SIMONS: Well, that's rather a long title. What does it all mean? Well, I'm sure you're all familiar with Wikidata and knowledge graphs, but just to recap for those who aren't. Wikidata is a free, collaborative, multilingual secondary database collecting structured data to provide support for Wikipedia, Wikimedia Commons, and other wikis of the Wikimedia movement and to anyone in the world.
NATASHA SIMONS: And a knowledge graph puts this kind of Wikidata in context via linking and semantic metadata. And this provides a framework for data integration, unification, analytics, and sharing. In this session, our speakers will be talking about Wikidata and knowledge graphs in practice from the perspective of libraries. I'm sure you are aware that libraries provide content and education that expands the access and visibility of data and research.
NATASHA SIMONS: But did you know the dynamic nature of libraries? So the people who make the library happen, the services that they provide, and the resources that they procure are not very well understood by search engines and indexing software agents, and this can lead to misinformation and missing information. Speakers in this session will demonstrate how Wikidata can be used as a tool to push out data beyond organizational silos, the technical details of knowledge graph markup, and semantic search engine optimization.
NATASHA SIMONS: They will work through questions about how metadata can represent an institution or an organization equitably, and explain how this work improves the accessibility and reach of global information communities. We have four speakers today. Our speakers are Doralyn Rossmann who's professor at Montana State University, where she researches and teaches about social media practices and optimization, scholarly communication, search engine optimization, communication, and public budgeting.
NATASHA SIMONS: Jason A. Clark will also be speaking today. He's a librarian also at Montana State University library. He's a professor, a hacker developer, a metadata geek, and author working under the Big Sky and focused on semantic web development, digital library development, metadata and data modeling, web services and APIs, search engine optimization, and interface design. Helen Williams is Metadata Manager at the London School of Economics library where her team have responsibility for print electronic and institutional repository metadata with a strategic focus on exploring and developing new ways in which metadata can support research, learning, and teaching.
NATASHA SIMONS: And our final speaker for today is Neil Stewart who is Digital Library Manager at Digital Scholarship and Innovation Group. And Neil's team works with digital collections to make them openly available on the web and to experiment with providing them in new formats. Neil is interested in collection, digitization, digital libraries, and web technologies for libraries. So please join me in welcoming our speakers after which we will have time for discussion and questions, and please join the conversation about this session in NISO discourse.
NATASHA SIMONS: The link to the discourse session is on the session page.
DORALYN ROSSMANN: Hello. I'm Doralyn Rossmann. I work at Montana State University. And my colleague Jason Clark from MSU is also here along with Helen K. Williams from the London School of Economics and her research partner is Neil Stewart. We are going to be talking to you today about our experiences with Wikidata and knowledge graphs in practice.
DORALYN ROSSMANN: So for our talk today we are going to cover five main areas. Initially we'll talk to you about our research motivation for these works that we're doing and the perspective of outside-in and inside-out solutions for libraries. Then Jason will be talking to you about knowledge graphs and how those can be applied to libraries. Then we will go over the case studies for Montana State University and the London School of Economics.
DORALYN ROSSMANN: And then we will explore research implications of our findings and how this might apply to your local situation. So research motivations. As my colleague Jason observed, if you look in Wikipedia for the entry of library, it says a library is a collection of materials, books, or media that are easily accessible for use. This is a very building-centric perspective of libraries and I think we all know that libraries are much broader than just a building.
DORALYN ROSSMANN: They are people, services, expertise, online collections. And so our motivation for our work was to try to update how computers and search engines understand the concept of a library in today's world. For the folks at the London School of Economics they took a closer look at Wikidata. So their motivation for doing some work here surrounded this idea that Wikidata is being used as a means of documenting and surfacing researchers publications and research data in a number of ways.
DORALYN ROSSMANN: It provides an opportunity for sharing faculty scholarship on an open and accessible platform. So by potentially updating Wikidata entries with information about library provided resources, there's an opportunity to reach beyond the library into the open web for computers to know that the library has additional resources. So this work we did complements each other because we are looking at sources like Wikipedia and Wikidata to help machines understand libraries and what they provide better.
DORALYN ROSSMANN: So now Jason is going to talk to you about the concept of outside-in and inside-out.
JASON A. CLARK: Thanks, Doralyn. And in this moment I'm building again off of some of the research motivation. You'll hear Helen talk about inside-out as a concept. And so it's a conceptual frame, we just wanted to put this-- give the audience an understanding part of what we're talking about. And this is a concept brought forward by Lorcan Dempsey, 2010. I think originally he was talking about the way that we think of our actual resources that the library holds or procures.
JASON A. CLARK: And he was making a distinction between the stuff, the things that are in the library and that could include group rooms and the way we manage our local data like our catalog or OPAC or our finding systems, discovery systems and the things that we do have that kind of travel outside the library, the resources that travel outside the library. Next slide. Mark Dahl took that idea of, what would it look like if you conceived of an inside-out library?
JASON A. CLARK: And a couple of things he highlighted was this idea that your, the expertise of the library moves into different kinds of environments that might impact newer kinds of learning areas or even broader impacts in the community. And so that concept of inside-out is where both of the case studies really come into focus. The local work we did in our case study to find ways to redefine what a library is for machine sources, that's taking that inside-out knowledge and turning it into something-- trying to impact a different kind of understanding or definition of libraries.
JASON A. CLARK: In the same way, Helen and Neil's Wikidata work is a way of using an existing knowledge graph database and pushing all content in our materials into it so that it can be used by a very broad global audience. This is really what I was talking about there. And then the two things that I really want that group to take away in this case, the metadata expertise that we have, that's something that Helen's moving into a broad global learning environment.
JASON A. CLARK: The same as Doralyn was crafting how we might think about these pages, how we define these pages as both of us were working through that. This is an example of both of us, both institutions looking at inside-out models. So as I noted, this is the moment where libraries are trying to, at least in both of these cases, contributing to broader research, teaching, and learning at a global scale.
JASON A. CLARK: The other concept I really want because it's kind of at the center of this, I want to try to unpack a bit more about knowledge graphs themselves. So next slide. This is the definition. It's not the-- it's, it's a heady concept, so it's not-- like none of the definitions tend to really capture it.
JASON A. CLARK: The main takeaway I want people to think through is this idea of have a thing that might have an action that does other things. This sort of idea of how you organize the actual data is about noun, subject-- noun, verb, subject which is a different kind of model than you usually use in a relational database. Next slide.
JASON A. CLARK: So the idea is this, you have a thing, what they call an entity which could be in our case, it will be our use case or case study will be the library itself as an entity. And then we're going to map a number of things that the library does, that it represents in a graph data model. So this connected, you'll see a center like the library itself and then how do we describe it in a graph model.
JASON A. CLARK: And so if you go to the next slide you can kind of see that. We did this work on a research center on campus where we connected person, a researcher to a number of researcher based on the things that, the types of publications they are putting forward, the topics of their research. Various other ways to relate them, but this kind of gets at the graph model like you have a person or a thing and they're connected to others by those strings, those gray lines between them.
JASON A. CLARK: Those would be verbs of like Darla is connected to in this case a number of different researchers like researcher number 1, I don't have a name there right now, but let's pretend that's Phil Stewart who is the head of that research center. So the graph model lays out information this way, whether it's Wikidata, which is where Helen and Neil are or the local work that we did to redefine and do new markup for our library website and pages.
JASON A. CLARK: It's a similar kind of model that we're going to follow. This gives you a generic sense of things. So that main entity, that blue. I don't know if people can see color in this instance, but the center is the main entity. So that's something that you would define, in our case you'll see it's a library. And then you could say that it has a verb between as it connects to the other corners or the nodes in that graph.
JASON A. CLARK: So the library offers a service or the library offers-- library has expertise in the form of this person. That's the kind of model that we used to redefine the library website and then again starting to seed machine definitions of library around using this model. So if you go to the next slide, you see a bigger picture. The purple, again in the center, I don't know if people can see color.
JASON A. CLARK: The library in the center is purple. The main point to think through here is like you have this main-- this main idea, main entity called a library. And then what we're doing is connecting it to these other subnodes in the graph. And each of those nodes says a particular thing. So like in the top left corner, there's library people. Library is connected to library people.
JASON A. CLARK: In the top right corner it has, the library has library resources. Those could be our books, our catalog, our procured databases. And then the other ones-- the other components of that are things like services and space, but the idea is you have a central entity that is linked in a set of nodes and that's the graph data model that sits at the center of Wikidata and our project.
JASON A. CLARK: So the first case study is our case study. And I'll talk at the beginning here, but Doralyn is going to talk through a number of the details. Next slide. Just in framing this activity, we had done a number of work-- some research and some work around search engine optimization for our people and our resources. And resources there are like our procured databases, purchased databases.
JASON A. CLARK: But we recognized that we could do-- there was even more work because as we started to define those entities, like who are the people in the library and what are the resources that the library has, we noted that at a higher level there was kind of a misunderstanding of what a library is. So we turn toward this practice of semantic search engine optimization of, and Doralyn will walk through these details of how do you map-- if you looked at your website and decided on pages as nodes that you could create new definitions of library within.
JASON A. CLARK: There's a form of optimization where you're adding that coding to those web pages and we thought we would see some results and what's preliminary, but I think we are starting. Next slide. So the idea here was to think of, if you remember back to where I was talking about the knowledge graph idea, if you think of the library website as a graph, this is where the activity starts to overlap with the concepts of a knowledge graph.
JASON A. CLARK: So the web-- the landing page of the website of the home-- the home page of the library is the start. And then you can start to say, OK, these particular websites web pages are connected in this way. And when you land on those web pages that you have a set of definitions that you can pitch to machine agents to help them understand what they're looking at and how to index that information.
JASON A. CLARK:
DORALYN ROSSMANN: Thanks, Jason. So when we were trying to decide how we would approach doing our markup we took a look at our main page for our library website. So this is a screenshot of that page. And this was a good guiding post for us to think about, what do we consider to be of our main categories or the way that we cluster our descriptions of ourselves? And so if you look across the top you'll see there are links to find, request services, spaces, and people along with ask the library.
DORALYN ROSSMANN: So using this as a guiding post we thought let's talk about these things to search engines so that they have a better sense of us beyond a building. So from there we decided a knowledge graph would be based on that index page, the about page which is linked further down the website- find, people, request, resources, services, and spaces. And right here you can really get a sense that this has some elements of a building with spaces, but you also have people, you have the services and resources we provide, and that's a lot more of a robust picture than just we provide databases and we have a building.
DORALYN ROSSMANN: And so we added some markup to each of these pages with more information. So we, as I said started defining people, places, and things, and the things that we do around those three things. We looked at schema.org and identified some JSON-LD markup that we wanted to add. So we looked at different options there for how we could describe ourselves and made some decisions about that markup, which I'll show you in the next slide.
DORALYN ROSSMANN: And then once we added this markup and we also added mark up with social media optimization which is giving social media networks more information about how to interpret your data. So that's another computer network. But once we were done adding both the JSON-LD and the social media optimization markup, we submitted these to Bing Webmaster Tools in Google Search Console for re-indexing.
DORALYN ROSSMANN: And we could see that we have one instance where I said you need to add some more information, and so we did and then everything else went through fine and they get re-indexed. And it takes a little while for everything to actually make it in there, but you can go in there and get updates and then you also get notifications for when things have been indexed and again, if there are any suggestions from those tools for improvement for the tagging you've added.
DORALYN ROSSMANN: So this is a screenshot of part of the JSON-LD markup. This is not all of it, but this is just to give you a sense of where we went. So if you look under like the fifth line down, you can see it says main entity. So we're explaining that we're a library, that we have things that we offer, and so here's a catalog of our offerings and this is the markup for our people, page, or our staff directory.
DORALYN ROSSMANN: So you see the type is we're offering something and what are we offering? We're offering a service. And what does that service look like? It looks like things like teaching, metadata, archiving, publishing. And then we have links to Wikidata entries. So we're saying when we say teaching this is what it means in Wikidata.
DORALYN ROSSMANN: This is what it means when we say data management. So we looked up what we thought were terms and descriptions, and connected those with terms that we use on our website. And then we have a similar markup on our pages that include some Wikipedia entries. The Wikipedia entries are again, similar definition, when we say this, this is what we mean. So that means sometimes we have to be creative in figuring out what's the most appropriate definition and that's where our information expertise comes in.
DORALYN ROSSMANN: I mentioned the social media optimization. You'll see here, these are two tweets. The one on the left is a tweet from a link in our digital collections that does not have social media markup. You'll see it's a pretty boring looking tweet if you are visually oriented. So it's just what we typed in a link and a hashtag. And then on the right, we have added a lot more information that gives the tag of MSU library.
DORALYN ROSSMANN: The top provides the actual image that we were linking to. It provides a brief description and it's a lot more visually engaging. And another bonus of this is there are analytics on Twitter that we can use to see how much of our traffic was driven to us by social media. And we can actually see, what this is called a Twitter card, we can actually see how much traffic was driven specifically from this particular tweet because of the card that's attached to it.
DORALYN ROSSMANN: So here's what the tagging for that would look like it's just more tagging you add you can add both Twitter and the OG is referring to open graph which is Facebook open graph and again, you're giving it more information. And if you don't sometimes a search I mean social media network will pick up that there's an image there. And they might display it but by putting this tagging on there You're telling the search the social media network specifically what you want to be displayed. So now Jason is going to talk to you about search engine results pages Thanks Doralyn and in keeping
JASON A. CLARK: with the early phases with a lot of the analytics and benchmarking, but in keeping with the spirit of the conference, which again is I think hoping to be more generative what I want you to do is as I talk through these because we're early it's just think about this is how you would monitor this kind of service like if you were going to look at doing this work.
JASON A. CLARK: Here's how you would start to do that analysis because we're about 30 days in, I think right at New Year's I think December 31 is when we started put a pin in and started this conversion and added the new information to the web index that we wanted to but if you're thinking through and I'll show I'll show kind of where that goes. But I just want the group to kind of think through this.
JASON A. CLARK: If you want it if you want to think about how to do this in practice or this is kind of where I'm going to go with the next couple of slides. So the first thing that we do is we have a series of scrapers and crawlers that just looks at the raw results because there are tools that I'll show you but we also wanted a way to just scrape results and create our own CVS to watch how the search results how what was happening in this when we ran a certain query what kinds of results were we seeing.
JASON A. CLARK: And so like there are links on this slide but their output sample output of data of the results of what those the first two pages of results look like. So we can kind of monitor what it looked like as we started this work and what if we're seeing some new results or new ways that are our library resources people, places, things are appearing in search results.
JASON A. CLARK: Next slide. So the tools that we use Doralyn had mentioned earlier Google Search Console which is a way to kind of understand what your index is in Google, Bing Webmaster Tools is the similar tool for the Bing index and we really prioritize those to search index search engines because they power I mean a number of they're the biggest they represent the market share between the two of them and the things we are looking for they're again trying to be generative and not just informative the impressions which is like how many times does our stuff appear in a search result the coverage how wide is our index.
JASON A. CLARK: We know that the library has x amount of pages we would expect to see those numbers increase and/or hold and then the actual activity. How does how are people using are moving from search result pages to our resources the click through rate, which are next so this is just a snapshot to give you a sense. And again we're in the normal cycle of these sort of benchmarking and analytics where we go with like 90 days we're about 30 days in so I'm just I just took a snapshot today to give a sense of what this console looks like and I'm noting that this is early but we're starting to see VR is changing a bit, which is a little curious impressions are basically holding clicks are holding position, which is not as important to us but it's interesting to see that there seems to be more activity on our pages at the moment.
JASON A. CLARK: Next slide. The other component of this analytics and benchmarking is the tool Google Analytics and in this case, you would want to look for what your audience is doing so in particular what the users what are your users as they move into your environment where are they going. Are you seeing an uptick in a number of users and then you also traffic in Google Analytics terms is acquisitions and so what we're looking for there is like source and medium and this could include as Doralyn noted this would be a source could be social media.
JASON A. CLARK: And so that would be something we'd watch for the other big one in acquisitions and/or traffic would be organic search so next slide. So this is a quick overview this is our users. So this is an audience overview in analytics and it does seem to be growing in the first in this first month which is interesting just again, I'm not ready to say we're there but it is I think it's instructive for everybody to see how you'd start to monitor this if you wanted to do this activity.
JASON A. CLARK: Next slide this is the traffic source the acquisitions in their language in the analytics language but what you can see there is a slight uptick in Google organic search so initially promising parts for this case study more to come Doralyn and I will probably be publishing on this and continuing work because we want to see really see where these definitions go with this particular case study. And if we can have a broader impact we suspect we little bit more to come.
JASON A. CLARK: So now I'm going to actually move to give Helen some space to talk about her particular case study. And this is really exciting to see her actually use the public knowledge graph of Wikidata to connect and build out discovery for library content. So with that, I'll hand it over to Helen Hello everyone.
JASON A. CLARK: I'm Helen Williams and I'm the Metadata Manager at the London School of Economics it will be 2:00 AM in the UK when you're watching this so I'm delighted to be with you today and also really grateful that the session could be pre-recorded I've really enjoyed working with Jason and Doralyn in recent weeks.
JASON A. CLARK: So my Thanks to them for inviting us to share this case study with you and Thanks to them to for Fielding questions due to the time difference. But if you do have anything you'd like to follow up on then please get in touch with me by email or Twitter give you some context for our work. My metadata team is part of the library's digital Scholarship and Innovation group and as part of the digital shift I've sought to broaden the focus of our work from the management of scholarly content to the exploration and development of new ways in which our metadata can support teaching and learning and research like MSU we've had an inside out approach.
JASON A. CLARK: But our particular focus has been on scholarly content and the way in which metadata can extend its access and visibility I wanted to investigate pushing our data beyond organizational silos to increase its discoverability and I began to reflect on the potential of Wikidata as a tool to take that collaborative creation and management of metadata beyond the library to a global landscape we probably all have different levels of familiarity with wicked data.
JASON A. CLARK: So I thought I'd just start by summarizing briefly to make sure that we're all on the same page. So what Wikidata is a structured database operating as the central data store for all Wikimedia projects it's a free and open knowledge base multilingual and it can be read and edited by humans and machines Google Knowledge Graph digital assistants and Wikipedia info boxes are all populated in part with information that's harvested from Wikidata so its content has a real impact on Discovery I'm particularly interested in its power to mint unique identifiers for content and the entities within it pulling it into the Linked Open Data ecosystem and creating links and relationships between entities that creates bridges between currently siloed domains and then impact search engine results if we can make our content more widely accessible and enable new connections and discoveries then that has huge potential benefits to our organizations and beyond that to global research our journey at LSE very much began from scratch where some institutions are able to call on the expertise of a Wikimedian in residence that wasn't our situation.
HELEN WILLIAMS: And so I just began by reading articles and watching presentations. And there were several barriers to overcome including the technical skills that would be required for bulk uploading content and getting to grips with Wikipedia's policies and procedures. And then also learning how to articulate the value of Wikidata to justify the staff time and resources that would need to be spent overcoming the first two barriers in practical terms I started with Dan Scott's blog post about creating and editing libraries in Wikidata and I edited the existing item for our library and then I created some new items for component parts of our library like the digital library and looked at how to link everything together using reciprocal part of and has part relationships.
HELEN WILLIAMS: So that I could give a representation of the library as a whole on Wikidata and I also mapped some of our different content types to Wikidata created some data models for the metadata and then manually created some Wikidata items for each type. So that gave me ideas for potential avenues of work where our focus could be organizational or community research and theses or open access digital or archival and the possibilities for extending our Web presence far outweighed the staff time and resources they're in discussion with colleagues we decided to have an initial focus on content in theses online which I'll refer to from now on as LSETO and this is our online archive of PhD theses and it's a partial collection as not all our historic theses have been digitized and in total it has just over 4,000 items in it.
HELEN WILLIAMS: And it's already indexed by Google but I highlighted the value of bringing it into the Linked Open Data ecosystem. So that we could contextualize it by linking to related data and content that's not created or managed by LSE to enable those potential new connections and discoveries I also wanted to use the data as a test case for other library content and it was a boundaried data sets about the right size for an experimental project.
HELEN WILLIAMS: And it would also have real world benefits to the institution by offering value to early career researchers and alumni these work would be promoted by what we were doing. So we must modeled our data and used OpenRefine for name reconciliation with Wikidata to start creating relationships between entities by making connections with data outside LSE I then experimented with quick statements and OpenRefine for bulk uploading content to Wikidata before doing some more reconciliation work this time to match the theses with identifiers in for external data sets making further connections between our content and the Linked Open Data cloud.
HELEN WILLIAMS: And finally there's around stage within Wikidata where we had statements to create links between authors and supervisors and to show their relationships to LSE and I've trained a couple of my team on this whole process and they're now coming within sight of adding the final current content both I and the team have found it very rewarding to learn these new skills. But we obviously don't want to be doing the work just for the sake of it.
HELEN WILLIAMS: So I wanted to see whether bringing the content into the Linked Open Data ecosystem was extending its reach and engagement. So I did an initial interim analysis when we'd added just over 1,000 theses so about a quarter of the content and on the screen, you can see a SPARQL query which ranks institutions according to the number of theses that they have in Wikidata.
HELEN WILLIAMS: And we started somewhere between 287 and 467 place with all the other institutions who have just one single theses in your data but by the time I did our interim analysis we were ninth in this list and as we've added more content we're currently fourth in the list and then I looked at some loads from LSETO over a four month period after I'd started adding the theses to Wikidata and total downloads for that time were 14% higher than the same time period in 2020 when for comparison the same time period over the previous three years had seen an increase in downloads of 6.9% in 2019 and decreases of 5 and 12% in the previous two years.
HELEN WILLIAMS: So we were seeing a notable difference in the Downloads but of course that was downloads across the whole of LSETO and I've not yet added all the theses to Wikidata. So I focused in on analyzing 80 titles looking at their downloads in the six months before and after addition to Wikidata and found that on average downloads in the six months after edition to Wikidata were 47% higher in the preceding six months, which is a really encouraging uplift I also had a look at Google Analytics for LSETO and I wouldn't expect to see referrals to LSETO directly from Wikidata because putting the data into Wikidata is about other sources using that data to drive traffic.
HELEN WILLIAMS: But as part of the project where a theses author has a Wikipedia page we add to their theses title with a citation for LSETO so I was expecting to see an increase in referrals from Wikipedia and you'll need a bit of context for our figures but our primary referral source is Google Scholar and during the period I was looking at that accounted for about 40% of the referrals to LSETO the second referral source was Twitter about 10% And then there are another 10 sources referring between 1 and 6% of traffic each and after that a long tail of about 300 sources referring 0.something of traffic and in the six months before the Wikidata work began LSETO received an average of 3.82 of its traffic from Wikipedia but in the six months afterwards that increased to 9.31 with the most recent week before my analysis being 13.61 of traffic and this moved Wikipedia from the fifth referral source in the six months before we get to work began so the third referral source in the six months afterwards still following Google Scholar and Twitter and I had a closer look at Twitter as well and between February and May 2020 there were 38 mentions of ex on Twitter and that increased to 74 during the same time period for 2021 so these are all really encouraging as interim figures and we're planning to look at all of this again when the whole of the data sets are loaded and enough time has elapsed for us to get some meaningful data I've also been keen to see if we can visualize the data in New ways as a result of putting things into Wikidata and the graph on the screen shows the relationships between authors and supervisors within our data.
HELEN WILLIAMS: So the building on the success of the work I've proposed options for expanding our weekly data work to extend the reach of content and data unique to LC I've already modeled the data to get content from our University press onto Wikidata and begun steps to automate that and we're looking at how that's visualized in Scholia and the graph on the screen shows author collaborations from one of our open access journals we're also considering further options, including a special collections focus where search engine discoverability for underexposed or underrepresented content could be enhanced by inclusion in Wikidata a digital library focused where we could increase the discoverability of our digitized content via Wikidata or investigate creating a collections map of LSE digital library content on key data or a researcher focus where we could utilize the potential of Wikidata and identify a hub to support the management of mange related to LSE and enhance that data for use by search engines by contributing to the Wikidata as a global and collaborative metadata source we've been able to extend the reach and engagement of a specific set of content the work's been shared with our PhD Academy furthering the role of the library in research dissemination and demonstrating the value of metadata in expanding the access and visibility of libraries and their resources to ensure that they're understood in the semantic web environment we're excited about monitoring the impact of our work going forward.
HELEN WILLIAMS: And using it as the basis to establish further work developing the role of metadata in supporting research and teaching and learning and in supporting inclusive and equitable access to that research and data improving the accessibility and reach of global information communities. Thank you very much OK, so we're back
DORALYN ROSSMANN: to talk about the research findings and implications. So Helen just talk to you a little bit about the work at the London School of Economics and we want to talk about MSU work and the relationship of those to the research findings and implications.
DORALYN ROSSMANN: So as Helen covered some of the initial findings they observed with their Wikidata work included seeing increased downloads of the LSE theses and also increased referrals to their website from Wikipedia they've seen increased reach and engagement through their analytics and they feel that this work that they're doing expands access and visibility so that inside out concept.
DORALYN ROSSMANN: They're going beyond just their immediate library community to the broader research community by populating that data and Wikidata. And then there are new opportunities to visualize the data that they have. So there's the different metrics now that they have and ways that they can measure use beyond the tools that they had been using at MSU we are looking as Jason mentioned at the Google Search Console there for impressions and coverage so we're seeing how much these things are getting in front of people's faces and search results according to Google Search Console we're seeing greater coverage as we resubmit our pages for indexing that the search engines are picking those up and getting the data that we're providing from JSON-LD and the search social media optimization we're also seeing more organic search results through Google Analytics.
DORALYN ROSSMANN: So people are getting to us not because they were specifically looking for us necessarily, but because they are searching and coming across us in search results. So we're seeing more traffic coming to us through those discovering us in the moment at the point of need rather than necessarily having to track us down and then again we're looking at traffic sources and Google Analytics and then also looking at things like Twitter analytics to see how much traffic is coming to us through social media networks and our Twitter and Facebook markup.
DORALYN ROSSMANN: So that social media optimization piece. So what does this mean for the future well as Jason notices for MSU we are at the beginning of this particular round of efforts. So we're using this concept of knowledge graphs to apply that to search experiences like bias LSE is doing the same thing where they are providing a more complete picture to search engines about the data that's available there is additional metadata and content being added to Wikidata.
DORALYN ROSSMANN: So again, going back to that inside out concept going beyond and adding expertise from the library to these semantic web spaces for greater understanding by computers that ultimately leads to better discovery by human beings and we're also trying to increase the understanding of libraries for the semantic web so we're trying to provide good terminology from Jason-LD in our markup that talks about the people, places, and things and the services around that to enhance the definitions of a library.
DORALYN ROSSMANN: So I think that this has really great implications for our own all of our own little pockets out there. So if there are things that we're providing this suggests that museums, libraries places that provide things that we're not just waiting for people to figure this out on their own that we're giving computers an opportunity to better understand what we provide. So in closing, I just wanted to first of all, thank Helen and Neil they are London School of Economics and we are trying to monitor each other's work.
DORALYN ROSSMANN: So we can learn from each other about best practices we might want to apply. I think a lot of you out there are probably working for publishers as well and there's a lot of implications for you to markup your own data and potentially help with the definition of libraries and the information they're providing through providing resources that you all make and I think there's a really exciting opportunities to update the concepts of libraries and access providers and not just space providers and expertise that we provide to our users both locally and more broadly.
DORALYN ROSSMANN: So hopefully in the conversation we have after this presentation we can explore more of how you could apply these concepts to your own spaces and other ideas you might have for how this could be applied more broadly. We also have talked to you about data models and sources that have an impact for libraries. So hopefully creating new understandings for humans and machines again we're providing that expertise the inside out expertise and we're enhancing discovery and usage so those are all very much in keeping with library values and we're just trying to take that a step further with this markup work we've mentioned a lot of resources.
DORALYN ROSSMANN: And so in our slide deck we have links to various things that we've mentioned. So hopefully these will be helpful to you after the session and then there's some really great examples that we provided here of things having to do with knowledge graphs and structured data and how to connect researchers with each other through these different semantic networks and Jason mentioned a few different resources.
DORALYN ROSSMANN: And so there's the links to those and now we look forward to having a discussion with you just as a heads up it is currently 2:00 AM in London and so we are going to field any questions you might have for Helen and Neil and then beyond that we'll provide you with an opportunity to connect with them directly and we'll be glad to do that if we have questions that we are beyond the scope of what we're able to feel today. So thank you very much for listening and let's have a great conversation.
DORALYN ROSSMANN: Thanks to all our speakers today. Please join us now in Zoom to continue the conversation.