Name:
Non Print Formats - future opportunities-NISO Plus
Description:
Non Print Formats - future opportunities-NISO Plus
Thumbnail URL:
https://cadmoremediastorage.blob.core.windows.net/e22a6e18-3794-4bb3-865d-f1aa05a9d79b/thumbnails/e22a6e18-3794-4bb3-865d-f1aa05a9d79b.png?sv=2019-02-02&sr=c&sig=JlyY27ijtCc16gaf2GJXVjn7hF0xm2wJjiAygzpoa3I%3D&st=2025-01-11T00%3A54%3A49Z&se=2025-01-11T04%3A59%3A49Z&sp=r
Duration:
T00H55M42S
Embed URL:
https://stream.cadmore.media/player/e22a6e18-3794-4bb3-865d-f1aa05a9d79b
Content URL:
https://cadmoreoriginalmedia.blob.core.windows.net/e22a6e18-3794-4bb3-865d-f1aa05a9d79b/Non-print formats future opportunities-NISO Plus.mp4?sv=2019-02-02&sr=c&sig=jaq4edPsZ9jQMPHh2mzBzWS4CNJPwtAfw%2BHIMEfVerY%3D&st=2025-01-11T00%3A54%3A51Z&se=2025-01-11T02%3A59%3A51Z&sp=r
Upload Date:
2022-08-26T00:00:00.0000000
Transcript:
Language: EN.
Segment:0 .
JASON GRIFFEY: Hi, everyone. Welcome to NISO Plus 2022. I'm Jason Griffey, Director of Strategic Initiatives at NISO and Chair of the NISO Plus Conference. I would like to welcome you, specifically, to the session on non-print formats, future opportunities. This session has a series of four wonderful speakers, who will walk you through the opportunities in front of us with non-print format, media objects in the scholarly communication space.
JASON GRIFFEY: And with no further ado, I will let them speak for themselves. And look forward to the conversation at the end, which will be moderated by Lola Estelle. Thank you. We'll see you there.
VIOLAINE IGLESIAS: Hello. It's [INAUDIBLE]. Thank you very much for the invitation to take part in this panel on non-print formats for NISO Plus. My name is Violaine Iglesias. I am the CEO and co-founder of Cadmore Media, which I'm going to talk just about for just a minute. So today, I'm going to speak about video. That's really our sole focus.
VIOLAINE IGLESIAS: So first of all, I wanted to just say a few words about our company. We are a video hosting and streaming organization, a vendor that serves only scholarly and professional organizations. So really our background is scholarly publishing. We were founded in 2018, really, with the goal of making video more of a scholarly object, so to help societies and publishers treat video the same way that they normally would do journal articles.
VIOLAINE IGLESIAS: So how do we do that? We provide technology and services. The one part of our technology is really player embeds, which are integrated with third party platforms. That could be a journal platform. It could be something as simple as a WordPress site or a SharePoint site or an LMS, which is Moodle for example.
VIOLAINE IGLESIAS: So in that case, we are just a video host for a third party platform. We can also host entire video libraries. Those tend to be-- just think of it as a journal publishing platform, but for video. So these are video centric sites that are typically used either for education or for research. But really, they are a collection of videos that we host on websites that are optimized for viewing.
VIOLAINE IGLESIAS: If you're a society, for example, if you want to have your own Netflix, that's the type of thing that we can build for you. And then finally, we have an event platform that did come out of the COVID world because we had a lot of vendors with the new world of virtual events and our relationships with existing societies. We created our own event platform which is being used for NISO.
VIOLAINE IGLESIAS: We're not just technology, though. The thing I guess I'm the proudest about in our company are the people. We were just founded in 2018. And there are 16 of us right now across the UK and the US. And we really all are scholarly publishing experts and video experts. So we actually provide a high level of service, more than I probably anticipated, to help societies and publishers do more with video.
VIOLAINE IGLESIAS: A lot of that is the expertise that we can bring with video metadata and metadata at workflows. So enough about us. I wanted to talk about video a little bit. So the premise of this panel is really to talk about non-print formats. So our non-print format video is being used in scholarly communication in various ways, which I think we can divide roughly into three buckets.
VIOLAINE IGLESIAS: So there's the education booklet, which is typically video collections that are used either for higher education or for professional development. So here, we've got the example of psychotherapy, which is a video collection product that is produced by the American Psychological Association and sold as a video subscription product to libraries. You also have a lot of content, of course, that's being put out by medical societies for continual medical education.
VIOLAINE IGLESIAS: There's really been a lot of educational products that have come out over the years. Probably, starting with the Alexander Street Press, who I think could be called the pioneers of academic video for education. Then you've got research. So research, it's really interesting for us to work on research products because we have to borrow infrastructure from the journal world.
VIOLAINE IGLESIAS: So you can either have video journals, where video is the articles. That's going to be the case of JOV, of course, which are the founders, again, of video journals. But you also have other video journals in the renal surgery, for example. So in those spaces, where it really makes sense to have a lot of visual content, you have a lot of video journals coming out.
VIOLAINE IGLESIAS: And you can also have video as ancillary material, where the video comes as supplementary material for an article. In both cases, you typically borrow the general infrastructure to be able to publish those journals. And then finally, you've got virtual events, which, of course, have taken off since 2020. Really, the move to virtual events has completely changed their relationship, especially with event organizers with video.
VIOLAINE IGLESIAS: NISO is a perfect example. NISO Plus was held in person once in 2020. And now we've moved into this virtual environment, where a video is recorded, but the audience really has been widened. Since the first edition, it's way more global, international, et cetera. That's a good example of good virtual event. And of course, video is at the center of virtual events.
VIOLAINE IGLESIAS: These were the traditional use cases that we've been working on over the years. In 2020, we're finding a few things. What we're finding is that there is more of everything. There is more expertise going around. So that is true both of staff, the staff of publishers of societies, especially. They're just more familiar with video because they've been doing more of it, but it's also true of researchers.
VIOLAINE IGLESIAS: So author, what you end up having is just a lot more authors who are able to produce content. They're able to record themselves. They are able to participate in recordings. They're getting familiar with it. So there's more opportunities to create content than there used to be, especially if you-- there's a lot more opportunities for a self-created content rather than produced content, which, historically, has been a lot more expensive.
VIOLAINE IGLESIAS: There's more focus. So we're small vendor. We used to be-- when we start on this journey, we started knocking on doors and telling people, hey, you should be thinking about video, doing more video, et cetera. And we had a bit of a trouble catching people's attention. And now we're having no trouble at all catching anybody's attention.
VIOLAINE IGLESIAS: It's really great. There's just more focus on video because it has become more important in this world that is more virtual than it has been. So really in the end, there's just more content. Coming out of virtual events, there is tons and tons and tons of video content that is either sitting on hard drives or being reused into products or published on platforms.
VIOLAINE IGLESIAS: There's really a lot more content than there used to be, which is a great opportunity to finally get video going in scholarly communication. So what there is too is a lot more experimentation than there used to be. The big experiment of the day is hybrid events. Hybrid events are really interesting because we went from in-person events to virtual events in 2020 to now, a hybrid model, which hybrid does come in anything.
VIOLAINE IGLESIAS: There's lots and lots of different shades of hybrid events. Video is at the center of almost all of them, but in various ways. There are pre-recordings or live events. There's just a lot going on where societies are having to experiment with vendors, with new technologies, and new ways of working with their authors and their members.
VIOLAINE IGLESIAS: So there's a lot of actually that's coming out of that. You don't just have the virtual, that are hybrid events, but also things like OK, perhaps, we're not going to have one big conference. We're going to have more continuous events. Or perhaps, do you know this live component that we've been working on, these webinars. Now we're going to integrate that onto our publishing platform.
VIOLAINE IGLESIAS: We've got a great project going on with the American Medical Association who's doing exactly that. Or, you discovered, hey, I held this conference, this virtual event, and went great. But then I realized that universities actually used the recordings of the conference four weeks after the event. I didn't know they were going to do that. So let me put together a product where they can officially go and where that is going to have the trappings of an educational product.
VIOLAINE IGLESIAS: So really, what I just talked about, those three areas-- research, education, and events, those lines are getting increasingly blurred, which is kind of taking us into more interesting places. So that is not to say that there are many challenges associated with publishing video-- video and scholarly communication. One of them is that scholarly video is not Netflix.
VIOLAINE IGLESIAS: Scholarly video is not entertainment. It's not something you're going to sit back to get popcorn and just watch from the beginning to the end. It really is, oftentimes, content that is less produced, that's more boring, if you're honest. But that doesn't mean it doesn't have value and that there's not good information in it. So when you need to do is work on discoverability, findability, so that anybody looking for scholarly video can find the videos that they're interested in, but also that they can find the content that they're interested in in each video.
VIOLAINE IGLESIAS: Sustainability is something to work on because it's great to have all this video content, but now, we need reproducible ways to create this content. Now that you have a hybrid events, how do you get videos for all the content? For example, you've started creating a product out of all this video content coming out of conferences, how to make sure that you still capture that content, but also keep the costs down because AV can be really expensive/ So can you find out alternative workflows?
VIOLAINE IGLESIAS: For example, you're going to ask your authors to pre-record content even if they're going to present in person. Lots of sustainability issues. Accessibility is, of course, a challenge across the board. For video, it's not that hard, as long as you've got captions and a good video player. You can probably get there, but it's still one more challenge to focus on.
VIOLAINE IGLESIAS: Finally, workflows. Workflows for video are really important. They can be a little bit daunting. So the story of how I created Cadmore was that I worked for a big publisher, who started creating big video collections. And to do that, they hired lots of staff. They invested a lot of money. They invested in systems.
VIOLAINE IGLESIAS: And it's successful. But really, a large investment that's just not an option for many small publishers and societies. So I decided to create a company that was going to help societies and publishers do more published content more easily. So we're doing this by supporting workflows and metadata. However, we cannot do this alone.
VIOLAINE IGLESIAS: I have DataCite on there. I just edit them to the PowerPoint. So I am co-chair of NISO working group that is right either has published by the time of NISO Plus or is on the verge of publishing guidelines for using video and audio metadata and scholarly communications. So those guidelines are supposed to be a starting point to help other organizations do more with video.
VIOLAINE IGLESIAS: So we made a list of properties that really should be at the core of all the use cases. But we really hope that from here, we can work with a number of other organizations to make sure that these guidelines are actually applied across the entire ecosystem. So it could be KBart. We're talking to KBart at the moment. They were interested in using the NISO guidelines to update their own guidelines.
VIOLAINE IGLESIAS: Crossref and DataCite for DOI enrichment and metadata. FAIR, video can be data in the geosciences, for example. ORCID and CRedit, very important for video to be recognized as something that counts towards an academic work portfolio. So in order to do this, you need to make sure that you work with the right organizations. And this does not include vendors like journal publishing platforms, submission systems, preservation systems, et cetera.
VIOLAINE IGLESIAS: All of these vendors, we are working with or we're hoping to work with to really facilitate the publication of video, scholarly video. And That's it for me. So it's our contact information. And I guess I'm going to hand it over to Liz.
LIZ KRZNARICH: All right. So [INAUDIBLE] last slide, was actually a great lead to this next presentation. I am from data site. My name is Liz Krznarich. And thanks to NISO for inviting me to speak on this panel. So stepping back from video specifically, I am going to talk a bit about surfacing non-traditional scholarly outputs and resources of all kinds through the power of persistent identifiers, including, of course, digital object identifiers, DOIs.
LIZ KRZNARICH: So a little bit about DataCite, where I'm based. So we are a DOI registration agency. You're probably quite familiar with DOIs at this point. Most of them that you'll see, associated with things like journal articles and books and conference publications, tend to be crossref DOIs. So DOIs that are intended mainly for formal, published works. What DataCite does is DOIs for scholarly outputs and resources that are everything else aside from formally published works.
LIZ KRZNARICH: I'll say that the data in DataCite is all inclusive. Lots of people, when they think about data, think about numbers, streaming out of an instrument, like a part of a particle accelerator. But in the data side context, we think of all sorts of scholarly resources as data. It might be spreadsheets with numbers in them, but could be words, images, audio and video files, databases, full of all kinds of various content, collections of objects like annotated or translated or analyzed text, those are all data from the DataCite context.
LIZ KRZNARICH: So in this presentation when I say data, I mean all of those different kinds of scholarly outputs. And we're expanding even further into some of the inputs like identifying instruments and code and all sorts of other resources. Basically, we want to identify, persistently, everything that's involved in the scholarly research and communication process. To that end, we currently support 27 different resource types.
LIZ KRZNARICH: You can assign DOIs to all of these different kinds of things. And of course, we also have a catchall other resource type. So currently, you can assign a DOI to just about everything that you might have, which OK, great. But why would you register a DOI for all of that stuff? So DOIs offer some basic technical capabilities, just based on how the DOI system works.
LIZ KRZNARICH: Of course, it's handy to be able to find a resource at the same URL, even if it's location on the web changes. That means we can go through system migrations, and everything stays, where it stays accessible at the original DOI URL. We can disambiguate or cite one particular version or variation of a resource, which is really handy for things that change over time, like code and certain kinds of data sets.
LIZ KRZNARICH: When we have a persistent URL, that means that we can cite that thing with confidence that anybody else who reads our publication or interacts with another resource that cites that DOI can find it, reliably, in the future. So yeah. Some very handy basic capabilities of DOIs. But there's much more that you can do with DOIs, and particularly, the underlying metadata.
LIZ KRZNARICH: So in the case of those nontraditional resources and outputs, one of the values that they provide is context for those publications for print and text resources, journal articles, and books. DOIs help to enable discovery of and access to those underlying data objects, whether they be numbers or images or other archival resources, whatever you may have. So being able to cite those things in a formal way supports attribution and credit for all of the huge amount of work that's involved in creating that underlying scholarly material.
LIZ KRZNARICH: Last but really, really not least, using DOIs and the metadata that underlies DOIs allows creating machine readable connections between resources. So between, for example, a publication and its underlying data or software, but also between resources and people through ORCID IDs or resources and organizations like funders through Crossref Funder IDs or research institutions through ROR IDs and other types of organization IDs.
LIZ KRZNARICH: What these connections allow is building automated workflows, building automated discovery and access, and tools that allow automated tracking and attribution. So hopefully, this all sounds pretty neat. But let's see how it actually works out in action in the real world. So I have, of course, randomly selected a book that has an ISBN.
LIZ KRZNARICH: It is a book about built cultural heritage and Antarctica. It's basically about the history of Swedish polar expedition and the history of some research stations in Antarctica. And a team of researchers not only went through archival resources to write this book, but also made a trip to some of these historic research sites and collected photographs, they made sketches.
LIZ KRZNARICH: So they amassed a huge amount of underlying data. That resulted in the publication of this book. So there are a few bits and pieces of that entire data set contained in the book. But these things also, the entire collections live in a data repository. They have DOIs, so I, as a reader of this book, can not only see a couple of photographs and sketches within the book, but I can then go to the related data collection and view everything that was part of what the researchers created-- photos, sketches, scans.
LIZ KRZNARICH: And that gives me a really rich context for this book. I can also, potentially, reuse these things in my own work and I could cite them based on the DOIs. So the repository tells me exactly how to cite that. And I can also see all the researchers that carried out this work and the organization that funded it and all of their affiliations in case I want to get in touch with some of these people or understand, more thoroughly, what the context of this publication was.
LIZ KRZNARICH: So this is obviously great for human users to get a much more full picture of the context of a publication. But it's also great for machines as well. And that's where the magic of DOI and other persistent identifier metadata comes in. So from the screenshots that we looked at, we could see connections to organizations and people, visually.
LIZ KRZNARICH: But those connections also live under the hood in the DOI metadata for these data objects, which means that we can start to create sort of a graph of the entire research ecosystem and how different objects are connected to each other, as well as how they're connected to people and organizations out in the world. So with that machine readable kind of graph of the whole scholarly communications ecosystem, we can build tools on top of it.
LIZ KRZNARICH: So here's an example of something that DataCite has been working on for a few years called DataCite Commons. It is built on top of another machine readable API called the PID Graph, that does attempt to pull together all of these resources and connections between them using persistent identifiers-- DOIs or code IDs, ROR IDs, Crossref Funder IDs, so that you can do things like look up an organization and see all of the resources, both data and publications associated with that organization, or with a particular person or a funder or even another resource.
LIZ KRZNARICH: So you can look up a Crossref DOI maybe for a journal article and find all of the other stuff that was associated with that, including citations of data sets and other resources. So pretty cool what you can do with persistent identifiers and the metadata that underlies these identifiers. A fair question, though is, is all of this really possible right now?
LIZ KRZNARICH: The answer is kind of. The plumbing is all in place. The technical infrastructure when DOIs are registered and when they contain really detailed metadata about PID connections. Yes, this does work. However, the book that I just showed is actually a fairly rare case at the moment that a publication of any sort is fully connected to all of its supporting resources and organizations and people through persistent identifiers.
LIZ KRZNARICH: So what we're working toward right now is much wider adoption and implementation of DOIs and other persistent identifiers throughout the whole scholarly workflow and communication process, as well as consistent and high quality DOI and other persistent identifier metadata. So to that end, DataCite just released its new 2022 to 2025 strategic plan. And you can see in the infographic, two of our pillars in that plan are connecting resources through metadata and ensuring that all scholarly resources are identified and connected.
LIZ KRZNARICH: Which means that we'll be working further, both internally and in collaboration with other PID organizations on building and improving our tools and services, as well as really focusing on adoption and implementation, guidance, and community engagement around best practices and using the tools and services over the next couple of years. We really want to make this process easy, so not just to make it possible as it is now, but easy to identify and connect all scholarly resources.
LIZ KRZNARICH: And of course, it's not just up to DataCite. It does take work from the whole scholarly community to enable these interconnected workflows that are identified and kind of fully PIDified. So I'll leave you with a couple of key points that those in the community, these are things that enable these interconnected workflows. Of course, it all starts with actually depositing your non-print and non-text resources into a repository that can store and make them accessible for the long term.
LIZ KRZNARICH: And then the next step is to register DOIs for these resources. Of course, many repositories do this as part of the standard features. Where there's really kind of a gap right now is including detailed metadata with those deposits and particularly, those relationships to other objects and to people in places through ORCID IDs and Crossref Funder IDs and ROR IDs.
LIZ KRZNARICH: Finally, to bring the whole thing full circle, it's really important to cite DOIs for these supporting, non-traditional resources in publications just as you would on other publication. So to put them in reference lists and treat them as first rate research objects so that it allows machine readable, easy tracking of these objects. And of course, results in attribution for those who created them.
LIZ KRZNARICH: So that's it for me. And I will hand it back to--
JOSH HADRO: Perfect. I assume that is all seeable and good. So thanks Jason. Thanks to the folks at NISO Plus for having me. It's really nice, yeah, to be with Liz and Violaine. I think this fits really nicely. So what I'm going to do is talk about IIIF and some of the connections to discovery. I'll start by saying that my name is Josh Hadro. I'm the Managing Director of the IIIF Consortium, which is a set of institutions that supports this open standard and set of specifications that we'll talk about.
JOSH HADRO: But overall, what I'm going to try to cover in just a little while is a little bit at a high level of about what life is if you're not familiar with it. And then get into some of the discovery elements. So how people are working within the community to make collections and materials, all that, much more accessible. So let's start really at the very basics. What is IIIF?
JOSH HADRO: Well, what it is is actually a stand in for this phrase here-- the International Image Interoperability Framework. And because that's a lot of syllables, you'll hear myself and just about everyone else say IIIF as shorthand for that concept. But maybe more usefully, this is what it is. It is a model or a set of specifications for presenting and annotating digitized objects. And so it started with images, but that's actually now grown to include audio and visual files as well.
JOSH HADRO: And because it's an open standard and it's implemented by so many institutions, there's all sorts of other benefits that are now accruing because of that standardization. And it's also really important when we talk about IIIF, we're always talking about the community of folks who make this thrive and exist and work. So that's the folks who are developing the software using the open APIs, but also the folks who are curating collections and developing exhibitions and presentations using IIIF tools and resources.
JOSH HADRO: And so just before I get too deep into this, I want to kind of throw the flag and just really make clear that IIIF is not a metadata standard. It's often in the same conversations and it, very explicitly, was designed to work and dovetail with just about any kind of descriptive metadata standard. But in the and of itself, it doesn't prescribe any real particular way of doing things. It's very flexible and it really is just concerned with the presentation of materials and assets and annotations.
JOSH HADRO: So in that sense, it works together with basically any other kind of metadata system that you might be using. And this is really what people are talking about when we talk about IIIF. There are a number of specifications in the ecosystem, but the main ones are these-- the image API and the presentation API. And the image API, as the name suggests, is really basic.
JOSH HADRO: It's how you deliver pixels. But you do it in a very smart regularized way, where using a URL, you can deliver an entire image at enormous resolution or just a section of it or anywhere in between. And That works with the presentation API, which dovetails with that image API to present just enough structure and metadata to drive a really compelling viewing experience.
JOSH HADRO: I think the best thing to do is just give you an example. So this is a IIIF viewer, one of many. This one is called Mirador. And in that blue box there, in the center of the screen, is the image data. So that's a deep zoomable image. So that's the viewing port where the user interacts with it just like you would a Google map. But the other elements surrounding it, there on the left and down at the bottom, that's the presentation API.
JOSH HADRO: So the order of the pages that you see in the sketchbook, that's all in there, in the structure. The label of the resource, the links to other resources, the manifest, things like rights information, that would all be in the presentation API. So I hope that gives you a bit of a sense. And then just to give you some really common use cases. These are all live examples. This is the most common way people use IIIF viewers.
JOSH HADRO: This is from Stanford. This is an enormous Japanese tax map. You can see the photographer there for scale. This is a map that's 12 feet by 18 feet. And so you don't have to give the user a multi gigabyte file. Instead, the IIIF APIs give you a way to get into that real level of zoom and depth without having to deliver all those gigabytes at the same time.
JOSH HADRO: But there's a lot more utility on top of that. So there's a lot more complexity and sets of interactions that you can enable. So one of the beautiful ones is this idea of reunification. So this is an illuminated manuscript from the 15th century. And it had the illuminations, literally, cut out of it. So by the 20th century, the illustrations were in one institution and the pages were at another.
JOSH HADRO: But in recent years, both institutions were serving the materials using IIIF. So these developers were able to, relatively, easily create this interface that reunited these images across institutional boundaries and created this viewing experience, where you could interact with both pieces at the same time at the original resolution, exactly as the pages were meant to be seen, but actually with even more utility because you can really get in there and get to that highest level of zoom.
JOSH HADRO: And because these are all generally open viewers based on open standards, there's a lot of portability involved. So this example comes from the Indigenous Digital Archive, which does a wonderful job of taking materials from the US National Archives around what were called the Indian boarding schools, presents them with good context and collection organization. But also, with each individual item, as you can see here, it presents a basic view with some basic metadata, but it also gives you the option, as down there at the bottom, to open it in other IIIF viewers, depending on the functionality the user might want to use.
JOSH HADRO: So for example, the second one mentioned is that Mirador viewer that you saw before, which is often used for side by side comparison. So you can open multiple resources, IIIF resources, in that viewer and compare them, say, side by side. This gives you just a sense of how and where IIIF is being used. It's a relatively new set of standards, initially developed in North America, UK, and Europe, but we're seeing growing adoption around the world.
JOSH HADRO: It just covers independent installations that we know about implementations. This doesn't even cover, say, content management system vendors, who also use IIIF technologies in their products and interfaces. And this gives you a different view into who's making use of these things. So huge number of state and national libraries, large scale research institutions, but also increasingly, we're seeing museum and gallery uptake.
JOSH HADRO: And particularly exciting is the work that aggregators are doing. So content DM from OCLC, Europeana. The Internet Archive is a massive collection of IIIF materials. Places like that are sort of leading the way in the development of new functionality. And these are the contexts. So it started, as I said, in the libraries and museums, but also seeing really cool use in teaching and learning.
JOSH HADRO: So even in secondary education, teachers creating exhibitions and talking to their students using IIIF exhibition tools. And some of the most exciting kind of new work is in STEM. So bioinformatics, bioimaging, using annotations to annotate slides of organisms and identifying cancer cells, things like that. Some really exciting use cases that are just kind of coming out in recent months and years.
JOSH HADRO: So I think if you worked in or near cultural heritage at all, you'd probably have an intuitive sense of why we digitize things. But why do we need IIIF? Excuse my cat. So digitize materials are just such useful carriers of information. They're such convenient ways of documenting our own cultures and learning about the cultures of others.
JOSH HADRO: And so that sort of second nature at this point. But we've created a problem for ourselves as we've done this over the last generation and a half or so of digitizing. We've created all these separate silos and instances of digitized collections in basically the same technology stacks, but with individual interfaces that users have to relearn every time they come to it. And so IIIF is meant to solve that problem and a number of others.
JOSH HADRO: So even in the context of just one institution, IIIF reduces costs and simplifies a lot of things because you can use off the shelf image servers or image viewers. But in the context of multiple institutions, this is where the ecosystem really starts to thrive and come together. And you can see all of these different capabilities and capacities start to work across institutional boundaries.
JOSH HADRO: So you can reference images from another institution and bring those together with annotations that are from yet another data store and examples like that. And so that's kind of a really high level view into the set of technologies. And now, in the last few minutes here, I just want to talk about some of the work the community is doing. So we have community groups. But what I'll talk about is what we call a technical specification group, which is a set of folks who are literally have been chartered to do work, to update the specifications and to bring in new functionality and ways of working with materials.
JOSH HADRO: And they work with the editorial group and the technical review committee to work through the process of making that all come together. There are a number of these. The one I'll talk about is our discovery technical specification group. And this group was chartered a number of years ago to look at exactly this question about how we have all this amazing capability, but IIIF materials aren't useful if they can't be found.
JOSH HADRO: So how do we encourage that kind of discovery and adoption? So they basically broke it out into a couple of different areas of investigation-- so crawling and harvesting, importing two viewers as well as porting between viewers, and the idea of notification. So when you update a resource, how and in what way should we be able to notify the viewers, the software applications of that change?
JOSH HADRO: So the first piece that's actually complete now is this piece around crawling and harvesting. So last YEAR the TSG published what's called the Change Discovery API, which is a relatively straightforward specification based on W3C activity streams, which provides a regularized way of just presenting, at collection level or multiple collection levels, information about the creation or updating or removal of IIIF items from that collection.
JOSH HADRO: And so very useful to aggregators, so that they know, for example, how far back they may need to go to update their records or only adding the new records that may appear in the stream, things like that. And so we've already seen some good uptake of this part of the specification. OCLC Research has done a proof of concept for its entire content DM customer base.
JOSH HADRO: We've seen it from Intranda, Goobi, based in Germany, as well as the Bodleian library at Oxford. And we are looking at, at the IIIF consortium level, hosting a centralized registry of these change discovery endpoints so that we might make those available to aggregators and be of help that way. And the other piece I want to mention is about to go to 1.0 phase. This is called the Content State API.
JOSH HADRO: And so what this does is it allows for more nuance in terms of porting or changing viewing states from one viewer to another. So you can already link to IIIF assets, obviously, but this allows you to send a link that shows a particular zoom level, and then potentially even can also include things like annotations and other comments connected to that all through, basically, just a regular URL link.
JOSH HADRO: So this specification provides the means by which to do that, to port from one IIIF viewer to another and provide that really specific state of interaction. And the very last thing I'll mention here, before handing it off, is one other group that we have. And this is a non-technical group. This is called our Discovery for Humans group. But this group kind of grew up around the work that the Discovery TSG was doing.
JOSH HADRO: And really is taking a look at more of the user experience and the other interaction elements that are around this notion of discovery. Because as much as the technology needs to be there And In place, we also need a lot of research and a body of folks working on the best practices in terms of presenting materials to viewers and how to implement those APIs and specifications that I mentioned from the other group.
JOSH HADRO: So I really encourage folks to get in touch if either of those groups are of interest. We'd love to have you join. D4H, in particular, is that exciting phase. So I'd encourage you, if you're looking at your own institution, to join us in this community group meeting, that meet once a month. And I'd be happy to put you in touch with the group chairs. But I will end there.
JOSH HADRO: Thanks.
CHEN: Thank you. Hi, everyone. It's my great pleasure to share this with you at the NISO conference. The topic I share today is Exploration and the Thinking on the Intelligent Application of Cultural Heritage Images. First of all, allow me to introduce my school and myself. I'm Tao Chen and I'm an associate professor at Sun Yat-Sen University. The School of Information Management at Sun Yat-Sen University, founded in 1980, is a school that stands on the edge of the information age and is full of vigor and vitality, with excellent tradition and the profound cultural heritage.
CHEN: The school is also part of the iSchools movement and a voting member of IFLA and the ICA and so has a strong international academic influence. We also welcome everyone to come here to communicate and learn. Well today I'm at in three parts. To begin with, I will briefly introduce what the MISS platform is.
CHEN: Then I will explain core technologies, LIBRA with the MISS platform. Finally, I operate under the transform functions through some specific examples to illustrate our design considerations and innovation. So firstly, what it means. The full name of MISS is Multi-dimensional Image Smart System. what is an image application platform that integrates functions of image uploading, management, publishing, organization, annotation, sharing, and reviews.
CHEN: At the present the platform website in the main is Chinese language and we are consider providing multiple language options to meet the needs of international users in the next stage. On the MISS platform the collections are classified, such as Chinese painting and the Chinese calligraphy. For small size images users can upload by themselves to the system and create a manifest online.
CHEN: While for larger sized images, we recommend offline conversion of images for MISS. So where we can upload resources for the MISS platform? Data in the MISS platform can be created by users themselves or imported from other IIIF resources Here is the list of institutions [INAUDIBLE] and resources that can be imported and reused in the MISS platform.
CHEN: Now let's move on to the core technologies in the MISS platform that we call it LIBRA. LIBRA here is not the blockchain, not certain technology. But LIBRA is an abbreviation of a series of technologies. Here L stands for "linked data." Linked data is not to the implementation of the semantic web that has been widely used in digital humanities.
CHEN: And I stands for IIIF. By IIIF means International Image interoperability Framework. The entire MISS platform is implemented based on IIIF framework using the Image API, Presentation API, Social API, and the discovery API. B means "big data." For the data storage of the MISS platform we used the triple store get the best and use nastiest suit attacks.
CHEN: R stands for RDF, the Resource Description Framework. An RDF provides the finder structure and structure and standard of different digital resources. Finally, A means the "artificial intelligence." The AI technology can be said to have swept the world and all walks of life. In MISS we adopted the AI technology in the OCR recognition of the images.
CHEN: Next up let me explain our design intentions and ideas through the function in a situation of specific examples. There are five primary functions and the slide choose. Currently, more and more cultural heritage institutions are presenting their collections using the IIIF framework. But when accessing these collections we wondered if there would be a platform that would allow users to add the resources they are interested in, and reorganize them as needed.
CHEN: Let's look at Yongle Dadian as an example. According to statistics, Yongle Dadian has about 770 million words and it's known as the World Encyclopedia. However, due to various reasons only a partial of Yongle Dadian remains. Currently, only over 400 of the 11,095 books remain. And they are scattered in the hands of both the public and the private collectors in eight countries and regions.
CHEN: Many institutions around the world have presented the Yongle Dadian in their collections through the IIIF framework. The MISS platform now connects more than 100 volumes of Yongle Dadian from different institutions online, and provides review and browsing.
CHEN: The second design consideration and intervention point is about the image reuse and curation. The purpose of the IIIF framework is to ensure the operability and accessibility of a global images. It is not only to display the collection resources but also to be able to interact and reuse the image resource. The figure shows how the images are reused from five different manifests to compose a new research topic.
CHEN: The organization of different resource images can be done with a few simple steps. The first step is to preview manifests that use the system in. The second step is to select the pages you want to use.
CHEN: These two steps can be repeated so the relevant images can be selected from different manifests. The third step is to create a new manifest by themselves. The third part of the design consideration is about the image annotation. Image annotation is the direct reflection of the image content.
CHEN: We adopt the web annotation data model proposed by W3C for the whole annotation model. Through image annotation, the story in the image can be better told and it is easier for people to understand the meaning expressed by the image. Semantic annotation, men usually use new technology to associate image information with links to open [INAUDIBLE]..
CHEN: With the semantic annotation you can view more information about the resources through the knowledge graph. Or you can view it directly on the origin page. It should be noted that each annotated object can have multiple semantic annotation calls. The dynamic OCR recognition is another important innovation of the MISS platform.
CHEN: I think we all know that cultural heritage images are usually larger in size and it is not practical to perform OCR on the entire image. So we wondered whether it is possible to allow users to select some areas for OCR according to their needs. When we want to perform OCR recognition, we only need to select the area to be recognized.
CHEN: Image [INAUDIBLE] and obtain the OCR results through API. After the OCR recognition, we browse the image. Users can determine whether to display the OCR recognition results superimposed on the image through settings. The font size and offset of the display can also be set by parameters.
CHEN: The last crucial function of our essential design considerations is the cultural discovery. The resources of the data set are limited but the linked resources are infinite. Linked data aims to build a global database which give us a great solution.
CHEN: When the collection had a specific author, we can query the same resources in Dtpedia in real time through SPARQL. At the present, the MISS platform can be used in some Chinese universities and public libraries at [INAUDIBLE]..
CHEN: Next, we will improve in the following three aspects. First of all, we will conduct more AI exploration such as automatic image classification, object detection, and automatic image description. What's more, when the team was formulating LIBRA, there was a controversy about whether B stands for blockchain or big data. In the end, which will chose big data.
CHEN: But that doesn't mean we won't use blockchain in the future. Finally, the MISS platform only connects the author with Dtpedia data set. And in the future, we will use the more technologies to discover similar connections from different institutions. Welcome everyone to scan the QR code and the fill the machine form to try our MISS platform and we want to hear diverse voices about our system and we expect for more international audience.
CHEN: This is all my presentation today. Thank you all for listening. Thanks.
JASON GRIFFEY: Thanks so much to all the speakers. And now we will move over into conversation about their topics and anything else you'd like to talk about as far as non-print formats go. So we'll see you there. [MUSIC PLAYING]