Name:
Value proposition of information standards - especially around APAC countries Recording
Description:
Value proposition of information standards - especially around APAC countries Recording
Thumbnail URL:
https://cadmoremediastorage.blob.core.windows.net/f1bf66c2-b90c-4466-9400-9f973004bd7b/videoscrubberimages/Scrubber_3.jpg
Duration:
T00H39M34S
Embed URL:
https://stream.cadmore.media/player/f1bf66c2-b90c-4466-9400-9f973004bd7b
Content URL:
https://cadmoreoriginalmedia.blob.core.windows.net/f1bf66c2-b90c-4466-9400-9f973004bd7b/Value proposition of information standards-NISO Plus.mp4?sv=2019-02-02&sr=c&sig=V5%2BVZ6bboaU9Xt4X3Zd3doSn2x%2B6nzYcxynmSs7RckM%3D&st=2024-11-23T19%3A58%3A07Z&se=2024-11-23T22%3A03%3A07Z&sp=r
Upload Date:
2024-03-06T00:00:00.0000000
Transcript:
Language: EN.
Segment:0 .
Hello my name is Chris Chan.
I'm the University librarian at HKBU in Hong Kong. It's a pleasure to serve as the moderator for this session entitled value proposition of information standards, especially around APAC countries. I'm thrilled that we have two speakers from the region to speak on this topic. The first is Andrew Davies, Digital content specialist at Standards Australia. In this specialist role, his focus is on how standards can be better found by potential end users, published and disseminated in a world still dominated by PDFs.
He's going to share with us how Standards Australia have been using the STS standards tag suite to make their work more efficient. So over to you, please Andrew. Hello, everybody. Just share my screen. Thank you to the organizers of this conference for allowing me to present to you some of Standards Australia's journey in using one particular information standard, the STS standard and how we are reaping incredible benefits from the utilization of that standard.
Before I go further, I'd like to acknowledge the traditional owners of the country from which I'm presenting today, the Gadigal people of the Eora nation. And I recognize their continued connection to land, waters and culture. We at Standards Australia pay our respects to their elders past, present, and emerging. In today's presentation.
Yep don't worry about it. Just give a little, little break so that I can cut that. Yeah my message wasn't rolling in for 30 is now. OK and happy to receive. In today's presentation, I'd like to cover three essential areas.
So what is the key information standard, which is helping us to manage xml? Our content captured in XML. In the world of standards, most of the talk will be on how we have reaped benefits from using this standard. And then there are some costs and benefits, but also challenges. When you're using content from other organizations and you have a flexible standard.
And there are things you need to consider in your business. So the standard that we are using is the STS Standards tag suite. It's it was developed by the NISO organization and faithful to the promise of published abstract on the screen there of the STS standards tag suite Standards Australia is one of the National standards bodies and we have reaped incredible benefits from adopting initially the precursor to this standard, which was developed by ISO, the International Organization for Standardization.
And it's grown up into this, what I call the second generation of standards mark up. which allows us to capture not only the metadata about standards, but also, as it says on the screen here, the normative and non-normative content of our publications and all sorts of subtypes of publications beyond just standards. If you don't know what is XML.
XML is a mark-up language that is user defined. So you can tailor XML to the needs for your organization, the type of information that you publish. And that's really quite important to the world of standards because we have some particularities in the way that we convey information. But standards XML has a whole heap in common with other professional and information publishing.
And so the SDS standard that we work on That has great antecedents in the JATS standard, which is designed for academic journal publishing as well. But XML is not an end user format. So whenever you're working with XML, you do need to transform it, so that your customers, your readers can actually read this stuff. And I'll just give you a brief example of what I mean by that.
Now So here is a fragment of one of our standards. It's just the cover page. And really what the XML can do is help you capture the metadata of your standard efficiently. So instead of information displayed in blobs on a page, you actually capture each of the metadata points with specific tagging. And of course you can use that sensibly in other databases and for other reasons.
But capturing metadata is fairly basic. It's something the databases do all the time. But in the world of XML, what is interesting for publishers is capturing the text in XML as well. And so here we have a fragment of XML from one of our standards, just a plain piece of text. It's got a list in it. It's got some cross-references.
And those are all specifically captured in the STS markup. And we can do smarter things with that markup. Including cross-references and links and all sorts of other useful things in the world of information publishing. So what are the benefits for us so far? Well, first of all, from a single XML source document, we are able to create content for our end users in a reliable and repeatable way.
We actually currently draft most of our documents in Word and we'll talk about that in a moment. But we have an increasing number of outputs which we need to support. And we need tools to help us generate these content in a really effective way. On screen, you can see we still produce PDFS and that's a large number of our customers take PDFS. We have our Standards Australia website and that offers the content in HTML.
We have a range of other platforms that I'll show you in a moment that also have features and functions that are underpinned by having our data in XML. And we also work with third parties who are developing apps and other content using our content in HTML. We actually have one stamp in the Daisy format. That was an early experiment in trying to make our content more accessible.
We're actually looking at using HTML as a better way of making our content accessible and making sure that our platforms are accessible and utilize the accessibility features that are also more easily available in HTML and not limited to specific types of readers. We also take our XML and provide it back into Word to in today's world for our drafters. And drafting is primarily done for our authoring community in Microsoft Word.
We don't use EPUB at the moment. That's just a question of the Australian market's appetite for how to receive information. ISO have developed scripts to convert their content into EPUB format and we could utilize those quite readily. And we have an open future as to other formats that we can use from the single source of XML, which is really the key. And that's the value we're driving.
We actually use a tool called eXtyles to transform our data from word into XML. And because we're using a shared information standard, we've actually been able to benefit incredibly from the foundational work that ISO did in developing this tool. We've repurposed it, but really we've only made minor modifications to support additional variations, some additional richness in our own documents and some quirks in our documents.
But we are building off a shared toolset and we actually get to share utilization tips and training materials and other benefits across with other standards organizations who have used this tool as well. This is a first generation tool. More work is being done internationally with more advanced tools. And online authoring tools, and that's on our future and all, underpinned by the same XML - having a common standard that we're working to.
What we have here is one of our websites essential to the production of standards is actually making sure that standards are available for public commenting so that we get as broad an input the standard before it's actually finalized. This is part of the world of standards to ensure that you've got a public commenting phase with our xml, we transform it to HTML and publish to our public commenting platform.
This has been designed with a whole range of features that we couldn't do before. Before we used to receive comments from the public in all sorts of formats and some project manager would have to cut and paste, compile all their feedback from customers and nobody knew each other's comments. So now we have a way of presenting the content. You can present the content as a straight read of the document, but you can also present it clause by clause and get feedback on each clause.
That helps. When the feedback is captured to each clause, it's much easier to manage the assessing of that feedback for our technical committees who are responsible for authoring the document. And really a fantastic advantage here is that users can see other's comments, they can endorse them, they can critique the comments, or they can offer their own alternatives as we go along.
It's a strange standard to actually pop up on the screen, but it's just a little bit of an indication of we're in a world of information standards today, but in the world of standards, we really, truly have standards on just about anything you can think of. You can see the title of the document, then one of our plumbing standards. We've also just rolled out a redlining function. Redlining is there to tell users what's changed.
We can turn it on and off. We can direct people's attention straight away to the clauses where there's difference. So some standards are heavily updated in between editions and others. Standards really have a pretty light touch when they get updated and they go out for public comment. So with this tool, we're able to very quickly, systematically, reliably produce a track changes version equivalent on screen.
Previous tools that we've explored really were patchy in the way that they produced a result. In fact, there's at least three comparison engines working under the covers here to actually help us solve some of the problems, such as detecting changes specifically in tables and of course in technical standards we have an awful lot of tabulated material and I remember the early days of comparison with tables, you just get disastrous, inaccurate results.
So this is a really good benefit for us. On our online store. So once we've published a document, we also present our content Now in HTML. We see on the left hand side there, you've got some metadata that you can see and that's reproduces typically information that are on the cover pages, inside front cover pages, imprint pages in the world of books.
But on the right hand side, not only can you read the text, it's all laid out, you know, very familiar manner, but now we're supporting user engagement with the content. Users can add bookmarks to the content, they can add annotations to their content, and that's stored for each user. And we're working towards storing that for each organization's users with when we've got a subscription platform, which we're about to launch so that those personalization can also be shared within a user community.
The other thing with this sort of personalization is managing change. So when we amend a document, we will republish the same standard. And what happens to the user's annotations and bookmarks? Well, because we have the structured data underneath, we have IDs in that structured data underneath, we can actually ensure that users' bookmarks and annotations aren't lost when we update the text.
So that saves them time and saves us grief of having to explain where the user's information has gone. So it really does encourage people to utilize these online tools. It is harking back to, you know, scribbling on a piece of paper or annotating a book. But this all sorts of interesting problems to solve, to be able to do this online and do it well.
Here we have another feature of our online platform. We one of the challenges in the world of standards, there are thousands of standards available. And on any particular topic, you really need to know, is this standard right for me? Am I a designer? I might need it. Or Is the standard designed for somebody maintaining a product or service, is it for them?
So we actually offer a certain amount of information for free on our website so that that information is predictable because of the structure of the documents. And then. The rest of the document remains behind a paywall for standards. So that's another tool. Very briefly, we've got a whole range of other tools that we've built relying on our XML data.
Not an uncommon thing, but having a data set, which is a vocabulary tool. So we have many Australian standards that are referenced in our national building code. We are able to pull all the terms and definitions of the standards referenced in the building code. Building code itself is published by the Australian building codes board and that's also in XML.
So we can pull their definitions. And we've released the construction dictionary for Australia online. We are now also working with third party innovators to provide chunks of our content, which can go into, for example, compliance tools. So Firemate is the first example and first organization we've started working with. We've started working with a range of others and the doors opening because we now can provide information in a flexible manner in XML, they can easily transform into HTML to embed into tools.
And services. So you've got the right bits of information, contextualized for an activity such as compliance activities around fire safety. In the world of publishing standards, we actually generally produce a consolidated list of the references in the document. That's really straightforward.
You can even extract them from the documents and have that list compiled automatically. That's pretty straightforward. But what we've done recently is tilted that the other way. What documents refer to me? And that's something we've never been able to offer to our authors before. And the bottom of the Power BI screen It's not the world's most sophisticated user interface, we can search for a particular document.
And then see all the other documents. And this bottom part of the screen is the interesting bit. So ISO 9001, which we've adopted in Australia, has been referenced by all these other documents. We know the currency of those documents and we also know the technical committee that owns those documents. And we can if we're updating our own publications or an international publication, we can get in touch with the committees and say this document's being updated.
Do you have some things to contribute to the updating of that document? And we can keep the community of information much better informed. And that's really fantastic advance for us. But there are some catches. We absolutely benefit from having a common mark-up from ISO and IEC.
30% of our corpus roughly is our adoptions of the International Standards and because our technical committees are very much endorsing them for the Australian market or the Australian New Zealand market for many of our documents. So with adoptions we can just grab the text from ISO and IEC, build what we need to around it and produce our own national standards, which are adoptions of standards.
So that's extremely efficient. But up to a point what we've discovered. So we've followed ISO's path from the beginning and we have a lot more similar tools. And services which I talked about before, which take the benefit of using this common XML language. But then there's differences and I guess this is one of the catches and this is one of the things that we need to be aware of in the world of working with XML.
The IEC came along. They've used the NISO STS standard. But in slightly different ways. And it's really important in developing information standards that you're flexible. You need to cater for, often quite a wide variety of structures and information points in your standards and other document types, particularly when you're capturing historical information.
It's really important to be able to do that with your contemporary, slightly more controlled XML environment. And obviously the NISO organization has just released version 1.2 of the NISO STS Standard. So how do we manage upgrades? And upgrades are essential, part of keeping any standard up to date, addressing new problems, solving things in slightly better ways.
But what do you have to do when things change or when somebody you're using somebody else's data and they're using it in a slightly different manner? and that's the salient lesson that I sort of wanted to finish with today is that the flexibility is great. Reusing other people's data is terrific, particularly in the sort of world that we operate in with standards where nationally we have an obligation as Australia is a signatory to the World Trade Organization and that asks us to consider putting into regulations, international standards rather than going down our own path with national standards.
It Promotes trade, promotes efficiency worldwide. So we do! 30% of our corpus is. But yes, we have to deal with change, spot variations and all these scripts that we use. And also our content gets loaded into databases. So there's great advantages, but there are some catches and I guess that's today's final message there are great advantages... We definitely want to remain with a flexible xml model.
But what is next for us? How do we manage changes? There's a really lively debate now about backwards compatibility of our current NISO STS standard with previous versions of that standard and in fact, the originating ISO STS standard. If you can't go backwards, you have to consider Everything on the right here. Every transformation script and every connected re-use of the information.
if you can't go backwards. & you don't give people time to adjust their systems, you are in trouble So you've got a really complex program level of change to manage so that either, everybody switches over concurrently or you have tools and processes to allow them to do their own fixes in their own time if you can manage that And backwards compatibility is really essential. If you can't go backwards, you lose value.
If you create a new world that has a richer markup And if you go backwards and you can't support it, you lose that. So do you want to do that? And that's I guess that's one of the catches. and that's part of the debate that's going on with advances for the NISO STS, which is a really healthy one to have. So that's my presentation for today.
I look forward to the chat session after Jesse presents for us. Thank you. Oh, great. Thanks so much, Andrew, for that. I enjoyed the plumbing standard that you showed. That was a really nice example. And the comment that was on there too. But thanks for providing that really great overview of both the benefits, the importance of flexibility and that important question of what comes next.
So maybe we'll get into that during the discussion. Let's go ahead. Without further ado, to introduce our next speaker, who is Jesse Xiao, Head of Scholarly Communication and Research Services and Medical Librarian at the University of Hong Kong. And he's going to be sharing some of his experiences using information standards in the context of research support at a really major international University.
So over to you, please, Jesse. Yes, thanks, Chris. So I share my screen. Yes so can you see my screen now? It's all good. Yeah thank you. Yeah Hi.
Hi, everyone. So thanks for the organizer to give me this opportunity to share the experience about using the information standards in academic libraries and college communications and research support services. So I'm Jesse Xiao from Hong Kong U libraries. So first of all, I would like to introduce our University first. So so for our University.
University of Hong Kong. So we founded in 1911 and until now we have 10 faculties, more than 33,000 students and more than 8,000 staff and focusing our libraries so we advance the teaching and learning research and the knowledge exchange pursuits of the universities through our resources, services and innovative and collaborative approach.
So for our libraries and research services. So we are managing the Hong Kong scholars hub as the institutional repository and the research information management system RIMS. And also we provided the research data services in the university, including managing the institutional data repository and the data management plan platform. So we are also promoting the understanding of open science movement in our University.
So including provides open access services such as signed transformative agreements with different publishers and the wider open sites also fund to support the way publishing. And also we subscribe to different tools. And platforms or database to support the open research. So start from last year. So we also provide our bibliometric and the research impact services. So based on the research intelligence tools.
And the database and extract data from those tools. And the platform and generate the report to support University seniors. Yep so for the information standards academic library, so as we all know, so our cataloging and the data management, so we use different kind of standards, such as the cataloging standards, cataloging content standards, such as the International standard, bibliographic description, speed, and the idea is the cataloging encoding standards.
So we're using the Bibframe MARC and the classifications we're using the Library of congress, the dewey decimal classifications and also the linked data. So using the semantic web technologies such as RDF or web ontology, language, et cetera. So in the metadata standards, so we're using a different kind of data structure standards such as Dublin code metadata elements, MARC and the data values standards library, congress, subject headings headings, and AAT etc.
And we also use different kind of data format or data exchange standards, such as the MARC MARCXML tracing format and OAI-PMH etc. So for our research and scholarly communication services. So we also rely on the information and the beta data from different platforms and the data resources. So, so, so that's why we need metadata are well defined and terminologies and fields and also have very clear structures.
So the information standards should also support the FARI principle, which means the information standard need to provide the metadata that is findable, accessible, interoperable and reusable. And our library use data from a different platform and the database and the publishers. So we're using using the API. also download the data set from the publishers.
We're using different way to collect the data from different resources, such as we collect data from all key research data from data site and the bibliographic data from cross crossref, scopus, web of science and also the research data from civils insights or qs, et cetera and also the author profile. So we get author profile information from Scopus web of science, ORCID and different kind of resources.
And also we also get some information from the publishers like last year, we do the open access exercise So we get a lot of our related information from different publishers. So after we collect those data and the information, so we will integrate it into our different platforms such as our Hong Kong scholars hub, Hong Kong data hub and our scholars intelligence hub, which provides the research assessment and the research impact report in the platform.
And then we also provide some customized services based on those data So the Open science is a trend and the movement to make the scientific research, including the publication data lab protocol, research, data software, et cetera and its dissemination accessible to all levels of societies. So the open science is the transparent and assessed and accessible knowledge that is shared and the developers through a collaborative network.
So the information standards play an important role. So when we build the collaborative network to share and exchange the information in the open sites movement. So we need the information standards. But we also have some facing issues. So this is one issue we identify. It's like the in difference citation tracking database. All the service providers, such as crossref and the publisher.
So the metadata are a little bit different. So for example, in the publication years, so they have different kind of date and a year is related to articles such as in Web of science. They have earlier access date on indexed date and the scopus has only provide publication publication year. But if you find some record you will see even the publication year.
They are different from the Scopus and the websites. Same as for the funding information So when we try to get some funding information from different source. So we can find that different resources on the platform provide different funding information such as this, this article. So the web of science and crossref provide more funding information than the scopus.
But if you see the grant number format, you will see even for the same publications, same funding. The grant number format are different. So that will cause some problem because we one way scrape those data and the imported to our data platform. So we want every field have the same format. And then we can easily do the mapping to get the related grant information, seeing our internal system.
So even some publishers, they only provide some the only put the funding information in the acknowledgment. So without like the records separated. So analysis issue is about open access definitions. So I think a lot of us are at least awareness states the OA definition is unclear in different kind of database, such as in the scopus, they have a gold, hybrid gold, bronze and the green and in the other platform even we search for the same University.
So you are seeing the publication counts are different even for a gold OA the publication count difference. And also we have a little bit confused about what is different with in what version, such as the green OA. OK so it's the submitted version, accepted version or published version. So and also when we doing some oa excise. So we find some bronze is for the open archive, but we cannot find the general open archive policy in the journal web page cannot easily find the lost information.
And also the embargo period information is cannot easily find from those publisher and journal web page. And also we are not sure whether the journal or the database are use the best OA version approach. So let me so if article have to gold OA and green oa together. So we will count as Gold OA So another challenge is, is the open science indicators.
So as you may aware that the plos recently announced that they have announced the open science indicator together with the data sharing, code sharing and preprint posting. So the open science indicator is very important for ala for academic institutions all the publishing field. So we want to see more open science indicators such as qos data availability, code availability, etc., those information, but from the library perspective.
So we do not want like those information, only show as in the text field or in the PDF, so we won't find lost information in the metadata. So we can easily scrape those information and feed it to our system. So let's some platforms already have this kind of information, such as the datacite metadata. So when you create, when you mint the data set the UI so you can choose the so you can have the DOI first and then you can find related, related documents so you can say the document is cited by and then you can choose the document type, such as you can choose the data set.
And let me use this dataset cited by other data set and also for the unpaywall. So we can get the preprint and postprint version from unpaywall API. So it is very helpful for us. So for our libraries to actually be used in our data hub is our data repository. So we also use those information such as the list data set publishing our data hub.
So we are also linked with the publications. So you will see so when the users are downloading the data, they can also see the publications here. So this example is the preprint. It's preprint, it's a preprint record published our data hub. So we also linked with the formal published version seen in the record page, and this example shows the dataset also link with the funding information, the funding record, and the user can find the funding details on the project.
Details just click to find out more. So the last one is about our scholars hub. So we also put some postprints in our scores and the, the user can find the postprint version from our Scholars Hub So this is the example. So the journal, so the article is published in the IEEE and also we publish in the postprint in our institutional repository.
Yes so the last one is talk about the ORCID. I am also on the board of directors in the ORCID. So actually the ORCID is provide all support for author profile management system. So ORCID is widely used in different systems such as the publisher system, ScholarOne, EM and the discovery and profile systems such as Web of Science, Scopus and also the funder system.
So perhaps in Hong Kong we are not used ORCID in our funding system at the moment, but I think in future the funder will start to use the ORCID. But for our institutional systems, so our Scholars Hub and data hub research, information management research, information management system, our score intelligence hub, the research impact management system. So all our systems are used.
ORCID so we use the ORCID to get information from the publisher funders and the discovery and the profile system, which is quite convenient for us. Yes so at least, it's my last slide. So open science is aimed to create greater transparency in the research and remove the barriers for sharing the open source methods or tools at any stage of research process.
Research process. So and more specifically so, open source has come to refer to a specific practice such as the open access publishing, open research data etc and the these changes in scholarly communication circle. So this needs the researchers, library, publisher and founder. So we work together to build a collaborative network and the reliable information standards in the different fields such as terminology, classifications, information governance.
So we build a Foundation to achieve the goals. Yep so. So I think I finish my presentation. So any questions we can discuss in the Q&A sessions? Perfect thank you so much, Jesse. I thought you brought up some really interesting points around standards and open access and open science and some of the challenges I like that example of even Web of Science and Scopus can't agree on when a particular article was even published, which year it was published.
So yeah, great. I'm sure the audience has a lot of questions for both you and Andrew as well. So let's go ahead and end the recorded segment of the session here and we'll see you all in the live discussion shortly. Thank you.