Miles Conrad Lecture 2022 - Dr_ Brennan-NISO Plus
Miles Conrad Lecture 2022 - Dr_ Brennan-NISO Plus
https://asa1cadmoremedia.blob.core.windows.net/asset-a9d53caa-4c0b-420d-b3d9-58275d672381/Miles Conrad Lecture 2022 - Dr_ Brennan-NISO Plus.mp4
TODD CARPENTER: So one of the highlights of the NISO Plus conference is the presentation of the Miles Conrad lecture. For those of you not familiar with the background of this award, this has been presented by the National Federation of Abstracting and Indexing Services since the 1960s. And since NISO and NFAIS merged in 2019, NISO has been continuing this tradition as a core element of the NISO Plus conference. Those of you who are not familiar, G. Miles Conrad was director of the Biological Science Abstracts, now BIOSIS Previews, distributed by Clarivate's Web of Science.
TODD CARPENTER: In 1957, Conrad organized a meeting of the 14 abstracting and indexing services to discuss the implications of government investments in science that resulted from the launch of Sputnik by the Soviet Union. This led to the formation of NFAIS in 1958. And, unfortunately, Mr. Conrad passed away in 1964 at the young age of 53. Following Conrad's death, the NFAIS board of directors established an annual lecture series named in his honor, which would be a central feature of its annual conference and would recognize thought leaders in the community.
TODD CARPENTER: You'll see some of the distinguished names of people, who've previously received the Miles Conrad Lecture Award. Most recently, James Neal of Columbia University and Heather Joseph from SPARC in 2020 and 2021. Before we present the award, I want to recognize EBSCO for its kilobyte-level sponsorship of this award. So thank you to EBSCO. This year's award is selected by the NISO board of directors.
TODD CARPENTER: And we've, hopefully, delivered to you, Patty, this award, which reads, "The NISO Board of Directors presents the 2022 Miles Conrad Award to Dr. Patricia F. Brennan for her lifetime achievements in the information community. Dr. Brennan is currently the Director of the National Library of Medicine, a part of the National Institutes of Health in Bethesda, Maryland.
TODD CARPENTER: She's an esteemed and accomplished researcher and practitioner. She holds a bachelor's degree in nursing from the University of Delaware, a master's in nursing from the University of Pennsylvania, and then entered clinical practice for a number of years before studying the connections between nursing and information services systems, receiving her PhD in industrial engineering from the University of Wisconsin-Madison.
TODD CARPENTER: She then went on to a prolific academic career. And she's held academic positions at Marquette and Case Western Reserve Universities, before joining the faculty at the University of Wisconsin-Madison, where she was the Lillian L. Moehlman Bascom Professor at the School of Nursing, as well as the College of Engineering. While she was there, she led the Learning Environments Lab at the Wisconsin Institutes for Discovery.
TODD CARPENTER: And in 2016, she became the first woman, the first nurse, and the first industrial engineer to ever serve as the Director of the National Library of Medicine. Using her skills and engineering systems design and clinical practice, Doctor Brennan has been at the forefront of improving patient outcomes and experiences. Her research is focused on the use of technology and the support of patient care.
TODD CARPENTER: Her work has helped lay the groundwork for what is commonly known today as mobile health or telemedicine. And while at the University of Wisconsin, Dr. Brennan was the principal investigator on the Project HealthDesign initiative, which helped develop a number of tools that supported engagement with patients. It focused on the time between encounters, or, basically, how clinicians can support patient care in the majority of time when they're not sitting in an office or space with a nurse or a doctor.
TODD CARPENTER: Dr. Brennan has received many awards and has done a lot in service and recognition in the community. She's past President of the American Medical Informatics Association. She's been elected to the institutes of medicine for the National Academy of Sciences now the National Academy of Medicine.
TODD CARPENTER: She's a fellow at the American Academy of Nursing, the American College of Medical Informatics, and the New York Academy of Medicine. She was recently inducted into the American Institute of Medical and Biological Engineering. One of the most striking things about Dr. Brennan is her practicality in the application of data and information resources.
TODD CARPENTER: It's not simply about collecting items for the sake of preservation or building a collection. Her approach is really one centered around the user, the patron, the public, who needs the information, to address a pressing health concern. Just last month, NISO hosted a virtual conference on putting the user at the center of design decisions. And this has been something that's been core to Dr. Brennan's work for decades.
TODD CARPENTER: She's continued that at the National Library of Medicine to amazing effect. And the past two years have brought that importance of putting patients' lives and connecting them to information into stark relief. And this is a great quote that I found when researching her background for this award. Is that, information alone doesn't make people's lives better.
TODD CARPENTER: The pathways to information, research about how best to use it, how to integrate it into policy, and tools to make it more visible to people would make it work. And in talking about her role at the National Library of Medicine, she was talking about how she could bring her understanding to the framing of a library that's not just part of people's lives and regardless of whether they're a scientist, a patient, or a mom.
TODD CARPENTER: I think it really goes to the core of the sort of skills and importance and values that we want to recognize here at NISO. So on behalf of the NISO Board of Directors, I'm very pleased to recognize Dr. Brennan for her service, her leadership, and her vision, and welcome her to the virtual stage. So Dr. Brennan, over to you.
DR. PATRICIA BRENNAN: Thank you very much, Todd. What a nice story about me. I appreciate it very much, and I'm really delighted to be here. I'm quite honored to be receiving this Miles Conrad Award today and providing this lecture for you. I want to take you on a journey where what is the role of a library in the world of unstructured data? Now I'm sorry to tell you, I'm not going to have the answer for you, for this question.
DR. PATRICIA BRENNAN: I'm going to work with you to evolve the answer over time. But I'm starting from two points. First, libraries will persist, but the units of scientific communication, the digital objects that we have to connect will continue to change constantly. So keeping up with them, keeping up with the standards that allow us to locate them and share them, these will continue to be challenges. We'll all have jobs for a long time.
DR. PATRICIA BRENNAN: And, secondly, I want you to remember that we know, as a National Library of Medicine, we cannot do this alone. Even within the biomedical sciences, we must partner with publishers, with authors, with distributors, with technology companies, and most importantly, with our stakeholders, who will include, increasingly, a larger and larger a much more diverse group than we've had all along.
DR. PATRICIA BRENNAN: We will have in our stakeholder group patients, mothers, children, high school students, as well as scientists, clinicians, and policymakers. Now I did receive the award this week. And I was delighted to have it come to my home. Thank you so much for this beautiful award, but more importantly for the recognition that a life spent on connecting people to information is a critical and important value for our society.
DR. PATRICIA BRENNAN: Now, I'm not a librarian and I'm not an expert in standards, so I thank the society, I thank NISO, and I particularly thank Todd for the work and the committee for the honor for being able to speak to you because you give me a platform to tell more of our stories. I've spent my career, as Todd explained, in the process of building computer technologies and connecting them to peoples' lives, and standards have been a critically important part of that.
DR. PATRICIA BRENNAN: I'm not a standards expert, I'm not even a librarian, but I will tell you that I've known over time standards bring order to complex information. They allow us to efficiently and automated make representations of this complex information, fostering communication, and ensuring shared meaning. This is critical to the lives of individuals. The National Library of Medicine focuses on not only acquiring and collecting and preserving and disseminating the scientific communication, but also the tools, including standards, to make sure that scientific communication is available.
DR. PATRICIA BRENNAN: I'm going to talk to you today about three things-- the mission, our history of the National Library of Medicine, where we are today, and the challenges for the future that I want you to partner with me as we do this. Now we will be able to have questions. I understand if you put them into the Q&A, Todd will be providing me questions. It's my hope that I will be through my remarks by about quarter of so that we can hear and have dialogue around these particular issues.
DR. PATRICIA BRENNAN: I am very, very, very honored to be receiving an award that was actually also awarded to my predecessor, Donald AB Lindbergh, but I want to remind you, and particularly those of you who are less familiar with the National Library of Medicine, that we've been serving science and society since 1836. We are a research enterprise for biomedical informatics, and we are the world's largest biomedical library. You can see three critical epochs in our history.
DR. PATRICIA BRENNAN: From 1836 to 1968, we were a collection of books and journals. We began as a shelf in a field surgeon's tent and now have moved on through congressional legislation, moved to the Public Health Service, and then moved to the NIH campus in 1960s. And we became part of the Department of Health and Human Services in the mid to late '60s around 1966. From '68 to 2000, we developed the foundations of a modern library.
DR. PATRICIA BRENNAN: We were still and we still are in that physical building you see on the picture there on the left. But, importantly, what was happening in the '60s, digitization, expansion of the impact of the information age. Much as Myles Conrad saw 10 years before, we developed and were, through congressional act, had the beginnings of the Lister Hill National Center for Biomedical Communications, and in the '80s, the National Center for Biotechnology Information.
DR. PATRICIA BRENNAN: Dr. Lindberg joined us in 1984. And in that period, brought us into a huge focus on automation as it would address both the burgeoning genomic knowledge that was coming forward as well as the needs of patients laypeople to have access to this information. In 2000, we began the beginning of the 21st-century library. We now focus on leading innovative data science research to accelerate the mission of the National Library of Medicine to reach scientists and society with trustable health information.
DR. PATRICIA BRENNAN: But let me remind you, for those of you who were not here in 2007 when Dr. Lindberg received this award, what was happening then? [VIDEO PLAYBACK] [MUSIC PLAYING] - Today on Capitol Hill, the National Library of Medicine's Medline internet database is made free to the public.
DR. PATRICIA BRENNAN: - The superhighway on medical information will now become a freeway, - This development by itself may do more to reform and improve the quality of health care in the United States than anything else we've done in a long time. - We actually, at NLM alone, answer a million inquiries a day. - A day?
DR. PATRICIA BRENNAN: - A day. It's a staggering amount. It's going on 400 million a year. - We have tens of thousands of genomes, not just one. And all of that has to be assembled in a place where people can compute on it and learn from it. And that's where NLM is just a central hub of information that we all depend on.
DR. PATRICIA BRENNAN: - Researchers are downloading data equivalent in size to the contents of the entire library of Congress every week. That's amazing. - The real heart of PubChem is the links from each chemical to other sources of information about its biological property. - When we were formulating the Framingham SHARe program, in which we wanted to genotype the 10,000 participants in the Framingham Heart Study, we wanted to put the genotypes in the phenotypes together in a common database.
DR. PATRICIA BRENNAN: And before we knew it, there was dbGaP database genotype and phenotype. - Oftentimes, it's the National Library of Medicine that I turned to and not just as a reporter but also as a doctor. There are 23 million citations on PubMed. ClinicalTrials.gov is a place that I often send my patients. - I push MedlinePlus. I eat, breathe, MedlinePlus.
DR. PATRICIA BRENNAN: It's just amazing the amount of information that they have on the website. - You know today there is virtually no lag between new knowledge coming out of any kind of research activity and its clinical application. The basic reason for that is what the National Library has done taking advantage of the new technology for the transmission of information.
DR. PATRICIA BRENNAN: [END PLAYBACK]
DR. PATRICIA BRENNAN: I stand on the shoulders of a giant. And let me tell you what the NLM has been up to the last 15 years. Obviously, in the last two years, the COVID-19 pandemic has changed the nature of research at the NIH, both on our Bethesda campus, which you see in front of you, and nationally and internationally. For those of you who've not visited us, by the way, the National Library of Medicine is over there on the left-hand side.
DR. PATRICIA BRENNAN: We now have two buildings and parts of two other buildings and two buildings off-campus, so we're growing and growing with our personnel. But during the COVID pandemic, we became a really important engine at the NIH to foster the changing nature of research across the NIH. Innovations and research design and new kinds of data were burgeoning at every single institute, and we were there to support this.
DR. PATRICIA BRENNAN: We accelerated the application of informatics to clinical research getting the NIH to publicly endorse health data standards in the US Core Data for Interoperability and also in messaging standards to Fast Healthcare Interoperability Resources or FHIR standard. We helped to inspire advances in analytics for generalizable solution. And we helped to support engagements in communities around the country in research, design, and implementation.
DR. PATRICIA BRENNAN: You may not realize that the National Library of Medicine has 8,000 points of presence around the country in our network as the National Library of Medicine providing us with a community access and a community doorway. This was particularly important during the COVID pandemic. We've learned a lot in these past two years. We've learned that medical information must be complemented by an understanding of the person in context.
DR. PATRICIA BRENNAN: But there's a lot more focus on the standards for the medical information, the privileged information, than on the person in context. And we're working to change that because community norms and privileges intersect with research principles and federal requirements. So to devise research quickly for a pandemic, such as the COVID pandemic, requires that we balance both science and society.
DR. PATRICIA BRENNAN: And, finally, we learned that the research at the speed of the pandemic goes best when it leverages existing community investments as well as established research assets, including standards. We've also learned through the pandemic that we needed to improve or attend to the research and reproducibility of research to accurately characterize the experience of all people in the pandemic.
DR. PATRICIA BRENNAN: Understanding and developing a better understanding of how clinical and human experience come together by leveraging information. And, finally, that technology and messaging standards were critical to be able to acquire testing information in the community or push content information back out to individuals as they needed it. We also developed new approaches to design, and this changes the nature of scientific communication.
DR. PATRICIA BRENNAN: We've had stronger community scientists collaborations and much greater accelerated timelines releasing funds faster than ever before at the NIH. Federal dollars into the hands of communities and scientists very quickly. We were able to leverage existing data, like electronic health records, particularly, when they relied heavily on terminology standards. We started to build into research, building early standards like common data elements so that the concept of having interoperable data began during the design phase of research.
DR. PATRICIA BRENNAN: And finally, we saw new forms of scientific communication come out. Preprints, news briefs, and sometimes the evening news, all of which led the National Library of Medicine to spearhead new forms of scientific communication. We led the NIH by providing a preprint pilot that makes the results of research funded by the National Institutes of Health, specifically related to the COVID pandemic, were discoverable and available through PubMed central and available with a citation index in PubMed itself.
DR. PATRICIA BRENNAN: We also collaborated with publishers around the country, more than 50 publishers, to develop the public health emergency COVID-19 initiatives. This made coronavirus-related articles accessible without charge in PubMed central. We used a licensing term that facilitated not only human readability but also text mining and secondary analysis. And not to let those newsbriefs to be lost, we developed the global health events archive.
DR. PATRICIA BRENNAN: We now have over 12,000 pieces of born-digital resources from the COVID-19 pandemic that will be available for future generations to study. The National Library of Medicine is guided right now by a strategic plan that we developed about 5 and 1/2 years ago as I joined the NLM. We are focused on accelerating discovery and data-powered health. We have three pillars of our strategic plan.
DR. PATRICIA BRENNAN: First, to accelerate discovery and advance health through data-driven research. Second, to reach more people in more ways to enhance dissemination and engagement. And third, to build a workforce for data-driven research and health. Each of these relies heavily on our ability to have common ways of labeling challenges, which we call standards.
DR. PATRICIA BRENNAN: Now let me begin by talking to you a little bit about the research at the National Library of Medicine. We are, like other institutes at the NIH, a research engine. We have an internal research program housed in on campus in the building on the left that you see here and also an extramural National Research program where we fund studies and projects all around the country.
DR. PATRICIA BRENNAN: Our intramural research program has two different groups, the computational biology branch and a computational health research branch. Each of these, again, relies heavily on the ability to make common labeling of clinical and data and biomedical phenomenon known and shareable. Within our research program, we spend a lot of efforts focusing on text mining, as you would expect. We're a library.
DR. PATRICIA BRENNAN: Dr. Jean-Louis team applied AI and machine learning to the literature and developed a resource called LitCOVID. This search engine and results reporting structure that was developed in LitCOVID now is the core that runs our PubMed searches. After a search is launched by an individual's query, the series of matching hits of citations that we held are made available.
DR. PATRICIA BRENNAN: There is a learning to rank algorithm that is driven by an AI mechanism that matches the citations that were extracted to the citations that would be presented. so that what is presented is a best match ordering of the citations. Why was this important? Well, we were finding that a typical search through our PubMed resources would generate 20, 30, 40 pages of results, most of which after the first two pages were never viewed.
DR. PATRICIA BRENNAN: So we wanted to bring the best results upfront. Some of our research actually leaves the library and goes into the community. Researchers in our intramural program partnered with researchers in the community to improve our ability to do wastewater-based surveillance. What's a library doing in wastewater, you may wonder. But the analytical tools that we have are useful for estimating the amount of circulating SARS-CoV-2 variants found in the genetic fingerprints in wastewater.
DR. PATRICIA BRENNAN: This is a partnership, as I said, between our intramural research program and an extramural investigator. We do support a lot of research in the extramural community. On the screen in front of you are two examples of this. The Green Button project on the left-hand side describes the work of Nigam Shah. He wanted to automate the very commonly asked question in clinical care, what happened to your patient who looked like mine?
DR. PATRICIA BRENNAN: But he brings together information from the experience but both from libraries, from randomized controlled trials, guidelines, and algorithms to provide that in-the-moment answer to clinical practice. And this is now operational at Stanford. In a project run by Doctor Nguyen at University of Maryland, Google Street Maps are being leveraged to better understand the community so we can characterize the built environment and look at the relationship between the built environment, the lateral environment, and health outcomes.
DR. PATRICIA BRENNAN: This work is bringing a whole new type of data into health. The National Library of Medicine is also critical in supporting projects across the entire NIH. And I want to draw your attention to three of these because you'll see how increasingly our work relies on effective use of terminology and messaging standards. In the BRIDGE2AI program, the goal is to generate new flagship biomedical and behavioral data sets that are ethically sourced, trustworthy, well-defined, and accessible.
DR. PATRICIA BRENNAN: What we might call a well-behaved data set. This program has put about $125 million into research around the country to develop software and standards to unify data attributes across multiple data sources and across data types. The teams are creating automated tools to accelerate the creation of fare that is findable, accessible, interoperable, and reusable data sets, as well as ethically-sourced data set.
DR. PATRICIA BRENNAN: Operating AI on data that is voluntarily contributed by patients by individuals is critical to not exploit the individual's rights. They are providing resources to disseminate the data as well as the principles and the best practice. But importantly, they are also investing in training materials for workforce development that bridges AI, biomedical, and behavioral research. A project just released this summer is referred to as the AIM-AHEAD project.
DR. PATRICIA BRENNAN: In AIM-AHEAD, our investigators are building partnerships around the country to leverage data science, clinical research, and community engagement specifically to address the challenges of health disparities. To look at can we detect and can we improve health equities by providing more appropriate and recommending systems more appropriate scientific discovery. The AIM-AHEAD program will create the infrastructure that supports interoperability at scale, which in turn makes sure that we are able to apply standards in a uniform fashion to research clinical data across the country.
DR. PATRICIA BRENNAN: Accelerating the use of well-behaved data sets will improve our ability to understand and address health disparities. And, again, critical to this program is to develop data science training, for we know that the research workforce for data science and information science at present is largely not as well diversified as our countries can possibly afford.
DR. PATRICIA BRENNAN: Finally, we're going international. With the DSI-Africa program, the goal is to harness data science for health discovery and innovation in Africa. The goal here is to leverage these technologies and prior investments in the sub-Saharan African region to develop solutions to the country's most pressing medical and public health problems by creating an ecosystem, including partners from academic, government, and private sector.
DR. PATRICIA BRENNAN: This program alone is a $75 million program. 19 awards have been made. And there is a data coordinating center and open data science platform at the University of Cape Town. The intention here is to foster in-country data science for in-country discoveries that can be rapidly returned to the community. In this, additionally, we are learning the languages that are used across this diverse continent and how to build structures to make these data fair and accessible and preserve their original meetings.
DR. PATRICIA BRENNAN: There's multisectoral and multidisciplinary hubs addressing critical problems that you see across the world but, specifically, in sub-Saharan Africa, including climate change, COVID, of course, mental health, and antimicrobial resistance. Now the National Library of Medicine is really better known as a trusted provider of health information, of trustworthy health information. So I'm going to turn to talk a little bit about our resources and health information right now.
DR. PATRICIA BRENNAN: The National Library of Medicine is the nexus of data and information at NIH. We're both a building and an electronic password and pathway and a workforce for the future. What we have, what we house, what we curate are high-value genomic bibliographic and research literature repositories. We also maintain the reference sources for many health data terminology and standards.
DR. PATRICIA BRENNAN: And, finally, value sets that is collections of clinically meaningful indicators that allow hospitals around the country to do quality management and quality improvement. We serve millions of people now, 24 hours a day, 7 days a week. We are one of the few federal workforces that cannot rest even if we have a sequestration, even if we have a drop in funding. We must be present because the world relies on us.
DR. PATRICIA BRENNAN: We are fostering sequence-based public health now, surveillance that is drawn not only by understanding who has gotten sick but what organisms are they exposing to. We're using standards to make data fair and promoting access to control data so we protect the privacy and the integrity of research but still foster re-use of data. And, finally, we're conducting research to develop and refine new ways to interrogate data.
DR. PATRICIA BRENNAN: Our products are probably familiar to many of you. ClinicalTrials.gov, our repository of registered clinical trials and results now has over 400,000, more than the 200,000 that Senator Obie said in the video. We have results available on 50,000 studies. Many of these actually never make it into publication, so we're able to assure the public's trust of being providing research results directly.
DR. PATRICIA BRENNAN: PubMed and PMC are our literature repositories. PubMed being citation. PMC being full-text access. 30 million citations in PubMed now. And over seven million full-text articles in PubMed central. Our genomic resources you see on the bottom half of the screen. dgGaP is the database of genotype and phenotype, bringing together specialized studies that allow us to look at these important predictions of what does a gene say will happen.
DR. PATRICIA BRENNAN: GenBank supports fully computed fully annotated gene sequences. And SRA takes the original raw sequences making them available for studies. We had the first SARS-CoV-2 sequence available in our repository within a month of the first case being determined in China. Now our data sources are just going wild. People are using them extensively. The sizes are growing.
DR. PATRICIA BRENNAN: We have over 30 petabytes of data in our sequence read archive, with a growth of over 250,000%. Our DV gap, our holdings are growing. PubMed central over 300% within a 9-year period. But our users are also growing. The kind of users and the number of users that are reaching different resource. So our pathogen detection system has grown over 300,300%. The ability to use our sequence read archive to determine what kind of infectious disease is seen.
DR. PATRICIA BRENNAN: The PubMed central resources are growing millions of users a day. And our sequence read archive continues to grow. We are working to be more responsive to these users, make things more discoverable, and make them more accessible. We're creating comparative genomic resources to advance scientific discoveries and improve public health. In the past, a specific genomic resource was created for every single different model organism, from rats to zebrafish.
DR. PATRICIA BRENNAN: And yet we're finding that crossing over the genetic structures of these different organisms is equally important. So we're looking to build pathways between organisms as well as within organisms. And we are developing ways to make genomic data accessible. Not only creating robust interconnected system, making use of modern commercial cloud technologies and open data, but we're also breaking down the silos to accelerate the generation analysis and sharing of the data.
DR. PATRICIA BRENNAN: Fundamentally applying standards to make this genomic data fair and accessible. We have a large investment in terminology standards health data standards to promote common approaches to support the NIH, science overall, and health care delivery. And I want to turn to those for just a few minutes. Our focus on health data standards brings the NLM into a specific role in the country.
DR. PATRICIA BRENNAN: We do build some standards. We build RxNORM, which is a terminological standards for drugs pharmaceuticals. But we often are the curators and disseminators of critical standards. And we certainly promote their use in the research community and nationally. Our terminology standards address both the literature and clinical phenomenon.
DR. PATRICIA BRENNAN: The medical subject heading is our key terminology for understanding and classifying and organized the literature. Taking advantage of a large ontology that allows hereditary processes. And improves our ability to locate and retrieve objects because of these relationships. Within clinical terms, we are focused on curating and disseminating three separate terminologies that are useful for what's referred to in the United States as meaningful use.
DR. PATRICIA BRENNAN: How do we make sure our hospital's clinical information system is achieving good health care? And that's why providing access to the standard nomenclature of medicine or snowman terminology. LOINC, the Logical Observation Identifiers Names and Codes, and RxNORM, as I mentioned a few minutes ago, are internally built normalized naming system for generic and branded drugs. In addition to basic terminology standards, we also develop composite standards.
DR. PATRICIA BRENNAN: The value set authority are clusters or lists of codes and terms that allow hospitals' health care providers to either make appropriate billing for a cluster of care services or determine the effect the quality of service by bringing together certain features around the diagnosis. We also effectively develop and disseminate the formatting standards of the FHIR, the Fast Health Interoperability Resource.
DR. PATRICIA BRENNAN: Now, this is a broad national partnership that we do. We participated to ensure that the Fast Interoperability Healthcare Resource, or FHIR resource, is available. It wasn't discovered at the NIH or developed only by the LOINC, but its partnership in making sure its use both for clinical care and research is one of our key activities. Fundamentally, as a head of a library, I view standards as a way to provide purposeful expressions of a common world view of specific phenomenon.
DR. PATRICIA BRENNAN: And we are now used to using standards to bring orders to complex phenomenon. If you think about the complex interaction between a care provider and a patient or what a radiology image looks like and how it becomes comparable to others. Or how do we organize the millions of books or thousands of clinical records. The standards that we invest in and support bring order to these complex phenomena and support the exchange of meaning.
DR. PATRICIA BRENNAN: I want to take a couple of minutes to say what we're doing immediately because we now are maturing in our standards efforts are specifically related to our health data standards efforts. I want to let you know about our efforts on the Medline 2022 Program. Medline 2022 is an initiative of the National Library of Medicine that is designed to make our processes more efficient and more responsive.
DR. PATRICIA BRENNAN: We've invested in a automated approach to collecting and curating, that will provide a 24-hour response time for a mesh index citation to appear in PubMed. This is almost unheard of, to rapidly be able to produce this. We are expanding the gene and chemical curation by subject matter experts. So we are preserving our scarce human resource to make sure that the application of standard terminologies for specialized areas does have human attention.
DR. PATRICIA BRENNAN: And we are working with our analysts and our researchers to continuously improve the automated indexing algorithm. At the other end of the research projects, the NLM is supporting the use of common data elements in research projects. Common data elements is a strategy to promote research rigor by consistent naming conventions. Given any phenomenon which is manifest to a particular concept, what is the appropriate measure?
DR. PATRICIA BRENNAN: What is the item or scale? And to whom is it applied to? This process of using common data elements is a way of collecting research data purposefully. Allows us to get better leveraged to better connect across research activities. These, like any other change in the standard way we label things, do require human engagement to develop a buy-in to help researchers see the value of a different way of addressing stuff to improve the ability to connect and value research, not only for the resolution of a single hypothesis, but for the development of a research data that can be used in the future.
DR. PATRICIA BRENNAN: So what is facing our future? I'm going to bring to you the most important challenge that I see coming up to us as a library today. And that is the presence of unstructured data. Unstructured data is moving towards us so quickly. And our past history of being able to build terminologies around a structure of an article, a structure of a digital record, a structure of a sequence, is now being taxed by the fact that increasingly we are facing unstructured novel data types coming into the NLM.
DR. PATRICIA BRENNAN: Let me give you a little bit of a video view of this. [VIDEO PLAYBACK] [MUSIC PLAYING] [END PLAYBACK] We will need to think about standards in a new way in the future.
DR. PATRICIA BRENNAN: If standards are the tools that help us make sense of phenomenon and share that meeting with others, then we need to think about how we develop standards on increasingly atomic data. How do we formalize the atomic data so the structuring of the information, the ontologies, the sense-making tools, can be brought together. The National Library of Medicine believes its primary purpose in the world is to enable knowledge building based on core data, whether that's literature, clinical observations, or sequenced biological data.
DR. PATRICIA BRENNAN: And in order to do that, we have to be putting in the building blocks that allow the interconnection of information into a resource that can be used, that can be leveraged by society. Our responsibilities as the National Library of Medicine are and will continue to be the acquisition, the organizing, the preservation, and dissemination of the information important to health and biomedicine. But this kind of information is coming at us in more and more granular ways.
DR. PATRICIA BRENNAN: And, more importantly, more people want to be able to use it to make their own story. In traditional use of terminologies and standards as I can see them from my perspective, a world view was expressed through specific terminologies. The ontology of organizing specific data units to connect to each other brought with itself a story. But now those stories are changing they're expanding. So an individual's understanding of their own health needs and the professionals understanding of the illness that person has, overlap but are not identical.
DR. PATRICIA BRENNAN: So our challenge as a library now is to bring forward the tools that help people make their stories. There's partnerships and engagement that are necessary to accelerate discovery to improve our ability to make our resources available cannot be done alone by the National Library of Medicine. But partners such as BIBRFRAME or the ICMJE or RDF, bring to us a partnership that we can leverage and in turn, we can contribute to.
DR. PATRICIA BRENNAN: The jobs, the standard that structures our citations allows us to advance our automated indexing through Medline 22 to be able to bring a 24-hour response. So these partnerships are core to our future. Sharing our worldviews, not necessarily having the same worldview, and aligning them will be critical as we go through developing new ways of doing business.
DR. PATRICIA BRENNAN: The library is challenged to help people find meaning from data that's driven by users' needs. What does it what does the individual user want to bring together? We want to, if you will, mash-up in the way they understand a problem. So that an individual looking at cellular transformation because of specific lung disease that is treated by a certain medication because that individual lived in a certain area with a level of traffic maybe a particular air quality has to be able to brought together itself.
DR. PATRICIA BRENNAN: No longer are we able to exclude health and non-health data. We must bring this all together. It requires us thinking in new ways. Multi-scaling from cells to society. We also have to recognize there are overlapping ontologies. We need to partner with our users, our patients, as well as our scientists, our policymakers, and our clinicians to be able to support their mental models for reconstruction.
DR. PATRICIA BRENNAN: For bringing the information of the world that they need to match their worldview to understand what is going on. This is going to challenge us in new ways in defining what is truth, what is accuracy? Who gets to be privileged to say what one specific definition or one particular problem should be labeled as? I bring another dimension to this when I ask you to think with me for a few minutes about ephemera.
DR. PATRICIA BRENNAN: How do we create standards around ephemera? The thousands and thousands of things that happen to a patient in any given day could in of and of themselves be important. Sometimes they're very transient, the gaze on a mother's face or the oxygen content in expired O2, expired air. We have to understand how we capture and label those to make them meaningful.
DR. PATRICIA BRENNAN: And yet because of the size of data sets that are coming at us, we cannot use the same structure that we've used in the past. Not every data point will have the same level of relevance. And we have to take new models, new probabilistic models, to determine what should we be investing our data standards in. What should we be using to describe phenomena? Finally, we must engage with communities and promote access not only to individual literature elements and scientific data resources but also to standards to make sure that we help people make data meaningful in their lives.
DR. PATRICIA BRENNAN: One of our strategies to address our second pillar of the National Library of Medicine strategic plan to make our resource accessible to more people in more ways requires that we rethink how we apply terms to data element descriptions to the literature so that we are able to increase the interaction. To increase the alignment between the vernacular that a questioner might bring to us. And the formalized way that we've defined or described a particular resource that's in our holdings.
DR. PATRICIA BRENNAN: Fundamentally, as I identified earlier, the National Library of Medicine is a source of trusted health information. But as we move into an era of unstructured data and create a strategy where our standards move to a more atomic level and analytical tools help us bring together data in ways that provide insights or treatment direction, the question of what is trust becomes critical and must be thought through carefully.
DR. PATRICIA BRENNAN: We recognize that the answer to a particular question is one hallmark of a trusted resource, the repeated coalition around the same answer is another hallmark. But now, as we're moving into a more robust yet more flexible way of understanding how health phenomena, desires for patients for certain types of care, the response to treatment, we need to be having a different kind of a pathway towards trust.
DR. PATRICIA BRENNAN: Our chain of trust has to rely not only on the correctness according to a given perspective of an answer, but also on the strategies used to imply to acquire the answer. We must display, we must evaluate the impact of algorithms such as machine learning algorithms on search to be sure we are remaining true and trustable. Exposing the literature as it needs to be exposed not privileging one view over another. And yet this is challenging our understanding of trust, requiring the user to engage with us more to determine how one creates and evaluates believable knowledge in a community.
DR. PATRICIA BRENNAN: Our responsibility is to create a library that can drive towards a future of public engagement, rapid response, highly personalized treatment, and the development of novel therapies requires that we bring the concept and the values that standards have given us to bring order to knowledge, to structure and retrieve in efficient ways to a whole new type of data. I hope you'll join me on this journey.
DR. PATRICIA BRENNAN: I hope you have guidance and advice for me because our expectation is serving society will happen through partnerships like the one we are experiencing today. I thank you for this honor. I thank you very much for your time listening to me and letting me explore with you my ideas of how standards might drive us to get value out of unstructured data. I invite you to stay in conversation with me.
DR. PATRICIA BRENNAN: I write a blog each Wednesday morning, "Musings from the Mezzanine." My email address is on the screen and also my Twitter handles. Thank you so much for your time. And I believe I've left us some time for questions and comments.
TODD CARPENTER: Thank you so much, Patty. Fantastic talk. Thank you. And I will stress to all of those listening, "Musings from the Mezzanine" is a brilliant, brilliant thing to follow. Really interesting things posted there regularly. I really enjoy reading it.
DR. PATRICIA BRENNAN: Thank you.
TODD CARPENTER: We do have a number of questions that have come in. If you have any questions for Dr. Brennan, please put them into the Q&A. We do have some time, maybe 10, 15 minutes. Thank you for giving us the opportunity to pepper you with questions. I'm actually interested in you were talking about the new streams of data coming in.
TODD CARPENTER: And I'm not sure this is necessarily unstructured, but it's certainly had her genius. I'm wondering in ways in which the medical community or the NLM maybe, in particular, is consuming the-- were talking about how people want to make their own health stories and are interested in their own health stories. And a lot of people are tracking their health vitals, you know, Fitbits and watches and whatever other devices to track and monitor their own health situations.
TODD CARPENTER: I'm wondering what sort of challenges you see in managing that and getting clinicians access to it? A lot of this is often-- who knows if you can get information from Apple or if it's managed in other ways. I would think this is really helpful in telling those stories, but can clinicians get access to it and get access to it at scale?
DR. PATRICIA BRENNAN: Thank you, first of all, for your kind remarks about "Musings on the Mezzanine." I have to just pause for a minute and tell you that results from a lot of people's contributions. I work with a tremendous communications staff who translate my language into English, which is really very helpful. I also am very delighted to have contributing writers. And anyone by the way, from the NISO community who would be interested in this platform, please do reach out to me because we do often have guest bloggers who bring their specific perspectives enriching our conversation.
DR. PATRICIA BRENNAN: So let's talk a little bit about Fitbits and what we consider patient-generated data. What does what is the library's role with this? So the National Library of Medicine houses mostly bibliographic citations and genome sequences because of our special relationship around making data public. And genomic data, of course, is one that we need to be sharing because so many discoveries for happening so quickly.
DR. PATRICIA BRENNAN: It was important to make sure that all researchers could have access to information without restriction. We have a very special responsibility though, to make sure that patient-specific information is protected. And so that access to our dbGaP our database of genotypes and phenotypes, for example, is highly restricted and requires permission. A researcher must apply for access.
DR. PATRICIA BRENNAN: Must sure they have human subjects protection and must meet our security. Within dbGaP though, we have records of people's Fitbit where we can try to align their walking cadence and their genomic structure. In some cases, one of our studies has records of daily eating habits. And so we have the ability to begin to explore the relationship between genomes and then the phenotype of body structure, body size.
DR. PATRICIA BRENNAN: And we also have many, many, many articles that describe the different ways one can use these various patient-generated data as a way of understanding particular health phenomena, whether it's anxiety or exercise tolerance, or what have you. Now I'm going to take you back a little bit in my own career to the mid-2000s when we were working with the Robert Wood Johnson Foundation, specifically, on how do you build tools to help people self-monitor their health.
DR. PATRICIA BRENNAN: And one of the things we learned is that people make very interesting and sometimes idiosyncratic interpretations of the data. So we take, for example, the idea of how do I know I'm healthy today? For some person, that's walking six blocks. For another person, that's picking up their grandchild. For a third person, it's being able to sleep all. So the indicators of what constitutes health to me can vary.
DR. PATRICIA BRENNAN: In addition, sometimes those indicators are explained to the person something that's meaningful to activate them towards health. They may not have any significance to the clinician. And this is where my concept of overlapping ontologies come into play. Data that is meaningful to a person that drives them to take health action is as valuable as the data that might be an indicator of their current health state.
DR. PATRICIA BRENNAN: They are different perspectives on the same phenomenon. I don't believe the NLM will ever provide a way to put standard expressions and interpretations around patient-generated data because it is an idiosyncratic process. But we will be and we hope to be part of the teams that are fostering the ability to capture that information and make it accessible. The FHIR data exchange format is actually being used by Apple Health to extract information from clinical records and put it on smartphones and vise-versa to be able to promote provide your information from your phone to the clinical or a researcher.
DR. PATRICIA BRENNAN: What I will challenge you to think about here with us is that the policies are not yet keeping up with the technologies. We've got to make sure the policies around unfettered access, lack of intrusive observation of, individuals and individual rights to data are well respected.
TODD CARPENTER: Of the things that you have developed and what resources or programs are you currently most proud of that are available for public use.
DR. PATRICIA BRENNAN: I should have been prepared for this question. I have to say, I'm most proud of the resource of our human workforce that has been able to pivot and be responsive to either emerging needs for technology and emerging ability to make effective use of technology. So our Medline 2022 program is really essential because it's going to make sure we can manage this ever-growing number of citations.
DR. PATRICIA BRENNAN: Or we were able to pivot our use of this sequence read archive, which was largely built as a research repository-- pivot that for public health use. So the Sequence Read Archive-- SRA as we refer to it-- is now a component of a program called Trace, which is looking for new targets for vaccines and new targets for therapeutics around the coronavirus infections. In addition to that, we work very closely with the CDC to monitor the presence and the variations in the virus itself.
DR. PATRICIA BRENNAN: So I'm proud of our ability to be able to be responsive to a public health emergency while remaining true to our basic principles of acquisition, preservation, organizing, and sharing.
TODD CARPENTER: There was a, earlier today-- and I have no expectation that you were part of this conversation. But since you were talking a little bit earlier about AI and the use of AI and the applications of AI amongst some teams that you're working with, there was a conversation earlier about interests related to standards as they focus on data access and sharing. And since this is an important element of where NIH is headed, I'm wondering if you had any suggestions about, from your perspective, from the NIH's perspective, that is related to how publishers and others who provide information can provide access to machine computation in support of AI learning and AI services?
DR. PATRICIA BRENNAN: So the move to append data sets or to point to data sets from articles has been really critical as we move into this new era. We have a grammar for an article. We know what the title in an abstract, and we know how these things fit together. We don't have grammar for models yet. So one of the things that could be very helpful to the accelerating effective and fair use of AI is developing the grammar, the standard structure for how one records and reports, not only how an algorithm was built, but how it was applied in a specific data set.
DR. PATRICIA BRENNAN: A third area that I was speaking of this morning in a meeting that most of you were not involved with, is to help us come up with the metadata to describe data sets. How do we know this data set is robust, appropriate, comprehensive, representative, and good enough for the purpose for which it's being used? We see these challenges come frequently when AI models are built on a certain data set. And what we get is maybe a description of where that data set was located and not how big is it, how sparse are the data elements, what's the variability that one would expect.
DR. PATRICIA BRENNAN: And while we used to believe-- and I say "we" as meaning the AI thinkers and aficionados like myself, "used to" meaning about 10 years ago-- thought that I was robust enough to perturbations in the data. And we didn't have to really worry about this. We're finding that good AI requires well-behaved data sets. That doesn't mean they have to be collected under the conditions of precision as a research activity because there are ways to make modifications, if you will-- error estimates within the AI algorithms.
DR. PATRICIA BRENNAN: But the data sets themselves still have to be traceable. They have to be knowable. The data has to be reproducible in a way that makes it possible for us to know how much this can be trusted. So I think for in terms of standards, if we can develop the grammar for model communication and if we can develop the metadata for data set integrity, we would go a long way. And I would be very appreciative to see that in our operation.
TODD CARPENTER: And we'll take a note of that as we're trying to gather ideas and things that NISO could possibly move forward with. One last question for you coming from one of the participants from Uganda actually. And since you mentioned some work that you've been doing in helping to build a data culture, information infrastructure in Africa, what kind of measures can a library undertake to collect and core data related to their community, like if you have a special place in a special institution where people are probably feeding you more data than you could possibly consume?
TODD CARPENTER: But in an institution that doesn't have those resources, how might they collect information that is core to their communities?
DR. PATRICIA BRENNAN: So this is a really important question. And it actually challenges a little bit of what we think of as what a library should be collecting. We have partnered with publishers for years as a way of assuring that the content of the scientific communication unit-- it has been assured by experts in that field, and we collect based on their recommendations. When we move into community-level data-- whether it's air quality, potable water, the number of streets in a town-- we're moving into a data model where that intermediary-- the imprimatur of accuracy-- is not there.
DR. PATRICIA BRENNAN: So the library takes on a new role in both communicating to its stakeholders, its patrons, about the nature of how much review has happened with this data and, also, the strategies to connect to it. Now, we connect to data resources that are produced by the federal government. Some that are produced by professional societies. And we do endeavor to connect with some permanence.
DR. PATRICIA BRENNAN: And we check our linkages, and we make sure we're able to continue to find things. In lower-resourced environments, network participation, network collaboration becomes really critical. And I'm curious to watch over the next 10 years to see what we will see happening in terms of a shift from every library creating its own collection, to the creation of shareable collections. Now, this is not uncommon across shared catalogs in the US.
DR. PATRICIA BRENNAN: We see this already coming to the fore. But there will be, and there will always be more data than we can possibly manage at any level. And, so, finding efficient ways to build and leverage connections by partners is, to me, one strategy that I think is absolutely critical. The last piece I'm going to bring in is from my engineering background. And that is, we need to not overlook the value of sampling.
DR. PATRICIA BRENNAN: So collecting large amounts of data and retaining them is only one way to understand the community. Periodic sampling into the community-- surveys, walkthroughs-- are also ways to generate data about a community in a moment. And I think we will need to be looking into the future, making careful distinction between data that must be preserved and knowledge or awareness that may be preserved without necessarily preserving the data at every point in time.
TODD CARPENTER: Well, fantastic. That's a great way to end. Every institution is part of a community. And regardless of the size and the reputation and even the budget or the impact of your institution could have, we all participate in a great networked world. So, Dr. Brennan, this has been a fantastic talk. It's such a great pleasure to speak with you.
TODD CARPENTER: And thank you for sharing your ideas with the NISO community.
DR. PATRICIA BRENNAN: Thank you for stimulating these questions. I mean, I appreciate it very much.
TODD CARPENTER: Great. And really appreciate all of the work that NLM does and all of the things that they do in contributing to NISO's work as well. So thank you.
DR. PATRICIA BRENNAN: Thank you so much. Bye-bye now. Good luck with your meeting.
TODD CARPENTER: Thank you. [MUSIC PLAYING]