Name:
NISO Two-Part Webinar, Discovery and Online Search, Part One: Drivers of Change in Online Search
Description:
NISO Two-Part Webinar, Discovery and Online Search, Part One: Drivers of Change in Online Search
Thumbnail URL:
https://cadmoremediastorage.blob.core.windows.net/c3806add-a818-49f2-8ed9-9eaca157007b/thumbnails/c3806add-a818-49f2-8ed9-9eaca157007b.png
Duration:
T01H37M24S
Embed URL:
https://stream.cadmore.media/player/c3806add-a818-49f2-8ed9-9eaca157007b
Content URL:
https://cadmoreoriginalmedia.blob.core.windows.net/c3806add-a818-49f2-8ed9-9eaca157007b/NISO Two-Part Webinar%2c Discovery and Online Search%2c Part One.mp4?sv=2019-02-02&sr=c&sig=XdICxeQ2lNomzx0LFVb80Fz9hSdrmVu6UhPl9ksOERE%3D&st=2024-10-16T01%3A10%3A27Z&se=2024-10-16T03%3A15%3A27Z&sp=r
Upload Date:
2023-08-08T00:00:00.0000000
Transcript:
Language: EN.
Segment:0 .
All right, good afternoon and welcome to the niso monthly webinar for June 12th, 2019.
My name is Todd carpenter. I'll be the moderator for today's program. The session today is part one of a two part program this month focused on discovery and online search. Today's session is on drivers of change in online search. Next week, on June 19th, we will hold part two of this program that will focus on the issues of personalized content and personalized data.
So there are two components to today's program. There is the online presentation through the Zoom interface. You should now see on your screen a welcome screen and the audio is delivered through a voice over IP. Alternatively, you can dial in to the information using the information provided with the login instructions. If you need technical assistance, you can get that by contacting a live chat via HTTPS support zoom.us. Look for the live chat section and you'll need today's webinar ID, which is 266685381.
We'll post that into the chat section. So that you have it. Some frequently asked questions about our programs. We will be making the slides for this event available. We will post those to the website for this event. So that you can get access to them. We are also recording this program so that you can listen in at a later time.
You'll see an email from our staff with information on links and information. How to get access to that usually takes us about 12 to 24 hours after the event to get that posted off to you as I mentioned, part two of this session will take place on June 19th, focused again on personalized content and personal data. We will we also will be participating in the American Library Association meeting June 21 through 24th in Washington, DC.
On Friday afternoon will be the 12th annual niso bisg forum. That's a partnership program that we have with the book industry study group on the changing standards landscape. That's a free event Friday afternoon from 1200 until 400 PM. We also have a couple other programs. The niso annual members meeting will take place on Saturday, and on Sunday.
We will be doing a program on the Ra 21 initiative and the Coalition for SeamlessAccess. In July we have we take a bit of a holiday and we will be back in August with our fall lineup of programs. For more information about all of the niso events, you can look at under the Events tab on the website.
Also want to highlight a new project that niso has launched. This is a new recommended practice to develop a to ensure the findability of video and audio outputs that are being produced by scholars and researchers and are being collected by publishers as well as libraries. We issued a press release just a couple of weeks ago on the announcement of that, the launch of this project.
We're looking at meta metadata surrounding audio and video outputs, particularly for the scholarly marketplace. We're also looking for volunteers to participate in that working group. So if you're interested, please take a look. Draw your attention to that project. So there are two core functions of a library.
The first being to gather content for their community that they serve. And then the second is to provide tools to identify and retrieve those relevant items within the collection. As such, discovery tools of various sorts, various sorts have long been core features of libraries, indices, Union catalogs, card catalogs and now online discovery tools are all examples of services that libraries have provided over the years.
Which patrons have used to navigate the collection that the library has curated. Many modern search and discovery tools have their roots in library systems discoveries such as Google, which was an outgrowth of a library, bibliometrics and citation tracking approach to identifying relevant content for users. What has happened over the years is that these principles and these services have been applied to content well beyond discovery of text based content and extended to other sorts such as images, music, video, even physical items.
But the way in which we interact with these systems is also in change. Expanding from a simple text box or Boolean search to voice interaction, even image sound or pattern matching. Yet many of these advances are taking place outside of the context of library services. Most users have more common interaction with advanced search and discovery tools in their daily lives.
What does this mean for the future of search and discovery in libraries? Which technologies are most important and most relevant to build into library services? And how do we match user expectations that are grown out of their day to day use of these discovery services? When libraries might not be able to cost effectively or even technologically deploy?
And how are the attitudes and expectations changing? And how do younger generations of users have different expectations when it comes to search and discovery? Now, what can libraries also expect from vendors and suppliers about interoperability and coverage of these services, and how might they work in the future? These are some of the topics that we'll be covering in our session today and next week.
We'll dig deeper into issues related to personalization, customization and privacy. And kicking us off with these topics will be Jim horn, who is Associate Professor and orientation services and environments librarian at the undergraduate library at the University of Illinois. Jill Jim will be talking about student perspectives on some of these discovery services.
So Jim, I'm going to stop sharing my screen and allow you. You can pull up your slides and get ready to go. OK Thanks so much. So I'm going to start out by talking about a project where I interviewed a group of students who had been logging into their library account using our mobile app.
And one thing that I wanted to learn from them is not how great and wonderful the mobile app was, but really what did they think about personalized recommender systems that sort of respond to items that they might be interested in? And the reason why sort of the motivations, one really big gaping hole that I saw was that even though we've seen these systems for some time now, we really didn't have a firm grounding in the students' perspective.
And I think that because it's understudied, it was a good place to start. And especially since academic environments are starting to be interested in implementing these. And certainly this isn't a this isn't a new area for student services. There's a lot of personalization going on currently. But what I needed to do before I could dig into the interviews was kind of develop a rubric or explore what other researchers have thought might be a great rubric.
So in that regard, I wanted to kind of understand what are the features and functionality. And this rubric came from a literature survey. And basically different perspectives of the user while they're interacting with sort of these recommenders and just a big sort of just a big sort of overview of the three areas that I'll report on today are asking students, how should a library generate recommendations?
There's a range of ways we can do this. And so I had some prompts on what we were doing and what we're not doing. And and then I also asked them, how would they like to see these displayed to them? Is it something that should be prominently featured when they log into their account, or is it something they should click on to generate? And then also there's this third step in the literature about critiquing your recommendations.
People will be familiar with like a star rating system. That system has kind of fallen out of favor in recommender systems more recently, mostly because it's those can be those can be gamed or rigged or basically false sending false signals. So the star rating has gone out of favor. But what's interesting is some of the student perspectives, which I am going to dig into here in a moment. So I also had this belief in implementing these personalized recommenders that when we do this as sort of a library project, the academic library kind of controls the service.
And we can sort of implement it consistent with user expectations for personal data reuse and for recommender system transparency. What was interesting is we did have some graduate students actually bring up the topic of transparency and these types of systems. So their thoughts kind of Fed into some of the findings as well. So very briefly, I did get in terms of methodology, I won't spend a lot of time on this, but I did get approval to interview students and these were students got emails from me asking to participate in a brief interview.
We did give them a whopping $10 gift card to our campus bookstore. These interviews did take place around basically the past year. And so we have several hours of qualitative data on student perspectives now. So this is more of a starting place rather than a finished project. But we're starting to have some grounding in what the student perspectives may be.
And I think we can expand this out into sort of a larger survey now. So I will say broadly, what I started to see in some of the interview results was surprisingly a conservative perspective on the way that these systems and libraries might generate their results. And what I mean by that is students didn't necessarily want the library adopting some of the internet based sort of the capitalistic, the surveillance capitalist modes of.
Sort of logging clicks or storing searches. So they were kind of they kind of had some suspicion of these actions. And I'll dig into this more in our findings. But they did not want all of their clicks to be data mined, and they had some good reasons for that. And so most of them did not want everything to be recorded. So they understood that that could happen online and that does happen online.
One student, one of the reasons a student said, well, the things that I click on are for a course. And and I don't think that's the right the student thought that they didn't necessarily think that was the right signal to send because that's not how they thought the recommender would work. Some students viewed recommenders as something that more akin to the public library's role and sort of readers advisory and certainly academic libraries with fiction and literature collection, certainly do.
Readers advisory as well. But if a student is in doing a research paper for an assignment and they're clicking around related to, say, a history paper, they don't want these clicks necessarily to feed into a reader's advisory. So they I think they had good reasons for that. Some other suggestions was and they it wasn't that they didn't feel the library had a place in making recommendations.
They sort of had these alternative ideas. And one idea was a startup page where a student can just select topics that they can kind of curate a sort of a these are some starting points of what I'm interested in. They also thought that graduate students this was one interesting thing that I heard a grad student say was just that. Areas from different fields might be useful in her research, and she may want to use a recommender system to help support sort of a perspective outside of her disciplinary focus so that she can see how sort of overlapping concerns in different research fields might help her project.
But one thing that this recommender may be able to do is sort of introduce her to that vocabulary, introduce her to those authors or those seminal texts that might allow her to get a grounding in approaching the problem from a different discipline. So so that's one area where certainly a recommendation system may help inform a graduate student research area. Now, still talking about these the generation of recommendations on one concern was students really wanted to have an idea of how these were recommended.
So one suggestion was sort of a simple text box that said this title was recommended because you searched x, y and z. Another, I believe this was a graduate student who said transparency is important, so folks should know what this is based on. The Amazon system isn't necessarily as transparent, and that could make this distinctive in terms of displaying that there was kind of general agreement that they wanted some way to kind of revise the recommendations after they saw them.
So there's definitely a need for when displaying these being able to somehow interact with them. And with respect to the just sort of the prompt, we gave them some themes. Students said for mobile displays, quick scanning of information. Mostly they just want to skim through items. So it should be easy to obtain recommendations, it should be quick and kind of following up on rating.
Students thought perhaps rating fiction would be acceptable, but they kind of distrusted actual reviews. I think this underscored sort of that undercurrent of distrust and savviness on the part of these graduate students who kind of approach online recommendations in a very they're very skeptical. They understand that there's politics and personal opinion. And so they didn't necessarily want that seeping into how they're performing their research.
And we had several students bring up some caveats in terms of retaining searches. So this gets at one of the key concerns, which was students didn't really want automatic retention of anything. Overall, the theme of data stewardship was sort of in support of how to use the system, which was students not rediscovering something could be helpful. They may be working on a paper.
And they want to go back and revisit things that they've overlooked. And we've seen this comment before in terms of maybe storing their history on their phone, but the library doesn't necessarily have access to that. So it's really something, something that when it's brought up, it never really leaves the phone. And the library doesn't necessarily offer stewardship of their browsing history.
Now, kind of in my I went about interpreting this, interpreting the results as sort of findings and I call some of these themes emergent in the sense that we still want to do a broader survey after having sat down with some interviews. But overall, the students did believe there is a place for recommendation systems in academic settings, and this finding about increased transparency was, I think, important and attending to student privacy as a system design issue.
So designing in privacy as sort of foregrounding any system capabilities and then sort of the interdisciplinary research that it could help greatly support students and students seem to they were pretty savvy about clicking and tapping items just does happen and commercial systems, which I'll show you some quotes on that in a moment. You know, student as I mentioned, graduate students in particular thought that this was helpful in just understanding the depth of various disciplines.
And they're not simply more like the search engines, that they're also designed for novelty and sort of an increased browse capability in the library collection. So you know, moving on to the scarier part of what student students surfaced, one undergrad student said on the internet, you might be interested in finding information about something but not want to buy it.
And sort of this started to surface ideas about YouTube, which was that it gives you recommendations that are based on maybe too small a data point, maybe one video you saw, and then you start to see so many things just like what you had just viewed. And, you know, to the student's credit, they said, this is not really a good way for them to learn. So they understood that, you know, seeing the same information wouldn't necessarily advance your research.
And then we did have students say that they thought that the algorithm is listening to them, that something is following them around the web as they search for something on the phone and move to desktop, they see the same ads and students thought in terms of information search when they see sort of a prompt to buy that this felt predatory. And this kind of led me to in terms of interpreting some of what students were saying.
This led me to Shoshana zuboff new book, the age of surveillance capitalism, and it sort of talks about how behavioral data has now been sort of just de facto part of the way the internet corporations operate. And I have two illustrations to underscore this point, which is that initially, as zuboff zuboff wrote, an age of surveillance capitalism. Initially users behavioral data served to improve the internet platform.
And zuboff sort of says that Google realized that they had so much data exhaust that could be monetized. And this is in the second sort of rendering here. We can see that what zuboff terms, the discovery of behavioral surplus, which essentially means that data that doesn't necessarily lead to service improvements has been bought and sold by third party data, data aggregators or data markets.
There's prediction products which certainly recommenders are a part of prediction products. There's markets of future behaviors and also other surveillance revenues, which really just is about making squeezing as much profit as possible. And I'm not against capitalism. And I think not to go too far afield on what zuboff says here is really that the traditional sort of capitalist mindset is really an exchange.
Typically, there had been exchange of, you know, some sort of goods, and then the individual is sort of rewarded for something like we have a better Gmail now. But usually in that exchange there's usually consent. And so this is kind of related to privacy, but perhaps maybe a broader and more disturbing issue than privacy. So with regard to I see personalization and surveillance capitalism is that these recommenders are sort of a quintessential attribute attribute of surveillance capitalism.
And that when we look to the library world and sort of how we design systems, I think that there's something about what students want to see in a system that really foregrounds their ethical preferences. And there was this article by Kramer and her colleagues about just an ethics of algorithms, which is that we should really allow the user to choose the circumstances in which that user is situating herself.
And we need to leave it to the user to specify those ethical parameters. And if that's not possible, the ethical assumptions of the algorithm should at least be transparent and easy to identify. So because library based recommenders are what I would say, they have started to work their way into vended products. I think as libraries assert ownership over these, that aspects of ethical design and preferences can be a distinctive part of a student's library experience.
And that there's certainly been a lot of work in learning analytics that could help inform and guide ethical considerations, also especially associated with the data reuse, which you know, this is. Learning analytics is another data intensive trend in academic librarianship, and there's some really good emerging Scholarship on the ethical need for not just privacy, but also sort of getting approaches to user consent and then, you know, reusing and making sort of making these principles sort of a keystone of how libraries can sort of advance advanced discovery while at the same time sort of referencing our professional ethics.
So those are some of my findings. And I'd be happy to take questions now at this time. Great thanks, Jim. Want to encourage the participants and the attendees. You can use the Q&A functionality. If you're in full screen mode, you need to hover over the whole. It's a little folder with on it.
You can type in any questions that you have for any of our speakers. We log those questions and we'll moderate them for you. Jim, one question for you. A couple of questions for you. The first is your students showed a pretty decent amount of savvy when it comes to understanding and appreciating these discovery systems and understanding that there might be maybe financial considerations that are exposing content to them, particularly when it comes to things like YouTube.
Did they mention or discuss or consider the transferability of that to library services and the content that's being provided, the library services? Did they have any question about those and kind of question the business implications of what's going on in library discovery versus a YouTube discovery? Is that something you explored? Well I would say that students viewed the library as separate from that.
There was a like when students brought up library discovery, they talked about vufind. They talked about the our discovery interface. And, you know, I think it was contrasted as perhaps having some limitations in the sense that it didn't necessarily, quote unquote, know your searches. And they thought that that was surprising to them because they thought, well, I've searched a bunch in this discovery interface and still it doesn't quote unquote, know me.
And so I think they didn't hold up a YouTube in the sense of. Something that the library should necessarily model. But I think there were aspects of how our data, how data sort of seems to follow you around and how there seems there does seem to be. I think students certainly did understand that there could be benefits to certain parts of that, but as it applied to their learning, they didn't necessarily want it to follow a YouTube model per se.
And related to that, when it comes to training, some of these systems like training, some of the AI, the machine learning that goes into voice recognition or discovery services or say, pattern matching, they express concern. They sounds like they express some concern about the use of that information, maybe for secondary purposes.
Um, yeah, what I talked about in our system was how we could take a look at the labeled data that we have, which is our, our subject data and libraries are in a really good place to jump into machine learning because there's a lot of labeled data in our, our catalogs. We have items with subjects and what I talked about with students was taking aggregate data. So not individual, but doing data mining and aggregate topics that are checked out together.
And so we can't look at individuals that way. We look at topic clusters and I can tell them the algorithm that we use in the algorithm is easy to find on Wikipedia and you can read about it. And so this it's a frequent pattern, frequent pattern algorithm, fp growth. And that algorithm is sort of it's, it's older, but at the same time, it's something we can point to and know exactly what's happening with.
And the data are really sort of a analysis of how transactions in the library are occurring. So from a certain perspective, it's how the library is using its transactional data to improve services towards users. So I think it fits into that first slide about using data to improve services, whereas the data is not anywhere resold. And I think students seem to think like this.
I didn't hear any concerns. They weren't voiced to me. But I think when we do a broader survey, that would be something to get more information about, about our approach. Yeah, I guess I'm less focused on the work that the library is doing per se, than the work that some of the vendors that are creating those systems is doing are doing.
You know, since in many cases, a lot of these discovery services are third party or vendor systems. Yeah each one each one has to be evaluated for its own on its own merits. And I think that yeah, I think that that's something that needs to happen as sort of like best practice for adopting these services. Like where does the Data Guard after, you know, after students sort have interacted with that service.
So we should ask those questions before those are adopted, not after. So yeah, that should really foreground it. Data has such a like individualized user data has such a like afterlife that, you know, things that go into a personal voice assistant can be bought and sold many times after, after it's been, you know, uttered by the user. Yeah all right, great.
So I want to remind people again, you can type in questions at any time. We log the questions in the back end. So if you have any questions, either for Jim or for the other speakers, as we move forward throughout the program, please feel free to type that question in. Actually before we move along.
One quick question for one quick question before we move on. Did you implement. I'm not sure this is maybe a mistype, but it says PHP growth with circulation data or click through or online journal usage. It was. Yeah growth was used with. We use data mining on circulation transactions.
But it it was those transactions had to be enriched with topics. And once the topics were established, then we did data mining on the topics themselves. So that's book circulation data. I think the next interesting area for some of this would be the application of it into sort of other like journal areas, but that that hasn't the journals haven't been done. It was with circulation data. OK, great.
So thank you so much, Jim. I really appreciated your talk. We'll move on to our next speaker. And Kelly, you can draw up your slides here. Well, while I do a short introduction for you, Kelly Dagen is research instruction and user experience librarian at Amherst College. Kelly will be discussing the role of voice, interaction and discovery.
And some of those implications for library and the Academy. So, Kelly, I can see your slides. You're all set to go. All right, great. Thanks so much for that introduction. So welcome, everyone. In this section, we'll be diving into the user experience of voice interfaces specifically and thinking about some implications for academia.
So as research instruction and user experience librarian at Amherst, I get to wear multiple hats across teaching research, supporting faculty and students, and leading library, UX and assessment projects. If any folks out there aren't familiar with UX or user experience, the shortest possible definition might be it's a field that investigates how to make things that are useful, usable, valuable and accessible for the people you're designing for.
So a brief overview today, we'll be talking about types of voice interfaces, some broad themes affecting voice search and academic discovery, user hopes and potentially expectations for such systems and some associated requirements for a good user experience. But first, there are a lot of terms flying around when it comes to voice related technology. So let's just get some definitions started. Voice devices come in many flavors.
Basically, voice user interfaces are designs in which voice is a primary mode of input or output. They can be voice only or they can be multimodal with other interfaces like a screen. Voice assistants are a form of intelligent personal assistant or a Digital Assistant that uses voice as an interface. So all those names that we keep hearing about for the purposes of this talk, I'll be focusing on voice assistants and voice search.
So nothing about smart fridges. And you may have noticed a lot of voice related developments happening recently. And that brings me to my first driving theme with apologies to mark Andreessen. By all accounts, popularity and use of voice interfaces is growing rapidly and seems to be approaching a saturation point. You may have read the news stories about hotel chains putting smart speakers in every room or the new kids edition of the Echo360 Dot.
So voice technologies are cropping up everywhere, but it can be tough to get a handle on exactly how common or accepted voice technology is. But there are a few things that we can look at as indicators. A 2017 Pew survey found that 46% of Americans use digital voice assistants. There's evidence that younger generations acclimated to voice very quickly. Even back in 2014, a Google study reported 55% of teenagers using voice search more than once a day.
And 31% just under a third, reported using it for help with homework. Today, there's some evidence that voice interfaces might be becoming sort of a background technology for earlier education and childhood, with a quarter of households with voice assistants and children engaging with them for help with homework. Now, if you do even a basic open web search on voice technology, you'll find all sorts of grand predictions and statistics.
A favorite quote is by 2020, 50% of searches will be voice searches. It is 2019 right now. So it's a little difficult to separate the actual state of affairs from a lot of the enthusiasm going on right now. As a side note, as far as I can tell, that 50% by 2050 quote was actually sort of a misquote from the chief scientist of the Chinese search engine Baidu.
And it was specifically about the Chinese search market, and it referred to both voice and image searching. So reference librarian fun there. But still we can see voice assistants and voice search making their way into people's lives in two main ways through smart speakers and smart phones. Nielsen's latest report in 2018 estimated that nearly a quarter of US households now own at least one smart speaker, and many of them have several.
Back in 2016, Google reported that 20% of all searches through their smartphone app were done via voice. And this is a global trend. 2018 report by globalwebindex found that 27% of the global online population is using mobile voice search with the highest use in the Asia-Pacific region. And as of a Comscore survey of smartphone users in 2017, 1 and 2 users reported using voice technology on their devices.
So things like voice commands and voice search. So with voice interactions becoming common practice, for many of us, this is likely shaping user expectations. But what's so great about voice? Well, for one, it's an incredibly efficient mode for inputting information. As Perna virji explains, you can type 38 to 40 words per minute on a mobile device, but you can speak at least 150 words per minute.
So voice can be a form of natural user interface, which means that if the interaction is designed well, it feels intuitive and doesn't require much cognitive effort. Compare speaking with, say, typing on a tiny screen with your thumbs. Second voice is pretty well matched for situations where we're multitasking. A 2018 Google survey found that being able to, quote, more easily multitask was the number one reason people cited for using their smart speakers.
So it's great for situations where our hands and eyes are busy or maybe when we're on the go. Third, voice interactions also appeal to our desires for instant gratification and speed. The number two and number three reasons cited for using smart speakers were to, quote, do things faster and, quote, instantly get answers and information. Smart speakers and voice assistants represent an always on ubiquitous presence that is ready to provide help.
Finally, interactions with voice feel more like social interactions, even when they're not with a human. Speech is so fundamental to our biology that we often don't have conscious control over our reactions, even when we know we're talking to a machine. Our brain often doesn't distinguish a difference, and people tend to apply the same social rules. This led researchers Clifford Nass and Scott Brave in the book wired for speech, to conclude that, quote, voice interfaces are intrinsically social interfaces, reflecting this, 41% of users in that Google survey said that using their smart speaker quote, feels like talking to a friend or another person.
So all of these factors might help explain the appeal of voice technologies. And I'd add one more that's may be difficult to quantify. It feels like magic when it works. Like you just say something and the world changes in some way. It's pretty close to wizardry and we all want to be Wizards. We need to be Wizards because we have a lot to manage these days. This brings me to my second theme, which is our Mobile on demand approach to information.
This is linked to the rise of smartphones and academia, which are considered academic success tools in their own right. The 2018 study of undergraduate students in information technology found that over half of students rated their smartphone as very or extremely important for their academic success, and over 80% of them used it for at least one class. Also, smartphones were seen as significantly more important by students who are non-white first generation, from families with lower incomes and students with disabilities.
At the University of Central Florida back in 2016, 69% of students reported using mobile apps for learning at least once a week with regular use of Google search and dictionary. So students are engaging with learning resources in a mobile context and using mobile devices. In this time scarce world, we're all seeking greater efficiency and voice search is pretty well established now for answering simple reference questions in practice in multiple voice assistant user surveys, people say that they often use voice assistants for answering a general question or seeking information or searching for a quick fact.
That was the top use of digital assistants in 2019, according to Microsoft. Google is, of course well aware of this trend and they've been taking steps to optimize their search results for voice interactions. The ultimate goal is to provide one single accurate answer to factual queries because of course, in voice search, having a single answer is a requirement. No one's going to listen to 30,000 pages of search results.
When you remember that convenience is a critical factor for undergraduate information needs. The efficiencies of voice search become even more important. Convenience and efficiency needs aren't unique to students, however. Research on faculty practices have found that faculty also struggle to manage time demands and employ similar skimming and scanning practices as undergraduates when searching for information.
Finally, in an always going multitasking world, our perceptions towards the getting of information, often shift. Searching for material becomes functional, not reflective, and efficiency and productivity become key. Users grab things intending to read or evaluate them fully later. The act of getting isn't necessarily connected to the act of thinking through the topic or question.
In a 2010 study of millennial students' mental models of search, only about a third of participants actually read articles or sites in depth when deciding whether to use them. Usually the decision was made in a matter of seconds. This third trend is pretty broad across academia, and it has to do with personalizing the educational experience. Campuses are looking for ways to stand out, and personalization is one of them.
Voice assistants are being used as one Avenue for this work. Three high profile examples are Arizona State university, Northeastern University and Saint Louis University. All three programs. All three campuses ran programs with Amazon based voice assistant devices. Saint Louis went the furthest. So far, putting Echo360 Dots in every student resident's room on campus.
They developed skills or programmed voice interactions designed to provide answers to common questions related to campus events, student life, academic support facilities and more. Campuses with these programs have cited benefits, including reducing cognitive load, reducing student anxiety, meeting students' expectations for seamless, efficient experiences, and helping students feel more connected to the University.
Thinking about again voice as a social experience. As I said earlier, common practices often shape user expectations. So with that in mind, let's look at some potential user hopes and maybe expectations for voice related technologies in academia. So users are already pretty primed to expect voice interactions for straightforward questions and simple interactions.
Even navigating a website to locate answers is fairly inefficient compared to voice instead of a website FAQ. Voice interactions could provide quick answers to common questions. For example, those campus voice assistants were often programmed with answers to variations on the question of when is the library open? With voice commands and voice assistance.
People are becoming more accustomed to telling their digital assistants to do things like book places, make appointments and set reminders. Integration with library systems would make these interactions more voice friendly and would meet their expectations here. Just a note that it seems like vendors are jumping on board for these needs fairly quickly. Overdrive is developing voice command support for their Libby app, including finding and playing audiobooks.
Ebsco is using its API to allow users to access content via Alexa and Google home, and Demko has developed an Alexa skill for Discover local, allowing patrons to check branch hours and services, reserve rooms, place holds and renew items. On another level, there's also a user appetite for making the search process itself more efficient and conversational. So thinking about our on the go context, having the ability to ask an interface to gather, let's say overviews on a topic or key sources in a field to review later, can make a lot of sense.
At the 2018 electronic resources and libraries conference, ebsco, director of field engineering Eric Frierson recently demoed exactly these abilities using an Amazon Echo360 to bring up overviews, information on specific topics and related works. Now, this set of use cases in particular requires considerable work, but these are all potentially appealing to our time sensitive users. As we've already explored, voice assistant users are getting pretty acclimated to factual results through voice search, asking a Digital Assistant to find overviews in order to get started on a topic could represent a next step in that process.
In a faculty context, voice may offer greater efficiencies in keeping up to date on their fields. Some enterprising scholars are already creating Alexa skills to find and read the most recent abstracts in their fields, providing an on the go briefing that allows them to quickly judge whether an article is worth reading more in depth. A recent focus group at the Pratt institute, specifically investigated user requirements for intelligent personal assistants in an academic context.
Participants described an interactive tool that helped them customize their searches and find more specific results. That's the quote for like having a research assistant in my phone. This would require some conversational capability and voice assistance to support a sustained back and forth with refinement. And at the moment, searching would probably perform best in well structured, boundary conditions with very specific criteria.
But in terms of ultimate dreams, a focus group study back in 2001 at the University of Idaho hinted at what undergraduate students might dream of in a voice interface for academia. When asked to describe a dream information machine, all of the students described device using voice recognition and natural language to perform comprehensive searches for them. It would also be portable, of course, and accessible 24 over 7.
Obviously, we don't have the capabilities for this dream machine, but it certainly exists as a user desire. All voice based interactions, however, come with certain user requirements for the experience. Without addressing these, we'll end up with a lot of problems and a detrimental experience overall. And many of these actually echo what Jim described in his study just now.
The first one. Privacy privacy is a big concern. And it's become relatively widespread among users with recent news. This has not only to do with how recordings are made and managed, but also security risks in these voice systems. It's already been proven that these devices can be hacked and turned into wiretaps.
Poor differentiation between user voices makes it possible for fraudulent charges or access of sensitive personal data. Users are becoming more aware of and less comfortable with how major platforms are tracking their behavior and using their data. If we don't honor and address these concerns, we're not building a good UX. Accessibility is another key requirement.
While voice interfaces may be more accessible for certain populations, including those with visual and motor disabilities, they can be inaccessible for those with auditory speech or cognitive issues. Transparency and customization are also important. Users can feel disempowered or lack trust when it isn't clear to them how these devices work. They can be frustrated by limited or no ability to customize the operations of these devices.
In fact, trust in this situation can cut both ways. You need users trust in the results and operation. But there's also the potential for users to be too trusting of results. For example, current product marketing practices in voice assistants often involve forms of sponsored information, so recipes, stain removal techniques, allergy management tips. Researchers are starting to indicate that when the information source isn't referenced by these devices, users don't question the quality or the intent of the information provided.
There's a lot of interest, in fact, in how voice assistants and other AI enabled devices affect our perceptions of and trust in technology. One study by the MIT Media Lab in 2017 showed that trust starts early among children aged 3 to 10 years old, interacting with Alexa and Google Home. They found that most of the children had complete trust in the devices and believed that they were being truthful.
This has major implications for how voice based systems would contextualize and present information. If you thought relevancy ranking was important, now it's even more so here. Meeting user expectations is another requirement. Voice based interactions tend to build expectations for a true back and forth conversation, and people generally have poor understandings of the real limitations of these devices.
So you get to see a lot of hashtags with device fail. A mismatch between expectations and capabilities will lead to user frustration, disappointment and disuse. This has been sort of a brief and whirlwind tour of some of the factors and possibilities surrounding voice based interaction in an academic context. Voice certainly isn't the best mode for every need or context. For example, public spaces can be really poorly suited for voice commands, and research has shown that people generally aren't comfortable engaging with voice interfaces in public areas, and even if they were, it would make things get pretty noisy really quick speaking of comfort, some users just aren't comfortable talking to a computer, period, even in private.
And finally, personal and sensitive content can be really fraught when it comes to voice for pretty obvious reasons. Besides matching the right knee and context, there are major requirements that we'd need to address for a good user experience if we're thinking about voice technologies, there's a lot of thought and design that would need to go into how the interaction works and ensuring that we are honoring user desires for privacy, accessibility and control, as well as managing trust and expectations.
But it's an exciting time to think about the possibilities. Thanks so much for listening. And I'm open to questions. All right. Thank you, Kelly. A couple of questions come to mind on your talk. Thank you.
You mentioned a couple vendors, as I think you mentioned, overdrive ebsco, maybe demco. Also, who are developing or deploying some of these automated discovery system voice automated systems. And they're using Alexa as a background as backend technology. I'm wondering if those vendors are clear that the services are piping data through Amazon.
Do if that's the case? I don't know how transparent they're being in communicating these services. I mean, that is a huge concern that people have brought up, that when you're using these platforms, they are essentially designed to track people. And so that's something that we would definitely need to talk about as librarians and as a profession. Unfortunately, I didn't get to attend any of the presentations, so I'm not sure how they're framing it, but it would be a really important point not to miss.
Yeah, because, I mean, you might a patron might be, you know, for whatever reason, reticent to share their data with Amazon, but not think twice about it, sending it through overdrive or ebsco because Oh well, it's a library service. They're trusted in the library community, but not understanding that ultimately the data is getting passed on through Amazon and that issue of transparency, which you raised, I think is a really critical one.
And and simply putting another company's brand on the back end of one company's services. Kind of interesting. Oh, yeah, absolutely. I have I have a lot of personal anxieties around this topic. But yeah, managing trust by being transparent and making sure that we are not handing over data in ways that users aren't aware of. That's a key point.
Yeah another point that you raised in your presentation has to do with the comfort level of some communities and some people of using these voice driven assistants. I'm interested to get your perspective there on whether or not that. Has value for disadvantaged or maybe first generation students who maybe if they're anything like my kids, particularly tied to their phones, talking to their phones.
Do you think maybe that might provide a way to reduce anxiety or engagement with the library and library services? I think that's one of the avenues, like with the campus programs that many people are thinking about is this idea of voice being able to provide a certain level of reassurance and provide a sense of connection and allow students to be able to ask questions and express things that maybe they wouldn't be comfortable expressing to a member of the administration or even a super friendly librarian.
I do think that we would absolutely need to do more research and investigation to make sure that we are honoring and balancing the requirements of trust and privacy there, especially with marginalized populations, because of just the history of surveillance and practices that have gone on in the past. So I think it could be a tool for enhancing connection, but again, it can cut both ways.
OK all right. Well, fantastic. I want to Kelly, both Kelly and Jim raised the issue and of privacy and of the privacy implications. And we're just kind of touching on this a little bit. In some of these talks, we're going to go full bore into some of those issues. Next week when we circle around and talk about personalization and the involvement of user data in these systems.
So that's why we're kind of glancing by some of these topics and not digging terribly deep into them now. But we will be addressing them in a lot more detail next week. So, Kelly, thank you so much for your talk. Thank you. And our third and final speaker today is Chad maron, who is information services librarian at St. Petersburg College.
And Chad is going to be giving us a short introduction into the process of building a personal voice assistant, which is kind of a good place to launch into understanding some of the technology issues involved in this systems and some of the technological issues around them. So, Chad, we can see your slides. You're all set to go.
Although we can't hear you yet if you're talking. Chad Chad, I'm wondering if your mic is set up properly.
Seems like Chad is having a little bit of technical issue on his end. I think maybe with his setup. One second while we. Dig into this from a technological perspective. It was working just a few minutes ago.
We're contacting the speaker directly to see if we can learn what the challenge is at this point. Please bear with us. I am here. OK yeah, we can hear you now. I don't get this. Are you able to hear me still? Yeah, we can. We can hear you now.
OK um, I have a little bit of a delay now for some reason, but hopefully it doesn't do too much, and hopefully the video is played. Maybe that's part of the problem. Actually, we're not seeing anything on your screen. OK there we go. You got it. Now can now see your slides and you can hear me and I can hear you.
Excellent all right. All right. I'll keep my fingers crossed that this works the whole time. OK all right. So real quick, um, artificial intelligence. So it's intelligence and machines is known as artificial intelligence. And according to the Encyclopedia britannica, it's a branch of computer science where machines have the capability to perform tasks commonly associated with humans.
So the, you know, the ability to reason, discover, meaning, generalize, and in some cases learn from experience. So you've got, you know, machine machine learning, neural networks, deep learning, computer vision, natural language processing, speech recognition, which I'm going to talk mostly about today, are all under the artificial intelligence umbrella. So for me, I'm confident mostly that the main goal in creating an intelligent machine is to alleviate some of the mundane work we do so that we can focus more on being human, to be creative, maybe to think more and be more innovative, you know, to basically have more time to experience what truly matters in the world to you.
Um, so hopefully that's why we're building these intelligent machines. I don't know, maybe they're going to take over the world. Who knows? So real quick, two types of artificial intelligence. React doesn't really have a concept of the world. Um, it kind of if you think of IBM's Deep Blue that beat the champion at chess, the limited memory, that's where the digital assistants are and/or in this type of, of AI.
And what they can do is they can consider past pieces of information and add them to a pre-programmed representation of their world. So self-driving cars are in there as well. Theory of mind, you know, you can have they have the capacity to understand thoughts and emotions. They don't exist yet, but I'm sure they're coming. So thank c-3po and r2-d2 and then self-aware, you know, they can form representations about themselves by themselves.
So this doesn't exist yet either. But I'm sure it's not too far away. All right. So jumping right into this. So I manage an Innovation Lab here, a little maker space. And so we wanted to start exploring the basics of AI to have a better understanding of what it is and what it is capable of doing. So our first attempt with I was playing with a fairly sophisticated Python program where you would play a game of tic-tac-toe against AI.
It was pretty fun, but it was a bit abstract. Um, and we wanted to start playing more with hardware. Mean guess, you know, we could have created a robotic arm that could have written down the X's and O's, but we didn't really get into that. But so what we ended up doing is a simpler route. Think is we purchased 2i kits. Um, we purchased a vision and a voice kit and I'm going to discuss how we set the voice kit today.
So AI y by the way, stands for artificial intelligence yourself. So kind of like a play on Du. Do it yourself. OK all right. So, by the way, you don't need to buy a kit to do this. You can go out and buy the various parts and just set it up yourself.
These are all fairly inexpensive pieces of equipment. So one of the main parts is the voice hat, and that's the hardware attached on top is what that stands for. So I ended up getting the version one kit, and that's where the voice hat comes in. Version two has what's called a voice Bonnet component, and that's just a smaller version of the voice hat. So this connects to the Raspberry pi, which is the brains of all of it, and that creates a natural language processor that responds to your voice connected to via Google Assistant.
You can see it's kind of a small picture, but the voice hat includes quite a bit of stuff on there. They consider this the hackable component where you can control motors, leds, inputs from different sensors, gpio connections. As for other devices. So this can do a lot as can the Raspberry pi, but essentially this facilitates audio recording and playback. It connects to the microphone, board and speaker, and it has an optional connector for dedicated power supply.
So you can see a bunch of other little connections there. I'm not going to go into all those, but it's got a lot of things you can control. So this hat doesn't process or recognize, recognize voice, it doesn't generate speech. So it's an interface, basically. So it takes input from the microphone. And then that microphone does some kind of filtering to eliminate noise and it sends it to Google Cloud servers.
And then you can call functions, which is a block of code to perform an action through Google APIs. And we're going to look at a few of those today. And then the Raspberry Pi interfaces with those APIs. So then excuse me, the Google APIs process, all this stuff and they send it back to the Raspberry Pi. And so then the hat amplifies all that audio back to the speakers. OK so what we ended up doing is we pieced all this stuff together.
We didn't use the little cardboard box that comes with it. We actually went ahead and 3D printed our box. It's more sturdy and it just looks NISO than having a piece of cardboard there. But the cardboard works too. Um, so you got to go through and get all the pieces together fairly, fairly easy to do. Took us about an hour and a half, I think, to get it set up properly.
And then of course you want to verify that the hardware works. I will say that you want to make sure that your USB power supply is. I want to say it's at least two 2 amps that will actually power everything. Once you've got it all installed or all configured, then you need to download the operating system and there's the link for it.
And then I used a program called etcher, which flashes the image to a micro SD card, excuse me. And then you run it, and its operating system is all included on that box. OK So this there's two things you can do. You can actually get a project app, which didn't work well for me because I'm using version one, which allows you to connect everything wirelessly through your app.
Or you can use option two, which was to use a monitor, mouse and keyboard. So once. Oh, then before you can really get it to do much, you. Oh, you have to go into Google console. And once you're in there, you can set up a trial account for free. And actually, right now, Google is offering a $300 credit that can be used for a year or so with your projects, which is plenty.
I won't use that much because a lot of the Google APIs do cost money per transaction. So what you need to do is you go in and you create a project. And then you have to go in and enable the Google Assistant API. And then this is giving applications access to all those features. So Google Assistant is a virtual assistant that communicates in two way conversations.
And what's nice is it does work on iOS devices now. And then later we'll look briefly at the actions by Google platform and that's how what you use to create your bot. So you enable the Google Assistant API. And then you need to go in and set up credentials. So there's a credentials link there. And then once you select that, you'll then pick Google Assistant API, which you've already enabled.
And then it's going to ask you where you're going to be calling the API from. And I just selected and other UI. So I could use Windows or the command line or whatever it is. So I just downloaded this, the command line tool, and I'm struggling with that, by the way. And then it's going to ask you what kind of data you're going to access and you have to tell it user data. So before you go to this page, you have to go in and then create an authorization client, which you hit Continue.
And what that does is all the credentials that are housed at Google somewhere, and then that will then prepare all the APIs for use. You then download the credential, which is a JSON file. You need to rename it. Think I named mine assistant JSON and then you make sure you put it in the correct directory on your Raspberry Pi.
And then when you first run one of the demo programs, it's going to prompt you to add your authorization code. Once you do that, you're good to go. So I'm going to run a video in here. Let's hope this works. I hope you can hear it. Turn on the light.
Blink the light. Turn off the light. Repeat after me. This mission is too important for me to allow you to jeopardize it. This mission is too important for me to allow you to jeopardize it.
So now it's. It's actually working. Um, hope you guys are able to hear that. And that's the machine learning side of it. So when you speak back, it's actually hearing what you're saying and prints all that information out on terminal. OK all right, then once you see that, it's actually working.
Let me move that. You have to enable Cloud Speech to text API. And what this does, this is your speech to text conversion. And this is also powered by machine learning. What's interesting is this API recognizes 120 different languages. You can do the voice command and control. It transcribes audio and it can process real time streaming or pre-recorded audio.
It does. It punctuates transcriptions, you know, commas, question marks, periods, that kind of thing. It can even identify what language is spoken. I think it was mentioned earlier. Diarization that's something that they're working on now in beta. It's basically nos excuse me, or it can predict who said what and from where.
Um, multi-channel recognition. So it can kind of tell where the people are speaking and then records it into separate channels. Um, I kind of thought, you know, this is perfect for spying, which is something that you guys have touched on with privacy and transparency. Um, it can return text as it's recognized. So from streaming audio or as the user is speaking. OK, so let's see.
So, Yeah. Um, excuse me. Enable the Cloud Speech to text API. So once you're in there, actually, before I go there, you do have to turn on billing because I mentioned earlier that they do charge you per second of billing. So they do say that if you use the API for less than 60 minutes per month that it is free, which is nice.
Um, beyond that it's very minimal. IT costs like forget, it's like 0.0 $0.06 for every 15 seconds and they do give you a $300 credit that I mentioned earlier. So similar thing. You enable the Cloud Speech to text API. You have to download the JSON credential file and then once you do that, you're good to go. Um, so the demo Python script that was there.
I went ahead and modified a few things. And so it'll actually do. And, and I accidentally played that video already for you. Um, but you, you make some changes to improve the voice recognition accuracy. I'm not going to go into all this code right now. I just wanted to highlight some of the features. Um, and then you add the, the import statement, which is up at where it says import on line 8.
And then once you've got this to recognize your voice, it'll then, um, you can then tell it. Let me play this one again. Think this is the right one. Bear with me here. turn on the light. Yeah this is the same one from before, but blink the light. So now I'm saying the commands, it's going out, turn off the light.
And it's recognizing those commands and sending them back. Repeat after me. This mission is too important for me to allow you to jeopardize it. This mission is too important for me to allow you to jeopardize it. There's ways to go in and look at the latency, and it's impressive to me that you can just speak your command through a little homemade box that we built ourselves and that it goes out to the Google Cloud servers somewhere.
It then does the processing through all the APIs and then sends it all back. It's pretty amazing how all this works. Um, and that's what I just talked about, the latency. There's not. Not a lot of latency. Another thing you can do is you can control lights your temperature at home. All this stuff using just your voice.
There's some programming involved. And I'm just starting to tackle all this stuff. But, you know, if you are interested in how it works down the road, I will provide my contact the end. So I just purchased this red bear duo. It's a tiny it's the size of your thumb, I think. I haven't received it yet, but it's an internet of things board that has Wi-Fi cloud capabilities and Bluetooth Hello energy capability.
It's for short range communication. So there's a lot of internet of things projects that we get to kind of play with. I wanted to briefly mention if that then this because it's a real easy way to add functionality to your Digital Assistant. And I'm focusing mostly on the Google Home Google Assistant today, but you can use it with anything if you have an Amazon Echo360 or whatever it is, you can use this to kind of spruce up your services.
So like, for example, if, if somebody comes in to my lab and they see that I'm not there, they can press that blue button on the device and it'll automatically send a phone call to me and say, hey, oh, oh, you're not here. And then talk through that something I'm working on right now. But through it, it's pretty easy. If I had to program that, it'd be fairly, fairly difficult to do.
So it's one push button and then it goes to an email or a wherever you want it to go, actually through IFTTT. And don't know if this was mentioned in one of the earlier presentations, but I find this fascinating that, um, Cisco's internet business solutions group predicts that $50 billion connected devices are going to be around by 2020. It's a lot more devices than people. Um, and just on a side note, the internet of things concept started back in late 90s in the late 90s at MIT with RFID sensing chips.
So what I do like, though, when I was reading this Cisco report, there's a quote in here. I love and it's the direct quote. It says, by combining the ability of the next evolution of the internet, the IoT concept to sense, collect, transmit, analyze and distribute data on a massive scale with the way people process information. Humanity will have the knowledge and wisdom it needs not only to survive, but to thrive in the coming months, years, decades and centuries.
Pretty uplifting and hope all of this is true down the road, but just a really good report if you're interested in reading all this. And by 20, what does it say here? 7.6 billion is the world population and $50 billion connected devices. That's just unfathomable to me. OK, so getting back into this here, um, you could spend hours just on Actions on Google.
So as of now, we've got the Google Assistant and the Cloud Speech to text APIs enabled all the authorization. Client IDs have been downloaded and we've got that working because our device communicated in the cloud. You do have to configure activity controls through Google. So you go to myaccount.google.com, activity controls, and you have to enable your device information, voice and audio activity and that sort of thing.
So what is an action? So actions on Google extends the functionality of Google Assistant. So actions let users become more productive through a conversational interface. You know, and I mentioned it before, you can turn on lights and move motors and all that stuff, but you can also have longer conversations. So with this you have to go in and enable the Google actions now and.
Once you've done that, an action will define support for a user request. OK and this is still I'm still trying to wrap my head around all this stuff too. Um, and it represents that as an intent. And it then includes what's called a fulfillment that will then carry out the request. So essentially Google actions will capture your voice intent and then translate it in a text.
There's quite a bit that you have to do for it to start working, though. Um, so before I get to that, I do want to show you. The Google console. So hopefully this page is loading in front of you now. Um, and if you go into your library, this should show you all. The APIs. That that exist.
There's a lot of them. And if you go back here, this is your there's a lot here. By the way, if I go back to my dashboard, this will show me some of the APIs that I have set up now. And dialogflow is another one I just did. We'll talk about that briefly in a second. But this is where you go in and keep all your APIs set up. And this is where you create your. Your your projects and that kind of thing.
OK, I'm going to close that guy. OK. So with actions on Google. This is what that looks like. And it can be fairly tricky. So if you want to start a new agent, you can. But created one a voice assistant through my lab. So this is what it looks like.
you've got your. Invocation, which basically says, OK, I want this to be I want somebody to say, hey, instead of, OK, Google do this. It's hey, Innovation Lab or speak to the Innovation Lab. So then it's our own little interface. So you got to create that first, and then you go in and build your action.
And this is where I'm starting to work on a search interface. Which it's not as easy as I thought. So so, anyway, here's what you got. You've got some of your actions that you're working on, different phrases that the AI is going to understand. And I do want to mention dialogflow, too. So when you select dialogflow, this is Google bot. Then they used to be called API, and this is where you actually build natural conversation experiences that is powered by artificial intelligence.
So again, just started scratching the surface here. There's a lot you can do. But what's really nice about this is you can connect with your users through your website, through the mobile app, through Google assistant, through Amazon, Alexa, Facebook messenger, whatever it is, you can do all of that through here. Excuse me. And this is where you create intent.
Briefly talked about this. It defines an action that the user wants to perform. The entity path defines something that the user references when they talk to you, and that's where it gets your response from. Um, and then dialogflow is going to select by itself in a weird way, an intent based on your speech. And then it's going to return a response text that you set up. Now, you can also this is something I'm also trying to figure out is you can create a command.
It's called an API call a webhook where you can then say, oh, I'm going to send you to a website instead. And Amazon Alexa does that. They all do it now. So if somebody asks a question, I need to do this and it's too sophisticated or too difficult to do via voice, you can then send them a link. And it'll go to their phone through that webhook I mentioned.
There are in dialogflow. There are pre-built agents already here, so you don't need to rewrite everything. I'm not going to go into all these either, but all these different currency converters, event search devices, dining out flights, all this stuff, language settings, hotel booking, smalltalk is another thing you can go in and customize. You know, if I enable this.
So if I go in and ask, well, how are you innovation lab? There's pre-recorded things in there that say, I'm doing fine, thank you. All this kind of thing. OK it's close, that one out. So there's dialogflow. Briefly so that uses machine learning to understand what users are saying.
Tensorflow is another thing that's built into all this. It's a it makes the process of acquiring data and training all your voice models and all this stuff, prediction, refining, all your results, all that stuff. It makes it a little bit easier. It can even detect handwriting and image recognition, which is pretty cool. And we're using that a lot with the vision intelligence stuff.
I briefly wanted to mention this. I know I'm running short on time, but I wanted to mention that there is a service out there called Pandora bots that some libraries are using, and it's a way for you to create your own virtual assistant that can answer simple inquiries about topics like library hours, location materials, upcoming library events and that kind of thing. I've read where people have done information literacy bots that can kind of help personalize the learning experience, that kind of thing, to help students figure out what they want to do their research on and that kind of thing.
Now we're going to replace working with a human, in my opinion. But it's something that if it's late and you want some very basic information shared with that user, you can do this. Just wanted to briefly mention to you that scratch 3.0 uses what's called web speech API, so you can do a lot in Scratch two. One thing I noticed that I think is kind of neat is if you can actually change the color of speech or the color of your background through speech.
So this is all based in there. So if I say aqua. Blue well, great. It's not working. Error occurred in recognition anyway. You can sit here. Once you allow it to go through, you can then say the name of the color and it'll change the background for you.
Red all right. Not working anyway. Worth a try. And then wanted to mention this too. This is a thing I did the other day. So let's open this. So this is Google Duplex. That's it.
But I'm going to help you. Hello Hello. Hi how are you? I'm Google's automated booking service calling for our clients, so the call may be recorded. Unfortunately, they needed to cancel their reservation. OK so do you need the name, please? The first name is Chad. Chad ed?
No first name is Chad. Chad OK thank you for calling. OK, great. Thanks you're welcome. So we were the lady was speaking with an AI based, human sounding voice, which is kind of freaky. So I just called. I just spoke to Google Assistant and said, would you book me or reserve me a table at this restaurant?
And then she went out and called and did everything. I was hoping to get a transcript of that to hear how it actually worked. I didn't. So I decided to go to lunch there and ask the lady if it was possible for me to cancel my reservation and just kind of see how it worked. And it actually worked really well. It was a little creepy, actually, how the assistant was able to recognize her accent and communicate with her decently.
It wasn't perfect, but it was. It was terrifying and pretty mind blowing at the same time. OK and I just wanted to briefly mention this to a friend of mine in Colorado is doing an artificial intelligence history project. And so he wanted to bring history to life. And so what his students did is they used dialogflow and they put the head of Kaiser Wilhelm the second in, you know, the 3D printed head over top Google Home assistant.
And they did all the programming behind the scenes. So the students researched the causes of World War one. You can now ask the Kaiser questions and he'll answer you. And what's interesting, though, is they can this software can detect temper as well, temperament. So if you start asking him questions about his mom, for example, he gets pretty upset over it. And you can program, all that in.
But the dialogflow actually has all these different responses to kind of randomize its response back to you. It's pretty cool stuff how they did this. So they're bringing artificial intelligence and coding into history. And we already kind of mentioned this, but privacy is obviously a big deal. Um, you know, I think it's really important for us to do some research to learn more about what companies like Amazon, Facebook and Google are doing with all of our information.
Do those private settings that we mess with actually do what we think they're doing? Let's hope. And really, when you're using a personal digital assistant, you all know this, but you're being recorded, you know, unless you have it unplugged or disabled. So your interactions with people inside your building could be heard by and used by organizations from all over the world.
So there's my contact info. I'm going to continue working through this and try to continue to work with this voice recognition system. So creating a personal Digital Assistant. So if you're interested in the history and what happens down the road, let me know. I'll be happy to keep the conversation going. All right, great. Thank you, Chad.
Thank you. One quick question for you. A lot of the things that you were doing are running through sort of the Google APIs and the Google back end. Are you aware of any. A community based or open source based initiatives to do the processing on some of this code so that you could run through a community held service to process this versus running it through Google.
Are you aware of anything like that? I'm not, but that's a great idea, actually. Um, yeah, it's something I'd like to look into. It's a really great question. I don't know. OK I know they have, like, open AI, but I don't know. Um, I think it's open. Org I don't know if that would do it. And they also have like certain libraries, but that's more for the visual intelligence side, the computer vision, like huge libraries of, of images that the computer already kind of knows what it is.
Right but then yeah, that's a great question. I don't know. Yeah, I think the I mean, I'm just guessing that the IPv4 running some of this is pretty complicated and probably pretty closely held. Uh, you had mentioned that there's a cost for running this through in excess of $60, 60 minutes a month, right? Roughly I mean, I'm sure it's on a sliding scale, but what are we talking about in terms of.
Time and cost. Well, for me, it hasn't cost anything yet because it's very early and it's, you know, infancy part here. I mean, we're just now getting it to kind of communicate back with us and we plan on just having the one box. It's kind of just as an experiment. Um, so I don't think we'd get that much traffic through that one box.
And I think that's we just plan on keeping it in that one box. OK that makes sense. Yeah no, absolutely. I was just wondering if there was some sense of scale if people wanted to do something more robust with it. Yeah I mean, I don't really think I'd have to look at it to see, but. You can usually do a fairly quick conversation and not use too much, too much time.
Sure you know. Yeah so it's only the, you know, the 5 seconds that it takes to ask a question that's processed. Right and it does it fairly quickly when you look at the video and you see as I'm speaking, the Google Assistant is translating it as I do it, it's almost real time. It's probably less than a microsecond latency on that. It's pretty impressive.
Well and I was chatting with Jill in the background. It was like, yeah, I'm sure there's, there's a whole lot of processing power that's going into getting that data, churning it out, turning it over and spitting it back, you know? Yeah, that's, that's, that's not a small computer. No, it's not. It's and it's crazy. I mean, I was thinking about the other day when I did, you know, the first test, I'm like, turn on the light.
Well it's, it's going out into Google Cloud server space somewhere, and it's taking that voice command and it's saying, oh, it sends it back and it does it and it's fairly quick. And you know, even the repeat after me thing, it's recording my you know, what it wants to do and it speaks it back fairly quickly. It's insane how fast it does. How fast it does it. OK all right.
Well, Chad, I want to thank you very much for your talk and good luck with your projects. And please let us know how they're progressing. I will. Thank you. All right. So with that, I know we're a couple minutes over, so apologies for that, but thank you all for hanging in. We'll draw today's session to a close.
And when you are, log out, please. You'll be presented with presented with a short survey. Please let us know how we did today. It's always good to get feedback on our programs, especially as we're putting together plans for 2020. It's not that far away. So with that, I want to draw today's session to a close. Thank you all for joining us. Thanks to Chad, Jim and Kelly for their talks and for joining us today.
And look forward to seeing hopefully, a few of you at the ala conference in a couple of weeks, as well as next week's session, part two of today's program. So with that, thank you and have a good afternoon. Bye bye. Thank you to all of our speakers and to our attendees.