Name:
Seamless Access Presents Entity Categories and Attribute Bundles
Description:
Seamless Access Presents Entity Categories and Attribute Bundles
Thumbnail URL:
https://cadmoremediastorage.blob.core.windows.net/8e24ad7f-791f-44f1-8956-29616d81b632/thumbnails/8e24ad7f-791f-44f1-8956-29616d81b632.jpg
Duration:
T00H59M41S
Embed URL:
https://stream.cadmore.media/player/8e24ad7f-791f-44f1-8956-29616d81b632
Content URL:
https://cadmoreoriginalmedia.blob.core.windows.net/8e24ad7f-791f-44f1-8956-29616d81b632/GMT20200810-132954_Seamless-A_1920x1178.mp4?sv=2019-02-02&sr=c&sig=fHuy2VgHwil7eWAmNXrHDQUYuMlw2V0ejgzsSIvI39Y%3D&st=2024-11-26T03%3A00%3A12Z&se=2024-11-26T05%3A05%3A12Z&sp=r
Upload Date:
2022-04-14T00:00:00.0000000
Transcript:
Language: EN.
Segment:0 .
JASON GRIFFEY: Hello, everyone. Welcome, and thank you for spending part of your morning here on this Monday morning with us at NISO and Seamless Access. This presentation will be Seamless Access Presents Entity Categories and Attribute Bundles. There are two things going on. One is obviously the online presentation. If you can hear me, then your VOIP is working. If there are any problems while the presentation is going on, you can contact support.zoom.us, and here on this initial slide is the webinar ID that you will need to let them know.
JASON GRIFFEY: So if you have any trouble at all with connectivity, please contact support.zoom.us and use that webinar ID. A couple of pieces of information that we always like to get to you up front. One is will slides be made available? Yes, of course they will. And not only will slides be made available, but a recording will be made available as well.
JASON GRIFFEY: You will be receiving an email that have links to slides, the recording. The recording will also be put on to the NISO video portal and be available for public consumption in addition to consumption of people that pre-registered for this webinar. So you will be able to not only see it, but you'll be able to share it if it is of interest to other people in your community.
JASON GRIFFEY: We have lots of educational events at NISO, and we like to remind people of a few of them when we are starting one. We have a couple of different things happening in August. We have a two-part webinar August 12 and August 19. It's two consecutive Wednesdays at 1:00 PM. The first is By Faculty and For Students: Supporting Open Educational Resources, and Open Access Monographs: What You Need to Know.
JASON GRIFFEY: We also have, later in the month on August 26, we have an ISO virtual conference on Transforming Search: What the Information Community Can and Should Build. Both of those events are available for illustration at the niso.org website under our Events tab. In addition to the events that we run, we try very hard to keep abreast of all of the news and happenings around the information ecosystem.
JASON GRIFFEY: And so we provide a publication called NISO IO, and you can find that at niso.org/niso-io. You can also sign up for a mailing list and receive this in your mailbox every time we publish it. This is updated very regularly and always has sort of up to the minute, at least up to the day, news and happenings around the information ecosystem. We're very proud of the work that goes into this, and we'd love to have you read it.
JASON GRIFFEY: Today we are going to spend about an hour talking through the Seamless Access projects Entity Categories and Attribute Bundles, which is the result of a working group that has been doing very detailed work on this topic for the last few months. I am your moderator, Jason Griffey. I'm the Director of Strategic Initiatives here at NISO. I'm also a member of the governance committee and a number of other committees inside of the Seamless Access project.
JASON GRIFFEY: And our speaker today is going to be Heather Flanagan, the Program Director of seamlessaccess.org. In addition to the Program Director of Seamless, she is also the principal at Spherical Cow Consulting, and she comes from a position that the internet is led by people, powered by words, and inspired by technology. She's been involved in leadership roles with some of the most technical volunteer-driven organizations on the internet, including ID Pro as principal editor, the IETF, the IAB, the IRTF as RFC Series Editor, ICAN as technical writer, and REFEDS as coordinator, just to name a few.
JASON GRIFFEY: I would like to offer a prize if anyone could name, off the top of their head, all of those acronyms, but unfortunately I don't think I'm allowed to do that. If there's work going on to develop new internet standards or discussions around the future of digital identity, she is interested in engaging in that work. And she is definitely the most knowledgeable person I know about all of this very complicated technical area of internet authentication.
JASON GRIFFEY: And I'm very proud to have her here and happy that she gets to walk us through this project. So Heather, I'm going to throw this over to you. I'm going to stop sharing my screen and let you start sharing yours.
HEATHER FLANAGAN: Fantastic. All right.
JASON GRIFFEY: Excellent.
HEATHER FLANAGAN: Let's get this party started. So welcome, everybody. Definitely glad to see you bright and early on a Monday morning. I'm actually on the US west coast, so it's about 7:00 AM. I have my coffee, so absolutely ready to get started. What we're going to talk about today is something called entity categories, and why they're important, what we're doing with them. Not everyone, actually, will likely understand all the gory, gory details, and frankly, you shouldn't have to.
HEATHER FLANAGAN: I am going to be making a couple of assumptions here, one of which is that you have some rough idea of how federated authentication works, and that these acronyms, especially once I remind you what they are, are part and parlay of the conversation that we're going to have. If you do need a refresher on things like how federated identity works, about privacy, about attributes, Seamless Access does have a YouTube channel that explains these things in sort of five-, 10-minute soundbites, and I encourage you to go look those over.
HEATHER FLANAGAN: OK. So entity categories. What does that even mean? So entity categories are basically in federation metadata, which is not exactly the same as the information metadata that you might be used to from a library perspective. This is metadata from a federation perspective.
HEATHER FLANAGAN: And entity categories describe a set of characteristics or capabilities that an entity such as a service provider or an identity provider can have so that they can be sort of categorized into buckets. The goal of an entity category is to basically offer a shorthand way of saying-- rather than having to list out every individual institution, you can say no, I'm interested in institutions that fall in this category.
HEATHER FLANAGAN: There are several already in existence today, and what Seamless Access is proposing to do is add a few more. So what we're planning to do, if you've heard of entity categories in the past, one you've probably heard of is known as R&S, which is Research and Scholarship. The Research and Scholarship entity category bundles up those entities, those IDPs and SPs, SPs in particular, that are to be used only for research and scholarship purposes.
HEATHER FLANAGAN: These are explicitly not for things like journal subscriptions or things like that. They're much more for scientific collaborations, as an example, that they're just trying to do their research. They're trying to offer material through their Wikis to their scientists. The purpose of having Research and Scholarship entity category is to tell IDPs in particular, look, this is not a vendor.
HEATHER FLANAGAN: This isn't someone that's trying to sell attributes here. This is a group that has been vetted as just wanting to do research. And they will behave in a certain manner. So if you would, please release more information to them so that they can actually do the science that they are supposed to be doing, and that we all, frankly, want them to do. This has been really useful to the scientific community as they start to depend more and more on federated identity.
HEATHER FLANAGAN: And it's just so helpful, also, to the identity providers, rather than having to have these individual agreements, to say, OK, well, the metadata says I need to release this information, email address, name, identifiers to this service provider. But this other service provider wants something slightly different, and another service provider might want something slightly different again in having these categories.
HEATHER FLANAGAN: It's now suddenly very easy for an identity provider to configure against a bundle. So when we put together a set of entity categories, that was where we were coming from. We were trying to make it easier for an identity provider to know how to configure their information to release attributes to service providers, or not.
HEATHER FLANAGAN: And we'll get to that or not in just a minute. So a little moment about who was involved in this. Was definitely a broad effort that involved some really excellent people from Stanford University, from Duke University, publishers like Springer Nature, service providers like MyUnidays, people who were just interested in the work like Peter Murray. We were all involved trying to make this the best specification possible that really focused on protecting a user's privacy and making it easy for an identity provider, and for the librarians on their campus, to know what exactly are we talking about releasing or not.
HEATHER FLANAGAN: So in all of the entity categories we put together, and we drafted three, we had some general principles. First and foremost is OK, so are we doing |? Because we want to make sure that user privacy and usability are key drivers in making federated identity work for all parties involved, all stakeholders. We wanted to emphasize that service providers must abide by principles of something called the GEANT Data Protection Code of Conduct.
HEATHER FLANAGAN: This one has been a little bit interesting in terms of feedback. The Data Protection Code of Conduct is a construct that came out of the data privacy directive before it was the GDPR. And we can't say that all service providers must sign off on the GEANT Data Protection Code of Conduct because it is an EU construct. And if you're not in the EU, then you can't actually sign up to it.
HEATHER FLANAGAN: There's legalities involved. But the principles of it, which are really motherhood and apple pie of privacy, are things that we want people to agree to, that you will use the data only for its intended purpose, that you will not collect more data than you absolutely, positively require to do your work, things like that. These entity categories are not the end all and be all of what a service provider and an identity provider can agree to.
HEATHER FLANAGAN: There may very well be additional bilateral agreements between an and SP, and having these entity categories does not preclude that. Another point of note is that these entity categories-- so the RNS entity category, which I mentioned before-- that's something at the federation level, that group that's actually holding all of this metadata and creating the trust agreements between service providers and identity providers.
HEATHER FLANAGAN: For RNS, the Federation actually has to sign off to say, yes, this is a Research and Scholarship entity, and therefore we can tag them as such. For the entity category Seamless Access is proposing, there is no approval process on the part of the Federation. Hopefully it will be obvious as to why once I get through them in a little bit more detail. Now that said, Federation operators do have one thing they have to do, which is to allow these entity categories in their metadata.
HEATHER FLANAGAN: People can create all sorts of fun and dandy things and say, I want this in my metadata. But you want to normalize what data you have, and you want to be very clear of what's actually going into this big bucket of material because when you share it out, you don't want to break anybody else's business, and you do want to have a certain level of security.
HEATHER FLANAGAN: So the Federation operators will need to actually agree to support this because otherwise, if you put it in there, they will pull it right back out. And last but not least, these entity categories are about what institutions can share with other institutions, be they vendors or whatnot. This isn't about what the user might provide directly to the service provider.
HEATHER FLANAGAN: That's out of scope. It's not something that we can really manage through this particular mechanism. And here we have a quote that comes from each of the specifications that say, this only relates to the personal data between the IDP and the SP, not the personal data requests directly from the end user or the browser, potentially via a consent flow.
HEATHER FLANAGAN: So what are these entity categories we're talking about, anyway? As I mentioned, there are three, and you can think of them as being on a continuum. At the very early start of the continuum, the first one is the authentication only entity category. This one says service providers will grant access solely on proof of a successful authentication.
HEATHER FLANAGAN: No attributes are actually shared at all. It's just sort of a binary, was the authentication successful, yes or no. This came about at the request of, actually, some of the publisher stakeholders, who occasionally work with institutions that release more data than the publishers want to have. It's a little bit difficult to scrub stuff out of log files.
HEATHER FLANAGAN: It can be done, but it's hard to get all of that material cleaned up as needed. And service providers often don't want the data because once they have it, they're liable for it. And that can be scary, especially in the age of GDPR. The second entity category is anonymous authorization. And this is where service providers will grant access based on proof of authentication and will actually make that authorization decision based on the user's affiliation and/or the user's entitlement.
HEATHER FLANAGAN: In an ideal world, this is where I think almost everybody should land, if at all possible. For the pseudonymous authorization, this is OK, the authentication was successful. Authorization decisions are made based on affiliation and/or entitlement data. And the IDP will send along a pseudonymous user identifier, basically what's called a pairwise identifier such that for that session, or for that user, that service provider gets this identifier.
HEATHER FLANAGAN: It's not actually used for any other service provider with that user. And that way you don't get to track the user across multiple entities. It's worth noting that a service provider can't say, I'm going to tag myself with all three of these entity categories so that I can just get whatever I can get.
HEATHER FLANAGAN: No, you're not allowed to do that. They have to pick one. And it's also worth noting that you can't combine this with R&S. That's almost like a given because R&S is for those use cases of research and is explicitly not used for subscriptions, journal subscriptions, things like that. If you can do R&S, you should.
HEATHER FLANAGAN: And if you can't, then these are options. So you just should never combine the two, and it wouldn't make any sense. OK. So let's talk a little bit about the authentication only entity category. So if an organization is using this, if an institution is using this, then what's happening is that the service provider indicates that they want complete anonymous privacy-preserving service for the user, and they don't want the IDP to release any user attributes whatsoever of any kind.
HEATHER FLANAGAN: If the IDP does send something, which, again, they shouldn't, then the SP must just drop it unless there are other arrangements in place. Now I'm going to make an argument in a moment that this is an anti-pattern. This is something that if people do this, you're saying authentication equals authorization. And any time you talk to anybody who's done identity and access management, as soon as you conflate those two things, they'll start twitching, and we'll have to dive in and say no, authentication and authorization are not the same thing.
HEATHER FLANAGAN: They should be logically and functionally two discrete activities. So from a personal perspective, I appreciate the idea of no, don't send any user data whatsoever. But I don't actually think it serves the purpose that we all want to have, which is make sure that the privacy is protected for the user, and make sure that we're meeting all the contracts and requirements in place to say who's authorized to use what.
HEATHER FLANAGAN: Give you a sort of example so you can take it out of specification language. So The Journal of Examples, you could use the authentication only entity category when a patron logs in using federated access, and then the university shares that that works. OK, that authentication, that works. That's all they get.
HEATHER FLANAGAN: Hey, Jason, are there any questions to this point?
JASON GRIFFEY: There are. There's one that calls back just a little bit, if you don't mind rewinding to your three categories. So someone wanted to know, so a service provider can't pick the category that's appropriate for specific products or services. Once a service provider chooses a category, they have to use that across all of their services. I think that was a misunderstanding, so if you want to clarify that.
JASON GRIFFEY:
HEATHER FLANAGAN: No, that's not quite right. They can actually specify on a more granular level what they want, and have that in the metadata.
JASON GRIFFEY: Yes, at a service level, not necessarily at the provider level, right?
HEATHER FLANAGAN: Right. I guess one other point that's really important to make here is the service provider can ask for whatever-- anything. They can ask for anything from here to next Tuesday, and it doesn't matter. It doesn't matter what the service provider actually asks for because the entity that controls what the service provider gets is the identity provider.
HEATHER FLANAGAN: One of the big reasons-- I mean, for me-- to have energy categories is one, you want to make it easy for the identity provider to configure appropriately what they're going to release to what service provider. And also, in having these very clear categories, the conversation that happens within the campus should also become a lot more straightforward. We know that it's not typical for the department that runs the identity provider work out of, for example, a library.
HEATHER FLANAGAN: Sometimes it happens, but mostly it doesn't, which means the library and the identity provider, they need to have a conversation. And we're trying to give you the vocabulary to have that conversation because a lot of the feedback we've received to date is that vocabulary isn't there, and those conversations are strained, if they happen at all.
JASON GRIFFEY: And here is a somewhat more specific take on the authentication only anonymous authorization distinction. So Adam Snook asked, by default, most IDPs send targeted ID and scoped affiliation. So by default, most IDPs wouldn't fit into the authentication only model without work. Most likely they would fit into the anonymous authorization.
JASON GRIFFEY: Has he understood that correctly?
HEATHER FLANAGAN: Yes, that is correct. That is correct.
JASON GRIFFEY: All right. And then there's one more, but I'm going to save it until we get through the rest of the categories so you can be a little more broad.
HEATHER FLANAGAN: OK. Let's see, where am I? [INAUDIBLE] and since we just talked about the anonymous authorization category, this is where the attribute bundle-- what the IDP is configured to release if the service provider has indicated that this is the category they're after-- is what organization and entitlement data. That's it. Optionally, an affiliation type and a metrics code.
HEATHER FLANAGAN: I want to say a little bit more about that metrics code for a moment. The librarians on the call are probably familiar with the counter standard. Basically, that allows publishers to generate reports to send back to the library saying, here's how you've asked us to group stuff for billing purposes.
HEATHER FLANAGAN: We don't know what these codes mean. You sent them to us. We're just packaging them and sending them back to you so that you can do your internal billing however it is you're going to do it. We are trying to figure out how to do something very, very similar in a federated identity workflow. And that's where this concept of a metrics code for reporting purposes comes from.
HEATHER FLANAGAN: There isn't such a thing right now in any standard's attribute schema. We're talking about it. But we don't have it yet, which is where affiliation type may very well come in, in that an affiliation in the eduPerson attribute schema, which is the attribute schema used in higher education around the world.
HEATHER FLANAGAN: It has a eduPerson Affiliation, which is a controlled vocabulary attribute. It can only have, I think, seven values-- student, faculty, library walk-in, a handful of others. I could talk about this one all day, and it would probably make people very sad and hang up. Suffice to say affiliation type, whether they're faculty or not, that's about the relationship the user has with their institution.
HEATHER FLANAGAN: Entitlement data is not the same thing. Entitlement data can be much more granular than that, and that sends the information of is this user entitled to the service because they are at the Chemistry department, something like that. A lot-- a lot-- so many organizations merge those two things, where they just say, well, all faculty should access this, and therefore they will use the affiliation to determine the entitlement.
HEATHER FLANAGAN: This has caused no end of confusion in the world. And we really wish people wouldn't do that, but it is what it is. Ideally, the affiliation would be used more for reporting so that a service provider could say, OK, x number of people you've said were faculty have used this, or y number of students, or z library walk-ins.
HEATHER FLANAGAN: Whether those users are actually entitled to the material, that's a decision that's made on the campus with the IDP, and that's a different set of data. So putting that into sort of a clearer prose than what you might find in a spec, when a patron logs in using their federated access, then what's going to be shared is the name of the institution, the entity ID, any entitlement details specific for that user to that publisher, and perhaps also affiliation and/or a reporting code.
HEATHER FLANAGAN: OK. Third on the list is the pseudonymous authorization entity category. Here we get a little bit more detail. You get the organizational identifier, which we talked about, that thing that says this is Harvard. The entitlement data, the thing that the campus has said yes, this user can access this material because.
HEATHER FLANAGAN: And a pseudonymous pairwise user identifier. This is something that would basically allow a service provider to say, OK, I don't know who this is, but I know it's the same person that was here last time. And from that, they can actually start building personalized services for that particular user. That information is not common from service provider to service provider to service provider. The user is going to get a unique pairwise user identifier for every service provider they're working with.
HEATHER FLANAGAN: And then it will be up to a conversation between the service provider and the user to further personalize information based on that. Now that they've been somewhat more tightly categorized, we can start personalizing what they see, again, optionally, sending affiliation type and metrics code for reporting purposes. And hopefully we'll be able to do that a bit more clearly when we have more information on a attribute for metrics.
HEATHER FLANAGAN: And of course, putting this into prose, as you'll see, each one gets a little bit longer as a little bit more information is shared. Now you may have noticed we're looking at a continuum. No information about the user at all. Just enough information to make an authorization decision and information to make an authorization decision and to allow for personalization with the user offering a bit more information.
HEATHER FLANAGAN: What you don't see here is what would be the fourth thing on the list? The fourth thing would be, OK, share personal data. We talked about that. Do we want to create an entity category that says, share email, share name, share phone number, what have you. And we decided that no, actually, we don't want to make that easier. If that level of information is being shared, then especially in the use cases where we're not talking about Research and Scholarship entities, where we're talking about vendor relationships, things like that, that needs to be an explicit conversation held between the service provider and the institution.
HEATHER FLANAGAN: Not saying you shouldn't do it. There may very well be use cases where that makes total sense. It depends on the service. It depends upon the institution. But it's a conversation, and we don't want it to be easy. We want just enough friction that you have to stop and think about what you're doing.
JASON GRIFFEY: All right. Had a couple of questions. Now that we've gotten through all three, there were a couple of things that I was hoping to get you to comment on. One, an anonymous attendee asked, so for the pseudonymous entity category, is the SP, is the service provider prohibited from tying user disclosed info with that pseudonymous ID?
JASON GRIFFEY: So is the service provider allowed to tie other pieces of information to that ID in the background or something else?
HEATHER FLANAGAN: With a user's consent, certainly. On their own, can they start building a profile of that user?
JASON GRIFFEY: I think that's the implication of the question, yes.
HEATHER FLANAGAN: So it kind of depends on why they're doing it and what they're going to do with the data. And here we get back to following the principles of the GEANT Data Protection Code of Conduct. If they're doing this, and it is part of the service that the user has consented to, then they can do that. But then they cannot take that and say, by the way, I think you would really like this completely other service.
HEATHER FLANAGAN: And for a small fee, we'll give you access to that. That is no. That's a hard no. But to help personalize the service that they've agreed to use, they can create that profile. They can tell the user, here's an opportunity. Do you want to create a profile so that we can actually say, hi, Jason, as opposed to hi, User 123.
JASON GRIFFEY: Yeah. I think the other instance that I can think of where that might be looked upon by users or by even the IDPs as a questionable behavior would be if the pseudonymous ID that was associated was then used in combination with something like browser fingerprinting to build profiles of individual users because that is a fairly trivial way of identifying individuals.
JASON GRIFFEY: So that sort of thing also would be not good.
HEATHER FLANAGAN: And there'd be one other thing that I would want to mention. Here now your institution is going to be subject to certain laws I don't know what those laws are. It depends on where you are. But there are some institutions-- one of the participants in the entity category working group was from Duke University, Tim McGeary. And he pointed out that in North Carolina, the service provider cannot ask the user for data if the user is accessing that material because of a contract with an institution.
HEATHER FLANAGAN: The institution owns the relationship between the service provider and the campus. And so any requests for additional information directly from the user needs to go through the campus. And they've got an excellent consent attribute release program to enable that. But his perspective is that you should never, ever, ever-- the service provider should never directly ask the user for more data because that's not who owns the relationship.
HEATHER FLANAGAN: The campus owns the relationship. I have also heard the flip side of that from the publisher side, saying, no, we need to have a record of consent on our side because we're going to be held liable for what data we collect. And we don't know what the institution has asked the user to agree to or not. So this is a space that does have some tension. And I don't know that other states have the same rules that North Carolina does.
JASON GRIFFEY: Yeah. That actually leads into another question that was along the same lines from John Mark Ockerbloom, who wanted to know what support exists for informed user consent for the pseudonymous authorization category for more detailed information passing.
HEATHER FLANAGAN: Sure. So that's out of scope for the specifications themselves. But there is some work that I would love to draw your attention to called the Consent Informed Attribute Release Project. If you go to Internet 2 and In Common, you can find out more there. And Duke University sort of is at the forefront of being the reference architecture for how that can work.
HEATHER FLANAGAN: That could use its own webinar. But I'll give you a teaser and say go look at what they're doing for that project, Consent Informed Attribute Release, known as CAR.
JASON GRIFFEY: Another question. Mike [INAUDIBLE] wanted to know, would the entitlement data vary per service provider? I think you've sort of answered this already, but if you want to be explicit.
HEATHER FLANAGAN: Yeah.
JASON GRIFFEY: I mean, yes.
HEATHER FLANAGAN: It would have to, I would think, because you're going to have a different agreement from one service provider to the next in terms of who gets to see what users are actually authorized to do what things.
JASON GRIFFEY: Excellent. Let's see. There's still a handful of questions. Do you want to take a few more before you move on to the next steps?
HEATHER FLANAGAN: Yeah.
JASON GRIFFEY: OK. So Elizabeth in the chat wanted to know, we have vendors that use federated access that insist their resources cannot be set up without the user filling in a registration providing their name and email address. Could this be or would this be challenged using these categories?
HEATHER FLANAGAN: Oh, I have opinions. I have never run a service provider. I've been part of identity provider groups at multiple campuses, and I've been dealing with federation space for about 10 years. And any time a service provider says, I must have personal information for the service to run, I'm going to take a good hard look as to why that is.
HEATHER FLANAGAN: Sometimes it's perfectly valid, coming back to the Research and Scholarship use case, where there's specific agreements in place because you need a specific scientist to be able to access material, or a specific department, or your grant information says that only this type of user can access this material. Department of Defense grants that say only American citizens can access stuff.
HEATHER FLANAGAN: Things like that, sure, I can see how requiring a much higher level of granularity to what data is there. I get that. But do you need it for personalization? Well, why do you need personalization? It's a very nice to have feature, but it is a nice to have feature.
HEATHER FLANAGAN: It's not actually required. I don't know that particular situation, what's really going on there. But I would ask very hard questions.
JASON GRIFFEY: All right. One more. Alistair Morrison says and asks, I think my library has been lazy about entitlement. It's a single yes no field that's an attribute of our user. It sounds like we should be calculating entitlement for each user and service combination, question mark.
HEATHER FLANAGAN: That is a decision that you can make on your campus. I suspect many, many, many, many groups are that kind of binary. But I struggle with thinking of it that way because it will depend. Not all users are allowed to access all the things. And therefore, you do need to get in a little bit more granular detail, saying this user is entitled to this list of things.
HEATHER FLANAGAN: And can you can do things to automatically provision this. You don't have to go through one user by one user. You can say, OK, I know everybody that's in the Chemistry department can access this. Therefore there's scripts that can be run that immediately that adds that it's a multi-valued attribute so they can have more than one, that says they are in the Chemistry department and the Physics department and the Math department with a minor in History.
HEATHER FLANAGAN: I actually know people that had Biology History majors, so it's a thing. And you could just automatically do that, and then that your entitlement information.
JASON GRIFFEY: All right. I'll mark that one done. And then the last outstanding question from the audience that we have-- and this was one I wanted to hear you answer-- and that is, why is it that you think most should go with the anonymous, that that should be the sort of default? And is Seamless Access going to make a recommendation specifically along that line?
HEATHER FLANAGAN: Sure. So the reason I think the anonymous authorization is likely, and frankly should be, the default, on the one hand, I think personalization is something that is abused and needs to be thought through a little bit more clearly. It's certainly a viable request. Certainly there are cases where it's absolutely reasonable.
HEATHER FLANAGAN: So sure. But I would like a little bit more thought put into it for the authentication only. Just because a user has an account on your institution doesn't mean that you want them to count towards your user base. Just because a user has an account doesn't mean, actually, hardly anything at all other than the user has an account.
HEATHER FLANAGAN: Why does the user have an account? Maybe they're a temp that's there for two weeks. Maybe they're a pre-college student. Maybe your institution is one of those that as soon as someone applies, they are given an account. That doesn't mean anything regarding what they should be authorized to access. That is a separate decision. Once you merge those two things to say, if they can authenticate, they are authorized to whatever any other user is authorized to do, why are you even bothering with your authentication at that point?
HEATHER FLANAGAN: It's just a wide open field, and it's not what I think of as particularly good security practice at all. The consultation period for this project has been open for about a month, as it says on the screen. It's going to run through the end of this month. And we've already been receiving feedback from some of the security experts in the world that says, yeah, the authentication only is a really, really bad idea.
HEATHER FLANAGAN: It is an anti-pattern. Don't do it. So I'm not entirely sure that that one's even going to be approved. That's part of the consultation process. This isn't a these are all guaranteed things we're going to do, we're just working on tweaking language. This is a are these a good idea?
HEATHER FLANAGAN: And it doesn't have to be all three yes, all three no. We may very well just see two out of the three, one out of three. I don't know. It depends on the feedback going through and actually becoming a thing.
JASON GRIFFEY: We have a comment, not a question. But it's from an expert source, and so I will read it. Ralph [? Youngen, ?] who has been a longtime member of the Seamless Access group and obviously is very knowledgeable about the stuff, he says, Heather is missing another critical purpose for pseudonymous identifiers. Publishers often detect potentially compromised credentials by noticing odd usage patterns on their websites.
JASON GRIFFEY: Pseudonymous identifiers are needed to assist the campus to determine the user with compromised credentials. So that was one of his pieces of input on the pseudonymous category.
HEATHER FLANAGAN: Back when I worked at Stanford University, we did have-- and I think every campus-- if you haven't gone through this yet, it's just a matter of time for a publisher to cut you off at the knees because it's one of your accounts has been compromised, and they're now downloading the entire chemical abstract to some other country entirely.
HEATHER FLANAGAN: And they can't figure out who's doing it, so they just turn off everybody. It'd be really nice to be to stop that kind of behavior, both the security of having the accounts compromised, and the, OK now we're going to shut off everybody because you've got one bad actor.
JASON GRIFFEY: Couple more questions. These are rolling in now. So Lisa Hinchcliffe wanted to know, who ultimately decides which of the three are approved? REFEDS, Seamless Access? Who has sort of final approval on these?
HEATHER FLANAGAN: REFEDS. In this case. So the Seamless Access working group put together the specifications. But because this becomes something that actually applies at the federation level, we went to the group that actually manages the majority of federation-related things around the world, REFEDS and said, would you incorporate these, would you run these, would you become custodians of these specifications, which they agreed to do.
HEATHER FLANAGAN: And that's who's actually managing the consultation and who will be looking through all of the feedback received and trying to decide what to do next. If you want more details on that, that is on the website. We actually have a blog post that describes some of this work and how to actually follow up and offer your feedback. Runs through August 31.
HEATHER FLANAGAN: Anybody in the world can offer comments, and I do encourage you to do so.
JASON GRIFFEY: One more question from John Mark Ockerbloom, who says, I thought that entitlement was limited to eduPerson Affiliation, but the recent answer suggests it might have more details, like major. Is that something constrained by the spec or controllable by the SP or IDP? Major minor combos, for example, can sometimes uniquely identify a user. Can you clarify? And as a user who is an undergrad was a Biology Philosophy major, and probably the only one in my school, I feel this.
HEATHER FLANAGAN: So eduPerson Affiliation is a controlled vocabulary attribute. You only have seven choices of what can be in there. And it's been a very interesting attribute because, of course-- and this is a known problem. It's been a known problem for a long time-- how some of those terms are actually defined at the local level varies. What is a faculty member at University of Chicago is not 100% the same as what is a faculty member at the University of Florida or the University of Alaska.
HEATHER FLANAGAN: When you take it global, it falls apart a little bit more. I mean, there's enough of a broad categorization. When you say faculty, people are like, oh. yeah, I know what faculty is. You only think you do. But it's a common enough term that at least it gets us started. Entitlement is not a controlled vocabulary. You can put in there pretty much anything that you want.
HEATHER FLANAGAN: And it's a multivalide attribute, so you can put as much of anything in there as you want. Can you manipulate that, potentially uniquely identify an individual? Probably. Probably. It would take quite a bit of work.
HEATHER FLANAGAN: And I personally wouldn't know how to do it, but I can imagine that it could be done. If you do start doing that level of granular logging and data mapping, then you're actually going against what the specs say you're allowed to do anyway.
JASON GRIFFEY: Yeah, that much correlation of data would be-- the GEANT principles would speak against that as well. So two more. Going back a question, and the recent question, if asking about the entitlement attribute, for example, eduPerson entitlement, that needs to be agreed upon between the SP and the IDP, correct? Both parties need to agree on those values.
JASON GRIFFEY: You can go as granular as you like as long as both parties agree to support it.
HEATHER FLANAGAN: Yes. And you may need to have a conversation with the service provider that they know what to do with the entitlement data. They should. But mean being perfectly honest, your mileage will vary with regards to how well people actually understand the applications that they're running for federated identity.
JASON GRIFFEY: Couple more. This one is a bit of a softball, I think, but I'm going to ask anyway. I may end up answering more than you. But will libraries and vendors have to include contract language in service agreements for the category that should govern access for specific products?
HEATHER FLANAGAN: Well, funny you should ask that question because once we had the entity categories firmed up enough to start a public consultation, we kicked off a contract language working group such that we would actually have some kind of template available where people could use these entity categories as a way to describe what they can and cannot do in the contracts. Jason, being chair of that group, would you like to say more?
JASON GRIFFEY: We are currently in meetings on that. We have a couple of subgroups that are doing research on existing contract language, or will be doing research on existing contract language, and then best practices on how to understand these entity categories and work them in. We will be creating a sort of tool box, a tool kit for libraries to use as far as language between service providers and IDPs.
JASON GRIFFEY: So we are working on this particular issue. It just started. We actually had our inaugural meeting last week. And so we will be doing that over the course of the next few months.
HEATHER FLANAGAN: I think the work that comes out of that group is critical, and it's also a little challenging because I do think there needs to be some kind of template such that the librarians know exactly what they're asking for, that the service providers know what is and isn't OK, and do that at a level that it's as global as it can be. That's really important. Flipside. As global as it can be.
HEATHER FLANAGAN: Oh, my goodness, the amount of legislation that we have-- this isn't legal language. This is something that you can use in your contracts based on your laws to make things smoother for all parties. But we're not lawyers.
JASON GRIFFEY: Yeah. I think that's actually the crux of the biggest challenge for the contract language stuff, is just there are so many jurisdictions and so many variance in the way these things are understood that it will be challenging to come up with a one size fits all. But that's, hopefully, the purpose of that sort of tool kit approach, where we can break it up and people can choose the things that are applicable to their particular area.
JASON GRIFFEY: even more. So Lisa Hinchcliffe asks, what sort of enforcement is there of the GEANT principles, since it sounds like some of the things people are concerned about aren't prohibited by the spec, but rather against the principles?
HEATHER FLANAGAN: So these are self-asserted situations. In terms of enforcement, if someone is being a particularly bad actor and has tagged themselves with an entity category, and yet are doing things with the data that they shouldn't do, then that becomes a bigger conversation. I think the identity provider needs to talk to the federation and say hey, this service provider is not behaving well because if they're breaking that, if they're going outside the best practice as defined by the code of conduct-- which again, we aren't getting people to agree to that explicitly because it's out of their jurisdiction.
HEATHER FLANAGAN: We just really like the way they describe their principles. If someone is not behaving well based on those guidelines, then they've got a bigger problem. They're probably not doing the right thing according to their federation agreement, either, and the trust fabric that they're participating in more broadly.
JASON GRIFFEY: And then we've got time for maybe just a couple more questions. I'll take one from the Q&A that's still there. So for institutions that are part of In Common and already released attributes to all of In Common, how would entity categories be involved in that process?
HEATHER FLANAGAN: Release attributes. So how on earth do they do that? I mean, are they basically saying if you're a member of In Common, we'll share email address, name, targeted identifiers, other identifiers. I mean if they're doing all of that, I don't suggest that's a good idea. I think actually using these entity categories will definitely improve your security footprint overall.
HEATHER FLANAGAN: I think you should support R&S, absolutely. That should be a given, where if a service provider has been vetted to fall in the R&S category, yeah, I think that should be an automatic yes, you support R&S, and you'll release the those attributes to those types of entities. Full stop. For entities that are not tagged with R&S, then no, you shouldn't be releasing all the things.
HEATHER FLANAGAN: And I would hope that having these entity categories will make it very easy for the people that actually operate your IDP to configure things a little bit more tightly.
JASON GRIFFEY: All right. And we had one question from a call-in. Can we unmute Mark an let him ask his question really quickly before we wrap up?
SPEAKER 1: Mark has been-- whoops, wait a minute. Well, I thought he was unmuted.
JASON GRIFFEY: If not we'll maybe follow up privately with him.
SPEAKER 1: Yeah, I'm afraid we'll have to follow up.
JASON GRIFFEY: All right. Mark, I'll follow up with you afterwards if you'd like and we can make sure that your question gets answered. For now, we are very quickly coming to the top of the hour. That was a lot of information. Thank you so much, Heather. Any last words from you?
HEATHER FLANAGAN: I want to thank everybody for coming. I know this is something of a dry subject. I do hope that we are able to make the conversations that need to happen within the campus a little bit easier by having some of our own controlled vocabulary about what information we want to see released and what we don't, and how to do it. If you do have more questions, or if you're interested in commenting on the specifications, I look forward to hearing from you.
JASON GRIFFEY: All right. If you could stop your sharing, I've got couple more quick slides here. And then we will-- Once more, everyone, thank you so much for being a part of this conversation today. Slides will be posted to the website just as quickly as we can get them up right after we end today. And the video of the session will also be publicly available on the NISO video portal as quickly as we can get it up, certainly before the end of the week, if you'd like to share it with anyone on your end in your organizations.
JASON GRIFFEY: Thank you so much for being here today. We really appreciate you spending the time. And if anyone has any follow-up questions or anything, feel free to reach out to me or to Heather directly. We will absolutely make sure that we get to whatever we can to help you understand this. So thank you very much. Thanks again, Heather, and to everyone for being here.
HEATHER FLANAGAN: Thanks, everybody. Ciao.
JASON GRIFFEY: Thank you.