Doug and Kristi discuss the impact of profiling in the data sets used to train algorithms and the extended impact to decision making.
This is a topic of particular interest to both of us due to our respective passion for data analytics. One of the most prescient points that comes out of the discussion is true degree of difficulty for creating an objective data set for the purpose of training predictive algorithms.
Doug's business specializes in partnering with companies and non-profits to create value and capture cost savings without layoffs to fund growth and strengthen financial results.
You can find out more at www.TerminalValue.biz
You can find the audio podcast feed at www.TerminalValuePodcast.com
You can find the video podcast feed at www.youtube.com/channel/UCV5a4QbT-dXhpgb-8HJHdGg
Schedule time with Doug to talk about your business at www.MeetDoug.Biz
<<Transcript>>
[Music]
[Introduction]
Welcome to the terminal value Podcast where each episode provides in depth insight about the long term value of companies and ideas in our current world. Your host for this podcast is Doug Utberg, the founder and principal consultant for Business of Life, LLC.
Doug: Okay, welcome to the terminal value podcast. I have Kristi Yuthas, on the line or with us today and Kristi and I actually worked together a couple of years ago, teaching a finance information systems class at Portland state university. And Kristi is very generously, even wanting to talk to me again after that experience, which I thank her greatly for. And what we would like to talk about today is actually analytics, particularly the advent of profiling or racism and analytics and what we can do about it. Kristi, welcome.
Kristi: Thank you, Doug. And let me just say, I miss you in the classroom. That’s really fun.
Doug: It was, it was a very, it was, it was very illustrative. It was my first time teaching a class and I had, I came in with these great, these, these great thoughts of students who will be yearning for knowledge. What I found was that, not all but many really just wanted an instruction sheet for getting an A, so they could get out of class and go on.
Kristi: You know but I were, the that was the most dynamic, the night class. I’ve seen you know, these kids work all day. They go to class at night.
Doug: Yeah.
Kristi: And you just kept me and everybody else.
Doug: Yes, I remember I did tell a lot of stories.
Kristi: Great.
Doug: Yeah, that, that, that class was a lot of fun. We will definitely have to find, find some time to teach together again in the near future. But one of the things that, Kristi has been doing quite a bit of work with is accounting analytics, because I think the data science of course, is really pervading everywhere really. But I think data science is getting is becoming especially important in the accounting profession because it's thing it's, you know, it's really impactful in ways how different ways to, you know, either forecast results or to test for potential control gaps or test for fraud. But that's actually not what we're going to talk about today. What we're going to talk about today is the place where data and analytics can actually get us into trouble because there've been some times when analytic algorithms have actually resulted in profiling, that is really not fair to the individual. Kristi, would you take it away from there after I served you up a nice juicy softball over the plate?
Kristi: Oh my goodness. There's so much to talk about here, but, but just in terms of just even any basic analytics, so we get wrapped up a lot in the tools and in the coding or in the statistical analysis, and we really are likely to lose the whole context. You know, we just forget these are real people, these are real situations and we just get so embedded in the data that we forget. And I think part of that is the way we teach these things. I mean, we use textbooks where the data sets match up perfectly with the problem, and then the factors all line up perfectly, or your aggression comes out smooth with a nice, you know, are. And so we just, don't, aren't trained to really think about the messy world. What are the, what are the reasons the data look like they do in the first place? And then what are the consequences of the decisions that we made using that data? So that's a big problem when people are making the decisions and it becomes a bigger problem when algorithms are making those decisions.
Doug: Well and I think that's, that's a, that's a really precious point there because at least what I've found is there's, there's kind of two extremes, right? Extreme a is you have people who don't believe in numbers and they just make it, want to make every decision based on their gut. And extreme B is kind of, you know, where you have sky net or Whopper making your decisions based on, you know, the, the amoral algorithms. And there doesn't really seem to be a lot in between. It's generally speaking management structures have a really hard time, you know, staying away from one or the other, you know, what have been your what are your observations? I can, I'll be happy to tell you mine, but I don't want, I don't want to be the only one talking,
Kristi: Well, I've tried to train students that are in between, you know, and just slow down this algorithmic analysis until you really understand the data and why you've got the data.
Doug: Yeah.
Kristi: Because once those tools get into place, they're sort of self-reinforcing, I mean.
Doug: Correct.
Kristi: The learn, and sometimes, you know, the PhDs who are creating these algorithms have no idea what the album's doing anymore. So it becomes kind of a black box and feeding it, the wrong stuff to begin with. It's just going to cycle in, on itself and create these outcomes that you never anticipated.
Doug: Well and.
Kristi: So lots of examples of that but.
Doug: Sure. Well okay. Oh, go ahead and give us a couple, you know, I can I, and again, you know, I, I'd be more than happy to put my subjective feedback in, but, but I'm interviewing you, so.
Kristi: Oh yeah, no, I love to hear your stories too. So, like, just as this is kind of beside the point, but I just want to illustrate this in a visual way. So when, for example, a us machine or a water faucet in a public restroom.
Doug: Yap.
Kristi: So those things you're going to have a little camera. You stick your hand down there, the soap comes out, you stick the thing in your hand, under the water comes out. Well, that works great when you have white skin, but if those issues were trained on whites again, and you've got dark skin and you stick your hand up there, you might not get any. So we might not get water it's very fresh you know just because nobody thought about that. You know, the people writing the code or predominantly white, the people they tested those machines on were predominantly white. There is nobody underrepresented in that room at any of those steps.
Doug: Yeah.
Kristi: And so the problem isn't even know this until these things are out every airport, then all of a sudden we realized, Oh, we made a big mistake.
Doug: Yeah.
Kristi: So the kind of thing we're trying to avoid at the outset, and you really have to take a step back. You cannot start dataset. You have to figure out where that data came from.
Doug: Well, I think that's actually a. That's actually a lot more impactful, I think, than the average person understands is, you know, anytime that you reverse engineer algorithms from a specific data set, those algorithms are going to be tuned to that data set. And so that means unless it's a very, very, very broad dataset, then you'll have a natural bias in those algorithms. And I think, you know, there, there are some cases where it can be fairly innocuous and there's other cases where it can actually be, it can, it can actually be very harmful. Because I think an example we were talking about off camera is that, you know, if you let your algorithmic your AI run amok, you know, your, your AI may, for example, look at crime statistics and they might find that areas with higher densities of African-American demographics have higher rates of crime on average, therefore African-Americans are more likely to be criminals. It's like, no, that's, that's not okay. That's a, that's a line that you can't cross. And an, and an algorithm won't know that unless you tell it. And you know, but the problem is that sent you even, even now AI and you know, RPA and all this stuff, it's, it's really kind of coming into its own, but it's still a very young profession. And I, in a lot of cases, you know, it's like you said, right. You know, the, the algorithm doesn't know what to do, or doesn't know where to stop unless you tell it what to do or tell it where to stop. And in a lot of cases, right. That the profession is still young enough to where people don't, haven't really thought of all the things to all the places to tell the algos where to stop or clearly to tell it what to do in a comprehensive way, because, you know, it's, like you said, there, there are, you know, you, you, you, you have photo scanners that were, that have been trained on light colored skin, not thinking that, well, maybe there are some people with darker colored skin who would like to wash their hands too.
Kristi: Yeah, exactly. I love that racial profiling example, because if there was any bias in terms of who got arrested in the first place, let's just say.
Doug: Yeah.
Kristi: Yeah, you know, black people got arrested at a high rate for doing this.
Doug: Yes.
Kristi: The same activity. Well, so once your predictive algorithm tells you, you know, that neighborhoods, it's a high crime neighborhood, you send more police into that neighborhood. So they start seeing more fun and they start arresting people.
Doug: And it's self-reinforcing.
Kristi: Exactly. And so it's, you know, it's a no win situation. And if you don't understand all the things that happened before you got to the, even to, with the data sets.
Doug: Yeah.
Kristi: To begin with you're going to create algorithms that do that exact thing.
Doug: Yeah. And, and yeah, I think that's actually, that's, it's a very precious point that I don't necessarily know for sure that I can articulate the answer. Because I think it's very, very complicated, but I think it's something that, you know, it's, it's something that needs to be answered because, you know, I think the, that the AI isn't going away, you know, algorithms aren't going away and, you know, and, you know, database, decision engines aren't going away. So we need to figure out some way to make sure that they're programmed ethically so that, you know, you know, so the very full, you know, flagrant problems like this don't persist.
Kristi: Yeah. So one of the major ways to address that is just exactly what you're doing, just discussing it.
Doug: Yeah.
Kristi: Meaning these, these points to the front, because people aren't aware of this stuff until they hear it. Once you hear it. And you're like, Oh, of course, that might happen. But if you're not aware of it.
Doug: Yeah.
Kristi: Your just coder and you're, you know, we've been trained for so long into thinking that data are objective, they reflect reality. Technology is objective, it's neutral. It doesn't have any opinion on anything. So we can just start with the data and the technology, you know, and then we'll get a result that's reliable. And, and we've been trained in that the scientific method, we think there's no politics or bias, you know.
Doug: Sorry. I'm, I'm suppressing laughter just being,
Kristi: Cause we don't get a chance to really take a step back.
Doug: Yep.
Kristi: But why did they come up with that theory in the first place who came up with that and what are their? What is the context that they came from? Why they would think that? and who gathered that data and why? You know, what were the circumstances under which that data occurred in the first place? We just really have to be backwards long way.
Doug: Well I mean, and because I think that the way that I think about, or that I describe technology is that I just, I think of technology as sociopathic. In other words, it's not good. It's not bad. It does exactly what it thinks it needs to in the most optimal way, with no regard at all, for morality, emotions, impact anything. It is, you know, and you know.
Kristi: So.
Doug: That’s the way and that’s the way that I think about tech CEOs too, because that's what I have, that's the behavior I've observed.
Kristi: We do have technology ethics class. I think like every, every computer science major in the country now has to have at least one, you know.
Doug: That's probably a very good thing just because I, that's probably one of the more disturbing trends I've noticed is that I think, you know, it's, you know, technology in and of itself. I think it, you know, of course, right. There's, you know, there's exponential gains, but there's, you know, of course algorithms are fundamentally amoral, right? They're, you know, they're, they're not moral unless they're designed to be moral and it's hard enough to, to code them, to work properly in the first place, much less try to, you know, try to put some sort of, some, you know, to impart some form of a holistic, moral belief system. And,
Kristi: And that's close everything down.
Doug: Right exactly.
Kristi: Huge backlogs of projects that we have to get out the door. And so sitting around and discussing the ethics when we're not emphasis to in the first place
Doug: Yeah.
Kristi: That's just not in the budget.
Doug: Yeah exactly. Yeah. That's you know, that is, it's a very significant drag on throughput. You know, but of course, you know, as you sit, you know, as you've seen, right. You know, you need to either be willing to put something out and then take it back and then re-engineer it. Or you need to be willing to take a very long time to figure, you know, to figure out some of these bigger problems before you release something. Otherwise, you know, otherwise you can run into a really tough situation.
Kristi: Right. And you can do a little of both of them.
Doug: Yeah, exactly.
Kristi: You know, because if you have a good audience, you can say, Oh, what are you forgetting or asking the questions? Like, why, why is this coming out this way? And not that way, then you can go back. So you don't have to figure out everything.
Doug: Well yeah. I, I was, I was being intentionally pedantic, but, thank you for calling me out on the carpet.
Kristi: But to me, one of the, one of the most obvious and most important solutions to that is to get a diverse group of technology people in when you're designing and architecting.
Doug: Yeah.
Kristi: And coding these systems, that's not easy to do.
Doug: No it’s not.
Kristi: There was one article I read and it was just literally, maybe five years ago, all the black people working at both Google and Apple together could fit on one jumbo jet or something like that may have these statistics all wrong.
Doug: Yeah.
Kristi: You know, but percent of tech people from underrepresented minority groups is low.
Doug: Yeah.
Kristi: To get those people on your team, even if you're, you know, so we have to figure out a way to branch out. So maybe we don't get the MIT trained data scientist, but we get somebody who's got the bigger contextual issue that they can bring something up. So, I mean, it's really important. And then of course we need to invest in getting kids in system from a variety of backgrounds that's.
Doug: Correct. Well, well, and, and what you're talking about right there, that's actually a rather complicated, that's actually rather a rather complex situation because, and the reason why I say it's complex is because, you know, the your traditional politician or administrator way, you know, way to solve, it would be to say, Hey, you know, you know we don't have enough people of a certain demographic cohort at a certain company. That's easy just force people to hire more people in that demographic cohort. Well, that may or may not address your core problem because, you know, one way is that okay, well, if they're you know, it's like, you know, if they haven't been trained adequately to produce a quality product that could reduce your enterprise value, another way that you could run into a problem is that if they've been trained to think identically to keep, do it to people who at that have a different demographic profile, you could have diversity, but still have group think. And that's actually one of the things that I keep thinking of is right. Just because different people have a different gender or different skin color doesn't necessarily mean that they think differently.
Kristi: That is so interesting. Yeah. Because we, we just always hire people who are like us. So, yeah, we were all trained in these high tech programs that we think only the people that are also trained in very similar programs can do the job effectively
Doug: And, and, and cause yeah, I think that's, you know, that, that there's the tension that you have is that you need to have there's obviously a certain degree of technical and business and industry competence that you need in order to be able to effectively do your job, but then you also need to have bring people in who have different ways of analyzing and analyzing and assessing problems. And I think that, you know, spanning that gap is, has actually been kind of hard because there's a lot of concentration in the way that people are taught through school or whatever. And so I think that, you know, even in the case where you have a lot of people who look different in a lot of cases, the way that they've been...