Episode 26: The Importance of Explainable AI with Jay Budzik CTO at ZestFinance
In this episode of AI at Work, Rob May interviewed Jay Budzik CTO at ZestFinance. ZestFinance's software uses machine learning algorithms and big data technology to help financial services companies make more effective credit decisions, safely and transparently. Tune in for Jay's take on the importance of explainable AI.
Rob May, CEO and Co-Founder,
Jay Budzik, CTO,
Rob May: I'm Rob May, Co-Founder and CEO of Talla. I'm your host. Joining me is Byron Galbraith, my Co-Founder and our chief data scientist. Our guest today is Jay Budzik who's the CTO at Zest Finance. Jay, welcome to the podcast. Give us a little bit about your background and what ZestFinance does.
Jay Budzik: Hey. Thanks for having me. I did a PHD in AI at Northwestern. We were just talking about Chicago for a minute. And funded a series of companies that were all about bringing AI to the enterprise, which is what brought me to Zest Finance, which is really founded with the mission to make fair and transparent credit available to everyone.
Our founder, Douglas Merrill, who was the CIO and VP of engineering at Google, wanted to equalize the playing field for some participants that had been neglected by the financial system by using better math to make more inclusive, fair, and profitable credit decisions. We think the wave of them now is coming to financial services. And we're certainly seeing that in our customers.
What enables that is the ability to explain how ML models work, because if you're going to use ML to run a billion dollar lending business, you probably want to know that it's doing the right thing. We focused a lot on making AI transparent and explainable, so that people can get comfortable trusting it.
RM: As part of that-- I mean, it's interesting that you also mentioned in your description of Zest a little bit about the fairness aspect as well. Do you think this makes the decisions-- because we've definitely seen an explainability crisis in AI for a lot of fields, but we've also seen a biased issue that's concerned a lot of people. Are you guys able to address that as well?
JB: I mean, there's sort of an existential question there, which is that if you make all sorts of lending decisions based on sort of biased practices and then try to use those data sets to create models, right? Those data sets are biased. You're stuck with an imperfect measurement of what actually would happen if you were more fair.
We've been able to address those using advanced machine learning techniques. So what we can do is we can give the business options, because generally, when you're talking about making fair decisions, you might give up some accuracy by being more fair, or some profit. And so there's a trade-off there. What's needed is the ability to see what happens in that trade-off and whether or not you want to live with the consequences.
It's been really difficult to sort of try out lots of different alternatives and see what's going to happen. But our tools make that easy.
RM: Tell us a little bit about-- you wrote a blog post awhile back about the explainability snake oil blog post. I find it very interesting because AI is one of those fields that is fascinating, there's actually a lot of really great stuff going on, and it is also covered in lots of snake oil. Tell us a little bit about where the snake oil is in explainability.
JB: It's provocative, right? It's meant to be that. These guys are trying to do the right thing. It's just that, if you take approaches that were meant to be used on things like linear models that you'd get out of a logistic regression type framework and use them on machine learning models, you can unwittingly make errors. We're seeing a lot of that out in the industry.
Our models, when you use them in real world financial services, businesses generate these huge profits compared to what they're doing today. That creates this incentive to go really fast to try to get the profits. In creating that rush to apply ML, you might be tempted to take shortcuts and use the techniques that are out there without really asking the question, are those techniques giving me the right answer?
Can I trust them? Are they accurate? Is that really what's happening with the model? We did was we did a set of studies, just on a really simple two variable model to show that if you use the wrong explainability method, or if you use the explainability method that may work perfectly well in a logistic regression context on an ML model, you actually get the wrong answer. That's problematic for all sorts of reasons in the financial services business.
That happens even on the most simple model. If you can't do it in a two variable model, how are you going to do it in a 2000 variable model? That's what that study was about. It was trying to shine a light on not all explainability methods actually give you accurate answers about what the model is doing. So you have to be careful about how you choose them.
BG: I reviewed that article and it was interesting and I kind of followed up on some of the other references that we had on this. I saw that the method that you guys are using, as opposed to sort of like Lime, which is a popular one that people have tried to use for this, is something called Shapley additive explanations, which has sort of a game theoretic application to it. Can you talk a bit more about that?
JB: There's this guy, Shapley who was a British mathematician, game theorist, contributed this really great formula that allows you to assess the value of a player in a cooperative game. You have a team of cricket players, which is what he was trying to model, and what's key there is that the players don't work alone.
It's not like you have team members that are working in isolation, they work on a team. The effects of those players playing off of each other is kind of important to model as well. And so he developed this sort of theoretical framework for doing that in game theory. It requires massive amounts of computation. You have to generate all model combinations, including all players, and compute the score differences across this vast space of different model alternatives.
You can't exactly compute that directly in any practical sense when you've got a model with more than, say, a couple variables. The Shapley baseline though is a really important thing that you can use to ground your empirical evaluation of explainability methods. So like you need a baseline to compare against, that's the baseline. Now, what folks have done in the recent years is they've developed techniques that allow you to compute those values quickly for different types of models.
The Shapley explanation is out of University of Washington. Scott Lundberg is the guy. He's really an awesome guy. The contribution is great. Was able to do that for a certain class of models called gradient boosted trees. They're now expanding that to other types of models and using different techniques, like Deep Lift and the rest. But what we've done here is we've sort of extended that work and combined it with some other work to allow us to explain heterogeneous ensembles of models.
These are models that sort of take an answer from a gradient boosted tree, take an answer from a neural network, take an answer from a logistic regression and combine them to get a better answer, because you're using multiple perspectives on the data and different mathematical techniques. What we've done is extended that work to allow us to use a broader palette of model types, which then allow us to be more accurate, allow the models to be more stable over time, but still provide accurate explanations, so that you can trust them.
RM: One of the things we like to talk about when people come on the program is, there are so many companies out there that are not deep in AI yet, they don't have data science teams, they're starting to hire, they're starting to think about that. You guys have been doing some of this for a while. What have you learned-- lessons learned for sort of those people about how data science teams and engineering teams work together?
Given that the workflows are kind of different, given that the skill sets are kind of different. What advice would you give to somebody who's sort of bringing on their first data science team and wants to make them productive?
JZ: There's a couple of things there. The first is, whether it's data science or engineering, hire the smartest people. You want to look for folks that really are never going to give up and they're going to constantly question the results and constantly drive themselves in the organization to a better place because they're oriented that way. That's the first thing. Then the second thing is to recognize that you want to create an environment where the data scientists are going to be productive and are going to be able to leverage their skills.
Hire a data scientist into an environment where they don't have any data, for example, that's going to be frustrating. As an initial matter you need to look at, what is this person going to do? What tools are we going to give them to be able to be successful? A lot of places just aren't sort of in a place where they have the data in a place in an infrastructure that's going to help the data scientists succeed.
You have to look at what are the enabling infrastructures and steps that you need to take before you go hire someone. Once you have them, I think it's really being clear about the objectives and how they're going to be measured, because as soon as you establish the right set of metrics and you have the right data set, that's a field day for a data scientist. They're going to optimize that outcome as quickly as possible. Setting up those initial conditions, making sure that you're hiring the smartest people and that they have the tools to be successful I think are the most important things.
RM: When you think about where ZestFinance is going and some of the different AI machine learning techniques and some of the things that are inputs to those techniques, when you look at the AI landscape, what's the problem that you guys aren't working on that you wish somebody would solve that would really help you and make your business better?
JB: That's a great question. One of the things that we've done here is we've deliberately sort of left predictive accuracy to the academics in the large R&D shops. So the next wave of modeling accuracy is probably going to come not from here. So I'm looking for the next TensorFlow, the next XGBoost, the next sort of modeling techniques that's going to take our model accuracy and stability to the next level.
Then what we'll do is we'll figure out how to explain it and make it practical to use. I think kind of-- as a data science company, we've decided to focus on a particular area and get really good at that. Machine learning in general is such a large field that requires so much depth and diversity of thought that we're sort of leaving core chunks of it to folks that are really focused on that, the places like Google and the University of Washington and Facebook and Microsoft and the other large R&D labs.
BG: To jump on that a little bit, because, I mean, you guys are sort of building a product to sell to financial institutions. We're in the business of trying to build sort of AI for practical real world use cases. A lot of what's coming out of academia or some of these big R&D shops you could argue in some cases isn't real world or it's a very, very specific or niche world.
We're already doing the things where we've got like-- I mean, I guess, and you guys are in the world of large, large data, which helps, but some of us we have small data, or we have methods that just don't-- that these companies, or these academics, they drive after certain benchmarks, or they drive after certain things. So know any thoughts around where-- do you think that those companies are going to produce those kind of models? Or do we need that to come from people, maybe now more in the practitioner world, that are trying to solve more real world societal challenges?
JB: I think that's fair. Look, I mean, it's really cool the type of stuff that you could do on image data and the stuff that you can do with voice data. The consumer applications there are just astounding. You have these self-driving cars, and you have Siri that's able to do speech recognition now, Microsoft's Cortana. The use cases that those technologies enable, those applications of AI enable, are pretty cool.
Our data isn't image data. It's not sort of sound waves. It's typically comes in to financial services companies in tabular form. We're looking at predicting a different set of stuff. And so it's sort of unclear that convolutional neural networks, which were designed to work on images, are going to really apply. On the other hand, there are techniques like LSTM that was developed to understand sort of natural language and to model time series data, like speech, that do turn out to be pretty applicable.
One of our key functions here is the translational sort of work where we take the academic results and evaluate them for practical applications on our data set in our business problems. And oftentimes we find that those techniques just fail because they were designed to solve a different problem. In other cases, we find that they're very successful.
I think that's absolutely right. The folks who are trying to use this stuff are really making a map of what works for what problems. It's impossible for the guys who invent the core algorithms to consider all the use cases and to do that work. So there's a really important function that businesses provide when they go to try to actually apply this stuff to a real world problem.
RM: You guys did a partnership last year with Microsoft. Tell us a little bit about how did that come about. Did you guys approach them? Did they approach you? One of the hard things about a startup sometimes is striking these deals with big companies. How does it work? What did you see that made you pursue it?
JB: Well, it helped that a number of our early customers were running our machine learning models and explainability tools on Microsoft's infrastructure. A few of these guys were dot net shops that had Azure Cloud and were able to make use of it. And what that taught us was that there was really this peanut butter and jelly type of effect. In machine learning, the workloads are very lumpy.
When you go to train a model you need a lot of compute. And when you're just doing exploratory data stuff you don't need that much compute. So having an elastic cloud based infrastructure, like offered by Azure, is really important. And then enterprises are really concerned about security and integration with their policies and their active directory and sort of the user management and all of the data security things. And Microsoft has that down pat.
It just makes sense from a technical perspective that we would have interoperability there and from a business perspective because of what Microsoft does that we don't want to do. We don't want to be in the cloud business. We don't want to be in the security business. We want to be in the ML business.
We took that to them and said, look, guys, this is really working well together. There was an executive there that cared deeply about regulated industries and fairness and financial inclusion and sort of cracking that open for Microsoft. She took us under her wing and got the partnership done.
BG: Awesome. Microsoft, they've got researchers there who are leaders are active in this sort of the fairness and accountability, transparency, like Kate Crawford and Hannah Wallach. I'm curious, like do you guys get to interact with them at all as part of this partnership? Do you get to leverage that?
JB: It's pretty cool. It's pretty cool. There's a whole sort of committee on AI in fairness, the head of the MSR North America and Europe, Eric Horvitz, happens to be the guy-- one of the guys who funded my PHD. I knew him from before. And we met with some of the folks who are working on generative-- additive models that are more inherently explainable to understand how the different techniques complement each other.
Microsoft continues to pursue its independent R&D agenda around explainability. There's a separate product roadmap for their ML tools. We're able to deliver this value to financial institutions now. And both organizations are really excited about it.
RM: Well, Jay, we used to wrap up on a question about Elon Musk versus Mark Zuckerberg views on AI, but that's gotten old and there are many more pseudo famous AI debates now. So the one we're going to wrap up on this year is, what I'm calling the Gary Marcus versus everyone else debate. Even though it's not rigged, there are some people on Gary Marcus' side, but there's a lot of people who aren't.
I'm interested in where you stand on sort of the limits of deep learning. And obviously, you guys sound like use a lot of variety of techniques. And so where do you fall on that. How much should we be looking at other techniques? How much are we going to need those other techniques to move AI forward? Or is deep learning going to get us a lot further than we think?
JB: I mean, I think it's a phenomenal technology, so I would never say-- but I grew up in an era where Roger Shank and Noam Chomsky were arguing about whether syntax-- it was syntax or semantics or whether you're going to get there through symbolics or statistics of Muskie and McCarthy. So it's that these debates fuel lots of academic careers and get a lot of papers written. Look, we've seen practically-- practical example-- in a mortgage lending example where an XGBoost boost model, a tree based model does better on two of the seven segments, or three of the seven segments, but the deep neural network does better on the other four or five segments.
You're always going to get better by having that diversity of perspective. And you don't want to rely on one because that's kind of risky from a portfolio management perspective.
I wouldn't say that there are any silver bullets here. Certainly deep learning has gotten us a lot further than we've been able to get in certain domains and it's very useful. We find it very useful in financial services as well.
RM: Good, Jay. Thanks for coming on the program. For those of you listening, thanks for listening again this week. If you have guests you'd like to see or questions you'd like us to ask, please email us to firstname.lastname@example.org. And we will see you all next week.