Episode 23: 2019 Predictions from Talla's Data Scientists

In this episode of AI at Work, Rob May sits down some of Talla's Data Scientists to get their take on questions like: Were there any key breakthroughs in natural language processing for 2018? Did we make any progress in explainable AI in 2018? What are your predictions for 2019? And, many more...

 

Subscribe to AI at Work on iTunes or Google Play

Rob Circle Headshot

Rob May, CEO and Co-Founder,
Talla 

Byron Black and White Headshot

Byron Galbraith, Co-Founder and Chief Data Scientist, Talla 

ladi heashot b&w

Ladi Ositelu, Data Scientist, Talla

DD black and white

Dhairya Dalal, Data Scientist, Talla




 Episode Transcription 

Rob May: Hello, everybody, and welcome to the latest edition of AI At Work. I'm Rob May, the co-founder and CEO at Talla. What we decided to do for a show to wrap up the year is, I have part of the data science team here from Talla, three of the members, and we're going to talk a little bit about some of the work that we did at Talla this year, some of the things that we noticed in AI and machine learning in general, some of the things for next year that we're thinking about that we find interesting, and hopefully can give you some good ideas of what to watch out for for next year. I will let these guys all briefly introduce themselves and we'll start with my co-founder Byron Galbraith, who's also our chief data scientist.

Byron Galbraith: Rob just introduced me. Hi, I'm Byron.

RM: What's your background Byron? What were you doing before this?

BG: Before Talla, I was finishing up my PhD in cognitive and neural systems at Boston University. I worked on brain machine interface and neuromorphic control systems for adaptive robotics. And then while I was also doing that, I was also working on an edtech company to try to teach STEM topics to kids in middle school through a game-based learning.

Dhairya Dalal: My name is Dharia, also one of the data scientists on the team. I've got a bit of a diverse background. I spent about three years at Harvard working with the institutional research group providing qualitative and quantitative analysis for senior leaders at Harvard to help them make strategic decisions using data science. Afterwards I spent some time at the Allen Institute for Artificial Intelligence working with their dialogue research, and then the Allen Institute for Brain Science working on generating scientific knowledge graphs for neuroscience research. In my free time I spent a lot of time working with the non-profit sector here in Boston.

Ladi Ositelu: My name is Ladi, also a data scientist here. Before Talla I was working on a PhD in energy engineering. I was in this space between energy economics and policy making.My research was to sort of understand the impacts of large scale power outages on the stock market value of power companies.

RM: Ladi, you didn't have an NLP background, but more of a general data science background. Not everything we do here is natural language processing, but a good core chunk of it is. How have you found the transition, for people who might be out there, that are like data scientists that are thinking about NLP? What's been interesting about it, or different about it, that you've noticed?

LO: I think the challenging thing was just, sort of, getting used to a lot of the terminology and some of the different techniques. Behind it, the mathematics aren't very different. You see that you're working with some type of math that you've seen at some point in time. It's not that much of a stretch to pick it up. 

Most of the stuff I've learned is just from learning on the job, having to do things like other sorts of natural language processing, language models and stuff. Most of this stuff I just sort of picked up on the job and the Coursera classes, you can also like look at to pick up stuff that aren't immediately obvious.

RM: If you're going to summarize natural language processing for 2018? Were there any key breakthroughs? What was interesting? What was surprising?

BG: We were kind of chuckling here because 2018 was the year of the new era of natural language processing and… Sesame Street puns that went on too long. The Allen Institute, that AI2 that Daria worked at had released a new word embedding approach called ELMo, which I think was actually a legit acronym that just had a nice name. It gave a nice boost to a lot of different natural language processing tasks, most of which, in production environments typically, or even just applied environments, tend to rely on some sort of word embedding way to convert the discrete words into, sort of, vectors so that these models can actually work on them.

Then Google turned around and came out with BERT, which they announced with great fanfare. It turned out that it does actually do better on these tasks if you use the pre-train models. But if you actually wanted to train your own, it pretty much is outside the realm of any sort of normal non-Google person or team. The resources required to train BERT actually quite, quite extensive.

DD: I'd agree. I think language models are definitely one of the key defining things of NLP in 2018, essentially because you basically take a set of embeddings, train in an unstructured way, and basically add across the board improvements for many NLP tasks, anti-extraction, machine comprehension, reference resolutions. It just is amazing to see the value across the board. There's a lot of really interesting, kind of, benchmarks that were shattered as introduction of this. The cool thing was also because it was, you know, Allen AI and Google producing these technologies, they also made the code also very much available for people to use. So that was really cool, too. Where you had a code base available, for you to basically take it, you know, iterate and do more research on. So that also, I think, spurred a lot of really interesting papers as a result of that.

BG: The other big sort of intriguing work-- a couple bit of works that came out this year related to embeddings was not embedding methods themselves, but what you could do with them. There was a really fascinating paper that looked at taking-- I'll step back and say there's this challenge of a big, sort of, practical application of NLP in production is machine translation, going from one language to another. And we could see this with the improvements in Google's translation service. They've used a lot of sort of deep learning techniques to really improve this across a large number of languages.

But the way these typically are trained is that you need parallel corpuses. You need all the words in English, and all the exact same sentences in German and they're lined up, and you kind of go from one sequence to the next. But some people found that you don't actually need to have these parallel corpuses anymore. You don't have to have the exact translations. You can do this embedding space where you just embed one language, you embed another language, and then you sort of figure out how to combine those two embedding spaces so that they overlap. And then suddenly, you can essentially have two sort of methods, or two approaches, that gets you translation performance that didn't even need these parallel corpus.

Then this was extended further by saying well we can go text to text, what about speech to text? And so there was work that was done, and this was highlighted at NARIPs this year, which was a process by saying, let's take speech of say English and embed that, and let's take text in English and embed that, and let's combine those two things in the same manifold space. And so now we have unsupervised, essentially, text to speech, speech to text models that don't need the parallel corpus anymore. And this allows training to happen in a much bigger scale because you don't have to invest in the parallel data.

RM: Let's talk a little bit about some of the stuff that we've done here this year, right? So for those of you who aren't familiar with Talla, we do a lot of sort of automation space around customer support, automating things directly for customers, to find answers to common questions that they have or perform certain tasks. We do a similar thing for reps. We have a rep-facing product and we have support from a knowledge base. A lot of our NLP technology is deployed in the knowledge base to sort of make sense of the content that you're creating. And so that's sort of our background with respect to how we deploy these things in products. But, you know, what are some of the sort of tougher problems that we had to work on this year and some of the things that we had to solve to get some of these things to work in production systems?

BG: One thing we did, and I think this is important. It touches on-- we talked about BERT and we talked about ELMo-- one of the things we work on here, and Daria's worked on this a lot, and you can talk a bit more about this is, in machine reading comprehension, or how do we ask a question of sort of an unstructured paragraph of text? And we can pull out the answer to this text. And so we've worked a lot on sort of operationalizing some of these models. And there's been a couple of challenges to deal with this that I think are lost in the hype when you hear about things like, oh, BERT makes machine comprehension like models be the best they can be, or they're so much better than the previous ones.

We have to make these work in reality, in a production environment, and we have to deal with user input that is maybe not the cleanest or most safe for the task. So for instance, in the machine comprehension approach we had to deal with, is a lot of these models don't actually know how to say, I don't know. So that's a big important factor for stuff that we are actually working on is that we need a way to understand, can the model actually answer your question for an environment we don't control what's being asked of it? And we need to have that sort of fail gracefully if it doesn't work out. Most of the academic work in this topic assumes all the questions are fair, or they're testing a certain thing.

DD: It was actually really interesting when we started this work out, right? So this is SQuAD 1.0 was the primary benchmark for comprehension, and, you know, we spent a lot of time taking a look at some of the models on the leaderboard and trying to see if we could implement them and productionalize it. And that was a huge limitation of those models at that time, which was essentially they assume that there is always an answer in the [? passage ?] text that you were trying to, you know, ask a question of.

Then around July-ish, SQuAD 2.0 came out, which was a direct answer to, kind of, the shortcomings of this data set. And so the idea was, can we, you know, make these questions much harder so that, you know, we have essentially in our data set examples of questions that can't be answered with a corresponding passage. And so we kind of got to see the evolution of, kind of, the models in that space. In terms of what we were working on, I think we spent a lot of time thinking about a couple of things.

One is how do we have the model tell us when it doesn't know? And then two, we came up across a large number of challenges around, basically, how do we calibrate and codify, you know, what the model's uncertainty is? And so this is an issue in general with neural networks is that, you know, you will get a uncertainty probability back, but it's within the context of the examples that you had and within the context of, you know, that particular example. So it's hard to base a model against all examples and against all questions you would pose to that model. And so as a result, you have a hard time of kind of, you know, measuring across the board whether the model is actually confident or not.

We've done a lot of work kind of thinking about how to solve that within the constraints of the model. And then, you know, I think in the past couple months there's actually been a lot of really cool things that have come out on the SQuAD 2.0 data set that we've been kind of excited to take a look at and see if there's-- that answers some of the questions that we've tried hardwire an answer.

The other interesting thing was that we found that with BERT and ELMo, they're really fantastic at improving the performance of the model overall, but they don't scale well in the production sense. And so what ends up happening is that these are huge models that have very long cost, and so when it takes like 30 or 40 seconds to answer a question, we can't deploy that in a production setting and be like, “hey, I'm sorry, you're going to have to wait another minute before I give you back an answer.” And so we had some really fun challenges around kind of optimizing the performance of these models so that we could actually make it production ready as well.

BG: To highlight on that, some of these things are just the trade-off of while you hear about some of these models and they are getting benchmark improvements, a big challenge we have to look at in all of our evaluation of all, sort of, the academic literature is is that improvement over the benchmark? It may be an improvement, but what is the operational cost both in terms of the resources required to run this, the time it takes to run the model. All those things are equally important to us. And, what is the relative trade-off of that gain?

If your model is like super complex and gets you on the leaderboard, you know, like one point of F1, yeah you can get a paper published on that, but we can't deploy that. And so, you know, because that's going to just completely throw across a bunch of problems across our operational infrastructure. And so it's important to note that, you know, that's big thing we have to do is sort of balance some of the cool, exciting research that comes out with like what does it take for a company like us to actually deploy this when we're not a Facebook or a Google, where we can just throw all the resources at it?

RM: Let's talk about a couple things that happened this year and give me your 15 to 30 second opinion on some of these things, right? Were they good, bad, or misunderstood by the public? Let's start with Gary Marcus. Very much in the news this year, very much a deep learning naysayer. Probably not as much of a naysayer as he's portrayed to be, but, you know, his evangelizing about needing more. This is a good thing or a bad thing for AI. What do you guys think?

BG: I'll say I actually, I like the role that Gary Marcus plays. I think it's good to have his perspective. I do think he and, for instance, Yann Lecun get into some pretty protracted Twitter arguments, which honestly get a little tiresome if you're following them. I think it's important to have someone constantly sort of hammering on, like, we are not at AGI yet, we're-- you're not going there. I think, though, that he is arguing in two different directions. Like I feel like there's a lot of cross debate going on here. He is arguing for general AI, general, sort of, artificial intelligence techniques, and a lot of the current, sort of, people who are deep learning obsessed, are focused, tend to be about solving very point problems and, sort of, moving bottom up, whereas I think he's looking at this from more holistically. And they often clash at the fact that they're talking about, sort of, two different things.

RM: Apple. Are they catching up in AI, or are they falling further behind?

DD: It's interesting, right? I think like Siri was what kind of kicked off this entire dialogue, digital assistant kind of revolution, and they've lagged significantly since. And it's been fascinating. I mean, to be fair, they are a laptop manufacturer and a device manufacturer. And so it's not something that's in their wheelhouse, but I think they've quickly found that like if they want to compete with the directions that Google and Facebook and all these big companies are going towards, they have to be more involved in the AI space. I think they've been catching up, been a little slow to it, but, you know, they're on their way. But they're definitely behind for sure from just what Google Home can do, for example, or Amazon's Alexa can do.

LO: Even if you just look at iTunes, for instance, the search is so bad. You would expect that they would have better query on standard framework.

DD: I mean, that's also true about Alexa's search, right? I mean, there's a fun flame war that we had when I over at Allen AI about this as well, so.

RM: That's so interesting to me to hear that because, I mean, you make a good point. A lot of people are concerned about start-ups and their ability to compete in AI, but again, you look at Apple and their lead they had with Siri, like, they should have built Alexa, right? They should have built Google Home, like they should have-- but it's like, it's hard to understand these opportunities and see them coming. And, you know, if you really look at the history of Alexa, I mean, people thought it was stupid when it came out, right? People didn't think it would improve as fast as it had. GDPR. Good or bad for AI?

DD: I think it's good. I have a controversial opinion about this. It's going to force us to have conversations about data transparency and also just what's happening at the level of these models. And I think as more and more people understand kind of how the sausage is made, there are going to be hard questions asked, too, you know. That process. And so yes, it'll probably stunt innovation in some places, but I think the overall conversation we have in terms of the impact of data transparency and privacy and all these things on the impact of social policy, is very much a positive thing.

BG: I would say I agree. I think that as ML, as AI technologies are getting more and more pervasive, as we are starting to realize more and more about just how much personal data lives and is out there and that we actually have no control over it, having at least some accountability baked in, I think, is important. The danger of these, sort of, regulatory exercises, if you will, is that they're not always done with, you know, everyone's voice clearly represented. So for instance-- or from the perspective of, you know, if it's a heavy handed regulation, in some regards it's going to solve a lot of problems, but it's going to introduce other problems. Such as what, you know, constitutes personal data? If I train a model and I have a bunch of weights in a neural network, is your data in there? You know, and I'd say, no, it's derivative, right? It's now not your data anymore, but, you know, is that true? Like what do you do if I want to handle like..

DD: Well is this part of a generative model in which case your data is there? It's pretty reasonable. I'd say you're like, uh, this is kind of awkward.

RM: Well that leads into the next one, which is, so explainable AI. It's been a big topic, you know, these things are too much black boxes. Did we make any progress in 2018? Do you think we're going to make progress in 2019? How does that stand?

BG: It's still really an open area of debate, and I think people haven't quite codified what it even means to be explainable. I've even seen just recently this attempt to try to-- you know, in true academic fashion, let's add more words and try to add more definitions to talk about things like what is explainable versus interpretable versus transparent? And treating those as three separate ideas, that something that's explainable versus that's interpretable. So I could tell you, you know, it needs to be explainable because I need an end user who has just seen the output of an AI algorithm be able say, OK, I need you to show me exactly how you came up with that result? Or do you just need to know that it got the result? And that we can tell, well, here's all the training data that we used, right? I could say, here's everything about how we made the model. Doesn't mean I have to be able-- I could tell you how the model actually got your answer, but I can tell you how I trained the model, which is different than having the model tell you exactly why it made a certain choice.

LO: I think as more decisions are being made that affect people's lives, how do you go away from not being able to explain what's going on, right? So for instance, we get rejected for a loan because some AI logarithm decided that you're not credit worthy. You'd probably be peeved if you have no idea why, right?

BG: It's a big issue. One of the panel discussions, or one of the discussions I went to at NARIPs, was a presentation from a woman who works in New York City in one of the offices of the mayor. And it was around, like, how do they deal with these algorithms? Because they have a lot of sort of rules around who you cannot discriminate against, right? And so that's important. How are you maintain those rules when you have algorithms? How do you know that those algorithms aren't discriminating unfairly? So it's a real thing and policymakers are sort of now aware of this, and are starting more and more, and we'll see more of this next year. And I know Lottie has some thoughts about this, too, about like what does it mean, right, for policymakers to sort of get involved with this process?

RM: Good, so GANs in 2018, 2019. Are they a thing? Are they becoming a real thing? Are they still abstract and academic?

BG: They are very real. They are getting better and better. There was just recently a release from Nvidia on this face generation, and the faces are high resolution and there's a large variety of them. I saw someone did a comparison of two or three years ago GAN faces and today, like, this result. It is striking. It's got to the point where you can still, if you kind of know what you're looking for and you're aware of how these things work, you can find things that show why it was a GAN-generated thing. Because before they looked like really bad JPEG artifacts. There was really clear, bad image artifacts in the early work, but now the resolution has gotten quite good. These are just getting better, they're higher resolution.

BigGAN was another thing that came out, which is actually a really fascinating high resolution GAN tool that people are using more in a creative, artistic sense. People just are spending their time just exploring the latent space and getting really weird things. There's a tool someone made called Ganbreeder where you can sort of take two images and it will just figure out some combination of the two. Like you want to see, you know, was it a shark and a bus? You can get shark bus. I don't know. It'll try to find shark bus for you, and it's quite fascinating now that the resolution of these things are getting really high.

It's also quite frightening in a way and concerning because, you know, some of the things I'm actually worried about and thinking about for next year is just the potential for GANs and for the generative of art or the generative image and the generative audio stuff, it's going to make things like fake news much, much more difficult to deal with. Like we might even see, you know, as early as next year or 2020, you know, cases where people legitimately use the, you know, it wasn't me, it was AI excuse.

RM: The web is going to come full circle back to like sources of authority like they did in like 1985.

BG: That's right. It's going to be a case where, you know, even some people were saying, like, we're going to be in a world where, you know, media is going to be no longer useful as a method of information transmission. It's only going to be good for entertainment because no one's going to be able to understand what is real and what isn't.

RM: Let's talk a little bit about next year, 2019, and some of the predictions you guys have. I know Byron, like, a lot of your expertise is in reinforcement learning. Where does that stand? Is it still pipe dream, are we starting to see it, you know, have real applications?

BG: RL was a really kind of hot topic last year-- this year. It kind of was picking up in 2017. It really, I think, captured a lot of people's attention in 2018. A lot of work at NARIPs was on reinforcement learning, where just more and more people are getting into the field of, sort of, how do we train agents, and the various many ways you can do this, some of which are what I would call RL, some are not quite RL. But essentially, how do we do this sort of reinforcement, or this agent-based learning for decision-making.

There was a discussion I had at one of the, sort of, a lunch group there, and there was some assertion that it's still not really a thing that's going to solve most real world challenges. It's just not possible because a lot of these methods require a lot of very specific assumptions to be true to be able to train them and to be able to get them to succeed at what they do.

We've made tremendous progress at playing like certain kinds of very difficult strategy games, like Go. You know, this year, just recently, they solved Montezuma's Revenge, the game, not the malady. But the game is one of the, sort of, hardest Atari games, and the sort of the Atari benchmark for RL because it was very exploration heavy. It gave almost no reward to the user, and it required you to do lots of exploration. And most RL algorithms failed utterly at this. Then the team at Uber research figured some way out to sort of induce this exploration. We're seeing a lot more work on exploration based methods.

I still think it's going to continue, it's not stopping. It's just getting bigger and bigger. But still very much academic, still a lot of toy problems. There's a few real world cases coming out from some of the self-driving companies. You know, we're looking at things like traffic pattern routing and things like that. But I would say most of it is still academic.

DD: I was going to say, I mean, one of the interesting things is like, you know, the agent that plays Montezuma's Revenge can't be used to play like more of a side scroller game that isn't exploration based, right?

BG: It only solves that game.

DD: That agent would be like super suicidal in the other game. And so I think that's one of the limitations of RL. Agents are very much domain specific and very confined to a specific use cases. On the language side, though, it's really cool that Facebook launched their deep learning for-- what was it-- NLP, I think, framework. Reinforcement learning for NLP framework.  I think that's going to be kind of fun to see what kind of applications and research emerges out of that.

BG: I think there's actually going to be a good-- that's a good opportunity for more smaller companies or more diverse set of companies, is sort of things like RL4 dialogue for chatbot systems for dialogue systems. Most of the academic work here seems to be focused around like a handful of crusty old dialogue data sets for like booking a hotel or a flight or ordering from a restaurant. You know, very, very narrow, defined, simple data sets. But that's pretty useless in most of the real world settings. And so how do we break out of this, sort of, kind of, interesting, but not very practical benchmark. And I think we're going to see more of that coming up next year.

RM: That's a good topic since, you know, all three of you guys are sort of over educated. Talk a little bit about the, you know-- there's a big gap between all the news headlines that we read about all the cool stuff they do academically that is not ready to be applied in production level systems for a whole host of reasons, right? Performance reasons, the messiness of real data, you know, stuff like that. You know, is that gap getting better? What are the things that might move into production systems next year, or what are you most excited about from that?

DD: Well, I think it starts off with the toolkits. This year we've had a lot of really interesting toolkits released, right? So on the NLP side you had AllenNLP, Facebook recently released PyText. There is also fast.ai, which is, you know, Jeremy Howard's deep learning class that he's now made into, basically, this very performant, you know, deep learning framework, which also has great pedagogical value. And so I think it's really cool that you basically have like these fantastic tools that have been released that are making it more accessible for both experimentation, but also just like industrial use. You don't need to be, you know, this is on top of PyTorch and, you know, TensorFlow. I think that in this sense is great. I mean, obviously there's still a large gap between research code and production ready code. I think you're seeing, as we start moving more and more into interesting research applications, that industry value that, you know, people are releasing their code bases, they're releasing their findings of results and their code, that there's just definitely a germination and kind of a spread of these tools and technologies.

BG: I kind of mentioned this earlier, but it is, I think, we're-- and I agree, absolutely, with Dharia that the openness of this community, of the contribution of the papers that are accessible, of the toolkits that are accessible. Everything is made available via, you know, either one of the big tech companies, sort of, you know, R&D, AI research groups, or there's a bajillion GitHub repos of people implementing all kinds of stuff. So it's all out there. It is, though, the challenge of figuring out is your problem, you know-- I'd still say a lot of these things aren't maybe geared toward small data problems. They're definitely big data problems. They're big resource problems to train, you know.

If you're not in that camp, so if you're not at Facebook or Google or another big sort of company with a lot of resources like a hedge fund or something, you know, it can be difficult to really make these things work or to find a good fit because you may just not have enough data. And again, we mentioned with the SQuAD data set, like, a lot of these things don't deal with, you know-- I don't know what you want to call it, but not user-generated data. Like not clean, curated data, but like stuff where there's unclear what it is, it's not-- maybe there's no answer, maybe there's not a good fit there.

There's still definitely plenty of jobs out there for all the data scientists. You still need to clean data, need to move data around, need to approximate and understand how we can do this stuff because it's, you know, whatever the big tech companies, PR firms would like you to believe, or the big research institutions, we have a long way to go.

DD: That being said, I think there is an interesting thing where there's, like, a divergence of problems of what can be solved and made available to the public and provide value versus what's more complicated. I think, like, for example, image classification. Like it's gotten to the point where it's almost a commodity at this point, right? Like you have farmers using it, basically, creating their own little TensorFlow models to basically predict crop yield, to like take a look at pictures of their farms and find out what areas are damaged or what areas are over watered or something like that.

You've got a medical space where people are looking for like, you know, novel things, like anomaly detection, or cancer detection, and things like that. And obviously, you know, there's challenges of making sure, depending on the the type of problem you're tackling and the implications it has on how confident you are on those models. The barrier to entry to like, you know, do image classification has like almost evaporated in the past year, right?

On the flip side, you have more complex problems in the NLP space, or like, you know, more complex architectures, which are definitely getting harder and harder to do. And so when you have complex problems where you're try to shoehorn into a deep learning model, or shoehorn into some sort of representation that I think that's where I think a more interesting and creative work is going to start emerging. And I think there's definitely a gap between the research there and the value in production.

RM: Looking forward to 2019, like, what is either one thing, you know-- give me like one thing that like really excited about that you think like, oh, this is going to be a breakthrough year for this, and it's exciting, or something where you feel like, hey, nobody's working on this and somebody out there should be working on this, right? It's actually a gap in where things are going.

LO: I think in 2019, just sort of following the news and seeing like how Congress is grilling Facebook and Google about things that are going on about Russian bots, I think that that's going to force more people to pay attention to what's going on and sort of try to hold policymakers and politicians more accountable for appropriate regulation. Whether they have the answers to that next year, very unlikely, but I think we'll start seeing more of an aggressive push.

BG: One thing I'm actually pretty excited about, in general, and I mentioned edtech, I actually think that the education space here is moving a little bit faster, is trying to keep up than maybe it did in the past during previous revolutions and sort of technologies and society. You know, we didn't have schools-- you didn't have a Bachelors of Web Development at Carnegie Mellon after, like, the early 2000s. We didn't have a school of web development, or whatever, coming out of MIT, but we have that for a AI. AI is seeing-- even though the web was incredibly transformational for society, it was still just seen as, like, oh, this is just an applied technology thing and it kind of just sat there, and then this whole world of software development and things kind of grew up around it in web development, I should say.

It looks like AI is being taken a little bit more seriously by, sort of, people in the educational space. We have MIT sort of investing in a whole new sort of college in this area. We had Carnegie Mellon created a bachelor's degree in AI. And people would argue that why-- this is just applied computer science. Why does this really need its own thing? And I don't think that that's true. I think it's incredibly interdisciplinary. I think it needs a lot of people. I think it's not just computer scientists and people who can write programs, but there's a number of things that have to be brought to bear here. You need people who understand policy, you need people who understand human interaction, you need people who understand sort of physiology, sociology, all kinds of things that can interact with and benefit from AI methods, need to be aware and trained in these things.

We're seeing this, you know, at the undergraduate level, and it's already being pushed and explored at the K to 12 level. People are trying to think about how do we teach kids as early as kindergarten up through high school, how do we teach them about AI, right? We need to get them aware early as soon as we can about what these methods are, what they do, what they mean, and how they work at a broad level. We don't have to teach anybody like the chain rule and back propagation. We don't need to talk about log likelihoods. What we do need to do is tell them, like, OK, what is a neural network? What is a machine learning model? What it is a statistical models? What are they doing at a high level? They have awareness of what these things are.

There's a whole, like, you know, like I mentioned, there's the graduate, at the undergraduate level, there's this AI for K-12 group that's through the AAAI and the ACM organizations of how do we get these things at all levels of society so that people are better educated?

DD: For me, the debate between Gary Marcus and Yann Lecun, which is, you know, kind of the old symbolic and 80s traditional style AI methods versus kind of where deep learning is going today. And really, I think what's interesting is that there's actually a lot of work that's at the intersection of both, right? They're not mutually exclusive. They're not either or things.

For example, this past year, the Allen Institute for Artificial Intelligence got $125 million dollar grant to look at common sense reasoning. One of the things that they're really interested in is how do we combine knowledge representation and large graphs with deep learning to do some really interesting things? With the prevalence of these language models where you have BERT and ELMo, where you're capturing general characteristics and conceptions of what language looks like and understanding of language and putting that to models, you could do some really cool things to augment a lot of these deep learning approaches. I think for 2019, what's going to be really interesting is, can we start moving past the limitations of deep learning by maybe augmenting it with common sense graphs, or, you know, common sense knowledge bases, or other information that the models then need to relearn every time it's doing a task? We can start seeing this baseline set of information, you're going to start seeing a lot more performance benefits as well as kind of that overarching gap that is between where we are with AI today and kind of moving towards general AI in the future.

RM: Very good feedback and answers, so thank you guys for being on. We'll probably do this once a quarter. Just check in with the Talla data science team and the stuff we're working on and what we're thinking about because, you know, we are kind of out there on the limits trying to do some new and interesting stuff. Thank you guys for listening. Hope you have happy holidays and a good new year. And, if you have guests you'd like us to see on the podcast or questions you'd like us to ask, send those emails to podcast@talla.com and we'll see you next week.

Subscribe to AI at Work on iTunes or Google Play and share with your network! If you have feedback or questions, we'd love to hear from you at podcast@talla.com or tweet at us @talllainc.