Episode 38: Machine Learning at HubSpot with Vedant Misra
Host Rob May had a conversation with Vedant Misra, HubSpot's Machine Learning Technical Lead. Before HubSpot, he founded Kemvi, with a vision to build technology that understands what's happening in the world in order to help B2B companies achieve faster topline growth. They raised $1MM+ from VCs and angel investors and were acquired by HubSpot. Tune in for a conversation on all things machine learning.
Rob May, CEO and Co-Founder,
Vedant Misra, Machine Learning Lead,
Rob May: Hi, everyone, welcome to the latest episode of AI at Work. I'm Rob May, the co-founder and CEO at Talla. I'm your host today. Our guest is Vedant Misra. He's the technical lead in machine learning at HubSpot. Vedant, welcome to the show today.
Vedant Misra: Thank you for having me.
RM: You and I met because you were running an AI company. I became an investor. It ultimately got subsumed into HubSpot, which makes a lot of sense.
Why don't you give everybody the overview? First of all, I think most people know HubSpot. If they don't, why don't you explain what HubSpot does? Then talk a little bit about what Kemvi did and why HubSpot was interested.
VM: HubSpot is about a half a billion a year company that makes software for small and medium businesses that, kind of, combines the functions of WordPress and Google Analytics and Salesforce and a bunch of other tools that businesses need to sell. Originally the product started out focused on SEO, back when getting people's attention on the internet through blogging was a very, sort of, popular phenomenon and people were using search to get information. Now it's evolved to include a variety of products, including CRM, which has a free version, and the Marketing Hub, and also Service Hub, as well, on multiple price points.
The context for all of this is that-- the way that I think about HubSpot's data-- is that there's this rich interaction history between buyers and consumers. At the end you have this fantastic label for whether or not each person you were interacting with decided to buy or not buy your offering. That's really the question, who's going to buy what from whom. That is often implemented in companies as a lead scoring model, which is about figuring out which accounts to target right now. That was one of the things that we were working on at Kemvi.
I studied physics and math, and then I spent some time working at a think tank where we were applying methods from physics to social problems. We were processing a lot of data, web-scale texts, to try to figure out if we could learn events from things that are being published in news articles. I got excited about trying to help companies make better decisions with text and with information in text.
The product that we were working on when we got acquired was-- there were multiple components to it, including integrations to Gmail and Outlook and the SAS app. The core of it, the most exciting part of it, was that if you gave me a CSV of people's first names, last names, and email addresses, domain reconciled, we could produce strings customized to those people of four to six words that you could then use in your marketing material.
If you were sending out an email campaign, you'd be able to customize the template tags to each person in that email campaign, both in the subject line and in the body, and then that would propagate down through your funnel in terms of higher response rates, higher open rates, et cetera. That was what we were thinking about at the time when we started working on this deep learning technology. Now at HubSpot I run the NLP team, which is focused on applying research in deep learning, not just NLP, but across deep learning to the product. We focus on just trying to find novel applications to prototype of ML.
RM: Give us an example of, maybe, something that you've pushed to production that people are using now that came out of your team.
VM: We launched SalesHub not too long ago, and it's experienced quite a lot of adoption. And there's lots of sales managers and reps using it to make calls. We've been recording those calls, but we haven't been doing much with them. What we're doing now is that we're transcribing them, and we're extracting information from them to figure out a variety of things. What would be exciting to managers, as you can imagine, is transparency into what reps are talking about, whether calls are succeeding.
Even to figure out whether a call connected is an important thing to learn, being able to figure that out automatically from the conversation that's being had on the call, if it connected, what parts of the call had to do with what sorts of topics, when were they talking about the product versus other things. There's a whole space of features that we're exploring around, trying to figure out what people are doing on calls, whether calls are working, that whole domain. I mean, it's a pretty rich stream of information because the truth of every deal is in what the reps are saying.
There's really rich information not just in terms of what's being said, but also in terms of how things are being said. There are temporal features that we can use in terms of when people are talking, what kinds of things they're saying, whether there's intonation there that we can use, all of that.
RM: Talk a little bit about NLP in general for use cases like this, because a lot of people will say that, from a machine learning perspective, or any AI perspective, NLP is one of the most challenging things to do. It hasn't made the progress that machine vision has made, for example. You guys are coming up with use cases that actually have some real applications. So how would you characterize the state of applied NLP today?
VM: That's a great question. And it's apt that you compare NLP to progress in the world of perception with deep learning, specifically images and video, because I think this-- I mean, I'm just repeating the words of more thoughtful commentators-- but people have said that this is the ImageNet year for NLP. What they mean when they say that is that when deep neural networks started to make dramatic progress on the ImageNet Challenge, it was clear that the state of the art had advanced in a way that was going to make qualitatively new things possible.
I'd say the same thing happened just a couple of months ago with OpenAI GPT-2, which follows on to work they did last year, which falls on to work from Google. This is work that's definitely worth looking up because the applications are incredible. The idea is, basically, that you can take a model, train it on a huge amount of text without giving it any specific goal besides to predict the next most likely word that's coming up given a set of words in a context window, and you just feed in a ton of words off the internet.
Then, if you feed in the start of some text, let's say, something that looks like a Reuters news article or a blog post, it will roll out the rest of that content in a way that is grammatical, coherent, and maintains consistency throughout the body of the article about what it's talking about. It's really remarkable. These samples that are on OpenAI's website demonstrate that models can now achieve close to human-level performance, at least on specific research tasks. We never thought this was possible. Just a few years ago we were writing manual parsers, doing all kinds of hand-tuned stuff. And now you just feed it a bunch of words, and this model can produce human-quality text. I think a lot's going to be possible in the next year or so.
RM: Do you think that's going to work well primarily just for, sort of, colloquial usage? If I have a specific business vernacular, is that still going to be able to work for me? Or am I going to have to train additional models for that, or what am I going to do? The process that has worked very well historically is to take a large pre-trained model that was trained on a large data set that isn't proprietary, it's a public, and then fine-tuning it on a private, or even proprietary, corpus. When you do that, the model picks up the flavor of your corpus.
It learns the gestalt of your corpus. It produces transition probabilities between words that are reflective of the way people talk in that fine-tuning corpus. One of these pre-trained checkpoints, what happened with GPT-2 is that they didn't release the pre-trained checkpoint. It's the first time this has happened that a major research group has announced they've made huge progress on a task like this and deliberately decided not to release the weights. In this case, it won't be possible to just take some large fine-tuned model and customize it to your own data. There are already many data sets out there that are trained on large corpus that you can fine-tune it on your data set.
That exists; and they can learn your own vernacular, your jargon. These models can learn to communicate like specific groups of people or to produce specific kinds of copy. They're quite versatile. This has been very research-driven at the moment. Is this one year? Is this five years? Where are we from being able to apply stuff like this? Because even your average software company afford to run this model and scale it in production for their customers yet? Or is that still a ways off?
VM: That's a good question. I think for most companies it's still a ways off, both in terms of having the volition to spend that much capital on computes, collect the data, and actually make it happen, and, also, in terms of access to people who can productize these kinds of stuff in a useful way, which is something that's worth getting into, I think, because a model that just writes blog posts for people is not something that anyone wants, I would say.
The optimal product here, I think, more carefully considers the human-computer interaction challenges around the fact that, for example, people want to be creative. They want to produce thoughtful, unique, original, meaningful content that their readers will get value out of, rather than just pushing a button and producing drivel, which is, you know, an end-state that we could converge to if we're not careful. So yeah, there's definitely a bunch of product challenges to think about there around these things.
To zoom back out and answer your question, I think there are already features like this in products that we use and love every day, like Gmail. Gmail's Smart Compose is mind-blowing, and it kind of just snuck up on us. It's actually mind-blowing because it, to a remarkable degree, knows what I'm going to say. And sometimes it makes me more polite. I find myself accepting its suggestions quite often. This is in production now. Hundreds of millions, if not billions, of people are using it. It's gotten better very fast.
RM: That's true. Like, since the day it came out, which hasn't been that long ago, it is already way better.
VM: There's some lead time between when technology becomes available to Google and when it becomes accessible to other companies. That process happens through both lots of research and, also, APIs and ecosystems changing and making it possible for integrations to startups to get access to this kind of tech.
I think all that's just going to accelerate. I think products like these are going to become far more common just because of the huge time savings. But, we're encountering new kinds of issues now, in that these models, they might be able to roll out a bunch of text, but there's no guarantee that the things that they're saying are true. And obviously, text that isn't true is a major issue in global geopolitical discourse and will be an issue if you aren't checking these concerns, also, in business communication. World modeling now becomes a concern. How do we ground these models in what's happening in the world so that they can actually roll out these language models in a way that's consistent with reality instead of just looking at transition tokens for probabilities for words?
RM: Really interesting stuff. Just a side note-- do you think that's going to happen using deep learning? Or do you think that people are going to have to move-- there's still some companies working on these, sort of, cognitive architecture models where you have different pieces, some that have pre-built assumptions into them. Like, do you have any particular point of view on where that might come from?
VM: It's a great question. I think there's a symbiosis coming in the near future between the, kind of, historic two worlds of artificial intelligence-- the connectivist world, which is all of deep neural networks, and the symbolic world, which is trying to do things in a more top-down controlled way. We're seeing themes from both being applied to cutting edge RL research all the time. And I imagine that we'll increasingly see human knowledge being put into systems in explicit procedural ways and beyond, and not just in the form of training data. Where you choose to do one or the other is an important question, of course, because there are many problems that are amenable to the learning as soon as you can get a few labels. The number that you need is becoming smaller and smaller, thanks to big pre-trained models and the fact that you can get access to tons of data pretty easily.
RM: When you think about the human-in-the-loop stuff, I think a lot of people don't realize that if you have a human workflow with a machine-in-the-loop, that's also, many times, a human-in-the-loop, where the human activity and interaction with that model is training that model, providing new labels, providing new data sets, expanding the data set.
It's really interesting where all that's going. Let's shift gears a little bit and talk about-- so you've done some traditional engineering work and some machine learning data science type of work. You know, when you look inside HubSpot, how do those teams work together? Do you have, like, one machine learning team, or are you embedded in product groups? How do you guys work with the engineering teams there?
VM: The product org at HubSpot believes in having lots of small teams that each have autonomy in their mission. And so the ML group consists of multiple teams that are responsible for their own components. There's the infrastructure team, there's a data team, there's a models team, there's a research team, the NLP team. And they're all sort of different domains of the problem that they own.
The idea behind infrastructure is to build the tooling that makes it easy for modelers to quickly deploy models and put them into production. There's a team that focuses on featurizing the data inside HubSpot systems and making it useful, and then also serving that data up in a fresh way at inference time-- these sorts of concerns. Then there are teams that are focused on actually training the V1 models, improving on the entire workflow, and those sorts of things. And then, so this entire group, the way that we've interacted with the product org has evolved over time. And it's a work in progress, of course.
There are two modes of operation. One is where we collaborate tightly with our product team that owns some part of the product and kind of just embed a modeler there. They own the customer problem, and they figure out what the right metrics are to move. They kind of just do the whole thing end to end. The other approach that works at different times is to sort of take this SWAT team approach where we go in, go into someone else's codebase, wire up our stuff, and leave. And so it really depends on the kind of implementation and the product concerns that come up with each specific model.
RM: As you hire for a machine learning team at HubSpot, are you guys looking, still, primarily for generalists? For specific skill sets? Do you expect people to have a strong machine learning background? Do you take people that come from engineering and move over? What's that process like?
VM: We have quite a lot of experience on the team now-- people who were at Google, people who've built very large production infrastructure, a lot of people with deep modeling expertise. And so we're now looking for people who have that background already.
I mean, again, there's two kinds of roles that we're hiring for-- one is people who have that background already and are taking more senior positions in terms of figuring out where the product could go from an ML perspective, but also more junior people who we can train, sort of bring up to speed on the way ML works at HubSpot. It's really both. We're growing pretty quickly.
RM: Let's step out of HubSpot now and talk a little bit about the ecosystem more generally for AI in terms of business applications, things that are coming down the pipeline. What do you see that really, really excites you, maybe even something that HubSpot isn't working on?
And then, a related question, or substitute question if you don't have an answer for that, is what do you see that excites you? And where do you see, maybe, opportunities that people aren't going after that they should be.
You were an entrepreneur; now you're at HubSpot. If you were going to leave and go start another company today, what domains would you be looking at?
VM: When I think about the question through that lens, what I want to say is that it's smart to start a company if you have an unfair advantage. You can build an unfair advantage by collecting differentiated data because, as we know, models are generally open source, and there's not too much IP in being able to iterate on model architecture.
If you have differentiated data and you trust it and it's clean and it's from some source that, historically, people haven't been looking at, that becomes an extremely valuable asset.
If I were to think about how-- if entrepreneurs were listening to this, they should be thinking about whether they have access to some stream of information that's hard to get to or if they can make use of it in a way that other people can't, whether by getting differentiated labels or through some other method.
RM: Is there anything that nobody's done yet? Is there a problem that you wish somebody would solve that would help you at HubSpot? Like, it's not a problem you're working on, but you say, man, if we had this as an input, it would be a really good thing for us.
VM: I know that we've built a lot of stuff that takes quite a long time to build that I think other companies could benefit from, in that it's very easy for us to put models into production.Larger companies, including those that have been doing machine learning for much longer than us, many of whom have sort of, like, invented much of this stuff, often the approach to deploying models is very monolithic. It takes an entire team that will do the whole thing from end to end, including the product integration.
What we do instead is we have this infrastructure, this very, sort of, heavy-hitting infrastructure, into which we can deploy models in a very lightweight, agile way. Being able to iterate quickly, get feedback from customers, both face to face and in terms of their interactions with the product, all of this is only made possible by having the tooling to accelerate all of this stuff.
I think a lot of companies don't have it. I think because this infrastructure was built by the infra team and people who were here before us, informed by best practices across the industry, as well as with Google and stuff. We've got all this technology that I think other companies could benefit from. It's got baked into it thinking that I think other companies should be applying to machine learning problems around how to quickly iterate, get labels, build in a human-in-the-loop component, all of these things.
If they're done in a platform-oriented way, it makes it easy for modelers to quickly iterate and get stuff out. So I think that's certainly valuable. I don't think there is a third-party solution that's really nailed that without forcing you to buy into their platform.
If there were some open source platform independent, useful, well-documented thing for doing this, that would be very compelling, where you could have-- I can keep listing off things that we're having to build. But there's point solutions for all of these things, but no one's really built something that anyone can use that solves all these problems. But monitoring metrics for features that are coming in to models at inference time, looking at the distributions of those and if there's change, monitoring various SLAs on the actual models in terms of latency or memory usage or whatever. Millions of things that come up when you're working with this stuff in production and we've had to wire it all up together ourselves. So I think a third-party solution that did this very well would be pretty compelling.
RM: Cool. Yeah, I have heard of some people working on similar things, but nobody that I've seen raise any significant funding or really start to roll out a scale yet. But there are some people thinking about it. I know there's at least one company out of MIT. Good. Last question, towards the end of last year, there was this sort of, like, debate that I call the Gary Marcus versus everyone else debate, which is, does deep learning get us closer to really understanding intelligence or not?
Do you need a fundamentally different approach? And I'm always interested in, for guests, do you have an opinion on that? How far will deep learning take us?
VM: I think the fact that deep learning has so effectively solved perception, or at least pulling features out of real world data, implies to me that we have, at least to first order, solved the problems in silico that the brain solves, at least in the visual cortex or in the auditory cortex. And again, this is very handwavy, but to first order.
I think techniques like this, the application of linear algebra to lots of data, are going to be the main thing that we use to solve intelligence. And I think deep learning as we see it now may not be feedforward neural networks. It could be complex probabilistic models that we haven't thought about how they'll look like yet. But, I think linear algebra with lots of data is going nowhere, and it's going to be the main workhorse of this revolution. I think when people have strong opinions to the contrary, it's often in part because they're confusing intelligence with consciousness. And those are two completely different things.
I do not believe that linear algebra is going to solve consciousness. I think that's a much different conversation. But I think building systems that can replicate the things that we used to think only brains could do-- we've seen that deep learning nails that. That's what intelligence is. So yes, it seems like humans can do qualitatively different things from the systems of today. But I think it's because we're the pilots, and so we feel privileged as the, sort of, passenger of this computer. And we think that consciousness is really the core driver of this, but it's not. It's really just data processing.
RM: That's it. Very interesting answer. We haven't had anybody go into that much depth yet or differentiate between consciousness and intelligence, so that was good. But yeah, that definitely is a philosophical debate. Vedant Misra, thanks for coming on today. And thank you, guys, for listening. If you have questions you would like us to ask or guests you'd like us to have on the podcast, please send those to firstname.lastname@example.org. And we'll see you next week.
Subscribe to AI at Work on iTunes, Google Play, Stitcher, SoundCloud or Spotify and share with your network! If you have feedback or questions, we'd love to hear from you at email@example.com or tweet at us @talllainc.