Interview with Chris Albon (Part 2 of 3)
Published 4/28/2017
In today's episode, I interview Chris Albon, co-host of Partially Derivative, a fantastic casual discussion podcast about all things data science. Chris is joined by Vidya Spandana and Jonathon Morgan on the show. We discuss the exciting prospects of machine learning and data science in this three part interview!
Today's episode is sponsored by Fuse! Build native iOS and Android apps with less code and better collaboration. Head over to spec.fm/fuse to learn more today!
Transcript (Generated by OpenAI Whisper)
How do you make decisions as a human? How do you consider your options and then choose whatever final option that you want? This is a question I want you to ponder as you listen to today's episode of Developer Tea. This is the second part of the interview with Chris Albon. My name is Jonathan Cutrell and my job here is to help coach you through some of the hardest parts of your career and help you level up as a developer. I want to help you cultivate the mindset of a great developer. The mindset of a great developer is not fixated on tools, it's not fixated on languages or really hyper-specific things. The great developer mindset plays the long game. That's what I want you to grab a hold of as a listener of the show. I want you to treat your development career like you would treat a long-term investment account. If you're moving too quickly or if you're changing your mind all the time, if you're buying stock and selling it right away, then that account is ultimately going to be worth very little money in the long run. If you invest predictably, if you ask bigger questions, instead of get rich quick questions, if you ask questions like how can I become a great developer and continue to build my career over the course of the next 20 or 30 or 40 years? How can I build my career not just based off of the one language that I know but instead off of the way that I think? That's what I want you to be thinking here as you listen to today's episode and really every episode of Developer Tea. Thank you so much for listening to this show. Thank you again to Chris for coming on the show. If you missed out on the first part, make sure you go back and listen to that. I'm going to get out of the way. We're going to get into the second part of the interview with Chris Alben, the co-host of Partially Derivative. I'll share my first experience of actually implementing something that used machine learning. I didn't actually do the algorithms myself. I just used the service and basically evaluated pictures for the objects in the picture. Then we created a live search against those pictures, which a lot of you have probably experienced this now in Google Photos or in your Apple Photos or whatever it is that you use. You don't have to index your own photos anymore. There's something that will recognize that there's a cat in that photo and you don't have to put cat as a tag. The first time I did this, I did it against a bunch of screenshots of my home movies. Basically, just started typing in words that would show me the home movies that would have that thing. It was like a rediscovery. It was a very deeply emotional experience for me, which is unusual probably for the average person. You're probably going to see a demo and not connect with it directly. It feels kind of business-y for a lot of people. To be sure, business will be driving this forward. But there are so many, I think, the wrong perception that maybe the hype drives is that there isn't a human or good thing about this. That it's somehow it's going to kill every job. It's out to make our world more systemized. It's not going to make our, there's no optimistic version of that that's being portrayed. At least there's a less optimism in the world about AI and about machine learning than there is. These cold, calculated business narratives are being portrayed. There's so much more to it. There's so much more power and I guess power is kind of a bad word to use. There's so many more possibilities as the better way to think about it. There's so many things that you can do very simply with it. You don't have to be like Chris was saying. You don't really have to be, you don't have to understand every single thing that's going on. I guess the hope of this episode of this podcast is to kind of give people the vision for what they could do, some of the things they could do with machine learning. I think again, that door is opened up by the first or the second step of understanding of what it does. Exactly what you said, what it is and what it isn't. There's just so many examples of it adding more tools to people's toolkits that it allows you to do better things. Just fly a drone up and then shoot a camera down. There was this one. I think they flew a drone over the ocean and they flew back and they trained it, not the drone, but they trained the image recognition machine learning algorithm to spot whales so they could just, they would fly it out and fly it back and then they would count if there was a whale there or something like that. That just opens up this huge range of things that people didn't think we could do before. To me, it's one of those things where everything in our life is just going to get smart. I know there's branding around smart items and smart internet and smart whatever, but every single product around us is just going to be a little bit better. My huge hope is that it means that it sort of takes some things off our table that we don't need to think about. I hate notifications. I think notifications are crazy and every single app wants to notify me, every single website wants to add themselves to my notifications, stuff like that. I want the anti-notification. I just want things to happen behind the scenes. The example I really love is Amazon's grocery store, which they're trying to make, but they use a neural network and a camera, I guess a bunch of cameras, to when you pick an item off the shelf, it recognizes you because of your, I guess, the app on your phone and it recognizes the item that you're picking off the shelf because they've trained it to recognize the items that are in the store. Your experience is that you walk into the store, open the Amazon app, grab an item and walk out of the store. That's it. I want that experience because I don't want a stand in line and I don't want to think if I should choose Register 4, Register 3 or anything like that. I want to be distracted by what matters in my life and just absent mindedly deal with these mundane parts. That to me is just the killer app part of a lot of these things. I kind of hope it gets to the point where down the road that the jobs that basically people do and they don't like, we've sort of, those are probably the really good jobs to eliminate through automation because it turns out that those jobs are mind-numbingly dull. There's a problem with that transition between where we have now, where we have a lot of manual jobs and where we have very few manual jobs and I think that transition is probably going to have a lot of pain to it, which sucks. But I hope that we get to the point afterwards, a generation through or something like that where it's like, no, no, no, no. There's still jobs and there's still people working and something like that. Of course, but there's different jobs because AI and automation opens up new areas and stuff and new jobs that we didn't think before that existed and now of course they exist and it's just a shift and I'm of course worried that that shift is going to be painful. And there's lots of things around universal basic income and that kind of stuff to alleviate that and it's something that I wish the government thought about more. But at the same time, I really do want to get to this part where we as human beings get to focus on spending time with their family and focus on, you know, say cooking if we like to cook and you know, whatever we want and like all the little mundane things in our life are just sort of handled by something else. And it's not one great AI, right? It's like the Amazon store has a neural network, so it means you don't have to like stand in line with the register and then your fridge has a camera in it and it sort of, you know, whatever you set, you always want to have milk. And so whenever it sees that the milk is low, it just orders more milk and more milk arrives and like that kind of stuff where it's just like, I don't need to think so much about all those parts of it. Right. Would be beneficial to me because like I do, you know, like being my wife's split all the chores evenly and that kind of stuff and I spent a huge amount of time. So does she like doing all these mundane parts of our lives just to sort of keep going. Yeah. I'd like to get past that. And like a lot of these things around, you know, AI are really appealing to me, like self-driving cars, like I live in a, you know, really rural area. And like, man, like I would love a car that drove itself because I have to drive like two and a half hours to the airport. It would be awesome if my car, you know, would just drive it or even better if like I could just, you know, like I could just have, I don't even have a car. And a car drives up and like I get in and then like I fall asleep and then I wake up the airport and that stuff like that would be a better life experience for me. And, you know, AI is a huge part of moving that forward. And like it's working like I mean, a car can drive itself. I mean, not probably amazing apparently, but like we're getting there like give it five years. And that's crazy. Like our car drives itself. Yeah. And relatively reliable. If every car on the road was a self-driving car, then it'd be, you know, it would happen now that the technology is certainly available to make that happen. It's really interesting. One of the things I heard in my master's program that has stuck with me ever since it was, I did this certificate. It was a management of technology certificate at Georgia Tech. And basically it's kind of the intersection between big business management and technology. And one of the things that my professor said was technology, the definition of technology is knowledge embedded into something. And that has stuck so well with me because I see it over and over and over again. So watching that movie going back to hidden figures, certainly a fantastic movie in my opinion, I think it's really good. I don't know how accurate it is, but some of the messages that portrayed particularly about technology by the way. I thought we're really good. But this idea that you're going to be replaced because the knowledge that you have or the capability that you have, the ability to, to manipulate something in time or space or whatever that you have can be embedded into another thing. So you think about industrial revolution. This was technology taking on the job of the people who were effectively making the bolts. Now we have machines that can make the bolts. And so the people who made the bolts now they have to go and do something else. And this is true for so many individual things along the way where the knowledge is no longer needed to put into a human, we can put it into something else, whether it's a machine or it's bits or whatever it is, we can offload some of that information, some of that knowledge into something else. And then the thing can have more predictable output. The thing can be more predictably cost-efficient. And we don't have to rely on what is relatively felible. People are relatively felible. So if you think about the taking this information out of people's heads and instead putting it into something else, now we have a much more reliable kind of base level to work from. Yeah, I mean, I think it's, I really like the idea of technology is putting knowledge into something because that's really close to what machine learning feels like. And that we're, like, we talk about training models and optimizing things and stuff like that. But where the paradigm is different is that, you know, with traditionally in like knowledge-based systems, what we would say is like say we wanted to, we had a bunch of photos of fruit, we wanted to predict something, whether it was an apple or not an apple, we'd have to come up with some, you know, deep definition of what an apple is like it's green while some of them aren't green. Okay, well, it's certain sizes, well, some oranges are that side. Okay, okay. Has smooth skin and it's green, you know, etc, etc, etc, etc. And, you know, the paradigm of machine learning is totally different where we as humans specialize in the learning algorithms. So just like the algorithms that teach computer stuff. And then what we do is we say, here's a bunch of photos of fruit, summer apples and summer not, we'll tell you which ones are fruit or which ones are apples and which was not. And then we apply these, these learning algorithms that we've come up with. And then we look at the output and then the computer actually builds a model for what it thinks, you know, what it thinks apples like the definition of an apple is and how to predict apples and stuff like that. But like we don't do it. We don't have a definition of apple and like all these algorithms like they don't have a definition of what a, you know, middle-aged man looks like it's just, it's learned how to predict that kind of stuff. And that's such a cool paradigm to be in where the teachers of these kind of things and where where the sort of given the strategies for learning. And then we kind of release them to go, to go free that stuff out. Yeah, of course, there's the big problem now is that we don't have a lot of data, like there's a lot of data in the world. But like the data that we can use for things, there's just not enough, like there's not enough photos of people's fridges in order to train a really good model to predict what's in your fridge. Right. And just need more labeled photos of fridges. I had this conversation actually very recently. We were talking, I was talking with a couple of business associates basically. I can't name names at this point, but basically we were talking about this, this idea that, you know, what, what is the value of all this information that Twitter is amassing, right? And for someone who is, you know, relatively, they don't, they don't know a lot about machine learning where they aren't really acquainted with that concept. Me looking at it, knowing something about machine learning, I see this is just a ton of data that just like is connected to humans. And when humans create something, when humans create some kind of, you know, informational, like some, something they've been just spewing out. Well, that's, that can be used and we can predict things from it. I don't know what. I'm not sure exactly what we're going to, you know, something behavioral can be determined from Twitter, right? Yeah. Something valuable is, is embedded in this, like this moment of somebody interacting and writing out some words. We can actually use that information and predict stuff. Yeah. No, and I think, I mean, coming from the social science background, and I'll just say it is, it is much harder to predict humans than it is to, to do other things, even probably self-driving cars. Like humans are really crafty and they, they do stuff like randomly, but more like strategic randomly where they try to be random at moments that they don't think it's very, like, but it's like, it's a nuanced relationship and people have good days and bad days. So there's internal stuff and there's extra, it's just, like humans are a very complicated thing. But there's like, there's just, there's so much stuff out there. And the, yeah, this one example I love, which is that, you know, Siri, like we, like constantly people mock Siri for not being that smart, right? Not being able to, like, understand what you mean, and that kind of stuff. And like, I don't know for certain, I don't know anybody working on it, an apple or anything like that. But like, I could almost guarantee that the reason they released Siri earlier than it being perfect is because they need your data, yeah, if you have tried to do stuff to make it better. So like, every single time you get it wrong, get frustrated, they're trying to detect that and they're trying to retrain it. Every single time you're satisfied with what it gives you, it's like that data goes back to Apple and they retrain their universal Siri model based on based on that. And like, this is the same, you know, the big problem with self-driving cars is it's not that neural networks can't drive cars. It's that we just don't have a lot of self-driving cars out there compared to what one would need to cover every single situation possible. One of the examples was there was in Pittsburgh, I think they had like Uber has self-driving cars or maybe a Tesla or something like that. And they put salt on the road because it was snowing. And that was kind of the first time the computer had ever dealt with salt on the road and salt on the road and they leave these lines, these white lines on the road. And of course, you know, like the neural networks never seen that before because they don't have training photos from that, you know, like they don't have enough. But like, give it 10 years of self-driving cars and like they'll start to have covered a lot of stuff. Yeah, everything. You know, we're we're getting there, but it is about getting that data. And of course, humans are like the final frontier where we have lots of this messy data and around humans, but like, humans are insanely hard to predict. So like, it's kind of the probably like, the final frontier is predicting like more than rudimentary stuff that human does. But maybe we'll get there. What does it take to be a real developer? Well, if you're like me, a real developer is determined by what they can build, not how they build it. Of course, there's some interesting things with code. We don't want to do away with code altogether. And code can be a lot of fun. But ultimately, sometimes code is just simply cost. It's a lot of tedium. It's a lot of handholding in your application. And today's sponsor helps you write less code and have better collaboration as you're building iOS and Android apps. Today's sponsor is Fuse. Fuse is an all-in-one cross-platform solution that works on both macOS and Windows. The cross-platform side of this means that you can build Android and iOS native applications. The Fuse installer includes everything you need to know to get started. They also have some really good full-source code detail explanation examples over at FuseTools.com. If you've ever used a game development system like Unity, Fuse is essentially Unity for app development. Go and check it out, spec.fm slash Fuse. And of course, you can find the great examples over at FuseTools.com. Thank you again to Fuse for sponsoring today's episode of Developer Tea. I think it's kind of weird whenever I read, for example. I've talked about these things on the show so many times. But things like the NEagram and other personality tests, the categorization of different types or personality types, and using the psychological side of things to inform other pieces of what we do. There's a couple of APIs out there that can take at least a piece of text and cast it into these different, predicting, predictive models that basically say, most of the time people use this kind of language. They fit this personality type. Of course, I can be copy and pasting from another thing. And it's not being me at all. But I do think it's interesting that for certain things, like for example, purchasing decisions, we have a relatively good ability to predict how many people are going to purchase this thing based on how many people purchased yesterday in the day before and the day before. So as a whole, we have certain predictive abilities. But then when you're trying to say exactly what one person is going to do, that's incredibly difficult. It's very hard to understand what the trajectory of a single person would be. It's a very problem that a company called StitchFix is tackling. So they do machine learning, but for boxes of clothes that they send you. So men and women, I guess they do both genders now. They, an algorithm like learns about you through like a conversation with a human. And then they send you four or five items of clothing that an algorithm has decided that you want. And it's algorithm based. And it turns out that that simple thing, like figuring out what sweatshirt you want is insanely, insanely difficult. And I think I have something like 75 to 100 data scientists working on it. It's really, really, really hard because people have preferences and that preferences change. And they don't even know their preferences. They can't even list out their preferences of clothing. It's literally in their natural network of their brain. So it's a hard problem. But it's one of those things that if we can, the more we get it right, it would be really nice if clothes showed up on my door. And they were just clothes that I like to wear. It reminds me of probably the most cited AI or machine learning movie of all time, minority report. Right. So the, the, when he walks into gap, I'm not going to talk about the murder prediction or anything like that. But when he walks into gap and they, they basically scan his eye and they're like, Hey, how, how did you like the jeans that you bought last time you were in here? That's almost happening now in my inbox. You know, it's, it's pretty amazing. It's like around the corner. When I walk in now, you know, you have things that are going to recognize at least recognize your phone and your phone is going to recognize where it's at. And it'll pop up something that says, Hey, what, you know, what did you think about those jeans that you bought last time? It will feel quite similar. Maybe not quite as eerie as something scanning your eyeball. But certainly, certainly the same kind of information gone around. Yeah. I mean, I have not seen this yet. But I could almost guarantee that in, you know, like a year or two max, you'll see in like New York and San Francisco, like ads, like sort of like the billboards on bus stops and stuff like that will be directed not like specifically to you, but to the collective you. Right. The like, oh, you're a man. Here's clothes. Yeah. Like here, you're a woman. Here's dresses. And like, like, they'll just see who's around and then try to direct it towards that. I could almost guarantee that like the technology to do that is like definitely commercially out there. And so, you know, like, you could go like down the crazy road of thinking that it's going to mind everything and it's mining your Twitter and it's decided like, that's probably not going to happen. Really? Yeah. Yeah. But like a basic thing of saying like, Hey, like, well, we use this API and it predicts like, you know, it predicts your gender and your age. And then we show you products of people who gender and age like, you know, like that. Okay. Cool. So like when I walk around, it's like camping gear. Right. Yeah. Exactly. You seem like you might like that. Yeah. And we, I mean, we already see that like that's been going on with AdSense for like a long time. A long time. Yeah. You know, and it's just like the goal, of course, was that the experience for the person is better, which was always the argument for AdSense is that it's customized ads. You only get ads that like, at least you might want to buy. Yeah. Yeah. But like, it's, it's one of those things that like, hopefully, it can be used to make our lives more, more enjoyable and more better and can make products more enjoyable and more better, which hopefully improves our lives. And then, yeah, there's like, there's lots of really interesting conversations around now about like how to do it with, you know, like, ethically. Right. And like, that's a really important discussion that's like just, just getting started. Like, I feel like it's been wild west until now and will probably be wild west for another few years. And like, we'll finally start to get a handle on it. But they're, you know, there, there's discussion of how to do it right. And like, even if you have a bunch of information about people, like, how do you, how do you do it in a way that's responsible where like, you know, you don't want someone walking down the street. And like, they're being shown things that are private that they don't want. Yeah. You know, like, even if you do, in fact, by those things, that's not, you know, that's, it's not okay to show that on a billboard while you're walking down the street. Exactly. And, you know, like, it's one of those things where this is a, this is a new, you know, like a new set of tools that we're putting in and like, I sort of, I have new discovery and a new planet of, of opportunity. And we're just barely at the beginning of it. And we like, don't know how to do everything with it yet. Like, there's no, right. There's no machine learning undergrad degree yet. And I bet you in a few years there will be. And there's barely data science master's degrees. Like, yeah, it's because we, there was no, like, when all of us got started there, of course, there wasn't anything like that. It was just like, did you go to a certain meetup and we're intrigued by someone's talk and then looked online and saw other things about it. Like, that's where, that's where this whole came from. Yeah. And a lot of this stuff has a background that's driven in, in math and statistics and, you know, business analysts, that kind of stuff, you know, linear regression has been around since I think like the 50 years, I believe you can probably speak more to that than the me. But, a lot of this, you know, the fundamental ideas here are all driven by math and similar types of things. So collecting, you know, for example, in Graham, you have a collection of words and every three words that would be, I guess it's called a three-gram. I don't know. Anyway, yeah, three-gram. Evaluating language and evaluating clumps of data together. That kind of stuff has been around for a while, but it hasn't been like, it hasn't been given to a machine to, you know, generate a million iterations of a given thing to test your assumptions, right? That is where the new and powerful capabilities are coming in. So in my management and technology certificate, we did some linear regression to try to predict some demand, that kind of stuff, right? That's been around as long as spreadsheets have been around. But this new frontier of, you know, automatically, like you said, mining data and doing some of the things that AdSense has been doing since the beginning, but making that available to like startups and people who are brand new in the space, that stuff is new. And we're looking at, I think you said on the most recent episode of a partially derivative, a 20-year kind of revolution in this space. I think that's spot on. I think we're looking at a huge, a huge shift in pretty much every area, every discipline you can imagine is going to be affected by this in a pretty major way. But it is exciting. And I think the ethics question is probably one of the most important things to start thinking about early, right? Because it's kind of like thinking back to the early days of the internet. It's easy to look at things from a perspective of that's kind of like a toy, right? That's not really going to be, it's not something to really take seriously yet. And I think we're just now, like you're saying, getting to the point where we're actually seeing data science as powerful. It's no longer just an academic thing. It's no longer theoretical. It's actually real. And the power that you have with it, you can really infringe on people's rights. You can infringe on people's privacy, it's very least. And there's a lot that goes with that. And so in my opinion, privacy is kind of the part of, it's the equal frontier of the future for the next 20 years as well. Yeah, I mean, you know, the big thing that's coming up now is, so I think it's called algorithmic responsibility, which is that so for, there's a bunch of different algorithms that we use, you know, like deep learning versus, you know, like regression would be considered a like a machine learning algorithm in a way. And decision trees and random forests and all that kind of stuff. And some of these algorithms, I can tell you, once it's trained, I could tell you why it made a decision. So I could literally take your data in and say, oh, well, you know, like you were denied alone because you're, you know, you're, you're below a threshold of age and you defaulted on a credit card or something like that. Sure. And then a lot of these are all of these artificial neural networks and deep learning. We trained the model, we put the data in, we trained the model, we understand how it works and how it learns. But the way that neural network is structured, I couldn't tell you the one thing that made you being denied for alone. Like you would, you would just be denied for alone. You'd be like, why? And you're like, I don't know the algorithms that you were designed for alone. So it almost takes on a personality, right? But the issue is that it, it becomes an issue of like, of a large amount of trust has to be put in the algorithm itself that it's working well. And there isn't negative side effects, which is a huge like, that's a massive ask to give to people. Like there's no reason you should trust your loan companies out. Their loan companies algorithm isn't being racist. Like there's no why should you believe that? Like it, like hopefully they have tried really hard and there's a bunch of really good techniques you can actually use to make sure, like for example, just the simple one of not including not telling the computer about race. So it can't like, figure that stuff out. Yeah. Yeah. But like, of course, you don't know what they're doing behind the scenes of the course, they could be, of course, they could be including that kind of stuff. And then you know, there's a discussion of like, well, these neural networks are really powerful, but like, do we want that? Like, like, maybe we should use something like a random forest or a regression or a logistic regression where like, I could tell you exactly why it made a decision or didn't make a decision. And it's this real back and forth because neural networks are incredibly powerful. And all of the really cool stuff that sort of is getting a lot of, you know, pop culture attention, like self-driving cars and like, you know, like Google's translation, which is now a neural network and that kind of stuff. That's all things where it's really hard to like, figure out why the computer made that decision. And we're just barely getting to the point where we're starting to address this. For example, there's a research project that just came out of the paper where they trained a neural network to explain what another neural network was doing. So let's try to like, give some kind of thing where like the second neural network was trained to convert the, like, how the first neural network was working, but into human words. But like that kind of stuff. So we can actually have some kind of responsibility because it's, you know, algorithms have been used to, you know, put some sort of structural violence and against people for a long time. Like, totally, you know, even before computers were really in there, like, people used formulas that, weren't necessarily racist, but clearly they were designed to be racist. Like, that was the, like, they just, they targeted things by using other variables other than race in order to do that. And like, that can continue under this scenario totally. And that's why we need to have this conversation around these around these things because, you know, like, this, like the black box, when people talk about neural networks, they talk about it as a black box because we don't know why it made a decision. We just know how accurate those decisions tend to be. You know, is that okay? And they're really powerful. And we all want a self-driving car, but we also want to know why it turns left and why it turns right. And if it gets into an accident, like, why did it, why did it get to next? And why did it decide to swerve into the poll or not or whatever? Yeah. And that kind of stuff, we're like, we're just like, it's as hard to describe how recent that conversation is and like how infant, infant baby steps, like we were talking months ago, like this kind of stuff got started. Like, we are just barely getting there. And it's a 20 conversation. They were just in month three of or whatever, maybe a year of. Yeah. Thank you for listening to the second part of my interview with Chris Albon, the co-host partially derivative. And of course, the doctor Chris Albon, I guess I should call him by that name, but he's such an easy person to talk to, such an enjoyable person to talk to. And so passionate and excited about the topics that he is discussing. So we just naturally kind of hit it off. I really feel like I can fully endorse you guys as Developer Tealisteners to go and subscribe to his show. So thank you again to Chris for coming on the show. Thank you to Fuse for sponsoring today's episode of Developer Tearight a little bit less code, collaborate a little bit more and see things instantly updated on your devices and an in screen or on screen simulator. Go and check it out, spec out FM slash fuse. Thank you again for listening to today's episode of Developer Tea. There is one more part to this interview. And if you don't want to mess out on that, make sure you subscribe and whatever podcasting app you use. Thank you again for listening. And until next time, enjoy your tea.