ยซ All Episodes

How Averages Can Trick You and Obscure the Truth

Published 9/27/2021

Averages can trick you into thinking a generalized idea about a complex set of data. This kind of compression happens not only with averages but any other process that summarizes information.

Ask yourself: What am I missing in this story?

๐Ÿ™ Today's Episode is Brought To you by: Retool

Build internal tools, remarkably fast. Connect to anything w/ an API. Drag-and-drop React components. Write JS anywhere.

Check them out over at https://retool.com/startups/devtea

๐Ÿ“ฎ Ask a Question

If you enjoyed this episode and would like me to discuss a question that you have on the show, drop it over at: developertea.com.

๐Ÿ“ฎ Join the Discord

If you want to be a part of a supportive community of engineers (non-engineers welcome!) working to improve their lives and careers, join us on the Developer Tea Discord community by visiting https://developertea.com/discord today!

๐Ÿงก Leave a Review

If you're enjoying the show and want to support the content head over to iTunes and leave a review! It helps other developers discover the show and keep us focused on what matters to you.

Transcript (Generated by OpenAI Whisper)

We use averages all the time as engineers, as people. We use averages to describe groups of other figures, groups of other numbers. In today's episode, we're going to talk about how averaging can trick you. It can make you believe that something is true that isn't quite true. My name is Jonathan Cottrell. You're listening to Developer Tea. My goal on this show is to help driven developers like you find clarity, perspective, and purpose in their careers. And I have a driving underlying reason that we're talking about averages today. But before we get to that, I want to talk about kind of the mechanism here. How exactly are averages sometimes not getting the whole picture? I'll give you the kind of simple example to understand. And then we'll talk a little bit about the math. And then we're going to talk about how this applies to our roles as engineers, as managers, as product developers, as thinkers, and as human beings. Most of us are first introduced to averages in grade school, specifically with our grades. We. Talk to the teacher about what makes up our grades. And in a, you know, in a very simple grading system where there's no weighting of one grade versus another grade, your grades average together. And so one exam, you might get a 100. And then the next exam, you might get an 80. But you walk away with a 90 average. Now, of course, teachers also have introduced weighting systems. So that your pop quiz that you weren't prepared for on a random Friday isn't weighted the same as your, let's say, final exam. And so you might have what amounts to a final exam that counts five times as much. You can imagine multiplying that final exam five times and you're back to working with averages. So the basic idea, if you're unfamiliar somehow with averaging, is that you add all the numbers. And then divide by the number of numbers. If all the numbers are the same, then the average is whatever that number is. But let's say you add 70 plus 80 plus 90 plus 100. That's 340 divided by four gives you 85. Now, before we go on a rant about how this can be tricking you into believing things that aren't necessarily true. Let's first say that averages are incredibly valuable. In fact. The vast majority of machine learning algorithms are working off of various usages of averages. In particular, when you're looking at things like loss functions, you are probably using some kind of average. For example, the mean squared error. That is an example of an average that we use. It's very incredibly valuable. But averages often don't tell the whole story. Our example of 70 plus 80 plus 90 plus 100 gives us an average of 85. But two 70s and two 100s gives us the same average. Now, this is not really that surprising when you have the boundaries of 0 to 100. You can kind of intuitively guess that with those boundaries in mind, there's only so many combinations. And the grading system probably works out to be relatively good. Relatively fair. But even within this system, sometimes the results are a little bit strange or they seem to be strange if you try to put a story to them. For example, I want you to imagine the kind of student that the following grades likely came from. A 95, a 95, a 0, and then another 95. Similarly. I want you to imagine. Imagine the kind of student that the following grades came from as well. A 75, an 80, a 70, and a 60. I'm sure you see this coming, but these sets of grades both average to the same grade. According to the final grade that you would get with these scores, assuming that those grades are not weighted, these students are identical. Now, this is problematic because as you begin to put a story. To this discussion, as you begin to put a story to these two different pictures. You can understand some of the failings possibly of an average system or using averages as a definition of a full set of numbers. For example, in the first student situation, very clearly the pattern is established of success. They have three 95s in that set. And very often zeros. They're handed out for students who forget to do an assignment. It's possible that this particular student, let's say, had a really difficult week. Maybe they had something happen to them personally that made them forget their work. And they got a zero as a result. The other student, however, by traditional measures at least, is a less successful student. They have a consistent pattern of significantly lower grades than the first student. And yet, when you average them out, they both come out. To the same number. So, what the system is implying is that consistency or showing up is more important than excellence. There are a lot of things that you can infer when you look at these individual cases. You could infer that the system is heavily biased against doing nothing. In other words, the forgetfulness or skipping class. And that intuition would be backed up by the basic fact that the top end of the system is not doing anything. The bottom end of failure in most grading systems is a 60. And the bottom end of failure is a zero. And by not showing up, this person has lost a 60-point margin. But the point of this is not to criticize average grading. There are plenty of other options for grading. Many teachers grade on a curve, for example. Many teachers have that weighted system we were talking about. They offer options to redo something that you missed or maybe you forgot. It's not necessarily to criticize. But instead, to highlight the fact that averages are a shortcut for describing a large set of numbers in a very reductive way. And sometimes that reduction is actually a compression of meaningful and important information that shouldn't necessarily be compressed. I'll give another specific example of this. And you can watch out for this in Spot It. And a lot of different cases in press releases. And in this particular case, it was in a health news, basically a summary of a study that was done. And the headline, which was obviously crafted for clicks, essentially said that you lose something like 24 minutes off of your life if you eat a hot dog. The basic message that you might take from this is that for every hot dog you eat, you very directly are losing some amount of time off your life. And that you're willing to trade 24 minutes for a hot dog. That's the typical response. But the way that this study was actually conducted is quite different. This turns out to be an average. Imagine that you have 100 people. And all things being equal, those 100 people have a life expectancy of, let's say, 77. Now, let's say that you were tasked with finding out how much does it impact a person to eat hot dogs for their entire life. And so you run a study. Somehow you get perfect data. This is all completely infeasible to do. But you get perfect data that shows that 10 of those 100 people died about 20 years before everyone else. In other words, they died at the age of 56. At 77, they were blessed with the evolution of evolution. to report this information. For example, you could say one out of 10 people who eat hot dogs for their whole lives will die 20 years earlier. You could replace that one out of 10 with 10%. You could also say that 90% of people who eat hot dogs their whole lives will live a full and happy life all the way to the age of 77. What you could also do is take the number of years that is lost, in this case about 200 life years, and average it across the whole set of people. 200 years divided by 100 people is two years. So on average, if you were to take this whole group of people who all eat hot dogs for their whole lives, they are living on average two years less than people who don't eat hot dogs. Now, this is all made up. There's no real scientific information in this discussion. It's all statistics. But the important factor here is that no one in this particular made-up scenario actually only lived two years less. There's two basic cases. One case is to live all the way to the age of 77. The other case is to live all the way to the age of 77. And the other case was to die young at the age of 57. The averaging changes the way you think about the information. Now, if you're used to reading these kinds of studies, then you know how to read into this information. And by the way, if you wanted to convert that to how much does a hot dog take off of your life, then you can just divide the number of hot dogs. That's another type of averaging, just in a different way. The average per hot dog, you're dividing rather than multiplying. But the underlying message here is that as you compress information, as you change information, and averaging does this, you begin to lose details of what that information was about. This shouldn't be a surprise to us as developers. But when you see things like averages, or another example of this is burn rate. There is more to the story. When you see these metrics, you have to ask, what else is there? What is that compressing? What am I missing? When you take this and turn it into a single dimension? What were those previous multiple dimensions? What story is here that isn't getting conveyed with this compressed version? We're going to take a quick sponsor break, and then we're going to come back and talk a little bit about how this applies to you. Your life as a software engineer. Developer Tea is supported by Retool. After working with thousands of startups, Retool noticed that technical founders spend a ton of time building internal tools, which means less time on their core product. So they built Retool for Startups, a program that gives early stage founders free access to a lot of information. And it's a great way to build a lot of information. A lot of the software that you need for great internal tooling. The goal is to make it 10 times faster to build the admin panels, the crud apps and dashboards that most early stage teams need. We've bundled together a year of free access to Retool with over $160,000 in discounts to save you money while building with software commonly connected to internal tools like AWS, MongoDB, Brex, and Segment. You can use your Retool credits to build tools that join the team. You can use your Retool credits to build tools that join the team. You can use your Retool credits to build tools that join product and billing data into a single customer view or convert manual workflows into fully featured apps for your team. And you can even build tools that help non-technical teammates read and write to your database. And there's so much more. Retool will give you a headstart with a pre-built UI, integrations, and other advanced features that make building from scratch much faster. To learn more, check out the site, apply, join webinars, and much more at retool.com. Thanks again to Retool for their support of Developer Team. Imagine your team is trying to estimate the effort necessary to complete a particular project. Of course, this is a very hard thing to do. We've talked a ton about estimation on the show, and I don't want to get too much into it. So, let's get started. Two into the weeds in the methodology of estimation. But instead, I want to talk about how we can fool ourselves into thinking about averages as a way of modifying our behavior. So, let's imagine that you estimated that something was going to take you, let's say, three weeks. And it ends up taking six weeks. So now, you've gone over by a hundred percent. You've gone over by a hundred percent. You've gone over by a hundred percent. However you want to look at it, you're two times longer than you expected to be. Now, let's imagine that on the next round, you modify that. You all take a lesson from that experience. And instead of underestimating, you overestimate. It takes you half as long as you expected it to take. And let's imagine for a moment that you expected it to take you six weeks, and this time it takes you three weeks. So, on average, you have been correct in both scenarios. This is the kind of counterintuitive part about averaging together. Because as you look at the metrics, if you were to compress both of those together and look at your error, right? Your error in this case was a vast underestimation and then a vast overestimation. You would imagine that a team that does this kind of over and under and over and under, you know, vacillating between these forever is on average correct. But the truth is that they're never correct. Now, you could say that this is semantics, but there are some ramifications to this, right? So, let's talk about this a little bit more in depth. The truth is that you're never correct. If you were to look at the metrics, you would imagine that you're never correct. But the truth is that you're never correct. If you were to look at the metrics, you would imagine that you're never correct. If you were to look at the error, the error in both of these cases is three weeks. Three weeks over or three weeks under, in both scenarios, the error is cumulative. In other words, your total error here is actually six weeks. Now, even though you got to the end of it and if you were to estimate it all together, it would have taken you nine weeks to get to the end of it. If you were to estimate it all together, it would have taken you nine weeks to get to the If we're all tracking our time right here, then we had one project that took six, we expected it to take three, we expected the next one to take six, and it actually took three. That's a total of nine. But what has happened here is that, number one, you haven't gotten better at estimating, right? The estimation process has not been refined. And additionally, in each of these iterations, as you are estimating with error, there is some, loss. Now, this could be loss at an administrative level that gets covered up. It could be loss of morale. Maybe it's stress-inducing. Or maybe in the case that you underestimated, there's extra slack and perhaps more than what the company can handle. In other words, they allocated time for you to do something, that's three weeks versus the six weeks, or they allocated the six weeks and you ended up being done for three of those six weeks with nothing to do. And this can be a huge cost for the company as well. And it's not just company costs, as we've already outlined in this case. These errors can cause frustration and stress. And ultimately, it's harder to plan with this kind of variability. Now, again, my goal is not to push you into revising your business. I'm not going to push you into revising your business. I'm going to push you into revising your estimation strategies. You might end up doing that. You might not. My goal instead is to encourage you to think more about what an average is telling you. What is the composition of the underlying numbers? And there are kind of methodical ways to ask this question. For example, you might ask about the distribution of those underlying numbers. You might ask about outliers. Are there, you know, is there any, and there are statistical kind of, measures to describe all of this stuff. There's statistical measures to describe outliers. You may just want to actually look at those numbers and see, you know, does this actually track even intuitively with what I understand from the average. It's tempting to use an average as a blanket descriptor. This is especially tempting for managers who are running reports. For example, it's especially tempting for hiring. Managers who are looking at salaries and very often these things are so unique in their individual cases that averages become much less meaningful. So all of this is to say that not only averages, averages are certainly a good example, but in every way that information gets compressed. For example, summarizations is another kind of semantic way that information gets compressed. In every way that information gets compressed, you should at least be thinking, what is missing? Sometimes what's missing is okay. Sometimes that missing piece only serves to distract or it's not meaningful differences. All of the numbers are very close and we're just looking for something that describes how close they are. But when you detect that there's some kind of missing, you're not going to be able to figure out what's missing. You should be asking, what exactly is happening? What am I missing? What is not here? to let them know that you're coming from developer T go to retool.com slash startups slash dev T that link is in the show notes as well. Thanks so much for listening. If you want to continue talking about things like what kinds of information averaging compresses, uh, or if you have more thoughts on this particular subject or other similar subjects, uh, any episode from the back catalog, uh, if you have any kind of comment, please join the discord community, ever head over to developer t.com slash discord joining and, uh, uh, participating in that discord is always going to be free. Uh, we're never going to charge for any of that. So please come and join at developer t.com slash discord. Thanks so much for listening. And until next time, enjoy your tea.