Developer Tea :: The Dangers of Measuring Side Effects

The Dangers of Measuring Side Effects

Published 1/4/2021

You want to measure what matters. But your measurements might change what matters.

Understanding that anything that is measured might modify behaviors, it's important to then understand what should be targeted. How can we incentivize the right actions that produce the side effects we want, rather than incentivizing direct manipulation of those side effects?

Transcript (Generated by OpenAI Whisper)

How are the things that you're measuring affecting your outcomes? This may seem like a strange question to ask because it seems like measurement should be an external observer, but so often that's not the case. My name is Jonathan Cutrell, you're listening to Developer Tea and my goal on the show is to self-driven developers like you find clarity, perspective, and purpose in their careers. So this question is a simple one, but the answer is pretty complicated. When we measure things, that observation that we do can have an effect. It can have an effect on the way that we think about whatever it is that we're measuring. And so let's take a very simple example. When you're defining success, let's say you are running a department at a company and you're trying to determine success. Maybe you run the sales department. So you're trying to decide, okay, what is a way that we can decide whether or not we've been successful this quarter? Now, maybe you have different categories of success. So you have kind of a minimum viable level. We have to meet this otherwise we get a shutdown our department. And then you have something between that and perfection and somewhere in the middle is this is our target or this is a reasonable goal to set. And then at the very top is if you get here, then we're all getting bonuses and we can probably retire if we go anywhere beyond that, right? And so there's this spectrum of what you could expect. And obviously it's good to be at one end of that spectrum and it's not so good to be at the other end of that spectrum. Okay, so this sets up the same kind of binary thinking that we've had in previous, in previous episodes, we talked about this binary thinking about washing the entire year of 2020, for example, as a negative year, about washing the entire year of 2020, for example, as a negative year. And this being the first episode of 2021, I assume that you are probably thinking about, okay, how can I be successful in this upcoming year? I'm really excited by the way about 2021. We're going to do a lot with developer to you, hopefully this year. We're going to do some new media formats, maybe video. And so as I move into this year, I've been doing a lot of this planning myself, trying to decide how do I measure what is my success? We've been around for six years as of tomorrow, win this show, airs, right? This episode, I'm recording it the day before it airs, so it's technically two days from today, but in any case, six years, this show has been around. And just the longevity of the show is something that I count as a success. But certainly, perhaps the most important thing to me is the messages that I receive from the listeners, I'll receive messages about how the show has impacted their careers. Okay, so if I were to optimize on any one thing, let's say longevity of the show, right? If I were to optimize on that, then I might sacrifice some of the quality of this show, right? I might sacrifice some of those things that I actually care about equally or even more, like getting messages from listeners saying that this had a good impact on their lives. If I were to be running a sales department and I optimize only on revenue, like we were talking about before, there is some metric where we say at the bottom end, you know, we have to meet this particular point and then up here at the top end, you know, we're all going to Hawaii, right? We're all, if we go that far, then we're doing great. We don't even have to worry about anything. Okay, well, if we do this one kind of metric measurement, then what we end up doing is gamifying or providing incentives that can be, come twisted, right? And this happens whether we wanted to or not. So think about it this way. If I were to optimize on longevity of the show, then I think, okay, this is a good thing. I'm not going to just do the bare minimum to keep the show alive, but if that's the thing I'm measuring, how long does the show stay alive? Well, I'm not putting boundaries on other important metrics. More importantly, if we were to optimize, say, on revenue as the sales organization, then we might even get to the point where we're bending ethics. Hopefully, we all have the same ethical concerns in mind. Hopefully we all at our company and our department or whatever. We have enough of an overlap of values and enough of a social contract between us and maybe our customers that we're not going to, you know, try to extort them. We're not going to try to cheat them out of money. We're not going to go and actually embezzle anything. We're not going to do insider trading, right? We're not going to step over those lines. But what we might do is sacrifice the quality of our product, right? We might cut corners. We might only look for short term revenue because that's the measurement window that we care about, right? We're not incentivized to do things that are not being measured. And because that measurement tends to change our behaviors, because we tend to wrap our behaviors around the measurement that we care about, then the otherwise important things may fall by the way side. Okay, so why am I talking about this? At the beginning of every year in our house, we take time to reset our pantry. All right? This is a very important time in our family, right? Because over the year, our pantry tends to become more and more messy. And it's hard to only clean out one part of the pantry because really what you need is a pantry system. If you haven't done this yet, I highly recommend it. What we do is we take out all of the food, all of the food that we've amassed over the year in our pantry. Sometimes we buy things at the grocery store that we optimistically we want to eat it, but we don't ever end up eating it and then it expires and then we have to throw that food out. And so that's kind of the first round. We look for things that have expired, we clear those out, throw them out. Then we move on to, okay, what things should we get rid of because we don't really want to eat it? And this is typically a health question, right? Is this something that's healthy? And so we have these different layers and filters that we choose to remove things out of our, you know, stash food that's out on the counters. So at this point, our pantry is totally empty and then we use this psychologically sound system to refill our pantry. This is a lot of fun, by the way, to kind of look at the psychology of food and the psychology of pantry. Believe it or not, there's quite a field of study around this. So one of the things we do, for example, is we put the bad food in hard to reach places and then we put the good food, the things that we want to eat on a regular basis, we put it at our eye level, all right? And then there's some practical concerns like the food that the kids can reach versus the food that they shouldn't be able to reach. Okay, so at the end of this process, what we have is a cleaned out pantry with food that's organized in a way that is most advantageous to our health. That is the ultimate goal, right? That's the goal that we want to reach when we start out. That's hard to measure, right? It's hard to measure unless you were to say, okay, I'm going to take a picture at the end and decide whether or not that's true or I'm going to take a snapshot in a few months to see if I'm actually a little bit healthier. Maybe we get some blood tests or something like that, right? That would be the real measurement to determine if this was a successful endeavor. But a lot of the time, what people set out to do when they start to clean out their pantry is not principled. It's not based on some theory because they're not nerds about psychology like I am. Instead, they want to throw out as much as they can. They want to clear the clutter out. And they might be incentivized to fill up as many trash bags as they can. And what this may lead them to do, by the way, is put as much of their food in the trash cans or in the trash bags as possible so that when they get to the other side, they say, wow, look at all of the food that we've thrown away. But they may end up with a pantry that is not what they really want. They may end up with a pantry that is lacking a lot of the important foods that they cared about because they threw out as much food as they could because that was the goal. That was what they were measuring. They were looking to see how much clutter they could remove. And in the process, they removed more than clutter. So this is obviously not about food, right? This is about measuring things and how our measurements can affect our behaviors. Because in this particular scenario, the food that I threw out, the number of trash bags that I filled was a side effect. This was kind of the exhaust, right, of the engine of trying to make the pantry more sane and healthy. It wasn't the goal. It wasn't the intention. The intention was not to see how many trash bags we could fill. That is something that you could look at after the fact. It's a retrospective measurement rather than a prospective measurement. It's not a goal. It's not something that you're trying to optimize for. Now, this is difficult, right? Because as you begin to fill up trash bags, you realize, hey, this trash bag is getting full. This is something that's meaningful and good because the more trash we throw away, that means the more clutter that we start to apply these labels and we start to apply meaning to these side effect measurements. We have to be highly aware of the side effect measurements. We have to be very careful, very careful about what we measure and what we pay attention to, especially when it's a side effect. Another example of a side effect measurement might be how many points are you finishing in your sprint, right? Maybe you run agile points like a lot of software engineering companies do and you're looking at, okay, how many points did we finish? Okay, this is a side effect measurement. This is not a goal. This is a side effect measurement because when you set your team up for success, when you have well-defined cards, when you have focus hours, when you have people positioned in the right place, when you have stakeholders available, all of these things that are considered best practices. When you have all that stuff in place, the side effect is productivity. It's a good side effect. It's one of the, it's kind of the reason why we care about it. So I'm not saying to not care about the side effect, right? Our goal is actually to have productive sprints, so what we're saying is when you try to make your goal something that you're measuring and optimizing for, sometimes that goal gets subverted. Another way to think about this. So let's imagine you're starting to build a garden. The spring you might plant a garden, right? A lot of the work that you're going to do is preparing the ground, preparing the soil, making sure you understand the nutrient needs of plants, making sure that you have the right fences built up so that there are different types of animals or insects don't get in. You might have some kind of insect repellent that you put on the perimeter. All of these things that you're doing to prepare the area for the growth. And then when you actually plant the seeds, you plant them at the right depth, make sure they get the right sunlight, etc. None of this work is directly affecting the growth of the plant. In other words, you can't pull the plant up. You can't, you know, if you're measuring the height of the plant, there's not really much you can do to directly affect that. As it turns out, that's a side effect of all of the other work that you've done. You've established the right scenario to produce the side effect. Think about this for a second. When I'm cleaning out my pantry, I've established the right scenario where I've established the right patterns, the right, you know, systems to produce the trash bags full of things that we don't want anymore. When you've established the right scenario for customers to buy your product like focusing on quality and focusing on good marketing strategy, these are establishing that garden. And the side effect turns out to be really good sales, high revenue. You're not going to be able to directly affect revenue. And if you did try to directly affect revenue, it's very much like trying to stretch a plant out of the ground. You're going to pull the plant up and it's going to become enroute it. You're going to do the same thing with almost any side effect measurements. Side effects cannot be directly affected. When you try to directly affect them, you have kind of thwarted the system. You've created a bad incentive and you're doing something that isn't going to result in long term positive effect for that side effect. Long term positive trajectory. So what does this mean? Well, it means we need to pay attention to what we're measuring. Number one, number two, recognize when what we're measuring is retrospective or is observatory versus prospective. In other words, we're trying to change that measurement. It's very difficult to do this, very difficult to measure something and not have it affect the way we think about it. So how do we manage this? There's a couple of strategies. None of them are bulletproof, unfortunately. One of them is measure a lot of other things as well. So if you have 20 things that you're measuring, you can't really optimize for any one of them. Choosing which ones matter is an exercise left up to the reader. This is something that if you're listening and you decide that some, a few things matter, key health metrics or something, well, make sure that you have a balance of those things. That one balances the other. So don't choose a single metric. Revenue cannot cannibalize another metric. We have to have some minimums on multiple metrics so that we're not using a single metric that can create a gamified system that we don't want, perversions and so on. So that's one strategy, measure multiple things. The second strategy would be to try to keep the analytics of that, the analytics team or whatever the analysis people completely separate from the execution people. So there's some kind of moderation between these two. This is extremely difficult, very unlikely to be successful and probably isn't healthy. This is probably not a good option, but it's possible to do this in certain types of systems like, for example, if you're working in a machining or if you're doing something like assembly line work, you can have some external measurements that the people on the assembly line are not privy to and then that external measurement goes through some kind of process that determines what are those side effects and what are the behaviors that result in those side effects. And then those behaviors are what are presented as the goals to the people on the assembly line. Because the difficulty here is that it's very difficult to actually keep that stuff separate. It's difficult to keep those metrics unknown to the people who are actually working on the assembly line. Okay. So there's not really a lot of ways to do this other than to say, so the third option, this is kind of the potentially the good option. The third option is, don't measure it. Don't measure your side effects. Now, how does this help? Well, you can retrospectively look back at side effects. You can say, okay, we did great this quarter. Check it out. We've got some really good revenue coming in. That's awesome. Keep doing what you're doing, right? Because the behaviors that you've engaged in have resulted in this. Or you can say, ah, we've got to fix something. Our side effects are showing poor performance. But we're not going to talk about what the, what good performance would look like. We're not going to set goals on our side effects. This is a difficult problem because people are going to crave the certainty of understanding what the goal should be. Well, where do you want the revenue to go, right? How do we have some certainty in our, in our job security? How do we know when we've hit the right, the right stride when we're actually doing the right thing? So set the goals about the behaviors you care about, not about the side effects. Set the goals on the behaviors that you care about, right? If you're running the agile team, don't set the goal on the agile points that you finish in a sprint. This is a terrible, terrible way to, to try to set goals on a team like that. Instead, you want to set goals that are based on the behaviors that you care about, like, for example, having good back and forth between the developers and the stakeholders. So hopefully this is a thoughtful representation of why side effects can be dangerous to measure, why it's so difficult to measure them effectively and then create systems around them. But then also an argument for stopping measuring those things altogether. Instead, measuring and putting targets on the behaviors that you care about, that produce the side effects that you do care about. Thank you so much for listening to this episode of Developer Tea. Thank you again for joining me for this, you know, the kickoff, really tomorrow is the kickoff of the seventh year of the show. You know, seven years is a long time for a pie cat, for anything to exist as far as media goes, but for a podcast to exist, this is a huge milestone for us. And we're so thankful that we have listeners who have stuck with us. Some of you, since the very beginning of the show, some of you are actually doing back catalog listening. Our plan this year is to expand Developer Teaa little bit. Hopefully we're going to produce a video. Hopefully we're going to produce podcasts, potentially bonus episodes. We're going to do a guest, just like we did last year. We're going to have all kinds of offerings that are changing and adjusting, not just the two times a week podcast, hopefully more than that. But we're going to stay flexible and there's no promises yet because we have to continuously change the way that we do media as the media landscape changes. So we're going to respond to that and listen to the feedback of you, the listeners. So if you do have feedback, please send it to me. You can reach me at Developer Tea at gmail.com. I'm also on Twitter at at Developer Tea as well as my personal Twitter at jcatrout. This episode can be found on any podcast platform that you use. We're going to be re-watching developertea.com. In the meantime, you can find this episode and every other episode of this show on spec.fm. Thanks so much for listening and until next time, enjoy your tea.