Listener Question: Marc Asks About Testing Private Methods
Published 7/8/2016
In today's episode, I answer a few questions from listener Marc about testing.
- "Should I test private methods or only public ones?" (Stackoverflow)
- God Object (Wikipedia)
- Sandi Metz's Rules for Developers
- "All the little things" - talk from Sandi Metz about small classes
Today's episode is sponsored by Linode! Head over to Linode.com/developertea or use the code DeveloperTea20 at checkout for a $20 credit towards your cloud hosting account! Thanks again to Linode for your support of Developer Tea.
And lastly...
Please take a moment and subscribe and review the show! Click here to review Developer Tea in iTunes.
Transcript (Generated by OpenAI Whisper)
Hey everyone and welcome to Developer Tea. My name is Jonathan Cutrell and in today's episode I'm answering listener Mark's question about testing. Today's episode is brought to you of course by spec.fm where you can find tons of resources for designers and developers looking to level up. Today's episode is also brought to you by Linode. With Linode you can instantly deploy and manage an SSD server in the Linode cloud. You can get it deployed in just a few minutes. It doesn't take very long at all. We will talk more about what Linode has to offer to Developer Tealisteners later on in today's episode. But I want to jump into Mark's question about testing. Mark sent in an email to developert.gmail.com. He says, dear Jonathan, thanks for keeping up with the great show. I appreciate every wise word on my commute. I listen to every single episode so far. Thanks for listening Mark. I appreciate it. I am a PHP developer with a Java background who is struggling with testing. Ever since I started coding it nearly nine years ago I wanted to do proper automated tests but whenever I think of how to actually do it I end up hitting road blocks. Usually most methods that are actually complex enough for testing are private and most public facing methods are doing too much for reasonable tests or doing mostly side effects. They have mostly side effects. For example, database changes or file creations in the great scheme with no testable return value. Yet most resources state that private code should not be tested outside of test driven development. And as everyone preaches to keep the public API slim opening everything seems wrong as well. The alternative of writing tests only for a couple of public methods even in big classes feels no better. Do you have any insights on how I could write proper tests for my existing code basis or how to structure my future code to make testing easier and how would you go about testing data heavy side effects? In regards, Mark. Mark thank you so much for writing in about this subject. This is a difficult subject to cover all in one episode. But hopefully I'm going to give you some good advice that other listeners will find valuable as well. Because I think a lot of people are facing kind of this difficult mystery factor when it comes to tests. It's easy to look at other people's tests and see why they make sense and see how they fit into the overall scheme. But then when it comes to testing your own code, it's kind of difficult to know where you where you need to start. So let's tackle this, you know, with a couple of the specific questions you've asked. So let's break down the situation that Mark is facing. Mark has a lot of code that is obscured away from the public API. In other words, he has classes that have methods that can't be called outside of the class. It can only be called by methods that are inside of the class. This is pretty much a feature of every object-oriented language at this point. And you can get the same kind of behavior out of JavaScript using techniques like closures or private variables. So Mark, this is a pretty common issue to face because a lot of the complex code, a lot of the algorithmic thinking ends up being shoved into private methods. So we're going to attack this by thinking about the design of the software first rather than thinking about ways to design our tests. Because typically, if software is difficult to test, it's not a problem with the test suite. It's more likely a problem with the design of the software. So that's where we're going to start. And Mark, my guess is, and you can correct me if I'm wrong, send me an email. Let me know how accurate this is. But my guess is your classes have a lot of stuff going on in them. In other words, you have maybe one class that has a ton of methods. It's doing a lot of different things. This is known as a God object. In other words, you have some kind of class that has methods that are really far reaching. They do a lot of different things. And the reason that I think this is the case for Mark is because he's describing a lot of the symptoms that go along with very massive classes that have a lot of internal methods, particularly private methods, specifically Mark is talking about data heavy side effects, for example. So let's talk about that for a second. If you have these massive classes, if you have one class that does all of the work, then you're breaking a lot of the design principles of object-oriented design. Make your classes smaller. Let's start there. Make your classes smaller. They need a single responsibility. The class in general should have one thing that it is responsible for doing, for taking care of. And that one thing can't be user management, because user management is typically a collection of tasks. For example, user sign up. Well, even that has more than one responsibility. So you can break down these large classes into many smaller classes. Now, there's a lot of benefits that you gain from this. One of them is that you're going to absolutely have to open up some of those methods to the public space. And the reason for that is because you're going to be composing your classes together. So in other words, the process of a user sign up, you might be using five or six different objects to compose together into a user sign up process, maybe even a null object in there. If the user's sign up doesn't work or if you want something like an anonymous user. So there's a lot of different patterns that are being created here that you can open up to the public sphere. And that does not mean that you have to suggest that people using the library or other developers use those public methods. There's a big difference between publicly accessible methods and documented API methods. In other words, people don't necessarily have to use these methods if they're public. You can create some documentation around the suggested methods to use. And really right now we're talking about other developers. So even if you in your documentation, you say, okay, these are the methods that we're publishing documentation for. But there are other classes in here that we need to make the methods public so that our other classes can see them so that our other classes can use them. When you do that, you open yourself up to being able to test those methods. So that's my first piece of advice for you, Mark. Take a step back, make more classes like a lot more classes, your class oriented design, your object oriented design, absolutely relies on you having smaller single responsibility classes that you compose together. Okay? So take a step back, think about how you can break different pieces of functionality out of your larger classes into smaller classes. That will necessarily make some of these methods that are likely hidden away in their private method in that spot in the private method of the class. It'll make you move those out to publicly accessible methods. Now we're going to talk a little bit more about this private versus public discussion. And hopefully I'm going to dispel a myth that a lot of you may think is true right after we take a quick sponsor break to talk about Linode. Every developer needs a place to put their projects. Linode is a fantastic location for the small projects that you have, but it also will grow with you when you take those projects and scale them. Linode has eight data centers and their plans start at just $10 a month. You can get a server running in under a minute. There's hourly billing with a monthly cap on all the plans and the add on services, including backups, node balancers, and long view. And you can do anything with these servers that you want to do. You have root access. You can create virtual machines for full control. You have Docker containers that you can run on there. You can run your own private get server. You can run a testing server if you wanted to. Whatever you want to do with your Linode machine, you can do it. Linode also has native SSD storage and a 40 gigabit internal network running on Intel E5 processors so speed is not a problem. Now, if you're concerned about making a commitment, Linode has you covered there. They offer a seven day money back guarantee. Linode also now offers two gigabytes in that $10 a month plan. That is a fantastic deal. Very economic decision for you, especially if you are just getting off the ground with a project. Linode is a great option for you, especially with the two gigabytes of RAM for only $10 a month. On top of all of this, Linode is offering Developer Tealisteners $20 worth of credit just for using the code Developer Tea 20. That's a promo code. And you can get that code applied by going to linode.com slash Developer Tea. This is a fantastic deal. Breaking this down, it looks like $100 a year, which is a little over a quarter a day. And I can tell you that a server is worth a whole lot more than a quarter a day. So go and check it out. Linode.com slash Developer Tea. Make sure you get signed up today because they have this promo running the $20 worth of credit. Linode.com slash Developer Tea, of course, that code and that link can be found in the show notes at spec.fm. Thank you so much again to Linode for sponsoring Developer Tea. So let's get back into this discussion on Mark's question. You know, Mark is talking. He's a nine year long developer. He's been doing this for nine years. And so he's learned a lot of things about developing projects. He's learned about class design. Mark is presumably he's done quite a few projects at this point. But he, like a lot of us, he stumbles with testing. This is a very common problem. And one of the reasons it's a common problem is because we get a lot of information that gets passed down in these kind of bite chunks. Right. So for example, a bite chunk might be don't test your private methods. Now it's important that we learn from the smart people who have been doing this longer than us. Right. It's important that we learn from each other. It's important that we learn from our experiences. But I want to dispel the myth that your private methods are not being tested. Your private methods absolutely should be tested. In fact, in my opinion, every single piece of your code should be tested. How it is tested is a totally different story. So this piece of advice that we hear all the time don't test your private methods. Really what we're saying is you can't test your private methods. It's impossible to do that because you can't call those methods from outside of the class unless you do something like create a private method caller like meta method or something, right. You don't want to do that. That's a little bit ugly. So really what you're talking about doing when you're saying don't test your private methods, you're telling people, hey, you need to obscure this code away because it can change often. And it's very different to say this code is going to change often. So you shouldn't rely on the effects of this particular method. It's very different to say that than it is to say don't test it at all because here's the reality. The only reason you have private methods is to use them in a public method. Let me say that again, a class is useless without public methods. So the only reason you have private methods is to use them inside of public methods. And if the ratio of public methods to private methods is significantly small. In other words, if you have very few public methods and a ton of private methods, then once again, there's probably a lot more going on in your classes than should be. In fact, you should most likely be moving large subsections of these private methods out to their own classes. And those methods often will become public. Now, I want to address one other piece that Mark has talked about. All right. So the first two pieces of advice, number one was take a step back and make your classes smaller, make your classes smaller and make a whole lot of classes. You shouldn't have five or ten classes in a sufficiently large program. You're going to have something like 50 or 100 classes. And that may seem complicated to you, but try it out and you'll see that a ton of classes is a whole lot easier to manage than one very large class. Okay. So start there. Make your classes smaller and make a lot more of them. The second piece of advice is to look at this information that you're getting from the programming community for example, the simple sound bite, don't test your private methods. Take a step back and ask what exactly do they mean and why? What is the reasoning behind not testing your private methods? In other words, don't take advice blindly. I've said this many times on the show, but investigate why people are saying what they're saying. Of course, they have a lot of experience. And it could be that just like I told you a few minutes ago, make your classes smaller and make a lot more of them and just trust me on that. It could be that they have experience and you should try what they're saying simply because they've been where you are now. Right. But don't just simply trust someone blindly without asking why. Lastly, Mark, I want to talk about the side effects that you're discussing. A lot of methods that we write, especially in general purpose languages like PHP or Ruby or Python, a lot of these things, they have a lot of built-in functionality to do input output. For example, file IO, they may also have the ability to natively, essentially natively connect to a database system and write to that database. So it's easy to create what we call a class, right. We collect it into the name of a class, but in reality, we're just putting a lot of scripts into our methods. We have some kind of database cleanup script or we have a generate report script and these scripts that we're creating and we're calling them methods in our classes, but their individual scripts, these scripts that we're creating, they're very difficult to test because the outcome of them is state oriented. We can't say given this input, expect this output. There's a lot of state change. There's a lot of things that those methods they depend on. So generating a report, you're going to go out and read some data that isn't passed into that method at all. It's somewhere else, the state of your database. If your database is empty, then generating a report is going to do basically nothing. If your database is full, then generating a report might fail because of a database timeout. Or in the middle case, it might succeed, but the only way to know that it succeeded is to check your report against the database. And basically, you're going to have to expect a specific database state and you're going to have to know what that report should look like given that database state. Now, if this sounds complicated, that's because it is. If it sounds like a bad idea, that's because it is. So Mark, here's what I want you to do. I want you to go and identify the scripts that are hiding behind the mask of a class. Go identify the things that you've written that you're calling a class method, but in fact, it's just a bunch of utility procedural things that you have to do periodically. It's like jobs, right? Little jobs that you've written, little cron jobs that are hiding inside of a method inside of a class. And what I want you to do with those is break them down into functions that take input and have a predictable output. Break them down into functions that take input and have predictable output. A simple example is that generate report function that we were talking about. We can generate a single row for the report. We can pass in particular information for that row to be generated. Then we can output some kind of data format, but we can do it in memory. In other words, we can create a function that takes some kind of input and generates output. Now, suddenly, we have a testable method, because what we've done is we've instead of going out and getting information and doing everything in one go and the result being a file, we've taken one of the steps out of that equation that we know what the input is and we can determine the output and we can test that now. Then we can compose our longer script type function methods. Some of those things are unavoidable. Sometimes you do need to generate a report. But to make that testable, you can create these individual steps. Now, the interesting irony here, Mark, is that you're probably doing something similar to this in your God objects, in your classes. You're saying that the individual steps are those private methods. So I don't want you to make these methods private. I want you to break them out into classes that make sense. A report generation class is important. And here's why. Imagine a scenario where suddenly the client that you're working for or the product that you're working for, they want you to generate a report in a totally different file format. Well, are you going to take your entire script method that you've written to generate a report and create a copy and pasted version of the same method? That doesn't seem like a good idea because we're repeating ourselves, but just to change the file format. Or is there a better way? Right? Maybe we can generate the report and then pass that report to a file output class. Something that will change the report from a data type that our language supports. Let's say you're working with JavaScript. Your report will be in JSON and then you can generate from JSON to a CSV or you can generate from JSON to a Excel file. So now instead of generating a report in one big long method that has a ton of side effects that's really difficult to test, you can generate a report and then you can generate a file from that report. Now that seems much more sane, doesn't it? So what you've created for yourself is a much more testable scenario. So mark my third piece of advice for you today to make your code more testable, find the spots where you're hiding scripts inside of classes. In other words, find those side effect methods and break them down. Find the methods inside of them that could be tested. If you take all this information together, I'm essentially telling you to write smaller classes with more public methods. This will make those classes composable. It'll make it much easier to manage in the long run and ultimately you'll be able to test them without breaking the rules of test-driven development. You'll be able to test them in pretty much any kind of test you'd like to test them in. A unit test or an integration test or any of those other popular names for tests, you can basically use all of them when you have a bunch of small classes. It's much easier to manage that way. A bunch of small classes with public methods and you can pass those classes around much easier to design software and test software this way. Mark, thank you so much for writing in and thank you for asking such a fantastic question and being committed to making better code and being committed to testing your code. These are incredibly important qualities in a great developer and people like Mark, people who are asking questions, you are the ones who are going to become great developers. The day you stop caring about the quality of the code that you are writing, the day you stop caring about whether or not your code is testable is the day your career starts going down into the drain. You should start looking for a different career. You can't continue to be a developer if you stop caring about the quality of your code. Thank you again Mark. Of course, anyone else with a question for me, you can email me at Developer Tea at gmail.com. You can also connect with me on the spec Slack community. Spec.fm slash Slack. That is totally open and free to everyone who listens to this show. Thank you again to Leno for sponsoring today's episode. If you are looking for a Linux cloud hosting solution, which pretty much all of you are going to need at some point in your career, head over to Leno.com slash Developer Tea. That will give you a $20 credit at checkout and just over a quarter a day you can get root access to a Leno server with two gigabytes of RAM. That's just $10 a month and with that $20 credit you get two months for free. Go and check it out. It's a fantastic deal. Leno.com slash Developer Tea. Thank you so much for listening. And until next time, enjoy your tea.