Every company has its own unique challenges. In today's episode, we talk with CTO and co-founder of Zapier, Bryan Helmig about the different challenges he faced when starting Zapier and the challenges he's facing today. Stay tuned for Part 2, coming out on Wednesday.
If you have questions about today's episode, want to start a conversation about today's topic or just want to let us know if you found this episode valuable I encourage you to join the conversation or start your own on our community platform Spectrum.chat/specfm/developer-tea
If you're enjoying the show and want to support the content head over to iTunes and leave a review! It helps other developers discover the show and keep us focused on what matters to you.
This is a daily challenge designed help you become more self-aware and be a better developer so you can have a positive impact on the people around you. Check it out and give it a try at https://www.teabreakchallenge.com/.
Transcript (Generated by OpenAI Whisper)
Every company has its own unique challenges. From special testing requirements based on whatever your product is, to the type of workers that you have, or maybe the type of distribution, remote or local or mixed office environment, these are just a few of the topics that we discuss with today's guest, Brian Helmig. Brian is the CTO and the co-founder of Zapier. If you haven't heard of Zapier, I encourage you to go and check out what Zapier does. I'm very excited to have Brian on the show because Brian faces a lot of really interesting technical challenges with his team and we're going to talk about a lot of that on today's episode. I'm Jonathan Cutrell, you're listening to Developer Tea. This show exists to help driven developers find clarity, perspective and purpose in their careers. Before we kick things off today, we do not have a sponsor for the episode, so in lieu of visiting our sponsors to support the show, I encourage you to go to teabrakechallenge.com and sign up. This is a daily soft skills exercise. Sometimes it's literally just telling you to think about something throughout your day. Because it's a five minute literal on paper kind of exercise. Go and check it out teabrakechallenge.com. These challenges are going to make you a better developer and they're 100% free. That's teabrakechallenge.com. Okay, let's jump straight into the episode with Brian Helmig. Brian, welcome to the show. Thank you. I'm excited to be here. I'm excited to have you. For people who have been living under a rock and don't know what Zapier is and make sure that I'm saying this right as well, Zapier. I'd love for you to give kind of a basic definition of that and kind of your role as CTO, it's Zapier. Yeah, Zapier is the correct way to pronounce it. We always say Zapier makes you happier. In a nutshell, Zapier is a way for usually small meeting businesses to automate their online world. It helps you connect different apps that may work together or may not work together. That a good example is you might have a spreadsheet and you want to trigger off of a row in that spreadsheet and maybe send it to Slack and then have someone slack emoji it for some sort of next step. Then it goes back and drops into maybe another sheet or makes a change in your CRM and you can kind of string together all these really different actions and steps and filters to build your own sort of integration. It can kind of scale from just like really simple use cases of I just want to, I'm a real turn and I want to collect my leads into this CRM or send a text message to these customers all the way up to, we have people who've built 40, 50, 60 steps apps that are really crazy and incredible. It helps you just automate your online SaaS software and if something you can sign up, you can use the kind of the good workflow and build your own, you don't have to write code. Though you can, one of my favorite things is you can actually write a little Python in there or JavaScript in there which lets you do lots of cool little things as well. It really ranges from non-technical to technical to business owners to individuals to all that. So, automate all my stuff. So I have a fun story about Zapier, a very short story. When I was working in agency world, I actually used it to help a company do an entire like online course and they did really have, yeah, they had a lot, there's so many use cases for this platform and I'm sure that developers who are listening to this, many of you have used Zapier for something or another and so you know that the, first of all, the technical capabilities of the platform are pretty extensive and it's essentially it's limited to what you do with it. But additionally that they are certainly a lot of technical challenges that are associated with this kind of extensive platform. Which means that you have to have a strong team and you're facing a lot of variety of challenges from UX to performance to I'm pretty, I'm sure that there are so many that I'm not going to even be able to think about. So I'm really excited to talk to you because that level of challenge, it feels like kind of like a developer playground in a lot of ways. But I'm also imagining that this can be pretty stressful at times. Yeah, it's funny. I do a lot of interviews because we're doing a lot of hiring and one of the things I always come back to when people ask about like what's something interesting or cool or also we were on this stressful side of what Zapier does and because we talked to so many different APIs, we have 1500 like official partners and that's not to count all the like private APIs people hooked up on a developer platform and stuff. We joked that we're like an edge case catching machine for the internet. So literally anything that can go wrong with an API, we find out about it right away and we have to deal with it. And that usually means like some pretty smart logic around holding off tasks and not trying to run automations for users while the API is returning 503s or something. And then trickle those out, resume them or retry them as the service becomes healthy again. So there's lots of, as you can imagine, just lots of behind the scenes, monitoring and logging and a little machine learning and all that good stuff that goes into just knowing when a service is healthy and when it's not and reacting appropriately. So I mean a piece of it, but you know the stressful thing in the early days we definitely didn't have that sort of sophistication. So it was very much reactive right now we've at least put in some of the systems that help us stay ahead of it and be proactive and even let users know like hey, heads up, you know your CRM that you're using is currently having an issue but don't worry, we're holding all the stuff and we'll release it as soon as we see it be healthy again. So it's making it's made things a lot less stressful over the years. So we've gotten better at that. Yeah, I can only imagine. And you've got these long list of integrations, you said 1500 different partners and then beyond that things that people have built custom integrations, of course you have things like web hooks and I can only imagine that it's easy or at least in the beginning it's probably easy to view all of these services as reliable and you know they all have similar interfaces and they're all going to follow the rules. And I'm sure that that has been just exactly the opposite case. You have probably with 1500 different partners, you have 1500 times some in number of exceptions that you have to make for all of those different services. Yeah, and it's not just the 1500 services but all the unique ways that people will combine them together, right? You might have, it's very combinatorial explosive the way that the services can interact and do all these interesting things. So yeah, it's definitely not like a trivial sort of undertaking and one that we didn't think about as much as we've probably showed up in the early days but sometimes that's kind of the special part about like naivety, right? At the early times. Yeah, when you're starting these things, if you knew how hard it was going to be you probably wouldn't start on it. But that's a great thing like after you just get a little bit better, a little bit better and it's taken its years to get to the point where we feel really comfortable with you know the system we've set up in particular with reliability around APIs but yeah it didn't start that way, not at all. Yeah, yeah. I can imagine that some of the kind of the black box things that developers like myself, when I see a service that I need to, for example, I need to write tests for an external service. That's always the thing that I'm like procrastinating on, right? Because I know I have to go and I have to make some kind of fixture or I have to mock it out or I have to do whatever. And that's like your entire service set, right? Yeah. Or a large portion of it. So I can only imagine that you have a lot of interesting problems around testing and I'd love to know like how did the team kind of interact in the early days around making decisions for that long run? How do we test, for example, these external services? How did you organize that early team? Yeah, I mean, the unique thing about SAP years were entirely remote. So we've always been really drawn to people who love working independently and tackling problems and digging into stuff. Because one of the unique things about remote is you don't have that person that you can tap on the shoulder right next to you. You can get them on Slack, of course. But being able to just kind of dive in and be curious about stuff and poke at things and kind of learn a little bit about it is important. And you mentioned testing and testing in particular is really challenging across those, for us, especially for so many services we integrate with, that mocking is something that gets very challenging to do because A is just a ton of work to mock out all those endpoints. But also it's not representative. If the, in this case, the partner changes something in their API and we still have the old mock, you're not supposed to change version number, functionality within a version number. But the entire rule doesn't operate according to your hopes, right? Right. So you have someone in a lot of our cases in particular, we were just running our tests directly against sandbox accounts. We don't run them all the time. They'd be too fragile. But if you're in there and you're working on it and you've run them again, they're hitting a sandbox account and they have to be designed in such a way that if there's, you know, if we're testing a CRM or adding another lead or trying to delete one or it doesn't break because there is an extra lead or someone went in and played with it, you know what I mean? In the sandbox account, through like the browser or something. So it does require a certain type of approach on that that's unique to our kind of product, our kind of world view. So that's definitely probably a little bit different than how most people have experienced certainly a test driven development, a little bit different. Yeah. You know, a lot of the work that I've done in any kind of integration is kind of cutting our responsibilities at, hey, you know, we're going to rely on this service. The moment this service becomes unreliable, then our service becomes unreliable. Right. There's kind of this acceptance that there is a point at which we're no longer, you know, we can't predict all those changes. But of course, you know, when your product relies so heavily on working with those services, that's really is a core tenant of what Zapier does. Yeah, definitely. And it's a big part of, I think the value we deliver to users too, you can kind of feel safe that you don't have to be the one to keep track of that API. So you know, if you're dev and you're thinking, I need to build a bunch of integrations. It's like, okay, I can build 10 or 12 integrations and then I have to maintain them every time something breaks. You know, that's a big value out of you can just say, well, we'll integrate with Zapier. And then we'll just make sure it works with Zapier and it automatically works with all the other stuff. We don't have to worry about those 15 or I guess 14, nine, nine different apps. It's almost like a buffering thing, right? Even if I was only integrating with a single API and I could use Zapier and use it kind of like as a safety net and a buffer to that API. Yeah, super common. Like we have people who will build like first integration will build on Zapier and that'll cover like all their like big use cases like right out the gate. If you're going to spend the time on it, it's much better to get the payout of, you know, 1500 different services than just one, right? If you just build your integration straight to MailChimp, that's, you know, that's an integration to get sure, but it's just that one integration. You do have to maintain that, but if you want a couple of them, that that work stacks up quick, which comes back to, you know, it's kind of the value that we can provide. We can kind of get into the trenches and dig in the dirt and get those APIs working again. And then all the other partners, they just don't have to worry about it. You know, Zapier's got it. And without dropping those, those Zap runs, for example, right? Yep, exactly. Yep. So we got a lot of a ton of logic and I personally have been on the receiving end of it not working the way we want it to be and trying to get it figured out, but yeah, it's a lot of that like retry logic and back off and safety mechanisms. They're all wrapped up and we can kind of handle them on a per-API basis or an aggregate. Like whenever there's an AWS outage, we see a lot of partners go down, right? Yeah. It doesn't have an often, but every now and then we wonder like, oh, is it something in our system that's acting funny or is AWS and we'll just do a quick search of all of our, you know, API calls. How many people are getting a 503? And if we see it's pretty much static, it's like, hmm, okay, maybe it's something we're doing. But if we see a big jump, we can say, hmm, looks like maybe an availability zone and Amazon is going a little upside down. Yeah. So you can see that. It's kind of an interesting opportunity that you have there with that data. You can almost say, you know, you could use some kind of simple learning algorithm to say, oh, yeah, it looks like this particular zone of Amazon is down maybe even before they can. It's kind of interesting. Yeah. It's not uncommon. We've definitely, we've definitely alerted partners to issues with their APIs. And they didn't know about that. They didn't know about it. And that's, that's, it's not, it's not unreasonable. You have, if you're, you know, if you're a company that does a ton of volume on your API and there's one edge of it that you didn't really spend as much time monitoring or adding, you know, triggers on for if there's exception rates or whatever, it's easy to miss it. So often we can kind of be that, that, that kind of second guard of just, hey, heads up, some goofy's happening over here. Yeah. So we do that a lot. Yeah. I think kind of an abstract lesson to be learned for developers who are listening to this. They don't necessarily work at Zapier and they're like, well, what, you know, how can I apply this to my work? I think an abstract lesson here is that there isn't one specific way to, for example, test, right? You have to imagine that your testing is supporting the product and the use case and not just following some arbitrary best practice, which would say, yeah, don't run your, for example, don't run your, your tests against the, the external service. Well, Zapier needs to do that, right? So the idea being that, you know, there's going to be exceptions. There's going to be changes. There's going to be fluidity in the way that your software is written and whatever practices you choose to use, you know, a lot of the time we can start from that, whatever established best practice is and then venture away from it as we see needed. But we shouldn't arbitrarily or blindly judge our practices based on whether they are following some, some external rule, right? We should, we should instead design our practices around what we're trying to accomplish. Yeah. And I think that's, that's a really good point because it's so easy to get best practices mixed up with dogma and it's, I think you're, you're point of just start there but be, feel free to change it. And that, that's a really good, healthy way to think about, not just like the technical side of things but, you know, as we've, we're a remote team, some of the stuff that you do in a co-located company translates, especially on the management side, some of it gets a little bit harder and you actually have to work harder at. So, kind of sticking to whatever that best practices as if it is dogma will always put you into some sort of a local maxima which is never, it's never going to like really push you further and further. So it's a great place to start. You shouldn't, especially for management, you should not try some crazy new wave thing. You should just use kind of what works and then start from there and then tweak it for whatever might make sense for you. If you have a good reason or a good narrative, then it probably makes sense. Yeah. Yeah. I'd love to hover over this, this discussion on, on remote for a few minutes. I'd love for you to share maybe some of your experiences around, you know, how do remote teams, first of all, how can they go wrong? And I realize that's a very big question. So feel free to compress it down to a few, you know, kind of highlights of things that you see that could go horribly wrong. And then also, you know, for people like me, I'm a remote employee and I'd love to know, what have you seen in terms of different types of behaviors that remote employees employ or don't employ that can correlate with their success at Zapier? Yeah, that's great. So the first part, what can go wrong in remote? I think that there's actually a really healthy overlap to what just can go wrong with any company, even in co-located. So you're not, it's not a completely different beast, there's different things that are accented for sure. I think having a mixed environment can be more challenging. It doesn't mean that it won't work, it just means it's more challenging. For example, if you have a manager who is co-located and, you know, three or four engineers are also co-located with that manager and then three or four are remote and you default to running those meetings in a conference room and then you expect the remote folks to dial in, that's really, it's an uneven feeling because you have some people who are together and then some people who are dialing in. Some of the companies that have been successful doing that put pretty strict rules around how that interaction works. For example, they might say everyone has to dial into the meeting no matter what, unless everybody that needs to be on the meeting is there, you have to dial in and maybe even you have to dial in regardless, even if everyone's in the same room. And that kind of, you don't want to create an in-group or an out-group, that can be a big anti-patternite. I remember I remember in the early days with Wade, Mike and myself, we would be sitting next to each other and just typing, I think it was on, probably, hip chat back then, but it was on hip chat and we would just be chatting with each other about what we're trying to work on rather than talking to each other. So even in the early days when there wasn't a big, I mean, I could have just talked to Wade or talked to Mike, but we were still doing hip chat and doing that in the kind of digital manner. And that just kind of set the roots of our culture that everything should happen online and that's really important. Of course, we do do calls as well, but in those cases, we often record them and then anybody who may be misdeme or was curious, they can check out the meeting notes or they can watch the video. And it's not in Commodore, see people watching these videos at like 2x speed, right? And everybody sounds like a chipmunk talking real fast. It's a really efficient way to gather that context. And that does lean into a little bit of the things that, you know, if you want to be really successful inside of Zapier as a remote environment, being great at communicating something is really important. Like I'll find myself revising a comment like on GitHub, like I'm making a comment on a PR and I'll like rewrite it four times, you know what I mean? Just to try to make sure that I'm emphasizing the right thing, not that I'm not necessarily that I'm being like super kind or empathetic, but that's a part of it certainly, but that I'm being clear and what my intention is here. Like there's the right emphasis on the things that are important or, you know, I call out the things that I'm not interested in that you might be misled into thinking I'm really really care about. Or it's just that clarity and communication. If you find yourself like rewriting a comment two or three times before you post it, just to make sure it says the right thing, those are good sort of behaviors to have in a remote environment. You don't get body language, you don't get tone over Slack or over GitHub or Gera or wherever the content is going, right? It's just text, so it's easy to misinterpret that. So that thing classic stuff, this is stuff that's probably applicable in any job being like super inquisitive, being helpful, we often encourage people to, especially new folks, imagine you'll get your chance to pay it forward in the future, ask all the questions you need to be successful. Everyone in Zapier is super kind and empathetic, will help you out in a pinch and then you'll get your chance to help the next person that's brought on. So be super, super inquisitive, ask lots of questions, yeah, and then pay forward when the next person comes on. Thank you so much for listening to today's episode of Developer Tea. Thank you again to Brian for joining me, if you don't want to ask out on the second part of this interview, go ahead and subscribe in whatever podcasting app you're using to listen to this right now. Thank you so much for listening and until next time, enjoy your tea.