I messed up my codebase (and the lessons I learned)

I wrote this back in February, and then forgot it in my drafts folder. So here it is, I hope you enjoy reading it.

I wasted an entire week — Monday morning to Friday afternoon — painstakingly searching for a bug that turned out to be fairly simple.

🤦🏼‍♂️

My plan was to release the update for Reusable Components today.

Unfortunately, I need to delay the release until next week. I don't like missing deadlines (especially ones I set myself!), but I'd rather do that than release something that's subpar.

Let me share with you what happened, and some of the lessons I've learned from it.

A high-risk refactor

When I first launched the Clean Components Toolkit, I also built my own custom platform.

But since I just had the one product, I decided to hold off on supporting multiple products at once.

Now, with the update and re-launch of Reusable Components, I needed to add that in. This update takes the original videos from Reusable Components and adds them to my platform, which allows me to add in step-by-step refactorings, quizzes, more written content, and any new features I add down the road.

But this is the problem:

This is a very high-risk refactor.

It affects basically every part of the codebase:

  • Database
  • Entire checkout and license activation flow
  • Authentication
  • Most UI pieces (from hardcoded to dynamic URLs)

I've done lots of complex work and deployments like this in the past, but it turns out I was a little rusty.

All of these lessons I've learned before, but I've now had to relearn them.

1. Unmerged code is a liability

Getting things to work was actually less challenging than I had originally expected.

But instead of deploying my changes like I should have, I left it in a feature branch.

A few weeks went by where I made some other changes here and there, tinkering with my branch without really thinking about it.

I upgraded my auth dependencies, I upgraded Nuxt so I could use getCachedData, and a few other "small" changes here and there.

But those changes added up.

When I went to deploy my branch that was working (at one point), it no longer did.

Yeah, in hindsight I feel like an idiot for doing this, but that's sort of my point here. I'm working entirely by myself, and clearly I've let my coding discipline slip.

This problem is even worse when working on large teams, since your main branch keeps changing while your feature branch just sits there, slowly rotting away...

From now on I will make sure to wrap up and deploy my work ASAP.

2. Small PRs reduce your risk

This is related to my first lesson, but it is distinct.

My entire PR was over 1000 lines long. What's more, it involved a bunch of high-risk and potentially catostrophic changes.

Likely, I should have done this in 15+ different PRs instead of a single giant one.

When I did my first deploy and everything broke, I didn't quite know what it was. It all worked on my local machine, but it seemed like the authentication was breaking in production.

It turned out that there were actually a few problems (one of which was the authentication being mis-configured), but it was impossible to tease them apart because so much had changed all at once.

Next time, I will deploy small pieces at a time. If something breaks, I'll know exactly what broke it.

This is actually how I ended up finding all of the issues. I went back to where it was working, then slowly added pieces making sure it didn't break.

Never forget about Gall's Law.

3. Trust your instincts

I discovered that my deploy was bad at the exact same time that Vadym bought the Clean Components Toolkit and was trying to log in.

(Sorry for the awful user experience Vadym!)

At first I thought it was a trailing slash issue in my logic (it was, partly). But what ended up being part of the fix didn't fix it completely, because there were a few other things going on.

Within seconds I knew in my gut that it had to do with trailing slashes, yet it took me an entire week to actually confirm that. In fact, for most of the week I wasn't sure it was the trailing slashes.

I'm not sure how I should feel about this.

If I had followed my instincts more closely, could I have fixed the issue sooner?

Really, it's impossible to know.

But learn to trust your instincts — it can be surprisingly accurate!

4. Staging environments are your friend

Although I wouldn't call myself this, I'm basically an "indie dev", building this product solo.

There's a balance between process and discipline, and just getting things done and shipping it, I just haven't found it yet.

It's a relatively small codebase that I've written entirely myself, and there's no way for me to do code reviews or more formal pull requests.

When it comes to testing, how much is helping me vs. slowing me down?

During this whole debugging process I ended up setting up a full staging environment. Without it I would have had to debug in production, which is just ridiculous.

But before this, I didn't really need a staging environment.

Every bug or problem could be perfectly reproduced locally, and I wasn't changing my auth or anything that needed a staging environment to test in.

This multi-product support though, touched so many pieces (running database migrations on build, authentication, configuring environment variables etc.) that I couldn't properly test it locally.

I'm using Netlify, so I set this up for branch previews. Different environment variables, different OAuth apps for authentication, using test mode in Lemon Squeezy, and some conditional logic here and there to tie it together.

It was probably a day or so of work, but I'm glad I have that set up now.

5. Working alone has it's downsides

I'm not going to say that I would have avoided these issues if I worked on a team, but I would have my weaknesses and blindspots better handled.

It's harder to let your discipline slip when you're accountable to others.

You have others to push back, and to help you better judge the risk of different changes, and to tell you that you should probably split up the PR because it's way too long to review and who wants to review a super long PR when there are so many other things to do?

I'm not complaining here, I love the freedom of working on my own.

However, over time I'm learning the downsides of working all by myself (and the upsides of working with a great team).

Conclusion

I hope you're working on a team (or by yourself) that already does these things.

Hopefully, none of this is news to you.

But maybe, like me, you needed a reminder that software engineering "best practices" are there for a reason.

I almost forgot to mention what the exact issues were:

  • Not handling trailing slashes in my logic
  • Something that caused Nuxt 3.10.3 and @sidebase/nuxt-auth to break that I didn't have time to investigate further

— Michael