Jack Marchant

Principal Software Engineer @ Deputy

Twitter | GitHub

Building Software with Broken Windows

Ever get the feeling that adding this "one little hack", a couple of lines of code, won't have much of an impact on the rest of the codebase? You think nothing of it and add it, convincing your team members it was the correct decision to get this new feature over the line. In theory, and generally speaking, I would kind of agree with doing it, but every hack is different so it's hard to paint them all with the same brush. If you've been doing software development for long enough you can see this kind of code coming from a mile away. It's the kind of code that can haunt your dreams if you're not careful.

Back to the point, the code you added that was a little sub-par has introduced the possibility for a second hack to be added without the same reservations or questioning from team-members that you might have had before. A similar decision was made last time so we can let this one slide. You may even go so far as to add a comment detailing the hack, and the reasoning, patting yourself on the back before merging it in.

This kind of attitude can really add up quickly, and without you even realising. I would classify this as the go-to technical debt example - the debt being the block of code you anticipate will need re-writing for one reason or another. Over time, you introduce code like this that isn't performant as it should be or wasn't written in a way that is extensible. Tech debt should be used like a bandage, a temporary fix to stop the bleeding, but left for too long, and it starts to bleed-through.

At some point you will have to repay this debt, and figure out a way to remove the code you or your team has added to get a "quick win". Some are easier and more straight-forward than others. When you're adding code, a good rule-of-thumb could be: "how easily can this be removed?". It is the removal of code that is taken forgranted. We assume code we write will live on for a long time, but in reality things change often, so we need to be able to move code around, delete it, or completely re-write it. Easy deletion of code should be the mark that something was created well, and isn't tangled in between many other files or functions.

To the detriment of the codebase, should hacks add up over time and you find yourself piled on with tech debt, it can make the attitude towards the codebase change. This effect is known in software development as Broken Windows, where seeing something that's already broken or poorly formed devalues your own opinion of it, so you either leave it broken or make matters worse by breaking more windows.

In this metaphor your codebase is a house, and you and your team live in this house. When you add a hack, it's like breaking a window. The first one you might patch up to stop the cold air getting in. Not patching it, however, will open the floodgates for more broken windows. Soon, you'll have three or four in your house. When a door handle inevitably breaks under the pressure of heavy usage, after seeing all of the broken windows, you'll probably just leave the door open and not close it anymore, rather than fixing it or buying a new handle.

How did we get here?

When your codebase is an unmaintainable mess, it's bad for business, it's bad for you (you have to keep fixing it) and it can make others in your team quit if it doesn't get better.

It might seem a bit dramatic, to go from a simple hack to all those bad side effects, but it wasn't the hack itself, it was the attitude that ensued as a result. Unchecked, these decisions can pile up over time without realising.

How do we fix it?

The first step is knowing you have a problem, just like any other. Identifying problem areas in your codebase, places where nobody dares go until they are forced to add a new feature. Sounds familiar, right? Instead of taking time out to refactor parts of the codebase for the sake of it, which I might add is much harder to convince anyone it's worth doing now, versus later, I would recommend waiting until you have a feature that needs to be implemented in that area, or could benefit from its refactoring. This ammunition can help you prioritise the refactoring ahead of the feature work itself, if building the feature will be easier. Think of it like an investment.

Planning the redesign of the software with the feature itself, means when it comes time to add the feature, it should be a piece of cake - assuming the planning and execution has gone well.

Whether it's a random hack or a poorly architected part of the software, you can treat the problems in the same way. If you're thinking it's too late to refactor and you need to completely rewire, I would urge you to think again, since in my experience it's almost always harder to compeletely re-write, unless there are other factors in play than simply it's bad code. It can be tempting to want to start again and commit to a new set of guidelines for how you build your software, but in the long run you will eventually need to be disciplined enough in your team to see a problem and fix it rather than needing to start again because it got so bad.

In Elixir, I would argue refactoring is at its easiest when you think about modules and functions, as opposed to hierarchical structures that you might find in Object-Oriented Programming languages. Of course, you can still get yourself into a mess in Elixir with the over-use of OTP features and apparent indirection that can come from Meta-programming with Macros. In general I have found it easier than most other languages that I have used.

A simple mindset change might be all you need to progress from an unmaintainable codebase to one that is easy to add new features. A popular one in programming is the Boy Scout's rule: "Always leave the camp ground cleaner than you found it", which in relation to programming means you fix something that's broken when you see it - while you're touching that code.

It can also be helpful to take the codebase in the state that it's in now and discuss improvements with your team (or yourself if you're riding solo), and plan for the state you'd like it to be in. When you can agree on how the codebase should look, it's easier to make steps towards that goal each time you write code. Over time, this will pay off with the correct attitude.

Tech debt is a mystical beast that can break companies, teams and software alike. Through understanding of how problems like this arise in software development, it's possible to limit the effect it has.

Note for the reader: Planned tech debt is not an excuse for writing bad code, nor should it happen consecutively across features - you may be in more trouble than you think!

. . .

how does a relational database index really work

A common question in software engineering interviews is how can you speed up a slow query? In this post I want to explain one answer to this question, which is: to add an index to the table the query is performed on.

refactoring for performance

I spend most of my time thinking about performance improvements. Refactoring is tricky work, even more so when you’re unfamiliar with the feature or part of the codebase.

exploring async php

Asynchronous programming is a foundational building block for scaling web applications due to the increasing need to do more in each web request. A typical example of this is sending an email as part of a request.

maintaining feature flags in a product engineering team

I have mixed feelings about feature flags. They are part of the product development workflow and you would be hard pressed to find a product engineering team that doesn’t use them. Gone are the days of either shipping and hoping the code will work first time or testing the life out of a feature so much that it delays the project.

technical interviewing

When I first started interviewing candidates for engineering roles, I was very nervous. The process can be quite daunting as both an interviewer and interviewee. The goal for the interviewer is to assess the candidate for their technical capabilities and make a judgement on whether you think they should move to the next round (there’s always a next round). Making a judgement on someone after an hour, sometimes a bit longer, is hard and error prone.

using a dependency injection container to decouple code

Dependency Injection is the method of passing objects to another (usually during instantiation) to invert the dependency created when you use an object. A Container is often used as a collection of the objects used in your system, to achieve separation between usage and instantiation.

3 tips to help with working from home

Working from home has been thrust upon those lucky enough to still have a job. Many aren’t sure how to cope, some are trying to find ways to help them through the day. Make no mistake, this is not a normal remote working environment we find ourselves in, but nonetheless we should find ways to embrace it.

making software a three step process

One of the most useful tips that has guided much of my decision over the years has been this simple principle: three steps, executed in sequential order;

help me help you code review

Code Reviews are one of the easiest ways to help your team-mates. There are a number of benefits for both the reviewer and pull request author:

a pratical guide to test driven development

It’s been a while since I last wrote about why testing is important, but in this post I thought I would expand on that and talk about why not only unit testing is important, but how a full spectrum of automated tests can improve productivity, increase confidence pushing code and help keep users happy.

facade pattern

Design Patterns allow you to create abstractions that decouple sections of a codebase with the purpose of making a change to the code later a much easier process.

the problem with elixir umbrella apps

Umbrella apps are big projects that contain multiple mix projects. Using umbrella apps feels more like getting poked in the eye from an actual umbrella.

broken windows

Ever get the feeling that adding this "one little hack", a couple of lines of code, won't have much of an impact on the rest of the codebase? You think nothing of it and add it, convincing your team members it was the correct decision to get this new feature over the line. In theory, and generally speaking, I would kind of agree with doing it, but every hack is different so it's hard to paint them all with the same brush. If you've been doing software development for long enough you can see this kind of code coming from a mile away. It's the kind of code that can haunt your dreams if you're not careful.

lonestar elixir 2019

Last week was Lonestar ElixirConf 2019 held in Austin, Texas. The conference ran over 2 days and was the first Elixir conference I had been to.

genserver async concurrent tasks

In most cases I have found inter-process communication to be an unnecessary overhead for the work I have been doing. Although Elixir is known for this (along with Erlang), it really depends on what you’re trying to achieve and processes shouldn’t be spawned just for the fun of it. I have recently come across a scenario where I thought having a separate process be responsible for performing concurrent and asynchronous jobs would be the best way to approach the problem. In this article I will explain the problem and the solution.

best practices third party integrations

When we think about what an application does, it's typical to think of how it behaves in context of its dependencies. For example, we could say a ficticious application sync's data with a third-party CRM.

you might not need a genserver

When you're browsing your way through Elixir documentation or reading blog posts (like this one), there's no doubt you'll come across a GenServer. It is perhaps one of the most overused modules in the Elixir standard library, simply because it's a good teaching tool for abstractions around processes. It can be confusing though, to know when to reach for your friendly, neighbourhood GenServer.

offset cursor pagination

Typically in an application with a database, you might have more records than you can fit on a page or in a single result set from a query. When you or your users want to retrieve the next page of results, two common options for paginating data include:

protocols

Protocols are a way to implement polymorphism in Elixir. We can use it to apply a function to multiple object types or structured data types, which are specific to the object itself. There are two steps; defining a protocol in the form of function(s), and one or many implementations for that protocol.

exdocker

Recently, I've been writing a tonne of Elixir code, some Phoenix websites and a few other small Elixir applications. One thing that was bugging me every time I would create a new project is that I would want to add Docker to it either straight away because I knew there would be a dependency on Redis or Postgres etc, or halfway through a project and it would really slow down the speed at which I could hack something together.

working with tasks

While writing Understanding Concurrency in Elixir I started to grasp processes more than I have before. Working with them more closely has strengthened the concepts in my own mind.

understanding concurrency

Concurrency in Elixir is a big selling point for the language, but what does it really mean for the code that we write in Elixir? It all comes down to Processes. Thanks to the Erlang Virtual Machine, upon which Elixir is built, we can create process threads that aren't actual processes on your machine, but in the Erlang VM. This means that in an Elixir application we can create thousands of Erlang processes without the application skipping a beat.

composing ecto queries

Ecto is an Elixir library, which allows you to define schemas that map to database tables. It's a super light weight ORM, (Object-Relational Mapper) that allows you to define structs to represent data.

streaming datasets

We often think about Streaming as being the way we watch multimedia content such as video/audio. We press play and the content is bufferred and starts sending data over the wire. The client receiving the data will handle those packets and show the content, while at the same time requesting more data. Streaming has allowed us to consume large media content types such as tv shows or movies over the internet.

elixir queues

A Queue is a collection data structure, which uses the FIFO (First In, First Out) method. This means that when you add items to a queue, often called enqueuing, the item takes its place at the end of the queue. When you dequeue an item, we remove the item from the front of the queue.

composing plugs

Elixir is a functional language, so it’s no surprise that one of the main building blocks of the request-response cycle is the humble Plug. A Plug will take connection struct (see Plug.Conn) and return a new struct of the same type. It is this concept that allows you to join multiple plugs together, each with their own transformation on a Conn struct.

elixir supervision trees

A Supervision Tree in Elixir has quite a number of parallels to how developers using React think about a component tree. In this article I will attempt to describe parallel concepts between the two - and if you've used React and are interested in functional programming, it might prompt you to take a look at Elixir.

surviving tech debt

Technical debt is a potentially crippling disease that can take over your codebase without much warning. One day, you’re building features, the next, you struggle to untangle the mess you (or maybe your team) has created.

pattern matching elixir

Before being introduced to Elixir, a functional programming language built on top of Erlang, I had no idea what pattern matching was. Hopefully, by the end of this article you will have at least a rudimentary understanding of how awesome it is.

first impressions elixir

Elixir is a functional programming language based on Erlang. I’m told it’s very similar to Ruby, with a few tweaks and improvements to the developer experience and language syntax.

write unit tests

Unit testing can sometimes be a tricky subject no matter what language you’re writing in. There’s a few reasons for this: