Need fast iterations with hot reload, debugging, and collaboration over k8s envs?

Why the Local Dev-Env Needs to [Finally] Disappear

Gahl Saraf
January 5, 2023
11
 min read

At Raftt we care deeply about developer experience, and a huge part of that is the environment in which developers spend most of their day - writing code, testing, debugging, iterating, and sharing. This environment has two main parts:

- A runtime, where the company’s product runs during development. This could be the local machine itself, the docker runtime, a remote cluster, or a remote VM.
- Various frontends, allowing convenient interaction with the code, runtime, and data during development. Most prominent among these is the IDE, of course, but the browser, DB client, terminal and many other tools all play a role.

By “local environment” we mean that both the runtime and the frontends are run on the local machine - most often, the developers laptop.

The developer environment matters. It impacts how effective each and every developer in the organization can be, and even affects their (our!) happiness. We, and most everyone we talk to, know this intuitively, but it is always great to find external validation. As an example, a [study by Mckinsey](https://www.mckinsey.com/industries/technology-media-and-telecommunications/our-insights/developer-velocity-how-software-excellence-fuels-business-performance) showed a clear relationship between developer velocity and investment in tooling and business performance.

## So - why do local dev environments need to disappear?

Before discussing why they are still in use [almost] everywhere, let’s dig into the pains developers experience when working with them. One of the common misconceptions is that they generally work fine for smaller teams, and problems only start to crop up as the team grows. To counter, let’s start with the challenges that even a single developer faces while working locally.

### A single dev’s struggle with their local dev-env

Mika is a developer starting to work on a brand new product for the company she works at. Her first step is getting an initial setup up, upon which she can start building the product. For this she installs various packages, runtimes (python, docker, …), and services (mysql, redis, …) locally and after some time has something that works. This took a while, so she really hopes she won’t need to change computer for any reason. Also, this setup isn’t anything like what is going to be running in production, (where things are deployed as containers, have replicas, are behind load balancers, …) so occasionally things that work locally end up breaking prod, but that is a problem for future Mika.

While she works on the new product most of the time, she occasionally forays into adjacent projects. Unfortunately, every time she switches between the products, she has to manually change some of the installed packages, reconfigure her database, and fiddle with shell environment variables before she is able to actually get any work done.

Worse - sometimes the dev environment will break on its own, due to some kind of external change - the OS was updated, or one of the packages released a new version, or for any of many other reasons. When this happens Mika has to spend time looking for the problem, figuring out what changed and finally fixing it.

**To recap, Mika faced challenges:**

- Around the initial setup and configuration of a developer environment
- Working on multiple projects
- Coming back to work on a project after some time has passed
- Understanding how the code she writes will behave in production

### A teams’s struggle with their local dev-env

Mika’s project went really well, and over several months a team has grown around her, with her as the team’s tech lead. They’ve onboarded to the project with Mika’s help, and after some time spent on setup and configuration are able to be productive. Very quickly, they discover a new problem - not all the team members use the same OS - they have a diverse laptop population with MacOS, Windows and Linux. Several versions of each OS are in use, Linux-based devs are split between Ubuntu, Fedora and Arch, and some of the Windows-based devs prefer developing directly on Windows, while others use WSL. There are even hardware-level differences - x86 vs ARM (or Apple Silicone), different CPU core counts and RAM amounts. All of these cause changes in the runtime behavior. There are even tools that are not available for all configurations, so workarounds have to be found.

After some work on in-house infrastructure consisting of a massive wiki page and accompanying scripts, the team is able to get everything to run. Something still isn’t right though - they don’t have the data in their local databases. So they copy over the dump from Mika’s laptop.

As part of growing the team, they decided jointly to invest in infrastructure in the form of a CI pipeline and a staging environment, enabling them to have high confidence when deploying new versions to production. Unfortunately, they discover that there are significant differences between how the code runs locally and how it runs in staging, so sometimes things that work fine locally break the CI pipeline. These are especially hard to debug.

To get good feedback from each other while developing the product, they try to use video conferencing and meetings, but aren’t able to get rich feedback through actual product interaction during development, and end up mostly depending on staging for this.

As time goes on, more and more differences accumulate between the developer’s laptops, and “but it works on my machine” becomes an often-heard excuse. Sometimes this is due to hardware differences, other times due to differing software setup, and occasionally due to drift in the database data. This is tiresome and frustrating.

Finally, Mika realized that she has been spending more and more time supporting the rest of the team with their dev environment problems. As the senior engineer, when people encounter a problem they aren’t able to solve they reach out to her.

**To recap, Mika’s team faced:**

- All of the challenges Mika faced initially, plus…
- Problems related to differing runtime hardware (CPU cores, OS and OS version, …)
- Hard to maintain wiki pages/scripts
- Inconsistencies in data seeding, and an ever increasing delta in stateful service state
- Difficulties with collaboration between developers
- Significant differences from CI / staging
- Inherent differences in the local environment depending on when it was installed
- Growing drift over time, as people make changes on their computer
- Senior team members spending their time supporting the dev environment for other developers

### An organization’s struggle with local dev-envs

Mika’s team has been crushing their goals, and has grown from a singe team to a group of over 50 developers, with additional product, QA, design, and analysts supporting the development. As they’ve grown, once they no longer fit in a single meeting room or daily, they started having a hard time communicating and staying in sync.

They set up several staging and test environments for people to deploy their work to for feedback, but found that these take a lot of maintenance. To avoid interfering with each other’s work they use a dedicated slack channel for synchronizing usage of the environments. Often they still end up stepping on each others toes. Unfortunately, deploying work to these environments takes a while, since it has to go through the CI process.

As time went on, the product became more and more complex, and the number of services grew from five, to ten, and then to fifteen, with additional databases, caches, brokers, discovery agents, and more. It became a hassle to run them on their local machines, because it caused the CPU to churn excessively and slowed everything else down. Some of the engineers began to depend more and more on the staging and test environments, at the cost of a much longer feedback cycle, and losing their ability to debug.

**To recap, Mika’s R&D organization faced:**

- All the challenges they faced as a team, plus…
- Difficulties with collaboration between teams and with external stakeholders
- Pains around synchronization of usage and maintenance of remote environments
- Scaling of the dev environment and subsequent inability to run everything locally
- Lost development abilities due to using staging and test environments as primary dev environments

All of these pains cost first Mika, then her team and finally the group a huge amount of time that could have been spent improving the company’s product.

## With so many downsides, why are local dev-envs still in use?

The most significant benefit local dev environments have is how easy it is to make changes. The extra step to use a VM, container, cloud instance, etc. can be cumbersome without the right tooling. This leads developers to often try things out locally, especially when starting out something new, when all the surrounding infrastructure (dockerfiles, VM images, clusters, CI pipelines, …) doesn’t yet exist.

They are also extremely customizable - each dev can use whatever OS they choose, with whatever setup. There are many dimensions of changes developers can make to tailor their experience.

The biggest reason by far, however, is simply that they are the default option. Without any investment in development infrastructure, they are the only option. And for many organizations, this investment is out of reach, or not highly prioritized.

## Are there real alternatives?

There are several alternatives, with unique benefits and pains associated with each.

### VM-Based development environments

Until relatively recently, the only alternative to pure local development was using a VM image. There were several ways it could be defined, which affected the way the images were shared and maintained. Examples include a local VM with an image shared between developers, a remote EC2 instance with a remote shared image, a VM defined by Vagrant, etc.

There are several significant problems with this approach, which have led it to generally fall out of favor over time:

- The VMs, while technically ephemeral, are often used over long periods of time, accumulating differences
- Maintaining VM images is painful - it takes a long time, they are large and slow to distribute, and end up being inconsistent between people and out of date
- It is inconvenient to work on a VM - there is runtime overhead, and if working remotely - input latency

### Docker-compose

Once containers became popular, developers began looking for simple ways to orchestrate containers for development. Docker-compose was created as a solution to that problem. It is a concise description of the set of services which needs to be run, including all information needed to build, expose, and develop on them. It is very widely used for this purpose, and solves several of the pains of pure-local development.

Unfortunately, the convenience comes at a cost - downloading container images locally can take a long time. Because of this, developers keep their environment up for a long time, leading to potential mismatches between image versions, and differences from staging/prod environments.

In addition, direct development over containers is a hassle. To start, the feedback cycle is quite long, as the image must be built and redeployed, making the feedback cycle of a one line change take minutes instead of seconds. Even if that is overcome, the tooling developers use in their day to day is limited - it becomes harder to debug.

Finally, docker-compose still runs everything locally, so once we have many services running we end up with laptops that are uncomfortable to use on … laps. Since it is run locally, we also have no good way to collaborate during the development process

### Cloud development

The final, and newest kid on the block, is the various options for cloud-based development. Without going into too much detail (as there is enough depth here for a post of its own), there are two primary modes of cloud-based development.

**Machine Abstraction**

In the first, the abstraction is that of a VM. We get a solution very similar to the “remote EC2 instance” mentioned earlier, but with much better UX - connection to an IDE, hooks and scripts for setup, a built-in way to share the base image, etc. Since it runs in the cloud, it no longer matters that some CPU is always being drained.

**Environment Abstraction**

The second mode instead understands the structure of the product, and recreates it in a way that is convenient for development. So if the product has several containers, some stateful services, some dependence on external cloud resources, the dev infra will be able to recreate it all for each developer. Not only does this offer a consistent experience, cloud runtime, and everything else in the previous mode, but it also potentially allows:

- The development environment to be similar to the staging/prod environments
- Full collaboration at the environment level - sharing environments with developers, product, QA, and others…
- Efficient resource usage, with multiple envs sharing the same underlying compute
- Significantly reduced mental overhead for developers, who no longer need to think about the runtime or the other services in it, and can focus just on the one they are developing

## Investment in dev infra is worth it

With adequate investment, any of the above solutions can be made to work well for your team. This investment can be hard to prioritize, but its effects over the entire R&D team’s efficiency and well-being are enormous. There are quite often even very simple things that delay developers by an enormous amount of time, and as mentioned in Mika’s case, it is often senior devs who suffer the most from the loss of productivity.

It can be difficult to know how much time to invest in this area, what solutions to pursue, and often the bottleneck may not be engineering capacity but leadership attention. The solution that we have seen work in high-performing companies over and over again is to create a unit (depending on organization size this can be a person, team or more), with the express goal of improving the development velocity. They would have the throughput to do in-depth research on possible solutions, and follow through to implementation.

## Where can Raftt help?

At Raftt we have built a product that gives the optimal developer experience for complex containerized environments. Fast iterations for changes, full visibility into the environment, out of the box debugging, in-team and XFN collaboration, and everything else you could hope would be present in your dev environment. We can be deployed as a stand-alone product, or integrate into your existing environment creation solution, supplying the full developer experience layer on top.

Since we can use existing IaC definitions, onboarding is very fast, and developers experience value immediately. And because we don’t touch the frontend of the developer environment - the IDE, browser, and other local tools - developers don’t have to change their workflows or get used to new tools.

Gahl Saraf

Stop wasting time worrying about your dev env.
Concentrate on your code.

The ability to focus on doing what you love best can be more than a bottled-up desire lost in a sea of frustration. Make it a reality — with Raftt.