Chapter 11
Environments
Before reaching production, a feature goes through development and testing stages.
Each of these stages takes place in a dedicated environment.
An "environment" refers to a replica of the infrastructure that runs the software and its required tools.
This includes not only running the code but also managing the database, cache, and other dependencies.
Environments are not an exact science. Each company adapts its development cycle and environments to fit its needs.
Let’s take a look at the most common ones.
Production
The production environment, or "prod" for short, is the final environment where the software is deployed for users.
It’s the ultimate goal of the development cycle, where everything must work reliably, securely, and efficiently.
Monitoring tools are connected to ensure constant oversight of performance.
If sales drop to zero for five minutes : an SMS is immediately sent to the on-call team.
The act of deploying code or dependencies to this environment is known as "going live" or "release to production".
To facilitate their workflow, developers, especially when working in teams, prefer to break everything down into small tasks.
Whenever possible, this breakdown is functional: they iterate on a feature to deliver more value to the customer with each step.
For certain topics, a purely technical breakdown is preferred: they deliver the puzzle pieces one by one and assemble everything at the end.
This happens particularly with projects that are hard to iterate on. For example, you can’t open to a customer half of a payment method: before launch,
you need to handle refunds, financial reports, and tools for customer service...
Ideally, these small pieces are deployed to production as quickly as possible. Incremental releases make production deployment less risky:
- Fewer things to test and monitor.
- Issues are identified and localized more quickly.
- Rollback is easier.
These small pieces are the "commits" mentioned in the previous chapter. Each commit doesn’t necessarily deliver a direct improvement for the customer,
which often leads to misunderstandings between developers and others about the term "deploy to production."
When a developer says their code is in production, it simply means the code is physically present in the production environment’s infrastructure.
It may not yet be accessible to customers.
In their habit of breaking tasks down for large projects, developers often separate the deployment to production from the final activation of the feature.
To do this, two common techniques are used:
-
99% of the code is in production but unused. The goal is to test that there is no regression. The final deployment activates the entry point to the feature.
This approach is mainly used for purely technical breakdowns, when there’s less confidence, and progress is made cautiously.
Despite all the testing, developers still cross their fingers during the deployment of impactful code. Unexpected user behavior, underestimated load, subtle differences in the production environment… As we’ll see later, developers do everything they can to avoid these risks! -
Using a feature toggle or feature switch.
100% of the code is deployed, but an internal toggle controls whether the feature is active. The toggle configuration is often external to the code itself, allowing developers to enable or disable a feature without the need of a production release.
Feature switches are also useful for disabling non-essential, resource-heavy features in case of system overload. For instance, when I worked at Deezer, we would disable all non-music-related features on New Year’s Eve to ensure the site remained functional during peak usage.
Advanced feature toggle systems offer additional options:
-
Restricting access to internal users only, such as via company IP addresses or email domains.
This allows for "hidden production," where new features are live in production but only accessible to internal teams while clients continue to use the old version. -
Gradual rollout to a small percentage of users. If everything works for them, the rollout continues; otherwise, it’s rolled back.
This is called a "canary" deployment, referencing canaries used in mines to detect gas levels. The canaries go in first to identify risks before miners proceed. - And more...
Preproduction
The preproduction environment is where the final tests take place before deployment to production.
In theory, it’s an exact clone of the production environment: the architecture, technologies, and configurations are identical.
This allows for realistic testing to ensure all components fit together seamlessly.
Every line of code goes through this stage before being deployed. When a new feature is added to preproduction, it means preproduction is ahead of production. This gap should be reduced as soon as possible to free up the preproduction environment for urgent bug fixes (hotfixes).
Development
The "dev" environment is used by developers to write their code. It’s usually a replica of the production environment but adapted to run on developers’ local machines.
While coding, developers have access to a local database, cache system, etc. This is why developers need powerful machines with plenty of memory.
The development environment doesn’t run exactly the same code as production or preproduction:
- Configurations are different, such as cache durations or explicit error displays.
- Debugging tools, testing utilities, and other development aids are integrated.
- The infrastructure isn’t a perfect replica. For instance, if production runs in the cloud, development may use a simplified imitation.
- It's plugged on partners' test environments, also called sandboxes, which may behave differently than production.
These differences partly explain why developers sometimes struggle with bugs: the infamous "it works on my machine."
Local Test
The "test" environment is a variation of the development environment used to run automated tests.
Distinguishing the two allows for different tools and the use of test doubles for parts of the infrastructure.
It’s like in movies, where stand-ins replace actors for certain scenes. This enables reliable and fast testing.
For example, while development uses a real database, the test environment might use a fake one. This makes it easier to simulate scenarios for testing.
Another example: during automated testing, which happens dozens of times daily, you don’t want to send real emails. Instead, a fake email service is used.
Without diving into details, test doubles come in many forms, such as mocks, stubs, fakes, spies, and dummies.
These tools return fake data, simulate complete systems, or analyze interactions with underlying systems.
By a loose use of terminology, 99% of developers use "mock" as a generic term for all test doubles.
Test doubles are critical in development.
When you hear complaints about "legacy code," unmaintainable and untestable code that doesn’t "follow best practices", it’s often because the code wasn’t designed to support test doubles.
They are the main reason why we implement robust architectures. When a developer speaks about hexagons, dependency injection, etc., it is not just for the beauty of the code. These practices make code testable and functional in the test environment. Without them, manual testing becomes the only safeguard against bugs or regressions. Deployments become stressful, weird issues appear months later, and the project becomes frustrating to work on.
In most companies, the test environment is found in two places:
- On each developer’s machine. Depending on their workflow, developers might run tests dozens of times daily, especially if practicing Test-Driven Development (TDD).
- During continuous integration, as discussed in the previous chapter. Code must pass automated tests before being merged or deployed to the next environment.
UAT / Staging
Some companies create additional environments, primarily for manual testing. Others typically use preproduction for this purpose.
While the "test" environment is for automated testing, "User Acceptance Testing" (UAT) or staging environments are used for manual testing and consolidating work across teams.
For example, imagine developing a large project involving integration with a partner.
As we've seen, the code might be deployed to production gradually. However, before opening to all users, stakeholders (e.g., product owners) want to test the user experience, developers want to ensure all components work together, and the partner wants to verify proper request and response handling.
Since this testing and debugging process can take weeks, reserving preproduction for that long isn’t a valid option. Instead, a dedicated UAT environment is created for the project and opened to the partner.
To address feedback more quickly, UAT deployment is often scheduled around code reviews, either before or after depending on if you value more reviewers’ or stakeholders’ time 😇. In any case, this happens before merging into the master branch, which must remain a perfect replica of preproduction (see the previous chapter).
Versioning
All these environments share a common goal: to enable smooth feature deployment to production.
We've seen with progressive rollout, that even with all our efforts, we sometimes want early user feedback before a full launch.
Versioning makes this possible too.
In a progressive rollout, the final feature is released to real users. While they didn’t explicitly opt-in, they experience the feature (and any bugs).
In contrast, creating a "version" means creating an entirely separate variant. Users can choose to adopt it or stick with their current version.
Of course, maintaining multiple versions can become time-consuming, and support will eventually end, forcing users to upgrade. However, users typically have a few years of stability.
Version numbers conventionally follow the SemVer format: major.minor.patch.
For example, 1.23.4 indicates the first major version, with 23 new features added since its release, and 4 patches for the latest feature.
A version can also have a maturity label, representing a feature’s development stage:
- Dev / Nightly build: the latest ongoing developments.
- Alpha: the first functional but unstable version, intended for internal testing.
- Beta: more stable but possibly with bugs or optimization issues, released to selected volunteers.
- Release Candidate: almost ready. Unless a major issue is found, this will become the public release.
- Stable: the version is available to everyone. Some stable versions are designated as Long Term Support (LTS), providing years of reliable use.
Users can voluntarily use unstable versions to access the latest features, in exchange for reporting any issues.
This system is commonly used for mobile apps, APIs, libraries, and more.