Continuous Integration
Building software is complex and an apparently simple, self-contained change to a single file can have unintended side effects on the overall system. When a large number of developers work on related systems, coordinating code updates is a hard problem, and changes from different developers can be incompatible.
Continuous Integration (CI) is a development practice that requires developers to integrate code into a shared repository several times a day. Each check-in is then verified by an automated build, allowing teams to detect problems early.
By integrating regularly, we can detect errors quickly, and locate them more easily.
Remember that if something takes a lot of time and energy, you should do it more often, forcing you to make it less painful. By creating rapid feedback loops and ensuring that developers work in small batches, CI enables teams to produce high-quality software, reduce the cost of ongoing software development and maintenance, and increase the productivity of the teams.
How to Implement Continuous Integration?
The foremost part of establishing CI is to have a Version Control system where developers integrate all their work into the main version of the codebase (known as the trunk, main, or mainline) on a regular basis. Teams where developers merge their work into the trunk daily perform better. A set of automated tests should run both before and after the merge in order to validate that the changes don't introduce regression bugs. If these automated tests fail, the team stops what they are doing to fix the problem immediately.
CI ensures that the software is always in a working state, and that developer branches don't diverge significantly from the trunk. The benefits of CI are significant: it leads to higher deployment frequency, more stable systems, and higher quality software.
The key elements in successfully implementing continuous integration are:
Each commit should trigger a build of the software.
Each commit should trigger a series of automated tests that provide feedback in a few minutes.
To implement these elements, you need the following:
An automated build process. The first step in CI is having an automated script that creates packages that can be deployed to any environment. The packages created by the CI build should be authoritative and used by all downstream processes. These builds should be numbered and repeatable. You should run your build process successfully at least once a day.
A suite of automated tests. If you don't have any, start by writing a handful of unit and acceptance tests (PDF) that cover the high-value functionality of your system. Make sure that the tests are reliable. That way, when they fail, you know there's a real problem, and when they pass, you're confident there are no serious problems with the system. Then ensure that all new functionality is covered by tests. Those tests should run quickly, to give developers feedback as soon as possible. Your tests should run successfully at least once a day. Ultimately, if you have performance and acceptance tests, the developers should get feedback from them daily.
A CI system that runs the build and automated tests on every check-in. The system should also make the status visible to the team. You can have some fun with this—for example, you can use klaxons or traffic lights to indicate when the build is broken. Don't use email notifications; many people ignore email notifications or create a filter that hides notifications. Notifications in a chat system is a better and more popular way of achieving this.
Continuous integration, as defined by Kent Beck and the Extreme Programming community where the term originated, also includes two further practices, which are also predictive of higher software delivery performance:
The practice of trunk-based development in which developers work off trunk/mainline in small batches. They merge their work into a shared trunk/mainline at least daily, rather than working on long-lived feature branches.
An agreement that when the build breaks, fixing it should take priority over any other work.
CI requires automated unit tests. These tests should be comprehensive enough to give you confidence that the software works as expected. The tests must also run in a few minutes or less. If the automated unit tests take longer to run, developers won't want to run them frequently. If the tests are run infrequently, then a test failure can originate from many different changes, making it hard to debug. Tests that are run infrequently are hard to maintain.
Creating maintainable suites of automated unit tests is complex. A good way to solve this problem is to practice test-driven development (TDD), in which developers write automated tests that initially fail, before they implement the code that makes the tests pass. TDD has several benefits, one of which is that it ensures developers write code that's modular and easy to test, which reduces the maintenance cost of the resulting automated test suites. Many organizations don't have maintainable suites of automated unit tests and, despite that, still don't practice TDD.
How CI helps us?
Because we’re integrating so frequently, there is significantly less back-tracking to discover where things went wrong, so we can spend more time building features. Continuous Integration is cheap. Not integrating continuously is expensive. If we don’t follow a continuous approach, we’ll have longer periods between integrations. This makes it exponentially more difficult to find and fix problems. Such integration problems can easily knock a project off-schedule, or cause it to fail altogether.
Continuous Integration brings multiple benefits to:
Say goodbye to long and tense integrations
Increase visibility enabling greater communication
Catch issues early and nip them in the bud
Spend less time debugging and more time adding features
Build a solid foundation
Stop waiting to find out if your code’s going to work
Reduce integration problems allowing you to deliver software more rapidly
Common pitfalls in CI
Some common pitfalls that prevent wide adoption of CI include the following:
Not putting everything into the code repository. Everything that's needed to build and configure the application and the system should be in your repository. This might seem outside of the scope of CI, but it's an important foundation.
Not automating the build process. Manual steps create opportunities for mistakes and leave steps undocumented.
Not triggering quick tests on every change. Full end-to-end tests are necessary, but quick tests (usually unit tests) are also important in order to enable fast feedback.
Not fixing broken builds right away. A key goal of CI is having a stable build from which everyone can develop. If the build can't be fixed in a few minutes, the change that caused the build to break should be identified and reverted.
Having tests that take too long to run. The tests should not take more than a few minutes to run, with an upper limit of about 10 minutes according to DORA's research (PDF). If your build takes longer than this, you should improve the efficiency of your tests, add more compute resources so you can run them in parallel, or split out longer-running tests into a separate build using the deployment pipeline pattern.
Not merging into trunk often enough. Many organizations have automated tests and builds, but don't enforce a daily merge into trunk. This leads to long-lived branches that are much harder to integrate, and to long feedback loops for the developers.
Ways to measure CI
The CI concepts discussed earlier outline ways to measure the effectiveness of CI in your systems and development environment, as shown in the following table. Gathering these metrics allows you to optimize your processes and tooling for them. This leads to better CI practices and to shorter feedback loops for your developers.