What is CI/CD? Benefits and Risks

Asked

Viewed 13,039 times

27

It’s very easy to find the answer to the concept on Google but wanted to understand how it works in practice.

I know that the Amazon, for example, code migration for each production few seconds. Can I also use in my personal project? Directionally, I have more benefits than work to implement?

  • At a glance at this article it may be useful: https://lhlima.wordpress.com/2015/08/04/differca-entre-ci-integracao-continua-e-cd-distribuicao-continua/

2 answers

36


Although the question is clear about not having much interest in the formal concepts of IC and CD by these being easily found on google I found it good to give a basic introduction in both to make the answer more complete, but if you are reading already know both can skip to the section "In practice".

CI

Continuous Integration proposes the following: the project should have a central repository and all developers should send their code to it as often as possible, which in practice means at most a few hours. That is, you made a small change to your code and already sent it to the repository. It is also necessary that there is an automatic process that is triggered at each sending of code to the central repository, this process must check errors of the integrated code, seeing if the changes work correctly in conjunction with the other’s code. This way problems in implementations are found automatically and quickly, it’s only time for you to send your code to the repository and the checking process to run and report the result.

CD

Continous delivery is to establish a development model that allows you to make rapid and constant deployments when needed. Instead of producing say a production-ready version of the software each week you produce several of them (the time is relative and varies according to the context), which allows, as in IC, to accelerate the detection of problems and consequently improve the reliability of the product. This development model usually comes anchored by an automated test system managed by a continuous integration system (CD will hardly exist without IC).

In practice

When I started working as a dev a few years ago one of the first things I implemented in the company I was in was a continuous integration system using the Jenkins, until the date of that post I am still there and we have used the system very successfully. The scheme works like this: all the projects of the company are in different repositories of the Subversion, most projects are in PHP and they have unit test suites in phpunit and acceptance using Selenium, all devs send code to the repository with a frequency of + - 1 commit every 15 minutes (did something, commit), every commit Jenkins fires a process (or "build") that will run all unit and acceptance tests across the system that now contain the new code. When Jenkins finishes running the build it internally publishes several reports in various formats saying what happened, what’s Ok and what’s not. Through an integration with the company’s chat the result and links of these reports are posted for everyone to see, ex:

Links para relatórios de uma jenkins build

Each of the links in blue is the report of a tool that was rotated on the code of the repository, the link "Codeception" for example is a tool that aggregates the tests of phpunit and Selenium, in this example build the result was the following:

Relatório de ferramentas de testes

As the photo shows there were two test failures in this execution, IE, something sent to the repository is in trouble and should be reviewed.

With this continuous integration system there is much greater confidence in the current quality of the software, if the reports indicate failure there is probably something wrong (if the reports are ok indicate or not that the software has problems will depend a lot on the tools you use and the quality of your tests, but this is another story), can see errors quickly the time spent to correct them is much less.

This confidence in the current quality of the software generated by the IC system is also what makes possible the application of Continuous Delivery in the company. A mixture of bash and sible scripts, which, very briefly, have been set up to make the process of putting a new version of one of the software into production simply run a command similar to bash deploy-script.bash [nome do projeto], which makes everything quite easy (except when one of these script breaks, heh) and allows several daily deploys when needed.

Worthwhile?

The benefits I hope have already become clear: constant feedback of the state of the software, speed in detecting bugs, increased reliability in the product, higher speed in deploying new versions in production, etc. Obviously this does not come for free, CI and CD require a certain amount of work, both learning and implementation, apart from the time spent on maintenance. The "risk" of applying these practices is the time that will be spent building, configuring and maintaining the necessary tools and codes.

Whether it is worth it or not and whether it will make up for the time spent assembling the architecture is something that can only be answered with a good contextualization of what is being done and by who is being done. That is, it depends ;(. But as a general rule I would say that if it is possible to implant a CI and CD system, implant it will probably save you a good time and headaches in the medium/long term.

  • Hello, @Brunorb! After more than a year, you have something to add or improve on your answer. I am very interested to hear more from you on this topic. Thank you!

  • 1

    @Laflame has some new experiences to share. Weekday is tricky, so probably by Saturday I’ll give one up on the answer.

13

I can tell you a little bit about my experience in my workplace. A priori summary: we are working to try to start a reliable IC to then have a CD to deliver value to the customer as soon as possible.

History of development cycle

In 2013, where I currently work, I used a single Generation 2 VCS (SVN itself; for nomenclature, see Eric Sink) to store Java fonts; we had 2 products that, in theory, should perform the same set of functions: one project for mobile device (Totalcross) and another for management portal (GWT). Ultimately (including the habits of this generation and the company’s culture), this meant that code changes to innovations (Features additions, experiments, etc.) and maintenance (bug fixes and performance improvements) were done in the same branch. At the end of the day, that meant that:

  1. the developer worked on a code scheme V1;
  2. when there was a conflict, the one who ultimately won was the one who left to deliver the job last, overwriting the code of whoever stirred concurrently;
  3. in the end, what was actually committed was the code v2 (after resolving the conflicts);
  4. the code was built v3 for testing;
  5. the test finds a flaw in v3, but the developer who picks this flaw is already on v5 (!!!!);
  6. the correction is made and delivered v6;
  7. make the test build v7 for testing;
  8. test approves v7, then the build is made v8 for production;
  9. v8 presents obvious problems in production that v7 did not present;
  10. saw each other nights with the code v12 to produce a v13 final;
  11. v13 corrects all problems of v8, but then a customer detects a regression in relation to what he was using;
  12. get v17 and tries to correct quickly by producing a v18;
  13. the tester picks up the test build v21 and points out other regression problems;
  14. you take the code in v30 and correct these problems, generating the trial version v31;
  15. the tester approves the build of v32;
  16. customers approve the build of v37;
  17. goes to production the build of v42 (!!!).

Yes, that was the size of the trauma to release versions... In the end, a release cycle of new features was 6 to 8 months, so a mysterious box arrived that the customer tried to use.

This began to change in September 2014, when the following changes took place:

  1. development of projects;
  2. switching to a Generation 3 VCS (git);
  3. adoption of a variation of Gitflow;
  4. all code only enters the master or develop after revision (route merge request);
  5. visual management through a web portal (Gitlab CE).

This avoided having maintenance chaos in production, now allowing developers to get sleepless nights. What the tester approves now is the same build that the client approves and uses in production. Also, due to the homologation environment, errors were captured in a much safer sandbox. This paradigm break (of unique code for stable code / master and unstable code / develop) improved our code quality and stability of new versions, but did not help in the release speed.

Even with the constant revisions of codes, it happened that the build broke after the acceptance of one or other merge request; the motives were really random, but anyway... Then, the first step towards the IC was given: all creation of a merge request results in automatic build execution for merge output. This detected many problems concerning it, saving the time of the proofreader and the blessed one who would correct it.

At the end of 2014, it was detected that most of the problems found in the portal code were identical to those of the mobile code; then, in May 2015, the unification of the code bases was made, creating the heart of the system that needed two substrates: UI and bank access. In this case, the least effort we had to achieve this unification was to inject the bank dependencies into the heart, while the UI called the heart to make the right referrals. After a period of stabilization due to this change of architecture (sometime between July and October 2015), we were able to release versions every 3 months. Much of the technical delay of the portal (not implemented Features and fixed bugs in mobile) was fixed with this, being therefore the focus of attention now new features and performance.

Even so, we were below our goal of speed: a final version closed and tested per week (not yet CD, but almost there). Reason for our slowness? Regression. What did we need to prevent regression? Test. However, hiring test personnel to do silly and repetitive things for every released version is a huge waste of money. The workable solution? Automated testing.

Automated testing began in August 2015, but the creation of these tests only gained traction in July 2016. In October 2016, we put that one merge request would only be accepted if it had automatic testing. This served for several reasons:

  • if the merge request were a hotfix, was proof that the production code was wrong;
  • ensures that a new Feature behaves well for some expected/common scenarios;
  • avoids the mere existence of regression.

From the moment it gained traction (in July 2016) until the end of March this year (2017), the automated tests did not bring the speed increase in release, but gave more credibility to the builds. From April 10, then, we are at the pace of 1 version for every two weeks, with increasing stability with each version.

Completion

Well, we are making increasingly smooth the development of our software (mobile, portal and heart). When the development reaches a very large softness, then we get continuous integration. The objectives of continuous integration, in our context, are two:

  1. value the software (from the customer’s point of view) more quickly;
  2. ensure that what works before continues to work.

With continuous integration, we can move on to a next stage: continuously delivering value to the customer.

We are not in a race to put CD simply because CD is beautiful, because it is fashionable. Imagine having a continuous delivery with the delivery flow described right at the beginning? Continuous delivery should target the customer and should be done responsibly. We don’t want to lose customer because each bi-weekly release breaks everything that there was before; I recognize that sometimes we annoy customers when we change the look or let the navigation more fluid/intuitive, because he was already used to the old scheme, but we didn’t lose them at least =)

The greatest risk that CD can cause is to fail in the second objective described above: ensure that what works before continues to work. A CD made irresponsibly does not guarantee this stability.

The risk imposed by IC is infinitely lower than that imposed by CD, but it is very large when compared to a slower integration. Basically, an irresponsible IC can disrupt the first objective described above: value the software (from the customer’s point of view) more quickly, for the branch with the unstable code develop will be more unstable than desirable, causing a lead time greater between the beginning of the development of a new functionality until the actual delivery of this functionality.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.