Strategies to prevent software regression

Asked

Viewed 1,491 times

27

What programming strategies/best practices can be adopted to minimize the risk of software regression?

Questions that can help guide:

  • which "pranks"/bad practices can fool the programmer and leave a bug or performance regression go unnoticed?
  • how to manage workflow for this?
  • what programming strategies and what tests can facilitate this process?

The answers do not need to be detailed, they can be a kind of "Checklist" of best practices, a guide that serves as guidance so that interested parties can from there deepen in the subject. But, of course, the more complete, the better!

Context: I ask because I am participating in the elaboration of more complex packages in the R and that, which before seemed to me was not something complicated, is proving difficult to manage! PS: there is such a question in the English OS specifically for the R with a very interesting discussion. I thought first of asking the question specifically for the R. However, how does it seem to me that this can be treated as a programming question in general, and the community of R in SOPT is still small, at first I leave so.

  • 3

    Test and test. Having unit tests and a well-documented public API is the key. It is also worth using tools to check if the tests are covering the whole code (Coverage tools). Repeating these tests in various environments is a good one too (something like Travis-CI can help).

  • 2

    It’s @Guilhermebernal, good practices are interesting, but nothingness, absolutely nothing, guarantees the absence of problems. No programmer, however good, can have the whole structure of a program in mind.

  • 2

    @utluiz De facto. The bigger the project, the greater the amount of regressions you will find later, no matter how careful. Hence it is interesting to have a method to update the post-release system. Another thing that helps is the peer-review, have other people working and reviewing your code.

  • Okay, no trouble nobody’s gonna make it. But surely there is a minimum core of practice/knowledge, factually grounded, that guides the process so as to minimize the problems, no?

  • @Guilhermebernal asked the question badly, but I believe it is not broad, you started to touch the points correctly. Can’t we edit it to make it better!?

  • @carloscinelli did not find it so broad (but maybe you should limit the scope a little more and define whether or not you are talking about R). Give example of some real case and ask what could have been done to avoid for example. (not to extend discussion here, search for me in chat if you want)

  • 1

    Reduce the coupling among the various components also helps a lot - in general new requirements interfere little in the old ones, so that if the components are decoupled those that have not been directly changed should not suffer behavior change. In addition, maintain unit tests as already mentioned, and if possible create regression testing where applicable (it is important that - before any release - a full system test is carried out, even in those parts which have not been modified).

  • 2

    P.S. I didn’t find the question too broad either. It’s clear to me that the mention of the R refers to the context, and not to the question itself, and although a complete answer in fact would be very extensive, nothing prevents a proposed "Checklist" to guide in the task of preventing such defects (I’m quite interested in such a response too, despite my several years developing I feel I still know little about regression). I’ll indicate the question for reopening.

  • @mgibsonbr thanks for resurrecting the question, I edited it to see if now will!

  • @Guilhermebernal helps to reopen?

  • I added more information about coordination/management.

Show 6 more comments

3 answers

14

The term "software regression" comes from English "regression software", means "return of bug in the software" or "re-emergence of failure". In day-to-day we also use the term "side effect" (side-Effect): we applied good doses of correction, but the sick software still suffers, precisely from some harm caused by the correction.

If we use the notion that software "advances" with bug fixes, so we can say that the software "regress" with the inclusion of side effect bugs; which brings us a more accurate sense to "regression".

Complex software is not so different from the human body, and programmers are no different than doctors with years of training and experience: they will avoid but not dodge side effects. Is a "systemic effect", is in the nature of complex systems... Like a Rubik’s Cube, situations arise where we try to fix one side, but compromise the other.

Are they like hidden demons, the perfectionist programmer will never have peace of mind? The first tip for the unsuspecting reader is this: only worry about the subject at complex systems, that is, in situations where you have lost the detailed view of the whole.

By the way, from this tip comes an approach prior to the problem... Maybe it doesn’t exist (!), if we are able to modularize the program.

Modularization

Before suspecting systemic problems, the best practices suggest to organize the system in modules with the maximum decoupling possible. It is the "heuristic of divide and conquer". A well insulated and uncoupling it will not be complex if none of these modules are complex.

Answer

Despite being "of the nature" of the complex system, there are two fundamental approaches for the unconverted to deal with the fact (investing in one is usually sufficient):

  1. Isolate and use the approval version in producing (see "Concepts" section below). Trust always suspicious, the "stable version" requires quarantine, users should use in production the new version in a separate environment ("type-approval") or be on notice of the "new version" (it is there ready to restore the old version). The approval would not only be "the customer approve" but "the customer use for a while and then say whether approve or not"... This is all because in most work environments, the software testing is not taken so seriously, only if "really forehead" when it is producing.
    Example: in a Web software, offer a group of more experienced users another address for the homologation (with him already working in the database of producing).

  2. Simulate the environment of use, where asserts can be performed massively and automatically: all "stable software" can be monitored and have the inputs and outputs recorded for "good behavior memorial effect".
    It can in theory, because the more sophisticated the user-interface, the more difficult to monitor. A log POST and GET of a webservice, for example, it can be stored. Only items from this log that can be considered "good examples of how it should work" should be checked and filtered. I’ve done this by building lots of XML files, and then by simulating the use of the webservice. It’s a lot of work, but it’s an almost perfect solution (!).
    PS: no mention of Coverage tool or non-regression testing before such a log. Statistics on asserts is the methodological basis of any simulation approach.


Notes

Of course, equally important, as these practices require investment, be sure to invest in good documentation, demonstrations and team support - as @Guilhermebernal reminded us, there is the practice of "peer review" (peer review)which also raises the reliability of critical algorithms.
PS: in the particular case of language R (which seems to be the challenge of @Carlos), which is quite oriented towards Mathematics and allows use of the functional programming paradigm, it is convenient to invest in the "mathematical proof" of each critical algorithm. Proof algorithms require systematic testing... In high reliability contexts (military, aeronautical, banking applications) proof is more important than testing.


(Human and managerial side)

Ah, although it’s obvious, remember, especially the client or the boss who charges him deadlines: if you need to approve, never depend on something that has not yet been approved. It is important to "step on" and charge those responsible for the approval (or for the construction of the test logs) that this comes to an end. It is important to divide (if possible in contract!) your responsibility with those who test. A stable building is built on stable foundations. "Regressions" are common when the programmer has no voice in the team, or when the tests are "mere formality".

User side: some "side effects" arise due to lack of warning to users that "something has changed" and that this change implies that the user also changes his way of proceeding. In this case it is not programmer failure, but lack of software operation manual update, or communication with users.

Psychological side: it is common to neglect cases of "acceptable side effect", when the side effect is rare or its "evil" is no worse than software before correction. We cannot accommodate ourselves: document and put on the bug list, Murphy’s Law says it will resurface (regression regression!) and cause worse damage if it does not tidy up.


Concepts

(included after seeing the terminological disparity of the discussion) A brief dictionary of the terminology used, and a personal view of the context.

Types of failure (of interest to this scope): software failure and failure of requirements. To formulation, analysis and documentation of requirements is part of the development process of a software, and results in what we call generic and vaguely "requirements". If the requirements are flawed, they will originate a failed software. If the requirements are reasonable, we can talk about software failures, also calls from bug.

Bugtracking and new requirements: tools like the Bugzilla or the same community interfaces as the Issue tracking of Github, make it possible to accurately document and evaluate bugs and the new requirements (requests for new features).

Bug fix: we use the term "correction" sometimes in a confusing way, also covering the notion of "inclusion of new functionality" (satisfying the request of new requirement). By practicality I will give this bad habit in this text.

Basis of reliability: I will assume (ignore other theories by practicality) that the only two ways to make software more reliable are testing the same after ready, and "demonstrating" (mathematical proof) every step that algorithm meets the requirements, with some black box condition, or with some high-level formal description. That is to say test and proof are the only ways.

Version control: I will use the term "version" to designate "software after bug fixing" only, let’s ignore "Fork versions". Version control is exercised by source code management software such as git.

Trial version: Let’s call "alpha version" the one the development team is testing, and "beta version" the one that a select group of users tests (just test, not yet produce)...

Homologation vs producing: "production" is when it has been accepted and is stable and in use by all. "Homologation" is a term commonly exchanged by others. In the present jargon what Debian calls testing releases, I would call "homologation releases". In the same Debian which has already been homologated, and is "in production", it is said stable release. In many development environments, there is no differentiation between approval tests. One of the proposals of the answer is to make this distinction.

Regression testing

The purpose of the call "regression test" is to certify that change in software (bug fix) does not introduce new failures or side effects. Strictly speaking it is the same as performing several asserts (item 2 of the Reply section), but in practice who does this is an appropriate application (that simulates user) or the test team (real user). Has a much more profile than black box test.

Another important thing in this type of test is the mapping of the modules in the functionalities tested by the end user: test first, or more insistently, the modules most coupled to the modified module.

When dealing with new features (no correction), the assert can become more complex as there is no set of previously approved outputs to compare. In this case tools of diff can help compare the "new" and "old" outputs".

10

As already mentioned in the existing optimal responses, regression (understood simply as something that worked as expected to stop working) is a natural phenomenon in complex systems, to the point of even being argued as something inevitable. In any case, experience indicates that there are some ways to decrease the risk of this undesirable effect.

1 - Minimize dependencies between software components

Software dependencies are usage relationships between "pieces" of a computational system that are intended to reuse and/or organize concepts, solutions or just code. These dependencies can occur in many ways, whether by means of global variables, jumps (with the infamous goto), functions, classes or any other structure available in the programming language used.

Such dependencies are necessary in the construction of any system, as they are what allow the solution to occur. But they are also the main cause of the complexity of changing an existing software. For example, a function reused in numerous locations has a high degree of dependency, so that any change made to it assuredly will affect its users (either by providing an improvement, if the change is correctly made, or by causing a regression, otherwise).

Minimizing dependencies simply means decreasing the number of "parts" of software that depend on others. But this decrease is not arbitrary, but carried out according to a rational approach. For example, it is often put as "good practice" to avoid writing very extensive functions. The reason for this is that a very large function potentially does more than it should, and ends up being reused by many parts within software. And if so, by doing more than one "function" (see how the name makes sense) it may need to be changed due to an A feature and end up generating a regression on a B feature that is dependent on the same function.

The correct planning of these dependencies, both during the analysis phase and during the programming phase, can avoid the "coupling" unnecessary of functions that would make much more sense if kept separate. This applies mainly to object-oriented languages, especially when inheritance is involved. A maxim that represents this ideal in the C++ language (recollection of old projects, hehehe) is: "if when making a modification I did not need to tinker with this CPP file, then surely there will be no regression in it".

Incidentally, this practice also helps to more easily locate potential problem sites in the occurrence of a regression, and is related to other good practice that I mention in the following item.

2 - Use effective development management

The organization in software development is not limited to the organization of functions in code blocks. The development process itself takes time and requires as little care as the backup copy of the source files. A developer working alone could perform these copies in zip files, but for a long time there are more appropriate tools.

Version control tools are for much more than just doing backup. They keep a change history by file, allowing comparison of changes and mainly restoration of previous versions if necessary. But in addition, they allow development management at higher levels, with the definition of version labels or baselines.

In a mature development process, it starts from a well-established X.XX version of the software (that is, functional and with well-known problems) and makes the planning of the changes to the next version. These changes include not only problem fixes but also improvements, both chosen according to project management criteria. Developers make changes to the code in order to fix problems or implement improvements, and it is expected that eventually a new version X.XY (or Y.YY) will be produced including fixes and improvements and no regressions.

The use of version control tools (more specifically, configuration management tools) that include the workflow (workflow) allows managers to establish a baseline (baseline) to mark individually the version of each source file as belonging to the version of the system being generated and delivered. Through these version labels, such tools are able to easily identify the changed files from one version to another, facilitating enormously the identification of the components that necessarily need to be tested for the success of correcting the problems and implementing the improvements, and also the dependencies that need to be tested to verify the possibility that there has been a regression.

It should be easy to notice how this item is complementary to the previous one. Unnecessary dependencies not only open room for potential regressions, but also increase compilation time in large projects and hamper the aforementioned management. Another good practice in languages like C++ and Java is to create separate files for each class. This good practice is linked to this principle of separation, and in some ways allows the certainty that a file that has not been changed will surely not regress from one version to another.

3 - Route tests - preferably in an automated way

Although not changing a code file ensures the absence of regression in its local scope, this is a finding still insufficient. The features provided by complex systems are dependent on a large number of components, and an even larger number of code files. It is therefore necessary to verify the regression at the functional level.

There are mathematical formulations that can prove (by contradiction or induction, for example) that a part of a code does what it claims to do, but the application of that principle in vast and complex systems is simply unworkable. That is why the best way to evaluate systems is the test. Other responses mention using the system in an isolated production environment, but I see this as a distinct form of testing. After all, if the environment is not production, the system is not actually being used.

Ideally one should test the system thoroughly before delivering a new version, but this is not always possible (due to lack of time or resources). The change indications provided by the version control tools described above can be very helpful. But it is increasingly common practice to automate testing whenever possible. This automation makes particular sense when related to functional requirements, in order to ensure that each requirement has been properly verified (this relationship must exist regardless of automation, actually - but anyway).

In the last company where I worked as an employee, new versions were initially tested manually (by a different employee of the developer) only on the components indicated as modified by the version control tool (there was used the Rational Clearcase). If problems were found (among them any regression), a call in the tool itself was opened for some developer to check, and the version was frozen until the problem was solved. After this phase of manual testing, an automation tool (there was used the Autohotkey) was automatically run on a generated version and installed on a standalone machine (without any installation of compilers, libraries, etc.). Automatism included graphical interface interactions in the same molds as an end user (user), and the scripts for this test were created and maintained by other employees independent of the testers and developers, closer to the requirements analysts, to test completely the system in terms of its functionality. The regressions were rare (I believe due to the good test management and also the maturity of the development team), but when they occurred they were mainly identified in this phase.

Concluding

Regressions are problems that can only be dealt with by the entire development team, as they depend on good practices in programming, testing and management. In fact, testing seems to be the most effective way of identifying regressions. Today, when comparing the approaches of the different companies in which I worked, I would say that this type of problem is what most damages the image of the company before the client, because it shows the mismanagement of the project very easily.

Test automation is a useful and fairly easy practice nowadays (it has several tools that can help with that), but it relies mainly on a good test script (Checklist), that check in a first instance only the components changed, and ultimately all functionalities.

Regression is a somewhat inevitable problem, especially in large projects involving many developers. What all these practices actually do is prevent these problems from following the generated versions and reaching the system in production, harming the customer and, consequently, the project.

9


Well, your question is related to a broader context: Software Test Management.

A good basic guide, is the booklet used for the study of certification ISTQB (International Software Testing Qualifications Board): Download for free on this page.

That said, you will need coverage on two fronts. I will comment here based on my experience in managing Quality Assurance.

On the Development Side

Programmers must produce tests with unit tests. But not only that: the tendency of programmers is always to test only the most positive case possible. It is necessary to always test several negative conditions: cases in which the system is expected to be ready to react to a negative situation. For example: in an unauthorized access attempt (a negative situation), the system must respond amicably that access is not allowed, rather than "overflow" Exceptions pro user.

In addition to positive and negative unit tests, over time this list should include bugs. Each new bug that appears and is fixed needs to be added to the unit test list to avoid regression bugs.

How to Manage on the Development Side

There is no need to have someone in a role especially dedicated to it. But it is necessary to have a technical leader with experience in Software Quality. And if, on the one hand, we don’t need a new job, we need a software infrastructure that supports it. The ideal in this case is to have an integrated version control environment with Continuous Integration.

I will cite some software as an example, but of course what is described here can be adapted with the software you already use.

For Version Control: use the Git + Github. The first offers you the tool, while the second offers you a repository with navigable web interface, wiki, management issues, among many other features. It is also free for public repositories.

Still use, in what concerns version control, the flow called Feature Branch: no one should ever work on the main copy (called master in git, trunk in the SVN).

Each new Feature or bug will be worked on by the developers in a new branch separately, which should be tested by the developer himself locally.

After finishing the new code branch, then the programmer will be responsible for writing and implementing the related unit tests. A parenthesis here: although I have commented on writing the tests afterward, there is much discussion on this. Some recommend write all the tests before, others say that making this requirement is always a little too radical (including the author of the TDD himself), and some write the tests only after.

That’s where the Jenkins (or another Continuous Integration tool): it will monitor the repository (Github) and automatically trigger the execution of tests every time someone updates some branch.

If all goes, ok, new code accepted. If not pass, it is the obligation of the developer who sent the broken code to fix it until the tests pass (or update the tests, if this is the case).

Once Jenkins has warned that all tests have passed, the programmer will generate a request for pull request, which is a request to join (merge) the current arm with the master. This allows working code not to be broken, and gives programmers a better view of the source code history.

Who should approve or fail the pull request is the technical leader.

Ideally, Jenkins results and code submissions should be viewed by the entire team, such as emailing or shown in some collaboration tool (such as Teams or Slack).

In fact, it should be mandatory for the entire team, developers and testers, to go online on any of these tools. Because in addition to the communication itself being facilitated (and today it is quite common to remote work, which is my case), these tools can receive warnings from tools (such as Jenkins or Github), and send emails automatically if the recipient of the message is away or offline.

From the Side of the Testing Team

Yeah, you’re gonna need a test team. Programmers are always on a tight schedule, they know a lot about the program (while the end user isn’t), they tend to always test the most positive flow possible, and if they’re inexperienced, they’ll think that the first code that comes out will be a masterpiece, so close to perfection, and any bug will be like pointing a defect in a child. While an experienced programmer knows that finding bugs is a good thing, after all programming is a complex activity and always produces bugs.

For testing, the obvious: testers trained in Quality Assurance. After all you ask the driver of your team to take care of the mechanics of the car, or for someone with training in mechanics?

That being said, the testing team will first need to know in depth the requirements, use cases, cards, or whatever the requirements are stored in.

Based on them, and the blackbox techniques, they will need to write a good list of MANUAL tests.

Contrary to popular belief, manual testing is much faster to be detailed and performed. Automating tests, especially involving the UI is a VERY complex, time consuming and expensive activity. Moreover, at first, it is not clear which features are stable, and which will be in continuous change. Therefore, we cannot spend time and resources automating tests that will soon need to be changed in the face of changing requirements.

After detailing the manual tests and storing them in some testware tool, the tests should be prioritized, as it will often not be possible to run all the tests. The priority will come from the time available, the importance of each tested part, and yet there is always a part of the software (usually the most complex) that has more bugs than all the others (principle of grouping defects - bug clustering).

Then, on the planned deadline, the selected tests will be run. The results stored in this same testware tool. And bugs will be opened, fixed by developers, re-tested and closed or re-opened.

With these tests in hand, it is possible to select some of them for a regression test group. This test group will always run every new cycle, to ensure that existing functionalities are not broken.

After some executions, it will be clear which of the manual tests will always be performed. These tests will be strong candidates to be automated (with the help of some automation tool, obvious).

How to Manage on the Testing Side

Someone from the test team with some experience in Project Management should be in charge of the management activities. After all, the activities are the same as a project: managing human resources, setting priorities (after all, one of the rules of testing is that testing is impossible), estimating and responding by deadlines, reporting metrics, etc.

The Bugs

Bugs must have the following characteristics:

  1. A clear and short title;
  2. Tell which version of the system affects;
  3. Tell which release is expected to be fixed (this can be added later by the development team);
  4. Tell which operating system, version and browser (if applicable) bug was found;
  5. A detailed description, including all the necessary steps to reproduce the bug;
  6. Expected result;
  7. Current result;
  8. Preferably captured screens, or better still, a recorded video.
  9. An assigned handler, even if initially someone is responsible for sorting bugs.

Some considerations

It is necessary to create the mentality, in the whole team (development and testing), that the goal of both is the same: to deliver as bug-free software as possible.

The testing team does not audit the development. Bugs will always exist, and both will work towards the same goal.

The development team can’t be sad when a bug is found, after all finding a bug (that existed before) is a good thing.

The test team should be extremely versatile in communication (both in writing well, and in being kind in communicating), and be inquisitive in nature. You should want to understand the software, question the requirements when dubious, and be persistent when you believe a bug is important.

And, Software Quality, is everyone’s responsibility: who writes the requirements, who reads them and notices something missing, the programmers, the testing team, anyone.

  • And when I say visualize history, I mean literally visualise, as various tools and Ides support to compile whiteness history charts and project tags.

  • I also do not include in the text the use of tags, to prevent the text from getting larger than it already is. But remember this: create a tag for each version released. With this you will have exactly one copy of the code of each version that customers/users are using, which will help a lot when identifying the problems.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.