What are the differences between Git, SVN and CVS?

Asked

Viewed 25,811 times

151

What are the advantages, limitations and main differences between these 3 versioning systems, Git, SVN and CVS?

  • 5

    I think the answer to that question gives a whole book. But it’s a good question! =)

  • 1

    Excellent answers here below; this is one of those questions that becomes classic, many medals and points for all involved ;)

  • The problem with this kind of question is that it’s almost impossible to choose the right answer.

  • 9

    It was interesting to compare also the Mercurial

3 answers

133


CVS

It was one of the first version control systems to have alternative development streams and to allow free editing of text-only files between several people of the same team at the same time. The idea of a repository within the norms of is in the tree scheme, comprising a main stream of development (called trunk, or trunk in English), alternative development flows (called branches, or branches), in which changes are implemented separately from the main flow, and tags (labels, which are revisions of the other two streams that can no longer be changed, ideal to indicate stable versions).

Has the following commands:

  • Checkout: is usually used to name the first download of an entire module from the CVS repository.
  • Commit: uploading user modifications to the CVS repository.
  • Export: is the download of an entire module from a CVS repository, without the CVS administrative files. Exported modules are not under CVS control.
  • Import: is usually used to designate the creation of an entire module within a CVS repository by uploading a directory structure.
  • Update: updating the local copy of the work by downloading the modifications made by other users in the repository.
  • Merge: is the fusion of modifications made by different users in the local copy of the same file. Whenever someone changes the code, you need to update before the commit, so that you merge - or merge - the changes.

It uses a client-server architecture in which all code is centralized. Ideal for linear development, whose projects are under maintenance or minor improvements.

SVN

It is the evolution of the that solves several limitations of CVS, such as the introduction of commands Rename and Move, that not only renames/moves the file but keeps its history of changes, the command Commit (send files) be truly atomic, supporting rollbacks in case of failures and versioning of files not supported by , as symbolic links.

Has all commands of and a few more:

  • Rename
  • Move

It also has the ability to store metadata of files and directories (ignored extensions, merge history, etc.).

Git

It is a version control quite different from the and , because the versioning model is decentralized (there is not exactly a central flow, and when it exists, it should not be changed, receiving only merges of other development streams) and the sending of files is in two phases:

  1. Commit, phase where changes are stored only locally;
  2. Push, phase where changes are sent to a server that concentrates all sets of changes (called changesets) which can be recombined freely among themselves.

Like the is not a linear architecture server, is ideal for early development projects where conflicts are common and features are developed separately. The process of merge is the most complete and tolerant of the three.

90

First we can classify the three tools into two major categories:

Centralized Versioning Systems

CVS and SVN have a central repository where users do the checkout and commit of versioned artifacts.

The advantage of this approach is that you can have central control over projects, enforce access security more easily. Also, there is the possibility of blocking files (lock).

However, there are many disadvantages. The main one is that this type of system does not scale very well, that is, many teams and projects in the same repository tend to slow it down. Another important disadvantage is that users cannot do much offline, always be connected to the central server to perform operations like creating tags, branches, do merge, etc..

In addition, there are significant differences between CVS and SVN:

  • SVN can track renamed files.
  • If centralized versioning is slow, CVS can be even slower.
  • The commit CVS is per file. SVN can group the changes of a commit, then it is possible for example to go back to a previous revision. It makes it very easy to find which one commit broke the code.

Distributed Versioning Systems

In Git, like Mercurial and Bazaar, there is no central repository. Of course you can choose one as such, but each repository, even the developer’s machine, contains a full and functional copy of the repository.

A drawback of this model is that initial cloning of the repository can take a long time, as not only the current copy of each artifact will be transferred, but also the historical one, tags and branches. One thing that can minimize this is the possibility of selectively retrieving parts of the repository, such as branches, tags or even by date. But I don’t know details on how far this is implemented in each of the systems.

Another drawback is the difficulty of centralized management and effective access control, as repositories are distributed across multiple environments.

Also, in distributed versioning systems, commit and checkout are done in the local repository of each environment. Upon completion of the work, with the properly "commited" class, "past" tags, and "merged branches", the developer needs to synchronize his local repository with the remote repository. This is done with the commands push (sends updates from your local repository to the remote) and pull (recovers updates from remote repository to location).

It’s a little more complicated to work with distributed versioning systems, but the advantages are many:

  • Except the pull initial, they are much faster than centralized systems like CVS and SVN.
  • Many operations do not require network access, so the developer can work offline, synchronizing with the remote repository only when needed.
  • The developer can work in private mode, generating tags, branches and versions that will simply be discarded.
  • Except when there are conflicts, the merge is automatic.
  • Each copy of the repository acts as a backup of the "main repository".

Note that there are differences between distributed versioning systems. I know almost nothing about the Bazaar, but I can mention some interesting cases regarding Git and Mercurial:

  • The command git pull includes the update and updates the files in use of the project, then the pull Git is different from the traditional concept of pull, being equivalent to pull + update. The pull pure would actually git fetch.
  • In Git, you need to manually add new and changed files to staging area with the command git add to be "committed". Mercurial does this automatically by default.
  • +1 for mentioning other distributed versioning systems.

  • +1 Excellent response!

  • 1

    An interesting point is that DCVS meet needs of distributed projects including in the physical sense of the thing. External contributors in different locations exchanging patches by email (or by doing Forks and commanding pull requests for projects upstream). A complex chain of repositories and hierarchy of reviewers and commiters, as well as many branches and merges (who has already had to make a merge two branches of the SVN knows the work it takes).

  • 2

    Both Git (initiated by Linus Torvald) and Mercurial (Matt Mackall) emerged in reaction to the license change from Bitkeper, one of the first tools that could handle the problems mentioned above.

  • Excellent response

  • 1

    An interesting feature you have in git is the stash.

Show 1 more comment

51

A bit of history

Beginnings

The version control systems are very old. Some of the first known were the CA Software Change Manager, the Panvalet and the SCCS 1972. Only a decade later one emerged that formed the basis of how we know these softwares today with the RCS. A few years later an evolution emerged that made the use practicable. But not without problems.

Evolved products

The CVS was useful for a long time, but it lacks more sophisticated functionality and mainly reliability. It is quite true that in some cases it works well. But I can’t see any reason to choose you these days to manage a code base of yours. At most use a client to pick up code from a legacy project.

Throughout the 80’s and 90’s there were several options of commercial software as the best known Clearcase, Visual Source[un]Safe and Perforce.

These systems are considered client-server, so you need a central repository to receive all updates. Each one with its characteristic, with its virtues and defects managed to greatly improve the workflow of teams developing software.

Something that works well

CVS was very successful because it was open source, but it had many defects. Not that some commercial products (see little joke I made above) did not have their success either. With this came the Subversion or simply SVN. An evolution in functionality and reliability compared to its predecessor CVS.

CVS

So the only thing I’m going to say about CVS is, don’t use it, don’t waste time with it. It does not have a single simple advantage over its competitors.

The maximum evolution

Still some projects were not well served by these software. From this comes a new generation of products that work in a distributed way, being able to work without a central server (although in practice ends up having a final server).

This not only enabled the work to be disconnected, more complex interactions between the team and enabled teams to be of unlimited size, but also added some facilities, making the workflow more organized and flexible, allowing repository hierarchies, and especially to do merge, because this has become a demand for heterogeneous teams.

Some examples of these softwares that have emerged in this century are: Sun Workshop Teamware, Code Co-op, Bitkeeper, GNU Arch (the first open source), Darcs, DVCS, Monotone, Baazar, Git, Mercurial, Fossil (a little different from its competitors and that deserves a look), Veracity, Plastic SCM. The latter bringing the two modes closer together distributed and centralized.

Git and Mercurial began to be developed to replace Bitkeeper which had a change in the license. They ended up becoming the best known distributed version control software. And more recently with the growing popularity of Github (central repository web for Git) it became dominant making even Microsoft and Google prefer its use over their own repositories. Microsoft eventually bought it.

Popularity

Taking away proprietary software like Team Foundation Server and some other less expressive, few software has maintained popularity. There is still a good fight between the two ways of controlling versions. SVN has almost become the exclusive representative for centralized control and Git has fired in preference for distributed control, followed by Mercurial, with fewer and fewer supporters.

In the background there is a comparison to these two most used and in a way a comparison between the two modes of control, because this is what differentiates them the most.

I particularly like simpler systems (centralized control) better, although I admit that some of the more complex (distributed) system functionalities are highly desirable. That’s why I like solutions that try to bring both together, so I’ve always gotten along with SVN and Mercurial.

But I also admit it gets harder and harder not to look at Git. Having such a popular repository and having more and more good integrations with other tools makes it something to consider even if you don’t need a truck to deliver pizza.

Popularity is also Feature.

SVN

Works great for individual developers and small teams. It has more practical mechanisms in simple workflows and they don’t usually have much conflict. It shines when more control is required, especially in corporate environments.

If we don’t consider popularity, the SVN should be the first choice for developers until a specific need is perceived that only Git can bring (and this exists a lot).

Perks

  • It is easier to learn and use and fits more in the intuition of programmers.
  • Behaves more like a file version control in general.
  • A single canonical repository is best suited to corporate philosophy allowing greater control and facilitating administration.
  • Allows working with parts of repositories. Allows mounting changesets.
  • Backup it’s very simple.
  • Allows locking files preventing updates.
  • Uses sequential version number that simplifies browsing versions.
  • More sophisticated access privilege control.
  • Easier to work with binary files, especially large ones.
  • Works best with renamed files.
  • Preserves the timestamp files* (at least in some situations).
  • Gives more freedom in the way of working and organizing the project.
  • Branches are only a part of the repository (for better or for worse).
  • Since it only works with the files, an initial copy of the repository is relatively fast.
  • Discourages complex workflows (yes, this is an advantage).
  • It is more mature, especially in Windows and use with GUI (it was an advantage, today no more, Giut improved too much in this).

See more about some concepts that are strongly rooted in the SVN.

SVK

It is possible to adopt a more decentralized workflow with SVN through the SVK. At least some of the disadvantages of SVN can be mitigated in this way without losing most of the advantages. Too bad it is not actively developed. But regardless of this some people develop similar flows.

Git

A decentralized system shines when development is decentralized (I still don’t understand individual use but I’m trying to understand), especially in large teams with large amount of updates and complicated flows. Of course it can be used in other environments as well.

There’s a big learning curve. It has a philosophy closer to that of UNIX-based systems full of scattered tools and they are not so well documented, they are not thought to facilitate the work and understanding of what is being done, it is not very intuitive. (Beware, people who have used it for a long time will say that the SVN is not intuitive. Yes, when you get used to a way of doing it, it becomes more intuitive for you. Once he gets used to a job he becomes intuitive).

An important difference is that in distributed systems there is a distinction between commit and push. The commit creates a snapshot place and only the push actually sends pending revisions to another (possibly remote) repository. On centralized systems the commit does it all. This is simpler but creates difficulties in environments of many updates by different users.

Perks

  • Controls content more generally.
  • In almost everything runs faster, in some cases absurdly faster. It is optimized to run over the internet.
  • The repository takes up less space. And it’s easier to repair it.
  • It’s much easier to manage multiple sources of updates.
  • It’s easy to work with local copies to do experiments and parallel developments. Branches are cheap and simple. Are even encouraged.
  • Encourages the commit frequent.
  • Makes it much easier to do merge.
  • It allows you to work comfortably without losing any functionality and information without being connected to the central server (which can and often is used). It has more local metadata.
  • It has more audit information and more facilities throughout the administration.
  • Handles end-of-line conversion more easily.
  • Reviews are digitally signed.
  • Project history can be modified.
  • Has a staging area which allows you to select parts you want to send to a repository.
  • Allows a wider range of workflows.

Some of these advantages are available today also in SVN.

On centralized repositories do not scale well

That’s not quite true, as I always say, depends on who does. Google made a single repository work well for your entire code base and is one of the largest in the world, absurdly larger than Linux that is understood that centralized would not work (of course they are different goals).

I also always say that there is a difference between good programmers and good engineers. Those who know how to do engineering always find better solutions. Who is not an engineer, and no harm in it, does what everyone is doing, sometimes thinking a little, sometimes without thinking. And don’t get me wrong, these people are good at programming, they understand reasonably what they’re doing, but they’re looking for the ready-made solution and not the best solution. Although it has its merit, this is not engineering.

Git is great, but I only use it because it’s easier to use what everyone else is using. If I were to choose a decentralized version control it would be Mercurial, but a centralized one usually suits me better. Each one should look for the one that best suits him.

Completion

This analysis may be outdated because the two softwares try to get closer to each other and improve on their deficiencies (in fact this has been happening since I wrote the original version). There is no miracle, the only way to know for sure if the software is good for you, is to install and start using.

Given the popularity Git eventually dominated the industry. There are still reasons to use other things, but if they don’t form strong Git turns out to be the easiest choice.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.