What are modification sets in Git
"Set of modifications" is to Changeset, that is unrelated to how Git stores changes.
Changeset is a core concept of Git and is also present in several other source code version control systems.
The basic idea of a changeset is to commit a set of changes in an atomic way, that is: either all set changes are successfully committed, or none are. We can make an analogy with a database transaction that either ensures the persistence of multiple records at once, or rollback everything if the persistence of one of the records gives error.
It is a fact that changesets in Git go beyond changesets in other version control systems. In git you can, for example, change a changeset in the repository! That is, in Git you can modify the history of the changes that have taken place. Of course, there are scenarios where this applies and there are restrictions, but this is another story.
How Git stores changes?
Output, it stores in a similar way to many other version control systems: when a changeset arrives, only the changeset files are saved. Files that haven’t changed remain there as they were. Every commit Git logs a snapshot, which is the status of the repository as it was after this commit.
As with other versioning systems, you can request the status of the repository at any point in the past, i.e.: you require a particular snapshot. What Git will give you then are the commited files at the time of registering that snapshot and also all the files that were there before, the ones that weren’t modified by the commit that gave rise to the snapshot.
No, Git doesn’t only save commit changes to each file. At the time of the commit, Git stores the entire contents of the file, not just the modifications made to the file.
It’s correct to say that Git is able to store only the differences between commits of the same file instead of having to keep the entire file even if only a single line has been modified?
Yes, it’s correct. In good times, Git will make a sort of Garbage Collection and, among other things, it will also delete some historical files by replacing them with the record only of the modifications that happened in these files between a commit and another (Delta Encoding). You can also force this process when you wish.
Note that during Garbage Collection, Git does not replace the files in the new commits with its own delta encoding but the reverse: it gets the changes from the file’s most current state back, so as to deliver with more speed the latest version of the file (which is probably the one you will want most of the time).
Completion
Set of modifications or Changeset is a concept that deals with commit atomicity and is not directly related to the way Git stores files. Git is one of many versioning systems that use this changeset concept.
During the commit, Git stores all the contents of the changed file, regardless of whether the file has been modified too little (just a new line, for example).
Git does not need to save a copy of the repository every commit to ensure that the repository is available in some past state. Instead, during the commit it registers a snapshot which points to the newly committed files and also to the current versions of the other files that were already there.
At the right time, Git rearranges its base to save space (Garbage Collection). During this reorganization, past versions of a file can be overwritten by records only of the changes the file has undergone (delta encoding). So, when an older version is requested, Git rebuilds the file from its current version, applying the change records to the older version.
That’s what you’re really thinking, it saves on the last commit.
– bfavaretto
Because the second option (save both the new full version of the file and the modifications in this file) would be a redundant record.
– Caffé
http://git-scm.com/book/en/v2/Getting-Started-About-Version-Control
– Skywalker
this in English http://git-scm.com/book/pt-br/v1
– Skywalker
Thanks @bfavaretto, I just didn’t get one thing. This link that Juarez passed says that Git actually saves snapshots, while other Vcs save modifications to the files. If Git saves snapshots, then in reality it has in each commit the exact state of each file and not just the right modifications? I guess I don’t quite understand how this idea of snapshots relates to what I saw in the course and commented on the question.
– SomeDeveloper
No, it only has the modifications, but with them it is able to reproduce the entire tree of your project in a given commit. I don’t know where you read this, or what you were taught as "snapshot" in the course, but bear in mind that this is not a technical term, accurate. It is a metaphor for representing that it is possible to recover a "portrait" of your project in several phases of it.
– bfavaretto
Actually in the course there was no talk of snapshots, only changesets, sets of modifications that are saved at each commit. It was at this link http://git-scm.com/book/en/v2/Getting-Started-About-Version-Control that I saw now talking about snapshots. The idea then is that the repository has the modification sets for each commit and is it possible to get that "picture" by applying the modifications? Thanks again for the help!
– SomeDeveloper
@Leonardo Hunm, I begin to fear for this course. It’s nothing serious but I don’t like when something formal trying to teach uses wrong terms. Git does not work with changesets, it does not have that ability. I do not know much the inner workings of these software but know changesets are the modifications made to the grouped files. Snapshots is a state at a given time. In theory it would be an exact copy of the file at that time. But it is possible to work with a delta coding that ends up reproducing the mechanism of differences but by a totally different method.
– Maniero
No one answered, I risked an answer.
– Maniero
@Mustache I also got a little suspicious of the course now. In addition to the official Git website I also found some videos on MVA (http://www.microsoftvirtualacademy.com/training-courses/using-git-with-visual-studio-2013-jump-start) about Git that confirm the use of snapshots and not deltas.
– SomeDeveloper
But as far as I know snapshots are obtained by deltas.
– Maniero
@Leonardo, I made an answer because things were a little fuzzy around here. I hope I’ve made it clear what changeset, snapshot, how Git stores files at the time of the commit, and how the contents of the file are overwritten by the changes records in the file. If something has not become clear please let us know.
– Caffé