A Brief Overview of Git

In this post I want to cover some Git misconceptions (or rather, lack of conception?)

add/commit/push

Commits are the smallest unit of reasonable change. In between two commits should lie the smallest possible meaningful change that doesn’t break anything. Notably, it should not be reasonably possible to break a commit into two smaller commits.

(“Commit” as a noun means “a specific version of the repository history”, while “commit” as a verb means “create a new version with the smallest reasonable difference from the current version.”)

git add is the “scratchwork” as you are making the edits you need for your next commit. git commit, well, actually creates the version. git push is irrevocable1 and updates the rest of the world about your changes.

Merge conflicts

Merging/rebasing is a way to reconcile the history of two divergent branches.

There are two types of divergent branches:

As a concrete example, let us say I create file A and you create file B. Then, no matter what order we do that in, the end result is the same: files A and B have been created.

Git uses diffs (the difference between two branches) to calculate the changes made. So, for instance, if we start with a file that says

b

and I add a character a above the b, it will look like

+a
b

while if you add a character c above the b, it will look like

b
+c

and Git is smart enough to realize that the change isn’t, “Dennis overwrote the file b to ab, while you overwrote the file b to bc” — rather, Git recognizes that I added a before b and you added c after b. Therefore, even if we edit the same file, there is a possibility that git merge spits out

+a
b
+c

because under a diff based system, the order in which we apply our changes does not matter. You can write c below b before I write a above b and we will still end up with the same result. (By the way, diffs are performed on a line-by-line basis: it will not work if I change b to ab while you change b to bc.)

However, if I do

+a
b

but this time, you do

+c
b

the order which we apply the changes matters. If we apply my change first, we get

+a
+c
b

while if we apply your changes first, we get

+c
+a
b

Git can automatically merge for you if the order does not matter; any reasonable interpretation of the changes (my changes first or your changes first) gives the same result. However, when the order of the changes matters (i.e. the order of lines is not necessarily fixed), that is when you get a merge conflict and must manually resolve it.

A more abstract example with mathematical functions

To make an analogy, consider the functions f(x)=x+1f(x)=x+1 and g(x)=x+2g(x)=x+2. Then f(g(x))=g(f(x))=x+3f(g(x))=g(f(x))=x+3. In other words, it doesn’t matter what order you apply the changes (i.e. ff and gg) in: the end result is still the same.

However, if f(x)=x+1f(x)=x+1 and g(x)=2xg(x)=2x, then f(g(x))=2x+1f(g(x))=2x+1 while g(f(x))=2x+2g(f(x))=2x+2. Here, the order in which you apply the changes matters.