Today I read a brilliant article about effective use of git
bisect, but I disagreed with a small nuance of one of its
conclusions (and, by internet law, was honor bound to write a blog post
about it): Had this happened in a code base with a ‘nice history’ (as the squash
proponents like to present it), that small commit would have been
bundled with various other commits. The problem wouldn’t have jumped at
me if buried in dozens of other changes. It’s true that First we need to agree on background and introduce terms. Let’s say you have a git history that looks like this: A standard This is a fast-forward merge. Since the There is no loss of fidelity from the point of development. Every
development commit is kept and the relationships between commits
maintained. Using You can no longer see that There are reasons to choose a lossy merge over a lossless merge. There are blogs that advocate heavily for a squash
workflow. Which strategy to choose is dependent on the content of
the commits you are merging. The strategy chosen should maintain the
principal that a commit in a mainline branch’s history should make sense
on its own. In the above example, the content of the feature branch isn’t shown.
A new example might be a This is an example where the desired outcome is lossless: both of
these commits are meaningful on their own and can be vectors for bugs.
After the merge, in an ideal case, the main branch should look like: Now, imagine a different branch – a bugfix, with only a single commit
that is up for review. During review, there’s a typo in a comment that needs fixing. Now the
branch graph looks like: I would argue that the typo commit doesn’t make sense on its own.
There’s no need to persist that commit into the main branch: it’s noise.
It’s not a vector for meaningful error, it’s a development detail that
shouldn’t leak back into the main branch. In short, a more functional
history for a merge would be to use a lossy strategy: Of course, the example above completely ignores merge commits,
repository merge strategies, and any shared agreements about the state
of feature or mainline branches, code review, testing strategies,
deployment pipelines, and so so (so!) much more! Many of the blog posts on git I read make broad generalizations about
the Right™ way to use some particularly controversial features of git
(pull, merge, rebase, branching, commit messages …wait. 🤔 Is every
feature controversial?), but the reality is that there is a lot of
nuance in the world and the only right answer depends on your
situation.
git merge --squash
obscures history;
whether or not this makes a nice history is entirely dependent on the
situation.Lossless merges
C - D feature/magic
/
A - B main
git merge feature/magic
issued on the
main
branch results in this history:
C - D feature/magic
/
A - B - C - D main
main
ref is at
B
and B
is the parent of C
when
we merge feature/magic
into main
,
main
’s ref is updated to point at the commit at
D
.Lossy merges
--squash
instead of the default merge strategy is
lossy: the fidelity of git history is lost. Squash, in our example,
results in a new commit being added to main
’s history that
is an amalgam of the commits on the feature/magic
branch:
C - D feature/magic
/
A - B - - - CD' main
C
and D
were two
separate commits.Helpful Loss
feature/lossless
branch that
contains a refactor and a new feature that depends on that refactor:
| * (feature/lossless) feature: method is dynamic
| * refactor: method instead of global
|/
* (main) Initial Commit
* (main, feature/lossless) feature: method is dynamic
* refactor: method instead of global
* Initial Commit
| * (bugfix/lossy) bugfix: validate user input
|/
* (main) Initial Commit
| * (bugfix/lossy) fix comment typo
| * bugfix: validate user input
|/
* (main) Initial Commit
* (main, bugfix/lossy) bugfix: validate user input
* Initial Commit
This is all an oversimplification
Posted