TL;DR
--first-parent
If you squash, Pull Request reviewers will just have to read one single commit. Easier.
You don’t have to squash: the merge commit already contains the whole branch, squashed.
When the PR is not squashed, you can review both the final result and each single step. You can comment, amend, or exclude each single commit. Still, you can see the PR in one single, unified change from the merge commit.
On the contrary, if the PR is squashed, you just have the final result, and all the single steps are lost forever.
Hard to expect painstaking precision when details have been molten.
A history like:
is cleaner than one displaying all the single commits of each pull requests:
You don’t need to squash to hide details. Just use --first-parent
.
This works from the command line using --first-parent
:
and with some Git GUI clients such as SmartGit and Magit. Not all the Git frontends support --first-parent
, though.
“[T]he problem isn’t the extra information: it’s that the information isn’t displayed in a way that shows them what they’re interested in” (David Chudzicki). In other words, making the history clean is mostly a matter of data display, not data collection. You can store all the details and still be able to only show the merge commits.
That’s why Git provides the option --first-parent
in the first place.
Ditto. Who cares?
You will probably regret having squashed the history the next time you troubleshoot.
git bisect is your best friend when searching which commit introduced a bug: if the devs stick with the good habit of commiting early, often and small, git bisect
will have all the chances to return you the very specific line of code containing the issue.
With squashed and large commits, you are left alone troubleshooting by hand.
The reviewer cares about the net effect of the PR, not about the half implemented commits, the broken ones that not even compile, the fixed typos, the amendments and the like.
Don’t commit broken code in the first place.
Conscientous developers do review their work before submitting a pull request, and each and every of their commits builds, has green tests and is potentially deployable.
Git offers the scrupulous developers all the tools for tidying up their commits
commit --amend
and fixup
for amending commitsrebase --interactive
for deleting, reordering, squashing commitsThere is really no excuse for pushing a pull request with not-compiling commits.
If the policy can be read as:
Don't worry, no matter the mess, all your commits will be squashed into one
you can be sure that no one will break their backs for avoiding the mess.
I saw this happening: mandatory squashing rules eventually translated to tolerated sloppy habits.
If you don’t squash, all those commits will knock Git down!
Reducing Scala repository (38,098 commits) to one (1) single commit just saves 47% of space:
Try yourself:
repo=scala
squashed=${repo}-squashed
rm -fr ${repo} ${squashed}
git clone https://github.com/scala/${repo}.git
cd ${repo} && git gc --prune --aggressive
cd ..
mkdir ${squashed}
cd ${squashed}
git init
git fetch --depth=1 -n ../${repo}
git reset --hard $(git commit-tree FETCH_HEAD^{tree} -m "initial commit")
git gc --prune --aggressive
cd ..
du -sh ${repo} ${squashed}
All valid arguments. But the reality speaks for itself. That’s the Scala repository, which does not use squashing:
Look instead how Typescript went from not squashing (on the left) to squashing (on the right):
In all honesty, if the alternative to squashing is having horrible Git histories like those, I’m all for squashing.
But there’s a reason why they are so convoluted: in those repositories PRs are merged without rebase. When PRs are rebased before merging, the result is like the Haskell Cabal’s repository:
With a sane and disciplined workflow, it’s not hard to have both all the details and a clean history.
But this deserves a separate article.