Bisecting LLVM code
Introduction
git bisect
is a useful tool for finding which revision caused a bug.
This document describes how to use git bisect
. In particular, while LLVM
has a mostly linear history, it has a few merge commits that added projects --
and these merged the linear history of those projects. As a consequence, the
LLVM repository has multiple roots: One "normal" root, and then one for each
toplevel project that was developed out-of-tree and then merged later.
As of early 2020, the only such merged project is MLIR, but flang will likely
be merged in a similar way soon.
Basic operation
See https://git-scm.com/docs/git-bisect for a good overview. In summary:
git bisect start git bisect bad master git bisect good f00ba
git will check out a revision in between. Try to reproduce your problem at
that revision, and run git bisect good
or git bisect bad
.
If you can't repro at the current commit (maybe the build is broken), run
git bisect skip
and git will pick a nearby alternate commit.
(To abort a bisect, run git bisect reset
, and if git complains about not
being able to reset, do the usual git checkout -f master; git reset --hard
origin/master
dance and try again).
git bisect run
A single bisect step often requires first building clang, and then compiling
a large code base with just-built clang. This can take a long time, so it's
good if it can happen completely automatically. git bisect run
can do
this for you if you write a run script that reproduces the problem
automatically. Writing the script can take 10-20 minutes, but it's almost
always worth it -- you can do something else while the bisect runs (such
as writing this document).
Here's an example run script. It assumes that you're in llvm-project
and
that you have a sibling llvm-build-project
build directory where you
configured CMake to use Ninja. You have a file repro.c
in the current
directory that makes clang crash at trunk, but it worked fine at revision
f00ba
.
# Build clang. If the build fails, `exit 125` causes this # revision to be skipped ninja -C ../llvm-build-project clang || exit 125 ../llvm-build-project/bin/clang repro.c
To make sure your run script works, it's a good idea to run ./run.sh
by
hand and tweak the script until it works, then run git bisect good
or
git bisect bad
manually once based on the result of the script
(check echo $?
after your script ran), and only then run git bisect run
./run.sh
. Don't forget to mark your run script as executable -- git bisect
run
doesn't check for that, it just assumes the run script failed each time.
Once your run script works, run git bisect run ./run.sh
and a few hours
later you'll know which commit caused the regression.
(This is a very simple run script. Often, you want to use just-built clang to build a different project and then run a built executable of that project in the run script.)
Bisecting across multiple roots
Here's how LLVM's history currently looks:
A-o-o-......-o-D-o-o-HEAD / B-o-...-o-C-
A
is the first commit in LLVM ever, 97724f18c79c
.
B
is the first commit in MLIR, aed0d21a62db
.
D
is the merge commit that merged MLIR into the main LLVM repository,
0f0d0ed1c78f
.
C
is the last commit in MLIR before it got merged, 0f0d0ed1c78f^2
. (The
^n
modifier selects the n'th parent of a merge commit.)
git bisect
goes through all parent revisions. Due to the way MLIR was
merged, at every revision at C
or earlier, only the mlir/
directory
exists, and nothing else does.
As of early 2020, there is no flag to git bisect
to tell it to not
descend into all reachable commits. Ideally, we'd want to tell it to only
follow the first parent of D
.
The best workaround is to pass a list of directories to git bisect
:
If you know the bug is due to a change in llvm, clang, or compiler-rt, use
git bisect start -- clang llvm compiler-rt
That way, the commits in mlir
are never evaluated.
Alternatively, git bisect skip aed0d21a6 aed0d21a6..0f0d0ed1c78f
explicitly
skips all commits on that branch. It takes 1.5 minutes to run on a fast
machine, and makes git bisect log
output unreadable. (aed0d21a6
is
listed twice because git ranges exclude the revision listed on the left,
so it needs to be ignored explicitly.)