1=================== 2Bisecting LLVM code 3=================== 4 5Introduction 6============ 7 8``git bisect`` is a useful tool for finding which revision caused a bug. 9 10This document describes how to use ``git bisect``. In particular, while LLVM 11has a mostly linear history, it has a few merge commits that added projects -- 12and these merged the linear history of those projects. As a consequence, the 13LLVM repository has multiple roots: One "normal" root, and then one for each 14toplevel project that was developed out-of-tree and then merged later. 15As of early 2020, the only such merged project is MLIR, but flang will likely 16be merged in a similar way soon. 17 18Basic operation 19=============== 20 21See https://git-scm.com/docs/git-bisect for a good overview. In summary: 22 23 .. code-block:: bash 24 25 git bisect start 26 git bisect bad main 27 git bisect good f00ba 28 29git will check out a revision in between. Try to reproduce your problem at 30that revision, and run ``git bisect good`` or ``git bisect bad``. 31 32If you can't repro at the current commit (maybe the build is broken), run 33``git bisect skip`` and git will pick a nearby alternate commit. 34 35(To abort a bisect, run ``git bisect reset``, and if git complains about not 36being able to reset, do the usual ``git checkout -f main; git reset --hard 37origin/main`` dance and try again). 38 39``git bisect run`` 40================== 41 42A single bisect step often requires first building clang, and then compiling 43a large code base with just-built clang. This can take a long time, so it's 44good if it can happen completely automatically. ``git bisect run`` can do 45this for you if you write a run script that reproduces the problem 46automatically. Writing the script can take 10-20 minutes, but it's almost 47always worth it -- you can do something else while the bisect runs (such 48as writing this document). 49 50Here's an example run script. It assumes that you're in ``llvm-project`` and 51that you have a sibling ``llvm-build-project`` build directory where you 52configured CMake to use Ninja. You have a file ``repro.c`` in the current 53directory that makes clang crash at trunk, but it worked fine at revision 54``f00ba``. 55 56 .. code-block:: bash 57 58 # Build clang. If the build fails, `exit 125` causes this 59 # revision to be skipped 60 ninja -C ../llvm-build-project clang || exit 125 61 62 ../llvm-build-project/bin/clang repro.c 63 64To make sure your run script works, it's a good idea to run ``./run.sh`` by 65hand and tweak the script until it works, then run ``git bisect good`` or 66``git bisect bad`` manually once based on the result of the script 67(check ``echo $?`` after your script ran), and only then run ``git bisect run 68./run.sh``. Don't forget to mark your run script as executable -- ``git bisect 69run`` doesn't check for that, it just assumes the run script failed each time. 70 71Once your run script works, run ``git bisect run ./run.sh`` and a few hours 72later you'll know which commit caused the regression. 73 74(This is a very simple run script. Often, you want to use just-built clang 75to build a different project and then run a built executable of that project 76in the run script.) 77 78Bisecting across multiple roots 79=============================== 80 81Here's how LLVM's history currently looks: 82 83 .. code-block:: none 84 85 A-o-o-......-o-D-o-o-HEAD 86 / 87 B-o-...-o-C- 88 89``A`` is the first commit in LLVM ever, ``97724f18c79c``. 90 91``B`` is the first commit in MLIR, ``aed0d21a62db``. 92 93``D`` is the merge commit that merged MLIR into the main LLVM repository, 94``0f0d0ed1c78f``. 95 96``C`` is the last commit in MLIR before it got merged, ``0f0d0ed1c78f^2``. (The 97``^n`` modifier selects the n'th parent of a merge commit.) 98 99``git bisect`` goes through all parent revisions. Due to the way MLIR was 100merged, at every revision at ``C`` or earlier, *only* the ``mlir/`` directory 101exists, and nothing else does. 102 103As of early 2020, there is no flag to ``git bisect`` to tell it to not 104descend into all reachable commits. Ideally, we'd want to tell it to only 105follow the first parent of ``D``. 106 107The best workaround is to pass a list of directories to ``git bisect``: 108If you know the bug is due to a change in llvm, clang, or compiler-rt, use 109 110 .. code-block:: bash 111 112 git bisect start -- clang llvm compiler-rt 113 114That way, the commits in ``mlir`` are never evaluated. 115 116Alternatively, ``git bisect skip aed0d21a6 aed0d21a6..0f0d0ed1c78f`` explicitly 117skips all commits on that branch. It takes 1.5 minutes to run on a fast 118machine, and makes ``git bisect log`` output unreadable. (``aed0d21a6`` is 119listed twice because git ranges exclude the revision listed on the left, 120so it needs to be ignored explicitly.) 121 122More Resources 123============== 124 125https://git-scm.com/book/en/v2/Git-Tools-Revision-Selection 126