1===================
2Variable Names Plan
3===================
4
5.. contents::
6   :local:
7
8This plan is *provisional*. It is not agreed upon. It is written with the
9intention of capturing the desires and concerns of the LLVM community, and
10forming them into a plan that can be agreed upon.
11The original author is somewhat naïve in the ways of LLVM so there will
12inevitably be some details that are flawed. You can help - you can edit this
13page (preferably with a Phabricator review for larger changes) or reply to the
14`Request For Comments thread
15<http://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html>`_.
16
17Too Long; Didn't Read
18=====================
19
20Improve the readability of LLVM code.
21
22Introduction
23============
24
25The current `variable naming rule
26<../CodingStandards.html#name-types-functions-variables-and-enumerators-properly>`_
27states:
28
29  Variable names should be nouns (as they represent state). The name should be
30  camel case, and start with an upper case letter (e.g. Leader or Boats).
31
32This rule is the same as that for type names. This is a problem because the
33type name cannot be reused for a variable name [*]_. LLVM developers tend to
34work around this by either prepending ``The`` to the type name::
35
36  Triple TheTriple;
37
38... or more commonly use an acronym, despite the coding standard stating "Avoid
39abbreviations unless they are well known"::
40
41  Triple T;
42
43The proliferation of acronyms leads to hard-to-read code such as `this
44<https://github.com/llvm/llvm-project/blob/0a8bc14ad7f3209fe702d18e250194cd90188596/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp#L7445>`_::
45
46  InnerLoopVectorizer LB(L, PSE, LI, DT, TLI, TTI, AC, ORE, VF.Width, IC,
47                         &LVL, &CM);
48
49Many other coding guidelines [LLDB]_ [Google]_ [WebKit]_ [Qt]_ [Rust]_ [Swift]_
50[Python]_ require that variable names begin with a lower case letter in contrast
51to class names which begin with a capital letter. This convention means that the
52most readable variable name also requires the least thought::
53
54  Triple triple;
55
56There is some agreement that the current rule is broken [LattnerAgree]_
57[ArsenaultAgree]_ [RobinsonAgree]_ and that acronyms are an obstacle to reading
58new code [MalyutinDistinguish]_ [CarruthAcronym]_ [PicusAcronym]_. There are
59some opposing views [ParzyszekAcronym2]_ [RicciAcronyms]_.
60
61This work-in-progress proposal is to change the coding standard for variable
62names to require that they start with a lower case letter.
63
64.. [*] In `some cases
65   <https://github.com/llvm/llvm-project/blob/8b72080d4d7b13072f371712eed333f987b7a18e/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp#L2727>`_
66   the type name *is* reused as a variable name, but this shadows the type name
67   and confuses many debuggers [DenisovCamelBack]_.
68
69Variable Names Coding Standard Options
70======================================
71
72There are two main options for variable names that begin with a lower case
73letter: ``camelBack`` and ``lower_case``. (These are also known by other names
74but here we use the terminology from clang-tidy).
75
76``camelBack`` is consistent with [WebKit]_, [Qt]_ and [Swift]_ while
77``lower_case`` is consistent with [LLDB]_, [Google]_, [Rust]_ and [Python]_.
78
79``camelBack`` is already used for function names, which may be considered an
80advantage [LattnerFunction]_ or a disadvantage [CarruthFunction]_.
81
82Approval for ``camelBack`` was expressed by [DenisovCamelBack]_
83[LattnerFunction]_ [IvanovicDistinguish]_.
84Opposition to ``camelBack`` was expressed by [CarruthCamelBack]_
85[TurnerCamelBack]_.
86Approval for ``lower_case`` was expressed by [CarruthLower]_
87[CarruthCamelBack]_ [TurnerLLDB]_.
88Opposition to ``lower_case`` was expressed by [LattnerLower]_.
89
90Differentiating variable kinds
91------------------------------
92
93An additional requested change is to distinguish between different kinds of
94variables [RobinsonDistinguish]_ [RobinsonDistinguish2]_ [JonesDistinguish]_
95[IvanovicDistinguish]_ [CarruthDistinguish]_ [MalyutinDistinguish]_.
96
97Others oppose this idea [HähnleDistinguish]_ [GreeneDistinguish]_
98[HendersonPrefix]_.
99
100A possibility is for member variables to be prefixed with ``m_`` and for global
101variables to be prefixed with ``g_`` to distinguish them from local variables.
102This is consistent with [LLDB]_. The ``m_`` prefix is consistent with [WebKit]_.
103
104A variation is for member variables to be prefixed with ``m``
105[IvanovicDistinguish]_ [BeylsDistinguish]_. This is consistent with [Mozilla]_.
106
107Another option is for member variables to be suffixed with ``_`` which is
108consistent with [Google]_ and similar to [Python]_. Opposed by
109[ParzyszekDistinguish]_.
110
111Reducing the number of acronyms
112===============================
113
114While switching coding standard will make it easier to use non-acronym names for
115new code, it doesn't improve the existing large body of code that uses acronyms
116extensively to the detriment of its readability. Further, it is natural and
117generally encouraged that new code be written in the style of the surrounding
118code. Therefore it is likely that much newly written code will also use
119acronyms despite what the coding standard says, much as it is today.
120
121As well as changing the case of variable names, they could also be expanded to
122their non-acronym form e.g. ``Triple T`` → ``Triple triple``.
123
124There is support for expanding many acronyms [CarruthAcronym]_ [PicusAcronym]_
125but there is a preference that expanding acronyms be deferred
126[ParzyszekAcronym]_ [CarruthAcronym]_.
127
128The consensus within the community seems to be that at least some acronyms are
129valuable [ParzyszekAcronym]_ [LattnerAcronym]_. The most commonly cited acronym
130is ``TLI`` however that is used to refer to both ``TargetLowering`` and
131``TargetLibraryInfo`` [GreeneDistinguish]_.
132
133The following is a list of acronyms considered sufficiently useful that the
134benefit of using them outweighs the cost of learning them. Acronyms that are
135either not on the list or are used to refer to a different type should be
136expanded.
137
138============================ =============
139Class name                   Variable name
140============================ =============
141DeterministicFiniteAutomaton dfa
142DominatorTree                dt
143LoopInfo                     li
144MachineFunction              mf
145MachineInstr                 mi
146MachineRegisterInfo          mri
147ScalarEvolution              se
148TargetInstrInfo              tii
149TargetLibraryInfo            tli
150TargetRegisterInfo           tri
151============================ =============
152
153In some cases renaming acronyms to the full type name will result in overly
154verbose code. Unlike most classes, a variable's scope is limited and therefore
155some of its purpose can implied from that scope, meaning that fewer words are
156necessary to give it a clear name. For example, in an optimization pass the reader
157can assume that a variable's purpose relates to optimization and therefore an
158``OptimizationRemarkEmitter`` variable could be given the name ``remarkEmitter``
159or even ``remarker``.
160
161The following is a list of longer class names and the associated shorter
162variable name.
163
164========================= =============
165Class name                Variable name
166========================= =============
167BasicBlock                block
168ConstantExpr              expr
169ExecutionEngine           engine
170MachineOperand            operand
171OptimizationRemarkEmitter remarker
172PreservedAnalyses         analyses
173PreservedAnalysesChecker  checker
174TargetLowering            lowering
175TargetMachine             machine
176========================= =============
177
178Transition Options
179==================
180
181There are three main options for transitioning:
182
1831. Keep the current coding standard
1842. Laissez faire
1853. Big bang
186
187Keep the current coding standard
188--------------------------------
189
190Proponents of keeping the current coding standard (i.e. not transitioning at
191all) question whether the cost of transition outweighs the benefit
192[EmersonConcern]_ [ReamesConcern]_ [BradburyConcern]_.
193The costs are that ``git blame`` will become less usable; and that merging the
194changes will be costly for downstream maintainers. See `Big bang`_ for potential
195mitigations.
196
197Laissez faire
198-------------
199
200The coding standard could allow both ``CamelCase`` and ``camelBack`` styles for
201variable names [LattnerTransition]_.
202
203A code review to implement this is at https://reviews.llvm.org/D57896.
204
205Advantages
206**********
207
208 * Very easy to implement initially.
209
210Disadvantages
211*************
212
213 * Leads to inconsistency [BradburyConcern]_ [AminiInconsistent]_.
214 * Inconsistency means it will be hard to know at a guess what name a variable
215   will have [DasInconsistent]_ [CarruthInconsistent]_.
216 * Some large-scale renaming may happen anyway, leading to its disadvantages
217   without any mitigations.
218
219Big bang
220--------
221
222With this approach, variables will be renamed by an automated script in a series
223of large commits.
224
225The principle advantage of this approach is that it minimises the cost of
226inconsistency [BradburyTransition]_ [RobinsonTransition]_.
227
228It goes against a policy of avoiding large-scale reformatting of existing code
229[GreeneDistinguish]_.
230
231It has been suggested that LLD would be a good starter project for the renaming
232[Ueyama]_.
233
234Keeping git blame usable
235************************
236
237``git blame`` (or ``git annotate``) permits quickly identifying the commit that
238changed a given line in a file. After renaming variables, many lines will show
239as being changed by that one commit, requiring a further invocation of ``git
240blame`` to identify prior, more interesting commits [GreeneGitBlame]_
241[RicciAcronyms]_.
242
243**Mitigation**: `git-hyper-blame
244<https://commondatastorage.googleapis.com/chrome-infra-docs/flat/depot_tools/docs/html/git-hyper-blame.html>`_
245can ignore or "look through" a given set of commits.
246A ``.git-blame-ignore-revs`` file identifying the variable renaming commits
247could be added to the LLVM git repository root directory.
248It is being `investigated
249<https://public-inbox.org/git/20190324235020.49706-1-michael@platin.gs/>`_
250whether similar functionality could be added to ``git blame`` itself.
251
252Minimising cost of downstream merges
253************************************
254
255There are many forks of LLVM with downstream changes. Merging a large-scale
256renaming change could be difficult for the fork maintainers.
257
258**Mitigation**: A large-scale renaming would be automated. A fork maintainer can
259merge from the commit immediately before the renaming, then apply the renaming
260script to their own branch. They can then merge again from the renaming commit,
261resolving all conflicts by choosing their own version. This could be tested on
262the [SVE]_ fork.
263
264Provisional Plan
265================
266
267This is a provisional plan for the `Big bang`_ approach. It has not been agreed.
268
269#. Investigate improving ``git blame``. The extent to which it can be made to
270   "look through" commits may impact how big a change can be made.
271
272#. Write a script to expand acronyms.
273
274#. Experiment and perform dry runs of the various refactoring options.
275   Results can be published in forks of the LLVM Git repository.
276
277#. Consider the evidence and agree on the new policy.
278
279#. Agree & announce a date for the renaming of the starter project (LLD).
280
281#. Update the `policy page <../CodingStandards.html>`_. This will explain the
282   old and new rules and which projects each applies to.
283
284#. Refactor the starter project in two commits:
285
286   1. Add or change the project's .clang-tidy to reflect the agreed rules.
287      (This is in a separate commit to enable the merging process described in
288      `Minimising cost of downstream merges`_).
289      Also update the project list on the policy page.
290   2. Apply ``clang-tidy`` to the project's files, with only the
291      ``readability-identifier-naming`` rules enabled. ``clang-tidy`` will also
292      reformat the affected lines according to the rules in ``.clang-format``.
293      It is anticipated that this will be a good dog-fooding opportunity for
294      clang-tidy, and bugs should be fixed in the process, likely including:
295
296        * `readability-identifier-naming incorrectly fixes lambda capture
297          <https://bugs.llvm.org/show_bug.cgi?id=41119>`_.
298        * `readability-identifier-naming incorrectly fixes variables which
299          become keywords <https://bugs.llvm.org/show_bug.cgi?id=41120>`_.
300        * `readability-identifier-naming misses fixing member variables in
301          destructor <https://bugs.llvm.org/show_bug.cgi?id=41122>`_.
302
303#. Gather feedback and refine the process as appropriate.
304
305#. Apply the process to the following projects, with a suitable delay between
306   each (at least 4 weeks after the first change, at least 2 weeks subsequently)
307   to allow gathering further feedback.
308   This list should exclude projects that must adhere to an externally defined
309   standard e.g. libcxx.
310   The list is roughly in chronological order of renaming.
311   Some items may not make sense to rename individually - it is expected that
312   this list will change following experimentation:
313
314   * TableGen
315   * llvm/tools
316   * clang-tools-extra
317   * clang
318   * ARM backend
319   * AArch64 backend
320   * AMDGPU backend
321   * ARC backend
322   * AVR backend
323   * BPF backend
324   * Hexagon backend
325   * Lanai backend
326   * MIPS backend
327   * NVPTX backend
328   * PowerPC backend
329   * RISC-V backend
330   * Sparc backend
331   * SystemZ backend
332   * WebAssembly backend
333   * X86 backend
334   * XCore backend
335   * libLTO
336   * Debug Information
337   * Remainder of llvm
338   * compiler-rt
339   * libunwind
340   * openmp
341   * parallel-libs
342   * polly
343   * lldb
344
345#. Remove the old variable name rule from the policy page.
346
347#. Repeat many of the steps in the sequence, using a script to expand acronyms.
348
349References
350==========
351
352.. [LLDB] LLDB Coding Conventions https://llvm.org/svn/llvm-project/lldb/branches/release_39/www/lldb-coding-conventions.html
353.. [Google] Google C++ Style Guide https://google.github.io/styleguide/cppguide.html#Variable_Names
354.. [WebKit] WebKit Code Style Guidelines https://webkit.org/code-style-guidelines/#names
355.. [Qt] Qt Coding Style https://wiki.qt.io/Qt_Coding_Style#Declaring_variables
356.. [Rust] Rust naming conventions https://doc.rust-lang.org/1.0.0/style/style/naming/README.html
357.. [Swift] Swift API Design Guidelines https://swift.org/documentation/api-design-guidelines/#general-conventions
358.. [Python] Style Guide for Python Code https://www.python.org/dev/peps/pep-0008/#function-and-variable-names
359.. [Mozilla] Mozilla Coding style: Prefixes https://firefox-source-docs.mozilla.org/tools/lint/coding-style/coding_style_cpp.html#prefixes
360.. [SVE] LLVM with support for SVE https://github.com/ARM-software/LLVM-SVE
361.. [AminiInconsistent] Mehdi Amini, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130329.html
362.. [ArsenaultAgree] Matt Arsenault, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129934.html
363.. [BeylsDistinguish] Kristof Beyls, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130292.html
364.. [BradburyConcern] Alex Bradbury, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130266.html
365.. [BradburyTransition] Alex Bradbury, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130388.html
366.. [CarruthAcronym] Chandler Carruth, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130313.html
367.. [CarruthCamelBack] Chandler Carruth, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130214.html
368.. [CarruthDistinguish] Chandler Carruth, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130310.html
369.. [CarruthFunction] Chandler Carruth, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130309.html
370.. [CarruthInconsistent] Chandler Carruth, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130312.html
371.. [CarruthLower] Chandler Carruth, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130430.html
372.. [DasInconsistent] Sanjoy Das, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130304.html
373.. [DenisovCamelBack] Alex Denisov, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130179.html
374.. [EmersonConcern] Amara Emerson, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129894.html
375.. [GreeneDistinguish] David Greene, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130425.html
376.. [GreeneGitBlame] David Greene, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130228.html
377.. [HendersonPrefix] James Henderson, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130465.html
378.. [HähnleDistinguish] Nicolai Hähnle, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129923.html
379.. [IvanovicDistinguish] Nemanja Ivanovic, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130249.html
380.. [JonesDistinguish] JD Jones, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129926.html
381.. [LattnerAcronym] Chris Lattner, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130353.html
382.. [LattnerAgree] Chris Latter, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129907.html
383.. [LattnerFunction] Chris Lattner, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130630.html
384.. [LattnerLower] Chris Lattner, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130629.html
385.. [LattnerTransition] Chris Lattner, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130355.html
386.. [MalyutinDistinguish] Danila Malyutin, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130320.html
387.. [ParzyszekAcronym] Krzysztof Parzyszek, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130306.html
388.. [ParzyszekAcronym2] Krzysztof Parzyszek, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130323.html
389.. [ParzyszekDistinguish] Krzysztof Parzyszek, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129941.html
390.. [PicusAcronym] Diana Picus, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130318.html
391.. [ReamesConcern] Philip Reames, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130181.html
392.. [RicciAcronyms] Bruno Ricci, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130328.html
393.. [RobinsonAgree] Paul Robinson, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130111.html
394.. [RobinsonDistinguish] Paul Robinson, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129920.html
395.. [RobinsonDistinguish2] Paul Robinson, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130229.html
396.. [RobinsonTransition] Paul Robinson, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130415.html
397.. [TurnerCamelBack] Zachary Turner, https://reviews.llvm.org/D57896#1402264
398.. [TurnerLLDB] Zachary Turner, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130213.html
399.. [Ueyama] Rui Ueyama, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130435.html
400