1Adding a New Linter to the Tree
2===============================
3
4Linter Requirements
5-------------------
6
7For a linter to be integrated into the mozilla-central tree, it needs to have:
8
9* Any required dependencies should be installed as part of ``./mach bootstrap``
10* A ``./mach lint`` interface
11* Running ``./mach lint`` command must pass (note, linters can be disabled for individual directories)
12* Taskcluster/Treeherder integration
13* In tree documentation (under ``docs/code-quality/lint``) to give a basic summary, links and any other useful information
14* Unit tests (under ``tools/lint/test``) to make sure that the linter works as expected and we don't regress.
15
16The review group in Phabricator is ``#linter-reviewers``.
17
18Linter Basics
19-------------
20
21A linter is a yaml file with a ``.yml`` extension. Depending on how the type of linter, there may
22be python code alongside the definition, pointed to by the 'payload' attribute.
23
24Here's a trivial example:
25
26no-eval.yml
27
28.. code-block:: yaml
29
30    EvalLinter:
31        description: Ensures the string eval doesn't show up.
32        extensions: ['js']
33        type: string
34        payload: eval
35
36Now ``no-eval.yml`` gets passed into :func:`LintRoller.read`.
37
38
39Linter Types
40------------
41
42There are four types of linters, though more may be added in the future.
43
441. string - fails if substring is found
452. regex - fails if regex matches
463. external - fails if a python function returns a non-empty result list
474. structured_log - fails if a mozlog logger emits any lint_error or lint_warning log messages
48
49As seen from the example above, string and regex linters are very easy to create, but they
50should be avoided if possible. It is much better to use a context aware linter for the language you
51are trying to lint. For example, use eslint to lint JavaScript files, use flake8 to lint python
52files, etc.
53
54Which brings us to the third and most interesting type of linter,
55external.  External linters call an arbitrary python function which is
56responsible for not only running the linter, but ensuring the results
57are structured properly. For example, an external type could shell out
58to a 3rd party linter, collect the output and format it into a list of
59:class:`Issue` objects. The signature for this python
60function is ``lint(files, config, **kwargs)``, where ``files`` is a list of
61files to lint and ``config`` is the linter definition defined in the ``.yml``
62file.
63
64Structured log linters are much like external linters, but suitable
65for cases where the linter code is using mozlog and emits
66``lint_error`` or ``lint_warning`` logging messages when the lint
67fails. This is recommended for writing novel gecko-specific lints. In
68this case the signature for lint functions is ``lint(files, config, logger,
69**kwargs)``.
70
71
72Linter Definition
73-----------------
74
75Each ``.yml`` file must have at least one linter defined in it. Here are the supported keys:
76
77* description - A brief description of the linter's purpose (required)
78* type - One of 'string', 'regex' or 'external' (required)
79* payload - The actual linting logic, depends on the type (required)
80* include - A list of file paths that will be considered (optional)
81* exclude - A list of file paths or glob patterns that must not be matched (optional)
82* extensions - A list of file extensions to be considered (optional)
83* setup - A function that sets up external dependencies (optional)
84* support-files - A list of glob patterns matching configuration files (optional)
85* find-dotfiles - If set to ``true``, run on dot files (.*) (optional)
86
87In addition to the above, some ``.yml`` files correspond to a single lint rule. For these, the
88following additional keys may be specified:
89
90* message - A string to print on infraction (optional)
91* hint - A string with a clue on how to fix the infraction (optional)
92* rule - An id string for the lint rule (optional)
93* level - The severity of the infraction, either 'error' or 'warning' (optional)
94
95For structured_log lints the following additional keys apply:
96
97* logger - A StructuredLog object to use for logging. If not supplied
98  one will be created (optional)
99
100
101Example
102-------
103
104Here is an example of an external linter that shells out to the python flake8 linter,
105let's call the file ``flake8_lint.py`` (`in-tree version <https://searchfox.org/mozilla-central/source/tools/lint/python/flake8.py>`_):
106
107.. code-block:: python
108
109    import json
110    import os
111    import subprocess
112    from collections import defaultdict
113
114    from mozlint import result
115
116    try:
117        from shutil import which
118    except ImportError:
119        from shutil_which import which
120
121
122    FLAKE8_NOT_FOUND = """
123    Could not find flake8! Install flake8 and try again.
124    """.strip()
125
126
127    def lint(files, config, **lintargs):
128        binary = os.environ.get('FLAKE8')
129        if not binary:
130            binary = which('flake8')
131            if not binary:
132                print(FLAKE8_NOT_FOUND)
133                return 1
134
135        # Flake8 allows passing in a custom format string. We use
136        # this to help mold the default flake8 format into what
137        # mozlint's Issue object expects.
138        cmdargs = [
139            binary,
140            '--format',
141            '{"path":"%(path)s","lineno":%(row)s,"column":%(col)s,"rule":"%(code)s","message":"%(text)s"}',
142        ] + files
143
144        proc = subprocess.Popen(cmdargs, stdout=subprocess.PIPE, env=os.environ)
145        output = proc.communicate()[0]
146
147        # all passed
148        if not output:
149            return []
150
151        results = []
152        for line in output.splitlines():
153            # res is a dict of the form specified by --format above
154            res = json.loads(line)
155
156            # parse level out of the id string
157            if 'code' in res and res['code'].startswith('W'):
158                res['level'] = 'warning'
159
160            # result.from_linter is a convenience method that
161            # creates a Issue using a LINTER definition
162            # to populate some defaults.
163            results.append(result.from_config(config, **res))
164
165        return results
166
167Now here is the linter definition that would call it:
168
169.. code-block:: yaml
170
171    flake8:
172        description: Python linter
173        include: ['.']
174        extensions: ['py']
175        type: external
176        payload: py.flake8:lint
177        support-files:
178            - '**/.flake8'
179
180Notice the payload has two parts, delimited by ':'. The first is the module
181path, which ``mozlint`` will attempt to import. The second is the object path
182within that module (e.g, the name of a function to call). It is up to consumers
183of ``mozlint`` to ensure the module is in ``sys.path``. Structured log linters
184use the same import mechanism.
185
186The ``support-files`` key is used to list configuration files or files related
187to the running of the linter itself. If using ``--outgoing`` or ``--workdir``
188and one of these files was modified, the entire tree will be linted instead of
189just the modified files.
190
191Result definition
192-----------------
193
194When generating the list of results, the following values are available.
195
196.. csv-table::
197   :header: "Name", "Description", "Optional"
198   :widths: 20, 40, 10
199
200    "linter", "Name of the linter that flagged this error", ""
201    "path", "Path to the file containing the error", ""
202    "message", "Text describing the error", ""
203    "lineno", "Line number that contains the error", ""
204    "column", "Column containing the error", ""
205    "level", "Severity of the error, either 'warning' or 'error' (default 'error')", "Yes"
206    "hint", "Suggestion for fixing the error", "Yes"
207    "source", "Source code context of the error", "Yes"
208    "rule", "Name of the rule that was violated", "Yes"
209    "lineoffset", "Denotes an error spans multiple lines, of the form (<lineno offset>, <num lines>)", "Yes"
210    "diff", "A diff describing the changes that need to be made to the code", "Yes"
211
212
213Automated testing
214-----------------
215
216Every new checker must have tests associated.
217
218They should be pretty easy to write as most of the work is managed by the Mozlint
219framework. The key declaration is the ``LINTER`` variable which must match
220the linker declaration.
221
222As an example, the `Flake8 test <https://searchfox.org/mozilla-central/source/tools/lint/test/test_flake8.py>`_ looks like the following snippet:
223
224.. code-block:: python
225
226    import mozunit
227    LINTER = 'flake8'
228
229    def test_lint_single_file(lint, paths):
230        results = lint(paths('bad.py'))
231        assert len(results) == 2
232        assert results[0].rule == 'F401'
233        assert results[1].rule == 'E501'
234        assert results[1].lineno == 5
235
236    if __name__ == '__main__':
237        mozunit.main()
238
239As always with tests, please make sure that enough positive and negative cases are covered.
240
241To run the tests:
242
243.. code-block:: shell
244
245    $ ./mach python-test --python 3 --subsuite mozlint
246
247
248More tests can be `found in-tree <https://searchfox.org/mozilla-central/source/tools/lint/test>`_.
249
250
251
252Bootstrapping Dependencies
253--------------------------
254
255Many linters, especially 3rd party ones, will require a set of dependencies. It
256could be as simple as installing a binary from a package manager, or as
257complicated as pulling a whole graph of tools, plugins and their dependencies.
258
259Either way, to reduce the burden on users, linters should strive to provide
260automated bootstrapping of all their dependencies. To help with this,
261``mozlint`` allows linters to define a ``setup`` config, which has the same
262path object format as an external payload. For example (`in-tree version <https://searchfox.org/mozilla-central/source/tools/lint/flake8.yml>`_):
263
264.. code-block:: yaml
265
266    flake8:
267        description: Python linter
268        include: ['.']
269        extensions: ['py']
270        type: external
271        payload: py.flake8:lint
272        setup: py.flake8:setup
273
274The setup function takes a single argument, the root of the repository being
275linted. In the case of ``flake8``, it might look like:
276
277.. code-block:: python
278
279    import subprocess
280    from distutils.spawn import find_executable
281
282    def setup(root, **lintargs):
283        # This is a simple example. Please look at the actual source for better examples.
284        if not find_executable('flake8'):
285            subprocess.call(['pip', 'install', 'flake8'])
286
287The setup function will be called implicitly before running the linter. This
288means it should return fast and not produce any output if there is no setup to
289be performed.
290
291The setup functions can also be called explicitly by running ``mach lint
292--setup``. This will only perform setup and not perform any linting. It is
293mainly useful for other tools like ``mach bootstrap`` to call into.
294
295
296Adding the linter to the CI
297---------------------------
298
299First, the job will have to be declared in Taskcluster.
300
301This should be done in the `mozlint Taskcluster configuration <https://searchfox.org/mozilla-central/source/taskcluster/ci/source-test/mozlint.yml>`_.
302You will need to define a symbol, how it is executed and on what kind of change.
303
304For example, for flake8, the configuration is the following:
305
306.. code-block:: yaml
307
308    py-flake8:
309        description: flake8 run over the gecko codebase
310        treeherder:
311            symbol: py(f8)
312        run:
313            mach: lint -l flake8 -f treeherder -f json:/builds/worker/mozlint.json
314        when:
315            files-changed:
316                - '**/*.py'
317                - '**/.flake8'
318                # moz.configure files are also Python files.
319                - '**/*.configure'
320
321If the linter requires an external program, you will have to install it in the `setup script <https://searchfox.org/mozilla-central/source/taskcluster/docker/lint/system-setup.sh>`_
322and maybe install the necessary files in the `Docker configuration <https://searchfox.org/mozilla-central/source/taskcluster/docker/lint/Dockerfile>`_.
323
324.. note::
325
326    If the defect found by the linter is minor, make sure that it is run as `tier 2 <https://wiki.mozilla.org/Sheriffing/Job_Visibility_Policy#Overview_of_the_Job_Visibility_Tiers>`_.
327    This prevents the tree from closing because of a tiny issue.
328    For example, the typo detection is run as tier-2.
329