1# Web Tests (formerly known as "Layout Tests" or "LayoutTests")
2
3Web tests are used by Blink to test many components, including but not
4limited to layout and rendering. In general, web tests involve loading pages
5in a test renderer (`content_shell`) and comparing the rendered output or
6JavaScript output against an expected output file.
7
8This document covers running and debugging existing web tests. See the
9[Writing Web Tests documentation](./writing_web_tests.md) if you find
10yourself writing web tests.
11
12Note that we changed the term "layout tests" to "web tests".
13Please assume these terms mean the identical stuff. We also call it as
14"WebKit tests" and "WebKit layout tests".
15
16["Web platform tests"](./web_platform_tests.md) (WPT) are the preferred form of
17web tests and are located at
18[web_tests/external/wpt](/third_party/blink/web_tests/external/wpt).
19Tests that should work across browsers go there. Other directories are for
20Chrome-specific tests only.
21
22[TOC]
23
24## Running Web Tests
25
26### Initial Setup
27
28Before you can run the web tests, you need to build the `blink_tests` target
29to get `content_shell` and all of the other needed binaries.
30
31```bash
32autoninja -C out/Default blink_tests
33```
34
35On **Android** (web test support
36[currently limited to KitKat and earlier](https://crbug.com/567947)) you need to
37build and install `content_shell_apk` instead. See also:
38[Android Build Instructions](../android_build_instructions.md).
39
40```bash
41autoninja -C out/Default content_shell_apk
42adb install -r out/Default/apks/ContentShell.apk
43```
44
45On **Mac**, you probably want to strip the content_shell binary before starting
46the tests. If you don't, you'll have 5-10 running concurrently, all stuck being
47examined by the OS crash reporter. This may cause other failures like timeouts
48where they normally don't occur.
49
50```bash
51strip ./xcodebuild/{Debug,Release}/content_shell.app/Contents/MacOS/content_shell
52```
53
54### Running the Tests
55
56TODO: mention `testing/xvfb.py`
57
58The test runner script is in
59`third_party/blink/tools/run_web_tests.py`.
60
61To specify which build directory to use (e.g. out/Default, out/Release,
62out/Debug) you should pass the `-t` or `--target` parameter. For example, to
63use the build in `out/Default`, use:
64
65```bash
66python third_party/blink/tools/run_web_tests.py -t Default
67```
68
69For Android (if your build directory is `out/android`):
70
71```bash
72python third_party/blink/tools/run_web_tests.py -t android --android
73```
74
75Tests marked as `[ Skip ]` in
76[TestExpectations](../../third_party/blink/web_tests/TestExpectations)
77won't be run by default, generally because they cause some intractable tool error.
78To force one of them to be run, either rename that file or specify the skipped
79test on the command line (see below) or in a file specified with --test-list
80(however, --skip=always can make the tests marked as `[ Skip ]` always skipped).
81Read the [Web Test Expectations documentation](./web_test_expectations.md) to
82learn more about TestExpectations and related files.
83
84*** promo
85Currently only the tests listed in
86[SmokeTests](../../third_party/blink/web_tests/SmokeTests)
87are run on the Android bots, since running all web tests takes too long on
88Android (and may still have some infrastructure issues). Most developers focus
89their Blink testing on Linux. We rely on the fact that the Linux and Android
90behavior is nearly identical for scenarios outside those covered by the smoke
91tests.
92***
93
94To run only some of the tests, specify their directories or filenames as
95arguments to `run_web_tests.py` relative to the web test directory
96(`src/third_party/blink/web_tests`). For example, to run the fast form tests,
97use:
98
99```bash
100python third_party/blink/tools/run_web_tests.py fast/forms
101```
102
103Or you could use the following shorthand:
104
105```bash
106python third_party/blink/tools/run_web_tests.py fast/fo\*
107```
108
109*** promo
110Example: To run the web tests with a debug build of `content_shell`, but only
111test the SVG tests and run pixel tests, you would run:
112
113```bash
114[python] third_party/blink/tools/run_web_tests.py -t Default svg
115```
116***
117
118As a final quick-but-less-robust alternative, you can also just use the
119content_shell executable to run specific tests by using (example on Windows):
120
121```bash
122out/Default/content_shell.exe --run-web-tests <url>|<full_test_source_path>|<relative_test_path>
123```
124
125as in:
126
127```bash
128out/Default/content_shell.exe --run-web-tests \
129    c:/chrome/src/third_party/blink/web_tests/fast/forms/001.html
130```
131or
132
133```bash
134out/Default/content_shell.exe --run-web-tests fast/forms/001.html
135```
136
137but this requires a manual diff against expected results, because the shell
138doesn't do it for you. It also just dumps the text result only (as the dump of
139pixels and audio binary data is not human readable).
140See [Running Web Tests Using the Content Shell](./web_tests_in_content_shell.md)
141for more details of running `content_shell`.
142
143To see a complete list of arguments supported, run:
144
145```bash
146python third_party/blink/tools/run_web_tests.py --help
147```
148
149*** note
150**Linux Note:** We try to match the Windows render tree output exactly by
151matching font metrics and widget metrics. If there's a difference in the render
152tree output, we should see if we can avoid rebaselining by improving our font
153metrics. For additional information on Linux web tests, please see
154[docs/web_tests_linux.md](./web_tests_linux.md).
155***
156
157*** note
158**Mac Note:** While the tests are running, a bunch of Appearance settings are
159overridden for you so the right type of scroll bars, colors, etc. are used.
160Your main display's "Color Profile" is also changed to make sure color
161correction by ColorSync matches what is expected in the pixel tests. The change
162is noticeable, how much depends on the normal level of correction for your
163display. The tests do their best to restore your setting when done, but if
164you're left in the wrong state, you can manually reset it by going to
165System Preferences → Displays → Color and selecting the "right" value.
166***
167
168### Test Harness Options
169
170This script has a lot of command line flags. You can pass `--help` to the script
171to see a full list of options. A few of the most useful options are below:
172
173| Option                      | Meaning |
174|:----------------------------|:--------------------------------------------------|
175| `--debug`                   | Run the debug build of the test shell (default is release). Equivalent to `-t Debug` |
176| `--nocheck-sys-deps`        | Don't check system dependencies; this allows faster iteration. |
177| `--verbose`                 |	Produce more verbose output, including a list of tests that pass. |
178| `--reset-results`           |	Overwrite the current baselines (`-expected.{png|txt|wav}` files) with actual results, or create new baselines if there are no existing baselines. |
179| `--renderer-startup-dialog` | Bring up a modal dialog before running the test, useful for attaching a debugger. |
180| `--fully-parallel`          | Run tests in parallel using as many child processes as the system has cores. |
181| `--driver-logging`          | Print C++ logs (LOG(WARNING), etc).  |
182
183## Success and Failure
184
185A test succeeds when its output matches the pre-defined expected results. If any
186tests fail, the test script will place the actual generated results, along with
187a diff of the actual and expected results, into
188`src/out/Default/layout_test_results/`, and by default launch a browser with a
189summary and link to the results/diffs.
190
191The expected results for tests are in the
192`src/third_party/blink/web_tests/platform` or alongside their respective
193tests.
194
195*** note
196Tests which use [testharness.js](https://github.com/w3c/testharness.js/)
197do not have expected result files if all test cases pass.
198***
199
200A test that runs but produces the wrong output is marked as "failed", one that
201causes the test shell to crash is marked as "crashed", and one that takes longer
202than a certain amount of time to complete is aborted and marked as "timed out".
203A row of dots in the script's output indicates one or more tests that passed.
204
205## Test expectations
206
207The
208[TestExpectations](../../third_party/blink/web_tests/TestExpectations) file (and related
209files) contains the list of all known web test failures. See the
210[Web Test Expectations documentation](./web_test_expectations.md) for more
211on this.
212
213## Testing Runtime Flags
214
215There are two ways to run web tests with additional command-line arguments:
216
217* Using `--additional-driver-flag`:
218
219  ```bash
220  python run_web_tests.py --additional-driver-flag=--blocking-repaint
221  ```
222
223  This tells the test harness to pass `--blocking-repaint` to the
224  content_shell binary.
225
226  It will also look for flag-specific expectations in
227  `web_tests/FlagExpectations/blocking-repaint`, if this file exists. The
228  suppressions in this file override the main TestExpectations file.
229
230  It will also look for baselines in `web_tests/flag-specific/blocking-repaint`.
231  The baselines in this directory override the fallback baselines.
232
233  By default, name of the expectation file name under
234  `web_tests/FlagExpectations` and name of the baseline directory under
235  `web_tests/flag-specific` uses the first flag of --additional-driver-flag
236  with leading '-'s stripped.
237
238  You can also customize the name in `web_tests/FlagSpecificConfig` when
239  the name is too long or when we need to match multiple additional args:
240
241  ```json
242  {
243    "name": "short-name",
244    "args": ["--blocking-repaint", "--another-flag"]
245  }
246  ```
247
248  When at least `--additional-driver-flag=--blocking-repaint` and
249  `--additional-driver-flag=--another-flag` are specified, `short-name` will
250  be used as name of the flag specific expectation file and the baseline directory.
251
252  With the config, you can also use `--flag-specific=short-name` as a shortcut
253  of `--additional-driver-flag=--blocking-repaint --additional-driver-flag=--another-flag`.
254
255* Using a *virtual test suite* defined in
256  [web_tests/VirtualTestSuites](../../third_party/blink/web_tests/VirtualTestSuites).
257  A virtual test suite runs a subset of web tests with additional flags, with
258  `virtual/<prefix>/...` in their paths. The tests can be virtual tests that
259  map to real base tests (directories or files) whose paths match any of the
260  specified bases, or any real tests under `web_tests/virtual/<prefix>/`
261  directory. For example, you could test a (hypothetical) new mode for
262  repainting using the following virtual test suite:
263
264  ```json
265  {
266    "prefix": "blocking_repaint",
267    "bases": ["compositing", "fast/repaint"],
268    "args": ["--blocking-repaint"]
269  }
270  ```
271
272  This will create new "virtual" tests of the form
273  `virtual/blocking_repaint/compositing/...` and
274  `virtual/blocking_repaint/fast/repaint/...` which correspond to the files
275  under `web_tests/compositing` and `web_tests/fast/repaint`, respectively,
276  and pass `--blocking-repaint` to `content_shell` when they are run.
277
278  These virtual tests exist in addition to the original `compositing/...` and
279  `fast/repaint/...` tests. They can have their own expectations in
280  `web_tests/TestExpectations`, and their own baselines. The test harness will
281  use the non-virtual baselines as a fallback. However, the non-virtual
282  expectations are not inherited: if `fast/repaint/foo.html` is marked
283  `[ Fail ]`, the test harness still expects
284  `virtual/blocking_repaint/fast/repaint/foo.html` to pass. If you expect the
285  virtual test to also fail, it needs its own suppression.
286
287  This will also let any real tests under `web_tests/virtual/blocking_repaint`
288  directory run with the `--blocking-repaint` flag.
289
290  The "prefix" value should be unique. Multiple directories with the same flags
291  should be listed in the same "bases" list. The "bases" list can be empty,
292  in case that we just want to run the real tests under `virtual/<prefix>`
293  with the flags without creating any virtual tests.
294
295For flags whose implementation is still in progress, virtual test suites and
296flag-specific expectations represent two alternative strategies for testing.
297Consider the following when choosing between them:
298
299* The
300  [waterfall builders](https://dev.chromium.org/developers/testing/chromium-build-infrastructure/tour-of-the-chromium-buildbot)
301  and [try bots](https://dev.chromium.org/developers/testing/try-server-usage)
302  will run all virtual test suites in addition to the non-virtual tests.
303  Conversely, a flag-specific expectations file won't automatically cause the
304  bots to test your flag - if you want bot coverage without virtual test suites,
305  you will need to set up a dedicated bot for your flag.
306
307* Due to the above, virtual test suites incur a performance penalty for the
308  commit queue and the continuous build infrastructure. This is exacerbated by
309  the need to restart `content_shell` whenever flags change, which limits
310  parallelism. Therefore, you should avoid adding large numbers of virtual test
311  suites. They are well suited to running a subset of tests that are directly
312  related to the feature, but they don't scale to flags that make deep
313  architectural changes that potentially impact all of the tests.
314
315* Note that using wildcards in virtual test path names (e.g.
316  `virtual/blocking_repaint/fast/repaint/*`) is not supported, but you can
317  still use `virtual/blocking_repaint` to run all real and virtual tests
318  in the suite or `virtual/blocking_repaint/fast/repaint/dir` to run real
319  or virtual tests in the suite under a specific directory.
320
321## Tracking Test Failures
322
323All bugs, associated with web test failures must have the
324[Test-Layout](https://crbug.com/?q=label:Test-Layout) label. Depending on how
325much you know about the bug, assign the status accordingly:
326
327* **Unconfirmed** -- You aren't sure if this is a simple rebaseline, possible
328  duplicate of an existing bug, or a real failure
329* **Untriaged** -- Confirmed but unsure of priority or root cause.
330* **Available** -- You know the root cause of the issue.
331* **Assigned** or **Started** -- You will fix this issue.
332
333When creating a new web test bug, please set the following properties:
334
335* Components: a sub-component of Blink
336* OS: **All** (or whichever OS the failure is on)
337* Priority: 2 (1 if it's a crash)
338* Type: **Bug**
339* Labels: **Test-Layout**
340
341You can also use the _Layout Test Failure_ template, which pre-sets these
342labels for you.
343
344## Debugging Web Tests
345
346After the web tests run, you should get a summary of tests that pass or
347fail. If something fails unexpectedly (a new regression), you will get a
348`content_shell` window with a summary of the unexpected failures. Or you might
349have a failing test in mind to investigate. In any case, here are some steps and
350tips for finding the problem.
351
352* Take a look at the result. Sometimes tests just need to be rebaselined (see
353  below) to account for changes introduced in your patch.
354    * Load the test into a trunk Chrome or content_shell build and look at its
355      result. (For tests in the http/ directory, start the http server first.
356      See above. Navigate to `http://localhost:8000/` and proceed from there.)
357      The best tests describe what they're looking for, but not all do, and
358      sometimes things they're not explicitly testing are still broken. Compare
359      it to Safari, Firefox, and IE if necessary to see if it's correct. If
360      you're still not sure, find the person who knows the most about it and
361      ask.
362    * Some tests only work properly in content_shell, not Chrome, because they
363      rely on extra APIs exposed there.
364    * Some tests only work properly when they're run in the web-test
365      framework, not when they're loaded into content_shell directly. The test
366      should mention that in its visible text, but not all do. So try that too.
367      See "Running the tests", above.
368* If you think the test is correct, confirm your suspicion by looking at the
369  diffs between the expected result and the actual one.
370    * Make sure that the diffs reported aren't important. Small differences in
371      spacing or box sizes are often unimportant, especially around fonts and
372      form controls. Differences in wording of JS error messages are also
373      usually acceptable.
374    * `python run_web_tests.py path/to/your/test.html` produces a page listing
375      all test results. Those which fail their expectations will include links
376      to the expected result, actual result, and diff. These results are saved
377      to `$root_build_dir/layout-test-results`.
378        * Alternatively the `--results-directory=path/for/output/` option allows
379          you to specify an alternative directory for the output to be saved to.
380    * If you're still sure it's correct, rebaseline the test (see below).
381      Otherwise...
382* If you're lucky, your test is one that runs properly when you navigate to it
383  in content_shell normally. In that case, build the Debug content_shell
384  project, fire it up in your favorite debugger, and load the test file either
385  from a `file:` URL.
386    * You'll probably be starting and stopping the content_shell a lot. In VS,
387      to save navigating to the test every time, you can set the URL to your
388      test (`file:` or `http:`) as the command argument in the Debugging section of
389      the content_shell project Properties.
390    * If your test contains a JS call, DOM manipulation, or other distinctive
391      piece of code that you think is failing, search for that in the Chrome
392      solution. That's a good place to put a starting breakpoint to start
393      tracking down the issue.
394    * Otherwise, you're running in a standard message loop just like in Chrome.
395      If you have no other information, set a breakpoint on page load.
396* If your test only works in full web-test mode, or if you find it simpler to
397  debug without all the overhead of an interactive session, start the
398  content_shell with the command-line flag `--run-web-tests`, followed by the
399  URL (`file:` or `http:`) to your test. More information about running web tests
400  in content_shell can be found [here](./web_tests_in_content_shell.md).
401    * In VS, you can do this in the Debugging section of the content_shell
402      project Properties.
403    * Now you're running with exactly the same API, theme, and other setup that
404      the web tests use.
405    * Again, if your test contains a JS call, DOM manipulation, or other
406      distinctive piece of code that you think is failing, search for that in
407      the Chrome solution. That's a good place to put a starting breakpoint to
408      start tracking down the issue.
409    * If you can't find any better place to set a breakpoint, start at the
410      `TestShell::RunFileTest()` call in `content_shell_main.cc`, or at
411      `shell->LoadURL() within RunFileTest()` in `content_shell_win.cc`.
412* Debug as usual. Once you've gotten this far, the failing web test is just a
413  (hopefully) reduced test case that exposes a problem.
414
415### Debugging HTTP Tests
416
417To run the server manually to reproduce/debug a failure:
418
419```bash
420cd src/third_party/blink/tools
421python run_blink_httpd.py
422```
423
424The web tests are served from `http://127.0.0.1:8000/`. For example, to
425run the test
426`web_tests/http/tests/serviceworker/chromium/service-worker-allowed.html`,
427navigate to
428`http://127.0.0.1:8000/serviceworker/chromium/service-worker-allowed.html`. Some
429tests behave differently if you go to `127.0.0.1` vs. `localhost`, so use
430`127.0.0.1`.
431
432To kill the server, hit any key on the terminal where `run_blink_httpd.py` is
433running, use `taskkill` or the Task Manager on Windows, or `killall` or
434Activity Monitor on macOS.
435
436The test server sets up an alias to the `web_tests/resources` directory. For
437example, in HTTP tests, you can access the testing framework using
438`src="/js-test-resources/js-test.js"`.
439
440### Tips
441
442Check https://test-results.appspot.com/ to see how a test did in the most recent
443~100 builds on each builder (as long as the page is being updated regularly).
444
445A timeout will often also be a text mismatch, since the wrapper script kills the
446content_shell before it has a chance to finish. The exception is if the test
447finishes loading properly, but somehow hangs before it outputs the bit of text
448that tells the wrapper it's done.
449
450Why might a test fail (or crash, or timeout) on buildbot, but pass on your local
451machine?
452* If the test finishes locally but is slow, more than 10 seconds or so, that
453  would be why it's called a timeout on the bot.
454* Otherwise, try running it as part of a set of tests; it's possible that a test
455  one or two (or ten) before this one is corrupting something that makes this
456  one fail.
457* If it consistently works locally, make sure your environment looks like the
458  one on the bot (look at the top of the stdio for the webkit_tests step to see
459  all the environment variables and so on).
460* If none of that helps, and you have access to the bot itself, you may have to
461  log in there and see if you can reproduce the problem manually.
462
463### Debugging DevTools Tests
464
465* Add `debug_devtools=true` to `args.gn` and compile: `autoninja -C out/Default devtools_frontend_resources`
466  > Debug DevTools lets you avoid having to recompile after every change to the DevTools front-end.
467* Do one of the following:
468    * Option A) Run from the `chromium/src` folder:
469      `third_party/blink/tools/run_web_tests.sh
470      --additional-driver-flag='--debug-devtools'
471      --additional-driver-flag='--remote-debugging-port=9222'
472      --time-out-ms=6000000`
473    * Option B) If you need to debug an http/tests/inspector test, start httpd
474      as described above. Then, run content_shell:
475      `out/Default/content_shell --debug-devtools --remote-debugging-port=9222 --run-web-tests
476      http://127.0.0.1:8000/path/to/test.html`
477* Open `http://localhost:9222` in a stable/beta/canary Chrome, click the single
478  link to open the devtools with the test loaded.
479* In the loaded devtools, set any required breakpoints and execute `test()` in
480  the console to actually start the test.
481
482NOTE: If the test is an html file, this means it's a legacy test so you need to add:
483* Add `window.debugTest = true;` to your test code as follows:
484
485  ```javascript
486  window.debugTest = true;
487  function test() {
488    /* TEST CODE */
489  }
490  ```
491
492## Bisecting Regressions
493
494You can use [`git bisect`](https://git-scm.com/docs/git-bisect) to find which
495commit broke (or fixed!) a web test in a fully automated way.  Unlike
496[bisect-builds.py](http://dev.chromium.org/developers/bisect-builds-py), which
497downloads pre-built Chromium binaries, `git bisect` operates on your local
498checkout, so it can run tests with `content_shell`.
499
500Bisecting can take several hours, but since it is fully automated you can leave
501it running overnight and view the results the next day.
502
503To set up an automated bisect of a web test regression, create a script like
504this:
505
506```bash
507#!/bin/bash
508
509# Exit code 125 tells git bisect to skip the revision.
510gclient sync || exit 125
511autoninja -C out/Debug -j100 blink_tests || exit 125
512
513third_party/blink/tools/run_web_tests.py -t Debug \
514  --no-show-results --no-retry-failures \
515  path/to/web/test.html
516```
517
518Modify the `out` directory, ninja args, and test name as appropriate, and save
519the script in `~/checkrev.sh`.  Then run:
520
521```bash
522chmod u+x ~/checkrev.sh  # mark script as executable
523git bisect start <badrev> <goodrev>
524git bisect run ~/checkrev.sh
525git bisect reset  # quit the bisect session
526```
527
528## Rebaselining Web Tests
529
530*** promo
531To automatically re-baseline tests across all Chromium platforms, using the
532buildbot results, see [How to rebaseline](./web_test_expectations.md#How-to-rebaseline).
533Alternatively, to manually run and test and rebaseline it on your workstation,
534read on.
535***
536
537```bash
538cd src/third_party/blink
539python tools/run_web_tests.py --reset-results foo/bar/test.html
540```
541
542If there are current expectation files for `web_tests/foo/bar/test.html`,
543the above command will overwrite the current baselines at their original
544locations with the actual results. The current baseline means the `-expected.*`
545file used to compare the actual result when the test is run locally, i.e. the
546first file found in the [baseline search path](https://cs.chromium.org/search/?q=port/base.py+baseline_search_path).
547
548If there are no current baselines, the above command will create new baselines
549in the platform-independent directory, e.g.
550`web_tests/foo/bar/test-expected.{txt,png}`.
551
552When you rebaseline a test, make sure your commit description explains why the
553test is being re-baselined.
554
555### Rebaselining flag-specific expectations
556
557Though we prefer the Rebaseline Tool to local rebaselining, the Rebaseline Tool
558doesn't support rebaselining flag-specific expectations.
559
560```bash
561cd src/third_party/blink
562python tools/run_web_tests.py --additional-driver-flag=--enable-flag --reset-results foo/bar/test.html
563```
564
565New baselines will be created in the flag-specific baselines directory, e.g.
566`web_tests/flag-specific/enable-flag/foo/bar/test-expected.{txt,png}`.
567
568Then you can commit the new baselines and upload the patch for review.
569
570However, it's difficult for reviewers to review the patch containing only new
571files. You can follow the steps below for easier review.
572
5731. Copy existing baselines to the flag-specific baselines directory for the
574   tests to be rebaselined:
575   ```bash
576   third_party/blink/tools/run_web_tests.py --additional-driver-flag=--enable-flag --copy-baselines foo/bar/test.html
577   ```
578   Then add the newly created baseline files, commit and upload the patch.
579   Note that the above command won't copy baselines for passing tests.
580
5812. Rebaseline the test locally:
582   ```bash
583   third_party/blink/tools/run_web_tests.py --additional-driver-flag=--enable-flag --reset-results foo/bar/test.html
584   ```
585   Commit the changes and upload the patch.
586
5873. Request review of the CL and tell the reviewer to compare the patch sets that
588   were uploaded in step 1 and step 2 to see the differences of the rebaselines.
589
590## Known Issues
591
592See
593[bugs with the component Blink>Infra](https://bugs.chromium.org/p/chromium/issues/list?can=2&q=component%3ABlink%3EInfra)
594for issues related to Blink tools, include the web test runner.
595
596* If QuickTime is not installed, the plugin tests
597  `fast/dom/object-embed-plugin-scripting.html` and
598  `plugins/embed-attributes-setting.html` are expected to fail.
599