1# Web Tests (formerly known as "Layout Tests" or "LayoutTests") 2 3Web tests are used by Blink to test many components, including but not 4limited to layout and rendering. In general, web tests involve loading pages 5in a test renderer (`content_shell`) and comparing the rendered output or 6JavaScript output against an expected output file. 7 8This document covers running and debugging existing web tests. See the 9[Writing Web Tests documentation](./writing_web_tests.md) if you find 10yourself writing web tests. 11 12Note that we changed the term "layout tests" to "web tests". 13Please assume these terms mean the identical stuff. We also call it as 14"WebKit tests" and "WebKit layout tests". 15 16["Web platform tests"](./web_platform_tests.md) (WPT) are the preferred form of 17web tests and are located at 18[web_tests/external/wpt](/third_party/blink/web_tests/external/wpt). 19Tests that should work across browsers go there. Other directories are for 20Chrome-specific tests only. 21 22[TOC] 23 24## Running Web Tests 25 26### Initial Setup 27 28Before you can run the web tests, you need to build the `blink_tests` target 29to get `content_shell` and all of the other needed binaries. 30 31```bash 32autoninja -C out/Default blink_tests 33``` 34 35On **Android** (web test support 36[currently limited to KitKat and earlier](https://crbug.com/567947)) you need to 37build and install `content_shell_apk` instead. See also: 38[Android Build Instructions](../android_build_instructions.md). 39 40```bash 41autoninja -C out/Default content_shell_apk 42adb install -r out/Default/apks/ContentShell.apk 43``` 44 45On **Mac**, you probably want to strip the content_shell binary before starting 46the tests. If you don't, you'll have 5-10 running concurrently, all stuck being 47examined by the OS crash reporter. This may cause other failures like timeouts 48where they normally don't occur. 49 50```bash 51strip ./xcodebuild/{Debug,Release}/content_shell.app/Contents/MacOS/content_shell 52``` 53 54### Running the Tests 55 56TODO: mention `testing/xvfb.py` 57 58The test runner script is in 59`third_party/blink/tools/run_web_tests.py`. 60 61To specify which build directory to use (e.g. out/Default, out/Release, 62out/Debug) you should pass the `-t` or `--target` parameter. For example, to 63use the build in `out/Default`, use: 64 65```bash 66python third_party/blink/tools/run_web_tests.py -t Default 67``` 68 69For Android (if your build directory is `out/android`): 70 71```bash 72python third_party/blink/tools/run_web_tests.py -t android --android 73``` 74 75Tests marked as `[ Skip ]` in 76[TestExpectations](../../third_party/blink/web_tests/TestExpectations) 77won't be run by default, generally because they cause some intractable tool error. 78To force one of them to be run, either rename that file or specify the skipped 79test on the command line (see below) or in a file specified with --test-list 80(however, --skip=always can make the tests marked as `[ Skip ]` always skipped). 81Read the [Web Test Expectations documentation](./web_test_expectations.md) to 82learn more about TestExpectations and related files. 83 84*** promo 85Currently only the tests listed in 86[SmokeTests](../../third_party/blink/web_tests/SmokeTests) 87are run on the Android bots, since running all web tests takes too long on 88Android (and may still have some infrastructure issues). Most developers focus 89their Blink testing on Linux. We rely on the fact that the Linux and Android 90behavior is nearly identical for scenarios outside those covered by the smoke 91tests. 92*** 93 94To run only some of the tests, specify their directories or filenames as 95arguments to `run_web_tests.py` relative to the web test directory 96(`src/third_party/blink/web_tests`). For example, to run the fast form tests, 97use: 98 99```bash 100python third_party/blink/tools/run_web_tests.py fast/forms 101``` 102 103Or you could use the following shorthand: 104 105```bash 106python third_party/blink/tools/run_web_tests.py fast/fo\* 107``` 108 109*** promo 110Example: To run the web tests with a debug build of `content_shell`, but only 111test the SVG tests and run pixel tests, you would run: 112 113```bash 114[python] third_party/blink/tools/run_web_tests.py -t Default svg 115``` 116*** 117 118As a final quick-but-less-robust alternative, you can also just use the 119content_shell executable to run specific tests by using (example on Windows): 120 121```bash 122out/Default/content_shell.exe --run-web-tests <url>|<full_test_source_path>|<relative_test_path> 123``` 124 125as in: 126 127```bash 128out/Default/content_shell.exe --run-web-tests \ 129 c:/chrome/src/third_party/blink/web_tests/fast/forms/001.html 130``` 131or 132 133```bash 134out/Default/content_shell.exe --run-web-tests fast/forms/001.html 135``` 136 137but this requires a manual diff against expected results, because the shell 138doesn't do it for you. It also just dumps the text result only (as the dump of 139pixels and audio binary data is not human readable). 140See [Running Web Tests Using the Content Shell](./web_tests_in_content_shell.md) 141for more details of running `content_shell`. 142 143To see a complete list of arguments supported, run: 144 145```bash 146python third_party/blink/tools/run_web_tests.py --help 147``` 148 149*** note 150**Linux Note:** We try to match the Windows render tree output exactly by 151matching font metrics and widget metrics. If there's a difference in the render 152tree output, we should see if we can avoid rebaselining by improving our font 153metrics. For additional information on Linux web tests, please see 154[docs/web_tests_linux.md](./web_tests_linux.md). 155*** 156 157*** note 158**Mac Note:** While the tests are running, a bunch of Appearance settings are 159overridden for you so the right type of scroll bars, colors, etc. are used. 160Your main display's "Color Profile" is also changed to make sure color 161correction by ColorSync matches what is expected in the pixel tests. The change 162is noticeable, how much depends on the normal level of correction for your 163display. The tests do their best to restore your setting when done, but if 164you're left in the wrong state, you can manually reset it by going to 165System Preferences → Displays → Color and selecting the "right" value. 166*** 167 168### Test Harness Options 169 170This script has a lot of command line flags. You can pass `--help` to the script 171to see a full list of options. A few of the most useful options are below: 172 173| Option | Meaning | 174|:----------------------------|:--------------------------------------------------| 175| `--debug` | Run the debug build of the test shell (default is release). Equivalent to `-t Debug` | 176| `--nocheck-sys-deps` | Don't check system dependencies; this allows faster iteration. | 177| `--verbose` | Produce more verbose output, including a list of tests that pass. | 178| `--reset-results` | Overwrite the current baselines (`-expected.{png|txt|wav}` files) with actual results, or create new baselines if there are no existing baselines. | 179| `--renderer-startup-dialog` | Bring up a modal dialog before running the test, useful for attaching a debugger. | 180| `--fully-parallel` | Run tests in parallel using as many child processes as the system has cores. | 181| `--driver-logging` | Print C++ logs (LOG(WARNING), etc). | 182 183## Success and Failure 184 185A test succeeds when its output matches the pre-defined expected results. If any 186tests fail, the test script will place the actual generated results, along with 187a diff of the actual and expected results, into 188`src/out/Default/layout_test_results/`, and by default launch a browser with a 189summary and link to the results/diffs. 190 191The expected results for tests are in the 192`src/third_party/blink/web_tests/platform` or alongside their respective 193tests. 194 195*** note 196Tests which use [testharness.js](https://github.com/w3c/testharness.js/) 197do not have expected result files if all test cases pass. 198*** 199 200A test that runs but produces the wrong output is marked as "failed", one that 201causes the test shell to crash is marked as "crashed", and one that takes longer 202than a certain amount of time to complete is aborted and marked as "timed out". 203A row of dots in the script's output indicates one or more tests that passed. 204 205## Test expectations 206 207The 208[TestExpectations](../../third_party/blink/web_tests/TestExpectations) file (and related 209files) contains the list of all known web test failures. See the 210[Web Test Expectations documentation](./web_test_expectations.md) for more 211on this. 212 213## Testing Runtime Flags 214 215There are two ways to run web tests with additional command-line arguments: 216 217* Using `--additional-driver-flag`: 218 219 ```bash 220 python run_web_tests.py --additional-driver-flag=--blocking-repaint 221 ``` 222 223 This tells the test harness to pass `--blocking-repaint` to the 224 content_shell binary. 225 226 It will also look for flag-specific expectations in 227 `web_tests/FlagExpectations/blocking-repaint`, if this file exists. The 228 suppressions in this file override the main TestExpectations file. 229 230 It will also look for baselines in `web_tests/flag-specific/blocking-repaint`. 231 The baselines in this directory override the fallback baselines. 232 233 By default, name of the expectation file name under 234 `web_tests/FlagExpectations` and name of the baseline directory under 235 `web_tests/flag-specific` uses the first flag of --additional-driver-flag 236 with leading '-'s stripped. 237 238 You can also customize the name in `web_tests/FlagSpecificConfig` when 239 the name is too long or when we need to match multiple additional args: 240 241 ```json 242 { 243 "name": "short-name", 244 "args": ["--blocking-repaint", "--another-flag"] 245 } 246 ``` 247 248 When at least `--additional-driver-flag=--blocking-repaint` and 249 `--additional-driver-flag=--another-flag` are specified, `short-name` will 250 be used as name of the flag specific expectation file and the baseline directory. 251 252 With the config, you can also use `--flag-specific=short-name` as a shortcut 253 of `--additional-driver-flag=--blocking-repaint --additional-driver-flag=--another-flag`. 254 255* Using a *virtual test suite* defined in 256 [web_tests/VirtualTestSuites](../../third_party/blink/web_tests/VirtualTestSuites). 257 A virtual test suite runs a subset of web tests with additional flags, with 258 `virtual/<prefix>/...` in their paths. The tests can be virtual tests that 259 map to real base tests (directories or files) whose paths match any of the 260 specified bases, or any real tests under `web_tests/virtual/<prefix>/` 261 directory. For example, you could test a (hypothetical) new mode for 262 repainting using the following virtual test suite: 263 264 ```json 265 { 266 "prefix": "blocking_repaint", 267 "bases": ["compositing", "fast/repaint"], 268 "args": ["--blocking-repaint"] 269 } 270 ``` 271 272 This will create new "virtual" tests of the form 273 `virtual/blocking_repaint/compositing/...` and 274 `virtual/blocking_repaint/fast/repaint/...` which correspond to the files 275 under `web_tests/compositing` and `web_tests/fast/repaint`, respectively, 276 and pass `--blocking-repaint` to `content_shell` when they are run. 277 278 These virtual tests exist in addition to the original `compositing/...` and 279 `fast/repaint/...` tests. They can have their own expectations in 280 `web_tests/TestExpectations`, and their own baselines. The test harness will 281 use the non-virtual baselines as a fallback. However, the non-virtual 282 expectations are not inherited: if `fast/repaint/foo.html` is marked 283 `[ Fail ]`, the test harness still expects 284 `virtual/blocking_repaint/fast/repaint/foo.html` to pass. If you expect the 285 virtual test to also fail, it needs its own suppression. 286 287 This will also let any real tests under `web_tests/virtual/blocking_repaint` 288 directory run with the `--blocking-repaint` flag. 289 290 The "prefix" value should be unique. Multiple directories with the same flags 291 should be listed in the same "bases" list. The "bases" list can be empty, 292 in case that we just want to run the real tests under `virtual/<prefix>` 293 with the flags without creating any virtual tests. 294 295For flags whose implementation is still in progress, virtual test suites and 296flag-specific expectations represent two alternative strategies for testing. 297Consider the following when choosing between them: 298 299* The 300 [waterfall builders](https://dev.chromium.org/developers/testing/chromium-build-infrastructure/tour-of-the-chromium-buildbot) 301 and [try bots](https://dev.chromium.org/developers/testing/try-server-usage) 302 will run all virtual test suites in addition to the non-virtual tests. 303 Conversely, a flag-specific expectations file won't automatically cause the 304 bots to test your flag - if you want bot coverage without virtual test suites, 305 you will need to set up a dedicated bot for your flag. 306 307* Due to the above, virtual test suites incur a performance penalty for the 308 commit queue and the continuous build infrastructure. This is exacerbated by 309 the need to restart `content_shell` whenever flags change, which limits 310 parallelism. Therefore, you should avoid adding large numbers of virtual test 311 suites. They are well suited to running a subset of tests that are directly 312 related to the feature, but they don't scale to flags that make deep 313 architectural changes that potentially impact all of the tests. 314 315* Note that using wildcards in virtual test path names (e.g. 316 `virtual/blocking_repaint/fast/repaint/*`) is not supported, but you can 317 still use `virtual/blocking_repaint` to run all real and virtual tests 318 in the suite or `virtual/blocking_repaint/fast/repaint/dir` to run real 319 or virtual tests in the suite under a specific directory. 320 321## Tracking Test Failures 322 323All bugs, associated with web test failures must have the 324[Test-Layout](https://crbug.com/?q=label:Test-Layout) label. Depending on how 325much you know about the bug, assign the status accordingly: 326 327* **Unconfirmed** -- You aren't sure if this is a simple rebaseline, possible 328 duplicate of an existing bug, or a real failure 329* **Untriaged** -- Confirmed but unsure of priority or root cause. 330* **Available** -- You know the root cause of the issue. 331* **Assigned** or **Started** -- You will fix this issue. 332 333When creating a new web test bug, please set the following properties: 334 335* Components: a sub-component of Blink 336* OS: **All** (or whichever OS the failure is on) 337* Priority: 2 (1 if it's a crash) 338* Type: **Bug** 339* Labels: **Test-Layout** 340 341You can also use the _Layout Test Failure_ template, which pre-sets these 342labels for you. 343 344## Debugging Web Tests 345 346After the web tests run, you should get a summary of tests that pass or 347fail. If something fails unexpectedly (a new regression), you will get a 348`content_shell` window with a summary of the unexpected failures. Or you might 349have a failing test in mind to investigate. In any case, here are some steps and 350tips for finding the problem. 351 352* Take a look at the result. Sometimes tests just need to be rebaselined (see 353 below) to account for changes introduced in your patch. 354 * Load the test into a trunk Chrome or content_shell build and look at its 355 result. (For tests in the http/ directory, start the http server first. 356 See above. Navigate to `http://localhost:8000/` and proceed from there.) 357 The best tests describe what they're looking for, but not all do, and 358 sometimes things they're not explicitly testing are still broken. Compare 359 it to Safari, Firefox, and IE if necessary to see if it's correct. If 360 you're still not sure, find the person who knows the most about it and 361 ask. 362 * Some tests only work properly in content_shell, not Chrome, because they 363 rely on extra APIs exposed there. 364 * Some tests only work properly when they're run in the web-test 365 framework, not when they're loaded into content_shell directly. The test 366 should mention that in its visible text, but not all do. So try that too. 367 See "Running the tests", above. 368* If you think the test is correct, confirm your suspicion by looking at the 369 diffs between the expected result and the actual one. 370 * Make sure that the diffs reported aren't important. Small differences in 371 spacing or box sizes are often unimportant, especially around fonts and 372 form controls. Differences in wording of JS error messages are also 373 usually acceptable. 374 * `python run_web_tests.py path/to/your/test.html` produces a page listing 375 all test results. Those which fail their expectations will include links 376 to the expected result, actual result, and diff. These results are saved 377 to `$root_build_dir/layout-test-results`. 378 * Alternatively the `--results-directory=path/for/output/` option allows 379 you to specify an alternative directory for the output to be saved to. 380 * If you're still sure it's correct, rebaseline the test (see below). 381 Otherwise... 382* If you're lucky, your test is one that runs properly when you navigate to it 383 in content_shell normally. In that case, build the Debug content_shell 384 project, fire it up in your favorite debugger, and load the test file either 385 from a `file:` URL. 386 * You'll probably be starting and stopping the content_shell a lot. In VS, 387 to save navigating to the test every time, you can set the URL to your 388 test (`file:` or `http:`) as the command argument in the Debugging section of 389 the content_shell project Properties. 390 * If your test contains a JS call, DOM manipulation, or other distinctive 391 piece of code that you think is failing, search for that in the Chrome 392 solution. That's a good place to put a starting breakpoint to start 393 tracking down the issue. 394 * Otherwise, you're running in a standard message loop just like in Chrome. 395 If you have no other information, set a breakpoint on page load. 396* If your test only works in full web-test mode, or if you find it simpler to 397 debug without all the overhead of an interactive session, start the 398 content_shell with the command-line flag `--run-web-tests`, followed by the 399 URL (`file:` or `http:`) to your test. More information about running web tests 400 in content_shell can be found [here](./web_tests_in_content_shell.md). 401 * In VS, you can do this in the Debugging section of the content_shell 402 project Properties. 403 * Now you're running with exactly the same API, theme, and other setup that 404 the web tests use. 405 * Again, if your test contains a JS call, DOM manipulation, or other 406 distinctive piece of code that you think is failing, search for that in 407 the Chrome solution. That's a good place to put a starting breakpoint to 408 start tracking down the issue. 409 * If you can't find any better place to set a breakpoint, start at the 410 `TestShell::RunFileTest()` call in `content_shell_main.cc`, or at 411 `shell->LoadURL() within RunFileTest()` in `content_shell_win.cc`. 412* Debug as usual. Once you've gotten this far, the failing web test is just a 413 (hopefully) reduced test case that exposes a problem. 414 415### Debugging HTTP Tests 416 417To run the server manually to reproduce/debug a failure: 418 419```bash 420cd src/third_party/blink/tools 421python run_blink_httpd.py 422``` 423 424The web tests are served from `http://127.0.0.1:8000/`. For example, to 425run the test 426`web_tests/http/tests/serviceworker/chromium/service-worker-allowed.html`, 427navigate to 428`http://127.0.0.1:8000/serviceworker/chromium/service-worker-allowed.html`. Some 429tests behave differently if you go to `127.0.0.1` vs. `localhost`, so use 430`127.0.0.1`. 431 432To kill the server, hit any key on the terminal where `run_blink_httpd.py` is 433running, use `taskkill` or the Task Manager on Windows, or `killall` or 434Activity Monitor on macOS. 435 436The test server sets up an alias to the `web_tests/resources` directory. For 437example, in HTTP tests, you can access the testing framework using 438`src="/js-test-resources/js-test.js"`. 439 440### Tips 441 442Check https://test-results.appspot.com/ to see how a test did in the most recent 443~100 builds on each builder (as long as the page is being updated regularly). 444 445A timeout will often also be a text mismatch, since the wrapper script kills the 446content_shell before it has a chance to finish. The exception is if the test 447finishes loading properly, but somehow hangs before it outputs the bit of text 448that tells the wrapper it's done. 449 450Why might a test fail (or crash, or timeout) on buildbot, but pass on your local 451machine? 452* If the test finishes locally but is slow, more than 10 seconds or so, that 453 would be why it's called a timeout on the bot. 454* Otherwise, try running it as part of a set of tests; it's possible that a test 455 one or two (or ten) before this one is corrupting something that makes this 456 one fail. 457* If it consistently works locally, make sure your environment looks like the 458 one on the bot (look at the top of the stdio for the webkit_tests step to see 459 all the environment variables and so on). 460* If none of that helps, and you have access to the bot itself, you may have to 461 log in there and see if you can reproduce the problem manually. 462 463### Debugging DevTools Tests 464 465* Add `debug_devtools=true` to `args.gn` and compile: `autoninja -C out/Default devtools_frontend_resources` 466 > Debug DevTools lets you avoid having to recompile after every change to the DevTools front-end. 467* Do one of the following: 468 * Option A) Run from the `chromium/src` folder: 469 `third_party/blink/tools/run_web_tests.sh 470 --additional-driver-flag='--debug-devtools' 471 --additional-driver-flag='--remote-debugging-port=9222' 472 --time-out-ms=6000000` 473 * Option B) If you need to debug an http/tests/inspector test, start httpd 474 as described above. Then, run content_shell: 475 `out/Default/content_shell --debug-devtools --remote-debugging-port=9222 --run-web-tests 476 http://127.0.0.1:8000/path/to/test.html` 477* Open `http://localhost:9222` in a stable/beta/canary Chrome, click the single 478 link to open the devtools with the test loaded. 479* In the loaded devtools, set any required breakpoints and execute `test()` in 480 the console to actually start the test. 481 482NOTE: If the test is an html file, this means it's a legacy test so you need to add: 483* Add `window.debugTest = true;` to your test code as follows: 484 485 ```javascript 486 window.debugTest = true; 487 function test() { 488 /* TEST CODE */ 489 } 490 ``` 491 492## Bisecting Regressions 493 494You can use [`git bisect`](https://git-scm.com/docs/git-bisect) to find which 495commit broke (or fixed!) a web test in a fully automated way. Unlike 496[bisect-builds.py](http://dev.chromium.org/developers/bisect-builds-py), which 497downloads pre-built Chromium binaries, `git bisect` operates on your local 498checkout, so it can run tests with `content_shell`. 499 500Bisecting can take several hours, but since it is fully automated you can leave 501it running overnight and view the results the next day. 502 503To set up an automated bisect of a web test regression, create a script like 504this: 505 506```bash 507#!/bin/bash 508 509# Exit code 125 tells git bisect to skip the revision. 510gclient sync || exit 125 511autoninja -C out/Debug -j100 blink_tests || exit 125 512 513third_party/blink/tools/run_web_tests.py -t Debug \ 514 --no-show-results --no-retry-failures \ 515 path/to/web/test.html 516``` 517 518Modify the `out` directory, ninja args, and test name as appropriate, and save 519the script in `~/checkrev.sh`. Then run: 520 521```bash 522chmod u+x ~/checkrev.sh # mark script as executable 523git bisect start <badrev> <goodrev> 524git bisect run ~/checkrev.sh 525git bisect reset # quit the bisect session 526``` 527 528## Rebaselining Web Tests 529 530*** promo 531To automatically re-baseline tests across all Chromium platforms, using the 532buildbot results, see [How to rebaseline](./web_test_expectations.md#How-to-rebaseline). 533Alternatively, to manually run and test and rebaseline it on your workstation, 534read on. 535*** 536 537```bash 538cd src/third_party/blink 539python tools/run_web_tests.py --reset-results foo/bar/test.html 540``` 541 542If there are current expectation files for `web_tests/foo/bar/test.html`, 543the above command will overwrite the current baselines at their original 544locations with the actual results. The current baseline means the `-expected.*` 545file used to compare the actual result when the test is run locally, i.e. the 546first file found in the [baseline search path](https://cs.chromium.org/search/?q=port/base.py+baseline_search_path). 547 548If there are no current baselines, the above command will create new baselines 549in the platform-independent directory, e.g. 550`web_tests/foo/bar/test-expected.{txt,png}`. 551 552When you rebaseline a test, make sure your commit description explains why the 553test is being re-baselined. 554 555### Rebaselining flag-specific expectations 556 557Though we prefer the Rebaseline Tool to local rebaselining, the Rebaseline Tool 558doesn't support rebaselining flag-specific expectations. 559 560```bash 561cd src/third_party/blink 562python tools/run_web_tests.py --additional-driver-flag=--enable-flag --reset-results foo/bar/test.html 563``` 564 565New baselines will be created in the flag-specific baselines directory, e.g. 566`web_tests/flag-specific/enable-flag/foo/bar/test-expected.{txt,png}`. 567 568Then you can commit the new baselines and upload the patch for review. 569 570However, it's difficult for reviewers to review the patch containing only new 571files. You can follow the steps below for easier review. 572 5731. Copy existing baselines to the flag-specific baselines directory for the 574 tests to be rebaselined: 575 ```bash 576 third_party/blink/tools/run_web_tests.py --additional-driver-flag=--enable-flag --copy-baselines foo/bar/test.html 577 ``` 578 Then add the newly created baseline files, commit and upload the patch. 579 Note that the above command won't copy baselines for passing tests. 580 5812. Rebaseline the test locally: 582 ```bash 583 third_party/blink/tools/run_web_tests.py --additional-driver-flag=--enable-flag --reset-results foo/bar/test.html 584 ``` 585 Commit the changes and upload the patch. 586 5873. Request review of the CL and tell the reviewer to compare the patch sets that 588 were uploaded in step 1 and step 2 to see the differences of the rebaselines. 589 590## Known Issues 591 592See 593[bugs with the component Blink>Infra](https://bugs.chromium.org/p/chromium/issues/list?can=2&q=component%3ABlink%3EInfra) 594for issues related to Blink tools, include the web test runner. 595 596* If QuickTime is not installed, the plugin tests 597 `fast/dom/object-embed-plugin-scripting.html` and 598 `plugins/embed-attributes-setting.html` are expected to fail. 599