1============ 2SpiderMonkey 3============ 4 5*SpiderMonkey* is the *JavaScript* and *WebAssembly* implementation library of 6the *Mozilla Firefox* web browser. The implementation behaviour is defined by 7the `ECMAScript <https://tc39.es/ecma262/>`_ and `WebAssembly 8<https://webassembly.org/>`_ specifications. 9 10Much of the internal technical documentation of the engine can be found 11throughout the source files themselves by looking for comments labelled with 12`[SMDOC]`_. Information about the team, our processes, and about embedding 13*SpiderMonkey* in your own projects can be found at https://spidermonkey.dev. 14 15Specific documentation on a few topics is available at: 16 17.. toctree:: 18 :maxdepth: 1 19 20 build 21 test 22 Debugger/index 23 SavedFrame/index 24 25Components of SpiderMonkey 26########################## 27 28 Garbage Collector 29********************* 30 31.. toctree:: 32 :maxdepth: 2 33 :hidden: 34 35 Overview <gc> 36 Rooting Hazard Analysis <HazardAnalysis/index> 37 Running the Analysis <HazardAnalysis/running> 38 39*JavaScript* is a garbage collected language and at the core of *SpiderMonkey* 40we manage a garbage-collected memory heap. Elements of this heap have a base 41C++ type of `gc::Cell`_. Each round of garbage collection will free up any 42*Cell* that is not referenced by a *root* or another live *Cell* in turn. 43 44See :doc:`GC overview<gc>` for more details. 45 46 47 JS::Value and JSObject 48************************** 49 50*JavaScript* values are divided into either objects or primitives 51(*Undefined*, *Null*, *Boolean*, *Number*, *BigInt*, *String*, or *Symbol*). 52Values are represented with the `JS::Value`_ type which may in turn point to 53an object that extends from the `JSObject`_ type. Objects include both plain 54*JavaScript* objects and exotic objects representing various things from 55functions to *ArrayBuffers* to *HTML Elements* and more. 56 57Most objects extend ``NativeObject`` (which is a subtype of ``JSObject``) 58which provides a way to store properties as key-value pairs similar to a hash 59table. These objects hold their *values* and point to a *Shape* that 60represents the set of *keys*. Similar objects point to the same *Shape* which 61saves memory and allows the JITs to quickly work with objects similar to ones 62it has seen before. See the `[SMDOC] Shapes`_ comment for more details. 63 64C++ (and Rust) code may create and manipulate these objects using the 65collection of interfaces we traditionally call the **JSAPI**. 66 67 68️ JavaScript Parser 69********************* 70 71In order to evaluate script text, we parse it using the *Parser* into an 72`Abstract Syntax Tree`_ (AST) temporarily and then run the *BytecodeEmitter* 73(BCE) to generate `Bytecode`_ and associated metadata. We refer to this 74resulting format as `Stencil`_ and it has the helpful characteristic that it 75does not utilize the Garbage Collector. The *Stencil* can then be 76instantiated into a series of GC *Cells* that can be mutated and understood 77by the execution engines described below. 78 79Each function as well as the top-level itself generates a distinct script. 80This is the unit of execution granularity since functions may be set as 81callbacks that the host runs at a later time. There are both 82``ScriptStencil`` and ``js::BaseScript`` forms of scripts. 83 84By default, the parser runs in a mode called *syntax* or *lazy* parsing where 85we avoid generating full bytecode for functions within the source that we are 86parsing. This lazy parsing is still required to check for all *early errors* 87that the specification describes. When such a lazily compiled inner function 88is first executed, we recompile just that function in a process called 89*delazification*. Lazy parsing avoids allocating the AST and bytecode which 90saves both CPU time and memory. In practice, many functions are never 91executed during a given load of a webpage so this delayed parsing can be 92quite beneficial. 93 94 95⚙️ JavaScript Interpreter 96************************** 97 98The *bytecode* generated by the parser may be executed by an interpreter 99written in C++ that manipulates objects in the GC heap and invokes native 100code of the host (eg. web browser). See `[SMDOC] Bytecode Definitions`_ for 101descriptions of each bytecode opcode and ``js/src/vm/Interpreter.cpp`` for 102their implementation. 103 104 105⚡ JavaScript JITs 106******************* 107 108.. toctree:: 109 :maxdepth: 1 110 :hidden: 111 112 MIR-optimizations/index 113 114In order to speed up execution of *bytecode*, we use a series of Just-In-Time 115(JIT) compilers to generate specialized machine code (eg. x86, ARM, etc) 116tailored to the *JavaScript* that is run and the data that is processed. 117 118As an individual script runs more times (or has a loop that runs many times) 119we describe it as getting *hotter* and at certain thresholds we *tier-up* by 120JIT-compiling it. Each subsequent JIT tier spends more time compiling but 121aims for better execution performance. 122 123Baseline Interpreter 124-------------------- 125 126The *Baseline Interpreter* is a hybrid interpreter/JIT that interprets the 127*bytecode* one opcode at a time, but attaches small fragments of code called 128*Inline Caches* (ICs) that rapidly speed-up executing the same opcode the next 129time (if the data is similar enough). See the `[SMDOC] JIT Inline Caches`_ 130comment for more details. 131 132Baseline Compiler 133----------------- 134 135The *Baseline Compiler* use the same *Inline Caches* mechanism from the 136*Baseline Interpreter* but additionally translates the entire bytecode to 137native machine code. This removes dispatch overhead and does minor local 138optimizations. This machine code still calls back into C++ for complex 139operations. The translation is very fast but the ``BaselineScript`` uses 140memory and requires ``mprotect`` and flushing CPU caches. 141 142WarpMonkey 143---------- 144 145The *WarpMonkey* JIT replaces the former *IonMonkey* engine and is the 146highest level of optimization for the most frequently run scripts. It is able 147to inline other scripts and specialize code based on the data and arguments 148being processed. 149 150We translate the *bytecode* and *Inline Cache* data into a Mid-level 151`Intermediate Representation`_ (Ion MIR) representation. This graph is 152transformed and optimized before being *lowered* to a Low-level Intermediate 153Representation (Ion LIR). This *LIR* performs register allocation and then 154generates native machine code in a process called *Code Generation*. 155 156See `MIR Optimizations`_ for an overview of MIR optimizations. 157 158The optimizations here assume that a script continues to see data similar 159what has been seen before. The *Baseline* JITs are essential to success here 160because they generate *ICs* that match observed data. If after a script is 161compiled with *Warp*, it encounters data that it is not prepared to handle it 162performs a *bailout*. The *bailout* mechanism reconstructs the native machine 163stack frame to match the layout used by the *Baseline Interpreter* and then 164branches to that interpreter as though we were running it all along. Building 165this stack frame may use special side-table saved by *Warp* to reconstruct 166values that are not otherwise available. 167 168 169 WebAssembly 170*************** 171 172In addition to *JavaScript*, the engine is also able to execute *WebAssembly* 173(WASM) sources. 174 175WASM-Baseline (RabaldrMonkey) 176----------------------------- 177 178This engine performs fast translation to machine code in order to minimize 179latency to first execution. 180 181WASM-Ion (BaldrMonkey) 182---------------------- 183 184This engine translates the WASM input into same *MIR* form that *WarpMonkey* 185uses and uses the *IonBackend* to optimize. These optimizations (and in 186particular, the register allocation) generate very fast native machine code. 187 188Cranelift 189--------- 190 191This experimental alternative to *BaldrMonkey* is an optimizing WASM compiler 192written in Rust. This currently is used on ARM64-based platforms (which do 193not support *BaldrMonkey*). 194 195 196.. _gc::Cell: https://searchfox.org/mozilla-central/search?q=[SMDOC]+GC+Cell 197.. _JSObject: https://searchfox.org/mozilla-central/search?q=[SMDOC]+JSObject+layout 198.. _JS::Value: https://searchfox.org/mozilla-central/search?q=[SMDOC]+JS%3A%3AValue+type&path=js%2F 199.. _[SMDOC]: https://searchfox.org/mozilla-central/search?q=[SMDOC]&path=js%2F 200.. _[SMDOC] Shapes: https://searchfox.org/mozilla-central/search?q=[SMDOC]+Shapes 201.. _[SMDOC] Bytecode Definitions: https://searchfox.org/mozilla-central/search?q=[SMDOC]+Bytecode+Definitions&path=js%2F 202.. _[SMDOC] JIT Inline Caches: https://searchfox.org/mozilla-central/search?q=[SMDOC]+JIT+Inline+Caches 203.. _Stencil: https://searchfox.org/mozilla-central/search?q=[SMDOC]+Script+Stencil 204.. _Bytecode: https://en.wikipedia.org/wiki/Bytecode 205.. _Abstract Syntax Tree: https://en.wikipedia.org/wiki/Abstract_syntax_tree 206.. _Intermediate Representation: https://en.wikipedia.org/wiki/Intermediate_representation 207.. _MIR Optimizations: ./MIR-optimizations/index.html 208