1============
2SpiderMonkey
3============
4
5*SpiderMonkey* is the *JavaScript* and *WebAssembly* implementation library of
6the *Mozilla Firefox* web browser. The implementation behaviour is defined by
7the `ECMAScript <https://tc39.es/ecma262/>`_ and `WebAssembly
8<https://webassembly.org/>`_ specifications.
9
10Much of the internal technical documentation of the engine can be found
11throughout the source files themselves by looking for comments labelled with
12`[SMDOC]`_. Information about the team, our processes, and about embedding
13*SpiderMonkey* in your own projects can be found at https://spidermonkey.dev.
14
15Specific documentation on a few topics is available at:
16
17.. toctree::
18   :maxdepth: 1
19
20   build
21   test
22   Debugger/index
23   SavedFrame/index
24
25Components of SpiderMonkey
26##########################
27
28�� Garbage Collector
29*********************
30
31.. toctree::
32   :maxdepth: 2
33   :hidden:
34
35   Overview <gc>
36   Rooting Hazard Analysis <HazardAnalysis/index>
37   Running the Analysis <HazardAnalysis/running>
38
39*JavaScript* is a garbage collected language and at the core of *SpiderMonkey*
40we manage a garbage-collected memory heap. Elements of this heap have a base
41C++ type of `gc::Cell`_. Each round of garbage collection will free up any
42*Cell* that is not referenced by a *root* or another live *Cell* in turn.
43
44See :doc:`GC overview<gc>` for more details.
45
46
47�� JS::Value and JSObject
48**************************
49
50*JavaScript* values are divided into either objects or primitives
51(*Undefined*, *Null*, *Boolean*, *Number*, *BigInt*, *String*, or *Symbol*).
52Values are represented with the `JS::Value`_ type which may in turn point to
53an object that extends from the `JSObject`_ type. Objects include both plain
54*JavaScript* objects and exotic objects representing various things from
55functions to *ArrayBuffers* to *HTML Elements* and more.
56
57Most objects extend ``NativeObject`` (which is a subtype of ``JSObject``)
58which provides a way to store properties as key-value pairs similar to a hash
59table. These objects hold their *values* and point to a *Shape* that
60represents the set of *keys*. Similar objects point to the same *Shape* which
61saves memory and allows the JITs to quickly work with objects similar to ones
62it has seen before. See the `[SMDOC] Shapes`_ comment for more details.
63
64C++ (and Rust) code may create and manipulate these objects using the
65collection of interfaces we traditionally call the **JSAPI**.
66
67
68��️ JavaScript Parser
69*********************
70
71In order to evaluate script text, we parse it using the *Parser* into an
72`Abstract Syntax Tree`_ (AST) temporarily and then run the *BytecodeEmitter*
73(BCE) to generate `Bytecode`_ and associated metadata. We refer to this
74resulting format as `Stencil`_ and it has the helpful characteristic that it
75does not utilize the Garbage Collector. The *Stencil* can then be
76instantiated into a series of GC *Cells* that can be mutated and understood
77by the execution engines described below.
78
79Each function as well as the top-level itself generates a distinct script.
80This is the unit of execution granularity since functions may be set as
81callbacks that the host runs at a later time. There are both
82``ScriptStencil`` and ``js::BaseScript`` forms of scripts.
83
84By default, the parser runs in a mode called *syntax* or *lazy* parsing where
85we avoid generating full bytecode for functions within the source that we are
86parsing. This lazy parsing is still required to check for all *early errors*
87that the specification describes. When such a lazily compiled inner function
88is first executed, we recompile just that function in a process called
89*delazification*. Lazy parsing avoids allocating the AST and bytecode which
90saves both CPU time and memory. In practice, many functions are never
91executed during a given load of a webpage so this delayed parsing can be
92quite beneficial.
93
94
95⚙️ JavaScript Interpreter
96**************************
97
98The *bytecode* generated by the parser may be executed by an interpreter
99written in C++ that manipulates objects in the GC heap and invokes native
100code of the host (eg. web browser). See `[SMDOC] Bytecode Definitions`_ for
101descriptions of each bytecode opcode and ``js/src/vm/Interpreter.cpp`` for
102their implementation.
103
104
105⚡ JavaScript JITs
106*******************
107
108.. toctree::
109   :maxdepth: 1
110   :hidden:
111
112   MIR-optimizations/index
113
114In order to speed up execution of *bytecode*, we use a series of Just-In-Time
115(JIT) compilers to generate specialized machine code (eg. x86, ARM, etc)
116tailored to the *JavaScript* that is run and the data that is processed.
117
118As an individual script runs more times (or has a loop that runs many times)
119we describe it as getting *hotter* and at certain thresholds we *tier-up* by
120JIT-compiling it. Each subsequent JIT tier spends more time compiling but
121aims for better execution performance.
122
123Baseline Interpreter
124--------------------
125
126The *Baseline Interpreter* is a hybrid interpreter/JIT that interprets the
127*bytecode* one opcode at a time, but attaches small fragments of code called
128*Inline Caches* (ICs) that rapidly speed-up executing the same opcode the next
129time (if the data is similar enough). See the `[SMDOC] JIT Inline Caches`_
130comment for more details.
131
132Baseline Compiler
133-----------------
134
135The *Baseline Compiler* use the same *Inline Caches* mechanism from the
136*Baseline Interpreter* but additionally translates the entire bytecode to
137native machine code. This removes dispatch overhead and does minor local
138optimizations. This machine code still calls back into C++ for complex
139operations. The translation is very fast but the ``BaselineScript`` uses
140memory and requires ``mprotect`` and flushing CPU caches.
141
142WarpMonkey
143----------
144
145The *WarpMonkey* JIT replaces the former *IonMonkey* engine and is the
146highest level of optimization for the most frequently run scripts. It is able
147to inline other scripts and specialize code based on the data and arguments
148being processed.
149
150We translate the *bytecode* and *Inline Cache* data into a Mid-level
151`Intermediate Representation`_ (Ion MIR) representation. This graph is
152transformed and optimized before being *lowered* to a Low-level Intermediate
153Representation (Ion LIR). This *LIR* performs register allocation and then
154generates native machine code in a process called *Code Generation*.
155
156See `MIR Optimizations`_ for an overview of MIR optimizations.
157
158The optimizations here assume that a script continues to see data similar
159what has been seen before. The *Baseline* JITs are essential to success here
160because they generate *ICs* that match observed data. If after a script is
161compiled with *Warp*, it encounters data that it is not prepared to handle it
162performs a *bailout*. The *bailout* mechanism reconstructs the native machine
163stack frame to match the layout used by the *Baseline Interpreter* and then
164branches to that interpreter as though we were running it all along. Building
165this stack frame may use special side-table saved by *Warp* to reconstruct
166values that are not otherwise available.
167
168
169�� WebAssembly
170***************
171
172In addition to *JavaScript*, the engine is also able to execute *WebAssembly*
173(WASM) sources.
174
175WASM-Baseline (RabaldrMonkey)
176-----------------------------
177
178This engine performs fast translation to machine code in order to minimize
179latency to first execution.
180
181WASM-Ion (BaldrMonkey)
182----------------------
183
184This engine translates the WASM input into same *MIR* form that *WarpMonkey*
185uses and uses the *IonBackend* to optimize. These optimizations (and in
186particular, the register allocation) generate very fast native machine code.
187
188Cranelift
189---------
190
191This experimental alternative to *BaldrMonkey* is an optimizing WASM compiler
192written in Rust. This currently is used on ARM64-based platforms (which do
193not support *BaldrMonkey*).
194
195
196.. _gc::Cell: https://searchfox.org/mozilla-central/search?q=[SMDOC]+GC+Cell
197.. _JSObject: https://searchfox.org/mozilla-central/search?q=[SMDOC]+JSObject+layout
198.. _JS::Value: https://searchfox.org/mozilla-central/search?q=[SMDOC]+JS%3A%3AValue+type&path=js%2F
199.. _[SMDOC]: https://searchfox.org/mozilla-central/search?q=[SMDOC]&path=js%2F
200.. _[SMDOC] Shapes: https://searchfox.org/mozilla-central/search?q=[SMDOC]+Shapes
201.. _[SMDOC] Bytecode Definitions: https://searchfox.org/mozilla-central/search?q=[SMDOC]+Bytecode+Definitions&path=js%2F
202.. _[SMDOC] JIT Inline Caches: https://searchfox.org/mozilla-central/search?q=[SMDOC]+JIT+Inline+Caches
203.. _Stencil: https://searchfox.org/mozilla-central/search?q=[SMDOC]+Script+Stencil
204.. _Bytecode: https://en.wikipedia.org/wiki/Bytecode
205.. _Abstract Syntax Tree: https://en.wikipedia.org/wiki/Abstract_syntax_tree
206.. _Intermediate Representation: https://en.wikipedia.org/wiki/Intermediate_representation
207.. _MIR Optimizations: ./MIR-optimizations/index.html
208