1 /* Copyright 2018 Mozilla Foundation
2 *
3 * Licensed under the Apache License, Version 2.0 (the "License");
4 * you may not use this file except in compliance with the License.
5 * You may obtain a copy of the License at
6 *
7 * http://www.apache.org/licenses/LICENSE-2.0
8 *
9 * Unless required by applicable law or agreed to in writing, software
10 * distributed under the License is distributed on an "AS IS" BASIS,
11 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 * See the License for the specific language governing permissions and
13 * limitations under the License.
14 */
15
16 //! This code bridges Spidermonkey to Cranelift.
17 //!
18 //! This documentation explains the role of each high-level function, each notable submodule, and
19 //! the Spidermonkey idiosyncrasies that are visible here and leak into Cranelift. This is not a
20 //! technical presentation of how Cranelift works or what it intends to achieve, a task much more
21 //! suited to the Wasmtime documentation itself:
22 //!
23 //! https://github.com/bytecodealliance/wasmtime/blob/master/cranelift/docs/index.md
24 //!
25 //! At the time of writing (April 14th, 2020), this code is only used for WebAssembly (wasm)
26 //! compilation, so this documentation focuses on the wasm integration. As a matter of fact, this
27 //! glue crate between Baldrmonkey and Cranelift is called Baldrdash, thanks to the usual punsters.
28 //!
29 //! ## Relationships to other files
30 //!
31 //! * WasmCraneliftCompile.cpp contains all the C++ code that calls into this crate.
32 //! * clifapi.h describes the C-style bindings to this crate's public functions, used by the C++
33 //! code to call into Rust. They're maintained by hand, and thus manual review must ensure the
34 //! signatures match those of the functions exposed in this lib.rs file.
35 //! * baldrapi.h describes the C-style functions exposed through `bindgen` so they can be called
36 //! from Rust. Bindings are automatically generated, such that they're safe to use in general.
37 //! WasmConstants.h is also exposed in through this file, which makes sharing some code easier.
38 //!
39 //! ## High-level functions
40 //!
41 //! * `cranelift_initialize` performs per-process initialization.
42 //! * `cranelift_compiler_create` will return a `BatchCompiler`, the high-level data structure
43 //! controlling the compilation of a group (batch) of wasm functions. The created compiler should
44 //! later be deallocated with `cranelift_compiler_destroy`, once it's not needed anymore.
45 //! * `cranelift_compile_function` takes care of translating a single wasm function into Cranelift
46 //! IR, and compiles it down to machine code. Input data is passed through a const pointer to a
47 //! `FuncCompilerInput` data structure (defined in bindings), and the return values are stored in
48 //! an in-out parameter named `CompiledFunc` (also defined in bindings).
49 //!
50 //! ## Submodules
51 //!
52 //! The list of submodules here is voluntarily put in a specific order, so as to make it easier to
53 //! discover and read.
54 //!
55 //! * The `isa` module configures Cranelift, applying some target-independent settings, as well as
56 //! target-specific settings. These settings are used both during translation of wasm to Cranelift
57 //! IR and compilation to machine code.
58 //! * The `wasm2clif` module contains the code doing the translation of the wasm code section to
59 //! Cranelift IR, implementing all the Spidermonkey specific behaviors.
60 //! * The `compile` module takes care of optimizing the Cranelift IR and compiles it down to
61 //! machine code, noting down relocations in the process.
62 //!
63 //! A few other helper modules are also defined:
64 //!
65 //! * The `bindings` module contains C++ bindings automatically generated by `bindgen` in the Cargo
66 //! build script (`build.rs`), as well as thin wrappers over these data structures to make these
67 //! more ergonomic to use in Rust.
68 //! * No code base would be feature complete without a bunch of random helpers and functions that
69 //! don't really belong anywhere else: the `utils` module contains error handling helpers, to unify
70 //! all the Cranelift Error types into one that can be passed around in Baldrdash.
71 //!
72 //! ## Spidermonkey idiosyncrasies
73 //!
74 //! Most of the Spidermonkey-specific behavior is reflected during conversion of the wasm code to
75 //! Cranelift IR (in the `wasm2clif` module), but there are some other aspects worth mentioning
76 //! here.
77 //!
78 //! ### Code generation, prologues/epilogues, ABI
79 //!
80 //! Cranelift may call into and be called from other functions using the Spidermonkey wasm ABI:
81 //! that is, code generated by the wasm baseline compiler during tiering, any other wasm stub, even
82 //! Ion (through the JIT entries and exits).
83 //!
84 //! As a matter of fact, it must push the same C++ `wasm::Frame` on the stack before a call, and
85 //! unwind it properly on exit. To keep this detail orthogonal to Cranelift, the function's
86 //! prologue and epilogue are **not** generated by Cranelift itself; the C++ code generates them
87 //! for us. Here, Cranelift only generates the code section and appropriate relocations.
88 //! The C++ code writes the prologue, copies the machine code section, writes the epilogue, and
89 //! translates the Cranelift relocations into Spidermonkey relocations.
90 //!
91 //! * To not generate the prologue and epilogue, Cranelift uses a special calling convention called
92 //! Baldrdash in its code. This is set upon creation of the `TargetISA`.
93 //! * Cranelift must know the offset to the stack argument's base, that is, the size of the
94 //! wasm::Frame. The `baldrdash_prologue_words` setting is used to propagate this information to
95 //! Cranelift.
96 //! * Since Cranelift generated functions interact with Ion-ABI functions (Ionmonkey, other wasm
97 //! functions), and native (host) functions, it has to respect both calling conventions. Especially
98 //! when it comes to function calls it must preserve callee-saved and caller-saved registers in a
99 //! way compatible with both ABIs. In practice, it means Cranelift must consider Ion's callee-saved
100 //! as its callee-saved, and native's caller-saved as its caller-saved (since it deals with both
101 //! ABIs, it has to union the sets).
102 //!
103 //! ### Maintaining HeapReg
104 //!
105 //! On some targets, Spidermonkey pins one register to keep the heap-base accessible at all-times,
106 //! making memory accesses cheaper. This register is excluded from Ion's register allocation, and
107 //! is manually maintained by Spidermonkey before and after calls.
108 //!
109 //! Cranelift has two settings to mimic the same behavior:
110 //! - `enable_pinned_reg` makes it possible to pin a register and gives access to two Cranelift
111 //! instructions for reading it and writing to it.
112 //! - `use_pinned_reg_as_heap_base` makes the code generator use the pinned register as the heap
113 //! base for all Cranelift IR memory accesses.
114 //!
115 //! Using both settings allows to reproduce Spidermonkey's behavior. One caveat is that the pinned
116 //! register used in Cranelift must match the HeapReg register in Spidermonkey, for this to work
117 //! properly.
118 //!
119 //! Not using the pinned register as the heap base, when there's a heap register on the platform,
120 //! means that we have to explicitly maintain it in the prologue and epilogue (because of tiering),
121 //! which would be another source of slowness.
122 //!
123 //! ### Non-streaming validation
124 //!
125 //! Ionmonkey is able to iterate over the wasm code section's body, validating and emitting the
126 //! internal Ionmonkey's IR at the same time.
127 //!
128 //! Cranelift uses `wasmparser` to parse the wasm binary section, which isn't able to add
129 //! per-opcode hooks. Instead, Cranelift validates (off the main thread) the function's body before
130 //! compiling it, function per function.
131
132 mod bindings;
133 mod compile;
134 mod isa;
135 mod utils;
136 mod wasm2clif;
137
138 use log::{self, error};
139 use std::ffi::CString;
140 use std::fmt::Display;
141 use std::os::raw::c_char;
142 use std::ptr;
143
144 use crate::bindings::{CompiledFunc, FuncCompileInput, ModuleEnvironment, StaticEnvironment};
145 use crate::compile::BatchCompiler;
146 use cranelift_codegen::CodegenError;
147
148 /// Initializes all the process-wide Cranelift state. It must be called at least once, before any
149 /// other use of this crate. It is not an issue if it is called more than once; subsequent calls
150 /// are useless though.
151 #[no_mangle]
cranelift_initialize()152 pub extern "C" fn cranelift_initialize() {
153 // Gecko might set a logger before we do, which is all fine; try to initialize ours, and reset
154 // the FilterLevel env_logger::try_init might have set to what it was in case of initialization
155 // failure
156 let filter = log::max_level();
157 match env_logger::try_init() {
158 Ok(_) => {}
159 Err(_) => {
160 log::set_max_level(filter);
161 }
162 }
163 }
164
165 /// Allocate a compiler for a module environment and return an opaque handle.
166 ///
167 /// It is the caller's responsability to deallocate the returned BatchCompiler later, passing back
168 /// the opaque handle to a call to `cranelift_compiler_destroy`.
169 ///
170 /// This is declared in `clifapi.h`.
171 #[no_mangle]
cranelift_compiler_create<'a, 'b>( static_env: *const StaticEnvironment, env: *const bindings::LowLevelModuleEnvironment, ) -> *mut BatchCompiler<'a, 'b>172 pub unsafe extern "C" fn cranelift_compiler_create<'a, 'b>(
173 static_env: *const StaticEnvironment,
174 env: *const bindings::LowLevelModuleEnvironment,
175 ) -> *mut BatchCompiler<'a, 'b> {
176 let env = env.as_ref().unwrap();
177 let static_env = static_env.as_ref().unwrap();
178 match BatchCompiler::new(static_env, ModuleEnvironment::new(env)) {
179 Ok(compiler) => Box::into_raw(Box::new(compiler)),
180 Err(err) => {
181 error!("When constructing the batch compiler: {}", err);
182 ptr::null_mut()
183 }
184 }
185 }
186
187 /// Deallocate a BatchCompiler created by `cranelift_compiler_create`.
188 ///
189 /// Passing any other kind of pointer to this function is technically undefined behavior, thus
190 /// making the function unsafe to use.
191 ///
192 /// This is declared in `clifapi.h`.
193 #[no_mangle]
cranelift_compiler_destroy(compiler: *mut BatchCompiler)194 pub unsafe extern "C" fn cranelift_compiler_destroy(compiler: *mut BatchCompiler) {
195 assert!(
196 !compiler.is_null(),
197 "NULL pointer passed to cranelift_compiler_destroy"
198 );
199 // Convert the pointer back into the box it came from. Then drop it.
200 let _box = Box::from_raw(compiler);
201 }
202
error_to_cstring<D: Display>(err: D) -> *mut c_char203 fn error_to_cstring<D: Display>(err: D) -> *mut c_char {
204 use std::fmt::Write;
205 let mut s = String::new();
206 let _ = write!(&mut s, "{}", err);
207 let cstr = CString::new(s).unwrap();
208 cstr.into_raw()
209 }
210
211 /// Compile a single function.
212 ///
213 /// This is declared in `clifapi.h`.
214 ///
215 /// If a Wasm validation error is returned in *error, then it *must* be later
216 /// freed by `cranelift_compiler_free_error()`.
217 #[no_mangle]
cranelift_compile_function( compiler: *mut BatchCompiler, data: *const FuncCompileInput, result: *mut CompiledFunc, error: *mut *mut c_char, ) -> bool218 pub unsafe extern "C" fn cranelift_compile_function(
219 compiler: *mut BatchCompiler,
220 data: *const FuncCompileInput,
221 result: *mut CompiledFunc,
222 error: *mut *mut c_char,
223 ) -> bool {
224 let compiler = compiler.as_mut().unwrap();
225 let data = data.as_ref().unwrap();
226
227 // Reset the compiler to a clean state.
228 compiler.clear();
229
230 if let Err(e) = compiler.translate_wasm(data) {
231 let cstr = error_to_cstring(e);
232 *error = cstr;
233 return false;
234 };
235
236 if let Err(e) = compiler.compile(data.stackmaps()) {
237 // Make sure to panic on verifier errors, so that fuzzers see those. Other errors are about
238 // unsupported features or implementation limits, so just report them as a user-facing
239 // error.
240 match e {
241 CodegenError::Verifier(verifier_error) => {
242 panic!("Cranelift verifier error: {}", verifier_error);
243 }
244 CodegenError::ImplLimitExceeded
245 | CodegenError::CodeTooLarge
246 | CodegenError::Unsupported(_) => {
247 let cstr = error_to_cstring(e);
248 *error = cstr;
249 return false;
250 }
251 }
252 };
253
254 // TODO(bbouvier) if destroy is called while one of these objects is alive, you're going to
255 // have a bad time. Would be nice to be able to enforce lifetimes accross languages, somehow.
256 let result = result.as_mut().unwrap();
257 result.reset(&compiler.current_func);
258
259 true
260 }
261
262 #[no_mangle]
cranelift_compiler_free_error(s: *mut c_char)263 pub unsafe extern "C" fn cranelift_compiler_free_error(s: *mut c_char) {
264 // Convert back into a `CString` and then let it drop.
265 let _cstr = CString::from_raw(s);
266 }
267
268 /// Returns true whether a platform (target ISA) is supported or not.
269 #[no_mangle]
cranelift_supports_platform() -> bool270 pub unsafe extern "C" fn cranelift_supports_platform() -> bool {
271 isa::platform::IS_SUPPORTED
272 }
273