1 /* Copyright 2018 Mozilla Foundation
2  *
3  * Licensed under the Apache License, Version 2.0 (the "License");
4  * you may not use this file except in compliance with the License.
5  * You may obtain a copy of the License at
6  *
7  *     http://www.apache.org/licenses/LICENSE-2.0
8  *
9  * Unless required by applicable law or agreed to in writing, software
10  * distributed under the License is distributed on an "AS IS" BASIS,
11  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12  * See the License for the specific language governing permissions and
13  * limitations under the License.
14  */
15 
16 //! This code bridges Spidermonkey to Cranelift.
17 //!
18 //! This documentation explains the role of each high-level function, each notable submodule, and
19 //! the Spidermonkey idiosyncrasies that are visible here and leak into Cranelift. This is not a
20 //! technical presentation of how Cranelift works or what it intends to achieve, a task much more
21 //! suited to the Wasmtime documentation itself:
22 //!
23 //! https://github.com/bytecodealliance/wasmtime/blob/master/cranelift/docs/index.md
24 //!
25 //! At the time of writing (April 14th, 2020), this code is only used for WebAssembly (wasm)
26 //! compilation, so this documentation focuses on the wasm integration. As a matter of fact, this
27 //! glue crate between Baldrmonkey and Cranelift is called Baldrdash, thanks to the usual punsters.
28 //!
29 //! ## Relationships to other files
30 //!
31 //! * WasmCraneliftCompile.cpp contains all the C++ code that calls into this crate.
32 //! * clifapi.h describes the C-style bindings to this crate's public functions, used by the C++
33 //! code to call into Rust. They're maintained by hand, and thus manual review must ensure the
34 //! signatures match those of the functions exposed in this lib.rs file.
35 //! * baldrapi.h describes the C-style functions exposed through `bindgen` so they can be called
36 //! from Rust. Bindings are automatically generated, such that they're safe to use in general.
37 //! WasmConstants.h is also exposed in through this file, which makes sharing some code easier.
38 //!
39 //! ## High-level functions
40 //!
41 //! * `cranelift_initialize` performs per-process initialization.
42 //! * `cranelift_compiler_create` will return a `BatchCompiler`, the high-level data structure
43 //! controlling the compilation of a group (batch) of wasm functions. The created compiler should
44 //! later be deallocated with `cranelift_compiler_destroy`, once it's not needed anymore.
45 //! * `cranelift_compile_function` takes care of translating a single wasm function into Cranelift
46 //! IR, and compiles it down to machine code. Input data is passed through a const pointer to a
47 //! `FuncCompilerInput` data structure (defined in bindings), and the return values are stored in
48 //! an in-out parameter named `CompiledFunc` (also defined in bindings).
49 //!
50 //! ## Submodules
51 //!
52 //! The list of submodules here is voluntarily put in a specific order, so as to make it easier to
53 //! discover and read.
54 //!
55 //! * The `isa` module configures Cranelift, applying some target-independent settings, as well as
56 //! target-specific settings. These settings are used both during translation of wasm to Cranelift
57 //! IR and compilation to machine code.
58 //! * The `wasm2clif` module contains the code doing the translation of the wasm code section to
59 //! Cranelift IR, implementing all the Spidermonkey specific behaviors.
60 //! * The `compile` module takes care of optimizing the Cranelift IR and compiles it down to
61 //! machine code, noting down relocations in the process.
62 //!
63 //! A few other helper modules are also defined:
64 //!
65 //! * The `bindings` module contains C++ bindings automatically generated by `bindgen` in the Cargo
66 //! build script (`build.rs`), as well as thin wrappers over these data structures to make these
67 //! more ergonomic to use in Rust.
68 //! * No code base would be feature complete without a bunch of random helpers and functions that
69 //! don't really belong anywhere else: the `utils` module contains error handling helpers, to unify
70 //! all the Cranelift Error types into one that can be passed around in Baldrdash.
71 //!
72 //! ## Spidermonkey idiosyncrasies
73 //!
74 //! Most of the Spidermonkey-specific behavior is reflected during conversion of the wasm code to
75 //! Cranelift IR (in the `wasm2clif` module), but there are some other aspects worth mentioning
76 //! here.
77 //!
78 //! ### Code generation, prologues/epilogues, ABI
79 //!
80 //! Cranelift may call into and be called from other functions using the Spidermonkey wasm ABI:
81 //! that is, code generated by the wasm baseline compiler during tiering, any other wasm stub, even
82 //! Ion (through the JIT entries and exits).
83 //!
84 //! As a matter of fact, it must push the same C++ `wasm::Frame` on the stack before a call, and
85 //! unwind it properly on exit. To keep this detail orthogonal to Cranelift, the function's
86 //! prologue and epilogue are **not** generated by Cranelift itself; the C++ code generates them
87 //! for us. Here, Cranelift only generates the code section and appropriate relocations.
88 //! The C++ code writes the prologue, copies the machine code section, writes the epilogue, and
89 //! translates the Cranelift relocations into Spidermonkey relocations.
90 //!
91 //! * To not generate the prologue and epilogue, Cranelift uses a special calling convention called
92 //! Baldrdash in its code. This is set upon creation of the `TargetISA`.
93 //! * Cranelift must know the offset to the stack argument's base, that is, the size of the
94 //! wasm::Frame. The `baldrdash_prologue_words` setting is used to propagate this information to
95 //! Cranelift.
96 //! * Since Cranelift generated functions interact with Ion-ABI functions (Ionmonkey, other wasm
97 //! functions), and native (host) functions, it has to respect both calling conventions. Especially
98 //! when it comes to function calls it must preserve callee-saved and caller-saved registers in a
99 //! way compatible with both ABIs. In practice, it means Cranelift must consider Ion's callee-saved
100 //! as its callee-saved, and native's caller-saved as its caller-saved (since it deals with both
101 //! ABIs, it has to union the sets).
102 //!
103 //! ### Maintaining HeapReg
104 //!
105 //! On some targets, Spidermonkey pins one register to keep the heap-base accessible at all-times,
106 //! making memory accesses cheaper. This register is excluded from Ion's register allocation, and
107 //! is manually maintained by Spidermonkey before and after calls.
108 //!
109 //! Cranelift has two settings to mimic the same behavior:
110 //! - `enable_pinned_reg` makes it possible to pin a register and gives access to two Cranelift
111 //! instructions for reading it and writing to it.
112 //! - `use_pinned_reg_as_heap_base` makes the code generator use the pinned register as the heap
113 //! base for all Cranelift IR memory accesses.
114 //!
115 //! Using both settings allows to reproduce Spidermonkey's behavior. One caveat is that the pinned
116 //! register used in Cranelift must match the HeapReg register in Spidermonkey, for this to work
117 //! properly.
118 //!
119 //! Not using the pinned register as the heap base, when there's a heap register on the platform,
120 //! means that we have to explicitly maintain it in the prologue and epilogue (because of tiering),
121 //! which would be another source of slowness.
122 //!
123 //! ### Non-streaming validation
124 //!
125 //! Ionmonkey is able to iterate over the wasm code section's body, validating and emitting the
126 //! internal Ionmonkey's IR at the same time.
127 //!
128 //! Cranelift uses `wasmparser` to parse the wasm binary section, which isn't able to add
129 //! per-opcode hooks. Instead, Cranelift validates (off the main thread) the function's body before
130 //! compiling it, function per function.
131 
132 mod bindings;
133 mod compile;
134 mod isa;
135 mod utils;
136 mod wasm2clif;
137 
138 use log::{self, error};
139 use std::ffi::CString;
140 use std::fmt::Display;
141 use std::os::raw::c_char;
142 use std::ptr;
143 
144 use crate::bindings::{CompiledFunc, FuncCompileInput, ModuleEnvironment, StaticEnvironment};
145 use crate::compile::BatchCompiler;
146 use cranelift_codegen::CodegenError;
147 
148 /// Initializes all the process-wide Cranelift state. It must be called at least once, before any
149 /// other use of this crate. It is not an issue if it is called more than once; subsequent calls
150 /// are useless though.
151 #[no_mangle]
cranelift_initialize()152 pub extern "C" fn cranelift_initialize() {
153     // Gecko might set a logger before we do, which is all fine; try to initialize ours, and reset
154     // the FilterLevel env_logger::try_init might have set to what it was in case of initialization
155     // failure
156     let filter = log::max_level();
157     match env_logger::try_init() {
158         Ok(_) => {}
159         Err(_) => {
160             log::set_max_level(filter);
161         }
162     }
163 }
164 
165 /// Allocate a compiler for a module environment and return an opaque handle.
166 ///
167 /// It is the caller's responsability to deallocate the returned BatchCompiler later, passing back
168 /// the opaque handle to a call to `cranelift_compiler_destroy`.
169 ///
170 /// This is declared in `clifapi.h`.
171 #[no_mangle]
cranelift_compiler_create<'a, 'b>( static_env: *const StaticEnvironment, env: *const bindings::LowLevelModuleEnvironment, ) -> *mut BatchCompiler<'a, 'b>172 pub unsafe extern "C" fn cranelift_compiler_create<'a, 'b>(
173     static_env: *const StaticEnvironment,
174     env: *const bindings::LowLevelModuleEnvironment,
175 ) -> *mut BatchCompiler<'a, 'b> {
176     let env = env.as_ref().unwrap();
177     let static_env = static_env.as_ref().unwrap();
178     match BatchCompiler::new(static_env, ModuleEnvironment::new(env)) {
179         Ok(compiler) => Box::into_raw(Box::new(compiler)),
180         Err(err) => {
181             error!("When constructing the batch compiler: {}", err);
182             ptr::null_mut()
183         }
184     }
185 }
186 
187 /// Deallocate a BatchCompiler created by `cranelift_compiler_create`.
188 ///
189 /// Passing any other kind of pointer to this function is technically undefined behavior, thus
190 /// making the function unsafe to use.
191 ///
192 /// This is declared in `clifapi.h`.
193 #[no_mangle]
cranelift_compiler_destroy(compiler: *mut BatchCompiler)194 pub unsafe extern "C" fn cranelift_compiler_destroy(compiler: *mut BatchCompiler) {
195     assert!(
196         !compiler.is_null(),
197         "NULL pointer passed to cranelift_compiler_destroy"
198     );
199     // Convert the pointer back into the box it came from. Then drop it.
200     let _box = Box::from_raw(compiler);
201 }
202 
error_to_cstring<D: Display>(err: D) -> *mut c_char203 fn error_to_cstring<D: Display>(err: D) -> *mut c_char {
204     use std::fmt::Write;
205     let mut s = String::new();
206     let _ = write!(&mut s, "{}", err);
207     let cstr = CString::new(s).unwrap();
208     cstr.into_raw()
209 }
210 
211 /// Compile a single function.
212 ///
213 /// This is declared in `clifapi.h`.
214 ///
215 /// If a Wasm validation error is returned in *error, then it *must* be later
216 /// freed by `cranelift_compiler_free_error()`.
217 #[no_mangle]
cranelift_compile_function( compiler: *mut BatchCompiler, data: *const FuncCompileInput, result: *mut CompiledFunc, error: *mut *mut c_char, ) -> bool218 pub unsafe extern "C" fn cranelift_compile_function(
219     compiler: *mut BatchCompiler,
220     data: *const FuncCompileInput,
221     result: *mut CompiledFunc,
222     error: *mut *mut c_char,
223 ) -> bool {
224     let compiler = compiler.as_mut().unwrap();
225     let data = data.as_ref().unwrap();
226 
227     // Reset the compiler to a clean state.
228     compiler.clear();
229 
230     if let Err(e) = compiler.translate_wasm(data) {
231         let cstr = error_to_cstring(e);
232         *error = cstr;
233         return false;
234     };
235 
236     if let Err(e) = compiler.compile(data.stackmaps()) {
237         // Make sure to panic on verifier errors, so that fuzzers see those. Other errors are about
238         // unsupported features or implementation limits, so just report them as a user-facing
239         // error.
240         match e {
241             CodegenError::Verifier(verifier_error) => {
242                 panic!("Cranelift verifier error: {}", verifier_error);
243             }
244             CodegenError::ImplLimitExceeded
245             | CodegenError::CodeTooLarge
246             | CodegenError::Unsupported(_) => {
247                 let cstr = error_to_cstring(e);
248                 *error = cstr;
249                 return false;
250             }
251         }
252     };
253 
254     // TODO(bbouvier) if destroy is called while one of these objects is alive, you're going to
255     // have a bad time. Would be nice to be able to enforce lifetimes accross languages, somehow.
256     let result = result.as_mut().unwrap();
257     result.reset(&compiler.current_func);
258 
259     true
260 }
261 
262 #[no_mangle]
cranelift_compiler_free_error(s: *mut c_char)263 pub unsafe extern "C" fn cranelift_compiler_free_error(s: *mut c_char) {
264     // Convert back into a `CString` and then let it drop.
265     let _cstr = CString::from_raw(s);
266 }
267 
268 /// Returns true whether a platform (target ISA) is supported or not.
269 #[no_mangle]
cranelift_supports_platform() -> bool270 pub unsafe extern "C" fn cranelift_supports_platform() -> bool {
271     isa::platform::IS_SUPPORTED
272 }
273