1 //! Implementation of a vanilla ABI, shared between several machines. The
2 //! implementation here assumes that arguments will be passed in registers
3 //! first, then additional args on the stack; that the stack grows downward,
4 //! contains a standard frame (return address and frame pointer), and the
5 //! compiler is otherwise free to allocate space below that with its choice of
6 //! layout; and that the machine has some notion of caller- and callee-save
7 //! registers. Most modern machines, e.g. x86-64 and AArch64, should fit this
8 //! mold and thus both of these backends use this shared implementation.
9 //!
10 //! See the documentation in specific machine backends for the "instantiation"
11 //! of this generic ABI, i.e., which registers are caller/callee-save, arguments
12 //! and return values, and any other special requirements.
13 //!
14 //! For now the implementation here assumes a 64-bit machine, but we intend to
15 //! make this 32/64-bit-generic shortly.
16 //!
17 //! # Vanilla ABI
18 //!
19 //! First, arguments and return values are passed in registers up to a certain
20 //! fixed count, after which they overflow onto the stack. Multiple return
21 //! values either fit in registers, or are returned in a separate return-value
22 //! area on the stack, given by a hidden extra parameter.
23 //!
24 //! Note that the exact stack layout is up to us. We settled on the
25 //! below design based on several requirements. In particular, we need
26 //! to be able to generate instructions (or instruction sequences) to
27 //! access arguments, stack slots, and spill slots before we know how
28 //! many spill slots or clobber-saves there will be, because of our
29 //! pass structure. We also prefer positive offsets to negative
30 //! offsets because of an asymmetry in some machines' addressing modes
31 //! (e.g., on AArch64, positive offsets have a larger possible range
32 //! without a long-form sequence to synthesize an arbitrary
33 //! offset). We also need clobber-save registers to be "near" the
34 //! frame pointer: Windows unwind information requires it to be within
35 //! 240 bytes of RBP. Finally, it is not allowed to access memory
36 //! below the current SP value.
37 //!
38 //! We assume that a prologue first pushes the frame pointer (and
39 //! return address above that, if the machine does not do that in
40 //! hardware). We set FP to point to this two-word frame record. We
41 //! store all other frame slots below this two-word frame record, with
42 //! the stack pointer remaining at or below this fixed frame storage
43 //! for the rest of the function. We can then access frame storage
44 //! slots using positive offsets from SP. In order to allow codegen
45 //! for the latter before knowing how SP might be adjusted around
46 //! callsites, we implement a "nominal SP" tracking feature by which a
47 //! fixup (distance between actual SP and a "nominal" SP) is known at
48 //! each instruction.
49 //!
50 //! Note that if we ever support dynamic stack-space allocation (for
51 //! `alloca`), we will need a way to reference spill slots and stack
52 //! slots without "nominal SP", because we will no longer be able to
53 //! know a static offset from SP to the slots at any particular
54 //! program point. Probably the best solution at that point will be to
55 //! revert to using the frame pointer as the reference for all slots,
56 //! and creating a "nominal FP" synthetic addressing mode (analogous
57 //! to "nominal SP" today) to allow generating spill/reload and
58 //! stackslot accesses before we know how large the clobber-saves will
59 //! be.
60 //!
61 //! # Stack Layout
62 //!
63 //! The stack looks like:
64 //!
65 //! ```plain
66 //! (high address)
67 //!
68 //! +---------------------------+
69 //! | ... |
70 //! | stack args |
71 //! | (accessed via FP) |
72 //! +---------------------------+
73 //! SP at function entry -----> | return address |
74 //! +---------------------------+
75 //! FP after prologue --------> | FP (pushed by prologue) |
76 //! +---------------------------+
77 //! | ... |
78 //! | clobbered callee-saves |
79 //! unwind-frame base ----> | (pushed by prologue) |
80 //! +---------------------------+
81 //! | ... |
82 //! | spill slots |
83 //! | (accessed via nominal SP) |
84 //! | ... |
85 //! | stack slots |
86 //! | (accessed via nominal SP) |
87 //! nominal SP ---------------> | (alloc'd by prologue) |
88 //! (SP at end of prologue) +---------------------------+
89 //! | [alignment as needed] |
90 //! | ... |
91 //! | args for call |
92 //! SP before making a call --> | (pushed at callsite) |
93 //! +---------------------------+
94 //!
95 //! (low address)
96 //! ```
97 //!
98 //! # Multi-value Returns
99 //!
100 //! Note that we support multi-value returns in two ways. First, we allow for
101 //! multiple return-value registers. Second, if teh appropriate flag is set, we
102 //! support the SpiderMonkey Wasm ABI. For details of the multi-value return
103 //! ABI, see:
104 //!
105 //! <https://searchfox.org/mozilla-central/rev/bc3600def806859c31b2c7ac06e3d69271052a89/js/src/wasm/WasmStubs.h#134>
106 //!
107 //! In brief:
108 //! - Return values are processed in *reverse* order.
109 //! - The first return value in this order (so the last return) goes into the
110 //! ordinary return register.
111 //! - Any further returns go in a struct-return area, allocated upwards (in
112 //! address order) during the reverse traversal.
113 //! - This struct-return area is provided by the caller, and a pointer to its
114 //! start is passed as an invisible last (extra) argument. Normally the caller
115 //! will allocate this area on the stack. When we generate calls, we place it
116 //! just above the on-stack argument area.
117 //! - So, for example, a function returning 4 i64's (v0, v1, v2, v3), with no
118 //! formal arguments, would:
119 //! - Accept a pointer `P` to the struct return area as a hidden argument in the
120 //! first argument register on entry.
121 //! - Return v3 in the one and only return-value register.
122 //! - Return v2 in memory at `[P]`.
123 //! - Return v1 in memory at `[P+8]`.
124 //! - Return v0 in memory at `[P+16]`.
125
126 use super::abi::*;
127 use crate::binemit::StackMap;
128 use crate::ir::types::*;
129 use crate::ir::{ArgumentExtension, ArgumentPurpose, StackSlot};
130 use crate::machinst::*;
131 use crate::settings;
132 use crate::CodegenResult;
133 use crate::{ir, isa};
134 use alloc::vec::Vec;
135 use log::{debug, trace};
136 use regalloc::{RealReg, Reg, RegClass, Set, SpillSlot, Writable};
137 use smallvec::{smallvec, SmallVec};
138 use std::convert::TryFrom;
139 use std::marker::PhantomData;
140 use std::mem;
141
142 /// A location for (part of) an argument or return value. These "storage slots"
143 /// are specified for each register-sized part of an argument.
144 #[derive(Clone, Copy, Debug, PartialEq, Eq)]
145 pub enum ABIArgSlot {
146 /// In a real register.
147 Reg {
148 /// Register that holds this arg.
149 reg: RealReg,
150 /// Value type of this arg.
151 ty: ir::Type,
152 /// Should this arg be zero- or sign-extended?
153 extension: ir::ArgumentExtension,
154 },
155 /// Arguments only: on stack, at given offset from SP at entry.
156 Stack {
157 /// Offset of this arg relative to the base of stack args.
158 offset: i64,
159 /// Value type of this arg.
160 ty: ir::Type,
161 /// Should this arg be zero- or sign-extended?
162 extension: ir::ArgumentExtension,
163 },
164 }
165
166 /// An ABIArg is composed of one or more parts. This allows for a CLIF-level
167 /// Value to be passed with its parts in more than one location at the ABI
168 /// level. For example, a 128-bit integer may be passed in two 64-bit registers,
169 /// or even a 64-bit register and a 64-bit stack slot, on a 64-bit machine. The
170 /// number of "parts" should correspond to the number of registers used to store
171 /// this type according to the machine backend.
172 ///
173 /// As an invariant, the `purpose` for every part must match. As a further
174 /// invariant, a `StructArg` part cannot appear with any other part.
175 #[derive(Clone, Debug)]
176 pub enum ABIArg {
177 /// Storage slots (registers or stack locations) for each part of the
178 /// argument value. The number of slots must equal the number of register
179 /// parts used to store a value of this type.
180 Slots {
181 /// Slots, one per register part.
182 slots: Vec<ABIArgSlot>,
183 /// Purpose of this arg.
184 purpose: ir::ArgumentPurpose,
185 },
186 /// Structure argument. We reserve stack space for it, but the CLIF-level
187 /// semantics are a little weird: the value passed to the call instruction,
188 /// and received in the corresponding block param, is a *pointer*. On the
189 /// caller side, we memcpy the data from the passed-in pointer to the stack
190 /// area; on the callee side, we compute a pointer to this stack area and
191 /// provide that as the argument's value.
192 StructArg {
193 /// Offset of this arg relative to base of stack args.
194 offset: i64,
195 /// Size of this arg on the stack.
196 size: u64,
197 /// Purpose of this arg.
198 purpose: ir::ArgumentPurpose,
199 },
200 }
201
202 impl ABIArg {
203 /// Get the purpose of this arg.
get_purpose(&self) -> ir::ArgumentPurpose204 fn get_purpose(&self) -> ir::ArgumentPurpose {
205 match self {
206 &ABIArg::Slots { purpose, .. } => purpose,
207 &ABIArg::StructArg { purpose, .. } => purpose,
208 }
209 }
210
211 /// Is this a StructArg?
is_struct_arg(&self) -> bool212 fn is_struct_arg(&self) -> bool {
213 match self {
214 &ABIArg::StructArg { .. } => true,
215 _ => false,
216 }
217 }
218
219 /// Create an ABIArg from one register.
reg( reg: RealReg, ty: ir::Type, extension: ir::ArgumentExtension, purpose: ir::ArgumentPurpose, ) -> ABIArg220 pub fn reg(
221 reg: RealReg,
222 ty: ir::Type,
223 extension: ir::ArgumentExtension,
224 purpose: ir::ArgumentPurpose,
225 ) -> ABIArg {
226 ABIArg::Slots {
227 slots: vec![ABIArgSlot::Reg { reg, ty, extension }],
228 purpose,
229 }
230 }
231
232 /// Create an ABIArg from one stack slot.
stack( offset: i64, ty: ir::Type, extension: ir::ArgumentExtension, purpose: ir::ArgumentPurpose, ) -> ABIArg233 pub fn stack(
234 offset: i64,
235 ty: ir::Type,
236 extension: ir::ArgumentExtension,
237 purpose: ir::ArgumentPurpose,
238 ) -> ABIArg {
239 ABIArg::Slots {
240 slots: vec![ABIArgSlot::Stack {
241 offset,
242 ty,
243 extension,
244 }],
245 purpose,
246 }
247 }
248 }
249
250 /// Are we computing information about arguments or return values? Much of the
251 /// handling is factored out into common routines; this enum allows us to
252 /// distinguish which case we're handling.
253 #[derive(Clone, Copy, Debug, PartialEq, Eq)]
254 pub enum ArgsOrRets {
255 /// Arguments.
256 Args,
257 /// Return values.
258 Rets,
259 }
260
261 /// Is an instruction returned by an ABI machine-specific backend a safepoint,
262 /// or not?
263 #[derive(Clone, Copy, Debug, PartialEq, Eq)]
264 pub enum InstIsSafepoint {
265 /// The instruction is a safepoint.
266 Yes,
267 /// The instruction is not a safepoint.
268 No,
269 }
270
271 /// Abstract location for a machine-specific ABI impl to translate into the
272 /// appropriate addressing mode.
273 #[derive(Clone, Copy, Debug)]
274 pub enum StackAMode {
275 /// Offset from the frame pointer, possibly making use of a specific type
276 /// for a scaled indexing operation.
277 FPOffset(i64, ir::Type),
278 /// Offset from the nominal stack pointer, possibly making use of a specific
279 /// type for a scaled indexing operation.
280 NominalSPOffset(i64, ir::Type),
281 /// Offset from the real stack pointer, possibly making use of a specific
282 /// type for a scaled indexing operation.
283 SPOffset(i64, ir::Type),
284 }
285
286 impl StackAMode {
287 /// Offset by an addend.
offset(self, addend: i64) -> Self288 pub fn offset(self, addend: i64) -> Self {
289 match self {
290 StackAMode::FPOffset(off, ty) => StackAMode::FPOffset(off + addend, ty),
291 StackAMode::NominalSPOffset(off, ty) => StackAMode::NominalSPOffset(off + addend, ty),
292 StackAMode::SPOffset(off, ty) => StackAMode::SPOffset(off + addend, ty),
293 }
294 }
295 }
296
297 /// Trait implemented by machine-specific backend to provide information about
298 /// register assignments and to allow generating the specific instructions for
299 /// stack loads/saves, prologues/epilogues, etc.
300 pub trait ABIMachineSpec {
301 /// The instruction type.
302 type I: VCodeInst;
303
304 /// Returns the number of bits in a word, that is 32/64 for 32/64-bit architecture.
word_bits() -> u32305 fn word_bits() -> u32;
306
307 /// Returns the number of bytes in a word.
word_bytes() -> u32308 fn word_bytes() -> u32 {
309 return Self::word_bits() / 8;
310 }
311
312 /// Returns word-size integer type.
word_type() -> Type313 fn word_type() -> Type {
314 match Self::word_bits() {
315 32 => I32,
316 64 => I64,
317 _ => unreachable!(),
318 }
319 }
320
321 /// Returns word register class.
word_reg_class() -> RegClass322 fn word_reg_class() -> RegClass {
323 match Self::word_bits() {
324 32 => RegClass::I32,
325 64 => RegClass::I64,
326 _ => unreachable!(),
327 }
328 }
329
330 /// Returns required stack alignment in bytes.
stack_align(call_conv: isa::CallConv) -> u32331 fn stack_align(call_conv: isa::CallConv) -> u32;
332
333 /// Process a list of parameters or return values and allocate them to registers
334 /// and stack slots.
335 ///
336 /// Returns the list of argument locations, the stack-space used (rounded up
337 /// to as alignment requires), and if `add_ret_area_ptr` was passed, the
338 /// index of the extra synthetic arg that was added.
compute_arg_locs( call_conv: isa::CallConv, flags: &settings::Flags, params: &[ir::AbiParam], args_or_rets: ArgsOrRets, add_ret_area_ptr: bool, ) -> CodegenResult<(Vec<ABIArg>, i64, Option<usize>)>339 fn compute_arg_locs(
340 call_conv: isa::CallConv,
341 flags: &settings::Flags,
342 params: &[ir::AbiParam],
343 args_or_rets: ArgsOrRets,
344 add_ret_area_ptr: bool,
345 ) -> CodegenResult<(Vec<ABIArg>, i64, Option<usize>)>;
346
347 /// Returns the offset from FP to the argument area, i.e., jumping over the saved FP, return
348 /// address, and maybe other standard elements depending on ABI (e.g. Wasm TLS reg).
fp_to_arg_offset(call_conv: isa::CallConv, flags: &settings::Flags) -> i64349 fn fp_to_arg_offset(call_conv: isa::CallConv, flags: &settings::Flags) -> i64;
350
351 /// Generate a load from the stack.
gen_load_stack(mem: StackAMode, into_reg: Writable<Reg>, ty: Type) -> Self::I352 fn gen_load_stack(mem: StackAMode, into_reg: Writable<Reg>, ty: Type) -> Self::I;
353
354 /// Generate a store to the stack.
gen_store_stack(mem: StackAMode, from_reg: Reg, ty: Type) -> Self::I355 fn gen_store_stack(mem: StackAMode, from_reg: Reg, ty: Type) -> Self::I;
356
357 /// Generate a move.
gen_move(to_reg: Writable<Reg>, from_reg: Reg, ty: Type) -> Self::I358 fn gen_move(to_reg: Writable<Reg>, from_reg: Reg, ty: Type) -> Self::I;
359
360 /// Generate an integer-extend operation.
gen_extend( to_reg: Writable<Reg>, from_reg: Reg, is_signed: bool, from_bits: u8, to_bits: u8, ) -> Self::I361 fn gen_extend(
362 to_reg: Writable<Reg>,
363 from_reg: Reg,
364 is_signed: bool,
365 from_bits: u8,
366 to_bits: u8,
367 ) -> Self::I;
368
369 /// Generate a return instruction.
gen_ret() -> Self::I370 fn gen_ret() -> Self::I;
371
372 /// Generate an "epilogue placeholder" instruction, recognized by lowering
373 /// when using the Baldrdash ABI.
gen_epilogue_placeholder() -> Self::I374 fn gen_epilogue_placeholder() -> Self::I;
375
376 /// Generate an add-with-immediate. Note that even if this uses a scratch
377 /// register, it must satisfy two requirements:
378 ///
379 /// - The add-imm sequence must only clobber caller-save registers, because
380 /// it will be placed in the prologue before the clobbered callee-save
381 /// registers are saved.
382 ///
383 /// - The add-imm sequence must work correctly when `from_reg` and/or
384 /// `into_reg` are the register returned by `get_stacklimit_reg()`.
gen_add_imm(into_reg: Writable<Reg>, from_reg: Reg, imm: u32) -> SmallInstVec<Self::I>385 fn gen_add_imm(into_reg: Writable<Reg>, from_reg: Reg, imm: u32) -> SmallInstVec<Self::I>;
386
387 /// Generate a sequence that traps with a `TrapCode::StackOverflow` code if
388 /// the stack pointer is less than the given limit register (assuming the
389 /// stack grows downward).
gen_stack_lower_bound_trap(limit_reg: Reg) -> SmallInstVec<Self::I>390 fn gen_stack_lower_bound_trap(limit_reg: Reg) -> SmallInstVec<Self::I>;
391
392 /// Generate an instruction to compute an address of a stack slot (FP- or
393 /// SP-based offset).
gen_get_stack_addr(mem: StackAMode, into_reg: Writable<Reg>, ty: Type) -> Self::I394 fn gen_get_stack_addr(mem: StackAMode, into_reg: Writable<Reg>, ty: Type) -> Self::I;
395
396 /// Get a fixed register to use to compute a stack limit. This is needed for
397 /// certain sequences generated after the register allocator has already
398 /// run. This must satisfy two requirements:
399 ///
400 /// - It must be a caller-save register, because it will be clobbered in the
401 /// prologue before the clobbered callee-save registers are saved.
402 ///
403 /// - It must be safe to pass as an argument and/or destination to
404 /// `gen_add_imm()`. This is relevant when an addition with a large
405 /// immediate needs its own temporary; it cannot use the same fixed
406 /// temporary as this one.
get_stacklimit_reg() -> Reg407 fn get_stacklimit_reg() -> Reg;
408
409 /// Generate a store to the given [base+offset] address.
gen_load_base_offset(into_reg: Writable<Reg>, base: Reg, offset: i32, ty: Type) -> Self::I410 fn gen_load_base_offset(into_reg: Writable<Reg>, base: Reg, offset: i32, ty: Type) -> Self::I;
411
412 /// Generate a load from the given [base+offset] address.
gen_store_base_offset(base: Reg, offset: i32, from_reg: Reg, ty: Type) -> Self::I413 fn gen_store_base_offset(base: Reg, offset: i32, from_reg: Reg, ty: Type) -> Self::I;
414
415 /// Adjust the stack pointer up or down.
gen_sp_reg_adjust(amount: i32) -> SmallInstVec<Self::I>416 fn gen_sp_reg_adjust(amount: i32) -> SmallInstVec<Self::I>;
417
418 /// Generate a meta-instruction that adjusts the nominal SP offset.
gen_nominal_sp_adj(amount: i32) -> Self::I419 fn gen_nominal_sp_adj(amount: i32) -> Self::I;
420
421 /// Generate the usual frame-setup sequence for this architecture: e.g.,
422 /// `push rbp / mov rbp, rsp` on x86-64, or `stp fp, lr, [sp, #-16]!` on
423 /// AArch64.
gen_prologue_frame_setup(flags: &settings::Flags) -> SmallInstVec<Self::I>424 fn gen_prologue_frame_setup(flags: &settings::Flags) -> SmallInstVec<Self::I>;
425
426 /// Generate the usual frame-restore sequence for this architecture.
gen_epilogue_frame_restore(flags: &settings::Flags) -> SmallInstVec<Self::I>427 fn gen_epilogue_frame_restore(flags: &settings::Flags) -> SmallInstVec<Self::I>;
428
429 /// Generate a probestack call.
gen_probestack(_frame_size: u32) -> SmallInstVec<Self::I>430 fn gen_probestack(_frame_size: u32) -> SmallInstVec<Self::I>;
431
432 /// Generate a clobber-save sequence. This takes the list of *all* registers
433 /// written/modified by the function body. The implementation here is
434 /// responsible for determining which of these are callee-saved according to
435 /// the ABI. It should return a sequence of instructions that "push" or
436 /// otherwise save these values to the stack. The sequence of instructions
437 /// should adjust the stack pointer downward, and should align as necessary
438 /// according to ABI requirements.
439 ///
440 /// Returns stack bytes used as well as instructions. Does not adjust
441 /// nominal SP offset; caller will do that.
gen_clobber_save( call_conv: isa::CallConv, flags: &settings::Flags, clobbers: &Set<Writable<RealReg>>, fixed_frame_storage_size: u32, outgoing_args_size: u32, ) -> (u64, SmallVec<[Self::I; 16]>)442 fn gen_clobber_save(
443 call_conv: isa::CallConv,
444 flags: &settings::Flags,
445 clobbers: &Set<Writable<RealReg>>,
446 fixed_frame_storage_size: u32,
447 outgoing_args_size: u32,
448 ) -> (u64, SmallVec<[Self::I; 16]>);
449
450 /// Generate a clobber-restore sequence. This sequence should perform the
451 /// opposite of the clobber-save sequence generated above, assuming that SP
452 /// going into the sequence is at the same point that it was left when the
453 /// clobber-save sequence finished.
gen_clobber_restore( call_conv: isa::CallConv, flags: &settings::Flags, clobbers: &Set<Writable<RealReg>>, fixed_frame_storage_size: u32, outgoing_args_size: u32, ) -> SmallVec<[Self::I; 16]>454 fn gen_clobber_restore(
455 call_conv: isa::CallConv,
456 flags: &settings::Flags,
457 clobbers: &Set<Writable<RealReg>>,
458 fixed_frame_storage_size: u32,
459 outgoing_args_size: u32,
460 ) -> SmallVec<[Self::I; 16]>;
461
462 /// Generate a call instruction/sequence. This method is provided one
463 /// temporary register to use to synthesize the called address, if needed.
gen_call( dest: &CallDest, uses: Vec<Reg>, defs: Vec<Writable<Reg>>, opcode: ir::Opcode, tmp: Writable<Reg>, callee_conv: isa::CallConv, callee_conv: isa::CallConv, ) -> SmallVec<[(InstIsSafepoint, Self::I); 2]>464 fn gen_call(
465 dest: &CallDest,
466 uses: Vec<Reg>,
467 defs: Vec<Writable<Reg>>,
468 opcode: ir::Opcode,
469 tmp: Writable<Reg>,
470 callee_conv: isa::CallConv,
471 callee_conv: isa::CallConv,
472 ) -> SmallVec<[(InstIsSafepoint, Self::I); 2]>;
473
474 /// Generate a memcpy invocation. Used to set up struct args. May clobber
475 /// caller-save registers; we only memcpy before we start to set up args for
476 /// a call.
gen_memcpy( call_conv: isa::CallConv, dst: Reg, src: Reg, size: usize, ) -> SmallVec<[Self::I; 8]>477 fn gen_memcpy(
478 call_conv: isa::CallConv,
479 dst: Reg,
480 src: Reg,
481 size: usize,
482 ) -> SmallVec<[Self::I; 8]>;
483
484 /// Get the number of spillslots required for the given register-class and
485 /// type.
get_number_of_spillslots_for_value(rc: RegClass, ty: Type) -> u32486 fn get_number_of_spillslots_for_value(rc: RegClass, ty: Type) -> u32;
487
488 /// Get the current virtual-SP offset from an instruction-emission state.
get_virtual_sp_offset_from_state(s: &<Self::I as MachInstEmit>::State) -> i64489 fn get_virtual_sp_offset_from_state(s: &<Self::I as MachInstEmit>::State) -> i64;
490
491 /// Get the "nominal SP to FP" offset from an instruction-emission state.
get_nominal_sp_to_fp(s: &<Self::I as MachInstEmit>::State) -> i64492 fn get_nominal_sp_to_fp(s: &<Self::I as MachInstEmit>::State) -> i64;
493
494 /// Get all caller-save registers, that is, registers that we expect
495 /// not to be saved across a call to a callee with the given ABI.
get_regs_clobbered_by_call(call_conv_of_callee: isa::CallConv) -> Vec<Writable<Reg>>496 fn get_regs_clobbered_by_call(call_conv_of_callee: isa::CallConv) -> Vec<Writable<Reg>>;
497
498 /// Get the needed extension mode, given the mode attached to the argument
499 /// in the signature and the calling convention. The input (the attribute in
500 /// the signature) specifies what extension type should be done *if* the ABI
501 /// requires extension to the full register; this method's return value
502 /// indicates whether the extension actually *will* be done.
get_ext_mode( call_conv: isa::CallConv, specified: ir::ArgumentExtension, ) -> ir::ArgumentExtension503 fn get_ext_mode(
504 call_conv: isa::CallConv,
505 specified: ir::ArgumentExtension,
506 ) -> ir::ArgumentExtension;
507 }
508
509 /// ABI information shared between body (callee) and caller.
510 struct ABISig {
511 /// Argument locations (regs or stack slots). Stack offsets are relative to
512 /// SP on entry to function.
513 args: Vec<ABIArg>,
514 /// Return-value locations. Stack offsets are relative to the return-area
515 /// pointer.
516 rets: Vec<ABIArg>,
517 /// Space on stack used to store arguments.
518 stack_arg_space: i64,
519 /// Space on stack used to store return values.
520 stack_ret_space: i64,
521 /// Index in `args` of the stack-return-value-area argument.
522 stack_ret_arg: Option<usize>,
523 /// Calling convention used.
524 call_conv: isa::CallConv,
525 }
526
527 impl ABISig {
from_func_sig<M: ABIMachineSpec>( sig: &ir::Signature, flags: &settings::Flags, ) -> CodegenResult<ABISig>528 fn from_func_sig<M: ABIMachineSpec>(
529 sig: &ir::Signature,
530 flags: &settings::Flags,
531 ) -> CodegenResult<ABISig> {
532 // Compute args and retvals from signature. Handle retvals first,
533 // because we may need to add a return-area arg to the args.
534 let (rets, stack_ret_space, _) = M::compute_arg_locs(
535 sig.call_conv,
536 flags,
537 &sig.returns,
538 ArgsOrRets::Rets,
539 /* extra ret-area ptr = */ false,
540 )?;
541 let need_stack_return_area = stack_ret_space > 0;
542 let (args, stack_arg_space, stack_ret_arg) = M::compute_arg_locs(
543 sig.call_conv,
544 flags,
545 &sig.params,
546 ArgsOrRets::Args,
547 need_stack_return_area,
548 )?;
549
550 trace!(
551 "ABISig: sig {:?} => args = {:?} rets = {:?} arg stack = {} ret stack = {} stack_ret_arg = {:?}",
552 sig,
553 args,
554 rets,
555 stack_arg_space,
556 stack_ret_space,
557 stack_ret_arg
558 );
559
560 Ok(ABISig {
561 args,
562 rets,
563 stack_arg_space,
564 stack_ret_space,
565 stack_ret_arg,
566 call_conv: sig.call_conv,
567 })
568 }
569 }
570
571 /// ABI object for a function body.
572 pub struct ABICalleeImpl<M: ABIMachineSpec> {
573 /// CLIF-level signature, possibly normalized.
574 ir_sig: ir::Signature,
575 /// Signature: arg and retval regs.
576 sig: ABISig,
577 /// Offsets to each stackslot.
578 stackslots: PrimaryMap<StackSlot, u32>,
579 /// Total stack size of all stackslots.
580 stackslots_size: u32,
581 /// Stack size to be reserved for outgoing arguments.
582 outgoing_args_size: u32,
583 /// Clobbered registers, from regalloc.
584 clobbered: Set<Writable<RealReg>>,
585 /// Total number of spillslots, from regalloc.
586 spillslots: Option<usize>,
587 /// Storage allocated for the fixed part of the stack frame. This is
588 /// usually the same as the total frame size below, except in the case
589 /// of the baldrdash calling convention.
590 fixed_frame_storage_size: u32,
591 /// "Total frame size", as defined by "distance between FP and nominal SP".
592 /// Some items are pushed below nominal SP, so the function may actually use
593 /// more stack than this would otherwise imply. It is simply the initial
594 /// frame/allocation size needed for stackslots and spillslots.
595 total_frame_size: Option<u32>,
596 /// The register holding the return-area pointer, if needed.
597 ret_area_ptr: Option<Writable<Reg>>,
598 /// Calling convention this function expects.
599 call_conv: isa::CallConv,
600 /// The settings controlling this function's compilation.
601 flags: settings::Flags,
602 /// Whether or not this function is a "leaf", meaning it calls no other
603 /// functions
604 is_leaf: bool,
605 /// If this function has a stack limit specified, then `Reg` is where the
606 /// stack limit will be located after the instructions specified have been
607 /// executed.
608 ///
609 /// Note that this is intended for insertion into the prologue, if
610 /// present. Also note that because the instructions here execute in the
611 /// prologue this happens after legalization/register allocation/etc so we
612 /// need to be extremely careful with each instruction. The instructions are
613 /// manually register-allocated and carefully only use caller-saved
614 /// registers and keep nothing live after this sequence of instructions.
615 stack_limit: Option<(Reg, SmallInstVec<M::I>)>,
616 /// Are we to invoke the probestack function in the prologue? If so,
617 /// what is the minimum size at which we must invoke it?
618 probestack_min_frame: Option<u32>,
619
620 _mach: PhantomData<M>,
621 }
622
get_special_purpose_param_register( f: &ir::Function, abi: &ABISig, purpose: ir::ArgumentPurpose, ) -> Option<Reg>623 fn get_special_purpose_param_register(
624 f: &ir::Function,
625 abi: &ABISig,
626 purpose: ir::ArgumentPurpose,
627 ) -> Option<Reg> {
628 let idx = f.signature.special_param_index(purpose)?;
629 match &abi.args[idx] {
630 &ABIArg::Slots { ref slots, .. } => match &slots[0] {
631 &ABIArgSlot::Reg { reg, .. } => Some(reg.to_reg()),
632 _ => None,
633 },
634 _ => None,
635 }
636 }
637
638 impl<M: ABIMachineSpec> ABICalleeImpl<M> {
639 /// Create a new body ABI instance.
new(f: &ir::Function, flags: settings::Flags) -> CodegenResult<Self>640 pub fn new(f: &ir::Function, flags: settings::Flags) -> CodegenResult<Self> {
641 debug!("ABI: func signature {:?}", f.signature);
642
643 let ir_sig = ensure_struct_return_ptr_is_returned(&f.signature);
644 let sig = ABISig::from_func_sig::<M>(&ir_sig, &flags)?;
645
646 let call_conv = f.signature.call_conv;
647 // Only these calling conventions are supported.
648 debug_assert!(
649 call_conv == isa::CallConv::SystemV
650 || call_conv == isa::CallConv::Fast
651 || call_conv == isa::CallConv::Cold
652 || call_conv.extends_baldrdash()
653 || call_conv.extends_windows_fastcall()
654 || call_conv == isa::CallConv::AppleAarch64
655 || call_conv == isa::CallConv::WasmtimeSystemV,
656 "Unsupported calling convention: {:?}",
657 call_conv
658 );
659
660 // Compute stackslot locations and total stackslot size.
661 let mut stack_offset: u32 = 0;
662 let mut stackslots = PrimaryMap::new();
663 for (stackslot, data) in f.stack_slots.iter() {
664 let off = stack_offset;
665 stack_offset += data.size;
666 let mask = M::word_bytes() - 1;
667 stack_offset = (stack_offset + mask) & !mask;
668 debug_assert_eq!(stackslot.as_u32() as usize, stackslots.len());
669 stackslots.push(off);
670 }
671
672 // Figure out what instructions, if any, will be needed to check the
673 // stack limit. This can either be specified as a special-purpose
674 // argument or as a global value which often calculates the stack limit
675 // from the arguments.
676 let stack_limit =
677 get_special_purpose_param_register(f, &sig, ir::ArgumentPurpose::StackLimit)
678 .map(|reg| (reg, smallvec![]))
679 .or_else(|| f.stack_limit.map(|gv| gen_stack_limit::<M>(f, &sig, gv)));
680
681 // Determine whether a probestack call is required for large enough
682 // frames (and the minimum frame size if so).
683 let probestack_min_frame = if flags.enable_probestack() {
684 assert!(
685 !flags.probestack_func_adjusts_sp(),
686 "SP-adjusting probestack not supported in new backends"
687 );
688 Some(1 << flags.probestack_size_log2())
689 } else {
690 None
691 };
692
693 Ok(Self {
694 ir_sig,
695 sig,
696 stackslots,
697 stackslots_size: stack_offset,
698 outgoing_args_size: 0,
699 clobbered: Set::empty(),
700 spillslots: None,
701 fixed_frame_storage_size: 0,
702 total_frame_size: None,
703 ret_area_ptr: None,
704 call_conv,
705 flags,
706 is_leaf: f.is_leaf(),
707 stack_limit,
708 probestack_min_frame,
709 _mach: PhantomData,
710 })
711 }
712
713 /// Inserts instructions necessary for checking the stack limit into the
714 /// prologue.
715 ///
716 /// This function will generate instructions necessary for perform a stack
717 /// check at the header of a function. The stack check is intended to trap
718 /// if the stack pointer goes below a particular threshold, preventing stack
719 /// overflow in wasm or other code. The `stack_limit` argument here is the
720 /// register which holds the threshold below which we're supposed to trap.
721 /// This function is known to allocate `stack_size` bytes and we'll push
722 /// instructions onto `insts`.
723 ///
724 /// Note that the instructions generated here are special because this is
725 /// happening so late in the pipeline (e.g. after register allocation). This
726 /// means that we need to do manual register allocation here and also be
727 /// careful to not clobber any callee-saved or argument registers. For now
728 /// this routine makes do with the `spilltmp_reg` as one temporary
729 /// register, and a second register of `tmp2` which is caller-saved. This
730 /// should be fine for us since no spills should happen in this sequence of
731 /// instructions, so our register won't get accidentally clobbered.
732 ///
733 /// No values can be live after the prologue, but in this case that's ok
734 /// because we just need to perform a stack check before progressing with
735 /// the rest of the function.
insert_stack_check( &self, stack_limit: Reg, stack_size: u32, insts: &mut SmallInstVec<M::I>, )736 fn insert_stack_check(
737 &self,
738 stack_limit: Reg,
739 stack_size: u32,
740 insts: &mut SmallInstVec<M::I>,
741 ) {
742 // With no explicit stack allocated we can just emit the simple check of
743 // the stack registers against the stack limit register, and trap if
744 // it's out of bounds.
745 if stack_size == 0 {
746 insts.extend(M::gen_stack_lower_bound_trap(stack_limit));
747 return;
748 }
749
750 // Note that the 32k stack size here is pretty special. See the
751 // documentation in x86/abi.rs for why this is here. The general idea is
752 // that we're protecting against overflow in the addition that happens
753 // below.
754 if stack_size >= 32 * 1024 {
755 insts.extend(M::gen_stack_lower_bound_trap(stack_limit));
756 }
757
758 // Add the `stack_size` to `stack_limit`, placing the result in
759 // `scratch`.
760 //
761 // Note though that `stack_limit`'s register may be the same as
762 // `scratch`. If our stack size doesn't fit into an immediate this
763 // means we need a second scratch register for loading the stack size
764 // into a register.
765 let scratch = Writable::from_reg(M::get_stacklimit_reg());
766 insts.extend(M::gen_add_imm(scratch, stack_limit, stack_size).into_iter());
767 insts.extend(M::gen_stack_lower_bound_trap(scratch.to_reg()));
768 }
769 }
770
771 /// Generates the instructions necessary for the `gv` to be materialized into a
772 /// register.
773 ///
774 /// This function will return a register that will contain the result of
775 /// evaluating `gv`. It will also return any instructions necessary to calculate
776 /// the value of the register.
777 ///
778 /// Note that global values are typically lowered to instructions via the
779 /// standard legalization pass. Unfortunately though prologue generation happens
780 /// so late in the pipeline that we can't use these legalization passes to
781 /// generate the instructions for `gv`. As a result we duplicate some lowering
782 /// of `gv` here and support only some global values. This is similar to what
783 /// the x86 backend does for now, and hopefully this can be somewhat cleaned up
784 /// in the future too!
785 ///
786 /// Also note that this function will make use of `writable_spilltmp_reg()` as a
787 /// temporary register to store values in if necessary. Currently after we write
788 /// to this register there's guaranteed to be no spilled values between where
789 /// it's used, because we're not participating in register allocation anyway!
gen_stack_limit<M: ABIMachineSpec>( f: &ir::Function, abi: &ABISig, gv: ir::GlobalValue, ) -> (Reg, SmallInstVec<M::I>)790 fn gen_stack_limit<M: ABIMachineSpec>(
791 f: &ir::Function,
792 abi: &ABISig,
793 gv: ir::GlobalValue,
794 ) -> (Reg, SmallInstVec<M::I>) {
795 let mut insts = smallvec![];
796 let reg = generate_gv::<M>(f, abi, gv, &mut insts);
797 return (reg, insts);
798 }
799
generate_gv<M: ABIMachineSpec>( f: &ir::Function, abi: &ABISig, gv: ir::GlobalValue, insts: &mut SmallInstVec<M::I>, ) -> Reg800 fn generate_gv<M: ABIMachineSpec>(
801 f: &ir::Function,
802 abi: &ABISig,
803 gv: ir::GlobalValue,
804 insts: &mut SmallInstVec<M::I>,
805 ) -> Reg {
806 match f.global_values[gv] {
807 // Return the direct register the vmcontext is in
808 ir::GlobalValueData::VMContext => {
809 get_special_purpose_param_register(f, abi, ir::ArgumentPurpose::VMContext)
810 .expect("no vmcontext parameter found")
811 }
812 // Load our base value into a register, then load from that register
813 // in to a temporary register.
814 ir::GlobalValueData::Load {
815 base,
816 offset,
817 global_type: _,
818 readonly: _,
819 } => {
820 let base = generate_gv::<M>(f, abi, base, insts);
821 let into_reg = Writable::from_reg(M::get_stacklimit_reg());
822 insts.push(M::gen_load_base_offset(
823 into_reg,
824 base,
825 offset.into(),
826 M::word_type(),
827 ));
828 return into_reg.to_reg();
829 }
830 ref other => panic!("global value for stack limit not supported: {}", other),
831 }
832 }
833
834 /// Return a type either from an optional type hint, or if not, from the default
835 /// type associated with the given register's class. This is used to generate
836 /// loads/spills appropriately given the type of value loaded/stored (which may
837 /// be narrower than the spillslot). We usually have the type because the
838 /// regalloc usually provides the vreg being spilled/reloaded, and we know every
839 /// vreg's type. However, the regalloc *can* request a spill/reload without an
840 /// associated vreg when needed to satisfy a safepoint (which requires all
841 /// ref-typed values, even those in real registers in the original vcode, to be
842 /// in spillslots).
ty_from_ty_hint_or_reg_class<M: ABIMachineSpec>(r: Reg, ty: Option<Type>) -> Type843 fn ty_from_ty_hint_or_reg_class<M: ABIMachineSpec>(r: Reg, ty: Option<Type>) -> Type {
844 match (ty, r.get_class()) {
845 // If the type is provided
846 (Some(t), _) => t,
847 // If no type is provided, this should be a register spill for a
848 // safepoint, so we only expect I32/I64 (integer) registers.
849 (None, rc) if rc == M::word_reg_class() => M::word_type(),
850 _ => panic!("Unexpected register class!"),
851 }
852 }
853
gen_load_stack_multi<M: ABIMachineSpec>( from: StackAMode, dst: ValueRegs<Writable<Reg>>, ty: Type, ) -> SmallInstVec<M::I>854 fn gen_load_stack_multi<M: ABIMachineSpec>(
855 from: StackAMode,
856 dst: ValueRegs<Writable<Reg>>,
857 ty: Type,
858 ) -> SmallInstVec<M::I> {
859 let mut ret = smallvec![];
860 let (_, tys) = M::I::rc_for_type(ty).unwrap();
861 let mut offset = 0;
862 // N.B.: registers are given in the `ValueRegs` in target endian order.
863 for (&dst, &ty) in dst.regs().iter().zip(tys.iter()) {
864 ret.push(M::gen_load_stack(from.offset(offset), dst, ty));
865 offset += ty.bytes() as i64;
866 }
867 ret
868 }
869
gen_store_stack_multi<M: ABIMachineSpec>( from: StackAMode, src: ValueRegs<Reg>, ty: Type, ) -> SmallInstVec<M::I>870 fn gen_store_stack_multi<M: ABIMachineSpec>(
871 from: StackAMode,
872 src: ValueRegs<Reg>,
873 ty: Type,
874 ) -> SmallInstVec<M::I> {
875 let mut ret = smallvec![];
876 let (_, tys) = M::I::rc_for_type(ty).unwrap();
877 let mut offset = 0;
878 // N.B.: registers are given in the `ValueRegs` in target endian order.
879 for (&src, &ty) in src.regs().iter().zip(tys.iter()) {
880 ret.push(M::gen_store_stack(from.offset(offset), src, ty));
881 offset += ty.bytes() as i64;
882 }
883 ret
884 }
885
ensure_struct_return_ptr_is_returned(sig: &ir::Signature) -> ir::Signature886 fn ensure_struct_return_ptr_is_returned(sig: &ir::Signature) -> ir::Signature {
887 let params_structret = sig
888 .params
889 .iter()
890 .find(|p| p.purpose == ArgumentPurpose::StructReturn);
891 let rets_have_structret = sig.returns.len() > 0
892 && sig
893 .returns
894 .iter()
895 .any(|arg| arg.purpose == ArgumentPurpose::StructReturn);
896 let mut sig = sig.clone();
897 if params_structret.is_some() && !rets_have_structret {
898 sig.returns.insert(0, params_structret.unwrap().clone());
899 }
900 sig
901 }
902
903 impl<M: ABIMachineSpec> ABICallee for ABICalleeImpl<M> {
904 type I = M::I;
905
signature(&self) -> &ir::Signature906 fn signature(&self) -> &ir::Signature {
907 &self.ir_sig
908 }
909
temp_needed(&self) -> Option<Type>910 fn temp_needed(&self) -> Option<Type> {
911 if self.sig.stack_ret_arg.is_some() {
912 Some(M::word_type())
913 } else {
914 None
915 }
916 }
917
init(&mut self, maybe_tmp: Option<Writable<Reg>>)918 fn init(&mut self, maybe_tmp: Option<Writable<Reg>>) {
919 if self.sig.stack_ret_arg.is_some() {
920 assert!(maybe_tmp.is_some());
921 self.ret_area_ptr = maybe_tmp;
922 }
923 }
924
accumulate_outgoing_args_size(&mut self, size: u32)925 fn accumulate_outgoing_args_size(&mut self, size: u32) {
926 if size > self.outgoing_args_size {
927 self.outgoing_args_size = size;
928 }
929 }
930
flags(&self) -> &settings::Flags931 fn flags(&self) -> &settings::Flags {
932 &self.flags
933 }
934
call_conv(&self) -> isa::CallConv935 fn call_conv(&self) -> isa::CallConv {
936 self.sig.call_conv
937 }
938
liveins(&self) -> Set<RealReg>939 fn liveins(&self) -> Set<RealReg> {
940 let mut set: Set<RealReg> = Set::empty();
941 for arg in &self.sig.args {
942 if let &ABIArg::Slots { ref slots, .. } = arg {
943 for slot in slots {
944 if let ABIArgSlot::Reg { reg, .. } = slot {
945 set.insert(*reg);
946 }
947 }
948 }
949 }
950 set
951 }
952
liveouts(&self) -> Set<RealReg>953 fn liveouts(&self) -> Set<RealReg> {
954 let mut set: Set<RealReg> = Set::empty();
955 for ret in &self.sig.rets {
956 if let &ABIArg::Slots { ref slots, .. } = ret {
957 for slot in slots {
958 if let ABIArgSlot::Reg { reg, .. } = slot {
959 set.insert(*reg);
960 }
961 }
962 }
963 }
964 set
965 }
966
num_args(&self) -> usize967 fn num_args(&self) -> usize {
968 self.sig.args.len()
969 }
970
num_retvals(&self) -> usize971 fn num_retvals(&self) -> usize {
972 self.sig.rets.len()
973 }
974
num_stackslots(&self) -> usize975 fn num_stackslots(&self) -> usize {
976 self.stackslots.len()
977 }
978
stackslot_offsets(&self) -> &PrimaryMap<StackSlot, u32>979 fn stackslot_offsets(&self) -> &PrimaryMap<StackSlot, u32> {
980 &self.stackslots
981 }
982
gen_copy_arg_to_regs( &self, idx: usize, into_regs: ValueRegs<Writable<Reg>>, ) -> SmallInstVec<Self::I>983 fn gen_copy_arg_to_regs(
984 &self,
985 idx: usize,
986 into_regs: ValueRegs<Writable<Reg>>,
987 ) -> SmallInstVec<Self::I> {
988 let mut insts = smallvec![];
989 match &self.sig.args[idx] {
990 &ABIArg::Slots { ref slots, .. } => {
991 assert_eq!(into_regs.len(), slots.len());
992 for (slot, into_reg) in slots.iter().zip(into_regs.regs().iter()) {
993 match slot {
994 // Extension mode doesn't matter (we're copying out, not in; we
995 // ignore high bits by convention).
996 &ABIArgSlot::Reg { reg, ty, .. } => {
997 insts.push(M::gen_move(*into_reg, reg.to_reg(), ty));
998 }
999 &ABIArgSlot::Stack { offset, ty, .. } => {
1000 insts.push(M::gen_load_stack(
1001 StackAMode::FPOffset(
1002 M::fp_to_arg_offset(self.call_conv, &self.flags) + offset,
1003 ty,
1004 ),
1005 *into_reg,
1006 ty,
1007 ));
1008 }
1009 }
1010 }
1011 }
1012 &ABIArg::StructArg { offset, .. } => {
1013 let into_reg = into_regs.only_reg().unwrap();
1014 insts.push(M::gen_get_stack_addr(
1015 StackAMode::FPOffset(
1016 M::fp_to_arg_offset(self.call_conv, &self.flags) + offset,
1017 I8,
1018 ),
1019 into_reg,
1020 I8,
1021 ));
1022 }
1023 }
1024 insts
1025 }
1026
arg_is_needed_in_body(&self, idx: usize) -> bool1027 fn arg_is_needed_in_body(&self, idx: usize) -> bool {
1028 match self.sig.args[idx].get_purpose() {
1029 // Special Baldrdash-specific pseudo-args that are present only to
1030 // fill stack slots. Won't ever be used as ordinary values in the
1031 // body.
1032 ir::ArgumentPurpose::CalleeTLS | ir::ArgumentPurpose::CallerTLS => false,
1033 _ => true,
1034 }
1035 }
1036
gen_copy_regs_to_retval( &self, idx: usize, from_regs: ValueRegs<Writable<Reg>>, ) -> SmallInstVec<Self::I>1037 fn gen_copy_regs_to_retval(
1038 &self,
1039 idx: usize,
1040 from_regs: ValueRegs<Writable<Reg>>,
1041 ) -> SmallInstVec<Self::I> {
1042 let mut ret = smallvec![];
1043 let word_bits = M::word_bits() as u8;
1044 match &self.sig.rets[idx] {
1045 &ABIArg::Slots { ref slots, .. } => {
1046 assert_eq!(from_regs.len(), slots.len());
1047 for (slot, from_reg) in slots.iter().zip(from_regs.regs().iter()) {
1048 match slot {
1049 &ABIArgSlot::Reg {
1050 reg, ty, extension, ..
1051 } => {
1052 let from_bits = ty_bits(ty) as u8;
1053 let ext = M::get_ext_mode(self.sig.call_conv, extension);
1054 match (ext, from_bits) {
1055 (ArgumentExtension::Uext, n) | (ArgumentExtension::Sext, n)
1056 if n < word_bits =>
1057 {
1058 let signed = ext == ArgumentExtension::Sext;
1059 ret.push(M::gen_extend(
1060 Writable::from_reg(reg.to_reg()),
1061 from_reg.to_reg(),
1062 signed,
1063 from_bits,
1064 /* to_bits = */ word_bits,
1065 ));
1066 }
1067 _ => {
1068 ret.push(M::gen_move(
1069 Writable::from_reg(reg.to_reg()),
1070 from_reg.to_reg(),
1071 ty,
1072 ));
1073 }
1074 };
1075 }
1076 &ABIArgSlot::Stack {
1077 offset,
1078 ty,
1079 extension,
1080 ..
1081 } => {
1082 let mut ty = ty;
1083 let from_bits = ty_bits(ty) as u8;
1084 // A machine ABI implementation should ensure that stack frames
1085 // have "reasonable" size. All current ABIs for machinst
1086 // backends (aarch64 and x64) enforce a 128MB limit.
1087 let off = i32::try_from(offset).expect(
1088 "Argument stack offset greater than 2GB; should hit impl limit first",
1089 );
1090 let ext = M::get_ext_mode(self.sig.call_conv, extension);
1091 // Trash the from_reg; it should be its last use.
1092 match (ext, from_bits) {
1093 (ArgumentExtension::Uext, n) | (ArgumentExtension::Sext, n)
1094 if n < word_bits =>
1095 {
1096 assert_eq!(M::word_reg_class(), from_reg.to_reg().get_class());
1097 let signed = ext == ArgumentExtension::Sext;
1098 ret.push(M::gen_extend(
1099 Writable::from_reg(from_reg.to_reg()),
1100 from_reg.to_reg(),
1101 signed,
1102 from_bits,
1103 /* to_bits = */ word_bits,
1104 ));
1105 // Store the extended version.
1106 ty = M::word_type();
1107 }
1108 _ => {}
1109 };
1110 ret.push(M::gen_store_base_offset(
1111 self.ret_area_ptr.unwrap().to_reg(),
1112 off,
1113 from_reg.to_reg(),
1114 ty,
1115 ));
1116 }
1117 }
1118 }
1119 }
1120 &ABIArg::StructArg { .. } => {
1121 panic!("StructArg in return position is unsupported");
1122 }
1123 }
1124 ret
1125 }
1126
gen_retval_area_setup(&self) -> Option<Self::I>1127 fn gen_retval_area_setup(&self) -> Option<Self::I> {
1128 if let Some(i) = self.sig.stack_ret_arg {
1129 let insts = self.gen_copy_arg_to_regs(i, ValueRegs::one(self.ret_area_ptr.unwrap()));
1130 let inst = insts.into_iter().next().unwrap();
1131 trace!(
1132 "gen_retval_area_setup: inst {:?}; ptr reg is {:?}",
1133 inst,
1134 self.ret_area_ptr.unwrap().to_reg()
1135 );
1136 Some(inst)
1137 } else {
1138 trace!("gen_retval_area_setup: not needed");
1139 None
1140 }
1141 }
1142
gen_ret(&self) -> Self::I1143 fn gen_ret(&self) -> Self::I {
1144 M::gen_ret()
1145 }
1146
gen_epilogue_placeholder(&self) -> Self::I1147 fn gen_epilogue_placeholder(&self) -> Self::I {
1148 M::gen_epilogue_placeholder()
1149 }
1150
set_num_spillslots(&mut self, slots: usize)1151 fn set_num_spillslots(&mut self, slots: usize) {
1152 self.spillslots = Some(slots);
1153 }
1154
set_clobbered(&mut self, clobbered: Set<Writable<RealReg>>)1155 fn set_clobbered(&mut self, clobbered: Set<Writable<RealReg>>) {
1156 self.clobbered = clobbered;
1157 }
1158
1159 /// Load from a stackslot.
load_stackslot( &self, slot: StackSlot, offset: u32, ty: Type, into_regs: ValueRegs<Writable<Reg>>, ) -> SmallInstVec<Self::I>1160 fn load_stackslot(
1161 &self,
1162 slot: StackSlot,
1163 offset: u32,
1164 ty: Type,
1165 into_regs: ValueRegs<Writable<Reg>>,
1166 ) -> SmallInstVec<Self::I> {
1167 // Offset from beginning of stackslot area, which is at nominal SP (see
1168 // [MemArg::NominalSPOffset] for more details on nominal SP tracking).
1169 let stack_off = self.stackslots[slot] as i64;
1170 let sp_off: i64 = stack_off + (offset as i64);
1171 trace!("load_stackslot: slot {} -> sp_off {}", slot, sp_off);
1172 gen_load_stack_multi::<M>(StackAMode::NominalSPOffset(sp_off, ty), into_regs, ty)
1173 }
1174
1175 /// Store to a stackslot.
store_stackslot( &self, slot: StackSlot, offset: u32, ty: Type, from_regs: ValueRegs<Reg>, ) -> SmallInstVec<Self::I>1176 fn store_stackslot(
1177 &self,
1178 slot: StackSlot,
1179 offset: u32,
1180 ty: Type,
1181 from_regs: ValueRegs<Reg>,
1182 ) -> SmallInstVec<Self::I> {
1183 // Offset from beginning of stackslot area, which is at nominal SP (see
1184 // [MemArg::NominalSPOffset] for more details on nominal SP tracking).
1185 let stack_off = self.stackslots[slot] as i64;
1186 let sp_off: i64 = stack_off + (offset as i64);
1187 trace!("store_stackslot: slot {} -> sp_off {}", slot, sp_off);
1188 gen_store_stack_multi::<M>(StackAMode::NominalSPOffset(sp_off, ty), from_regs, ty)
1189 }
1190
1191 /// Produce an instruction that computes a stackslot address.
stackslot_addr(&self, slot: StackSlot, offset: u32, into_reg: Writable<Reg>) -> Self::I1192 fn stackslot_addr(&self, slot: StackSlot, offset: u32, into_reg: Writable<Reg>) -> Self::I {
1193 // Offset from beginning of stackslot area, which is at nominal SP (see
1194 // [MemArg::NominalSPOffset] for more details on nominal SP tracking).
1195 let stack_off = self.stackslots[slot] as i64;
1196 let sp_off: i64 = stack_off + (offset as i64);
1197 M::gen_get_stack_addr(StackAMode::NominalSPOffset(sp_off, I8), into_reg, I8)
1198 }
1199
1200 /// Load from a spillslot.
load_spillslot( &self, slot: SpillSlot, ty: Type, into_regs: ValueRegs<Writable<Reg>>, ) -> SmallInstVec<Self::I>1201 fn load_spillslot(
1202 &self,
1203 slot: SpillSlot,
1204 ty: Type,
1205 into_regs: ValueRegs<Writable<Reg>>,
1206 ) -> SmallInstVec<Self::I> {
1207 // Offset from beginning of spillslot area, which is at nominal SP + stackslots_size.
1208 let islot = slot.get() as i64;
1209 let spill_off = islot * M::word_bytes() as i64;
1210 let sp_off = self.stackslots_size as i64 + spill_off;
1211 trace!("load_spillslot: slot {:?} -> sp_off {}", slot, sp_off);
1212
1213 // Integer types smaller than word size have been spilled as words below,
1214 // and therefore must be reloaded in the same type.
1215 let ty = if ty.is_int() && ty.bytes() < M::word_bytes() {
1216 M::word_type()
1217 } else {
1218 ty
1219 };
1220
1221 gen_load_stack_multi::<M>(StackAMode::NominalSPOffset(sp_off, ty), into_regs, ty)
1222 }
1223
1224 /// Store to a spillslot.
store_spillslot( &self, slot: SpillSlot, ty: Type, from_regs: ValueRegs<Reg>, ) -> SmallInstVec<Self::I>1225 fn store_spillslot(
1226 &self,
1227 slot: SpillSlot,
1228 ty: Type,
1229 from_regs: ValueRegs<Reg>,
1230 ) -> SmallInstVec<Self::I> {
1231 // Offset from beginning of spillslot area, which is at nominal SP + stackslots_size.
1232 let islot = slot.get() as i64;
1233 let spill_off = islot * M::word_bytes() as i64;
1234 let sp_off = self.stackslots_size as i64 + spill_off;
1235 trace!("store_spillslot: slot {:?} -> sp_off {}", slot, sp_off);
1236
1237 // When reloading from a spill slot, we might have lost information about real integer
1238 // types. For instance, on the x64 backend, a zero-extension can become spurious and
1239 // optimized into a move, causing vregs of types I32 and I64 to share the same coalescing
1240 // equivalency class. As a matter of fact, such a value can be spilled as an I32 and later
1241 // reloaded as an I64; to make sure the high bits are always defined, do a word-sized store
1242 // all the time, in this case.
1243 let ty = if ty.is_int() && ty.bytes() < M::word_bytes() {
1244 M::word_type()
1245 } else {
1246 ty
1247 };
1248
1249 gen_store_stack_multi::<M>(StackAMode::NominalSPOffset(sp_off, ty), from_regs, ty)
1250 }
1251
spillslots_to_stack_map( &self, slots: &[SpillSlot], state: &<Self::I as MachInstEmit>::State, ) -> StackMap1252 fn spillslots_to_stack_map(
1253 &self,
1254 slots: &[SpillSlot],
1255 state: &<Self::I as MachInstEmit>::State,
1256 ) -> StackMap {
1257 let virtual_sp_offset = M::get_virtual_sp_offset_from_state(state);
1258 let nominal_sp_to_fp = M::get_nominal_sp_to_fp(state);
1259 assert!(virtual_sp_offset >= 0);
1260 trace!(
1261 "spillslots_to_stackmap: slots = {:?}, state = {:?}",
1262 slots,
1263 state
1264 );
1265 let map_size = (virtual_sp_offset + nominal_sp_to_fp) as u32;
1266 let bytes = M::word_bytes();
1267 let map_words = (map_size + bytes - 1) / bytes;
1268 let mut bits = std::iter::repeat(false)
1269 .take(map_words as usize)
1270 .collect::<Vec<bool>>();
1271
1272 let first_spillslot_word =
1273 ((self.stackslots_size + virtual_sp_offset as u32) / bytes) as usize;
1274 for &slot in slots {
1275 let slot = slot.get() as usize;
1276 bits[first_spillslot_word + slot] = true;
1277 }
1278
1279 StackMap::from_slice(&bits[..])
1280 }
1281
gen_prologue(&mut self) -> SmallInstVec<Self::I>1282 fn gen_prologue(&mut self) -> SmallInstVec<Self::I> {
1283 let mut insts = smallvec![];
1284 if !self.call_conv.extends_baldrdash() {
1285 // set up frame
1286 insts.extend(M::gen_prologue_frame_setup(&self.flags).into_iter());
1287 }
1288
1289 let bytes = M::word_bytes();
1290 let mut total_stacksize = self.stackslots_size + bytes * self.spillslots.unwrap() as u32;
1291 if self.call_conv.extends_baldrdash() {
1292 debug_assert!(
1293 !self.flags.enable_probestack(),
1294 "baldrdash does not expect cranelift to emit stack probes"
1295 );
1296 total_stacksize += self.flags.baldrdash_prologue_words() as u32 * bytes;
1297 }
1298 let mask = M::stack_align(self.call_conv) - 1;
1299 let total_stacksize = (total_stacksize + mask) & !mask; // 16-align the stack.
1300
1301 if !self.call_conv.extends_baldrdash() {
1302 // Leaf functions with zero stack don't need a stack check if one's
1303 // specified, otherwise always insert the stack check.
1304 if total_stacksize > 0 || !self.is_leaf {
1305 if let Some((reg, stack_limit_load)) = &self.stack_limit {
1306 insts.extend(stack_limit_load.clone());
1307 self.insert_stack_check(*reg, total_stacksize, &mut insts);
1308 }
1309 if let Some(min_frame) = &self.probestack_min_frame {
1310 if total_stacksize >= *min_frame {
1311 insts.extend(M::gen_probestack(total_stacksize));
1312 }
1313 }
1314 }
1315 if total_stacksize > 0 {
1316 self.fixed_frame_storage_size += total_stacksize;
1317 }
1318 }
1319
1320 // Save clobbered registers.
1321 let (clobber_size, clobber_insts) = M::gen_clobber_save(
1322 self.call_conv,
1323 &self.flags,
1324 &self.clobbered,
1325 self.fixed_frame_storage_size,
1326 self.outgoing_args_size,
1327 );
1328 insts.extend(clobber_insts);
1329
1330 // N.B.: "nominal SP", which we use to refer to stackslots and
1331 // spillslots, is defined to be equal to the stack pointer at this point
1332 // in the prologue.
1333 //
1334 // If we push any further data onto the stack in the function
1335 // body, we emit a virtual-SP adjustment meta-instruction so
1336 // that the nominal SP references behave as if SP were still
1337 // at this point. See documentation for
1338 // [crate::machinst::abi_impl](this module) for more details
1339 // on stackframe layout and nominal SP maintenance.
1340
1341 self.total_frame_size = Some(total_stacksize + clobber_size as u32);
1342 insts
1343 }
1344
gen_epilogue(&self) -> SmallInstVec<M::I>1345 fn gen_epilogue(&self) -> SmallInstVec<M::I> {
1346 let mut insts = smallvec![];
1347
1348 // Restore clobbered registers.
1349 insts.extend(M::gen_clobber_restore(
1350 self.call_conv,
1351 &self.flags,
1352 &self.clobbered,
1353 self.fixed_frame_storage_size,
1354 self.outgoing_args_size,
1355 ));
1356
1357 // N.B.: we do *not* emit a nominal SP adjustment here, because (i) there will be no
1358 // references to nominal SP offsets before the return below, and (ii) the instruction
1359 // emission tracks running SP offset linearly (in straight-line order), not according to
1360 // the CFG, so early returns in the middle of function bodies would cause an incorrect
1361 // offset for the rest of the body.
1362
1363 if !self.call_conv.extends_baldrdash() {
1364 insts.extend(M::gen_epilogue_frame_restore(&self.flags));
1365 insts.push(M::gen_ret());
1366 }
1367
1368 debug!("Epilogue: {:?}", insts);
1369 insts
1370 }
1371
frame_size(&self) -> u321372 fn frame_size(&self) -> u32 {
1373 self.total_frame_size
1374 .expect("frame size not computed before prologue generation")
1375 }
1376
stack_args_size(&self) -> u321377 fn stack_args_size(&self) -> u32 {
1378 self.sig.stack_arg_space as u32
1379 }
1380
get_spillslot_size(&self, rc: RegClass, ty: Type) -> u321381 fn get_spillslot_size(&self, rc: RegClass, ty: Type) -> u32 {
1382 M::get_number_of_spillslots_for_value(rc, ty)
1383 }
1384
gen_spill(&self, to_slot: SpillSlot, from_reg: RealReg, ty: Option<Type>) -> Self::I1385 fn gen_spill(&self, to_slot: SpillSlot, from_reg: RealReg, ty: Option<Type>) -> Self::I {
1386 let ty = ty_from_ty_hint_or_reg_class::<M>(from_reg.to_reg(), ty);
1387 self.store_spillslot(to_slot, ty, ValueRegs::one(from_reg.to_reg()))
1388 .into_iter()
1389 .next()
1390 .unwrap()
1391 }
1392
gen_reload( &self, to_reg: Writable<RealReg>, from_slot: SpillSlot, ty: Option<Type>, ) -> Self::I1393 fn gen_reload(
1394 &self,
1395 to_reg: Writable<RealReg>,
1396 from_slot: SpillSlot,
1397 ty: Option<Type>,
1398 ) -> Self::I {
1399 let ty = ty_from_ty_hint_or_reg_class::<M>(to_reg.to_reg().to_reg(), ty);
1400 self.load_spillslot(
1401 from_slot,
1402 ty,
1403 writable_value_regs(ValueRegs::one(to_reg.to_reg().to_reg())),
1404 )
1405 .into_iter()
1406 .next()
1407 .unwrap()
1408 }
1409 }
1410
abisig_to_uses_and_defs<M: ABIMachineSpec>(sig: &ABISig) -> (Vec<Reg>, Vec<Writable<Reg>>)1411 fn abisig_to_uses_and_defs<M: ABIMachineSpec>(sig: &ABISig) -> (Vec<Reg>, Vec<Writable<Reg>>) {
1412 // Compute uses: all arg regs.
1413 let mut uses = Vec::new();
1414 for arg in &sig.args {
1415 if let &ABIArg::Slots { ref slots, .. } = arg {
1416 for slot in slots {
1417 match slot {
1418 &ABIArgSlot::Reg { reg, .. } => {
1419 uses.push(reg.to_reg());
1420 }
1421 _ => {}
1422 }
1423 }
1424 }
1425 }
1426
1427 // Compute defs: all retval regs, and all caller-save (clobbered) regs.
1428 let mut defs = M::get_regs_clobbered_by_call(sig.call_conv);
1429 for ret in &sig.rets {
1430 if let &ABIArg::Slots { ref slots, .. } = ret {
1431 for slot in slots {
1432 match slot {
1433 &ABIArgSlot::Reg { reg, .. } => {
1434 defs.push(Writable::from_reg(reg.to_reg()));
1435 }
1436 _ => {}
1437 }
1438 }
1439 }
1440 }
1441
1442 (uses, defs)
1443 }
1444
1445 /// ABI object for a callsite.
1446 pub struct ABICallerImpl<M: ABIMachineSpec> {
1447 /// CLIF-level signature, possibly normalized.
1448 ir_sig: ir::Signature,
1449 /// The called function's signature.
1450 sig: ABISig,
1451 /// All uses for the callsite, i.e., function args.
1452 uses: Vec<Reg>,
1453 /// All defs for the callsite, i.e., return values and caller-saves.
1454 defs: Vec<Writable<Reg>>,
1455 /// Call destination.
1456 dest: CallDest,
1457 /// Actual call opcode; used to distinguish various types of calls.
1458 opcode: ir::Opcode,
1459 /// Caller's calling convention.
1460 caller_conv: isa::CallConv,
1461 /// The settings controlling this compilation.
1462 flags: settings::Flags,
1463
1464 _mach: PhantomData<M>,
1465 }
1466
1467 /// Destination for a call.
1468 #[derive(Debug, Clone)]
1469 pub enum CallDest {
1470 /// Call to an ExtName (named function symbol).
1471 ExtName(ir::ExternalName, RelocDistance),
1472 /// Indirect call to a function pointer in a register.
1473 Reg(Reg),
1474 }
1475
1476 impl<M: ABIMachineSpec> ABICallerImpl<M> {
1477 /// Create a callsite ABI object for a call directly to the specified function.
from_func( sig: &ir::Signature, extname: &ir::ExternalName, dist: RelocDistance, caller_conv: isa::CallConv, flags: &settings::Flags, ) -> CodegenResult<ABICallerImpl<M>>1478 pub fn from_func(
1479 sig: &ir::Signature,
1480 extname: &ir::ExternalName,
1481 dist: RelocDistance,
1482 caller_conv: isa::CallConv,
1483 flags: &settings::Flags,
1484 ) -> CodegenResult<ABICallerImpl<M>> {
1485 let ir_sig = ensure_struct_return_ptr_is_returned(sig);
1486 let sig = ABISig::from_func_sig::<M>(&ir_sig, flags)?;
1487 let (uses, defs) = abisig_to_uses_and_defs::<M>(&sig);
1488 Ok(ABICallerImpl {
1489 ir_sig,
1490 sig,
1491 uses,
1492 defs,
1493 dest: CallDest::ExtName(extname.clone(), dist),
1494 opcode: ir::Opcode::Call,
1495 caller_conv,
1496 flags: flags.clone(),
1497 _mach: PhantomData,
1498 })
1499 }
1500
1501 /// Create a callsite ABI object for a call to a function pointer with the
1502 /// given signature.
from_ptr( sig: &ir::Signature, ptr: Reg, opcode: ir::Opcode, caller_conv: isa::CallConv, flags: &settings::Flags, ) -> CodegenResult<ABICallerImpl<M>>1503 pub fn from_ptr(
1504 sig: &ir::Signature,
1505 ptr: Reg,
1506 opcode: ir::Opcode,
1507 caller_conv: isa::CallConv,
1508 flags: &settings::Flags,
1509 ) -> CodegenResult<ABICallerImpl<M>> {
1510 let ir_sig = ensure_struct_return_ptr_is_returned(sig);
1511 let sig = ABISig::from_func_sig::<M>(&ir_sig, flags)?;
1512 let (uses, defs) = abisig_to_uses_and_defs::<M>(&sig);
1513 Ok(ABICallerImpl {
1514 ir_sig,
1515 sig,
1516 uses,
1517 defs,
1518 dest: CallDest::Reg(ptr),
1519 opcode,
1520 caller_conv,
1521 flags: flags.clone(),
1522 _mach: PhantomData,
1523 })
1524 }
1525 }
1526
adjust_stack_and_nominal_sp<M: ABIMachineSpec, C: LowerCtx<I = M::I>>( ctx: &mut C, off: i32, is_sub: bool, )1527 fn adjust_stack_and_nominal_sp<M: ABIMachineSpec, C: LowerCtx<I = M::I>>(
1528 ctx: &mut C,
1529 off: i32,
1530 is_sub: bool,
1531 ) {
1532 if off == 0 {
1533 return;
1534 }
1535 let amt = if is_sub { -off } else { off };
1536 for inst in M::gen_sp_reg_adjust(amt) {
1537 ctx.emit(inst);
1538 }
1539 ctx.emit(M::gen_nominal_sp_adj(-amt));
1540 }
1541
1542 impl<M: ABIMachineSpec> ABICaller for ABICallerImpl<M> {
1543 type I = M::I;
1544
signature(&self) -> &ir::Signature1545 fn signature(&self) -> &ir::Signature {
1546 &self.ir_sig
1547 }
1548
num_args(&self) -> usize1549 fn num_args(&self) -> usize {
1550 if self.sig.stack_ret_arg.is_some() {
1551 self.sig.args.len() - 1
1552 } else {
1553 self.sig.args.len()
1554 }
1555 }
1556
accumulate_outgoing_args_size<C: LowerCtx<I = Self::I>>(&self, ctx: &mut C)1557 fn accumulate_outgoing_args_size<C: LowerCtx<I = Self::I>>(&self, ctx: &mut C) {
1558 let off = self.sig.stack_arg_space + self.sig.stack_ret_space;
1559 ctx.abi().accumulate_outgoing_args_size(off as u32);
1560 }
1561
emit_stack_pre_adjust<C: LowerCtx<I = Self::I>>(&self, ctx: &mut C)1562 fn emit_stack_pre_adjust<C: LowerCtx<I = Self::I>>(&self, ctx: &mut C) {
1563 let off = self.sig.stack_arg_space + self.sig.stack_ret_space;
1564 adjust_stack_and_nominal_sp::<M, C>(ctx, off as i32, /* is_sub = */ true)
1565 }
1566
emit_stack_post_adjust<C: LowerCtx<I = Self::I>>(&self, ctx: &mut C)1567 fn emit_stack_post_adjust<C: LowerCtx<I = Self::I>>(&self, ctx: &mut C) {
1568 let off = self.sig.stack_arg_space + self.sig.stack_ret_space;
1569 adjust_stack_and_nominal_sp::<M, C>(ctx, off as i32, /* is_sub = */ false)
1570 }
1571
emit_copy_regs_to_arg<C: LowerCtx<I = Self::I>>( &self, ctx: &mut C, idx: usize, from_regs: ValueRegs<Reg>, )1572 fn emit_copy_regs_to_arg<C: LowerCtx<I = Self::I>>(
1573 &self,
1574 ctx: &mut C,
1575 idx: usize,
1576 from_regs: ValueRegs<Reg>,
1577 ) {
1578 let word_rc = M::word_reg_class();
1579 let word_bits = M::word_bits() as usize;
1580 match &self.sig.args[idx] {
1581 &ABIArg::Slots { ref slots, .. } => {
1582 assert_eq!(from_regs.len(), slots.len());
1583 for (slot, from_reg) in slots.iter().zip(from_regs.regs().iter()) {
1584 match slot {
1585 &ABIArgSlot::Reg {
1586 reg, ty, extension, ..
1587 } => {
1588 let ext = M::get_ext_mode(self.sig.call_conv, extension);
1589 if ext != ir::ArgumentExtension::None && ty_bits(ty) < word_bits {
1590 assert_eq!(word_rc, reg.get_class());
1591 let signed = match ext {
1592 ir::ArgumentExtension::Uext => false,
1593 ir::ArgumentExtension::Sext => true,
1594 _ => unreachable!(),
1595 };
1596 ctx.emit(M::gen_extend(
1597 Writable::from_reg(reg.to_reg()),
1598 *from_reg,
1599 signed,
1600 ty_bits(ty) as u8,
1601 word_bits as u8,
1602 ));
1603 } else {
1604 ctx.emit(M::gen_move(
1605 Writable::from_reg(reg.to_reg()),
1606 *from_reg,
1607 ty,
1608 ));
1609 }
1610 }
1611 &ABIArgSlot::Stack {
1612 offset,
1613 ty,
1614 extension,
1615 ..
1616 } => {
1617 let mut ty = ty;
1618 let ext = M::get_ext_mode(self.sig.call_conv, extension);
1619 if ext != ir::ArgumentExtension::None && ty_bits(ty) < word_bits {
1620 assert_eq!(word_rc, from_reg.get_class());
1621 let signed = match ext {
1622 ir::ArgumentExtension::Uext => false,
1623 ir::ArgumentExtension::Sext => true,
1624 _ => unreachable!(),
1625 };
1626 // Extend in place in the source register. Our convention is to
1627 // treat high bits as undefined for values in registers, so this
1628 // is safe, even for an argument that is nominally read-only.
1629 ctx.emit(M::gen_extend(
1630 Writable::from_reg(*from_reg),
1631 *from_reg,
1632 signed,
1633 ty_bits(ty) as u8,
1634 word_bits as u8,
1635 ));
1636 // Store the extended version.
1637 ty = M::word_type();
1638 }
1639 ctx.emit(M::gen_store_stack(
1640 StackAMode::SPOffset(offset, ty),
1641 *from_reg,
1642 ty,
1643 ));
1644 }
1645 }
1646 }
1647 }
1648 &ABIArg::StructArg { offset, size, .. } => {
1649 let src_ptr = from_regs.only_reg().unwrap();
1650 let dst_ptr = ctx.alloc_tmp(M::word_type()).only_reg().unwrap();
1651 ctx.emit(M::gen_get_stack_addr(
1652 StackAMode::SPOffset(offset, I8),
1653 dst_ptr,
1654 I8,
1655 ));
1656 // Emit a memcpy from `src_ptr` to `dst_ptr` of `size` bytes.
1657 // N.B.: because we process StructArg params *first*, this is
1658 // safe w.r.t. clobbers: we have not yet filled in any other
1659 // arg regs.
1660 let memcpy_call_conv = isa::CallConv::for_libcall(&self.flags, self.sig.call_conv);
1661 for insn in
1662 M::gen_memcpy(memcpy_call_conv, dst_ptr.to_reg(), src_ptr, size as usize)
1663 .into_iter()
1664 {
1665 ctx.emit(insn);
1666 }
1667 }
1668 }
1669 }
1670
get_copy_to_arg_order(&self) -> SmallVec<[usize; 8]>1671 fn get_copy_to_arg_order(&self) -> SmallVec<[usize; 8]> {
1672 let mut ret = SmallVec::new();
1673 for (i, arg) in self.sig.args.iter().enumerate() {
1674 // Struct args.
1675 if arg.is_struct_arg() {
1676 ret.push(i);
1677 }
1678 }
1679 for (i, arg) in self.sig.args.iter().enumerate() {
1680 // Non-struct args. Skip an appended return-area arg for multivalue
1681 // returns, if any.
1682 if !arg.is_struct_arg() && i < self.ir_sig.params.len() {
1683 ret.push(i);
1684 }
1685 }
1686 ret
1687 }
1688
emit_copy_retval_to_regs<C: LowerCtx<I = Self::I>>( &self, ctx: &mut C, idx: usize, into_regs: ValueRegs<Writable<Reg>>, )1689 fn emit_copy_retval_to_regs<C: LowerCtx<I = Self::I>>(
1690 &self,
1691 ctx: &mut C,
1692 idx: usize,
1693 into_regs: ValueRegs<Writable<Reg>>,
1694 ) {
1695 match &self.sig.rets[idx] {
1696 &ABIArg::Slots { ref slots, .. } => {
1697 assert_eq!(into_regs.len(), slots.len());
1698 for (slot, into_reg) in slots.iter().zip(into_regs.regs().iter()) {
1699 match slot {
1700 // Extension mode doesn't matter because we're copying out, not in,
1701 // and we ignore high bits in our own registers by convention.
1702 &ABIArgSlot::Reg { reg, ty, .. } => {
1703 ctx.emit(M::gen_move(*into_reg, reg.to_reg(), ty));
1704 }
1705 &ABIArgSlot::Stack { offset, ty, .. } => {
1706 let ret_area_base = self.sig.stack_arg_space;
1707 ctx.emit(M::gen_load_stack(
1708 StackAMode::SPOffset(offset + ret_area_base, ty),
1709 *into_reg,
1710 ty,
1711 ));
1712 }
1713 }
1714 }
1715 }
1716 &ABIArg::StructArg { .. } => {
1717 panic!("StructArg not supported in return position");
1718 }
1719 }
1720 }
1721
emit_call<C: LowerCtx<I = Self::I>>(&mut self, ctx: &mut C)1722 fn emit_call<C: LowerCtx<I = Self::I>>(&mut self, ctx: &mut C) {
1723 let (uses, defs) = (
1724 mem::replace(&mut self.uses, Default::default()),
1725 mem::replace(&mut self.defs, Default::default()),
1726 );
1727 let word_type = M::word_type();
1728 if let Some(i) = self.sig.stack_ret_arg {
1729 let rd = ctx.alloc_tmp(word_type).only_reg().unwrap();
1730 let ret_area_base = self.sig.stack_arg_space;
1731 ctx.emit(M::gen_get_stack_addr(
1732 StackAMode::SPOffset(ret_area_base, I8),
1733 rd,
1734 I8,
1735 ));
1736 self.emit_copy_regs_to_arg(ctx, i, ValueRegs::one(rd.to_reg()));
1737 }
1738 let tmp = ctx.alloc_tmp(word_type).only_reg().unwrap();
1739 for (is_safepoint, inst) in M::gen_call(
1740 &self.dest,
1741 uses,
1742 defs,
1743 self.opcode,
1744 tmp,
1745 self.sig.call_conv,
1746 self.caller_conv,
1747 )
1748 .into_iter()
1749 {
1750 match is_safepoint {
1751 InstIsSafepoint::Yes => ctx.emit_safepoint(inst),
1752 InstIsSafepoint::No => ctx.emit(inst),
1753 }
1754 }
1755 }
1756 }
1757