1 //! Implementation of a vanilla ABI, shared between several machines. The
2 //! implementation here assumes that arguments will be passed in registers
3 //! first, then additional args on the stack; that the stack grows downward,
4 //! contains a standard frame (return address and frame pointer), and the
5 //! compiler is otherwise free to allocate space below that with its choice of
6 //! layout; and that the machine has some notion of caller- and callee-save
7 //! registers. Most modern machines, e.g. x86-64 and AArch64, should fit this
8 //! mold and thus both of these backends use this shared implementation.
9 //!
10 //! See the documentation in specific machine backends for the "instantiation"
11 //! of this generic ABI, i.e., which registers are caller/callee-save, arguments
12 //! and return values, and any other special requirements.
13 //!
14 //! For now the implementation here assumes a 64-bit machine, but we intend to
15 //! make this 32/64-bit-generic shortly.
16 //!
17 //! # Vanilla ABI
18 //!
19 //! First, arguments and return values are passed in registers up to a certain
20 //! fixed count, after which they overflow onto the stack. Multiple return
21 //! values either fit in registers, or are returned in a separate return-value
22 //! area on the stack, given by a hidden extra parameter.
23 //!
24 //! Note that the exact stack layout is up to us. We settled on the
25 //! below design based on several requirements. In particular, we need
26 //! to be able to generate instructions (or instruction sequences) to
27 //! access arguments, stack slots, and spill slots before we know how
28 //! many spill slots or clobber-saves there will be, because of our
29 //! pass structure. We also prefer positive offsets to negative
30 //! offsets because of an asymmetry in some machines' addressing modes
31 //! (e.g., on AArch64, positive offsets have a larger possible range
32 //! without a long-form sequence to synthesize an arbitrary
33 //! offset). We also need clobber-save registers to be "near" the
34 //! frame pointer: Windows unwind information requires it to be within
35 //! 240 bytes of RBP. Finally, it is not allowed to access memory
36 //! below the current SP value.
37 //!
38 //! We assume that a prologue first pushes the frame pointer (and
39 //! return address above that, if the machine does not do that in
40 //! hardware). We set FP to point to this two-word frame record. We
41 //! store all other frame slots below this two-word frame record, with
42 //! the stack pointer remaining at or below this fixed frame storage
43 //! for the rest of the function. We can then access frame storage
44 //! slots using positive offsets from SP. In order to allow codegen
45 //! for the latter before knowing how SP might be adjusted around
46 //! callsites, we implement a "nominal SP" tracking feature by which a
47 //! fixup (distance between actual SP and a "nominal" SP) is known at
48 //! each instruction.
49 //!
50 //! Note that if we ever support dynamic stack-space allocation (for
51 //! `alloca`), we will need a way to reference spill slots and stack
52 //! slots without "nominal SP", because we will no longer be able to
53 //! know a static offset from SP to the slots at any particular
54 //! program point. Probably the best solution at that point will be to
55 //! revert to using the frame pointer as the reference for all slots,
56 //! and creating a "nominal FP" synthetic addressing mode (analogous
57 //! to "nominal SP" today) to allow generating spill/reload and
58 //! stackslot accesses before we know how large the clobber-saves will
59 //! be.
60 //!
61 //! # Stack Layout
62 //!
63 //! The stack looks like:
64 //!
65 //! ```plain
66 //!   (high address)
67 //!
68 //!                              +---------------------------+
69 //!                              |          ...              |
70 //!                              | stack args                |
71 //!                              | (accessed via FP)         |
72 //!                              +---------------------------+
73 //! SP at function entry ----->  | return address            |
74 //!                              +---------------------------+
75 //! FP after prologue -------->  | FP (pushed by prologue)   |
76 //!                              +---------------------------+
77 //!                              |          ...              |
78 //!                              | clobbered callee-saves    |
79 //! unwind-frame base     ---->  | (pushed by prologue)      |
80 //!                              +---------------------------+
81 //!                              |          ...              |
82 //!                              | spill slots               |
83 //!                              | (accessed via nominal SP) |
84 //!                              |          ...              |
85 //!                              | stack slots               |
86 //!                              | (accessed via nominal SP) |
87 //! nominal SP --------------->  | (alloc'd by prologue)     |
88 //! (SP at end of prologue)      +---------------------------+
89 //!                              | [alignment as needed]     |
90 //!                              |          ...              |
91 //!                              | args for call             |
92 //! SP before making a call -->  | (pushed at callsite)      |
93 //!                              +---------------------------+
94 //!
95 //!   (low address)
96 //! ```
97 //!
98 //! # Multi-value Returns
99 //!
100 //! Note that we support multi-value returns in two ways. First, we allow for
101 //! multiple return-value registers. Second, if teh appropriate flag is set, we
102 //! support the SpiderMonkey Wasm ABI.  For details of the multi-value return
103 //! ABI, see:
104 //!
105 //! <https://searchfox.org/mozilla-central/rev/bc3600def806859c31b2c7ac06e3d69271052a89/js/src/wasm/WasmStubs.h#134>
106 //!
107 //! In brief:
108 //! - Return values are processed in *reverse* order.
109 //! - The first return value in this order (so the last return) goes into the
110 //!   ordinary return register.
111 //! - Any further returns go in a struct-return area, allocated upwards (in
112 //!   address order) during the reverse traversal.
113 //! - This struct-return area is provided by the caller, and a pointer to its
114 //!   start is passed as an invisible last (extra) argument. Normally the caller
115 //!   will allocate this area on the stack. When we generate calls, we place it
116 //!   just above the on-stack argument area.
117 //! - So, for example, a function returning 4 i64's (v0, v1, v2, v3), with no
118 //!   formal arguments, would:
119 //!   - Accept a pointer `P` to the struct return area as a hidden argument in the
120 //!     first argument register on entry.
121 //!   - Return v3 in the one and only return-value register.
122 //!   - Return v2 in memory at `[P]`.
123 //!   - Return v1 in memory at `[P+8]`.
124 //!   - Return v0 in memory at `[P+16]`.
125 
126 use super::abi::*;
127 use crate::binemit::StackMap;
128 use crate::ir::types::*;
129 use crate::ir::{ArgumentExtension, ArgumentPurpose, StackSlot};
130 use crate::machinst::*;
131 use crate::settings;
132 use crate::CodegenResult;
133 use crate::{ir, isa};
134 use alloc::vec::Vec;
135 use log::{debug, trace};
136 use regalloc::{RealReg, Reg, RegClass, Set, SpillSlot, Writable};
137 use smallvec::{smallvec, SmallVec};
138 use std::convert::TryFrom;
139 use std::marker::PhantomData;
140 use std::mem;
141 
142 /// A location for (part of) an argument or return value. These "storage slots"
143 /// are specified for each register-sized part of an argument.
144 #[derive(Clone, Copy, Debug, PartialEq, Eq)]
145 pub enum ABIArgSlot {
146     /// In a real register.
147     Reg {
148         /// Register that holds this arg.
149         reg: RealReg,
150         /// Value type of this arg.
151         ty: ir::Type,
152         /// Should this arg be zero- or sign-extended?
153         extension: ir::ArgumentExtension,
154     },
155     /// Arguments only: on stack, at given offset from SP at entry.
156     Stack {
157         /// Offset of this arg relative to the base of stack args.
158         offset: i64,
159         /// Value type of this arg.
160         ty: ir::Type,
161         /// Should this arg be zero- or sign-extended?
162         extension: ir::ArgumentExtension,
163     },
164 }
165 
166 /// An ABIArg is composed of one or more parts. This allows for a CLIF-level
167 /// Value to be passed with its parts in more than one location at the ABI
168 /// level. For example, a 128-bit integer may be passed in two 64-bit registers,
169 /// or even a 64-bit register and a 64-bit stack slot, on a 64-bit machine. The
170 /// number of "parts" should correspond to the number of registers used to store
171 /// this type according to the machine backend.
172 ///
173 /// As an invariant, the `purpose` for every part must match. As a further
174 /// invariant, a `StructArg` part cannot appear with any other part.
175 #[derive(Clone, Debug)]
176 pub enum ABIArg {
177     /// Storage slots (registers or stack locations) for each part of the
178     /// argument value. The number of slots must equal the number of register
179     /// parts used to store a value of this type.
180     Slots {
181         /// Slots, one per register part.
182         slots: Vec<ABIArgSlot>,
183         /// Purpose of this arg.
184         purpose: ir::ArgumentPurpose,
185     },
186     /// Structure argument. We reserve stack space for it, but the CLIF-level
187     /// semantics are a little weird: the value passed to the call instruction,
188     /// and received in the corresponding block param, is a *pointer*. On the
189     /// caller side, we memcpy the data from the passed-in pointer to the stack
190     /// area; on the callee side, we compute a pointer to this stack area and
191     /// provide that as the argument's value.
192     StructArg {
193         /// Offset of this arg relative to base of stack args.
194         offset: i64,
195         /// Size of this arg on the stack.
196         size: u64,
197         /// Purpose of this arg.
198         purpose: ir::ArgumentPurpose,
199     },
200 }
201 
202 impl ABIArg {
203     /// Get the purpose of this arg.
get_purpose(&self) -> ir::ArgumentPurpose204     fn get_purpose(&self) -> ir::ArgumentPurpose {
205         match self {
206             &ABIArg::Slots { purpose, .. } => purpose,
207             &ABIArg::StructArg { purpose, .. } => purpose,
208         }
209     }
210 
211     /// Is this a StructArg?
is_struct_arg(&self) -> bool212     fn is_struct_arg(&self) -> bool {
213         match self {
214             &ABIArg::StructArg { .. } => true,
215             _ => false,
216         }
217     }
218 
219     /// Create an ABIArg from one register.
reg( reg: RealReg, ty: ir::Type, extension: ir::ArgumentExtension, purpose: ir::ArgumentPurpose, ) -> ABIArg220     pub fn reg(
221         reg: RealReg,
222         ty: ir::Type,
223         extension: ir::ArgumentExtension,
224         purpose: ir::ArgumentPurpose,
225     ) -> ABIArg {
226         ABIArg::Slots {
227             slots: vec![ABIArgSlot::Reg { reg, ty, extension }],
228             purpose,
229         }
230     }
231 
232     /// Create an ABIArg from one stack slot.
stack( offset: i64, ty: ir::Type, extension: ir::ArgumentExtension, purpose: ir::ArgumentPurpose, ) -> ABIArg233     pub fn stack(
234         offset: i64,
235         ty: ir::Type,
236         extension: ir::ArgumentExtension,
237         purpose: ir::ArgumentPurpose,
238     ) -> ABIArg {
239         ABIArg::Slots {
240             slots: vec![ABIArgSlot::Stack {
241                 offset,
242                 ty,
243                 extension,
244             }],
245             purpose,
246         }
247     }
248 }
249 
250 /// Are we computing information about arguments or return values? Much of the
251 /// handling is factored out into common routines; this enum allows us to
252 /// distinguish which case we're handling.
253 #[derive(Clone, Copy, Debug, PartialEq, Eq)]
254 pub enum ArgsOrRets {
255     /// Arguments.
256     Args,
257     /// Return values.
258     Rets,
259 }
260 
261 /// Is an instruction returned by an ABI machine-specific backend a safepoint,
262 /// or not?
263 #[derive(Clone, Copy, Debug, PartialEq, Eq)]
264 pub enum InstIsSafepoint {
265     /// The instruction is a safepoint.
266     Yes,
267     /// The instruction is not a safepoint.
268     No,
269 }
270 
271 /// Abstract location for a machine-specific ABI impl to translate into the
272 /// appropriate addressing mode.
273 #[derive(Clone, Copy, Debug)]
274 pub enum StackAMode {
275     /// Offset from the frame pointer, possibly making use of a specific type
276     /// for a scaled indexing operation.
277     FPOffset(i64, ir::Type),
278     /// Offset from the nominal stack pointer, possibly making use of a specific
279     /// type for a scaled indexing operation.
280     NominalSPOffset(i64, ir::Type),
281     /// Offset from the real stack pointer, possibly making use of a specific
282     /// type for a scaled indexing operation.
283     SPOffset(i64, ir::Type),
284 }
285 
286 impl StackAMode {
287     /// Offset by an addend.
offset(self, addend: i64) -> Self288     pub fn offset(self, addend: i64) -> Self {
289         match self {
290             StackAMode::FPOffset(off, ty) => StackAMode::FPOffset(off + addend, ty),
291             StackAMode::NominalSPOffset(off, ty) => StackAMode::NominalSPOffset(off + addend, ty),
292             StackAMode::SPOffset(off, ty) => StackAMode::SPOffset(off + addend, ty),
293         }
294     }
295 }
296 
297 /// Trait implemented by machine-specific backend to provide information about
298 /// register assignments and to allow generating the specific instructions for
299 /// stack loads/saves, prologues/epilogues, etc.
300 pub trait ABIMachineSpec {
301     /// The instruction type.
302     type I: VCodeInst;
303 
304     /// Returns the number of bits in a word, that is 32/64 for 32/64-bit architecture.
word_bits() -> u32305     fn word_bits() -> u32;
306 
307     /// Returns the number of bytes in a word.
word_bytes() -> u32308     fn word_bytes() -> u32 {
309         return Self::word_bits() / 8;
310     }
311 
312     /// Returns word-size integer type.
word_type() -> Type313     fn word_type() -> Type {
314         match Self::word_bits() {
315             32 => I32,
316             64 => I64,
317             _ => unreachable!(),
318         }
319     }
320 
321     /// Returns word register class.
word_reg_class() -> RegClass322     fn word_reg_class() -> RegClass {
323         match Self::word_bits() {
324             32 => RegClass::I32,
325             64 => RegClass::I64,
326             _ => unreachable!(),
327         }
328     }
329 
330     /// Returns required stack alignment in bytes.
stack_align(call_conv: isa::CallConv) -> u32331     fn stack_align(call_conv: isa::CallConv) -> u32;
332 
333     /// Process a list of parameters or return values and allocate them to registers
334     /// and stack slots.
335     ///
336     /// Returns the list of argument locations, the stack-space used (rounded up
337     /// to as alignment requires), and if `add_ret_area_ptr` was passed, the
338     /// index of the extra synthetic arg that was added.
compute_arg_locs( call_conv: isa::CallConv, flags: &settings::Flags, params: &[ir::AbiParam], args_or_rets: ArgsOrRets, add_ret_area_ptr: bool, ) -> CodegenResult<(Vec<ABIArg>, i64, Option<usize>)>339     fn compute_arg_locs(
340         call_conv: isa::CallConv,
341         flags: &settings::Flags,
342         params: &[ir::AbiParam],
343         args_or_rets: ArgsOrRets,
344         add_ret_area_ptr: bool,
345     ) -> CodegenResult<(Vec<ABIArg>, i64, Option<usize>)>;
346 
347     /// Returns the offset from FP to the argument area, i.e., jumping over the saved FP, return
348     /// address, and maybe other standard elements depending on ABI (e.g. Wasm TLS reg).
fp_to_arg_offset(call_conv: isa::CallConv, flags: &settings::Flags) -> i64349     fn fp_to_arg_offset(call_conv: isa::CallConv, flags: &settings::Flags) -> i64;
350 
351     /// Generate a load from the stack.
gen_load_stack(mem: StackAMode, into_reg: Writable<Reg>, ty: Type) -> Self::I352     fn gen_load_stack(mem: StackAMode, into_reg: Writable<Reg>, ty: Type) -> Self::I;
353 
354     /// Generate a store to the stack.
gen_store_stack(mem: StackAMode, from_reg: Reg, ty: Type) -> Self::I355     fn gen_store_stack(mem: StackAMode, from_reg: Reg, ty: Type) -> Self::I;
356 
357     /// Generate a move.
gen_move(to_reg: Writable<Reg>, from_reg: Reg, ty: Type) -> Self::I358     fn gen_move(to_reg: Writable<Reg>, from_reg: Reg, ty: Type) -> Self::I;
359 
360     /// Generate an integer-extend operation.
gen_extend( to_reg: Writable<Reg>, from_reg: Reg, is_signed: bool, from_bits: u8, to_bits: u8, ) -> Self::I361     fn gen_extend(
362         to_reg: Writable<Reg>,
363         from_reg: Reg,
364         is_signed: bool,
365         from_bits: u8,
366         to_bits: u8,
367     ) -> Self::I;
368 
369     /// Generate a return instruction.
gen_ret() -> Self::I370     fn gen_ret() -> Self::I;
371 
372     /// Generate an "epilogue placeholder" instruction, recognized by lowering
373     /// when using the Baldrdash ABI.
gen_epilogue_placeholder() -> Self::I374     fn gen_epilogue_placeholder() -> Self::I;
375 
376     /// Generate an add-with-immediate. Note that even if this uses a scratch
377     /// register, it must satisfy two requirements:
378     ///
379     /// - The add-imm sequence must only clobber caller-save registers, because
380     ///   it will be placed in the prologue before the clobbered callee-save
381     ///   registers are saved.
382     ///
383     /// - The add-imm sequence must work correctly when `from_reg` and/or
384     ///   `into_reg` are the register returned by `get_stacklimit_reg()`.
gen_add_imm(into_reg: Writable<Reg>, from_reg: Reg, imm: u32) -> SmallInstVec<Self::I>385     fn gen_add_imm(into_reg: Writable<Reg>, from_reg: Reg, imm: u32) -> SmallInstVec<Self::I>;
386 
387     /// Generate a sequence that traps with a `TrapCode::StackOverflow` code if
388     /// the stack pointer is less than the given limit register (assuming the
389     /// stack grows downward).
gen_stack_lower_bound_trap(limit_reg: Reg) -> SmallInstVec<Self::I>390     fn gen_stack_lower_bound_trap(limit_reg: Reg) -> SmallInstVec<Self::I>;
391 
392     /// Generate an instruction to compute an address of a stack slot (FP- or
393     /// SP-based offset).
gen_get_stack_addr(mem: StackAMode, into_reg: Writable<Reg>, ty: Type) -> Self::I394     fn gen_get_stack_addr(mem: StackAMode, into_reg: Writable<Reg>, ty: Type) -> Self::I;
395 
396     /// Get a fixed register to use to compute a stack limit. This is needed for
397     /// certain sequences generated after the register allocator has already
398     /// run. This must satisfy two requirements:
399     ///
400     /// - It must be a caller-save register, because it will be clobbered in the
401     ///   prologue before the clobbered callee-save registers are saved.
402     ///
403     /// - It must be safe to pass as an argument and/or destination to
404     ///   `gen_add_imm()`. This is relevant when an addition with a large
405     ///   immediate needs its own temporary; it cannot use the same fixed
406     ///   temporary as this one.
get_stacklimit_reg() -> Reg407     fn get_stacklimit_reg() -> Reg;
408 
409     /// Generate a store to the given [base+offset] address.
gen_load_base_offset(into_reg: Writable<Reg>, base: Reg, offset: i32, ty: Type) -> Self::I410     fn gen_load_base_offset(into_reg: Writable<Reg>, base: Reg, offset: i32, ty: Type) -> Self::I;
411 
412     /// Generate a load from the given [base+offset] address.
gen_store_base_offset(base: Reg, offset: i32, from_reg: Reg, ty: Type) -> Self::I413     fn gen_store_base_offset(base: Reg, offset: i32, from_reg: Reg, ty: Type) -> Self::I;
414 
415     /// Adjust the stack pointer up or down.
gen_sp_reg_adjust(amount: i32) -> SmallInstVec<Self::I>416     fn gen_sp_reg_adjust(amount: i32) -> SmallInstVec<Self::I>;
417 
418     /// Generate a meta-instruction that adjusts the nominal SP offset.
gen_nominal_sp_adj(amount: i32) -> Self::I419     fn gen_nominal_sp_adj(amount: i32) -> Self::I;
420 
421     /// Generate the usual frame-setup sequence for this architecture: e.g.,
422     /// `push rbp / mov rbp, rsp` on x86-64, or `stp fp, lr, [sp, #-16]!` on
423     /// AArch64.
gen_prologue_frame_setup(flags: &settings::Flags) -> SmallInstVec<Self::I>424     fn gen_prologue_frame_setup(flags: &settings::Flags) -> SmallInstVec<Self::I>;
425 
426     /// Generate the usual frame-restore sequence for this architecture.
gen_epilogue_frame_restore(flags: &settings::Flags) -> SmallInstVec<Self::I>427     fn gen_epilogue_frame_restore(flags: &settings::Flags) -> SmallInstVec<Self::I>;
428 
429     /// Generate a probestack call.
gen_probestack(_frame_size: u32) -> SmallInstVec<Self::I>430     fn gen_probestack(_frame_size: u32) -> SmallInstVec<Self::I>;
431 
432     /// Generate a clobber-save sequence. This takes the list of *all* registers
433     /// written/modified by the function body. The implementation here is
434     /// responsible for determining which of these are callee-saved according to
435     /// the ABI. It should return a sequence of instructions that "push" or
436     /// otherwise save these values to the stack. The sequence of instructions
437     /// should adjust the stack pointer downward, and should align as necessary
438     /// according to ABI requirements.
439     ///
440     /// Returns stack bytes used as well as instructions. Does not adjust
441     /// nominal SP offset; caller will do that.
gen_clobber_save( call_conv: isa::CallConv, flags: &settings::Flags, clobbers: &Set<Writable<RealReg>>, fixed_frame_storage_size: u32, outgoing_args_size: u32, ) -> (u64, SmallVec<[Self::I; 16]>)442     fn gen_clobber_save(
443         call_conv: isa::CallConv,
444         flags: &settings::Flags,
445         clobbers: &Set<Writable<RealReg>>,
446         fixed_frame_storage_size: u32,
447         outgoing_args_size: u32,
448     ) -> (u64, SmallVec<[Self::I; 16]>);
449 
450     /// Generate a clobber-restore sequence. This sequence should perform the
451     /// opposite of the clobber-save sequence generated above, assuming that SP
452     /// going into the sequence is at the same point that it was left when the
453     /// clobber-save sequence finished.
gen_clobber_restore( call_conv: isa::CallConv, flags: &settings::Flags, clobbers: &Set<Writable<RealReg>>, fixed_frame_storage_size: u32, outgoing_args_size: u32, ) -> SmallVec<[Self::I; 16]>454     fn gen_clobber_restore(
455         call_conv: isa::CallConv,
456         flags: &settings::Flags,
457         clobbers: &Set<Writable<RealReg>>,
458         fixed_frame_storage_size: u32,
459         outgoing_args_size: u32,
460     ) -> SmallVec<[Self::I; 16]>;
461 
462     /// Generate a call instruction/sequence. This method is provided one
463     /// temporary register to use to synthesize the called address, if needed.
gen_call( dest: &CallDest, uses: Vec<Reg>, defs: Vec<Writable<Reg>>, opcode: ir::Opcode, tmp: Writable<Reg>, callee_conv: isa::CallConv, callee_conv: isa::CallConv, ) -> SmallVec<[(InstIsSafepoint, Self::I); 2]>464     fn gen_call(
465         dest: &CallDest,
466         uses: Vec<Reg>,
467         defs: Vec<Writable<Reg>>,
468         opcode: ir::Opcode,
469         tmp: Writable<Reg>,
470         callee_conv: isa::CallConv,
471         callee_conv: isa::CallConv,
472     ) -> SmallVec<[(InstIsSafepoint, Self::I); 2]>;
473 
474     /// Generate a memcpy invocation. Used to set up struct args. May clobber
475     /// caller-save registers; we only memcpy before we start to set up args for
476     /// a call.
gen_memcpy( call_conv: isa::CallConv, dst: Reg, src: Reg, size: usize, ) -> SmallVec<[Self::I; 8]>477     fn gen_memcpy(
478         call_conv: isa::CallConv,
479         dst: Reg,
480         src: Reg,
481         size: usize,
482     ) -> SmallVec<[Self::I; 8]>;
483 
484     /// Get the number of spillslots required for the given register-class and
485     /// type.
get_number_of_spillslots_for_value(rc: RegClass, ty: Type) -> u32486     fn get_number_of_spillslots_for_value(rc: RegClass, ty: Type) -> u32;
487 
488     /// Get the current virtual-SP offset from an instruction-emission state.
get_virtual_sp_offset_from_state(s: &<Self::I as MachInstEmit>::State) -> i64489     fn get_virtual_sp_offset_from_state(s: &<Self::I as MachInstEmit>::State) -> i64;
490 
491     /// Get the "nominal SP to FP" offset from an instruction-emission state.
get_nominal_sp_to_fp(s: &<Self::I as MachInstEmit>::State) -> i64492     fn get_nominal_sp_to_fp(s: &<Self::I as MachInstEmit>::State) -> i64;
493 
494     /// Get all caller-save registers, that is, registers that we expect
495     /// not to be saved across a call to a callee with the given ABI.
get_regs_clobbered_by_call(call_conv_of_callee: isa::CallConv) -> Vec<Writable<Reg>>496     fn get_regs_clobbered_by_call(call_conv_of_callee: isa::CallConv) -> Vec<Writable<Reg>>;
497 
498     /// Get the needed extension mode, given the mode attached to the argument
499     /// in the signature and the calling convention. The input (the attribute in
500     /// the signature) specifies what extension type should be done *if* the ABI
501     /// requires extension to the full register; this method's return value
502     /// indicates whether the extension actually *will* be done.
get_ext_mode( call_conv: isa::CallConv, specified: ir::ArgumentExtension, ) -> ir::ArgumentExtension503     fn get_ext_mode(
504         call_conv: isa::CallConv,
505         specified: ir::ArgumentExtension,
506     ) -> ir::ArgumentExtension;
507 }
508 
509 /// ABI information shared between body (callee) and caller.
510 struct ABISig {
511     /// Argument locations (regs or stack slots). Stack offsets are relative to
512     /// SP on entry to function.
513     args: Vec<ABIArg>,
514     /// Return-value locations. Stack offsets are relative to the return-area
515     /// pointer.
516     rets: Vec<ABIArg>,
517     /// Space on stack used to store arguments.
518     stack_arg_space: i64,
519     /// Space on stack used to store return values.
520     stack_ret_space: i64,
521     /// Index in `args` of the stack-return-value-area argument.
522     stack_ret_arg: Option<usize>,
523     /// Calling convention used.
524     call_conv: isa::CallConv,
525 }
526 
527 impl ABISig {
from_func_sig<M: ABIMachineSpec>( sig: &ir::Signature, flags: &settings::Flags, ) -> CodegenResult<ABISig>528     fn from_func_sig<M: ABIMachineSpec>(
529         sig: &ir::Signature,
530         flags: &settings::Flags,
531     ) -> CodegenResult<ABISig> {
532         // Compute args and retvals from signature. Handle retvals first,
533         // because we may need to add a return-area arg to the args.
534         let (rets, stack_ret_space, _) = M::compute_arg_locs(
535             sig.call_conv,
536             flags,
537             &sig.returns,
538             ArgsOrRets::Rets,
539             /* extra ret-area ptr = */ false,
540         )?;
541         let need_stack_return_area = stack_ret_space > 0;
542         let (args, stack_arg_space, stack_ret_arg) = M::compute_arg_locs(
543             sig.call_conv,
544             flags,
545             &sig.params,
546             ArgsOrRets::Args,
547             need_stack_return_area,
548         )?;
549 
550         trace!(
551             "ABISig: sig {:?} => args = {:?} rets = {:?} arg stack = {} ret stack = {} stack_ret_arg = {:?}",
552             sig,
553             args,
554             rets,
555             stack_arg_space,
556             stack_ret_space,
557             stack_ret_arg
558         );
559 
560         Ok(ABISig {
561             args,
562             rets,
563             stack_arg_space,
564             stack_ret_space,
565             stack_ret_arg,
566             call_conv: sig.call_conv,
567         })
568     }
569 }
570 
571 /// ABI object for a function body.
572 pub struct ABICalleeImpl<M: ABIMachineSpec> {
573     /// CLIF-level signature, possibly normalized.
574     ir_sig: ir::Signature,
575     /// Signature: arg and retval regs.
576     sig: ABISig,
577     /// Offsets to each stackslot.
578     stackslots: PrimaryMap<StackSlot, u32>,
579     /// Total stack size of all stackslots.
580     stackslots_size: u32,
581     /// Stack size to be reserved for outgoing arguments.
582     outgoing_args_size: u32,
583     /// Clobbered registers, from regalloc.
584     clobbered: Set<Writable<RealReg>>,
585     /// Total number of spillslots, from regalloc.
586     spillslots: Option<usize>,
587     /// Storage allocated for the fixed part of the stack frame.  This is
588     /// usually the same as the total frame size below, except in the case
589     /// of the baldrdash calling convention.
590     fixed_frame_storage_size: u32,
591     /// "Total frame size", as defined by "distance between FP and nominal SP".
592     /// Some items are pushed below nominal SP, so the function may actually use
593     /// more stack than this would otherwise imply. It is simply the initial
594     /// frame/allocation size needed for stackslots and spillslots.
595     total_frame_size: Option<u32>,
596     /// The register holding the return-area pointer, if needed.
597     ret_area_ptr: Option<Writable<Reg>>,
598     /// Calling convention this function expects.
599     call_conv: isa::CallConv,
600     /// The settings controlling this function's compilation.
601     flags: settings::Flags,
602     /// Whether or not this function is a "leaf", meaning it calls no other
603     /// functions
604     is_leaf: bool,
605     /// If this function has a stack limit specified, then `Reg` is where the
606     /// stack limit will be located after the instructions specified have been
607     /// executed.
608     ///
609     /// Note that this is intended for insertion into the prologue, if
610     /// present. Also note that because the instructions here execute in the
611     /// prologue this happens after legalization/register allocation/etc so we
612     /// need to be extremely careful with each instruction. The instructions are
613     /// manually register-allocated and carefully only use caller-saved
614     /// registers and keep nothing live after this sequence of instructions.
615     stack_limit: Option<(Reg, SmallInstVec<M::I>)>,
616     /// Are we to invoke the probestack function in the prologue? If so,
617     /// what is the minimum size at which we must invoke it?
618     probestack_min_frame: Option<u32>,
619 
620     _mach: PhantomData<M>,
621 }
622 
get_special_purpose_param_register( f: &ir::Function, abi: &ABISig, purpose: ir::ArgumentPurpose, ) -> Option<Reg>623 fn get_special_purpose_param_register(
624     f: &ir::Function,
625     abi: &ABISig,
626     purpose: ir::ArgumentPurpose,
627 ) -> Option<Reg> {
628     let idx = f.signature.special_param_index(purpose)?;
629     match &abi.args[idx] {
630         &ABIArg::Slots { ref slots, .. } => match &slots[0] {
631             &ABIArgSlot::Reg { reg, .. } => Some(reg.to_reg()),
632             _ => None,
633         },
634         _ => None,
635     }
636 }
637 
638 impl<M: ABIMachineSpec> ABICalleeImpl<M> {
639     /// Create a new body ABI instance.
new(f: &ir::Function, flags: settings::Flags) -> CodegenResult<Self>640     pub fn new(f: &ir::Function, flags: settings::Flags) -> CodegenResult<Self> {
641         debug!("ABI: func signature {:?}", f.signature);
642 
643         let ir_sig = ensure_struct_return_ptr_is_returned(&f.signature);
644         let sig = ABISig::from_func_sig::<M>(&ir_sig, &flags)?;
645 
646         let call_conv = f.signature.call_conv;
647         // Only these calling conventions are supported.
648         debug_assert!(
649             call_conv == isa::CallConv::SystemV
650                 || call_conv == isa::CallConv::Fast
651                 || call_conv == isa::CallConv::Cold
652                 || call_conv.extends_baldrdash()
653                 || call_conv.extends_windows_fastcall()
654                 || call_conv == isa::CallConv::AppleAarch64
655                 || call_conv == isa::CallConv::WasmtimeSystemV,
656             "Unsupported calling convention: {:?}",
657             call_conv
658         );
659 
660         // Compute stackslot locations and total stackslot size.
661         let mut stack_offset: u32 = 0;
662         let mut stackslots = PrimaryMap::new();
663         for (stackslot, data) in f.stack_slots.iter() {
664             let off = stack_offset;
665             stack_offset += data.size;
666             let mask = M::word_bytes() - 1;
667             stack_offset = (stack_offset + mask) & !mask;
668             debug_assert_eq!(stackslot.as_u32() as usize, stackslots.len());
669             stackslots.push(off);
670         }
671 
672         // Figure out what instructions, if any, will be needed to check the
673         // stack limit. This can either be specified as a special-purpose
674         // argument or as a global value which often calculates the stack limit
675         // from the arguments.
676         let stack_limit =
677             get_special_purpose_param_register(f, &sig, ir::ArgumentPurpose::StackLimit)
678                 .map(|reg| (reg, smallvec![]))
679                 .or_else(|| f.stack_limit.map(|gv| gen_stack_limit::<M>(f, &sig, gv)));
680 
681         // Determine whether a probestack call is required for large enough
682         // frames (and the minimum frame size if so).
683         let probestack_min_frame = if flags.enable_probestack() {
684             assert!(
685                 !flags.probestack_func_adjusts_sp(),
686                 "SP-adjusting probestack not supported in new backends"
687             );
688             Some(1 << flags.probestack_size_log2())
689         } else {
690             None
691         };
692 
693         Ok(Self {
694             ir_sig,
695             sig,
696             stackslots,
697             stackslots_size: stack_offset,
698             outgoing_args_size: 0,
699             clobbered: Set::empty(),
700             spillslots: None,
701             fixed_frame_storage_size: 0,
702             total_frame_size: None,
703             ret_area_ptr: None,
704             call_conv,
705             flags,
706             is_leaf: f.is_leaf(),
707             stack_limit,
708             probestack_min_frame,
709             _mach: PhantomData,
710         })
711     }
712 
713     /// Inserts instructions necessary for checking the stack limit into the
714     /// prologue.
715     ///
716     /// This function will generate instructions necessary for perform a stack
717     /// check at the header of a function. The stack check is intended to trap
718     /// if the stack pointer goes below a particular threshold, preventing stack
719     /// overflow in wasm or other code. The `stack_limit` argument here is the
720     /// register which holds the threshold below which we're supposed to trap.
721     /// This function is known to allocate `stack_size` bytes and we'll push
722     /// instructions onto `insts`.
723     ///
724     /// Note that the instructions generated here are special because this is
725     /// happening so late in the pipeline (e.g. after register allocation). This
726     /// means that we need to do manual register allocation here and also be
727     /// careful to not clobber any callee-saved or argument registers. For now
728     /// this routine makes do with the `spilltmp_reg` as one temporary
729     /// register, and a second register of `tmp2` which is caller-saved. This
730     /// should be fine for us since no spills should happen in this sequence of
731     /// instructions, so our register won't get accidentally clobbered.
732     ///
733     /// No values can be live after the prologue, but in this case that's ok
734     /// because we just need to perform a stack check before progressing with
735     /// the rest of the function.
insert_stack_check( &self, stack_limit: Reg, stack_size: u32, insts: &mut SmallInstVec<M::I>, )736     fn insert_stack_check(
737         &self,
738         stack_limit: Reg,
739         stack_size: u32,
740         insts: &mut SmallInstVec<M::I>,
741     ) {
742         // With no explicit stack allocated we can just emit the simple check of
743         // the stack registers against the stack limit register, and trap if
744         // it's out of bounds.
745         if stack_size == 0 {
746             insts.extend(M::gen_stack_lower_bound_trap(stack_limit));
747             return;
748         }
749 
750         // Note that the 32k stack size here is pretty special. See the
751         // documentation in x86/abi.rs for why this is here. The general idea is
752         // that we're protecting against overflow in the addition that happens
753         // below.
754         if stack_size >= 32 * 1024 {
755             insts.extend(M::gen_stack_lower_bound_trap(stack_limit));
756         }
757 
758         // Add the `stack_size` to `stack_limit`, placing the result in
759         // `scratch`.
760         //
761         // Note though that `stack_limit`'s register may be the same as
762         // `scratch`. If our stack size doesn't fit into an immediate this
763         // means we need a second scratch register for loading the stack size
764         // into a register.
765         let scratch = Writable::from_reg(M::get_stacklimit_reg());
766         insts.extend(M::gen_add_imm(scratch, stack_limit, stack_size).into_iter());
767         insts.extend(M::gen_stack_lower_bound_trap(scratch.to_reg()));
768     }
769 }
770 
771 /// Generates the instructions necessary for the `gv` to be materialized into a
772 /// register.
773 ///
774 /// This function will return a register that will contain the result of
775 /// evaluating `gv`. It will also return any instructions necessary to calculate
776 /// the value of the register.
777 ///
778 /// Note that global values are typically lowered to instructions via the
779 /// standard legalization pass. Unfortunately though prologue generation happens
780 /// so late in the pipeline that we can't use these legalization passes to
781 /// generate the instructions for `gv`. As a result we duplicate some lowering
782 /// of `gv` here and support only some global values. This is similar to what
783 /// the x86 backend does for now, and hopefully this can be somewhat cleaned up
784 /// in the future too!
785 ///
786 /// Also note that this function will make use of `writable_spilltmp_reg()` as a
787 /// temporary register to store values in if necessary. Currently after we write
788 /// to this register there's guaranteed to be no spilled values between where
789 /// it's used, because we're not participating in register allocation anyway!
gen_stack_limit<M: ABIMachineSpec>( f: &ir::Function, abi: &ABISig, gv: ir::GlobalValue, ) -> (Reg, SmallInstVec<M::I>)790 fn gen_stack_limit<M: ABIMachineSpec>(
791     f: &ir::Function,
792     abi: &ABISig,
793     gv: ir::GlobalValue,
794 ) -> (Reg, SmallInstVec<M::I>) {
795     let mut insts = smallvec![];
796     let reg = generate_gv::<M>(f, abi, gv, &mut insts);
797     return (reg, insts);
798 }
799 
generate_gv<M: ABIMachineSpec>( f: &ir::Function, abi: &ABISig, gv: ir::GlobalValue, insts: &mut SmallInstVec<M::I>, ) -> Reg800 fn generate_gv<M: ABIMachineSpec>(
801     f: &ir::Function,
802     abi: &ABISig,
803     gv: ir::GlobalValue,
804     insts: &mut SmallInstVec<M::I>,
805 ) -> Reg {
806     match f.global_values[gv] {
807         // Return the direct register the vmcontext is in
808         ir::GlobalValueData::VMContext => {
809             get_special_purpose_param_register(f, abi, ir::ArgumentPurpose::VMContext)
810                 .expect("no vmcontext parameter found")
811         }
812         // Load our base value into a register, then load from that register
813         // in to a temporary register.
814         ir::GlobalValueData::Load {
815             base,
816             offset,
817             global_type: _,
818             readonly: _,
819         } => {
820             let base = generate_gv::<M>(f, abi, base, insts);
821             let into_reg = Writable::from_reg(M::get_stacklimit_reg());
822             insts.push(M::gen_load_base_offset(
823                 into_reg,
824                 base,
825                 offset.into(),
826                 M::word_type(),
827             ));
828             return into_reg.to_reg();
829         }
830         ref other => panic!("global value for stack limit not supported: {}", other),
831     }
832 }
833 
834 /// Return a type either from an optional type hint, or if not, from the default
835 /// type associated with the given register's class. This is used to generate
836 /// loads/spills appropriately given the type of value loaded/stored (which may
837 /// be narrower than the spillslot). We usually have the type because the
838 /// regalloc usually provides the vreg being spilled/reloaded, and we know every
839 /// vreg's type. However, the regalloc *can* request a spill/reload without an
840 /// associated vreg when needed to satisfy a safepoint (which requires all
841 /// ref-typed values, even those in real registers in the original vcode, to be
842 /// in spillslots).
ty_from_ty_hint_or_reg_class<M: ABIMachineSpec>(r: Reg, ty: Option<Type>) -> Type843 fn ty_from_ty_hint_or_reg_class<M: ABIMachineSpec>(r: Reg, ty: Option<Type>) -> Type {
844     match (ty, r.get_class()) {
845         // If the type is provided
846         (Some(t), _) => t,
847         // If no type is provided, this should be a register spill for a
848         // safepoint, so we only expect I32/I64 (integer) registers.
849         (None, rc) if rc == M::word_reg_class() => M::word_type(),
850         _ => panic!("Unexpected register class!"),
851     }
852 }
853 
gen_load_stack_multi<M: ABIMachineSpec>( from: StackAMode, dst: ValueRegs<Writable<Reg>>, ty: Type, ) -> SmallInstVec<M::I>854 fn gen_load_stack_multi<M: ABIMachineSpec>(
855     from: StackAMode,
856     dst: ValueRegs<Writable<Reg>>,
857     ty: Type,
858 ) -> SmallInstVec<M::I> {
859     let mut ret = smallvec![];
860     let (_, tys) = M::I::rc_for_type(ty).unwrap();
861     let mut offset = 0;
862     // N.B.: registers are given in the `ValueRegs` in target endian order.
863     for (&dst, &ty) in dst.regs().iter().zip(tys.iter()) {
864         ret.push(M::gen_load_stack(from.offset(offset), dst, ty));
865         offset += ty.bytes() as i64;
866     }
867     ret
868 }
869 
gen_store_stack_multi<M: ABIMachineSpec>( from: StackAMode, src: ValueRegs<Reg>, ty: Type, ) -> SmallInstVec<M::I>870 fn gen_store_stack_multi<M: ABIMachineSpec>(
871     from: StackAMode,
872     src: ValueRegs<Reg>,
873     ty: Type,
874 ) -> SmallInstVec<M::I> {
875     let mut ret = smallvec![];
876     let (_, tys) = M::I::rc_for_type(ty).unwrap();
877     let mut offset = 0;
878     // N.B.: registers are given in the `ValueRegs` in target endian order.
879     for (&src, &ty) in src.regs().iter().zip(tys.iter()) {
880         ret.push(M::gen_store_stack(from.offset(offset), src, ty));
881         offset += ty.bytes() as i64;
882     }
883     ret
884 }
885 
ensure_struct_return_ptr_is_returned(sig: &ir::Signature) -> ir::Signature886 fn ensure_struct_return_ptr_is_returned(sig: &ir::Signature) -> ir::Signature {
887     let params_structret = sig
888         .params
889         .iter()
890         .find(|p| p.purpose == ArgumentPurpose::StructReturn);
891     let rets_have_structret = sig.returns.len() > 0
892         && sig
893             .returns
894             .iter()
895             .any(|arg| arg.purpose == ArgumentPurpose::StructReturn);
896     let mut sig = sig.clone();
897     if params_structret.is_some() && !rets_have_structret {
898         sig.returns.insert(0, params_structret.unwrap().clone());
899     }
900     sig
901 }
902 
903 impl<M: ABIMachineSpec> ABICallee for ABICalleeImpl<M> {
904     type I = M::I;
905 
signature(&self) -> &ir::Signature906     fn signature(&self) -> &ir::Signature {
907         &self.ir_sig
908     }
909 
temp_needed(&self) -> Option<Type>910     fn temp_needed(&self) -> Option<Type> {
911         if self.sig.stack_ret_arg.is_some() {
912             Some(M::word_type())
913         } else {
914             None
915         }
916     }
917 
init(&mut self, maybe_tmp: Option<Writable<Reg>>)918     fn init(&mut self, maybe_tmp: Option<Writable<Reg>>) {
919         if self.sig.stack_ret_arg.is_some() {
920             assert!(maybe_tmp.is_some());
921             self.ret_area_ptr = maybe_tmp;
922         }
923     }
924 
accumulate_outgoing_args_size(&mut self, size: u32)925     fn accumulate_outgoing_args_size(&mut self, size: u32) {
926         if size > self.outgoing_args_size {
927             self.outgoing_args_size = size;
928         }
929     }
930 
flags(&self) -> &settings::Flags931     fn flags(&self) -> &settings::Flags {
932         &self.flags
933     }
934 
call_conv(&self) -> isa::CallConv935     fn call_conv(&self) -> isa::CallConv {
936         self.sig.call_conv
937     }
938 
liveins(&self) -> Set<RealReg>939     fn liveins(&self) -> Set<RealReg> {
940         let mut set: Set<RealReg> = Set::empty();
941         for arg in &self.sig.args {
942             if let &ABIArg::Slots { ref slots, .. } = arg {
943                 for slot in slots {
944                     if let ABIArgSlot::Reg { reg, .. } = slot {
945                         set.insert(*reg);
946                     }
947                 }
948             }
949         }
950         set
951     }
952 
liveouts(&self) -> Set<RealReg>953     fn liveouts(&self) -> Set<RealReg> {
954         let mut set: Set<RealReg> = Set::empty();
955         for ret in &self.sig.rets {
956             if let &ABIArg::Slots { ref slots, .. } = ret {
957                 for slot in slots {
958                     if let ABIArgSlot::Reg { reg, .. } = slot {
959                         set.insert(*reg);
960                     }
961                 }
962             }
963         }
964         set
965     }
966 
num_args(&self) -> usize967     fn num_args(&self) -> usize {
968         self.sig.args.len()
969     }
970 
num_retvals(&self) -> usize971     fn num_retvals(&self) -> usize {
972         self.sig.rets.len()
973     }
974 
num_stackslots(&self) -> usize975     fn num_stackslots(&self) -> usize {
976         self.stackslots.len()
977     }
978 
stackslot_offsets(&self) -> &PrimaryMap<StackSlot, u32>979     fn stackslot_offsets(&self) -> &PrimaryMap<StackSlot, u32> {
980         &self.stackslots
981     }
982 
gen_copy_arg_to_regs( &self, idx: usize, into_regs: ValueRegs<Writable<Reg>>, ) -> SmallInstVec<Self::I>983     fn gen_copy_arg_to_regs(
984         &self,
985         idx: usize,
986         into_regs: ValueRegs<Writable<Reg>>,
987     ) -> SmallInstVec<Self::I> {
988         let mut insts = smallvec![];
989         match &self.sig.args[idx] {
990             &ABIArg::Slots { ref slots, .. } => {
991                 assert_eq!(into_regs.len(), slots.len());
992                 for (slot, into_reg) in slots.iter().zip(into_regs.regs().iter()) {
993                     match slot {
994                         // Extension mode doesn't matter (we're copying out, not in; we
995                         // ignore high bits by convention).
996                         &ABIArgSlot::Reg { reg, ty, .. } => {
997                             insts.push(M::gen_move(*into_reg, reg.to_reg(), ty));
998                         }
999                         &ABIArgSlot::Stack { offset, ty, .. } => {
1000                             insts.push(M::gen_load_stack(
1001                                 StackAMode::FPOffset(
1002                                     M::fp_to_arg_offset(self.call_conv, &self.flags) + offset,
1003                                     ty,
1004                                 ),
1005                                 *into_reg,
1006                                 ty,
1007                             ));
1008                         }
1009                     }
1010                 }
1011             }
1012             &ABIArg::StructArg { offset, .. } => {
1013                 let into_reg = into_regs.only_reg().unwrap();
1014                 insts.push(M::gen_get_stack_addr(
1015                     StackAMode::FPOffset(
1016                         M::fp_to_arg_offset(self.call_conv, &self.flags) + offset,
1017                         I8,
1018                     ),
1019                     into_reg,
1020                     I8,
1021                 ));
1022             }
1023         }
1024         insts
1025     }
1026 
arg_is_needed_in_body(&self, idx: usize) -> bool1027     fn arg_is_needed_in_body(&self, idx: usize) -> bool {
1028         match self.sig.args[idx].get_purpose() {
1029             // Special Baldrdash-specific pseudo-args that are present only to
1030             // fill stack slots.  Won't ever be used as ordinary values in the
1031             // body.
1032             ir::ArgumentPurpose::CalleeTLS | ir::ArgumentPurpose::CallerTLS => false,
1033             _ => true,
1034         }
1035     }
1036 
gen_copy_regs_to_retval( &self, idx: usize, from_regs: ValueRegs<Writable<Reg>>, ) -> SmallInstVec<Self::I>1037     fn gen_copy_regs_to_retval(
1038         &self,
1039         idx: usize,
1040         from_regs: ValueRegs<Writable<Reg>>,
1041     ) -> SmallInstVec<Self::I> {
1042         let mut ret = smallvec![];
1043         let word_bits = M::word_bits() as u8;
1044         match &self.sig.rets[idx] {
1045             &ABIArg::Slots { ref slots, .. } => {
1046                 assert_eq!(from_regs.len(), slots.len());
1047                 for (slot, from_reg) in slots.iter().zip(from_regs.regs().iter()) {
1048                     match slot {
1049                         &ABIArgSlot::Reg {
1050                             reg, ty, extension, ..
1051                         } => {
1052                             let from_bits = ty_bits(ty) as u8;
1053                             let ext = M::get_ext_mode(self.sig.call_conv, extension);
1054                             match (ext, from_bits) {
1055                                 (ArgumentExtension::Uext, n) | (ArgumentExtension::Sext, n)
1056                                     if n < word_bits =>
1057                                 {
1058                                     let signed = ext == ArgumentExtension::Sext;
1059                                     ret.push(M::gen_extend(
1060                                         Writable::from_reg(reg.to_reg()),
1061                                         from_reg.to_reg(),
1062                                         signed,
1063                                         from_bits,
1064                                         /* to_bits = */ word_bits,
1065                                     ));
1066                                 }
1067                                 _ => {
1068                                     ret.push(M::gen_move(
1069                                         Writable::from_reg(reg.to_reg()),
1070                                         from_reg.to_reg(),
1071                                         ty,
1072                                     ));
1073                                 }
1074                             };
1075                         }
1076                         &ABIArgSlot::Stack {
1077                             offset,
1078                             ty,
1079                             extension,
1080                             ..
1081                         } => {
1082                             let mut ty = ty;
1083                             let from_bits = ty_bits(ty) as u8;
1084                             // A machine ABI implementation should ensure that stack frames
1085                             // have "reasonable" size. All current ABIs for machinst
1086                             // backends (aarch64 and x64) enforce a 128MB limit.
1087                             let off = i32::try_from(offset).expect(
1088                                 "Argument stack offset greater than 2GB; should hit impl limit first",
1089                                 );
1090                             let ext = M::get_ext_mode(self.sig.call_conv, extension);
1091                             // Trash the from_reg; it should be its last use.
1092                             match (ext, from_bits) {
1093                                 (ArgumentExtension::Uext, n) | (ArgumentExtension::Sext, n)
1094                                     if n < word_bits =>
1095                                 {
1096                                     assert_eq!(M::word_reg_class(), from_reg.to_reg().get_class());
1097                                     let signed = ext == ArgumentExtension::Sext;
1098                                     ret.push(M::gen_extend(
1099                                         Writable::from_reg(from_reg.to_reg()),
1100                                         from_reg.to_reg(),
1101                                         signed,
1102                                         from_bits,
1103                                         /* to_bits = */ word_bits,
1104                                     ));
1105                                     // Store the extended version.
1106                                     ty = M::word_type();
1107                                 }
1108                                 _ => {}
1109                             };
1110                             ret.push(M::gen_store_base_offset(
1111                                 self.ret_area_ptr.unwrap().to_reg(),
1112                                 off,
1113                                 from_reg.to_reg(),
1114                                 ty,
1115                             ));
1116                         }
1117                     }
1118                 }
1119             }
1120             &ABIArg::StructArg { .. } => {
1121                 panic!("StructArg in return position is unsupported");
1122             }
1123         }
1124         ret
1125     }
1126 
gen_retval_area_setup(&self) -> Option<Self::I>1127     fn gen_retval_area_setup(&self) -> Option<Self::I> {
1128         if let Some(i) = self.sig.stack_ret_arg {
1129             let insts = self.gen_copy_arg_to_regs(i, ValueRegs::one(self.ret_area_ptr.unwrap()));
1130             let inst = insts.into_iter().next().unwrap();
1131             trace!(
1132                 "gen_retval_area_setup: inst {:?}; ptr reg is {:?}",
1133                 inst,
1134                 self.ret_area_ptr.unwrap().to_reg()
1135             );
1136             Some(inst)
1137         } else {
1138             trace!("gen_retval_area_setup: not needed");
1139             None
1140         }
1141     }
1142 
gen_ret(&self) -> Self::I1143     fn gen_ret(&self) -> Self::I {
1144         M::gen_ret()
1145     }
1146 
gen_epilogue_placeholder(&self) -> Self::I1147     fn gen_epilogue_placeholder(&self) -> Self::I {
1148         M::gen_epilogue_placeholder()
1149     }
1150 
set_num_spillslots(&mut self, slots: usize)1151     fn set_num_spillslots(&mut self, slots: usize) {
1152         self.spillslots = Some(slots);
1153     }
1154 
set_clobbered(&mut self, clobbered: Set<Writable<RealReg>>)1155     fn set_clobbered(&mut self, clobbered: Set<Writable<RealReg>>) {
1156         self.clobbered = clobbered;
1157     }
1158 
1159     /// Load from a stackslot.
load_stackslot( &self, slot: StackSlot, offset: u32, ty: Type, into_regs: ValueRegs<Writable<Reg>>, ) -> SmallInstVec<Self::I>1160     fn load_stackslot(
1161         &self,
1162         slot: StackSlot,
1163         offset: u32,
1164         ty: Type,
1165         into_regs: ValueRegs<Writable<Reg>>,
1166     ) -> SmallInstVec<Self::I> {
1167         // Offset from beginning of stackslot area, which is at nominal SP (see
1168         // [MemArg::NominalSPOffset] for more details on nominal SP tracking).
1169         let stack_off = self.stackslots[slot] as i64;
1170         let sp_off: i64 = stack_off + (offset as i64);
1171         trace!("load_stackslot: slot {} -> sp_off {}", slot, sp_off);
1172         gen_load_stack_multi::<M>(StackAMode::NominalSPOffset(sp_off, ty), into_regs, ty)
1173     }
1174 
1175     /// Store to a stackslot.
store_stackslot( &self, slot: StackSlot, offset: u32, ty: Type, from_regs: ValueRegs<Reg>, ) -> SmallInstVec<Self::I>1176     fn store_stackslot(
1177         &self,
1178         slot: StackSlot,
1179         offset: u32,
1180         ty: Type,
1181         from_regs: ValueRegs<Reg>,
1182     ) -> SmallInstVec<Self::I> {
1183         // Offset from beginning of stackslot area, which is at nominal SP (see
1184         // [MemArg::NominalSPOffset] for more details on nominal SP tracking).
1185         let stack_off = self.stackslots[slot] as i64;
1186         let sp_off: i64 = stack_off + (offset as i64);
1187         trace!("store_stackslot: slot {} -> sp_off {}", slot, sp_off);
1188         gen_store_stack_multi::<M>(StackAMode::NominalSPOffset(sp_off, ty), from_regs, ty)
1189     }
1190 
1191     /// Produce an instruction that computes a stackslot address.
stackslot_addr(&self, slot: StackSlot, offset: u32, into_reg: Writable<Reg>) -> Self::I1192     fn stackslot_addr(&self, slot: StackSlot, offset: u32, into_reg: Writable<Reg>) -> Self::I {
1193         // Offset from beginning of stackslot area, which is at nominal SP (see
1194         // [MemArg::NominalSPOffset] for more details on nominal SP tracking).
1195         let stack_off = self.stackslots[slot] as i64;
1196         let sp_off: i64 = stack_off + (offset as i64);
1197         M::gen_get_stack_addr(StackAMode::NominalSPOffset(sp_off, I8), into_reg, I8)
1198     }
1199 
1200     /// Load from a spillslot.
load_spillslot( &self, slot: SpillSlot, ty: Type, into_regs: ValueRegs<Writable<Reg>>, ) -> SmallInstVec<Self::I>1201     fn load_spillslot(
1202         &self,
1203         slot: SpillSlot,
1204         ty: Type,
1205         into_regs: ValueRegs<Writable<Reg>>,
1206     ) -> SmallInstVec<Self::I> {
1207         // Offset from beginning of spillslot area, which is at nominal SP + stackslots_size.
1208         let islot = slot.get() as i64;
1209         let spill_off = islot * M::word_bytes() as i64;
1210         let sp_off = self.stackslots_size as i64 + spill_off;
1211         trace!("load_spillslot: slot {:?} -> sp_off {}", slot, sp_off);
1212 
1213         // Integer types smaller than word size have been spilled as words below,
1214         // and therefore must be reloaded in the same type.
1215         let ty = if ty.is_int() && ty.bytes() < M::word_bytes() {
1216             M::word_type()
1217         } else {
1218             ty
1219         };
1220 
1221         gen_load_stack_multi::<M>(StackAMode::NominalSPOffset(sp_off, ty), into_regs, ty)
1222     }
1223 
1224     /// Store to a spillslot.
store_spillslot( &self, slot: SpillSlot, ty: Type, from_regs: ValueRegs<Reg>, ) -> SmallInstVec<Self::I>1225     fn store_spillslot(
1226         &self,
1227         slot: SpillSlot,
1228         ty: Type,
1229         from_regs: ValueRegs<Reg>,
1230     ) -> SmallInstVec<Self::I> {
1231         // Offset from beginning of spillslot area, which is at nominal SP + stackslots_size.
1232         let islot = slot.get() as i64;
1233         let spill_off = islot * M::word_bytes() as i64;
1234         let sp_off = self.stackslots_size as i64 + spill_off;
1235         trace!("store_spillslot: slot {:?} -> sp_off {}", slot, sp_off);
1236 
1237         // When reloading from a spill slot, we might have lost information about real integer
1238         // types. For instance, on the x64 backend, a zero-extension can become spurious and
1239         // optimized into a move, causing vregs of types I32 and I64 to share the same coalescing
1240         // equivalency class. As a matter of fact, such a value can be spilled as an I32 and later
1241         // reloaded as an I64; to make sure the high bits are always defined, do a word-sized store
1242         // all the time, in this case.
1243         let ty = if ty.is_int() && ty.bytes() < M::word_bytes() {
1244             M::word_type()
1245         } else {
1246             ty
1247         };
1248 
1249         gen_store_stack_multi::<M>(StackAMode::NominalSPOffset(sp_off, ty), from_regs, ty)
1250     }
1251 
spillslots_to_stack_map( &self, slots: &[SpillSlot], state: &<Self::I as MachInstEmit>::State, ) -> StackMap1252     fn spillslots_to_stack_map(
1253         &self,
1254         slots: &[SpillSlot],
1255         state: &<Self::I as MachInstEmit>::State,
1256     ) -> StackMap {
1257         let virtual_sp_offset = M::get_virtual_sp_offset_from_state(state);
1258         let nominal_sp_to_fp = M::get_nominal_sp_to_fp(state);
1259         assert!(virtual_sp_offset >= 0);
1260         trace!(
1261             "spillslots_to_stackmap: slots = {:?}, state = {:?}",
1262             slots,
1263             state
1264         );
1265         let map_size = (virtual_sp_offset + nominal_sp_to_fp) as u32;
1266         let bytes = M::word_bytes();
1267         let map_words = (map_size + bytes - 1) / bytes;
1268         let mut bits = std::iter::repeat(false)
1269             .take(map_words as usize)
1270             .collect::<Vec<bool>>();
1271 
1272         let first_spillslot_word =
1273             ((self.stackslots_size + virtual_sp_offset as u32) / bytes) as usize;
1274         for &slot in slots {
1275             let slot = slot.get() as usize;
1276             bits[first_spillslot_word + slot] = true;
1277         }
1278 
1279         StackMap::from_slice(&bits[..])
1280     }
1281 
gen_prologue(&mut self) -> SmallInstVec<Self::I>1282     fn gen_prologue(&mut self) -> SmallInstVec<Self::I> {
1283         let mut insts = smallvec![];
1284         if !self.call_conv.extends_baldrdash() {
1285             // set up frame
1286             insts.extend(M::gen_prologue_frame_setup(&self.flags).into_iter());
1287         }
1288 
1289         let bytes = M::word_bytes();
1290         let mut total_stacksize = self.stackslots_size + bytes * self.spillslots.unwrap() as u32;
1291         if self.call_conv.extends_baldrdash() {
1292             debug_assert!(
1293                 !self.flags.enable_probestack(),
1294                 "baldrdash does not expect cranelift to emit stack probes"
1295             );
1296             total_stacksize += self.flags.baldrdash_prologue_words() as u32 * bytes;
1297         }
1298         let mask = M::stack_align(self.call_conv) - 1;
1299         let total_stacksize = (total_stacksize + mask) & !mask; // 16-align the stack.
1300 
1301         if !self.call_conv.extends_baldrdash() {
1302             // Leaf functions with zero stack don't need a stack check if one's
1303             // specified, otherwise always insert the stack check.
1304             if total_stacksize > 0 || !self.is_leaf {
1305                 if let Some((reg, stack_limit_load)) = &self.stack_limit {
1306                     insts.extend(stack_limit_load.clone());
1307                     self.insert_stack_check(*reg, total_stacksize, &mut insts);
1308                 }
1309                 if let Some(min_frame) = &self.probestack_min_frame {
1310                     if total_stacksize >= *min_frame {
1311                         insts.extend(M::gen_probestack(total_stacksize));
1312                     }
1313                 }
1314             }
1315             if total_stacksize > 0 {
1316                 self.fixed_frame_storage_size += total_stacksize;
1317             }
1318         }
1319 
1320         // Save clobbered registers.
1321         let (clobber_size, clobber_insts) = M::gen_clobber_save(
1322             self.call_conv,
1323             &self.flags,
1324             &self.clobbered,
1325             self.fixed_frame_storage_size,
1326             self.outgoing_args_size,
1327         );
1328         insts.extend(clobber_insts);
1329 
1330         // N.B.: "nominal SP", which we use to refer to stackslots and
1331         // spillslots, is defined to be equal to the stack pointer at this point
1332         // in the prologue.
1333         //
1334         // If we push any further data onto the stack in the function
1335         // body, we emit a virtual-SP adjustment meta-instruction so
1336         // that the nominal SP references behave as if SP were still
1337         // at this point. See documentation for
1338         // [crate::machinst::abi_impl](this module) for more details
1339         // on stackframe layout and nominal SP maintenance.
1340 
1341         self.total_frame_size = Some(total_stacksize + clobber_size as u32);
1342         insts
1343     }
1344 
gen_epilogue(&self) -> SmallInstVec<M::I>1345     fn gen_epilogue(&self) -> SmallInstVec<M::I> {
1346         let mut insts = smallvec![];
1347 
1348         // Restore clobbered registers.
1349         insts.extend(M::gen_clobber_restore(
1350             self.call_conv,
1351             &self.flags,
1352             &self.clobbered,
1353             self.fixed_frame_storage_size,
1354             self.outgoing_args_size,
1355         ));
1356 
1357         // N.B.: we do *not* emit a nominal SP adjustment here, because (i) there will be no
1358         // references to nominal SP offsets before the return below, and (ii) the instruction
1359         // emission tracks running SP offset linearly (in straight-line order), not according to
1360         // the CFG, so early returns in the middle of function bodies would cause an incorrect
1361         // offset for the rest of the body.
1362 
1363         if !self.call_conv.extends_baldrdash() {
1364             insts.extend(M::gen_epilogue_frame_restore(&self.flags));
1365             insts.push(M::gen_ret());
1366         }
1367 
1368         debug!("Epilogue: {:?}", insts);
1369         insts
1370     }
1371 
frame_size(&self) -> u321372     fn frame_size(&self) -> u32 {
1373         self.total_frame_size
1374             .expect("frame size not computed before prologue generation")
1375     }
1376 
stack_args_size(&self) -> u321377     fn stack_args_size(&self) -> u32 {
1378         self.sig.stack_arg_space as u32
1379     }
1380 
get_spillslot_size(&self, rc: RegClass, ty: Type) -> u321381     fn get_spillslot_size(&self, rc: RegClass, ty: Type) -> u32 {
1382         M::get_number_of_spillslots_for_value(rc, ty)
1383     }
1384 
gen_spill(&self, to_slot: SpillSlot, from_reg: RealReg, ty: Option<Type>) -> Self::I1385     fn gen_spill(&self, to_slot: SpillSlot, from_reg: RealReg, ty: Option<Type>) -> Self::I {
1386         let ty = ty_from_ty_hint_or_reg_class::<M>(from_reg.to_reg(), ty);
1387         self.store_spillslot(to_slot, ty, ValueRegs::one(from_reg.to_reg()))
1388             .into_iter()
1389             .next()
1390             .unwrap()
1391     }
1392 
gen_reload( &self, to_reg: Writable<RealReg>, from_slot: SpillSlot, ty: Option<Type>, ) -> Self::I1393     fn gen_reload(
1394         &self,
1395         to_reg: Writable<RealReg>,
1396         from_slot: SpillSlot,
1397         ty: Option<Type>,
1398     ) -> Self::I {
1399         let ty = ty_from_ty_hint_or_reg_class::<M>(to_reg.to_reg().to_reg(), ty);
1400         self.load_spillslot(
1401             from_slot,
1402             ty,
1403             writable_value_regs(ValueRegs::one(to_reg.to_reg().to_reg())),
1404         )
1405         .into_iter()
1406         .next()
1407         .unwrap()
1408     }
1409 }
1410 
abisig_to_uses_and_defs<M: ABIMachineSpec>(sig: &ABISig) -> (Vec<Reg>, Vec<Writable<Reg>>)1411 fn abisig_to_uses_and_defs<M: ABIMachineSpec>(sig: &ABISig) -> (Vec<Reg>, Vec<Writable<Reg>>) {
1412     // Compute uses: all arg regs.
1413     let mut uses = Vec::new();
1414     for arg in &sig.args {
1415         if let &ABIArg::Slots { ref slots, .. } = arg {
1416             for slot in slots {
1417                 match slot {
1418                     &ABIArgSlot::Reg { reg, .. } => {
1419                         uses.push(reg.to_reg());
1420                     }
1421                     _ => {}
1422                 }
1423             }
1424         }
1425     }
1426 
1427     // Compute defs: all retval regs, and all caller-save (clobbered) regs.
1428     let mut defs = M::get_regs_clobbered_by_call(sig.call_conv);
1429     for ret in &sig.rets {
1430         if let &ABIArg::Slots { ref slots, .. } = ret {
1431             for slot in slots {
1432                 match slot {
1433                     &ABIArgSlot::Reg { reg, .. } => {
1434                         defs.push(Writable::from_reg(reg.to_reg()));
1435                     }
1436                     _ => {}
1437                 }
1438             }
1439         }
1440     }
1441 
1442     (uses, defs)
1443 }
1444 
1445 /// ABI object for a callsite.
1446 pub struct ABICallerImpl<M: ABIMachineSpec> {
1447     /// CLIF-level signature, possibly normalized.
1448     ir_sig: ir::Signature,
1449     /// The called function's signature.
1450     sig: ABISig,
1451     /// All uses for the callsite, i.e., function args.
1452     uses: Vec<Reg>,
1453     /// All defs for the callsite, i.e., return values and caller-saves.
1454     defs: Vec<Writable<Reg>>,
1455     /// Call destination.
1456     dest: CallDest,
1457     /// Actual call opcode; used to distinguish various types of calls.
1458     opcode: ir::Opcode,
1459     /// Caller's calling convention.
1460     caller_conv: isa::CallConv,
1461     /// The settings controlling this compilation.
1462     flags: settings::Flags,
1463 
1464     _mach: PhantomData<M>,
1465 }
1466 
1467 /// Destination for a call.
1468 #[derive(Debug, Clone)]
1469 pub enum CallDest {
1470     /// Call to an ExtName (named function symbol).
1471     ExtName(ir::ExternalName, RelocDistance),
1472     /// Indirect call to a function pointer in a register.
1473     Reg(Reg),
1474 }
1475 
1476 impl<M: ABIMachineSpec> ABICallerImpl<M> {
1477     /// Create a callsite ABI object for a call directly to the specified function.
from_func( sig: &ir::Signature, extname: &ir::ExternalName, dist: RelocDistance, caller_conv: isa::CallConv, flags: &settings::Flags, ) -> CodegenResult<ABICallerImpl<M>>1478     pub fn from_func(
1479         sig: &ir::Signature,
1480         extname: &ir::ExternalName,
1481         dist: RelocDistance,
1482         caller_conv: isa::CallConv,
1483         flags: &settings::Flags,
1484     ) -> CodegenResult<ABICallerImpl<M>> {
1485         let ir_sig = ensure_struct_return_ptr_is_returned(sig);
1486         let sig = ABISig::from_func_sig::<M>(&ir_sig, flags)?;
1487         let (uses, defs) = abisig_to_uses_and_defs::<M>(&sig);
1488         Ok(ABICallerImpl {
1489             ir_sig,
1490             sig,
1491             uses,
1492             defs,
1493             dest: CallDest::ExtName(extname.clone(), dist),
1494             opcode: ir::Opcode::Call,
1495             caller_conv,
1496             flags: flags.clone(),
1497             _mach: PhantomData,
1498         })
1499     }
1500 
1501     /// Create a callsite ABI object for a call to a function pointer with the
1502     /// given signature.
from_ptr( sig: &ir::Signature, ptr: Reg, opcode: ir::Opcode, caller_conv: isa::CallConv, flags: &settings::Flags, ) -> CodegenResult<ABICallerImpl<M>>1503     pub fn from_ptr(
1504         sig: &ir::Signature,
1505         ptr: Reg,
1506         opcode: ir::Opcode,
1507         caller_conv: isa::CallConv,
1508         flags: &settings::Flags,
1509     ) -> CodegenResult<ABICallerImpl<M>> {
1510         let ir_sig = ensure_struct_return_ptr_is_returned(sig);
1511         let sig = ABISig::from_func_sig::<M>(&ir_sig, flags)?;
1512         let (uses, defs) = abisig_to_uses_and_defs::<M>(&sig);
1513         Ok(ABICallerImpl {
1514             ir_sig,
1515             sig,
1516             uses,
1517             defs,
1518             dest: CallDest::Reg(ptr),
1519             opcode,
1520             caller_conv,
1521             flags: flags.clone(),
1522             _mach: PhantomData,
1523         })
1524     }
1525 }
1526 
adjust_stack_and_nominal_sp<M: ABIMachineSpec, C: LowerCtx<I = M::I>>( ctx: &mut C, off: i32, is_sub: bool, )1527 fn adjust_stack_and_nominal_sp<M: ABIMachineSpec, C: LowerCtx<I = M::I>>(
1528     ctx: &mut C,
1529     off: i32,
1530     is_sub: bool,
1531 ) {
1532     if off == 0 {
1533         return;
1534     }
1535     let amt = if is_sub { -off } else { off };
1536     for inst in M::gen_sp_reg_adjust(amt) {
1537         ctx.emit(inst);
1538     }
1539     ctx.emit(M::gen_nominal_sp_adj(-amt));
1540 }
1541 
1542 impl<M: ABIMachineSpec> ABICaller for ABICallerImpl<M> {
1543     type I = M::I;
1544 
signature(&self) -> &ir::Signature1545     fn signature(&self) -> &ir::Signature {
1546         &self.ir_sig
1547     }
1548 
num_args(&self) -> usize1549     fn num_args(&self) -> usize {
1550         if self.sig.stack_ret_arg.is_some() {
1551             self.sig.args.len() - 1
1552         } else {
1553             self.sig.args.len()
1554         }
1555     }
1556 
accumulate_outgoing_args_size<C: LowerCtx<I = Self::I>>(&self, ctx: &mut C)1557     fn accumulate_outgoing_args_size<C: LowerCtx<I = Self::I>>(&self, ctx: &mut C) {
1558         let off = self.sig.stack_arg_space + self.sig.stack_ret_space;
1559         ctx.abi().accumulate_outgoing_args_size(off as u32);
1560     }
1561 
emit_stack_pre_adjust<C: LowerCtx<I = Self::I>>(&self, ctx: &mut C)1562     fn emit_stack_pre_adjust<C: LowerCtx<I = Self::I>>(&self, ctx: &mut C) {
1563         let off = self.sig.stack_arg_space + self.sig.stack_ret_space;
1564         adjust_stack_and_nominal_sp::<M, C>(ctx, off as i32, /* is_sub = */ true)
1565     }
1566 
emit_stack_post_adjust<C: LowerCtx<I = Self::I>>(&self, ctx: &mut C)1567     fn emit_stack_post_adjust<C: LowerCtx<I = Self::I>>(&self, ctx: &mut C) {
1568         let off = self.sig.stack_arg_space + self.sig.stack_ret_space;
1569         adjust_stack_and_nominal_sp::<M, C>(ctx, off as i32, /* is_sub = */ false)
1570     }
1571 
emit_copy_regs_to_arg<C: LowerCtx<I = Self::I>>( &self, ctx: &mut C, idx: usize, from_regs: ValueRegs<Reg>, )1572     fn emit_copy_regs_to_arg<C: LowerCtx<I = Self::I>>(
1573         &self,
1574         ctx: &mut C,
1575         idx: usize,
1576         from_regs: ValueRegs<Reg>,
1577     ) {
1578         let word_rc = M::word_reg_class();
1579         let word_bits = M::word_bits() as usize;
1580         match &self.sig.args[idx] {
1581             &ABIArg::Slots { ref slots, .. } => {
1582                 assert_eq!(from_regs.len(), slots.len());
1583                 for (slot, from_reg) in slots.iter().zip(from_regs.regs().iter()) {
1584                     match slot {
1585                         &ABIArgSlot::Reg {
1586                             reg, ty, extension, ..
1587                         } => {
1588                             let ext = M::get_ext_mode(self.sig.call_conv, extension);
1589                             if ext != ir::ArgumentExtension::None && ty_bits(ty) < word_bits {
1590                                 assert_eq!(word_rc, reg.get_class());
1591                                 let signed = match ext {
1592                                     ir::ArgumentExtension::Uext => false,
1593                                     ir::ArgumentExtension::Sext => true,
1594                                     _ => unreachable!(),
1595                                 };
1596                                 ctx.emit(M::gen_extend(
1597                                     Writable::from_reg(reg.to_reg()),
1598                                     *from_reg,
1599                                     signed,
1600                                     ty_bits(ty) as u8,
1601                                     word_bits as u8,
1602                                 ));
1603                             } else {
1604                                 ctx.emit(M::gen_move(
1605                                     Writable::from_reg(reg.to_reg()),
1606                                     *from_reg,
1607                                     ty,
1608                                 ));
1609                             }
1610                         }
1611                         &ABIArgSlot::Stack {
1612                             offset,
1613                             ty,
1614                             extension,
1615                             ..
1616                         } => {
1617                             let mut ty = ty;
1618                             let ext = M::get_ext_mode(self.sig.call_conv, extension);
1619                             if ext != ir::ArgumentExtension::None && ty_bits(ty) < word_bits {
1620                                 assert_eq!(word_rc, from_reg.get_class());
1621                                 let signed = match ext {
1622                                     ir::ArgumentExtension::Uext => false,
1623                                     ir::ArgumentExtension::Sext => true,
1624                                     _ => unreachable!(),
1625                                 };
1626                                 // Extend in place in the source register. Our convention is to
1627                                 // treat high bits as undefined for values in registers, so this
1628                                 // is safe, even for an argument that is nominally read-only.
1629                                 ctx.emit(M::gen_extend(
1630                                     Writable::from_reg(*from_reg),
1631                                     *from_reg,
1632                                     signed,
1633                                     ty_bits(ty) as u8,
1634                                     word_bits as u8,
1635                                 ));
1636                                 // Store the extended version.
1637                                 ty = M::word_type();
1638                             }
1639                             ctx.emit(M::gen_store_stack(
1640                                 StackAMode::SPOffset(offset, ty),
1641                                 *from_reg,
1642                                 ty,
1643                             ));
1644                         }
1645                     }
1646                 }
1647             }
1648             &ABIArg::StructArg { offset, size, .. } => {
1649                 let src_ptr = from_regs.only_reg().unwrap();
1650                 let dst_ptr = ctx.alloc_tmp(M::word_type()).only_reg().unwrap();
1651                 ctx.emit(M::gen_get_stack_addr(
1652                     StackAMode::SPOffset(offset, I8),
1653                     dst_ptr,
1654                     I8,
1655                 ));
1656                 // Emit a memcpy from `src_ptr` to `dst_ptr` of `size` bytes.
1657                 // N.B.: because we process StructArg params *first*, this is
1658                 // safe w.r.t. clobbers: we have not yet filled in any other
1659                 // arg regs.
1660                 let memcpy_call_conv = isa::CallConv::for_libcall(&self.flags, self.sig.call_conv);
1661                 for insn in
1662                     M::gen_memcpy(memcpy_call_conv, dst_ptr.to_reg(), src_ptr, size as usize)
1663                         .into_iter()
1664                 {
1665                     ctx.emit(insn);
1666                 }
1667             }
1668         }
1669     }
1670 
get_copy_to_arg_order(&self) -> SmallVec<[usize; 8]>1671     fn get_copy_to_arg_order(&self) -> SmallVec<[usize; 8]> {
1672         let mut ret = SmallVec::new();
1673         for (i, arg) in self.sig.args.iter().enumerate() {
1674             // Struct args.
1675             if arg.is_struct_arg() {
1676                 ret.push(i);
1677             }
1678         }
1679         for (i, arg) in self.sig.args.iter().enumerate() {
1680             // Non-struct args. Skip an appended return-area arg for multivalue
1681             // returns, if any.
1682             if !arg.is_struct_arg() && i < self.ir_sig.params.len() {
1683                 ret.push(i);
1684             }
1685         }
1686         ret
1687     }
1688 
emit_copy_retval_to_regs<C: LowerCtx<I = Self::I>>( &self, ctx: &mut C, idx: usize, into_regs: ValueRegs<Writable<Reg>>, )1689     fn emit_copy_retval_to_regs<C: LowerCtx<I = Self::I>>(
1690         &self,
1691         ctx: &mut C,
1692         idx: usize,
1693         into_regs: ValueRegs<Writable<Reg>>,
1694     ) {
1695         match &self.sig.rets[idx] {
1696             &ABIArg::Slots { ref slots, .. } => {
1697                 assert_eq!(into_regs.len(), slots.len());
1698                 for (slot, into_reg) in slots.iter().zip(into_regs.regs().iter()) {
1699                     match slot {
1700                         // Extension mode doesn't matter because we're copying out, not in,
1701                         // and we ignore high bits in our own registers by convention.
1702                         &ABIArgSlot::Reg { reg, ty, .. } => {
1703                             ctx.emit(M::gen_move(*into_reg, reg.to_reg(), ty));
1704                         }
1705                         &ABIArgSlot::Stack { offset, ty, .. } => {
1706                             let ret_area_base = self.sig.stack_arg_space;
1707                             ctx.emit(M::gen_load_stack(
1708                                 StackAMode::SPOffset(offset + ret_area_base, ty),
1709                                 *into_reg,
1710                                 ty,
1711                             ));
1712                         }
1713                     }
1714                 }
1715             }
1716             &ABIArg::StructArg { .. } => {
1717                 panic!("StructArg not supported in return position");
1718             }
1719         }
1720     }
1721 
emit_call<C: LowerCtx<I = Self::I>>(&mut self, ctx: &mut C)1722     fn emit_call<C: LowerCtx<I = Self::I>>(&mut self, ctx: &mut C) {
1723         let (uses, defs) = (
1724             mem::replace(&mut self.uses, Default::default()),
1725             mem::replace(&mut self.defs, Default::default()),
1726         );
1727         let word_type = M::word_type();
1728         if let Some(i) = self.sig.stack_ret_arg {
1729             let rd = ctx.alloc_tmp(word_type).only_reg().unwrap();
1730             let ret_area_base = self.sig.stack_arg_space;
1731             ctx.emit(M::gen_get_stack_addr(
1732                 StackAMode::SPOffset(ret_area_base, I8),
1733                 rd,
1734                 I8,
1735             ));
1736             self.emit_copy_regs_to_arg(ctx, i, ValueRegs::one(rd.to_reg()));
1737         }
1738         let tmp = ctx.alloc_tmp(word_type).only_reg().unwrap();
1739         for (is_safepoint, inst) in M::gen_call(
1740             &self.dest,
1741             uses,
1742             defs,
1743             self.opcode,
1744             tmp,
1745             self.sig.call_conv,
1746             self.caller_conv,
1747         )
1748         .into_iter()
1749         {
1750             match is_safepoint {
1751                 InstIsSafepoint::Yes => ctx.emit_safepoint(inst),
1752                 InstIsSafepoint::No => ctx.emit(inst),
1753             }
1754         }
1755     }
1756 }
1757