1*e790a4ceSJonathan Corbet=============
2*e790a4ceSJonathan CorbetCurrent State
3*e790a4ceSJonathan Corbet=============
4*e790a4ceSJonathan Corbet
5*e790a4ceSJonathan CorbetThe following describes the current state of the NetWinder's floating point
6*e790a4ceSJonathan Corbetemulator.
7*e790a4ceSJonathan Corbet
8*e790a4ceSJonathan CorbetIn the following nomenclature is used to describe the floating point
9*e790a4ceSJonathan Corbetinstructions.  It follows the conventions in the ARM manual.
10*e790a4ceSJonathan Corbet
11*e790a4ceSJonathan Corbet::
12*e790a4ceSJonathan Corbet
13*e790a4ceSJonathan Corbet  <S|D|E> = <single|double|extended>, no default
14*e790a4ceSJonathan Corbet  {P|M|Z} = {round to +infinity,round to -infinity,round to zero},
15*e790a4ceSJonathan Corbet            default = round to nearest
16*e790a4ceSJonathan Corbet
17*e790a4ceSJonathan CorbetNote: items enclosed in {} are optional.
18*e790a4ceSJonathan Corbet
19*e790a4ceSJonathan CorbetFloating Point Coprocessor Data Transfer Instructions (CPDT)
20*e790a4ceSJonathan Corbet------------------------------------------------------------
21*e790a4ceSJonathan Corbet
22*e790a4ceSJonathan CorbetLDF/STF - load and store floating
23*e790a4ceSJonathan Corbet
24*e790a4ceSJonathan Corbet<LDF|STF>{cond}<S|D|E> Fd, Rn
25*e790a4ceSJonathan Corbet<LDF|STF>{cond}<S|D|E> Fd, [Rn, #<expression>]{!}
26*e790a4ceSJonathan Corbet<LDF|STF>{cond}<S|D|E> Fd, [Rn], #<expression>
27*e790a4ceSJonathan Corbet
28*e790a4ceSJonathan CorbetThese instructions are fully implemented.
29*e790a4ceSJonathan Corbet
30*e790a4ceSJonathan CorbetLFM/SFM - load and store multiple floating
31*e790a4ceSJonathan Corbet
32*e790a4ceSJonathan CorbetForm 1 syntax:
33*e790a4ceSJonathan Corbet<LFM|SFM>{cond}<S|D|E> Fd, <count>, [Rn]
34*e790a4ceSJonathan Corbet<LFM|SFM>{cond}<S|D|E> Fd, <count>, [Rn, #<expression>]{!}
35*e790a4ceSJonathan Corbet<LFM|SFM>{cond}<S|D|E> Fd, <count>, [Rn], #<expression>
36*e790a4ceSJonathan Corbet
37*e790a4ceSJonathan CorbetForm 2 syntax:
38*e790a4ceSJonathan Corbet<LFM|SFM>{cond}<FD,EA> Fd, <count>, [Rn]{!}
39*e790a4ceSJonathan Corbet
40*e790a4ceSJonathan CorbetThese instructions are fully implemented.  They store/load three words
41*e790a4ceSJonathan Corbetfor each floating point register into the memory location given in the
42*e790a4ceSJonathan Corbetinstruction.  The format in memory is unlikely to be compatible with
43*e790a4ceSJonathan Corbetother implementations, in particular the actual hardware.  Specific
44*e790a4ceSJonathan Corbetmention of this is made in the ARM manuals.
45*e790a4ceSJonathan Corbet
46*e790a4ceSJonathan CorbetFloating Point Coprocessor Register Transfer Instructions (CPRT)
47*e790a4ceSJonathan Corbet----------------------------------------------------------------
48*e790a4ceSJonathan Corbet
49*e790a4ceSJonathan CorbetConversions, read/write status/control register instructions
50*e790a4ceSJonathan Corbet
51*e790a4ceSJonathan CorbetFLT{cond}<S,D,E>{P,M,Z} Fn, Rd          Convert integer to floating point
52*e790a4ceSJonathan CorbetFIX{cond}{P,M,Z} Rd, Fn                 Convert floating point to integer
53*e790a4ceSJonathan CorbetWFS{cond} Rd                            Write floating point status register
54*e790a4ceSJonathan CorbetRFS{cond} Rd                            Read floating point status register
55*e790a4ceSJonathan CorbetWFC{cond} Rd                            Write floating point control register
56*e790a4ceSJonathan CorbetRFC{cond} Rd                            Read floating point control register
57*e790a4ceSJonathan Corbet
58*e790a4ceSJonathan CorbetFLT/FIX are fully implemented.
59*e790a4ceSJonathan Corbet
60*e790a4ceSJonathan CorbetRFS/WFS are fully implemented.
61*e790a4ceSJonathan Corbet
62*e790a4ceSJonathan CorbetRFC/WFC are fully implemented.  RFC/WFC are supervisor only instructions, and
63*e790a4ceSJonathan Corbetpresently check the CPU mode, and do an invalid instruction trap if not called
64*e790a4ceSJonathan Corbetfrom supervisor mode.
65*e790a4ceSJonathan Corbet
66*e790a4ceSJonathan CorbetCompare instructions
67*e790a4ceSJonathan Corbet
68*e790a4ceSJonathan CorbetCMF{cond} Fn, Fm        Compare floating
69*e790a4ceSJonathan CorbetCMFE{cond} Fn, Fm       Compare floating with exception
70*e790a4ceSJonathan CorbetCNF{cond} Fn, Fm        Compare negated floating
71*e790a4ceSJonathan CorbetCNFE{cond} Fn, Fm       Compare negated floating with exception
72*e790a4ceSJonathan Corbet
73*e790a4ceSJonathan CorbetThese are fully implemented.
74*e790a4ceSJonathan Corbet
75*e790a4ceSJonathan CorbetFloating Point Coprocessor Data Instructions (CPDT)
76*e790a4ceSJonathan Corbet---------------------------------------------------
77*e790a4ceSJonathan Corbet
78*e790a4ceSJonathan CorbetDyadic operations:
79*e790a4ceSJonathan Corbet
80*e790a4ceSJonathan CorbetADF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - add
81*e790a4ceSJonathan CorbetSUF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - subtract
82*e790a4ceSJonathan CorbetRSF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - reverse subtract
83*e790a4ceSJonathan CorbetMUF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - multiply
84*e790a4ceSJonathan CorbetDVF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - divide
85*e790a4ceSJonathan CorbetRDV{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - reverse divide
86*e790a4ceSJonathan Corbet
87*e790a4ceSJonathan CorbetThese are fully implemented.
88*e790a4ceSJonathan Corbet
89*e790a4ceSJonathan CorbetFML{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - fast multiply
90*e790a4ceSJonathan CorbetFDV{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - fast divide
91*e790a4ceSJonathan CorbetFRD{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - fast reverse divide
92*e790a4ceSJonathan Corbet
93*e790a4ceSJonathan CorbetThese are fully implemented as well.  They use the same algorithm as the
94*e790a4ceSJonathan Corbetnon-fast versions.  Hence, in this implementation their performance is
95*e790a4ceSJonathan Corbetequivalent to the MUF/DVF/RDV instructions.  This is acceptable according
96*e790a4ceSJonathan Corbetto the ARM manual.  The manual notes these are defined only for single
97*e790a4ceSJonathan Corbetoperands, on the actual FPA11 hardware they do not work for double or
98*e790a4ceSJonathan Corbetextended precision operands.  The emulator currently does not check
99*e790a4ceSJonathan Corbetthe requested permissions conditions, and performs the requested operation.
100*e790a4ceSJonathan Corbet
101*e790a4ceSJonathan CorbetRMF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - IEEE remainder
102*e790a4ceSJonathan Corbet
103*e790a4ceSJonathan CorbetThis is fully implemented.
104*e790a4ceSJonathan Corbet
105*e790a4ceSJonathan CorbetMonadic operations:
106*e790a4ceSJonathan Corbet
107*e790a4ceSJonathan CorbetMVF{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - move
108*e790a4ceSJonathan CorbetMNF{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - move negated
109*e790a4ceSJonathan Corbet
110*e790a4ceSJonathan CorbetThese are fully implemented.
111*e790a4ceSJonathan Corbet
112*e790a4ceSJonathan CorbetABS{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - absolute value
113*e790a4ceSJonathan CorbetSQT{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - square root
114*e790a4ceSJonathan CorbetRND{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - round
115*e790a4ceSJonathan Corbet
116*e790a4ceSJonathan CorbetThese are fully implemented.
117*e790a4ceSJonathan Corbet
118*e790a4ceSJonathan CorbetURD{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - unnormalized round
119*e790a4ceSJonathan CorbetNRM{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - normalize
120*e790a4ceSJonathan Corbet
121*e790a4ceSJonathan CorbetThese are implemented.  URD is implemented using the same code as the RND
122*e790a4ceSJonathan Corbetinstruction.  Since URD cannot return a unnormalized number, NRM becomes
123*e790a4ceSJonathan Corbeta NOP.
124*e790a4ceSJonathan Corbet
125*e790a4ceSJonathan CorbetLibrary calls:
126*e790a4ceSJonathan Corbet
127*e790a4ceSJonathan CorbetPOW{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - power
128*e790a4ceSJonathan CorbetRPW{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - reverse power
129*e790a4ceSJonathan CorbetPOL{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - polar angle (arctan2)
130*e790a4ceSJonathan Corbet
131*e790a4ceSJonathan CorbetLOG{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - logarithm to base 10
132*e790a4ceSJonathan CorbetLGN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - logarithm to base e
133*e790a4ceSJonathan CorbetEXP{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - exponent
134*e790a4ceSJonathan CorbetSIN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - sine
135*e790a4ceSJonathan CorbetCOS{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - cosine
136*e790a4ceSJonathan CorbetTAN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - tangent
137*e790a4ceSJonathan CorbetASN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arcsine
138*e790a4ceSJonathan CorbetACS{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arccosine
139*e790a4ceSJonathan CorbetATN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arctangent
140*e790a4ceSJonathan Corbet
141*e790a4ceSJonathan CorbetThese are not implemented.  They are not currently issued by the compiler,
142*e790a4ceSJonathan Corbetand are handled by routines in libc.  These are not implemented by the FPA11
143*e790a4ceSJonathan Corbethardware, but are handled by the floating point support code.  They should
144*e790a4ceSJonathan Corbetbe implemented in future versions.
145*e790a4ceSJonathan Corbet
146*e790a4ceSJonathan CorbetSignalling:
147*e790a4ceSJonathan Corbet
148*e790a4ceSJonathan CorbetSignals are implemented.  However current ELF kernels produced by Rebel.com
149*e790a4ceSJonathan Corbethave a bug in them that prevents the module from generating a SIGFPE.  This
150*e790a4ceSJonathan Corbetis caused by a failure to alias fp_current to the kernel variable
151*e790a4ceSJonathan Corbetcurrent_set[0] correctly.
152*e790a4ceSJonathan Corbet
153*e790a4ceSJonathan CorbetThe kernel provided with this distribution (vmlinux-nwfpe-0.93) contains
154*e790a4ceSJonathan Corbeta fix for this problem and also incorporates the current version of the
155*e790a4ceSJonathan Corbetemulator directly.  It is possible to run with no floating point module
156*e790a4ceSJonathan Corbetloaded with this kernel.  It is provided as a demonstration of the
157*e790a4ceSJonathan Corbettechnology and for those who want to do floating point work that depends
158*e790a4ceSJonathan Corbeton signals.  It is not strictly necessary to use the module.
159*e790a4ceSJonathan Corbet
160*e790a4ceSJonathan CorbetA module (either the one provided by Russell King, or the one in this
161*e790a4ceSJonathan Corbetdistribution) can be loaded to replace the functionality of the emulator
162*e790a4ceSJonathan Corbetbuilt into the kernel.
163