1/* 2 * Copyright (c) 1985 Regents of the University of California. 3 * All rights reserved. 4 * 5 * Redistribution and use in source and binary forms are permitted 6 * provided that the above copyright notice and this paragraph are 7 * duplicated in all such forms and that any documentation, 8 * advertising materials, and other materials related to such 9 * distribution and use acknowledge that the software was developed 10 * by the University of California, Berkeley. The name of the 11 * University may not be used to endorse or promote products derived 12 * from this software without specific prior written permission. 13 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR 14 * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED 15 * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE. 16 * 17 * All recipients should regard themselves as participants in an ongoing 18 * research project and hence should feel obligated to report their 19 * experiences (good or bad) with these elementary function codes, using 20 * the sendbug(8) program, to the authors. 21 * 22 * K.C. Ng, with Z-S. Alex Liu, S. McDonald, P. Tang, W. Kahan. 23 * Revised on 5/10/85, 5/13/85, 6/14/85, 8/20/85, 8/27/85, 9/11/85. 24 * 25 * @(#)README 5.3 (Berkeley) 06/30/88 26 */ 27 28****************************************************************************** 29* This is a description of the upgraded elementary functions (listed in 1). * 30* Bessel functions (j0, j1, jn, y0, y1, yn), floor, and fabs passed over * 31* from 4.2BSD without change except perhaps for the way floating point * 32* exception is signaled on a VAX. Three lines that contain "errno" in erf.c* 33* (error functions erf, erfc) have been deleted to prevent overriding the * 34* system "errno". * 35****************************************************************************** 36 370. Total number of files: 40 38 39 IEEE/Makefile VAX/Makefile VAX/support.s erf.c lgamma.c 40 IEEE/atan2.c VAX/argred.s VAX/tan.s exp.c log.c 41 IEEE/cabs.c VAX/atan2.s acosh.c exp__E.c log10.c 42 IEEE/cbrt.c VAX/cabs.s asincos.c expm1.c log1p.c 43 IEEE/support.c VAX/cbrt.s asinh.c floor.c log__L.c 44 IEEE/trig.c VAX/infnan.s atan.c j0.c pow.c 45 Makefile VAX/sincos.s atanh.c j1.c sinh.c 46 README VAX/sqrt.s cosh.c jn.c tanh.c 47 481. Functions implemented : 49 (A). Standard elementary functions (total 22) : 50 acos(x) ...in file asincos.c 51 asin(x) ...in file asincos.c 52 atan(x) ...in file atan.c 53 atan2(x,y) ...in files IEEE/atan2.c, VAX/atan2.s 54 sin(x) ...in files IEEE/trig.c, VAX/sincos.s 55 cos(x) ...in files IEEE/trig.c, VAX/sincos.s 56 tan(x) ...in files IEEE/trig.c, VAX/tan.s 57 cabs(x,y) ...in files IEEE/cabs.c, VAX/cabs.s 58 hypot(x,y) ...in files IEEE/cabs.c, VAX/cabs.s 59 cbrt(x) ...in files IEEE/cbrt.c, VAX/cbrt.s 60 exp(x) ...in file exp.c 61 expm1(x):=exp(x)-1 ...in file expm1.c 62 log(x) ...in file log.c 63 log10(x) ...in file log10.c 64 log1p(x):=log(1+x) ...in file log1p.c 65 pow(x,y) ...in file pow.c 66 sinh(x) ...in file sinh.c 67 cosh(x) ...in file cosh.c 68 tanh(x) ...in file tanh.c 69 asinh(x) ...in file asinh.c 70 acosh(x) ...in file acosh.c 71 atanh(x) ...in file atanh.c 72 73 (B). Kernel functions : 74 exp__E(x,c) ...in file exp__E.c, used by expm1/exp/pow/cosh 75 log__L(s) ...in file log__L.c, used by log1p/log/pow 76 libm$argred ...in file VAX/argred.s, used by VAX version of sin/cos/tan 77 78 (C). System supported functions : 79 sqrt() ...in files IEEE/support.c, VAX/sqrt.s 80 drem() ...in files IEEE/support.c, VAX/support.s 81 finite() ...in files IEEE/support.c, VAX/support.s 82 logb() ...in files IEEE/support.c, VAX/support.s 83 scalb() ...in files IEEE/support.c, VAX/support.s 84 copysign() ...in files IEEE/support.c, VAX/support.s 85 rint() ...in file floor.c 86 87 88 Notes: 89 i. The codes in files ending with ".s" are written in VAX assembly 90 language. They are intended for VAX computers. 91 92 Files that end with ".c" are written in C. They are intended 93 for either a VAX or a machine that conforms to the IEEE 94 standard 754 for double precision floating-point arithmetic. 95 96 ii. On other than VAX or IEEE machines, run the original math 97 library, formerly "/usr/lib/libm.a", now "/usr/lib/libom.a", if 98 nothing better is available. 99 100 iii. The trigonometric functions sin/cos/tan/atan2 in files "VAX/sincos.s", 101 "VAX/tan.s" and "VAX/atan2.s" are different from those in 102 "IEEE/trig.c" and "IEEE/atan2.c". The VAX assembler code uses the 103 true value of pi to perform argument reduction, while the C code uses 104 a machine value of PI (see "IEEE/trig.c"). 105 106 1072. A computer system that conforms to IEEE standard 754 should provide 108 sqrt(x), 109 drem(x,p), (double precision remainder function) 110 copysign(x,y), 111 finite(x), 112 scalb(x,N), 113 logb(x) and 114 rint(x). 115 These functions are either required or recommended by the standard. 116 For convenience, a (slow) C implementation of these functions is 117 provided in the file "IEEE/support.c". 118 119 Warning: The functions in IEEE/support.c are somewhat machine dependent. 120 Some modifications may be necessary to run them on a different machine. 121 Currently, if compiled with a suitable flag, "IEEE/support.c" will work 122 on a National 32000, a Zilog 8000, a VAX, and a SUN (cf. the "Makefile" 123 in this directory). Invoke the C compiler thus: 124 125 cc -c -DVAX IEEE/support.c ... on a VAX, D-format 126 cc -c -DNATIONAL IEEE/support.c ... on a National 32000 127 cc -c IEEE/support.c ... on other IEEE machines, 128 we hope. 129 130 Notes: 131 1. Faster versions of "drem" and "sqrt" for IEEE double precision 132 (coded in C but intended for assembly language) are given at the 133 end of "IEEE/support.c" but commented out since they require certain 134 machine-dependent functions. 135 136 2. A fast VAX assembler version of the system supported functions 137 copysign(), logb(), scalb(), finite(), and drem() appears in file 138 "VAX/support.s". A fast VAX assembler version of sqrt() is in 139 file "VAX/sqrt.s". 140 1413. Two formats are supported by all the standard elementary functions: 142 the VAX D-format (56-bit precision), and the IEEE double format 143 (53-bit precision). The cbrt() in "IEEE/cbrt.c" is for IEEE machines 144 only. The functions in files that end with ".s" are for VAX computers 145 only. The functions in files that end with ".c" (except "IEEE/cbrt.c") 146 are for VAX and IEEE machines. To use the VAX D-format, compile the code 147 with -DVAX; to use IEEE double format on various IEEE machines, see 148 "Makefile" in this directory). 149 150 Example: 151 cc -c -DVAX sin.c ... for VAX D-format 152 153 Warning: The values of floating-point constants used in the code are 154 given in both hexadecimal and decimal. The hexadecimal values 155 are the intended ones. The decimal values may be used provided 156 that the compiler converts from decimal to binary accurately 157 enough to produce the hexadecimal values shown. If the 158 conversion is inaccurate, then one must know the exact machine 159 representation of the constants and alter the assembly 160 language output from the compiler, or play tricks like 161 the following in a C program. 162 163 Example: to store the floating-point constant 164 165 p1= 2^-6 * .F83ABE67E1066A (Hexadecimal) 166 167 on a VAX in C, we use two longwords to store its 168 machine value and define p1 to be the double constant 169 at the location of these two longwords: 170 171 static long p1x[] = { 0x3abe3d78, 0x066a67e1}; 172 #define p1 (*(double*)p1x) 173 174 Note: On a VAX, some functions have two codes. For example, cabs() has 175 one implementation in "IEEE/cabs.c", and another in "VAX/cabs.s". 176 In this case, the assembly language version is preferred. 177 178 1794. Accuracy. 180 181 The errors in expm1(), log1p(), exp(), log(), cabs(), hypot() 182 and cbrt() are below 1 ULP (Unit in the Last Place). 183 184 The error in pow(x,y) grows with the size of y. Nevertheless, 185 for integers x and y, pow(x,y) returns the correct integer value 186 on all tested machines (VAX, SUN, NATIONAL, ZILOG), provided that 187 x to the power of y is representable exactly. 188 189 cosh, sinh, acosh, asinh, tanh, atanh and log10 have errors below 190 about 3 ULPs. 191 192 For trigonometric and inverse trigonometric functions: 193 194 Let [trig(x)] denote the value actually computed for trig(x), 195 196 1) Those codes using the machine's value PI (true pi rounded): 197 (source codes: IEEE/{trig.c,atan2.c}, asincos.c and atan.c) 198 199 The errors in [sin(x)], [cos(x)], and [atan(x)] are below 200 1 ULP compared with sin(x*pi/PI), cos(x*pi/PI), and 201 atan(x)*PI/pi respectively, where PI is the machine's 202 value of pi rounded. [tan(x)] returns tan(x*pi/PI) within 203 about 2 ULPs; [acos(x)], [asin(x)], and [atan2(y,x)] 204 return acos(x)*PI/pi, asin(x)*PI/pi, and atan2(y,x)*PI/pi 205 respectively to similar accuracy. 206 207 208 2) Those using true pi (for VAX D-format only): 209 (source codes: VAX/{sincos.s,tan.s,atan2.s}, asincos.c and 210 atan.c) 211 212 The errors in [sin(x)], [cos(x)], and [atan(x)] are below 213 1 ULP. [tan(x)], [atan2(y,x)], [acos(x)], and [asin(x)] 214 have errors below about 2 ULPs. 215 216 217 Here are the results of some test runs to find worst errors on 218 the VAX : 219 220 tan : 2.09 ULPs ...1,024,000 random arguments (machine PI) 221 sin : .861 ULPs ...1,024,000 random arguments (machine PI) 222 cos : .857 ULPs ...1,024,000 random arguments (machine PI) 223 (compared with tan, sin, cos of (x*pi/PI)) 224 225 acos : 2.07 ULPs .....200,000 random arguments (machine PI) 226 asin : 2.06 ULPs .....200,000 random arguments (machine PI) 227 atan2 : 1.41 ULPs .....356,000 random arguments (machine PI) 228 atan : 0.86 ULPs ...1,536,000 random arguments (machine PI) 229 (compared with (PI/pi)*(atan, asin, acos, atan2 of x)) 230 231 tan : 2.15 ULPs ...1,024,000 random arguments (true pi) 232 sin : .814 ULPs ...1,024,000 random arguments (true pi) 233 cos : .792 ULPs ...1,024,000 random arguments (true pi) 234 acos : 2.15 ULPs ...1,024,000 random arguments (true pi) 235 asin : 1.99 ULPs ...1,024,000 random arguments (true pi) 236 atan2 : 1.48 ULPs ...1,024,000 random arguments (true pi) 237 atan : .850 ULPs ...1,024,000 random arguments (true pi) 238 239 acosh : 3.30 ULPs .....512,000 random arguments 240 asinh : 1.58 ULPs .....512,000 random arguments 241 atanh : 1.71 ULPs .....512,000 random arguments 242 cosh : 1.23 ULPs .....768,000 random arguments 243 sinh : 1.93 ULPs ...1,024,000 random arguments 244 tanh : 2.22 ULPs ...1,024,000 random arguments 245 log10 : 1.74 ULPs ...1,536,000 random arguments 246 pow : 1.79 ULPs .....100,000 random arguments, 0 < x, y < 20. 247 248 exp : .768 ULPs ...1,156,000 random arguments 249 expm1 : .844 ULPs ...1,166,000 random arguments 250 log1p : .846 ULPs ...1,536,000 random arguments 251 log : .826 ULPs ...1,536,000 random arguments 252 cabs : .959 ULPs .....500,000 random arguments 253 cbrt : .666 ULPs ...5,120,000 random arguments 254 255 2565. Speed. 257 258 Some functions coded in VAX assembly language (cabs(), hypot() and 259 sqrt()) are significantly faster than the corresponding ones in 4.2BSD. 260 In general, to improve performance, all functions in "IEEE/support.c" 261 should be written in assembly language and, whenever possible, should 262 be called via short subroutine calls. 263 264 2656. j0, j1, jn. 266 267 The modifications to these routines were only in how an invalid 268 floating point operations is signaled. 269