1""" 2Implementation of the inverse of the logarithm of the CDF of the standard 3normal distribution. 4 5Copyright: Albert Steppi 6 7Distributed under the same license as SciPy 8 9 10Implementation Overview 11 12The inverse of the CDF of the standard normal distribution is available 13in scipy through the Cephes Math Library where it is called ndtri. 14We call our implementation of the inverse of the log CDF ndtri_exp. 15For -2 <= y <= log(1 - exp(-2)), ndtri_exp is computed as ndtri(exp(y)). 16 17For 0 < p < exp(-2), the cephes implementation of ndtri uses an approximation 18for ndtri(p) which is a function of z = sqrt(-2.0 * log(p)). Letting 19y = log(p), for y < -2, ndtri_exp uses this approximation in log(p) directly. 20This allows the implementation to achieve high precision for very small y, 21whereas ndtri(exp(y)) evaluates to infinity. This is because exp(y) underflows 22for y < ~ -745.1. 23 24When p > 1 - exp(-2), the Cephes implementation of ndtri uses the symmetry 25of the normal distribution and calculates ndtri(p) as -ndtri(1 - p) allowing 26for the use of the same approximation. When y > log(1 - exp(-2)) this 27implementation calculates ndtri_exp as -ndtri(-exp1m(y)). 28 29 30Accuracy 31 32Cephes provides the following relative error estimates for ndtri 33 Relative error: 34arithmetic domain # trials peak rms 35 IEEE 0.125, 1 20000 7.2e-16 1.3e-16 36 IEEE 3e-308, 0.135 50000 4.6e-16 9.8e-17 37 38When y < -2, ndtri_exp must have relative error at least as small as the 39Cephes implementation of ndtri for p < exp(-2). It relies on the same 40approximation but does not have to lose precision by passing from p to log(p) 41before applying the approximation. 42 43Relative error of ndtri for values of the argument p near 1 can be much higher 44than claimed by the above chart. For p near 1, symmetry is exploited to 45replace the calculation of ndtri(p) with -ndtri(1 - p). The inverse of the 46normal CDF increases so rapidly near the endpoints of [0, 1] that the loss 47of precision incurred by the subtraction 1 - p due to limitations in binary 48approximation can make a significant difference in the results. Using 49version 9.3.0 targeting x86_64-linux-gnu we've observed the following 50 51 Estimated Relative Error 52 ndtri(1e-8) = -5.612001244174789 '' 53-ndtri(1 - 1e-8) = -5.612001243305505 1.55e-10 54 ndtri(1e-16) = -8.222082216130435 '' 55-ndtri(1 - 1e-16) = -8.209536151601387 0.0015 56 57If expm1 is correctly rounded for y in [log(1 - exp(-2), 0), then ndtri_exp(y) 58should have the same relative error as ndtri(p) for p > 1 - exp(-2). As seen 59above, this error may be higher than desired. IEEE-754 provides no guarantee on 60the accuracy of expm1 however, therefore accuracy of ndtri_exp in this range 61is platform dependent. 62 63The case 64 65 -2 <= y <= log(1 - exp(-2)) ~ -0.1454 66 67corresponds to 68 69 ~ 0.135 <= p <= ~ 0.865 70 71The derivative of ndtri is sqrt(2 * pi) * exp(ndtri(x)**2 / 2). 72It is ~4.597 at x ~ 0.135, decreases monotonically to sqrt(2 * pi) ~ 2.507 73at x = 0 and increases monotonically again to ~4.597 at x ~ 0.865. 74 75It can be checked that all higher derivatives follow a similar pattern. 76Their absolute value takes on a maximum (for this interval) at x ~ 0.135, 77decrease to a minimum at x = 0 and increases to the same maximum at x ~ 0.865. 78Derivatives of all orders are positive at x=log(1 - exp(-2)). Thus the worst 79possible loss of precision of ndtri(exp(x)) in the interval 80[0, log(1 - exp(-2))] due to error in calculating exp(x) must occur near 81x=log(1 - exp(-2)). By symmetry, the worst possible loss of precision in 82[-2, log(1 - exp(-2)] must occur near the endpoints. We may observe 83empirically that error at the endpoints due to exp is not substantial. 84Assuming that exp is accurate within +-ULP (unit of least precision), 85we observed a value of at most ~6.0474e-16 for 86 87 abs(ndtri(x + epsilon) - ndtri(x)) 88 89if x is near exp(-2) or 1 - exp(-2) and epsilon is equal to the unit of least 90precision of x. 91 92(IEEE-754 provides no guarantee on the accuracy of exp, but for most 93compilers on most architectures an assumption of +-ULP should be 94reasonable.) 95 96The error here is on the order of the error in the Cephes implementation of 97ndtri itself, leading to an error profile that is still favorable. 98""" 99 100 101import cython 102from libc.float cimport DBL_MAX 103from libc.math cimport exp, expm1, log, log1p, sqrt, M_SQRT2 104 105cdef extern from "cephes/polevl.h": 106 double polevl(double x, const double coef[], int N) nogil 107 double p1evl(double x, const double coef[], int N) nogil 108 109cdef extern from "numpy/npy_math.h": 110 double NPY_INFINITY 111 112from ._cephes cimport ndtri 113 114@cython.cdivision(True) 115cdef inline double _ndtri_exp_small_y(double y) nogil: 116 """Return inverse of log CDF of normal distribution for very small y 117 118 For p sufficiently small, the inverse of the CDF of the normal 119 distribution can be approximated to high precision as a rational function 120 in sqrt(-2.0 * log(p)). 121 """ 122 cdef: 123 double *P1 = [ 124 4.05544892305962419923, 3.15251094599893866154e1, 125 5.71628192246421288162e1, 4.40805073893200834700e1, 126 1.46849561928858024014e1, 2.18663306850790267539, 127 -1.40256079171354495875e-1, -3.50424626827848203418e-2, 128 -8.57456785154685413611e-4 129 ] 130 double *Q1 = [ 131 1.57799883256466749731e1, 4.53907635128879210584e1, 132 4.13172038254672030440e1, 1.50425385692907503408e1, 133 2.50464946208309415979, -1.42182922854787788574e-1, 134 -3.80806407691578277194e-2, -9.33259480895457427372e-4 135 ] 136 double *P2 = [ 137 3.23774891776946035970, 6.91522889068984211695, 138 3.93881025292474443415, 1.33303460815807542389, 139 2.01485389549179081538e-1, 1.23716634817820021358e-2, 140 3.01581553508235416007e-4, 2.65806974686737550832e-6, 141 6.23974539184983293730e-9 142 ] 143 double *Q2 = [ 144 6.02427039364742014255, 3.67983563856160859403, 145 1.37702099489081330271, 2.16236993594496635890e-1, 146 1.34204006088543189037e-2, 3.28014464682127739104e-4, 147 2.89247864745380683936e-6, 6.79019408009981274425e-9 148 ] 149 double x, x0, x1, z 150 151 # sqrt(-2 * y) is faster and has more precision but overflows when 152 # y < -DBL_MAX * 0.5 153 if y >= -DBL_MAX * 0.5: 154 x = sqrt(-2 * y) 155 else: 156 x = M_SQRT2 * sqrt(-y) 157 x0 = x - log(x) / x 158 z = 1 / x 159 if x < 8.0: 160 x1 = z * polevl(z, P1, 8) / p1evl(z, Q1, 8) 161 else: 162 x1 = z * polevl(z, P2, 8) / p1evl(z, Q2, 8) 163 return x1 - x0 164 165 166cdef inline double ndtri_exp(double y) nogil: 167 """Return inverse of logarithm of Normal CDF evaluated at y.""" 168 if y < -DBL_MAX: 169 return -NPY_INFINITY 170 elif y < - 2.0: 171 return _ndtri_exp_small_y(y) 172 elif y > -0.14541345786885906: # log1p(-exp(-2)) 173 return -ndtri(-expm1(y)) 174 else: 175 return ndtri(exp(y)) 176