1*> \brief <b> DSGESV computes the solution to system of linear equations A * X = B for GE matrices</b> (mixed precision with iterative refinement) 2* 3* =========== DOCUMENTATION =========== 4* 5* Online html documentation available at 6* http://www.netlib.org/lapack/explore-html/ 7* 8*> \htmlonly 9*> Download DSGESV + dependencies 10*> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/dsgesv.f"> 11*> [TGZ]</a> 12*> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/dsgesv.f"> 13*> [ZIP]</a> 14*> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/dsgesv.f"> 15*> [TXT]</a> 16*> \endhtmlonly 17* 18* Definition: 19* =========== 20* 21* SUBROUTINE DSGESV( N, NRHS, A, LDA, IPIV, B, LDB, X, LDX, WORK, 22* SWORK, ITER, INFO ) 23* 24* .. Scalar Arguments .. 25* INTEGER INFO, ITER, LDA, LDB, LDX, N, NRHS 26* .. 27* .. Array Arguments .. 28* INTEGER IPIV( * ) 29* REAL SWORK( * ) 30* DOUBLE PRECISION A( LDA, * ), B( LDB, * ), WORK( N, * ), 31* $ X( LDX, * ) 32* .. 33* 34* 35*> \par Purpose: 36* ============= 37*> 38*> \verbatim 39*> 40*> DSGESV computes the solution to a real system of linear equations 41*> A * X = B, 42*> where A is an N-by-N matrix and X and B are N-by-NRHS matrices. 43*> 44*> DSGESV first attempts to factorize the matrix in SINGLE PRECISION 45*> and use this factorization within an iterative refinement procedure 46*> to produce a solution with DOUBLE PRECISION normwise backward error 47*> quality (see below). If the approach fails the method switches to a 48*> DOUBLE PRECISION factorization and solve. 49*> 50*> The iterative refinement is not going to be a winning strategy if 51*> the ratio SINGLE PRECISION performance over DOUBLE PRECISION 52*> performance is too small. A reasonable strategy should take the 53*> number of right-hand sides and the size of the matrix into account. 54*> This might be done with a call to ILAENV in the future. Up to now, we 55*> always try iterative refinement. 56*> 57*> The iterative refinement process is stopped if 58*> ITER > ITERMAX 59*> or for all the RHS we have: 60*> RNRM < SQRT(N)*XNRM*ANRM*EPS*BWDMAX 61*> where 62*> o ITER is the number of the current iteration in the iterative 63*> refinement process 64*> o RNRM is the infinity-norm of the residual 65*> o XNRM is the infinity-norm of the solution 66*> o ANRM is the infinity-operator-norm of the matrix A 67*> o EPS is the machine epsilon returned by DLAMCH('Epsilon') 68*> The value ITERMAX and BWDMAX are fixed to 30 and 1.0D+00 69*> respectively. 70*> \endverbatim 71* 72* Arguments: 73* ========== 74* 75*> \param[in] N 76*> \verbatim 77*> N is INTEGER 78*> The number of linear equations, i.e., the order of the 79*> matrix A. N >= 0. 80*> \endverbatim 81*> 82*> \param[in] NRHS 83*> \verbatim 84*> NRHS is INTEGER 85*> The number of right hand sides, i.e., the number of columns 86*> of the matrix B. NRHS >= 0. 87*> \endverbatim 88*> 89*> \param[in,out] A 90*> \verbatim 91*> A is DOUBLE PRECISION array, 92*> dimension (LDA,N) 93*> On entry, the N-by-N coefficient matrix A. 94*> On exit, if iterative refinement has been successfully used 95*> (INFO = 0 and ITER >= 0, see description below), then A is 96*> unchanged, if double precision factorization has been used 97*> (INFO = 0 and ITER < 0, see description below), then the 98*> array A contains the factors L and U from the factorization 99*> A = P*L*U; the unit diagonal elements of L are not stored. 100*> \endverbatim 101*> 102*> \param[in] LDA 103*> \verbatim 104*> LDA is INTEGER 105*> The leading dimension of the array A. LDA >= max(1,N). 106*> \endverbatim 107*> 108*> \param[out] IPIV 109*> \verbatim 110*> IPIV is INTEGER array, dimension (N) 111*> The pivot indices that define the permutation matrix P; 112*> row i of the matrix was interchanged with row IPIV(i). 113*> Corresponds either to the single precision factorization 114*> (if INFO = 0 and ITER >= 0) or the double precision 115*> factorization (if INFO = 0 and ITER < 0). 116*> \endverbatim 117*> 118*> \param[in] B 119*> \verbatim 120*> B is DOUBLE PRECISION array, dimension (LDB,NRHS) 121*> The N-by-NRHS right hand side matrix B. 122*> \endverbatim 123*> 124*> \param[in] LDB 125*> \verbatim 126*> LDB is INTEGER 127*> The leading dimension of the array B. LDB >= max(1,N). 128*> \endverbatim 129*> 130*> \param[out] X 131*> \verbatim 132*> X is DOUBLE PRECISION array, dimension (LDX,NRHS) 133*> If INFO = 0, the N-by-NRHS solution matrix X. 134*> \endverbatim 135*> 136*> \param[in] LDX 137*> \verbatim 138*> LDX is INTEGER 139*> The leading dimension of the array X. LDX >= max(1,N). 140*> \endverbatim 141*> 142*> \param[out] WORK 143*> \verbatim 144*> WORK is DOUBLE PRECISION array, dimension (N,NRHS) 145*> This array is used to hold the residual vectors. 146*> \endverbatim 147*> 148*> \param[out] SWORK 149*> \verbatim 150*> SWORK is REAL array, dimension (N*(N+NRHS)) 151*> This array is used to use the single precision matrix and the 152*> right-hand sides or solutions in single precision. 153*> \endverbatim 154*> 155*> \param[out] ITER 156*> \verbatim 157*> ITER is INTEGER 158*> < 0: iterative refinement has failed, double precision 159*> factorization has been performed 160*> -1 : the routine fell back to full precision for 161*> implementation- or machine-specific reasons 162*> -2 : narrowing the precision induced an overflow, 163*> the routine fell back to full precision 164*> -3 : failure of SGETRF 165*> -31: stop the iterative refinement after the 30th 166*> iterations 167*> > 0: iterative refinement has been successfully used. 168*> Returns the number of iterations 169*> \endverbatim 170*> 171*> \param[out] INFO 172*> \verbatim 173*> INFO is INTEGER 174*> = 0: successful exit 175*> < 0: if INFO = -i, the i-th argument had an illegal value 176*> > 0: if INFO = i, U(i,i) computed in DOUBLE PRECISION is 177*> exactly zero. The factorization has been completed, 178*> but the factor U is exactly singular, so the solution 179*> could not be computed. 180*> \endverbatim 181* 182* Authors: 183* ======== 184* 185*> \author Univ. of Tennessee 186*> \author Univ. of California Berkeley 187*> \author Univ. of Colorado Denver 188*> \author NAG Ltd. 189* 190*> \date June 2016 191* 192*> \ingroup doubleGEsolve 193* 194* ===================================================================== 195 SUBROUTINE DSGESV( N, NRHS, A, LDA, IPIV, B, LDB, X, LDX, WORK, 196 $ SWORK, ITER, INFO ) 197* 198* -- LAPACK driver routine (version 3.8.0) -- 199* -- LAPACK is a software package provided by Univ. of Tennessee, -- 200* -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..-- 201* June 2016 202* 203* .. Scalar Arguments .. 204 INTEGER INFO, ITER, LDA, LDB, LDX, N, NRHS 205* .. 206* .. Array Arguments .. 207 INTEGER IPIV( * ) 208 REAL SWORK( * ) 209 DOUBLE PRECISION A( LDA, * ), B( LDB, * ), WORK( N, * ), 210 $ X( LDX, * ) 211* .. 212* 213* ===================================================================== 214* 215* .. Parameters .. 216 LOGICAL DOITREF 217 PARAMETER ( DOITREF = .TRUE. ) 218* 219 INTEGER ITERMAX 220 PARAMETER ( ITERMAX = 30 ) 221* 222 DOUBLE PRECISION BWDMAX 223 PARAMETER ( BWDMAX = 1.0E+00 ) 224* 225 DOUBLE PRECISION NEGONE, ONE 226 PARAMETER ( NEGONE = -1.0D+0, ONE = 1.0D+0 ) 227* 228* .. Local Scalars .. 229 INTEGER I, IITER, PTSA, PTSX 230 DOUBLE PRECISION ANRM, CTE, EPS, RNRM, XNRM 231* 232* .. External Subroutines .. 233 EXTERNAL DAXPY, DGEMM, DLACPY, DLAG2S, DGETRF, DGETRS, 234 $ SGETRF, SGETRS, SLAG2D, XERBLA 235* .. 236* .. External Functions .. 237 INTEGER IDAMAX 238 DOUBLE PRECISION DLAMCH, DLANGE 239 EXTERNAL IDAMAX, DLAMCH, DLANGE 240* .. 241* .. Intrinsic Functions .. 242 INTRINSIC ABS, DBLE, MAX, SQRT 243* .. 244* .. Executable Statements .. 245* 246 INFO = 0 247 ITER = 0 248* 249* Test the input parameters. 250* 251 IF( N.LT.0 ) THEN 252 INFO = -1 253 ELSE IF( NRHS.LT.0 ) THEN 254 INFO = -2 255 ELSE IF( LDA.LT.MAX( 1, N ) ) THEN 256 INFO = -4 257 ELSE IF( LDB.LT.MAX( 1, N ) ) THEN 258 INFO = -7 259 ELSE IF( LDX.LT.MAX( 1, N ) ) THEN 260 INFO = -9 261 END IF 262 IF( INFO.NE.0 ) THEN 263 CALL XERBLA( 'DSGESV', -INFO ) 264 RETURN 265 END IF 266* 267* Quick return if (N.EQ.0). 268* 269 IF( N.EQ.0 ) 270 $ RETURN 271* 272* Skip single precision iterative refinement if a priori slower 273* than double precision factorization. 274* 275 IF( .NOT.DOITREF ) THEN 276 ITER = -1 277 GO TO 40 278 END IF 279* 280* Compute some constants. 281* 282 ANRM = DLANGE( 'I', N, N, A, LDA, WORK ) 283 EPS = DLAMCH( 'Epsilon' ) 284 CTE = ANRM*EPS*SQRT( DBLE( N ) )*BWDMAX 285* 286* Set the indices PTSA, PTSX for referencing SA and SX in SWORK. 287* 288 PTSA = 1 289 PTSX = PTSA + N*N 290* 291* Convert B from double precision to single precision and store the 292* result in SX. 293* 294 CALL DLAG2S( N, NRHS, B, LDB, SWORK( PTSX ), N, INFO ) 295* 296 IF( INFO.NE.0 ) THEN 297 ITER = -2 298 GO TO 40 299 END IF 300* 301* Convert A from double precision to single precision and store the 302* result in SA. 303* 304 CALL DLAG2S( N, N, A, LDA, SWORK( PTSA ), N, INFO ) 305* 306 IF( INFO.NE.0 ) THEN 307 ITER = -2 308 GO TO 40 309 END IF 310* 311* Compute the LU factorization of SA. 312* 313 CALL SGETRF( N, N, SWORK( PTSA ), N, IPIV, INFO ) 314* 315 IF( INFO.NE.0 ) THEN 316 ITER = -3 317 GO TO 40 318 END IF 319* 320* Solve the system SA*SX = SB. 321* 322 CALL SGETRS( 'No transpose', N, NRHS, SWORK( PTSA ), N, IPIV, 323 $ SWORK( PTSX ), N, INFO ) 324* 325* Convert SX back to double precision 326* 327 CALL SLAG2D( N, NRHS, SWORK( PTSX ), N, X, LDX, INFO ) 328* 329* Compute R = B - AX (R is WORK). 330* 331 CALL DLACPY( 'All', N, NRHS, B, LDB, WORK, N ) 332* 333 CALL DGEMM( 'No Transpose', 'No Transpose', N, NRHS, N, NEGONE, A, 334 $ LDA, X, LDX, ONE, WORK, N ) 335* 336* Check whether the NRHS normwise backward errors satisfy the 337* stopping criterion. If yes, set ITER=0 and return. 338* 339 DO I = 1, NRHS 340 XNRM = ABS( X( IDAMAX( N, X( 1, I ), 1 ), I ) ) 341 RNRM = ABS( WORK( IDAMAX( N, WORK( 1, I ), 1 ), I ) ) 342 IF( RNRM.GT.XNRM*CTE ) 343 $ GO TO 10 344 END DO 345* 346* If we are here, the NRHS normwise backward errors satisfy the 347* stopping criterion. We are good to exit. 348* 349 ITER = 0 350 RETURN 351* 352 10 CONTINUE 353* 354 DO 30 IITER = 1, ITERMAX 355* 356* Convert R (in WORK) from double precision to single precision 357* and store the result in SX. 358* 359 CALL DLAG2S( N, NRHS, WORK, N, SWORK( PTSX ), N, INFO ) 360* 361 IF( INFO.NE.0 ) THEN 362 ITER = -2 363 GO TO 40 364 END IF 365* 366* Solve the system SA*SX = SR. 367* 368 CALL SGETRS( 'No transpose', N, NRHS, SWORK( PTSA ), N, IPIV, 369 $ SWORK( PTSX ), N, INFO ) 370* 371* Convert SX back to double precision and update the current 372* iterate. 373* 374 CALL SLAG2D( N, NRHS, SWORK( PTSX ), N, WORK, N, INFO ) 375* 376 DO I = 1, NRHS 377 CALL DAXPY( N, ONE, WORK( 1, I ), 1, X( 1, I ), 1 ) 378 END DO 379* 380* Compute R = B - AX (R is WORK). 381* 382 CALL DLACPY( 'All', N, NRHS, B, LDB, WORK, N ) 383* 384 CALL DGEMM( 'No Transpose', 'No Transpose', N, NRHS, N, NEGONE, 385 $ A, LDA, X, LDX, ONE, WORK, N ) 386* 387* Check whether the NRHS normwise backward errors satisfy the 388* stopping criterion. If yes, set ITER=IITER>0 and return. 389* 390 DO I = 1, NRHS 391 XNRM = ABS( X( IDAMAX( N, X( 1, I ), 1 ), I ) ) 392 RNRM = ABS( WORK( IDAMAX( N, WORK( 1, I ), 1 ), I ) ) 393 IF( RNRM.GT.XNRM*CTE ) 394 $ GO TO 20 395 END DO 396* 397* If we are here, the NRHS normwise backward errors satisfy the 398* stopping criterion, we are good to exit. 399* 400 ITER = IITER 401* 402 RETURN 403* 404 20 CONTINUE 405* 406 30 CONTINUE 407* 408* If we are at this place of the code, this is because we have 409* performed ITER=ITERMAX iterations and never satisfied the 410* stopping criterion, set up the ITER flag accordingly and follow up 411* on double precision routine. 412* 413 ITER = -ITERMAX - 1 414* 415 40 CONTINUE 416* 417* Single-precision iterative refinement failed to converge to a 418* satisfactory solution, so we resort to double precision. 419* 420 CALL DGETRF( N, N, A, LDA, IPIV, INFO ) 421* 422 IF( INFO.NE.0 ) 423 $ RETURN 424* 425 CALL DLACPY( 'All', N, NRHS, B, LDB, X, LDX ) 426 CALL DGETRS( 'No transpose', N, NRHS, A, LDA, IPIV, X, LDX, 427 $ INFO ) 428* 429 RETURN 430* 431* End of DSGESV. 432* 433 END 434