1 SUBROUTINE PCTRRFS( UPLO, TRANS, DIAG, N, NRHS, A, IA, JA, DESCA, 2 $ B, IB, JB, DESCB, X, IX, JX, DESCX, FERR, 3 $ BERR, WORK, LWORK, RWORK, LRWORK, INFO ) 4* 5* -- ScaLAPACK routine (version 1.7) -- 6* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 7* and University of California, Berkeley. 8* May 1, 1997 9* 10* .. Scalar Arguments .. 11 CHARACTER DIAG, TRANS, UPLO 12 INTEGER INFO, IA, IB, IX, JA, JB, JX, LRWORK, LWORK, 13 $ N, NRHS 14* .. 15* .. Array Arguments .. 16 INTEGER DESCA( * ), DESCB( * ), DESCX( * ) 17 REAL BERR( * ), FERR( * ), RWORK( * ) 18 COMPLEX A( * ), B( * ), WORK( * ), X( * ) 19* .. 20* 21* Purpose 22* ======= 23* 24* PCTRRFS provides error bounds and backward error estimates for the 25* solution to a system of linear equations with a triangular 26* coefficient matrix. 27* 28* The solution matrix X must be computed by PCTRTRS or some other 29* means before entering this routine. PCTRRFS does not do iterative 30* refinement because doing so cannot improve the backward error. 31* 32* Notes 33* ===== 34* 35* Each global data object is described by an associated description 36* vector. This vector stores the information required to establish 37* the mapping between an object element and its corresponding process 38* and memory location. 39* 40* Let A be a generic term for any 2D block cyclicly distributed array. 41* Such a global array has an associated description vector DESCA. 42* In the following comments, the character _ should be read as 43* "of the global array". 44* 45* NOTATION STORED IN EXPLANATION 46* --------------- -------------- -------------------------------------- 47* DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 48* DTYPE_A = 1. 49* CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 50* the BLACS process grid A is distribu- 51* ted over. The context itself is glo- 52* bal, but the handle (the integer 53* value) may vary. 54* M_A (global) DESCA( M_ ) The number of rows in the global 55* array A. 56* N_A (global) DESCA( N_ ) The number of columns in the global 57* array A. 58* MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 59* the rows of the array. 60* NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 61* the columns of the array. 62* RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 63* row of the array A is distributed. 64* CSRC_A (global) DESCA( CSRC_ ) The process column over which the 65* first column of the array A is 66* distributed. 67* LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 68* array. LLD_A >= MAX(1,LOCr(M_A)). 69* 70* Let K be the number of rows or columns of a distributed matrix, 71* and assume that its process grid has dimension p x q. 72* LOCr( K ) denotes the number of elements of K that a process 73* would receive if K were distributed over the p processes of its 74* process column. 75* Similarly, LOCc( K ) denotes the number of elements of K that a 76* process would receive if K were distributed over the q processes of 77* its process row. 78* The values of LOCr() and LOCc() may be determined via a call to the 79* ScaLAPACK tool function, NUMROC: 80* LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 81* LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 82* An upper bound for these quantities may be computed by: 83* LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 84* LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 85* 86* In the following comments, sub( A ), sub( X ) and sub( B ) denote 87* respectively A(IA:IA+N-1,JA:JA+N-1), X(IX:IX+N-1,JX:JX+NRHS-1) and 88* B(IB:IB+N-1,JB:JB+NRHS-1). 89* 90* Arguments 91* ========= 92* 93* UPLO (global input) CHARACTER*1 94* = 'U': sub( A ) is upper triangular; 95* = 'L': sub( A ) is lower triangular. 96* 97* TRANS (global input) CHARACTER*1 98* Specifies the form of the system of equations. 99* = 'N': sub( A ) * sub( X ) = sub( B ) (No transpose) 100* = 'T': sub( A )**T * sub( X ) = sub( B ) (Transpose) 101* = 'C': sub( A )**H * sub( X ) = sub( B ) 102* (Conjugate transpose) 103* 104* DIAG (global input) CHARACTER*1 105* = 'N': sub( A ) is non-unit triangular; 106* = 'U': sub( A ) is unit triangular. 107* 108* N (global input) INTEGER 109* The order of the matrix sub( A ). N >= 0. 110* 111* NRHS (global input) INTEGER 112* The number of right hand sides, i.e., the number of columns 113* of the matrices sub( B ) and sub( X ). NRHS >= 0. 114* 115* A (local input) COMPLEX pointer into the local memory 116* to an array of local dimension (LLD_A,LOCc(JA+N-1) ). This 117* array contains the local pieces of the original triangular 118* distributed matrix sub( A ). 119* If UPLO = 'U', the leading N-by-N upper triangular part of 120* sub( A ) contains the upper triangular part of the matrix, 121* and its strictly lower triangular part is not referenced. 122* If UPLO = 'L', the leading N-by-N lower triangular part of 123* sub( A ) contains the lower triangular part of the distribu- 124* ted matrix, and its strictly upper triangular part is not 125* referenced. 126* If DIAG = 'U', the diagonal elements of sub( A ) are also 127* not referenced and are assumed to be 1. 128* 129* IA (global input) INTEGER 130* The row index in the global array A indicating the first 131* row of sub( A ). 132* 133* JA (global input) INTEGER 134* The column index in the global array A indicating the 135* first column of sub( A ). 136* 137* DESCA (global and local input) INTEGER array of dimension DLEN_. 138* The array descriptor for the distributed matrix A. 139* 140* B (local input) COMPLEX pointer into the local memory 141* to an array of local dimension (LLD_B, LOCc(JB+NRHS-1) ). 142* On entry, this array contains the the local pieces of the 143* right hand sides sub( B ). 144* 145* IB (global input) INTEGER 146* The row index in the global array B indicating the first 147* row of sub( B ). 148* 149* JB (global input) INTEGER 150* The column index in the global array B indicating the 151* first column of sub( B ). 152* 153* DESCB (global and local input) INTEGER array of dimension DLEN_. 154* The array descriptor for the distributed matrix B. 155* 156* X (local input) COMPLEX pointer into the local memory 157* to an array of local dimension (LLD_X, LOCc(JX+NRHS-1) ). 158* On entry, this array contains the the local pieces of the 159* solution vectors sub( X ). 160* 161* IX (global input) INTEGER 162* The row index in the global array X indicating the first 163* row of sub( X ). 164* 165* JX (global input) INTEGER 166* The column index in the global array X indicating the 167* first column of sub( X ). 168* 169* DESCX (global and local input) INTEGER array of dimension DLEN_. 170* The array descriptor for the distributed matrix X. 171* 172* FERR (local output) REAL array of local dimension 173* LOCc(JB+NRHS-1). The estimated forward error bounds for 174* each solution vector of sub( X ). If XTRUE is the true 175* solution, FERR bounds the magnitude of the largest entry 176* in (sub( X ) - XTRUE) divided by the magnitude of the 177* largest entry in sub( X ). The estimate is as reliable as 178* the estimate for RCOND, and is almost always a slight 179* overestimate of the true error. 180* This array is tied to the distributed matrix X. 181* 182* BERR (local output) REAL array of local dimension 183* LOCc(JB+NRHS-1). The componentwise relative backward 184* error of each solution vector (i.e., the smallest re- 185* lative change in any entry of sub( A ) or sub( B ) 186* that makes sub( X ) an exact solution). 187* This array is tied to the distributed matrix X. 188* 189* WORK (local workspace/local output) COMPLEX array, 190* dimension (LWORK) 191* On exit, WORK(1) returns the minimal and optimal LWORK. 192* 193* LWORK (local or global input) INTEGER 194* The dimension of the array WORK. 195* LWORK is local input and must be at least 196* LWORK >= 2*LOCr( N + MOD( IA-1, MB_A ) ). 197* 198* If LWORK = -1, then LWORK is global input and a workspace 199* query is assumed; the routine only calculates the minimum 200* and optimal size for all work arrays. Each of these 201* values is returned in the first entry of the corresponding 202* work array, and no error message is issued by PXERBLA. 203* 204* RWORK (local workspace/local output) REAL array, 205* dimension (LRWORK) 206* On exit, RWORK(1) returns the minimal and optimal LRWORK. 207* 208* LRWORK (local or global input) INTEGER 209* The dimension of the array RWORK. 210* LRWORK is local input and must be at least 211* LRWORK >= LOCr( N + MOD( IB-1, MB_B ) ). 212* 213* If LRWORK = -1, then LRWORK is global input and a workspace 214* query is assumed; the routine only calculates the minimum 215* and optimal size for all work arrays. Each of these 216* values is returned in the first entry of the corresponding 217* work array, and no error message is issued by PXERBLA. 218* 219* 220* INFO (global output) INTEGER 221* = 0: successful exit 222* < 0: If the i-th argument is an array and the j-entry had 223* an illegal value, then INFO = -(i*100+j), if the i-th 224* argument is a scalar and had an illegal value, then 225* INFO = -i. 226* 227* Notes 228* ===== 229* 230* This routine temporarily returns when N <= 1. 231* 232* The distributed submatrices sub( X ) and sub( B ) should be 233* distributed the same way on the same processes. These conditions 234* ensure that sub( X ) and sub( B ) are "perfectly" aligned. 235* 236* Moreover, this routine requires the distributed submatrices sub( A ), 237* sub( X ), and sub( B ) to be aligned on a block boundary, 238* i.e., if f(x,y) = MOD( x-1, y ): 239* f( IA, DESCA( MB_ ) ) = f( JA, DESCA( NB_ ) ) = 0, 240* f( IB, DESCB( MB_ ) ) = f( JB, DESCB( NB_ ) ) = 0, and 241* f( IX, DESCX( MB_ ) ) = f( JX, DESCX( NB_ ) ) = 0. 242* 243* ===================================================================== 244* 245* .. Parameters .. 246 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 247 $ LLD_, MB_, M_, NB_, N_, RSRC_ 248 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 249 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 250 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 251 REAL ZERO, RONE 252 PARAMETER ( ZERO = 0.0E+0, RONE = 1.0E+0 ) 253 COMPLEX ONE 254 PARAMETER ( ONE = ( 1.0E+0, 0.0E+0 ) ) 255* .. 256* .. Local Scalars .. 257 LOGICAL LQUERY, NOTRAN, NOUNIT, UPPER 258 CHARACTER TRANSN, TRANST 259 INTEGER IAROW, IXBCOL, IXBROW, IXCOL, IXROW, ICOFFA, 260 $ ICOFFB, ICOFFX, ICTXT, ICURCOL, IDUM, II, IIXB, 261 $ IIW, IOFFXB, IPB, IPR, IPV, IROFFA, IROFFB, 262 $ IROFFX, IW, J, JBRHS, JJ, JJFBE, JJXB, JN, JW, 263 $ K, KASE, LDXB, LRWMIN, LWMIN, MYCOL, MYRHS, 264 $ MYROW, NP, NP0, NPCOL, NPMOD, NPROW, NZ 265 REAL EPS, EST, LSTRES, S, SAFE1, SAFE2, SAFMIN 266 COMPLEX ZDUM 267* .. 268* .. Local Arrays .. 269 INTEGER DESCW( DLEN_ ), IDUM1( 5 ), IDUM2( 5 ) 270* .. 271* .. External Functions .. 272 LOGICAL LSAME 273 INTEGER ICEIL, INDXG2P, NUMROC 274 REAL PSLAMCH 275 EXTERNAL ICEIL, INDXG2P, LSAME, NUMROC, PSLAMCH 276* .. 277* .. External Subroutines .. 278 EXTERNAL BLACS_GRIDINFO, CGEBR2D, CGEBS2D, CHK1MAT, 279 $ DESCSET, INFOG2L, PCATRMV, PCAXPY, PCHK1MAT, 280 $ PCHK2MAT, PCCOPY, PCLACON, PCTRMV, 281 $ PCTRSV, PXERBLA, SGAMX2D 282* .. 283* .. Intrinsic Functions .. 284 INTRINSIC ABS, AIMAG, CMPLX, ICHAR, MAX, MIN, MOD, REAL 285* .. 286* .. Statement Functions .. 287 REAL CABS1 288* .. 289* .. Statement Function definitions .. 290 CABS1( ZDUM ) = ABS( REAL( ZDUM ) ) + ABS( AIMAG( ZDUM ) ) 291* .. 292* .. Executable Statements .. 293* 294* Get grid parameters 295* 296 ICTXT = DESCA( CTXT_ ) 297 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 298* 299* Test the input parameters. 300* 301 INFO = 0 302 IF( NPROW.EQ.-1 ) THEN 303 INFO = -( 900+CTXT_ ) 304 ELSE 305 CALL CHK1MAT( N, 4, N, 4, IA, JA, DESCA, 9, INFO ) 306 CALL CHK1MAT( N, 4, NRHS, 5, IB, JB, DESCB, 13, INFO ) 307 CALL CHK1MAT( N, 4, NRHS, 5, IX, JX, DESCX, 17, INFO ) 308 IF( INFO.EQ.0 ) THEN 309 UPPER = LSAME( UPLO, 'U' ) 310 NOTRAN = LSAME( TRANS, 'N' ) 311 NOUNIT = LSAME( DIAG, 'N' ) 312 IROFFA = MOD( IA-1, DESCA( MB_ ) ) 313 ICOFFA = MOD( JA-1, DESCA( NB_ ) ) 314 IROFFB = MOD( IB-1, DESCB( MB_ ) ) 315 ICOFFB = MOD( JB-1, DESCB( NB_ ) ) 316 IROFFX = MOD( IX-1, DESCX( MB_ ) ) 317 ICOFFX = MOD( JX-1, DESCX( NB_ ) ) 318 IAROW = INDXG2P( IA, DESCA( MB_ ), MYROW, DESCA( RSRC_ ), 319 $ NPROW ) 320 CALL INFOG2L( IB, JB, DESCB, NPROW, NPCOL, MYROW, MYCOL, 321 $ IIXB, JJXB, IXBROW, IXBCOL ) 322 IXROW = INDXG2P( IX, DESCX( MB_ ), MYROW, DESCX( RSRC_ ), 323 $ NPROW ) 324 IXCOL = INDXG2P( JX, DESCX( NB_ ), MYCOL, DESCX( CSRC_ ), 325 $ NPCOL ) 326 NPMOD = NUMROC( N+IROFFA, DESCA( MB_ ), MYROW, IAROW, 327 $ NPROW ) 328 LWMIN = 2*NPMOD 329 WORK( 1 ) = REAL( LWMIN ) 330 LRWMIN = NPMOD 331 RWORK( 1 ) = REAL( LRWMIN ) 332 LQUERY = ( LWORK.EQ.-1 .OR. LRWORK.EQ.-1 ) 333* 334 IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN 335 INFO = -1 336 ELSE IF( .NOT.NOTRAN .AND. .NOT.LSAME( TRANS, 'T' ) .AND. 337 $ .NOT.LSAME( TRANS, 'C' ) ) THEN 338 INFO = -2 339 ELSE IF( .NOT.NOUNIT .AND. .NOT.LSAME( DIAG, 'U' ) ) THEN 340 INFO = -3 341 ELSE IF( N.LT.0 ) THEN 342 INFO = -4 343 ELSE IF( NRHS.LT.0 ) THEN 344 INFO = -5 345 ELSE IF( IROFFA.NE.0 ) THEN 346 INFO = -7 347 ELSE IF( ICOFFA.NE.0 ) THEN 348 INFO = -8 349 ELSE IF( DESCA( MB_ ).NE.DESCA( NB_ ) ) THEN 350 INFO = -( 900+NB_ ) 351 ELSE IF( IROFFA.NE.IROFFB .OR. IAROW.NE.IXBROW ) THEN 352 INFO = -11 353 ELSE IF( DESCA( MB_ ).NE.DESCB( MB_ ) ) THEN 354 INFO = -( 1300+MB_ ) 355 ELSE IF( ICTXT.NE.DESCB( CTXT_ ) ) THEN 356 INFO = -( 1300+CTXT_ ) 357 ELSE IF( IROFFX.NE.0 .OR. IXBROW.NE.IXROW ) THEN 358 INFO = -15 359 ELSE IF( ICOFFB.NE.ICOFFX .OR. IXBCOL.NE.IXCOL ) THEN 360 INFO = -16 361 ELSE IF( DESCB( MB_ ).NE.DESCX( MB_ ) ) THEN 362 INFO = -( 1700+MB_ ) 363 ELSE IF( DESCB( NB_ ).NE.DESCX( NB_ ) ) THEN 364 INFO = -( 1700+NB_ ) 365 ELSE IF( ICTXT.NE.DESCX( CTXT_ ) ) THEN 366 INFO = -( 1700+CTXT_ ) 367 ELSE IF( LWORK.LT.LWMIN .AND. .NOT.LQUERY ) THEN 368 INFO = -21 369 ELSE IF( LRWORK.LT.LRWMIN .AND. .NOT.LQUERY ) THEN 370 INFO = -23 371 END IF 372 END IF 373* 374 IF( UPPER ) THEN 375 IDUM1( 1 ) = ICHAR( 'U' ) 376 ELSE 377 IDUM1( 1 ) = ICHAR( 'L' ) 378 END IF 379 IDUM2( 1 ) = 1 380 IF( NOTRAN ) THEN 381 IDUM1( 2 ) = ICHAR( 'N' ) 382 ELSE IF( LSAME( TRANS, 'T' ) ) THEN 383 IDUM1( 2 ) = ICHAR( 'T' ) 384 ELSE 385 IDUM1( 2 ) = ICHAR( 'C' ) 386 END IF 387 IDUM2( 2 ) = 2 388 IF( NOUNIT ) THEN 389 IDUM1( 3 ) = ICHAR( 'N' ) 390 ELSE 391 IDUM1( 3 ) = ICHAR( 'U' ) 392 END IF 393 IDUM2( 3 ) = 3 394 IF( LWORK.EQ.-1 ) THEN 395 IDUM1( 4 ) = -1 396 ELSE 397 IDUM1( 4 ) = 1 398 END IF 399 IDUM2( 4 ) = 21 400 IF( LRWORK.EQ.-1 ) THEN 401 IDUM1( 5 ) = -1 402 ELSE 403 IDUM1( 5 ) = 1 404 END IF 405 IDUM2( 5 ) = 23 406 CALL PCHK1MAT( N, 4, N, 4, IA, JA, DESCA, 9, 0, IDUM1, IDUM2, 407 $ INFO ) 408 CALL PCHK2MAT( N, 4, NRHS, 5, IB, JB, DESCB, 13, N, 4, NRHS, 5, 409 $ IX, JX, DESCX, 17, 5, IDUM1, IDUM2, INFO ) 410 END IF 411 IF( INFO.NE.0 ) THEN 412 CALL PXERBLA( ICTXT, 'PCTRRFS', -INFO ) 413 RETURN 414 ELSE IF( LQUERY ) THEN 415 RETURN 416 END IF 417* 418 JJFBE = JJXB 419 MYRHS = NUMROC( JB+NRHS-1, DESCB( NB_ ), MYCOL, DESCB( CSRC_ ), 420 $ NPCOL ) 421* 422* Quick return if possible 423* 424 IF( N.LE.1 .OR. NRHS.EQ.0 ) THEN 425 DO 10 JJ = JJFBE, MYRHS 426 FERR( JJ ) = ZERO 427 BERR( JJ ) = ZERO 428 10 CONTINUE 429 RETURN 430 END IF 431* 432 IF( NOTRAN ) THEN 433 TRANSN = 'N' 434 TRANST = 'C' 435 ELSE 436 TRANSN = 'C' 437 TRANST = 'N' 438 END IF 439* 440 NP0 = NUMROC( N+IROFFB, DESCB( MB_ ), MYROW, IXBROW, NPROW ) 441 CALL DESCSET( DESCW, N+IROFFB, 1, DESCA( MB_ ), 1, IXBROW, IXBCOL, 442 $ ICTXT, MAX( 1, NP0 ) ) 443 IPB = 1 444 IPR = 1 445 IPV = IPR + NP0 446 IF( MYROW.EQ.IXBROW ) THEN 447 IIW = 1 + IROFFB 448 NP = NP0 - IROFFB 449 ELSE 450 IIW = 1 451 NP = NP0 452 END IF 453 IW = 1 + IROFFB 454 JW = 1 455 LDXB = DESCB( LLD_ ) 456 IOFFXB = ( JJXB-1 )*LDXB 457* 458* NZ = maximum number of nonzero entries in each row of A, plus 1 459* 460 NZ = N + 1 461 EPS = PSLAMCH( ICTXT, 'Epsilon' ) 462 SAFMIN = PSLAMCH( ICTXT, 'Safe minimum' ) 463 SAFE1 = NZ*SAFMIN 464 SAFE2 = SAFE1 / EPS 465 JN = MIN( ICEIL( JB, DESCB( NB_ ) )*DESCB( NB_ ), JB+NRHS-1 ) 466* 467* Handle first block separately 468* 469 JBRHS = JN - JB + 1 470 DO 90 K = 0, JBRHS - 1 471* 472* Compute residual R = B - op(A) * X, 473* where op(A) = A or A', depending on TRANS. 474* 475 CALL PCCOPY( N, X, IX, JX+K, DESCX, 1, WORK( IPR ), IW, JW, 476 $ DESCW, 1 ) 477 CALL PCTRMV( UPLO, TRANS, DIAG, N, A, IA, JA, DESCA, 478 $ WORK( IPR ), IW, JW, DESCW, 1 ) 479 CALL PCAXPY( N, -ONE, B, IB, JB+K, DESCB, 1, WORK( IPR ), IW, 480 $ JW, DESCW, 1 ) 481* 482* Compute componentwise relative backward error from formula 483* 484* max(i) ( abs(R(i)) / ( abs(op(A))*abs(X) + abs(B) )(i) ) 485* 486* where abs(Z) is the componentwise absolute value of the matrix 487* or vector Z. If the i-th component of the denominator is less 488* than SAFE2, then SAFE1 is added to the i-th components of the 489* numerator and denominator before dividing. 490* 491 IF( MYCOL.EQ.IXBCOL ) THEN 492 IF( NP.GT.0 ) THEN 493 DO 20 II = IIXB, IIXB + NP - 1 494 RWORK( IIW+II-IIXB ) = CABS1( B( II+IOFFXB ) ) 495 20 CONTINUE 496 END IF 497 END IF 498* 499 CALL PCATRMV( UPLO, TRANS, DIAG, N, RONE, A, IA, JA, DESCA, X, 500 $ IX, JX+K, DESCX, 1, RONE, RWORK( IPB ), IW, JW, 501 $ DESCW, 1 ) 502* 503 S = ZERO 504 IF( MYCOL.EQ.IXBCOL ) THEN 505 IF( NP.GT.0 ) THEN 506 DO 30 II = IIW - 1, IIW + NP - 2 507 IF( RWORK( IPB+II ).GT.SAFE2 ) THEN 508 S = MAX( S, CABS1( WORK( IPR+II ) ) / 509 $ RWORK( IPB+II ) ) 510 ELSE 511 S = MAX( S, ( CABS1( WORK( IPR+II ) )+SAFE1 ) / 512 $ ( RWORK( IPB+II )+SAFE1 ) ) 513 END IF 514 30 CONTINUE 515 END IF 516 END IF 517* 518 CALL SGAMX2D( ICTXT, 'All', ' ', 1, 1, S, 1, IDUM, IDUM, 1, 519 $ -1, MYCOL ) 520 IF( MYCOL.EQ.IXBCOL ) 521 $ BERR( JJFBE ) = S 522* 523* Bound error from formula 524* 525* norm(X - XTRUE) / norm(X) .le. FERR = 526* norm( abs(inv(op(A)))* 527* ( abs(R) + NZ*EPS*( abs(op(A))*abs(X)+abs(B) ))) / norm(X) 528* 529* where 530* norm(Z) is the magnitude of the largest component of Z 531* inv(op(A)) is the inverse of op(A) 532* abs(Z) is the componentwise absolute value of the matrix or 533* vector Z 534* NZ is the maximum number of nonzeros in any row of A, plus 1 535* EPS is machine epsilon 536* 537* The i-th component of abs(R)+NZ*EPS*(abs(op(A))*abs(X)+abs(B)) 538* is incremented by SAFE1 if the i-th component of 539* abs(op(A))*abs(X) + abs(B) is less than SAFE2. 540* 541* Use PCLACON to estimate the infinity-norm of the matrix 542* inv(op(A)) * diag(W), 543* where W = abs(R) + NZ*EPS*( abs(op(A))*abs(X)+abs(B) ))) 544* 545 IF( MYCOL.EQ.IXBCOL ) THEN 546 IF( NP.GT.0 ) THEN 547 DO 40 II = IIW - 1, IIW + NP - 2 548 IF( RWORK( IPB+II ).GT.SAFE2 ) THEN 549 RWORK( IPB+II ) = CABS1( WORK( IPR+II ) ) + 550 $ NZ*EPS*RWORK( IPB+II ) 551 ELSE 552 RWORK( IPB+II ) = CABS1( WORK( IPR+II ) ) + 553 $ NZ*EPS*RWORK( IPB+II ) + SAFE1 554 END IF 555 40 CONTINUE 556 END IF 557 END IF 558* 559 KASE = 0 560 50 CONTINUE 561 IF( MYCOL.EQ.IXBCOL ) THEN 562 CALL CGEBS2D( ICTXT, 'Rowwise', ' ', NP, 1, WORK( IPR ), 563 $ DESCW( LLD_ ) ) 564 ELSE 565 CALL CGEBR2D( ICTXT, 'Rowwise', ' ', NP, 1, WORK( IPR ), 566 $ DESCW( LLD_ ), MYROW, IXBCOL ) 567 END IF 568 DESCW( CSRC_ ) = MYCOL 569 CALL PCLACON( N, WORK( IPV ), IW, JW, DESCW, WORK( IPR ), 570 $ IW, JW, DESCW, EST, KASE ) 571 DESCW( CSRC_ ) = IXBCOL 572* 573 IF( KASE.NE.0 ) THEN 574 IF( KASE.EQ.1 ) THEN 575* 576* Multiply by diag(W)*inv(op(A)'). 577* 578 CALL PCTRSV( UPLO, TRANST, DIAG, N, A, IA, JA, DESCA, 579 $ WORK( IPR ), IW, JW, DESCW, 1 ) 580 IF( MYCOL.EQ.IXBCOL ) THEN 581 IF( NP.GT.0 ) THEN 582 DO 60 II = IIW - 1, IIW + NP - 2 583 WORK( IPR+II ) = RWORK( IPB+II )*WORK( IPR+II ) 584 60 CONTINUE 585 END IF 586 END IF 587 ELSE 588* 589* Multiply by inv(op(A))*diag(W). 590* 591 IF( MYCOL.EQ.IXBCOL ) THEN 592 IF( NP.GT.0 ) THEN 593 DO 70 II = IIW - 1, IIW + NP - 2 594 WORK( IPR+II ) = RWORK( IPB+II )*WORK( IPR+II ) 595 70 CONTINUE 596 END IF 597 END IF 598 CALL PCTRSV( UPLO, TRANSN, DIAG, N, A, IA, JA, DESCA, 599 $ WORK( IPR ), IW, JW, DESCW, 1 ) 600 END IF 601 GO TO 50 602 END IF 603* 604* Normalize error. 605* 606 LSTRES = ZERO 607 IF( MYCOL.EQ.IXBCOL ) THEN 608 IF( NP.GT.0 ) THEN 609 DO 80 II = IIXB, IIXB + NP - 1 610 LSTRES = MAX( LSTRES, CABS1( X( IOFFXB+II ) ) ) 611 80 CONTINUE 612 END IF 613 CALL SGAMX2D( ICTXT, 'Column', ' ', 1, 1, LSTRES, 1, IDUM, 614 $ IDUM, 1, -1, MYCOL ) 615 IF( LSTRES.NE.ZERO ) 616 $ FERR( JJFBE ) = EST / LSTRES 617* 618 JJXB = JJXB + 1 619 JJFBE = JJFBE + 1 620 IOFFXB = IOFFXB + LDXB 621* 622 END IF 623* 624 90 CONTINUE 625* 626 ICURCOL = MOD( IXBCOL+1, NPCOL ) 627* 628* Do for each right hand side 629* 630 DO 180 J = JN + 1, JB + NRHS - 1, DESCB( NB_ ) 631 JBRHS = MIN( JB+NRHS-J, DESCB( NB_ ) ) 632 DESCW( CSRC_ ) = ICURCOL 633* 634 DO 170 K = 0, JBRHS - 1 635* 636* Compute residual R = B - op(A) * X, 637* where op(A) = A or A', depending on TRANS. 638* 639 CALL PCCOPY( N, X, IX, J+K, DESCX, 1, WORK( IPR ), IW, JW, 640 $ DESCW, 1 ) 641 CALL PCTRMV( UPLO, TRANS, DIAG, N, A, IA, JA, DESCA, 642 $ WORK( IPR ), IW, JW, DESCW, 1 ) 643 CALL PCAXPY( N, -ONE, B, IB, J+K, DESCB, 1, WORK( IPR ), 644 $ IW, JW, DESCW, 1 ) 645* 646* Compute componentwise relative backward error from formula 647* 648* max(i) ( abs(R(i)) / ( abs(op(A))*abs(X) + abs(B) )(i) ) 649* 650* where abs(Z) is the componentwise absolute value of the 651* matrix or vector Z. If the i-th component of the 652* denominator is less than SAFE2, then SAFE1 is added to the 653* i-th components of the numerator and denominator before 654* dividing. 655* 656 IF( MYCOL.EQ.IXBCOL ) THEN 657 IF( NP.GT.0 ) THEN 658 DO 100 II = IIXB, IIXB + NP - 1 659 RWORK( IIW+II-IIXB ) = CABS1( B( II+IOFFXB ) ) 660 100 CONTINUE 661 END IF 662 END IF 663* 664 CALL PCATRMV( UPLO, TRANS, DIAG, N, RONE, A, IA, JA, DESCA, 665 $ X, IX, J+K, DESCX, 1, RONE, RWORK( IPB ), IW, 666 $ JW, DESCW, 1 ) 667* 668 S = ZERO 669 IF( MYCOL.EQ.IXBCOL ) THEN 670 IF( NP.GT.0 ) THEN 671 DO 110 II = IIW - 1, IIW + NP - 2 672 IF( RWORK( IPB+II ).GT.SAFE2 ) THEN 673 S = MAX( S, CABS1( WORK( IPR+II ) ) / 674 $ RWORK( IPB+II ) ) 675 ELSE 676 S = MAX( S, ( CABS1( WORK( IPR+II ) )+SAFE1 ) / 677 $ ( RWORK( IPB+II )+SAFE1 ) ) 678 END IF 679 110 CONTINUE 680 END IF 681 END IF 682* 683 CALL SGAMX2D( ICTXT, 'All', ' ', 1, 1, S, 1, IDUM, IDUM, 1, 684 $ -1, MYCOL ) 685 IF( MYCOL.EQ.IXBCOL ) 686 $ BERR( JJFBE ) = S 687* 688* Bound error from formula 689* 690* norm(X - XTRUE) / norm(X) .le. FERR = 691* norm( abs(inv(op(A)))* 692* ( abs(R) + NZ*EPS*( abs(op(A))*abs(X)+abs(B) )))/norm(X) 693* 694* where 695* norm(Z) is the magnitude of the largest component of Z 696* inv(op(A)) is the inverse of op(A) 697* abs(Z) is the componentwise absolute value of the matrix 698* or vector Z 699* NZ is the maximum number of nonzeros in any row of A, 700* plus 1 701* EPS is machine epsilon 702* 703* The i-th component of 704* abs(R)+NZ*EPS*(abs(op(A))*abs(X)+abs(B)) 705* is incremented by SAFE1 if the i-th component of 706* abs(op(A))*abs(X) + abs(B) is less than SAFE2. 707* 708* Use PCLACON to estimate the infinity-norm of the matrix 709* inv(op(A)) * diag(W), 710* where W = abs(R) + NZ*EPS*( abs(op(A))*abs(X)+abs(B) ))) 711* 712 IF( MYCOL.EQ.IXBCOL ) THEN 713 IF( NP.GT.0 ) THEN 714 DO 120 II = IIW - 1, IIW + NP - 2 715 IF( RWORK( IPB+II ).GT.SAFE2 ) THEN 716 RWORK( IPB+II ) = CABS1( WORK( IPR+II ) ) + 717 $ NZ*EPS*RWORK( IPB+II ) 718 ELSE 719 RWORK( IPB+II ) = CABS1( WORK( IPR+II ) ) + 720 $ NZ*EPS*RWORK( IPB+II ) + SAFE1 721 END IF 722 120 CONTINUE 723 END IF 724 END IF 725* 726 KASE = 0 727 130 CONTINUE 728 IF( MYCOL.EQ.IXBCOL ) THEN 729 CALL CGEBS2D( ICTXT, 'Rowwise', ' ', NP, 1, WORK( IPR ), 730 $ DESCW( LLD_ ) ) 731 ELSE 732 CALL CGEBR2D( ICTXT, 'Rowwise', ' ', NP, 1, WORK( IPR ), 733 $ DESCW( LLD_ ), MYROW, IXBCOL ) 734 END IF 735 DESCW( CSRC_ ) = MYCOL 736 CALL PCLACON( N, WORK( IPV ), IW, JW, DESCW, WORK( IPR ), 737 $ IW, JW, DESCW, EST, KASE ) 738 DESCW( CSRC_ ) = IXBCOL 739* 740 IF( KASE.NE.0 ) THEN 741 IF( KASE.EQ.1 ) THEN 742* 743* Multiply by diag(W)*inv(op(A)'). 744* 745 CALL PCTRSV( UPLO, TRANST, DIAG, N, A, IA, JA, DESCA, 746 $ WORK( IPR ), IW, JW, DESCW, 1 ) 747 IF( MYCOL.EQ.IXBCOL ) THEN 748 IF( NP.GT.0 ) THEN 749 DO 140 II = IIW - 1, IIW + NP - 2 750 WORK( IPR+II ) = RWORK( IPB+II )* 751 $ WORK( IPR+II ) 752 140 CONTINUE 753 END IF 754 END IF 755 ELSE 756* 757* Multiply by inv(op(A))*diag(W). 758* 759 IF( MYCOL.EQ.IXBCOL ) THEN 760 IF( NP.GT.0 ) THEN 761 DO 150 II = IIW - 1, IIW + NP - 2 762 WORK( IPR+II ) = RWORK( IPB+II )* 763 $ WORK( IPR+II ) 764 150 CONTINUE 765 END IF 766 END IF 767 CALL PCTRSV( UPLO, TRANSN, DIAG, N, A, IA, JA, DESCA, 768 $ WORK( IPR ), IW, JW, DESCW, 1 ) 769 END IF 770 GO TO 130 771 END IF 772* 773* Normalize error. 774* 775 LSTRES = ZERO 776 IF( MYCOL.EQ.IXBCOL ) THEN 777 IF( NP.GT.0 ) THEN 778 DO 160 II = IIXB, IIXB + NP - 1 779 LSTRES = MAX( LSTRES, CABS1( X( IOFFXB+II ) ) ) 780 160 CONTINUE 781 END IF 782 CALL SGAMX2D( ICTXT, 'Column', ' ', 1, 1, LSTRES, 1, 783 $ IDUM, IDUM, 1, -1, MYCOL ) 784 IF( LSTRES.NE.ZERO ) 785 $ FERR( JJFBE ) = EST / LSTRES 786* 787 JJXB = JJXB + 1 788 JJFBE = JJFBE + 1 789 IOFFXB = IOFFXB + LDXB 790* 791 END IF 792* 793 170 CONTINUE 794* 795 ICURCOL = MOD( ICURCOL+1, NPCOL ) 796* 797 180 CONTINUE 798* 799 WORK( 1 ) = CMPLX( REAL( LWMIN ) ) 800 RWORK( 1 ) = REAL( LRWMIN ) 801* 802 RETURN 803* 804* End of PCTRRFS 805* 806 END 807