1 SUBROUTINE PDINVCHK( MATTYP, N, A, IA, JA, DESCA, IASEED, ANORM, 2 $ FRESID, RCOND, WORK ) 3* 4* -- ScaLAPACK routine (version 1.7) -- 5* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 6* and University of California, Berkeley. 7* May 28, 2001 8* 9* .. Scalar Arguments .. 10 INTEGER IA, IASEED, JA, N 11 DOUBLE PRECISION ANORM, FRESID, RCOND 12* .. 13* .. Array Arguments .. 14 CHARACTER*3 MATTYP 15 INTEGER DESCA( * ) 16 DOUBLE PRECISION A( * ), WORK( * ) 17* .. 18* 19* Purpose 20* ======= 21* 22* PDINVCHK computes the scaled residual 23* 24* || sub( A ) * inv( sub( A ) ) - I || / ( || sub( A ) || * N * eps ), 25* 26* where sub( A ) denotes A(IA:IA+N-1,JA:JA+N-1). to check the result 27* returned by the matrix inversion routines. 28* 29* Notes 30* ===== 31* 32* Each global data object is described by an associated description 33* vector. This vector stores the information required to establish 34* the mapping between an object element and its corresponding process 35* and memory location. 36* 37* Let A be a generic term for any 2D block cyclicly distributed array. 38* Such a global array has an associated description vector DESCA. 39* In the following comments, the character _ should be read as 40* "of the global array". 41* 42* NOTATION STORED IN EXPLANATION 43* --------------- -------------- -------------------------------------- 44* DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 45* DTYPE_A = 1. 46* CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 47* the BLACS process grid A is distribu- 48* ted over. The context itself is glo- 49* bal, but the handle (the integer 50* value) may vary. 51* M_A (global) DESCA( M_ ) The number of rows in the global 52* array A. 53* N_A (global) DESCA( N_ ) The number of columns in the global 54* array A. 55* MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 56* the rows of the array. 57* NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 58* the columns of the array. 59* RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 60* row of the array A is distributed. 61* CSRC_A (global) DESCA( CSRC_ ) The process column over which the 62* first column of the array A is 63* distributed. 64* LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 65* array. LLD_A >= MAX(1,LOCr(M_A)). 66* 67* Let K be the number of rows or columns of a distributed matrix, 68* and assume that its process grid has dimension p x q. 69* LOCr( K ) denotes the number of elements of K that a process 70* would receive if K were distributed over the p processes of its 71* process column. 72* Similarly, LOCc( K ) denotes the number of elements of K that a 73* process would receive if K were distributed over the q processes of 74* its process row. 75* The values of LOCr() and LOCc() may be determined via a call to the 76* ScaLAPACK tool function, NUMROC: 77* LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 78* LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 79* An upper bound for these quantities may be computed by: 80* LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 81* LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 82* 83* Arguments 84* ========= 85* 86* MATTYP (global input) CHARACTER*3 87* The type of the distributed matrix to be generated: 88* if MATTYP = 'GEN' then GENeral matrix, 89* if MATTYP = 'UTR' then Upper TRiangular matrix, 90* if MATTYP = 'LTR' then Lower TRiangular matrix, 91* if MATTYP = 'UPD' then (Upper) symmetric Positive Definite, 92* if MATTYP = 'LPD' then (Lower) symmetric Positive Definite, 93* 94* N (global input) INTEGER 95* The number of rows and columns to be operated on, i.e. the 96* order of the distributed submatrix sub( A ). N >= 0. 97* 98* A (local input) DOUBLE PRECISION pointer into the local memory 99* to an array of local dimension (LLD_A, LOCc(JA+N-1)). On 100* entry, sub( A ) contains the distributed matrix inverse 101* computed by PDGETRI, PDPOTRI or PDTRTRI. 102* 103* IA (global input) INTEGER 104* The row index in the global array A indicating the first 105* row of sub( A ). 106* 107* JA (global input) INTEGER 108* The column index in the global array A indicating the 109* first column of sub( A ). 110* 111* DESCA (global and local input) INTEGER array of dimension DLEN_. 112* The array descriptor for the distributed matrix A. 113* 114* IASEED (global input) INTEGER 115* Seed for the random generation of sub( A ). 116* 117* ANORM (global input) DOUBLE PRECISION 118* The 1-norm of the original matrix sub( A ). 119* 120* FRESID (global output) DOUBLE PRECISION 121* The inversion residual. 122* 123* RCOND (global output) DOUBLE PRECISION 124* The condition number of the original distributed matrix. 125* RCOND = || sub( A ) ||.|| sub( A )^{-1} || where ||A|| 126* denotes the 1-norm of A. 127* 128* WORK (local workspace) DOUBLE PRECISION array, dimension 129* MAX(2*LOCr(N_A+MOD(IA-1,MB_A))*MB_A, LDW) 130* where LDW is the workspace requirement for the norm computa- 131* tions, see PDLANGE, PDLANSY and PDLANTR. 132* 133* ===================================================================== 134* 135* .. Parameters .. 136 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 137 $ LLD_, MB_, M_, NB_, N_, RSRC_ 138 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 139 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 140 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 141 DOUBLE PRECISION ZERO, ONE 142 PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0 ) 143* .. 144* .. Local Scalars .. 145 CHARACTER AFORM, DIAG, UPLO 146 INTEGER ICTXT, ICURCOL, ICURROW, II, IIA, IPW, IROFF, 147 $ IW, J, JB, JJA, JN, KK, MYCOL, MYROW, NP, 148 $ NPCOL, NPROW 149 DOUBLE PRECISION AUXNORM, EPS, NRMINVAXA, TEMP 150* .. 151* .. Local Arrays .. 152 INTEGER DESCW( DLEN_ ) 153* .. 154* .. External Subroutines .. 155 EXTERNAL BLACS_GRIDINFO, DESCSET, INFOG2L, PDGEMM, 156 $ PDLASET, PDMATGEN, PDSYMM, PDTRMM 157* .. 158* .. External Functions .. 159 LOGICAL LSAMEN 160 INTEGER ICEIL, NUMROC 161 DOUBLE PRECISION PDLAMCH, PDLANGE, PDLANSY, PDLANTR 162 EXTERNAL ICEIL, LSAMEN, NUMROC, PDLAMCH, PDLANGE, 163 $ PDLANSY, PDLANTR 164* .. 165* .. Intrinsic Functions .. 166 INTRINSIC MAX, MIN, MOD 167* .. 168* .. Executable Statements .. 169* 170 EPS = PDLAMCH( DESCA( CTXT_ ), 'eps' ) 171* 172* Get grid parameters 173* 174 ICTXT = DESCA( CTXT_ ) 175 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 176* 177* Compute the condition number 178* 179 IF( LSAMEN( 1, MATTYP( 1:1 ), 'U' ) ) THEN 180 UPLO = 'U' 181 ELSE 182 UPLO = 'L' 183 END IF 184* 185 IF( LSAMEN( 3, MATTYP, 'GEN' ) ) THEN 186* 187 AFORM = 'N' 188 DIAG = 'D' 189 AUXNORM = PDLANGE( '1', N, N, A, IA, JA, DESCA, WORK ) 190* 191 ELSE IF( LSAMEN( 2, MATTYP( 2:3 ), 'TR' ) ) THEN 192* 193 AFORM = 'N' 194 DIAG = 'D' 195 AUXNORM = PDLANTR( '1', UPLO, 'Non unit', N, N, A, IA, JA, 196 $ DESCA, WORK ) 197 ELSE IF( LSAMEN( 2, MATTYP( 2:3 ), 'PD' ) ) THEN 198* 199 AFORM = 'S' 200 DIAG = 'D' 201 AUXNORM = PDLANSY( '1', UPLO, N, A, IA, JA, DESCA, WORK ) 202* 203 END IF 204 RCOND = ANORM*AUXNORM 205* 206* Compute inv(A)*A 207* 208 CALL INFOG2L( IA, JA, DESCA, NPROW, NPCOL, MYROW, MYCOL, IIA, JJA, 209 $ ICURROW, ICURCOL ) 210* 211* Define array descriptor for working array WORK 212* 213 IROFF = MOD( IA-1, DESCA( MB_ ) ) 214 NP = NUMROC( N+IROFF, DESCA( MB_ ), MYROW, ICURROW, NPROW ) 215 CALL DESCSET( DESCW, N+IROFF, DESCA( NB_ ), DESCA( MB_ ), 216 $ DESCA( NB_ ), ICURROW, ICURCOL, DESCA( CTXT_ ), 217 $ MAX( 1, NP ) ) 218 IPW = DESCW( LLD_ ) * DESCW( NB_ ) + 1 219* 220 IF( MYROW.EQ.ICURROW ) THEN 221 II = IROFF + 1 222 NP = NP - IROFF 223 ELSE 224 II = 1 225 END IF 226 JN = MIN( ICEIL( JA, DESCA( NB_ ) ) * DESCA( NB_ ), JA+N-1 ) 227 JB = JN - JA + 1 228* 229* Handle first block separately, regenerate a block of columns of A 230* 231 IW = IROFF + 1 232 IF( MYCOL.EQ.ICURCOL ) THEN 233 IF( LSAMEN( 2, MATTYP( 2:3 ), 'TR' ) ) THEN 234 CALL PDMATGEN( ICTXT, AFORM, DIAG, DESCA( M_ ), DESCA( N_ ), 235 $ DESCW( MB_ ), DESCW( NB_ ), WORK, 236 $ DESCW( LLD_ ), DESCA( RSRC_ ), 237 $ DESCA( CSRC_ ), IASEED, IIA-1, NP, 238 $ JJA-1, JB, MYROW, MYCOL, NPROW, NPCOL ) 239 IF( LSAMEN( 3, MATTYP, 'UTR' ) ) THEN 240 CALL PDLASET( 'Lower', N-1, JB, ZERO, ZERO, WORK, IW+1, 241 $ 1, DESCW ) 242 ELSE 243 CALL PDLASET( 'Upper', JB-1, JB-1, ZERO, ZERO, WORK, IW, 244 $ 2, DESCW ) 245 END IF 246 ELSE 247 CALL PDMATGEN( ICTXT, AFORM, DIAG, DESCA( M_ ), DESCA( N_ ), 248 $ DESCW( MB_ ), DESCW( NB_ ), WORK( IPW ), 249 $ DESCW( LLD_ ), DESCA( RSRC_ ), 250 $ DESCA( CSRC_ ), IASEED, 251 $ IIA-1, NP, JJA-1, JB, MYROW, MYCOL, NPROW, 252 $ NPCOL ) 253 END IF 254 END IF 255* 256* Multiply A^{-1}*A 257* 258 IF( LSAMEN( 3, MATTYP, 'GEN' ) ) THEN 259* 260 CALL PDGEMM( 'No tranpose', 'No transpose', N, JB, N, ONE, A, 261 $ IA, JA, DESCA, WORK( IPW ), IW, 1, DESCW, ZERO, 262 $ WORK, IW, 1, DESCW ) 263* 264 ELSE IF( LSAMEN( 2, MATTYP( 2:3 ), 'TR' ) ) THEN 265* 266 CALL PDTRMM( 'Left', UPLO, 'No tranpose', 'Non unit', N, JB, 267 $ ONE, A, IA, JA, DESCA, WORK, IW, 1, DESCW ) 268* 269 ELSE IF( LSAMEN( 2, MATTYP( 2:3 ), 'PD' ) ) THEN 270* 271 CALL PDSYMM( 'Left', UPLO, N, JB, ONE, A, IA, JA, DESCA, 272 $ WORK( IPW ), IW, 1, DESCW, ZERO, WORK, IW, 1, 273 $ DESCW ) 274* 275 END IF 276* 277* subtract the identity matrix to the diagonal block of these cols 278* 279 IF( MYROW.EQ.ICURROW .AND. MYCOL.EQ.ICURCOL ) THEN 280 DO 10 KK = 0, JB-1 281 WORK( II+KK*(DESCW(LLD_)+1) ) = 282 $ WORK( II+KK*(DESCW( LLD_ )+1) )-ONE 283 10 CONTINUE 284 END IF 285* 286 NRMINVAXA = PDLANGE( '1', N, JB, WORK, IW, 1, DESCW, WORK( IPW ) ) 287* 288 IF( MYROW.EQ.ICURROW ) 289 $ II = II + JB 290 IF( MYCOL.EQ.ICURCOL ) 291 $ JJA = JJA + JB 292 ICURROW = MOD( ICURROW+1, NPROW ) 293 ICURCOL = MOD( ICURCOL+1, NPCOL ) 294 DESCW( CSRC_ ) = ICURCOL 295* 296 DO 30 J = JN+1, JA+N-1, DESCA( NB_ ) 297* 298 JB = MIN( N-J+JA, DESCA( NB_ ) ) 299* 300* regenerate a block of columns of A 301* 302 IF( MYCOL.EQ.ICURCOL ) THEN 303 IF( LSAMEN( 2, MATTYP( 2:3 ), 'TR' ) ) THEN 304 CALL PDMATGEN( ICTXT, AFORM, DIAG, DESCA( M_ ), 305 $ DESCA( N_ ), DESCW( MB_ ), DESCW( NB_ ), 306 $ WORK, DESCW( LLD_ ), DESCA( RSRC_ ), 307 $ DESCA( CSRC_ ), 308 $ IASEED, IIA-1, NP, JJA-1, JB, MYROW, 309 $ MYCOL, NPROW, NPCOL ) 310 IF( LSAMEN( 3, MATTYP, 'UTR' ) ) THEN 311 CALL PDLASET( 'Lower', JA+N-J-1, JB, ZERO, ZERO, 312 $ WORK, IW+J-JA+1, 1, DESCW ) 313 ELSE 314 CALL PDLASET( 'All', J-JA, JB, ZERO, ZERO, WORK, IW, 315 $ 1, DESCW ) 316 CALL PDLASET( 'Upper', JB-1, JB-1, ZERO, ZERO, 317 $ WORK, IW+J-JA, 2, DESCW ) 318 END IF 319 ELSE 320 CALL PDMATGEN( ICTXT, AFORM, DIAG, DESCA( M_ ), 321 $ DESCA( N_ ), DESCW( MB_ ), DESCW( NB_ ), 322 $ WORK( IPW ), DESCW( LLD_ ), 323 $ DESCA( RSRC_ ), DESCA( CSRC_ ), IASEED, 324 $ IIA-1, NP, 325 $ JJA-1, JB, MYROW, MYCOL, NPROW, NPCOL ) 326 END IF 327 END IF 328* 329* Multiply A^{-1}*A 330* 331 IF( LSAMEN( 3, MATTYP, 'GEN' ) ) THEN 332* 333 CALL PDGEMM( 'No tranpose', 'No transpose', N, JB, N, ONE, 334 $ A, IA, JA, DESCA, WORK( IPW ), IW, 1, DESCW, 335 $ ZERO, WORK, IW, 1, DESCW ) 336* 337 ELSE IF( LSAMEN( 2, MATTYP(2:3), 'TR' ) ) THEN 338* 339 CALL PDTRMM( 'Left', UPLO, 'No tranpose', 'Non unit', N, JB, 340 $ ONE, A, IA, JA, DESCA, WORK, IW, 1, DESCW ) 341* 342 ELSE IF( LSAMEN( 2, MATTYP( 2:3 ), 'PD' ) ) THEN 343* 344 CALL PDSYMM( 'Left', UPLO, N, JB, ONE, A, IA, JA, DESCA, 345 $ WORK(IPW), IW, 1, DESCW, ZERO, WORK, IW, 1, 346 $ DESCW ) 347* 348 END IF 349* 350* subtract the identity matrix to the diagonal block of these 351* cols 352* 353 IF( MYROW.EQ.ICURROW .AND. MYCOL.EQ.ICURCOL ) THEN 354 DO 20 KK = 0, JB-1 355 WORK( II+KK*(DESCW( LLD_ )+1) ) = 356 $ WORK( II+KK*(DESCW( LLD_ )+1) ) - ONE 357 20 CONTINUE 358 END IF 359* 360* Compute the 1-norm of these JB cols 361* 362 TEMP = PDLANGE( '1', N, JB, WORK, IW, 1, DESCW, WORK( IPW ) ) 363 NRMINVAXA = MAX( TEMP, NRMINVAXA ) 364* 365 IF( MYROW.EQ.ICURROW ) 366 $ II = II + JB 367 IF( MYCOL.EQ.ICURCOL ) 368 $ JJA = JJA + JB 369 ICURROW = MOD( ICURROW+1, NPROW ) 370 ICURCOL = MOD( ICURCOL+1, NPCOL ) 371 DESCW( CSRC_ ) = ICURCOL 372* 373 30 CONTINUE 374* 375* Compute the scaled residual 376* 377 FRESID = NRMINVAXA / ( N * EPS * ANORM ) 378* 379 RETURN 380* 381* End of PDINVCHK 382* 383 END 384