1 PROGRAM PSLLTDRIVER 2* 3* -- ScaLAPACK testing driver (version 1.7) -- 4* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 5* and University of California, Berkeley. 6* May 1, 1997 7* 8* Purpose 9* ======= 10* 11* PSLLTDRIVER is the main test program for the REAL 12* ScaLAPACK Cholesky routines. This test driver performs an 13* A = L*L**T or A = U**T*U factorization and solve, and optionally 14* performs condition estimation and iterative refinement. 15* 16* The program must be driven by a short data file. An annotated 17* example of a data file can be obtained by deleting the first 3 18* characters from the following 18 lines: 19* 'ScaLAPACK LLt factorization input file' 20* 'Intel iPSC/860 hypercube, gamma model.' 21* 'LLT.out' output file name (if any) 22* 6 device out 23* 'U' define Lower or Upper 24* 1 number of problems sizes 25* 31 100 200 values of N 26* 1 number of NB's 27* 2 10 24 values of NB 28* 1 number of NRHS's 29* 1 values of NRHS 30* 1 Number of NBRHS's 31* 1 values of NBRHS 32* 1 number of process grids (ordered pairs of P & Q) 33* 2 values of P 34* 2 values of Q 35* 1.0 threshold 36* T (T or F) Test Cond. Est. and Iter. Ref. Routines 37* 38* 39* Internal Parameters 40* =================== 41* 42* TOTMEM INTEGER, default = 2000000 43* TOTMEM is a machine-specific parameter indicating the 44* maximum amount of available memory in bytes. 45* The user should customize TOTMEM to his platform. Remember 46* to leave room in memory for the operating system, the BLACS 47* buffer, etc. For example, on a system with 8 MB of memory 48* per process (e.g., one processor on an Intel iPSC/860), the 49* parameters we use are TOTMEM=6200000 (leaving 1.8 MB for OS, 50* code, BLACS buffer, etc). However, for PVM, we usually set 51* TOTMEM = 2000000. Some experimenting with the maximum value 52* of TOTMEM may be required. 53* 54* INTGSZ INTEGER, default = 4 bytes. 55* REALSZ INTEGER, default = 4 bytes. 56* INTGSZ and REALSZ indicate the length in bytes on the 57* given platform for an integer and a single precision real. 58* MEM REAL array, dimension ( TOTMEM / REALSZ ) 59* 60* All arrays used by SCALAPACK routines are allocated from 61* this array and referenced by pointers. The integer IPA, 62* for example, is a pointer to the starting element of MEM for 63* the matrix A. 64* 65* ===================================================================== 66* 67* .. Parameters .. 68 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 69 $ LLD_, MB_, M_, NB_, N_, RSRC_ 70 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 71 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 72 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 73 INTEGER INTGSZ, MEMSIZ, NTESTS, REALSZ, TOTMEM 74 REAL PADVAL, ZERO 75 PARAMETER ( INTGSZ = 4, REALSZ = 4, TOTMEM = 2000000, 76 $ MEMSIZ = TOTMEM / REALSZ, NTESTS = 20, 77 $ PADVAL = -9923.0E+0, ZERO = 0.0E+0 ) 78* .. 79* .. Local Scalars .. 80 LOGICAL CHECK, EST 81 CHARACTER UPLO 82 CHARACTER*6 PASSED 83 CHARACTER*80 OUTFILE 84 INTEGER HH, I, IAM, IASEED, IBSEED, ICTXT, IMIDPAD, 85 $ INFO, IPA, IPA0, IPB, IPB0, IPBERR, IPFERR, 86 $ IPREPAD, IPOSTPAD, IPW, IPW2, ITEMP, J, K, 87 $ KFAIL, KK, KPASS, KSKIP, KTESTS, LCM, LCMQ, 88 $ LIWORK, LWORK, LW2, MYCOL, MYRHS, MYROW, N, NB, 89 $ NBRHS, NGRIDS, NMAT, NNB, NNBR, NNR, NOUT, NP, 90 $ NPCOL, NPROCS, NPROW, NQ, NRHS, WORKSIZ 91 REAL ANORM, ANORM1, FRESID, RCOND, SRESID, SRESID2, 92 $ THRESH 93 DOUBLE PRECISION NOPS, TMFLOPS 94* .. 95* .. Local Arrays .. 96 INTEGER DESCA( DLEN_ ), DESCB( DLEN_ ), IERR( 1 ), 97 $ NBRVAL( NTESTS ), NBVAL( NTESTS ), 98 $ NRVAL( NTESTS ), NVAL( NTESTS ), 99 $ PVAL( NTESTS ), QVAL( NTESTS ) 100 REAL MEM( MEMSIZ ) 101 DOUBLE PRECISION CTIME( 2 ), WTIME( 2 ) 102* .. 103* .. External Subroutines .. 104 EXTERNAL BLACS_BARRIER, BLACS_EXIT, BLACS_GRIDEXIT, 105 $ BLACS_GRIDINFO, BLACS_GRIDINIT, DESCINIT, 106 $ IGSUM2D, BLACS_PINFO, PSCHEKPAD, PSFILLPAD, 107 $ PSLAFCHK, PSLASCHK, PSLLTINFO, 108 $ PSMATGEN, PSPOCON, PSPORFS, 109 $ PSPOTRF, PSPOTRRV, PSPOTRS, SLBOOT, 110 $ SLCOMBINE, SLTIMER 111* .. 112* .. External Functions .. 113 LOGICAL LSAME 114 INTEGER ICEIL, ILCM, NUMROC 115 REAL PSLANSY 116 EXTERNAL ICEIL, ILCM, LSAME, NUMROC, PSLANSY 117* .. 118* .. Intrinsic Functions .. 119 INTRINSIC DBLE, MAX, MIN 120* .. 121* .. Data Statements .. 122 DATA KFAIL, KPASS, KSKIP, KTESTS / 4*0 / 123* .. 124* .. Executable Statements .. 125* 126* Get starting information 127* 128 CALL BLACS_PINFO( IAM, NPROCS ) 129 IASEED = 100 130 IBSEED = 200 131 CALL PSLLTINFO( OUTFILE, NOUT, UPLO, NMAT, NVAL, NTESTS, NNB, 132 $ NBVAL, NTESTS, NNR, NRVAL, NTESTS, NNBR, NBRVAL, 133 $ NTESTS, NGRIDS, PVAL, NTESTS, QVAL, NTESTS, 134 $ THRESH, EST, MEM, IAM, NPROCS ) 135 CHECK = ( THRESH.GE.0.0E+0 ) 136* 137* Print headings 138* 139 IF( IAM.EQ.0 ) THEN 140 WRITE( NOUT, FMT = * ) 141 WRITE( NOUT, FMT = 9995 ) 142 WRITE( NOUT, FMT = 9994 ) 143 WRITE( NOUT, FMT = * ) 144 END IF 145* 146* Loop over different process grids 147* 148 DO 50 I = 1, NGRIDS 149* 150 NPROW = PVAL( I ) 151 NPCOL = QVAL( I ) 152* 153* Make sure grid information is correct 154* 155 IERR( 1 ) = 0 156 IF( NPROW.LT.1 ) THEN 157 IF( IAM.EQ.0 ) 158 $ WRITE( NOUT, FMT = 9999 ) 'GRID', 'nprow', NPROW 159 IERR( 1 ) = 1 160 ELSE IF( NPCOL.LT.1 ) THEN 161 IF( IAM.EQ.0 ) 162 $ WRITE( NOUT, FMT = 9999 ) 'GRID', 'npcol', NPCOL 163 IERR( 1 ) = 1 164 ELSE IF( NPROW*NPCOL.GT.NPROCS ) THEN 165 IF( IAM.EQ.0 ) 166 $ WRITE( NOUT, FMT = 9998 ) NPROW*NPCOL, NPROCS 167 IERR( 1 ) = 1 168 END IF 169* 170 IF( IERR( 1 ).GT.0 ) THEN 171 IF( IAM.EQ.0 ) 172 $ WRITE( NOUT, FMT = 9997 ) 'grid' 173 KSKIP = KSKIP + 1 174 GO TO 50 175 END IF 176* 177* Define process grid 178* 179 CALL BLACS_GET( -1, 0, ICTXT ) 180 CALL BLACS_GRIDINIT( ICTXT, 'Row-major', NPROW, NPCOL ) 181 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 182* 183* Go to bottom of process grid loop if this case doesn't use my 184* process 185* 186 IF( MYROW.GE.NPROW .OR. MYCOL.GE.NPCOL ) 187 $ GO TO 50 188* 189 DO 40 J = 1, NMAT 190* 191 N = NVAL( J ) 192* 193* Make sure matrix information is correct 194* 195 IERR( 1 ) = 0 196 IF( N.LT.1 ) THEN 197 IF( IAM.EQ.0 ) 198 $ WRITE( NOUT, FMT = 9999 ) 'MATRIX', 'N', N 199 IERR( 1 ) = 1 200 ELSE IF( N.LT.1 ) THEN 201 IF( IAM.EQ.0 ) 202 $ WRITE( NOUT, FMT = 9999 ) 'MATRIX', 'N', N 203 IERR( 1 ) = 1 204 END IF 205* 206* Check all processes for an error 207* 208 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, IERR, 1, -1, 0 ) 209* 210 IF( IERR( 1 ).GT.0 ) THEN 211 IF( IAM.EQ.0 ) 212 $ WRITE( NOUT, FMT = 9997 ) 'matrix' 213 KSKIP = KSKIP + 1 214 GO TO 40 215 END IF 216* 217 DO 30 K = 1, NNB 218* 219 NB = NBVAL( K ) 220* 221* Make sure nb is legal 222* 223 IERR( 1 ) = 0 224 IF( NB.LT.1 ) THEN 225 IERR( 1 ) = 1 226 IF( IAM.EQ.0 ) 227 $ WRITE( NOUT, FMT = 9999 ) 'NB', 'NB', NB 228 END IF 229* 230* Check all processes for an error 231* 232 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, IERR, 1, -1, 0 ) 233* 234 IF( IERR( 1 ).GT.0 ) THEN 235 IF( IAM.EQ.0 ) 236 $ WRITE( NOUT, FMT = 9997 ) 'NB' 237 KSKIP = KSKIP + 1 238 GO TO 30 239 END IF 240* 241* Padding constants 242* 243 NP = NUMROC( N, NB, MYROW, 0, NPROW ) 244 NQ = NUMROC( N, NB, MYCOL, 0, NPCOL ) 245 IF( CHECK ) THEN 246 IPREPAD = MAX( NB, NP ) 247 IMIDPAD = NB 248 IPOSTPAD = MAX( NB, NQ ) 249 ELSE 250 IPREPAD = 0 251 IMIDPAD = 0 252 IPOSTPAD = 0 253 END IF 254* 255* Initialize the array descriptor for the matrix A 256* 257 CALL DESCINIT( DESCA, N, N, NB, NB, 0, 0, ICTXT, 258 $ MAX( 1, NP )+IMIDPAD, IERR( 1 ) ) 259* 260* Check all processes for an error 261* 262 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, IERR, 1, -1, 0 ) 263* 264 IF( IERR( 1 ).LT.0 ) THEN 265 IF( IAM.EQ.0 ) 266 $ WRITE( NOUT, FMT = 9997 ) 'descriptor' 267 KSKIP = KSKIP + 1 268 GO TO 30 269 END IF 270* 271* Assign pointers into MEM for SCALAPACK arrays, A is 272* allocated starting at position MEM( IPREPAD+1 ) 273* 274 IPA = IPREPAD+1 275 IF( EST ) THEN 276 IPA0 = IPA + DESCA( LLD_ )*NQ + IPOSTPAD + IPREPAD 277 IPW = IPA0 + DESCA( LLD_ )*NQ + IPOSTPAD + IPREPAD 278 ELSE 279 IPW = IPA + DESCA( LLD_ )*NQ + IPOSTPAD + IPREPAD 280 END IF 281* 282* 283 IF( CHECK ) THEN 284* 285* Calculate the amount of workspace required by 286* the checking routines PSLAFCHK, PSPOTRRV, and 287* PSLANSY 288* 289 WORKSIZ = NP * DESCA( NB_ ) 290* 291 WORKSIZ = MAX( WORKSIZ, DESCA( MB_ ) * DESCA( NB_ ) ) 292* 293 LCM = ILCM( NPROW, NPCOL ) 294 ITEMP = MAX( 2, 2 * NQ ) + NP 295 IF( NPROW.NE.NPCOL ) THEN 296 ITEMP = ITEMP + 297 $ NB * ICEIL( ICEIL( NP, NB ), LCM / NPROW ) 298 END IF 299 WORKSIZ = MAX( WORKSIZ, ITEMP ) 300 WORKSIZ = WORKSIZ + IPOSTPAD 301* 302 ELSE 303* 304 WORKSIZ = IPOSTPAD 305* 306 END IF 307* 308* Check for adequate memory for problem size 309* 310 IERR( 1 ) = 0 311 IF( IPW+WORKSIZ.GT.MEMSIZ ) THEN 312 IF( IAM.EQ.0 ) 313 $ WRITE( NOUT, FMT = 9996 ) 'factorization', 314 $ ( IPW+WORKSIZ )*REALSZ 315 IERR( 1 ) = 1 316 END IF 317* 318* Check all processes for an error 319* 320 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, IERR, 1, -1, 0 ) 321* 322 IF( IERR( 1 ).GT.0 ) THEN 323 IF( IAM.EQ.0 ) 324 $ WRITE( NOUT, FMT = 9997 ) 'MEMORY' 325 KSKIP = KSKIP + 1 326 GO TO 30 327 END IF 328* 329* Generate a symmetric positive definite matrix A 330* 331 CALL PSMATGEN( ICTXT, 'Symm', 'Diag', DESCA( M_ ), 332 $ DESCA( N_ ), DESCA( MB_ ), DESCA( NB_ ), 333 $ MEM( IPA ), DESCA( LLD_ ), DESCA( RSRC_ ), 334 $ DESCA( CSRC_ ), IASEED, 0, NP, 0, NQ, 335 $ MYROW, MYCOL, NPROW, NPCOL ) 336* 337* Calculate inf-norm of A for residual error-checking 338* 339 IF( CHECK ) THEN 340 CALL PSFILLPAD( ICTXT, NP, NQ, MEM( IPA-IPREPAD ), 341 $ DESCA( LLD_ ), IPREPAD, IPOSTPAD, 342 $ PADVAL ) 343 CALL PSFILLPAD( ICTXT, WORKSIZ-IPOSTPAD, 1, 344 $ MEM( IPW-IPREPAD ), WORKSIZ-IPOSTPAD, 345 $ IPREPAD, IPOSTPAD, PADVAL ) 346 ANORM = PSLANSY( 'I', UPLO, N, MEM( IPA ), 1, 1, 347 $ DESCA, MEM( IPW ) ) 348 ANORM1 = PSLANSY( '1', UPLO, N, MEM( IPA ), 1, 1, 349 $ DESCA, MEM( IPW ) ) 350 CALL PSCHEKPAD( ICTXT, 'PSLANSY', NP, NQ, 351 $ MEM( IPA-IPREPAD ), DESCA( LLD_ ), 352 $ IPREPAD, IPOSTPAD, PADVAL ) 353 CALL PSCHEKPAD( ICTXT, 'PSLANSY', 354 $ WORKSIZ-IPOSTPAD, 1, 355 $ MEM( IPW-IPREPAD ), WORKSIZ-IPOSTPAD, 356 $ IPREPAD, IPOSTPAD, PADVAL ) 357 END IF 358* 359 IF( EST ) THEN 360 CALL PSMATGEN( ICTXT, 'Symm', 'Diag', DESCA( M_ ), 361 $ DESCA( N_ ), DESCA( MB_ ), 362 $ DESCA( NB_ ), MEM( IPA0 ), 363 $ DESCA( LLD_ ), DESCA( RSRC_ ), 364 $ DESCA( CSRC_ ), IASEED, 0, NP, 0, NQ, 365 $ MYROW, MYCOL, NPROW, NPCOL ) 366 IF( CHECK ) 367 $ CALL PSFILLPAD( ICTXT, NP, NQ, 368 $ MEM( IPA0-IPREPAD ), DESCA( LLD_ ), 369 $ IPREPAD, IPOSTPAD, PADVAL ) 370 END IF 371* 372 CALL SLBOOT() 373 CALL BLACS_BARRIER( ICTXT, 'All' ) 374* 375* Perform LLt factorization 376* 377 CALL SLTIMER( 1 ) 378* 379 CALL PSPOTRF( UPLO, N, MEM( IPA ), 1, 1, DESCA, INFO ) 380* 381 CALL SLTIMER( 1 ) 382* 383 IF( INFO.NE.0 ) THEN 384 IF( IAM.EQ.0 ) 385 $ WRITE( NOUT, FMT = * ) 'PSPOTRF INFO=', INFO 386 KFAIL = KFAIL + 1 387 RCOND = ZERO 388 GO TO 60 389 END IF 390* 391 IF( CHECK ) THEN 392* 393* Check for memory overwrite in LLt factorization 394* 395 CALL PSCHEKPAD( ICTXT, 'PSPOTRF', NP, NQ, 396 $ MEM( IPA-IPREPAD ), DESCA( LLD_ ), 397 $ IPREPAD, IPOSTPAD, PADVAL ) 398 END IF 399* 400 IF( EST ) THEN 401* 402* Calculate workspace required for PSPOCON 403* 404 LWORK = MAX( 1, 2*NP ) + MAX( 1, 2*NQ ) + 405 $ MAX( 2, DESCA( NB_ )* 406 $ MAX( 1, ICEIL( NPROW-1, NPCOL ) ), 407 $ NQ + DESCA( NB_ )* 408 $ MAX( 1, ICEIL( NPCOL-1, NPROW ) ) ) 409 IPW2 = IPW + LWORK + IPOSTPAD + IPREPAD 410 LIWORK = MAX( 1, NP ) 411 LW2 = ICEIL( LIWORK*INTGSZ, REALSZ ) + IPOSTPAD 412* 413 IERR( 1 ) = 0 414 IF( IPW2+LW2.GT.MEMSIZ ) THEN 415 IF( IAM.EQ.0 ) 416 $ WRITE( NOUT, FMT = 9996 )'cond est', 417 $ ( IPW2+LW2 )*REALSZ 418 IERR( 1 ) = 1 419 END IF 420* 421* Check all processes for an error 422* 423 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, IERR, 1, 424 $ -1, 0 ) 425* 426 IF( IERR( 1 ).GT.0 ) THEN 427 IF( IAM.EQ.0 ) 428 $ WRITE( NOUT, FMT = 9997 ) 'MEMORY' 429 KSKIP = KSKIP + 1 430 GO TO 60 431 END IF 432* 433 IF( CHECK ) THEN 434 CALL PSFILLPAD( ICTXT, LWORK, 1, 435 $ MEM( IPW-IPREPAD ), LWORK, 436 $ IPREPAD, IPOSTPAD, PADVAL ) 437 CALL PSFILLPAD( ICTXT, LW2-IPOSTPAD, 1, 438 $ MEM( IPW2-IPREPAD ), 439 $ LW2-IPOSTPAD, IPREPAD, 440 $ IPOSTPAD, PADVAL ) 441 END IF 442* 443* Compute condition number of the matrix 444* 445 CALL PSPOCON( UPLO, N, MEM( IPA ), 1, 1, DESCA, 446 $ ANORM1, RCOND, MEM( IPW ), LWORK, 447 $ MEM( IPW2 ), LIWORK, INFO ) 448* 449 IF( CHECK ) THEN 450 CALL PSCHEKPAD( ICTXT, 'PSPOCON', NP, NQ, 451 $ MEM( IPA-IPREPAD ), DESCA( LLD_ ), 452 $ IPREPAD, IPOSTPAD, PADVAL ) 453 CALL PSCHEKPAD( ICTXT, 'PSPOCON', 454 $ LWORK, 1, MEM( IPW-IPREPAD ), 455 $ LWORK, IPREPAD, IPOSTPAD, 456 $ PADVAL ) 457 CALL PSCHEKPAD( ICTXT, 'PSPOCON', 458 $ LW2-IPOSTPAD, 1, 459 $ MEM( IPW2-IPREPAD ), LW2-IPOSTPAD, 460 $ IPREPAD, IPOSTPAD, PADVAL ) 461 END IF 462 END IF 463* 464* Loop over the different values for NRHS 465* 466 DO 20 HH = 1, NNR 467* 468 NRHS = NRVAL( HH ) 469* 470 DO 10 KK = 1, NNBR 471* 472 NBRHS = NBRVAL( KK ) 473* 474* Initialize Array Descriptor for rhs 475* 476 CALL DESCINIT( DESCB, N, NRHS, NB, NBRHS, 0, 0, 477 $ ICTXT, MAX( 1, NP )+IMIDPAD, 478 $ IERR( 1 ) ) 479* 480* move IPW to allow room for RHS 481* 482 MYRHS = NUMROC( DESCB( N_ ), DESCB( NB_ ), MYCOL, 483 $ DESCB( CSRC_ ), NPCOL ) 484 IPB = IPW 485* 486 IF( EST ) THEN 487 IPB0 = IPB + DESCB( LLD_ )*MYRHS + IPOSTPAD + 488 $ IPREPAD 489 IPFERR = IPB0 + DESCB( LLD_ )*MYRHS + IPOSTPAD 490 $ + IPREPAD 491 IPBERR = MYRHS + IPFERR + IPOSTPAD + IPREPAD 492 IPW = MYRHS + IPBERR + IPOSTPAD + IPREPAD 493 ELSE 494 IPW = IPB + DESCB( LLD_ )*MYRHS + IPOSTPAD + 495 $ IPREPAD 496 END IF 497* 498 IF( CHECK ) THEN 499* 500* Calculate the amount of workspace required by 501* the checking routines PSLASCHK 502* 503 LCMQ = LCM / NPCOL 504 WORKSIZ = MAX( WORKSIZ-IPOSTPAD, 505 $ NQ * NBRHS + NP * NBRHS + 506 $ MAX( MAX( NQ*NB, 2*NBRHS ), 507 $ NBRHS * NUMROC( NUMROC(N,NB,0,0,NPCOL),NB, 508 $ 0,0,LCMQ ) ) ) 509 WORKSIZ = IPOSTPAD + WORKSIZ 510 ELSE 511 WORKSIZ = IPOSTPAD 512 END IF 513* 514 IERR( 1 ) = 0 515 IF( IPW+WORKSIZ.GT.MEMSIZ ) THEN 516 IF( IAM.EQ.0 ) 517 $ WRITE( NOUT, FMT = 9996 )'solve', 518 $ ( IPW+WORKSIZ )*REALSZ 519 IERR( 1 ) = 1 520 END IF 521* 522* Check all processes for an error 523* 524 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, IERR, 1, 525 $ -1, 0 ) 526* 527 IF( IERR( 1 ).GT.0 ) THEN 528 IF( IAM.EQ.0 ) 529 $ WRITE( NOUT, FMT = 9997 ) 'MEMORY' 530 KSKIP = KSKIP + 1 531 GO TO 10 532 END IF 533* 534* Generate RHS 535* 536 CALL PSMATGEN( ICTXT, 'No', 'No', DESCB( M_ ), 537 $ DESCB( N_ ), DESCB( MB_ ), 538 $ DESCB( NB_ ), MEM( IPB ), 539 $ DESCB( LLD_ ), DESCB( RSRC_ ), 540 $ DESCB( CSRC_ ), IBSEED, 0, NP, 0, 541 $ MYRHS, MYROW, MYCOL, NPROW, NPCOL ) 542* 543 IF( CHECK ) 544 $ CALL PSFILLPAD( ICTXT, NP, MYRHS, 545 $ MEM( IPB-IPREPAD ), 546 $ DESCB( LLD_ ), 547 $ IPREPAD, IPOSTPAD, PADVAL ) 548* 549 IF( EST ) THEN 550 CALL PSMATGEN( ICTXT, 'No', 'No', DESCB( M_ ), 551 $ DESCB( N_ ), DESCB( MB_ ), 552 $ DESCB( NB_ ), MEM( IPB0 ), 553 $ DESCB( LLD_ ), DESCB( RSRC_ ), 554 $ DESCB( CSRC_ ), IBSEED, 0, NP, 0, 555 $ MYRHS, MYROW, MYCOL, NPROW, 556 $ NPCOL ) 557* 558 IF( CHECK ) THEN 559 CALL PSFILLPAD( ICTXT, NP, MYRHS, 560 $ MEM( IPB0-IPREPAD ), 561 $ DESCB( LLD_ ), IPREPAD, 562 $ IPOSTPAD, PADVAL ) 563 CALL PSFILLPAD( ICTXT, 1, MYRHS, 564 $ MEM( IPFERR-IPREPAD ), 1, 565 $ IPREPAD, IPOSTPAD, 566 $ PADVAL ) 567 CALL PSFILLPAD( ICTXT, 1, MYRHS, 568 $ MEM( IPBERR-IPREPAD ), 1, 569 $ IPREPAD, IPOSTPAD, 570 $ PADVAL ) 571 END IF 572 END IF 573* 574 CALL BLACS_BARRIER( ICTXT, 'All' ) 575 CALL SLTIMER( 2 ) 576* 577* Solve linear system via Cholesky factorization 578* 579 CALL PSPOTRS( UPLO, N, NRHS, MEM( IPA ), 1, 1, 580 $ DESCA, MEM( IPB ), 1, 1, DESCB, 581 $ INFO ) 582* 583 CALL SLTIMER( 2 ) 584* 585 IF( CHECK ) THEN 586* 587* check for memory overwrite 588* 589 CALL PSCHEKPAD( ICTXT, 'PSPOTRS', NP, NQ, 590 $ MEM( IPA-IPREPAD ), 591 $ DESCA( LLD_ ), 592 $ IPREPAD, IPOSTPAD, PADVAL ) 593 CALL PSCHEKPAD( ICTXT, 'PSPOTRS', NP, 594 $ MYRHS, MEM( IPB-IPREPAD ), 595 $ DESCB( LLD_ ), IPREPAD, 596 $ IPOSTPAD, PADVAL ) 597* 598 CALL PSFILLPAD( ICTXT, WORKSIZ-IPOSTPAD, 1, 599 $ MEM( IPW-IPREPAD ), 600 $ WORKSIZ-IPOSTPAD, IPREPAD, 601 $ IPOSTPAD, PADVAL ) 602* 603* check the solution to rhs 604* 605 CALL PSLASCHK( 'Symm', 'Diag', N, NRHS, 606 $ MEM( IPB ), 1, 1, DESCB, 607 $ IASEED, 1, 1, DESCA, IBSEED, 608 $ ANORM, SRESID, MEM( IPW ) ) 609* 610 IF( IAM.EQ.0 .AND. SRESID.GT.THRESH ) 611 $ WRITE( NOUT, FMT = 9985 ) SRESID 612* 613* check for memory overwrite 614* 615 CALL PSCHEKPAD( ICTXT, 'PSLASCHK', NP, 616 $ MYRHS, MEM( IPB-IPREPAD ), 617 $ DESCB( LLD_ ), IPREPAD, 618 $ IPOSTPAD, PADVAL ) 619 CALL PSCHEKPAD( ICTXT, 'PSLASCHK', 620 $ WORKSIZ-IPOSTPAD, 1, 621 $ MEM( IPW-IPREPAD ), 622 $ WORKSIZ-IPOSTPAD, IPREPAD, 623 $ IPOSTPAD, PADVAL ) 624* 625* The second test is a NaN trap 626* 627 IF( ( SRESID.LE.THRESH ).AND. 628 $ ( (SRESID-SRESID).EQ.0.0E+0 ) ) THEN 629 KPASS = KPASS + 1 630 PASSED = 'PASSED' 631 ELSE 632 KFAIL = KFAIL + 1 633 PASSED = 'FAILED' 634 END IF 635 ELSE 636 KPASS = KPASS + 1 637 SRESID = SRESID - SRESID 638 PASSED = 'BYPASS' 639 END IF 640* 641 IF( EST ) THEN 642* 643* Calculate workspace required for PSPORFS 644* 645 LWORK = MAX( 1, 3*NP ) 646 IPW2 = IPW + LWORK + IPOSTPAD + IPREPAD 647 LIWORK = MAX( 1, NP ) 648 LW2 = ICEIL( LIWORK*INTGSZ, REALSZ ) + 649 $ IPOSTPAD 650* 651 IERR( 1 ) = 0 652 IF( IPW2+LW2.GT.MEMSIZ ) THEN 653 IF( IAM.EQ.0 ) 654 $ WRITE( NOUT, FMT = 9996 ) 655 $ 'iter ref', ( IPW2+LW2 )*REALSZ 656 IERR( 1 ) = 1 657 END IF 658* 659* Check all processes for an error 660* 661 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, IERR, 662 $ 1, -1, 0 ) 663* 664 IF( IERR( 1 ).GT.0 ) THEN 665 IF( IAM.EQ.0 ) 666 $ WRITE( NOUT, FMT = 9997 ) 667 $ 'MEMORY' 668 KSKIP = KSKIP + 1 669 GO TO 10 670 END IF 671* 672 IF( CHECK ) THEN 673 CALL PSFILLPAD( ICTXT, LWORK, 1, 674 $ MEM( IPW-IPREPAD ), 675 $ LWORK, IPREPAD, IPOSTPAD, 676 $ PADVAL ) 677 CALL PSFILLPAD( ICTXT, LW2-IPOSTPAD, 678 $ 1, MEM( IPW2-IPREPAD ), 679 $ LW2-IPOSTPAD, 680 $ IPREPAD, IPOSTPAD, 681 $ PADVAL ) 682 END IF 683* 684* Use iterative refinement to improve the 685* computed solution 686* 687 CALL PSPORFS( UPLO, N, NRHS, MEM( IPA0 ), 688 $ 1, 1, DESCA, MEM( IPA ), 1, 1, 689 $ DESCA, MEM( IPB0 ), 1, 1, 690 $ DESCB, MEM( IPB ), 1, 1, DESCB, 691 $ MEM( IPFERR ), MEM( IPBERR ), 692 $ MEM( IPW ), LWORK, MEM( IPW2 ), 693 $ LIWORK, INFO ) 694* 695* check for memory overwrite 696* 697 IF( CHECK ) THEN 698 CALL PSCHEKPAD( ICTXT, 'PSPORFS', NP, 699 $ NQ, MEM( IPA0-IPREPAD ), 700 $ DESCA( LLD_ ), IPREPAD, 701 $ IPOSTPAD, PADVAL ) 702 CALL PSCHEKPAD( ICTXT, 'PSPORFS', NP, 703 $ NQ, MEM( IPA-IPREPAD ), 704 $ DESCA( LLD_ ), IPREPAD, 705 $ IPOSTPAD, PADVAL ) 706 CALL PSCHEKPAD( ICTXT, 'PSPORFS', NP, 707 $ MYRHS, MEM( IPB-IPREPAD ), 708 $ DESCB( LLD_ ), IPREPAD, 709 $ IPOSTPAD, PADVAL ) 710 CALL PSCHEKPAD( ICTXT, 'PSPORFS', NP, 711 $ MYRHS, 712 $ MEM( IPB0-IPREPAD ), 713 $ DESCB( LLD_ ), IPREPAD, 714 $ IPOSTPAD, PADVAL ) 715 CALL PSCHEKPAD( ICTXT, 'PSPORFS', 1, 716 $ MYRHS, 717 $ MEM( IPFERR-IPREPAD ), 1, 718 $ IPREPAD, IPOSTPAD, 719 $ PADVAL ) 720 CALL PSCHEKPAD( ICTXT, 'PSPORFS', 1, 721 $ MYRHS, 722 $ MEM( IPBERR-IPREPAD ), 1, 723 $ IPREPAD, IPOSTPAD, 724 $ PADVAL ) 725 CALL PSCHEKPAD( ICTXT, 'PSPORFS', LWORK, 726 $ 1, MEM( IPW-IPREPAD ), 727 $ LWORK, IPREPAD, IPOSTPAD, 728 $ PADVAL ) 729 CALL PSCHEKPAD( ICTXT, 'PSPORFS', 730 $ LW2-IPOSTPAD, 1, 731 $ MEM( IPW2-IPREPAD ), 732 $ LW2-IPOSTPAD, 733 $ IPREPAD, IPOSTPAD, 734 $ PADVAL ) 735* 736 CALL PSFILLPAD( ICTXT, WORKSIZ-IPOSTPAD, 737 $ 1, MEM( IPW-IPREPAD ), 738 $ WORKSIZ-IPOSTPAD, IPREPAD, 739 $ IPOSTPAD, PADVAL ) 740* 741* check the solution to rhs 742* 743 CALL PSLASCHK( 'Symm', 'Diag', N, NRHS, 744 $ MEM( IPB ), 1, 1, DESCB, 745 $ IASEED, 1, 1, DESCA, 746 $ IBSEED, ANORM, SRESID2, 747 $ MEM( IPW ) ) 748* 749 IF( IAM.EQ.0 .AND. SRESID2.GT.THRESH ) 750 $ WRITE( NOUT, FMT = 9985 ) SRESID2 751* 752* check for memory overwrite 753* 754 CALL PSCHEKPAD( ICTXT, 'PSLASCHK', NP, 755 $ MYRHS, MEM( IPB-IPREPAD ), 756 $ DESCB( LLD_ ), IPREPAD, 757 $ IPOSTPAD, PADVAL ) 758 CALL PSCHEKPAD( ICTXT, 'PSLASCHK', 759 $ WORKSIZ-IPOSTPAD, 1, 760 $ MEM( IPW-IPREPAD ), 761 $ WORKSIZ-IPOSTPAD, 762 $ IPREPAD, IPOSTPAD, 763 $ PADVAL ) 764 END IF 765 END IF 766* 767* Gather maximum of all CPU and WALL clock timings 768* 769 CALL SLCOMBINE( ICTXT, 'All', '>', 'W', 2, 1, 770 $ WTIME ) 771 CALL SLCOMBINE( ICTXT, 'All', '>', 'C', 2, 1, 772 $ CTIME ) 773* 774* Print results 775* 776 IF( MYROW.EQ.0 .AND. MYCOL.EQ.0 ) THEN 777* 778* 1/3 N^3 + 1/2 N^2 flops for LLt factorization 779* 780 NOPS = (DBLE(N)**3)/3.0D+0 + 781 $ (DBLE(N)**2)/2.0D+0 782* 783* nrhs * 2 N^2 flops for LLt solve. 784* 785 NOPS = NOPS + 2.0D+0*(DBLE(N)**2)*DBLE(NRHS) 786* 787* Calculate total megaflops -- factorization and 788* solve -- for WALL and CPU time, and print output 789* 790* Print WALL time if machine supports it 791* 792 IF( WTIME( 1 ) + WTIME( 2 ) .GT. 0.0D+0 ) THEN 793 TMFLOPS = NOPS / 794 $ ( ( WTIME( 1 )+WTIME( 2 ) ) * 1.0D+6 ) 795 ELSE 796 TMFLOPS = 0.0D+0 797 END IF 798* 799 IF( WTIME( 2 ).GE.0.0D+0 ) 800 $ WRITE( NOUT, FMT = 9993 ) 'WALL', UPLO, N, 801 $ NB, NRHS, NBRHS, NPROW, NPCOL, 802 $ WTIME( 1 ), WTIME( 2 ), TMFLOPS, 803 $ PASSED 804* 805* Print CPU time if machine supports it 806* 807 IF( CTIME( 1 )+CTIME( 2 ).GT.0.0D+0 ) THEN 808 TMFLOPS = NOPS / 809 $ ( ( CTIME( 1 )+CTIME( 2 ) ) * 1.0D+6 ) 810 ELSE 811 TMFLOPS = 0.0D+0 812 END IF 813* 814 IF( CTIME( 2 ).GE.0.0D+0 ) 815 $ WRITE( NOUT, FMT = 9993 ) 'CPU ', UPLO, N, 816 $ NB, NRHS, NBRHS, NPROW, NPCOL, 817 $ CTIME( 1 ), CTIME( 2 ), TMFLOPS, 818 $ PASSED 819* 820 END IF 821 10 CONTINUE 822 20 CONTINUE 823* 824 IF( CHECK .AND. SRESID.GT.THRESH ) THEN 825* 826* Compute FRESID = ||A - LL'|| / (||A|| * N * eps) 827* 828 CALL PSPOTRRV( UPLO, N, MEM( IPA ), 1, 1, DESCA, 829 $ MEM( IPW ) ) 830 CALL PSLAFCHK( 'Symm', 'Diag', N, N, MEM( IPA ), 1, 1, 831 $ DESCA, IASEED, ANORM, FRESID, 832 $ MEM( IPW ) ) 833* 834* Check for memory overwrite 835* 836 CALL PSCHEKPAD( ICTXT, 'PSPOTRRV', NP, NQ, 837 $ MEM( IPA-IPREPAD ), DESCA( LLD_ ), 838 $ IPREPAD, IPOSTPAD, PADVAL ) 839 CALL PSCHEKPAD( ICTXT, 'PSGETRRV', 840 $ WORKSIZ-IPOSTPAD, 1, 841 $ MEM( IPW-IPREPAD ), WORKSIZ-IPOSTPAD, 842 $ IPREPAD, IPOSTPAD, PADVAL ) 843* 844 IF( IAM.EQ.0 ) THEN 845 IF( LSAME( UPLO, 'L' ) ) THEN 846 WRITE( NOUT, FMT = 9986 ) 'L*L''', FRESID 847 ELSE 848 WRITE( NOUT, FMT = 9986 ) 'U''*U', FRESID 849 END IF 850 END IF 851 END IF 852* 853 30 CONTINUE 854 40 CONTINUE 855 CALL BLACS_GRIDEXIT( ICTXT ) 856 50 CONTINUE 857* 858* Print ending messages and close output file 859* 860 60 CONTINUE 861 IF( IAM.EQ.0 ) THEN 862 KTESTS = KPASS + KFAIL + KSKIP 863 WRITE( NOUT, FMT = * ) 864 WRITE( NOUT, FMT = 9992 ) KTESTS 865 IF( CHECK ) THEN 866 WRITE( NOUT, FMT = 9991 ) KPASS 867 WRITE( NOUT, FMT = 9989 ) KFAIL 868 ELSE 869 WRITE( NOUT, FMT = 9990 ) KPASS 870 END IF 871 WRITE( NOUT, FMT = 9988 ) KSKIP 872 WRITE( NOUT, FMT = * ) 873 WRITE( NOUT, FMT = * ) 874 WRITE( NOUT, FMT = 9987 ) 875 IF( NOUT.NE.6 .AND. NOUT.NE.0 ) 876 $ CLOSE ( NOUT ) 877 END IF 878* 879 CALL BLACS_EXIT( 0 ) 880* 881 9999 FORMAT( 'ILLEGAL ', A6, ': ', A5, ' = ', I3, 882 $ '; It should be at least 1' ) 883 9998 FORMAT( 'ILLEGAL GRID: nprow*npcol = ', I4, '. It can be at most', 884 $ I4 ) 885 9997 FORMAT( 'Bad ', A6, ' parameters: going on to next test case.' ) 886 9996 FORMAT( 'Unable to perform ', A, ': need TOTMEM of at least', 887 $ I11 ) 888 9995 FORMAT( 'TIME UPLO N NB NRHS NBRHS P Q LLt Time ', 889 $ 'Slv Time MFLOPS CHECK' ) 890 9994 FORMAT( '---- ---- ----- --- ---- ----- ---- ---- -------- ', 891 $ '-------- -------- ------' ) 892 9993 FORMAT( A4, 4X, A1, 1X, I5, 1X, I3, 1X, I4, 1X, I5, 1X, I4, 1X, 893 $ I4, 1X, F8.2, 1X, F8.2, 1X, F8.2, 1X, A6 ) 894 9992 FORMAT( 'Finished ', I6, ' tests, with the following results:' ) 895 9991 FORMAT( I5, ' tests completed and passed residual checks.' ) 896 9990 FORMAT( I5, ' tests completed without checking.' ) 897 9989 FORMAT( I5, ' tests completed and failed residual checks.' ) 898 9988 FORMAT( I5, ' tests skipped because of illegal input values.' ) 899 9987 FORMAT( 'END OF TESTS.' ) 900 9986 FORMAT( '||A - ', A4, '|| / (||A|| * N * eps) = ', G25.7 ) 901 9985 FORMAT( '||Ax-b||/(||x||*||A||*eps*N) ', F25.7 ) 902* 903 STOP 904* 905* End of PSLLTDRIVER 906* 907 END 908