1 PROGRAM PDLSDRIVER 2* 3* -- ScaLAPACK routine (version 1.7) -- 4* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 5* and University of California, Berkeley. 6* May 28, 2001 7* 8* Purpose 9* ======= 10* 11* PDLSDRIVER is the main test program for the DOUBLE PRECISION 12* SCALAPACK (full rank) Least Squares routines. This test driver solves 13* full-rank least square problems. 14* 15* The program must be driven by a short data file. An annotated 16* example of a data file can be obtained by deleting the first 3 17* characters from the following 17 lines: 18* 'ScaLapack LS solve input file' 19* 'Intel iPSC/860 hypercube, gamma model.' 20* 'LS.out' output file name (if any) 21* 6 device out 22* 4 number of problems sizes 23* 55 17 31 201 values of M 24* 5 71 31 201 values of N 25* 3 number of NB's 26* 2 3 5 values of NB 27* 3 number of NRHS's 28* 2 3 5 values of NRHS 29* 2 number of NBRHS's 30* 1 2 values of NBRHS 31* 7 number of process grids (ordered P & Q) 32* 1 2 1 4 2 3 8 values of P 33* 7 2 4 1 3 2 1 values of Q 34* 3.0 threshold 35* 36* Internal Parameters 37* =================== 38* 39* TOTMEM INTEGER, default = 6200000. 40* TOTMEM is a machine-specific parameter indicating the 41* maximum amount of available memory in bytes. 42* The user should customize TOTMEM to his platform. Remember 43* to leave room in memory for the operating system, the BLACS 44* buffer, etc. For example, on a system with 8 MB of memory 45* per process (e.g., one processor on an Intel iPSC/860), the 46* parameters we use are TOTMEM=6200000 (leaving 1.8 MB for OS, 47* code, BLACS buffer, etc). However, for PVM, we usually set 48* TOTMEM = 2000000. Some experimenting with the maximum value 49* of TOTMEM may be required. 50* INTGSZ INTEGER, default = 4 bytes. 51* DBLESZ INTEGER, default = 8 bytes. 52* INTGSZ and DBLESZ indicate the length in bytes on the 53* given platform for an integer and a double precision real. 54* MEM DOUBLE PRECISION array, dimension ( TOTMEM / DBLESZ ) 55* All arrays used by SCALAPACK routines are allocated from 56* this array and referenced by pointers. The integer IPA, 57* for example, is a pointer to the starting element of MEM for 58* the matrix A. 59* 60* ===================================================================== 61* 62* .. Parameters .. 63 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 64 $ LLD_, MB_, M_, NB_, N_, RSRC_ 65 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 66 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 67 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 68 INTEGER DBLESZ, MEMSIZ, NTESTS, TOTMEM 69 DOUBLE PRECISION PADVAL 70 DOUBLE PRECISION ONE, ZERO 71 PARAMETER ( DBLESZ = 8, TOTMEM = 2000000, 72 $ MEMSIZ = TOTMEM / DBLESZ, NTESTS = 20, 73 $ PADVAL = -9923.0D+0 ) 74 PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) 75* .. 76* .. Local Scalars .. 77 LOGICAL CHECK, TPSD 78 CHARACTER TRANS 79 CHARACTER*6 PASSED 80 CHARACTER*80 OUTFILE 81 INTEGER HH, I, IAM, IASEED, IBSEED, ICTXT, II, IMIDPAD, 82 $ INFO, IPA, IPB, IPOSTPAD, IPREPAD, IPW, IPW2, 83 $ IPX, ISCALE, ITRAN, ITYPE, J, JJ, K, KFAIL, KK, 84 $ KPASS, KSKIP, KTESTS, LCM, LCMP, LTAU, LWF, 85 $ LWORK, LWS, M, MNP, MNRHSP, MP, MQ, MYCOL, 86 $ MYROW, N, NB, NBRHS, NCOLS, NGRIDS, NMAT, NNB, 87 $ NNBR, NNR, NNRHSQ, NOUT, NP, NPCOL, NPROCS, 88 $ NPROW, NROWS, NQ, NRHS, NRHSP, NRHSQ, WORKSIZ 89 REAL THRESH 90 DOUBLE PRECISION ADDFAC, ADDS, ANORM, BNORM, MULFAC, MULTS, 91 $ NOPS, SRESID, TMFLOPS 92* .. 93* .. Local Arrays .. 94 INTEGER DESCA( DLEN_ ), DESCB( DLEN_ ), DESCW( LLD_ ), 95 $ DESCX( DLEN_ ), IERR( 2 ), MVAL( NTESTS ), 96 $ NBRVAL( NTESTS ), NBVAL( NTESTS ), 97 $ NRVAL( NTESTS ), NVAL( NTESTS ), 98 $ PVAL( NTESTS ), QVAL( NTESTS ) 99 DOUBLE PRECISION CTIME( 1 ), MEM( MEMSIZ ), RESULT( 2 ), 100 $ WTIME( 1 ) 101* .. 102* .. External Subroutines .. 103 EXTERNAL BLACS_BARRIER, BLACS_EXIT, BLACS_GET, 104 $ BLACS_GRIDEXIT, BLACS_GRIDINFO, BLACS_GRIDINIT, 105 $ BLACS_PINFO, DESCINIT, IGSUM2D, PDCHEKPAD, 106 $ PDFILLPAD, PDGELS, PDGEMM, PDLACPY, 107 $ PDLSINFO, PDMATGEN, PDNRM2, PDSCAL, 108 $ PDQRT13, PDQRT16, SLBOOT, SLCOMBINE, SLTIMER 109* .. 110* .. External Functions .. 111 LOGICAL LSAME 112 INTEGER ICEIL, ILCM, NUMROC 113 DOUBLE PRECISION PDLANGE, PDQRT14, PDQRT17 114 EXTERNAL ICEIL, ILCM, LSAME, NUMROC, PDLANGE, 115 $ PDQRT14, PDQRT17 116* .. 117* .. Intrinsic Functions .. 118 INTRINSIC MAX, MIN 119* .. 120* .. Data Statements .. 121 DATA KTESTS, KPASS, KFAIL, KSKIP / 4*0 / 122* .. 123* .. Executable Statements .. 124* 125* Get starting information 126* 127 CALL BLACS_PINFO( IAM, NPROCS ) 128* 129 IASEED = 100 130 IBSEED = 200 131 CALL PDLSINFO( OUTFILE, NOUT, NMAT, MVAL, NTESTS, NVAL, 132 $ NTESTS, NNB, NBVAL, NTESTS, NNR, NRVAL, NTESTS, 133 $ NNBR, NBRVAL, NTESTS, NGRIDS, PVAL, NTESTS, QVAL, 134 $ NTESTS, THRESH, MEM, IAM, NPROCS ) 135 CHECK = ( THRESH.GE.0.0E+0 ) 136* 137* Print headings 138* 139 IF( IAM.EQ.0 ) THEN 140 WRITE( NOUT, FMT = * ) 141 WRITE( NOUT, FMT = 9995 ) 142 WRITE( NOUT, FMT = 9994 ) 143 WRITE( NOUT, FMT = * ) 144 END IF 145* 146* Loop over different process grids 147* 148 DO 90 I = 1, NGRIDS 149* 150 NPROW = PVAL( I ) 151 NPCOL = QVAL( I ) 152* 153* Make sure grid information is correct 154* 155 IERR( 1 ) = 0 156 IF( NPROW.LT.1 ) THEN 157 IF( IAM.EQ.0 ) 158 $ WRITE( NOUT, FMT = 9999 ) 'GRID', 'nprow', NPROW 159 IERR( 1 ) = 1 160 ELSE IF( NPCOL.LT.1 ) THEN 161 IF( IAM.EQ.0 ) 162 $ WRITE( NOUT, FMT = 9999 ) 'GRID', 'npcol', NPCOL 163 IERR( 1 ) = 1 164 ELSE IF( NPROW*NPCOL.GT.NPROCS ) THEN 165 IF( IAM.EQ.0 ) 166 $ WRITE( NOUT, FMT = 9998 ) NPROW*NPCOL, NPROCS 167 IERR( 1 ) = 1 168 END IF 169* 170 IF( IERR( 1 ).GT.0 ) THEN 171 IF( IAM.EQ.0 ) 172 $ WRITE( NOUT, FMT = 9997 ) 'grid' 173 KSKIP = KSKIP + 1 174 GO TO 90 175 END IF 176* 177* Define process grid 178* 179 CALL BLACS_GET( -1, 0, ICTXT ) 180 CALL BLACS_GRIDINIT( ICTXT, 'Row-major', NPROW, NPCOL ) 181 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 182* 183* Go to bottom of loop if this case doesn't use my process 184* 185 IF( ( MYROW.GE.NPROW ).OR.( MYCOL.GE.NPCOL ) ) 186 $ GO TO 90 187* 188 DO 80 J = 1, NMAT 189* 190 M = MVAL( J ) 191 N = NVAL( J ) 192* 193* Make sure matrix information is correct 194* 195 IERR( 1 ) = 0 196 IF( M.LT.1 ) THEN 197 IF( IAM.EQ.0 ) 198 $ WRITE( NOUT, FMT = 9999 ) 'MATRIX', 'M', M 199 IERR( 1 ) = 1 200 ELSE IF( N.LT.1 ) THEN 201 IF( IAM.EQ.0 ) 202 $ WRITE( NOUT, FMT = 9999 ) 'MATRIX', 'N', N 203 IERR( 1 ) = 1 204 END IF 205* 206* Make sure no one had error 207* 208 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, IERR, 1, -1, 0 ) 209* 210 IF( IERR( 1 ).GT.0 ) THEN 211 IF( IAM.EQ.0 ) 212 $ WRITE( NOUT, FMT = 9997 ) 'matrix' 213 KSKIP = KSKIP + 1 214 GO TO 80 215 END IF 216* 217* Loop over different blocking sizes 218* 219 DO 70 K = 1, NNB 220* 221 NB = NBVAL( K ) 222* 223* Make sure nb is legal 224* 225 IERR( 1 ) = 0 226 IF( NB.LT.1 ) THEN 227 IERR( 1 ) = 1 228 IF( IAM.EQ.0 ) 229 $ WRITE( NOUT, FMT = 9999 ) 'NB', 'NB', NB 230 END IF 231* 232* Check all processes for an error 233* 234 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, IERR, 1, -1, 0 ) 235* 236 IF( IERR( 1 ).GT.0 ) THEN 237 IF( IAM.EQ.0 ) 238 $ WRITE( NOUT, FMT = 9997 ) 'NB' 239 KSKIP = KSKIP + 1 240 GO TO 70 241 END IF 242* 243* Padding constants 244* 245 MP = NUMROC( M, NB, MYROW, 0, NPROW ) 246 MQ = NUMROC( M, NB, MYCOL, 0, NPCOL ) 247 NP = NUMROC( N, NB, MYROW, 0, NPROW ) 248 MNP = MAX( MP, NP ) 249 NQ = NUMROC( N, NB, MYCOL, 0, NPCOL ) 250* 251 IF( CHECK ) THEN 252 IPREPAD = MAX( NB, MP ) 253 IMIDPAD = NB 254 IPOSTPAD = MAX( NB, NQ ) 255 ELSE 256 IPREPAD = 0 257 IMIDPAD = 0 258 IPOSTPAD = 0 259 END IF 260* 261* Initialize the array descriptor for the matrix A 262* 263 CALL DESCINIT( DESCA, M, N, NB, NB, 0, 0, ICTXT, 264 $ MAX( 1, MP ) + IMIDPAD, IERR( 1 ) ) 265* 266* Check all processes for an error 267* 268 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, IERR, 1, -1, 0 ) 269* 270 IF( IERR( 1 ).LT.0 ) THEN 271 IF( IAM.EQ.0 ) 272 $ WRITE( NOUT, FMT = 9997 ) 'descriptor' 273 KSKIP = KSKIP + 1 274 GO TO 70 275 END IF 276* 277 DO 60 ISCALE = 1, 3 278* 279 ITYPE = ISCALE 280* 281* Assign pointers into MEM for SCALAPACK arrays, A is 282* allocated starting at position MEM( IPREPAD+1 ) 283* 284 IPA = IPREPAD + 1 285 IPX = IPA + DESCA( LLD_ )*NQ + IPOSTPAD + IPREPAD 286 IPW = IPX 287* 288 WORKSIZ = NQ + IPOSTPAD 289* 290* Check for adequate memory for problem size 291* 292 IERR( 1 ) = 0 293 IF( ( IPW+WORKSIZ ).GT.MEMSIZ ) THEN 294 IF( IAM.EQ.0 ) 295 $ WRITE( NOUT, FMT = 9996 ) 'MEMORY', 296 $ ( IPX+WORKSIZ )*DBLESZ 297 IERR( 1 ) = 1 298 END IF 299* 300* Check all processes for an error 301* 302 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, IERR, 1, -1, 303 $ 0 ) 304* 305 IF( IERR( 1 ).GT.0 ) THEN 306 IF( IAM.EQ.0 ) 307 $ WRITE( NOUT, FMT = 9997 ) 'MEMORY' 308 KSKIP = KSKIP + 1 309 GO TO 70 310 END IF 311* 312 IF( CHECK ) THEN 313 CALL PDFILLPAD( ICTXT, MP, NQ, MEM( IPA-IPREPAD ), 314 $ DESCA( LLD_ ), IPREPAD, IPOSTPAD, 315 $ PADVAL ) 316 CALL PDFILLPAD( ICTXT, WORKSIZ-IPOSTPAD, 1, 317 $ MEM( IPW-IPREPAD ), 318 $ WORKSIZ-IPOSTPAD, IPREPAD, 319 $ IPOSTPAD, PADVAL ) 320 END IF 321* 322* Generate the matrix A and calculate its 1-norm 323* 324 CALL PDQRT13( ISCALE, M, N, MEM( IPA ), 1, 1, 325 $ DESCA, ANORM, IASEED, MEM( IPW ) ) 326* 327 IF( CHECK ) THEN 328 CALL PDCHEKPAD( ICTXT, 'PDQRT13', MP, NQ, 329 $ MEM( IPA-IPREPAD ), DESCA( LLD_ ), 330 $ IPREPAD, IPOSTPAD, PADVAL ) 331 CALL PDCHEKPAD( ICTXT, 'PDQRT13', 332 $ WORKSIZ-IPOSTPAD, 1, 333 $ MEM( IPW-IPREPAD ), 334 $ WORKSIZ-IPOSTPAD, IPREPAD, 335 $ IPOSTPAD, PADVAL ) 336 END IF 337* 338 DO 50 ITRAN = 1, 2 339* 340 IF( ITRAN.EQ.1 ) THEN 341 NROWS = M 342 NCOLS = N 343 TRANS = 'N' 344 TPSD = .FALSE. 345 ELSE 346 NROWS = N 347 NCOLS = M 348 TRANS = 'T' 349 TPSD = .TRUE. 350 END IF 351* 352* Loop over the different values for NRHS 353* 354 DO 40 HH = 1, NNR 355* 356 NRHS = NRVAL( HH ) 357* 358 DO 30 KK = 1, NNBR 359* 360 NBRHS = NBRVAL( KK ) 361* 362 NRHSP = NUMROC( NRHS, NBRHS, MYROW, 0, 363 $ NPROW ) 364 NRHSQ = NUMROC( NRHS, NBRHS, MYCOL, 0, 365 $ NPCOL ) 366* 367* Define Array descriptor for rhs MAX(M,N)xNRHS 368* 369 CALL DESCINIT( DESCX, MAX( M, N ), NRHS, NB, 370 $ NBRHS, 0, 0, ICTXT, 371 $ MAX( 1, MNP ) + IMIDPAD, 372 $ IERR( 1 ) ) 373 IF( TPSD ) THEN 374 CALL DESCINIT( DESCW, M, NRHS, NB, NBRHS, 375 $ 0, 0, ICTXT, MAX( 1, MP ) + 376 $ IMIDPAD, IERR( 2 ) ) 377 ELSE 378 CALL DESCINIT( DESCW, N, NRHS, NB, NBRHS, 379 $ 0, 0, ICTXT, MAX( 1, NP ) + 380 $ IMIDPAD, IERR( 2 ) ) 381 END IF 382* 383* Check all processes for an error 384* 385 CALL IGSUM2D( ICTXT, 'All', ' ', 2, 1, IERR, 386 $ 2, -1, 0 ) 387* 388 IF( IERR( 1 ).LT.0 .OR. IERR( 2 ).LT.0 ) THEN 389 IF( IAM.EQ.0 ) 390 $ WRITE( NOUT, FMT = 9997 ) 'descriptor' 391 KSKIP = KSKIP + 1 392 GO TO 30 393 END IF 394* 395* Check for enough memory 396* 397 IPX = IPA + DESCA( LLD_ )*NQ + IPOSTPAD + 398 $ IPREPAD 399 IPW = IPX + DESCX( LLD_ )*NRHSQ + IPOSTPAD + 400 $ IPREPAD 401 WORKSIZ = DESCW( LLD_ )*NRHSQ + IPOSTPAD 402* 403 IERR( 1 ) = 0 404 IF( IPW+WORKSIZ.GT.MEMSIZ ) THEN 405 IF( IAM.EQ.0 ) 406 $ WRITE( NOUT, FMT = 9996 ) 'Generation', 407 $ ( IPW+WORKSIZ )*DBLESZ 408 IERR( 1 ) = 1 409 END IF 410* 411* Check all processes for an error 412* 413 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, IERR, 414 $ 1, -1, 0 ) 415* 416 IF( IERR( 1 ).GT.0 ) THEN 417 IF( IAM.EQ.0 ) 418 $ WRITE( NOUT, FMT = 9997 ) 'MEMORY' 419 KSKIP = KSKIP + 1 420 GO TO 30 421 END IF 422* 423* Generate RHS 424* 425 IF( TPSD ) THEN 426 CALL PDMATGEN( ICTXT, 'No', 'No', 427 $ DESCW( M_ ), DESCW( N_ ), 428 $ DESCW( MB_ ), DESCW( NB_ ), 429 $ MEM( IPW ), DESCW( LLD_ ), 430 $ DESCW( RSRC_ ), 431 $ DESCW( CSRC_ ), IBSEED, 0, 432 $ MP, 0, NRHSQ, MYROW, MYCOL, 433 $ NPROW, NPCOL ) 434 ELSE 435 CALL PDMATGEN( ICTXT, 'No', 'No', 436 $ DESCW( M_ ), DESCW( N_ ), 437 $ DESCW( MB_ ), DESCW( NB_ ), 438 $ MEM( IPW ), DESCW( LLD_ ), 439 $ DESCW( RSRC_ ), 440 $ DESCW( CSRC_ ), IBSEED, 0, 441 $ NP, 0, NRHSQ, MYROW, MYCOL, 442 $ NPROW, NPCOL ) 443 END IF 444* 445 IF( CHECK ) THEN 446 CALL PDFILLPAD( ICTXT, MNP, NRHSQ, 447 $ MEM( IPX-IPREPAD ), 448 $ DESCX( LLD_ ), IPREPAD, 449 $ IPOSTPAD, PADVAL ) 450 IF( TPSD ) THEN 451 CALL PDFILLPAD( ICTXT, MP, NRHSQ, 452 $ MEM( IPW-IPREPAD ), 453 $ DESCW( LLD_ ), IPREPAD, 454 $ IPOSTPAD, PADVAL ) 455 ELSE 456 CALL PDFILLPAD( ICTXT, NP, NRHSQ, 457 $ MEM( IPW-IPREPAD ), 458 $ DESCW( LLD_ ), IPREPAD, 459 $ IPOSTPAD, PADVAL ) 460 END IF 461 END IF 462* 463 DO 10 JJ = 1, NRHS 464 CALL PDNRM2( NCOLS, BNORM, MEM( IPW ), 1, 465 $ JJ, DESCW, 1 ) 466 IF( BNORM.GT.ZERO ) 467 $ CALL PDSCAL( NCOLS, ONE / BNORM, 468 $ MEM( IPW ), 1, JJ, DESCW, 469 $ 1 ) 470 10 CONTINUE 471* 472 CALL PDGEMM( TRANS, 'N', NROWS, NRHS, NCOLS, 473 $ ONE, MEM( IPA ), 1, 1, DESCA, 474 $ MEM( IPW ), 1, 1, DESCW, ZERO, 475 $ MEM( IPX ), 1, 1, DESCX ) 476* 477 IF( CHECK ) THEN 478* 479* check for memory overwrite 480* 481 CALL PDCHEKPAD( ICTXT, 'Generation', MP, 482 $ NQ, MEM( IPA-IPREPAD ), 483 $ DESCA( LLD_ ), IPREPAD, 484 $ IPOSTPAD, PADVAL ) 485 CALL PDCHEKPAD( ICTXT, 'Generation', MNP, 486 $ NRHSQ, MEM( IPX-IPREPAD ), 487 $ DESCX( LLD_ ), IPREPAD, 488 $ IPOSTPAD, PADVAL ) 489 IF( TPSD ) THEN 490 CALL PDCHEKPAD( ICTXT, 'Generation', 491 $ MP, NRHSQ, 492 $ MEM( IPW-IPREPAD ), 493 $ DESCW( LLD_ ), IPREPAD, 494 $ IPOSTPAD, PADVAL ) 495 ELSE 496 CALL PDCHEKPAD( ICTXT, 'Generation', 497 $ NP, NRHSQ, 498 $ MEM( IPW-IPREPAD ), 499 $ DESCW( LLD_ ), IPREPAD, 500 $ IPOSTPAD, PADVAL ) 501 END IF 502* 503* Allocate space for copy of rhs 504* 505 IPB = IPW 506* 507 IF( TPSD ) THEN 508 CALL DESCINIT( DESCB, N, NRHS, NB, 509 $ NBRHS, 0, 0, ICTXT, 510 $ MAX( 1, NP ) + IMIDPAD, 511 $ IERR( 1 ) ) 512 ELSE 513 CALL DESCINIT( DESCB, M, NRHS, NB, 514 $ NBRHS, 0, 0, ICTXT, 515 $ MAX( 1, MP ) + IMIDPAD, 516 $ IERR( 1 ) ) 517 END IF 518* 519* Check all processes for an error 520* 521 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, 522 $ IERR, 1, -1, 0 ) 523* 524 IF( IERR( 1 ).LT.0 ) THEN 525 IF( IAM.EQ.0 ) 526 $ WRITE( NOUT, FMT = 9997 ) 527 $ 'descriptor' 528 KSKIP = KSKIP + 1 529 GO TO 30 530 END IF 531* 532 IPW = IPB + DESCB( LLD_ )*NRHSQ + 533 $ IPOSTPAD + IPREPAD 534* 535 END IF 536* 537* Calculate the amount of workspace for PDGELS 538* 539 IF( M.GE.N ) THEN 540 LTAU = NUMROC( MIN(M,N), NB, MYCOL, 0, 541 $ NPCOL ) 542 LWF = NB * ( MP + NQ + NB ) 543 LWS = MAX( ( NB*( NB - 1 ) ) / 2, 544 $ ( MP + NRHSQ ) * NB ) + NB*NB 545 ELSE 546 LCM = ILCM( NPROW, NPCOL ) 547 LCMP = LCM / NPROW 548 LTAU = NUMROC( MIN(M,N), NB, MYROW, 0, 549 $ NPROW ) 550 LWF = NB * ( MP + NQ + NB ) 551 LWS = MAX( ( NB*( NB - 1 ) ) / 2, ( NP + 552 $ MAX( NQ + NUMROC( NUMROC( N, NB, 0, 553 $ 0, NPROW ), NB, 0, 0, LCMP ), 554 $ NRHSQ ) ) * NB ) + NB*NB 555 END IF 556* 557 LWORK = LTAU + MAX( LWF, LWS ) 558 WORKSIZ = LWORK + IPOSTPAD 559* 560* Check for adequate memory for problem size 561* 562 IERR( 1 ) = 0 563 IF( IPW+WORKSIZ.GT.MEMSIZ ) THEN 564 IF( IAM.EQ.0 ) 565 $ WRITE( NOUT, FMT = 9996 ) 'solve', 566 $ ( IPW+WORKSIZ )*DBLESZ 567 IERR( 1 ) = 1 568 END IF 569* 570* Check all processes for an error 571* 572 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, IERR, 573 $ 1, -1, 0 ) 574* 575 IF( IERR( 1 ).GT.0 ) THEN 576 IF( IAM.EQ.0 ) 577 $ WRITE( NOUT, FMT = 9997 ) 'MEMORY' 578 KSKIP = KSKIP + 1 579 GO TO 30 580 END IF 581* 582 IF( CHECK ) THEN 583* 584* Make the copy of the right hand side 585* 586 CALL PDLACPY( 'All', NROWS, NRHS, 587 $ MEM( IPX ), 1, 1, DESCX, 588 $ MEM( IPB ), 1, 1, DESCB ) 589* 590 IF( TPSD ) THEN 591 CALL PDFILLPAD( ICTXT, NP, NRHSQ, 592 $ MEM( IPB-IPREPAD ), 593 $ DESCB( LLD_ ), IPREPAD, 594 $ IPOSTPAD, PADVAL ) 595 ELSE 596 CALL PDFILLPAD( ICTXT, MP, NRHSQ, 597 $ MEM( IPB-IPREPAD ), 598 $ DESCB( LLD_ ), IPREPAD, 599 $ IPOSTPAD, PADVAL ) 600 END IF 601 CALL PDFILLPAD( ICTXT, LWORK, 1, 602 $ MEM( IPW-IPREPAD ), 603 $ LWORK, IPREPAD, 604 $ IPOSTPAD, PADVAL ) 605 END IF 606* 607 CALL SLBOOT( ) 608 CALL BLACS_BARRIER( ICTXT, 'All' ) 609 CALL SLTIMER( 1 ) 610* 611* Solve the LS or overdetermined system 612* 613 CALL PDGELS( TRANS, M, N, NRHS, MEM( IPA ), 614 $ 1, 1, DESCA, MEM( IPX ), 1, 1, 615 $ DESCX, MEM( IPW ), LWORK, INFO ) 616* 617 CALL SLTIMER( 1 ) 618* 619 IF( CHECK ) THEN 620* 621* check for memory overwrite 622* 623 CALL PDCHEKPAD( ICTXT, 'PDGELS', MP, 624 $ NQ, MEM( IPA-IPREPAD ), 625 $ DESCA( LLD_ ), IPREPAD, 626 $ IPOSTPAD, PADVAL ) 627 CALL PDCHEKPAD( ICTXT, 'PDGELS', MNP, 628 $ NRHSQ, MEM( IPX-IPREPAD ), 629 $ DESCX( LLD_ ), IPREPAD, 630 $ IPOSTPAD, PADVAL ) 631 CALL PDCHEKPAD( ICTXT, 'PDGELS', LWORK, 632 $ 1, MEM( IPW-IPREPAD ), 633 $ LWORK, IPREPAD, 634 $ IPOSTPAD, PADVAL ) 635 END IF 636* 637* Regenerate A in place for testing and next 638* iteration 639* 640 CALL PDQRT13( ISCALE, M, N, MEM( IPA ), 1, 1, 641 $ DESCA, ANORM, IASEED, 642 $ MEM( IPW ) ) 643* 644* check the solution to rhs 645* 646 IF( CHECK ) THEN 647* 648* Am I going to call PDQRT17 ? 649* 650 IF( ( M.GE.N .AND. ( .NOT.TPSD ) ) .OR. 651 $ ( M.LT.N .AND. TPSD ) ) THEN 652* 653* Call PDQRT17 first, A, X, and B remain 654* unchanged. Solving LS system 655* 656* Check amount of memory for PDQRT17 657* 658 IF( TPSD ) THEN 659 WORKSIZ = NP*NRHSQ + NRHSP*MQ 660 IPW2 = IPW + WORKSIZ 661 WORKSIZ = WORKSIZ + 662 $ MAX( NQ, MAX( MQ, NRHSQ ) ) + 663 $ IPOSTPAD 664 ELSE 665 WORKSIZ = MP*NRHSQ + NRHSP*NQ 666 IPW2 = IPW + WORKSIZ 667 WORKSIZ = WORKSIZ + 668 $ MAX( NQ, NRHSQ ) + 669 $ IPOSTPAD 670 END IF 671* 672* Check for adequate memory for problem 673* size 674* 675 IERR( 1 ) = 0 676 IF( ( IPW+WORKSIZ ).GT.MEMSIZ ) THEN 677 IF( IAM.EQ.0 ) 678 $ WRITE( NOUT, FMT = 9996 ) 679 $ 'MEMORY', ( IPW+WORKSIZ )*DBLESZ 680 IERR( 1 ) = 1 681 END IF 682* 683* Check all processes for an error 684* 685 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, 686 $ IERR, 1, -1, 0 ) 687* 688 IF( IERR( 1 ).GT.0 ) THEN 689 IF( IAM.EQ.0 ) 690 $ WRITE( NOUT, FMT = 9997 ) 691 $ 'MEMORY' 692 KSKIP = KSKIP + 1 693 GO TO 30 694 END IF 695* 696 CALL PDFILLPAD( ICTXT, 697 $ WORKSIZ-IPOSTPAD, 1, 698 $ MEM( IPW-IPREPAD ), 699 $ WORKSIZ-IPOSTPAD, 700 $ IPREPAD, IPOSTPAD, 701 $ PADVAL ) 702* 703 RESULT( 2 ) = PDQRT17( TRANS, 1, M, N, 704 $ NRHS, 705 $ MEM( IPA ), 706 $ 1, 1, DESCA, 707 $ MEM( IPX ), 1, 708 $ 1, DESCX, 709 $ MEM( IPB ), 710 $ 1, 1, DESCB, 711 $ MEM( IPW ), 712 $ MEM( IPW2 ) ) 713 SRESID = RESULT( 2 ) 714* 715 CALL PDCHEKPAD( ICTXT, 'PDQRT17', 716 $ MP, NQ, 717 $ MEM( IPA-IPREPAD ), 718 $ DESCA( LLD_ ), 719 $ IPREPAD, IPOSTPAD, 720 $ PADVAL ) 721 CALL PDCHEKPAD( ICTXT, 'PDQRT17', 722 $ MNP, NRHSQ, 723 $ MEM( IPX-IPREPAD ), 724 $ DESCX( LLD_ ), IPREPAD, 725 $ IPOSTPAD, PADVAL ) 726 IF( TPSD ) THEN 727 CALL PDCHEKPAD( ICTXT, 'PDQRT17', 728 $ NP, NRHSQ, 729 $ MEM( IPB-IPREPAD ), 730 $ DESCB( LLD_ ), 731 $ IPREPAD, IPOSTPAD, 732 $ PADVAL ) 733 ELSE 734 CALL PDCHEKPAD( ICTXT, 'PDQRT17', 735 $ MP, NRHSQ, 736 $ MEM( IPB-IPREPAD ), 737 $ DESCB( LLD_ ), 738 $ IPREPAD, IPOSTPAD, 739 $ PADVAL ) 740 END IF 741 CALL PDCHEKPAD( ICTXT, 'PDQRT17', 742 $ WORKSIZ-IPOSTPAD, 1, 743 $ MEM( IPW-IPREPAD ), 744 $ WORKSIZ-IPOSTPAD, 745 $ IPREPAD, IPOSTPAD, 746 $ PADVAL ) 747 END IF 748* 749* Call PDQRT16, B will be destroyed. 750* 751 IF( TPSD ) THEN 752 WORKSIZ = MP + IPOSTPAD 753 ELSE 754 WORKSIZ = NQ + IPOSTPAD 755 END IF 756* 757* Check for adequate memory for problem size 758* 759 IERR( 1 ) = 0 760 IF( ( IPW+WORKSIZ ).GT.MEMSIZ ) THEN 761 IF( IAM.EQ.0 ) 762 $ WRITE( NOUT, FMT = 9996 ) 'MEMORY', 763 $ ( IPW+WORKSIZ )*DBLESZ 764 IERR( 1 ) = 1 765 END IF 766* 767* Check all processes for an error 768* 769 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, 770 $ IERR, 1, -1, 0 ) 771* 772 IF( IERR( 1 ).GT.0 ) THEN 773 IF( IAM.EQ.0 ) 774 $ WRITE( NOUT, FMT = 9997 ) 'MEMORY' 775 KSKIP = KSKIP + 1 776 GO TO 30 777 END IF 778* 779 CALL PDFILLPAD( ICTXT, 780 $ WORKSIZ-IPOSTPAD, 1, 781 $ MEM( IPW-IPREPAD ), 782 $ WORKSIZ-IPOSTPAD, 783 $ IPREPAD, IPOSTPAD, 784 $ PADVAL ) 785* 786 CALL PDQRT16( TRANS, M, N, NRHS, 787 $ MEM( IPA ), 1, 1, DESCA, 788 $ MEM( IPX ), 1, 1, DESCX, 789 $ MEM( IPB ), 1, 1, DESCB, 790 $ MEM( IPW ), RESULT( 1 ) ) 791* 792 CALL PDCHEKPAD( ICTXT, 'PDQRT16', 793 $ MP, NQ, 794 $ MEM( IPA-IPREPAD ), 795 $ DESCA( LLD_ ), 796 $ IPREPAD, IPOSTPAD, 797 $ PADVAL ) 798 CALL PDCHEKPAD( ICTXT, 'PDQRT16', 799 $ MNP, NRHSQ, 800 $ MEM( IPX-IPREPAD ), 801 $ DESCX( LLD_ ), IPREPAD, 802 $ IPOSTPAD, PADVAL ) 803 IF( TPSD ) THEN 804 CALL PDCHEKPAD( ICTXT, 'PDQRT16', 805 $ NP, NRHSQ, 806 $ MEM( IPB-IPREPAD ), 807 $ DESCB( LLD_ ), 808 $ IPREPAD, IPOSTPAD, 809 $ PADVAL ) 810 ELSE 811 CALL PDCHEKPAD( ICTXT, 'PDQRT16', 812 $ MP, NRHSQ, 813 $ MEM( IPB-IPREPAD ), 814 $ DESCB( LLD_ ), 815 $ IPREPAD, IPOSTPAD, 816 $ PADVAL ) 817 END IF 818 CALL PDCHEKPAD( ICTXT, 'PDQRT16', 819 $ WORKSIZ-IPOSTPAD, 1, 820 $ MEM( IPW-IPREPAD ), 821 $ WORKSIZ-IPOSTPAD, 822 $ IPREPAD, IPOSTPAD, 823 $ PADVAL ) 824* 825* Call PDQRT14 826* 827 IF( ( M.GE.N .AND. TPSD ) .OR. 828 $ ( M.LT.N .AND. ( .NOT.TPSD ) ) ) THEN 829* 830 IPW = IPB 831* 832 IF( TPSD ) THEN 833* 834 NNRHSQ = NUMROC( N+NRHS, NB, MYCOL, 835 $ 0, NPCOL ) 836 LTAU = NUMROC( MIN( M, N+NRHS ), NB, 837 $ MYCOL, 0, NPCOL ) 838 LWF = NB * ( NB + MP + NNRHSQ ) 839 WORKSIZ = MP * NNRHSQ + LTAU + LWF + 840 $ IPOSTPAD 841* 842 ELSE 843* 844 MNRHSP = NUMROC( M+NRHS, NB, MYROW, 845 $ 0, NPROW ) 846 LTAU = NUMROC( MIN( M+NRHS, N ), NB, 847 $ MYROW, 0, NPROW ) 848 LWF = NB * ( NB + MNRHSP + NQ ) 849 WORKSIZ = MNRHSP * NQ + LTAU + LWF + 850 $ IPOSTPAD 851* 852 END IF 853* 854* Check for adequate memory for problem 855* size 856* 857 IERR( 1 ) = 0 858 IF( ( IPW+WORKSIZ ).GT.MEMSIZ ) THEN 859 IF( IAM.EQ.0 ) 860 $ WRITE( NOUT, FMT = 9996 ) 861 $ 'MEMORY', ( IPW+WORKSIZ )*DBLESZ 862 IERR( 1 ) = 1 863 END IF 864* 865* Check all processes for an error 866* 867 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, 868 $ IERR, 1, -1, 0 ) 869* 870 IF( IERR( 1 ).GT.0 ) THEN 871 IF( IAM.EQ.0 ) 872 $ WRITE( NOUT, FMT = 9997 ) 873 $ 'MEMORY' 874 KSKIP = KSKIP + 1 875 GO TO 30 876 END IF 877* 878 CALL PDFILLPAD( ICTXT, 879 $ WORKSIZ-IPOSTPAD, 1, 880 $ MEM( IPW-IPREPAD ), 881 $ WORKSIZ-IPOSTPAD, 882 $ IPREPAD, IPOSTPAD, 883 $ PADVAL ) 884* 885* Solve underdetermined system 886* 887 RESULT( 2 ) = PDQRT14( TRANS, M, N, 888 $ NRHS, 889 $ MEM( IPA ), 1, 890 $ 1, DESCA, 891 $ MEM( IPX ), 892 $ 1, 1, DESCX, 893 $ MEM( IPW ) ) 894 SRESID = RESULT( 2 ) 895* 896 CALL PDCHEKPAD( ICTXT, 'PDQRT14', 897 $ MP, NQ, 898 $ MEM( IPA-IPREPAD ), 899 $ DESCA( LLD_ ), 900 $ IPREPAD, IPOSTPAD, 901 $ PADVAL ) 902 CALL PDCHEKPAD( ICTXT, 'PDQRT14', 903 $ MNP, NRHSQ, 904 $ MEM( IPX-IPREPAD ), 905 $ DESCX( LLD_ ), IPREPAD, 906 $ IPOSTPAD, PADVAL ) 907 CALL PDCHEKPAD( ICTXT, 'PDQRT14', 908 $ WORKSIZ-IPOSTPAD, 1, 909 $ MEM( IPW-IPREPAD ), 910 $ WORKSIZ-IPOSTPAD, 911 $ IPREPAD, IPOSTPAD, 912 $ PADVAL ) 913 END IF 914* 915* Print information about the tests that 916* did not pass the threshold. 917* 918 PASSED = 'PASSED' 919 DO 20 II = 1, 2 920 IF( ( RESULT( II ).GE.THRESH ) .AND. 921 $ ( RESULT( II )-RESULT( II ).EQ.0.0E+0 922 $ ) ) THEN 923 IF( IAM.EQ.0 ) 924 $ WRITE( NOUT, FMT = 9986 )TRANS, 925 $ M, N, NRHS, NB, ITYPE, II, 926 $ RESULT( II ) 927 KFAIL = KFAIL + 1 928 PASSED = 'FAILED' 929 ELSE 930 KPASS = KPASS + 1 931 END IF 932 20 CONTINUE 933* 934 ELSE 935* 936* By-pass the solve check 937* 938 KPASS = KPASS + 1 939 SRESID = SRESID - SRESID 940 PASSED = 'BYPASS' 941* 942 END IF 943* 944* Gather maximum of all CPU and WALL clock 945* timings 946* 947 CALL SLCOMBINE( ICTXT, 'All', '>', 'W', 1, 1, 948 $ WTIME ) 949 CALL SLCOMBINE( ICTXT, 'All', '>', 'C', 1, 1, 950 $ CTIME ) 951* 952* Print results 953* 954 IF( MYROW.EQ.0 .AND. MYCOL.EQ.0 ) THEN 955 ADDFAC = 1 956 MULFAC = 1 957 IF( M.GE.N ) THEN 958* 959* NOPS = DOPLA( 'DGEQRF', M, N, 0, 0, 960* NB ) + DOPLA( 'DORMQR', M, NRHS, N, 961* 0, NB ) 962* 963 MULTS = N*( ( ( 23.D0 / 6.D0 )+M+N / 964 $ 2.D0 )+ N*( M-N / 3.D0 ) ) + 965 $ N*NRHS*( 2.D0*M+2.D0-N ) 966 ADDS = N*( ( 5.D0 / 6.D0 )+N* 967 $ ( 1.D0 / 2.D0+( M-N / 3.D0 ) ) ) 968 $ + N*NRHS*( 2.D0*M+1.D0-N ) 969 ELSE 970* 971* NOPS = DOPLA( 'DGELQF', M, N, 0, 0, 972* NB ) + DOPLA( 'DORMLQ', M, 973* NRHS, N, 0, NB ) 974* 975 MULTS = M*( ( ( 29.D0 / 6.D0 )+2.D0*N-M 976 $ / 2.D0 )+M*( N-M / 3.D0 ) ) 977 $ + N*NRHS*( 2.D0*M+2.D0-N ) 978 ADDS = M*( ( 5.D0 / 6.D0 )+M / 2.D0+M* 979 $ ( N-M / 3.D0 ) ) 980 $ + N*NRHS*( 2.D0*M+1.D0-N ) 981 END IF 982 NOPS = ADDFAC*ADDS + MULFAC*MULTS 983* 984* Calculate total megaflops, for WALL and 985* CPU time, and print output 986* 987* Print WALL time if machine supports it 988* 989 IF( WTIME( 1 ).GT.0.0D+0 ) THEN 990 TMFLOPS = NOPS / ( WTIME( 1 )*1.0D+6 ) 991 ELSE 992 TMFLOPS = 0.0D+0 993 END IF 994* 995 IF( WTIME( 1 ).GE.0.0D+0 ) 996 $ WRITE( NOUT, FMT = 9993 ) 997 $ 'WALL', TRANS, M, N, NB, NRHS, 998 $ NBRHS, NPROW, NPCOL, WTIME( 1 ), 999 $ TMFLOPS, PASSED 1000* 1001* Print CPU time if machine supports it 1002* 1003 IF( CTIME( 1 ).GT.0.0D+0 ) THEN 1004 TMFLOPS = NOPS / ( CTIME( 1 )*1.0D+6 ) 1005 ELSE 1006 TMFLOPS = 0.0D+0 1007 END IF 1008* 1009 IF( CTIME( 1 ).GE.0.0D+0 ) 1010 $ WRITE( NOUT, FMT = 9993 ) 1011 $ 'CPU ', TRANS, M, N, NB, NRHS, 1012 $ NBRHS, NPROW, NPCOL, CTIME( 1 ), 1013 $ TMFLOPS, PASSED 1014 END IF 1015 30 CONTINUE 1016 40 CONTINUE 1017 50 CONTINUE 1018 60 CONTINUE 1019 70 CONTINUE 1020 80 CONTINUE 1021 CALL BLACS_GRIDEXIT( ICTXT ) 1022 90 CONTINUE 1023* 1024* Print out ending messages and close output file 1025* 1026 IF( IAM.EQ.0 ) THEN 1027 KTESTS = KPASS + KFAIL + KSKIP 1028 WRITE( NOUT, FMT = * ) 1029 WRITE( NOUT, FMT = 9992 ) KTESTS 1030 IF( CHECK ) THEN 1031 WRITE( NOUT, FMT = 9991 ) KPASS 1032 WRITE( NOUT, FMT = 9989 ) KFAIL 1033 ELSE 1034 WRITE( NOUT, FMT = 9990 ) KPASS 1035 END IF 1036 WRITE( NOUT, FMT = 9988 ) KSKIP 1037 WRITE( NOUT, FMT = * ) 1038 WRITE( NOUT, FMT = * ) 1039 WRITE( NOUT, FMT = 9987 ) 1040 IF( NOUT.NE.6 .AND. NOUT.NE.0 ) 1041 $ CLOSE ( NOUT ) 1042 END IF 1043* 1044 CALL BLACS_EXIT( 0 ) 1045* 1046 9999 FORMAT( 'ILLEGAL ', A6, ': ', A5, ' = ', I3, 1047 $ '; It should be at least 1' ) 1048 9998 FORMAT( 'ILLEGAL GRID: nprow*npcol = ', I4, '. It can be at most', 1049 $ I4 ) 1050 9997 FORMAT( 'Bad ', A6, ' parameters: going on to next test case.' ) 1051 9996 FORMAT( 'Unable to perform ', A, ': need TOTMEM of at least', 1052 $ I11 ) 1053 9995 FORMAT( 'Time TRANS M N NB NRHS NBRHS P Q ', 1054 $ 'LS Time MFLOPS CHECK' ) 1055 9994 FORMAT( '---- ----- ------ ------ --- ----- ----- ----- ----- ', 1056 $ '--------- -------- ------' ) 1057 9993 FORMAT( A4, 3X, A1, 3X, I6, 1X, I6, 1X, I3, 1X, I5, 1X, I5, 1X, 1058 $ I5, 1X, I5, 1X, F9.2, 1X, F8.2, 1X, A6 ) 1059 9992 FORMAT( 'Finished', I6, ' tests, with the following results:' ) 1060 9991 FORMAT( I5, ' tests completed and passed residual checks.' ) 1061 9990 FORMAT( I5, ' tests completed without checking.' ) 1062 9989 FORMAT( I5, ' tests completed and failed residual checks.' ) 1063 9988 FORMAT( I5, ' tests skipped because of illegal input values.' ) 1064 9987 FORMAT( 'END OF TESTS.' ) 1065 9986 FORMAT( ' TRANS=''', A1, ''', M=', I5, ', N=', I5, ', NRHS=', I4, 1066 $ ', NB=', I4, ', type', I2, ', test(', I2, ')=', G12.5 ) 1067* 1068 STOP 1069* 1070* End of PDLSDRIVER 1071* 1072 END 1073