1 PROGRAM PCLSDRIVER 2* 3* -- ScaLAPACK routine (version 1.7) -- 4* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 5* and University of California, Berkeley. 6* August 14, 2001 7* 8* Purpose 9* ======= 10* 11* PCLSDRIVER is the main test program for the COMPLEX 12* SCALAPACK (full rank) Least Squares routines. This test driver solves 13* full-rank least square problems. 14* 15* The program must be driven by a short data file. An annotated 16* example of a data file can be obtained by deleting the first 3 17* characters from the following 17 lines: 18* 'ScaLapack LS solve input file' 19* 'Intel iPSC/860 hypercube, gamma model.' 20* 'LS.out' output file name (if any) 21* 6 device out 22* 4 number of problems sizes 23* 55 17 31 201 values of M 24* 5 71 31 201 values of N 25* 3 number of NB's 26* 2 3 5 values of NB 27* 3 number of NRHS's 28* 2 3 5 values of NRHS 29* 2 number of NBRHS's 30* 1 2 values of NBRHS 31* 7 number of process grids (ordered P & Q) 32* 1 2 1 4 2 3 8 values of P 33* 7 2 4 1 3 2 1 values of Q 34* 3.0 threshold 35* 36* Internal Parameters 37* =================== 38* 39* TOTMEM INTEGER, default = 6200000. 40* TOTMEM is a machine-specific parameter indicating the 41* maximum amount of available memory in bytes. 42* The user should customize TOTMEM to his platform. Remember 43* to leave room in memory for the operating system, the BLACS 44* buffer, etc. For example, on a system with 8 MB of memory 45* per process (e.g., one processor on an Intel iPSC/860), the 46* parameters we use are TOTMEM=6200000 (leaving 1.8 MB for OS, 47* code, BLACS buffer, etc). However, for PVM, we usually set 48* TOTMEM = 2000000. Some experimenting with the maximum value 49* of TOTMEM may be required. 50* INTGSZ INTEGER, default = 4 bytes. 51* CPLXSZ INTEGER, default = 8 bytes. 52* INTGSZ and CPLXSZ indicate the length in bytes on the 53* given platform for an integer and a single precision 54* complex. 55* MEM COMPLEX array, dimension ( TOTMEM / CPLXSZ ) 56* All arrays used by SCALAPACK routines are allocated from 57* this array and referenced by pointers. The integer IPA, 58* for example, is a pointer to the starting element of MEM for 59* the matrix A. 60* 61* ===================================================================== 62* 63* .. Parameters .. 64 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 65 $ LLD_, MB_, M_, NB_, N_, RSRC_ 66 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 67 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 68 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 69 INTEGER CPLXSZ, MEMSIZ, NTESTS, REALSZ, TOTMEM 70 REAL RZERO, RONE 71 COMPLEX ONE, PADVAL, ZERO 72 PARAMETER ( CPLXSZ = 8, REALSZ = 8, TOTMEM = 2000000, 73 $ MEMSIZ = TOTMEM / CPLXSZ, NTESTS = 20, 74 $ PADVAL = ( -9923.0E+0, -9923.0E+0 ) ) 75 PARAMETER ( ONE = ( 1.0E+0, 0.0E+0 ), RZERO = 0.0E+0, 76 $ RONE = 1.0E+0, ZERO = ( 0.0E+0, 0.0E+0 ) ) 77* .. 78* .. Local Scalars .. 79 LOGICAL CHECK, TPSD 80 CHARACTER TRANS 81 CHARACTER*6 PASSED 82 CHARACTER*80 OUTFILE 83 INTEGER HH, I, IAM, IASEED, IBSEED, ICTXT, II, IMIDPAD, 84 $ INFO, IPA, IPB, IPOSTPAD, IPREPAD, IPW, IPW2, 85 $ IPX, ISCALE, ITRAN, ITYPE, J, JJ, K, KFAIL, KK, 86 $ KPASS, KSKIP, KTESTS, LCM, LCMP, LTAU, LWF, 87 $ LWORK, LWS, M, MNP, MNRHSP, MP, MQ, MYCOL, 88 $ MYROW, N, NB, NBRHS, NCOLS, NGRIDS, NMAT, NNB, 89 $ NNBR, NNR, NNRHSQ, NOUT, NP, NPCOL, NPROCS, 90 $ NPROW, NROWS, NQ, NRHS, NRHSP, NRHSQ, WORKSIZ 91 REAL ANORM, BNORM, SRESID, THRESH 92 DOUBLE PRECISION ADDFAC, ADDS, MULFAC, MULTS, NOPS, TMFLOPS 93* .. 94* .. Local Arrays .. 95 INTEGER DESCA( DLEN_ ), DESCB( DLEN_ ), DESCW( LLD_ ), 96 $ DESCX( DLEN_ ), IERR( 2 ), MVAL( NTESTS ), 97 $ NBRVAL( NTESTS ), NBVAL( NTESTS ), 98 $ NRVAL( NTESTS ), NVAL( NTESTS ), 99 $ PVAL( NTESTS ), QVAL( NTESTS ) 100 REAL RESULT( 2 ) 101 DOUBLE PRECISION CTIME( 1 ), WTIME( 1 ) 102 COMPLEX MEM( MEMSIZ ) 103* .. 104* .. External Subroutines .. 105 EXTERNAL BLACS_BARRIER, BLACS_EXIT, BLACS_GET, 106 $ BLACS_GRIDEXIT, BLACS_GRIDINFO, BLACS_GRIDINIT, 107 $ BLACS_PINFO, DESCINIT, IGSUM2D, PCCHEKPAD, 108 $ PCFILLPAD, PCGELS, PCGEMM, PCLACPY, 109 $ PCLSINFO, PCMATGEN, PSCNRM2, 110 $ PCSSCAL, PCQRT13, PCQRT16, SLBOOT, 111 $ SLCOMBINE, SLTIMER 112* .. 113* .. External Functions .. 114 LOGICAL LSAME 115 INTEGER ICEIL, ILCM, NUMROC 116 REAL PCLANGE, PCQRT14, PCQRT17 117 EXTERNAL ICEIL, ILCM, LSAME, NUMROC, PCLANGE, 118 $ PCQRT14, PCQRT17 119* .. 120* .. Intrinsic Functions .. 121 INTRINSIC MAX, MIN 122* .. 123* .. Data Statements .. 124 DATA KTESTS, KPASS, KFAIL, KSKIP / 4*0 / 125* .. 126* .. Executable Statements .. 127* 128* Get starting information 129* 130 CALL BLACS_PINFO( IAM, NPROCS ) 131* 132 IASEED = 100 133 IBSEED = 200 134 CALL PCLSINFO( OUTFILE, NOUT, NMAT, MVAL, NTESTS, NVAL, 135 $ NTESTS, NNB, NBVAL, NTESTS, NNR, NRVAL, NTESTS, 136 $ NNBR, NBRVAL, NTESTS, NGRIDS, PVAL, NTESTS, QVAL, 137 $ NTESTS, THRESH, MEM, IAM, NPROCS ) 138 CHECK = ( THRESH.GE.0.0E+0 ) 139* 140* Print headings 141* 142 IF( IAM.EQ.0 ) THEN 143 WRITE( NOUT, FMT = * ) 144 WRITE( NOUT, FMT = 9995 ) 145 WRITE( NOUT, FMT = 9994 ) 146 WRITE( NOUT, FMT = * ) 147 END IF 148* 149* Loop over different process grids 150* 151 DO 90 I = 1, NGRIDS 152* 153 NPROW = PVAL( I ) 154 NPCOL = QVAL( I ) 155* 156* Make sure grid information is correct 157* 158 IERR( 1 ) = 0 159 IF( NPROW.LT.1 ) THEN 160 IF( IAM.EQ.0 ) 161 $ WRITE( NOUT, FMT = 9999 ) 'GRID', 'nprow', NPROW 162 IERR( 1 ) = 1 163 ELSE IF( NPCOL.LT.1 ) THEN 164 IF( IAM.EQ.0 ) 165 $ WRITE( NOUT, FMT = 9999 ) 'GRID', 'npcol', NPCOL 166 IERR( 1 ) = 1 167 ELSE IF( NPROW*NPCOL.GT.NPROCS ) THEN 168 IF( IAM.EQ.0 ) 169 $ WRITE( NOUT, FMT = 9998 ) NPROW*NPCOL, NPROCS 170 IERR( 1 ) = 1 171 END IF 172* 173 IF( IERR( 1 ).GT.0 ) THEN 174 IF( IAM.EQ.0 ) 175 $ WRITE( NOUT, FMT = 9997 ) 'grid' 176 KSKIP = KSKIP + 1 177 GO TO 90 178 END IF 179* 180* Define process grid 181* 182 CALL BLACS_GET( -1, 0, ICTXT ) 183 CALL BLACS_GRIDINIT( ICTXT, 'Row-major', NPROW, NPCOL ) 184 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 185* 186* Go to bottom of loop if this case doesn't use my process 187* 188 IF( ( MYROW.GE.NPROW ).OR.( MYCOL.GE.NPCOL ) ) 189 $ GO TO 90 190* 191 DO 80 J = 1, NMAT 192* 193 M = MVAL( J ) 194 N = NVAL( J ) 195* 196* Make sure matrix information is correct 197* 198 IERR( 1 ) = 0 199 IF( M.LT.1 ) THEN 200 IF( IAM.EQ.0 ) 201 $ WRITE( NOUT, FMT = 9999 ) 'MATRIX', 'M', M 202 IERR( 1 ) = 1 203 ELSE IF( N.LT.1 ) THEN 204 IF( IAM.EQ.0 ) 205 $ WRITE( NOUT, FMT = 9999 ) 'MATRIX', 'N', N 206 IERR( 1 ) = 1 207 END IF 208* 209* Make sure no one had error 210* 211 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, IERR, 1, -1, 0 ) 212* 213 IF( IERR( 1 ).GT.0 ) THEN 214 IF( IAM.EQ.0 ) 215 $ WRITE( NOUT, FMT = 9997 ) 'matrix' 216 KSKIP = KSKIP + 1 217 GO TO 80 218 END IF 219* 220* Loop over different blocking sizes 221* 222 DO 70 K = 1, NNB 223* 224 NB = NBVAL( K ) 225* 226* Make sure nb is legal 227* 228 IERR( 1 ) = 0 229 IF( NB.LT.1 ) THEN 230 IERR( 1 ) = 1 231 IF( IAM.EQ.0 ) 232 $ WRITE( NOUT, FMT = 9999 ) 'NB', 'NB', NB 233 END IF 234* 235* Check all processes for an error 236* 237 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, IERR, 1, -1, 0 ) 238* 239 IF( IERR( 1 ).GT.0 ) THEN 240 IF( IAM.EQ.0 ) 241 $ WRITE( NOUT, FMT = 9997 ) 'NB' 242 KSKIP = KSKIP + 1 243 GO TO 70 244 END IF 245* 246* Padding constants 247* 248 MP = NUMROC( M, NB, MYROW, 0, NPROW ) 249 MQ = NUMROC( M, NB, MYCOL, 0, NPCOL ) 250 NP = NUMROC( N, NB, MYROW, 0, NPROW ) 251 MNP = MAX( MP, NP ) 252 NQ = NUMROC( N, NB, MYCOL, 0, NPCOL ) 253* 254 IF( CHECK ) THEN 255 IPREPAD = MAX( NB, MP ) 256 IMIDPAD = NB 257 IPOSTPAD = MAX( NB, NQ ) 258 ELSE 259 IPREPAD = 0 260 IMIDPAD = 0 261 IPOSTPAD = 0 262 END IF 263* 264* Initialize the array descriptor for the matrix A 265* 266 CALL DESCINIT( DESCA, M, N, NB, NB, 0, 0, ICTXT, 267 $ MAX( 1, MP ) + IMIDPAD, IERR( 1 ) ) 268* 269* Check all processes for an error 270* 271 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, IERR, 1, -1, 0 ) 272* 273 IF( IERR( 1 ).LT.0 ) THEN 274 IF( IAM.EQ.0 ) 275 $ WRITE( NOUT, FMT = 9997 ) 'descriptor' 276 KSKIP = KSKIP + 1 277 GO TO 70 278 END IF 279* 280 DO 60 ISCALE = 1, 3 281* 282 ITYPE = ISCALE 283* 284* Assign pointers into MEM for SCALAPACK arrays, A is 285* allocated starting at position MEM( IPREPAD+1 ) 286* 287 IPA = IPREPAD + 1 288 IPX = IPA + DESCA( LLD_ )*NQ + IPOSTPAD + IPREPAD 289 IPW = IPX 290* 291 WORKSIZ = NQ + IPOSTPAD 292* 293* Check for adequate memory for problem size 294* 295 IERR( 1 ) = 0 296 IF( ( IPW+WORKSIZ ).GT.MEMSIZ ) THEN 297 IF( IAM.EQ.0 ) 298 $ WRITE( NOUT, FMT = 9996 ) 'MEMORY', 299 $ ( IPX+WORKSIZ )*CPLXSZ 300 IERR( 1 ) = 1 301 END IF 302* 303* Check all processes for an error 304* 305 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, IERR, 1, -1, 306 $ 0 ) 307* 308 IF( IERR( 1 ).GT.0 ) THEN 309 IF( IAM.EQ.0 ) 310 $ WRITE( NOUT, FMT = 9997 ) 'MEMORY' 311 KSKIP = KSKIP + 1 312 GO TO 70 313 END IF 314* 315 IF( CHECK ) THEN 316 CALL PCFILLPAD( ICTXT, MP, NQ, MEM( IPA-IPREPAD ), 317 $ DESCA( LLD_ ), IPREPAD, IPOSTPAD, 318 $ PADVAL ) 319 CALL PCFILLPAD( ICTXT, WORKSIZ-IPOSTPAD, 1, 320 $ MEM( IPW-IPREPAD ), 321 $ WORKSIZ-IPOSTPAD, IPREPAD, 322 $ IPOSTPAD, PADVAL ) 323 END IF 324* 325* Generate the matrix A and calculate its 1-norm 326* 327 CALL PCQRT13( ISCALE, M, N, MEM( IPA ), 1, 1, 328 $ DESCA, ANORM, IASEED, MEM( IPW ) ) 329* 330 IF( CHECK ) THEN 331 CALL PCCHEKPAD( ICTXT, 'PCQRT13', MP, NQ, 332 $ MEM( IPA-IPREPAD ), DESCA( LLD_ ), 333 $ IPREPAD, IPOSTPAD, PADVAL ) 334 CALL PCCHEKPAD( ICTXT, 'PCQRT13', 335 $ WORKSIZ-IPOSTPAD, 1, 336 $ MEM( IPW-IPREPAD ), 337 $ WORKSIZ-IPOSTPAD, IPREPAD, 338 $ IPOSTPAD, PADVAL ) 339 END IF 340* 341 DO 50 ITRAN = 1, 2 342* 343 IF( ITRAN.EQ.1 ) THEN 344 NROWS = M 345 NCOLS = N 346 TRANS = 'N' 347 TPSD = .FALSE. 348 ELSE 349 NROWS = N 350 NCOLS = M 351 TRANS = 'C' 352 TPSD = .TRUE. 353 END IF 354* 355* Loop over the different values for NRHS 356* 357 DO 40 HH = 1, NNR 358* 359 NRHS = NRVAL( HH ) 360* 361 DO 30 KK = 1, NNBR 362* 363 NBRHS = NBRVAL( KK ) 364* 365 NRHSP = NUMROC( NRHS, NBRHS, MYROW, 0, 366 $ NPROW ) 367 NRHSQ = NUMROC( NRHS, NBRHS, MYCOL, 0, 368 $ NPCOL ) 369* 370* Define Array descriptor for rhs MAX(M,N)xNRHS 371* 372 CALL DESCINIT( DESCX, MAX( M, N ), NRHS, NB, 373 $ NBRHS, 0, 0, ICTXT, 374 $ MAX( 1, MNP ) + IMIDPAD, 375 $ IERR( 1 ) ) 376 IF( TPSD ) THEN 377 CALL DESCINIT( DESCW, M, NRHS, NB, NBRHS, 378 $ 0, 0, ICTXT, MAX( 1, MP ) + 379 $ IMIDPAD, IERR( 2 ) ) 380 ELSE 381 CALL DESCINIT( DESCW, N, NRHS, NB, NBRHS, 382 $ 0, 0, ICTXT, MAX( 1, NP ) + 383 $ IMIDPAD, IERR( 2 ) ) 384 END IF 385* 386* Check all processes for an error 387* 388 CALL IGSUM2D( ICTXT, 'All', ' ', 2, 1, IERR, 389 $ 2, -1, 0 ) 390* 391 IF( IERR( 1 ).LT.0 .OR. IERR( 2 ).LT.0 ) THEN 392 IF( IAM.EQ.0 ) 393 $ WRITE( NOUT, FMT = 9997 ) 'descriptor' 394 KSKIP = KSKIP + 1 395 GO TO 30 396 END IF 397* 398* Check for enough memory 399* 400 IPX = IPA + DESCA( LLD_ )*NQ + IPOSTPAD + 401 $ IPREPAD 402 IPW = IPX + DESCX( LLD_ )*NRHSQ + IPOSTPAD + 403 $ IPREPAD 404 WORKSIZ = DESCW( LLD_ )*NRHSQ + IPOSTPAD 405* 406 IERR( 1 ) = 0 407 IF( IPW+WORKSIZ.GT.MEMSIZ ) THEN 408 IF( IAM.EQ.0 ) 409 $ WRITE( NOUT, FMT = 9996 ) 'Generation', 410 $ ( IPW+WORKSIZ )*CPLXSZ 411 IERR( 1 ) = 1 412 END IF 413* 414* Check all processes for an error 415* 416 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, IERR, 417 $ 1, -1, 0 ) 418* 419 IF( IERR( 1 ).GT.0 ) THEN 420 IF( IAM.EQ.0 ) 421 $ WRITE( NOUT, FMT = 9997 ) 'MEMORY' 422 KSKIP = KSKIP + 1 423 GO TO 30 424 END IF 425* 426* Generate RHS 427* 428 IF( TPSD ) THEN 429 CALL PCMATGEN( ICTXT, 'No', 'No', 430 $ DESCW( M_ ), DESCW( N_ ), 431 $ DESCW( MB_ ), DESCW( NB_ ), 432 $ MEM( IPW ), DESCW( LLD_ ), 433 $ DESCW( RSRC_ ), 434 $ DESCW( CSRC_ ), IBSEED, 0, 435 $ MP, 0, NRHSQ, MYROW, MYCOL, 436 $ NPROW, NPCOL ) 437 ELSE 438 CALL PCMATGEN( ICTXT, 'No', 'No', 439 $ DESCW( M_ ), DESCW( N_ ), 440 $ DESCW( MB_ ), DESCW( NB_ ), 441 $ MEM( IPW ), DESCW( LLD_ ), 442 $ DESCW( RSRC_ ), 443 $ DESCW( CSRC_ ), IBSEED, 0, 444 $ NP, 0, NRHSQ, MYROW, MYCOL, 445 $ NPROW, NPCOL ) 446 END IF 447* 448 IF( CHECK ) THEN 449 CALL PCFILLPAD( ICTXT, MNP, NRHSQ, 450 $ MEM( IPX-IPREPAD ), 451 $ DESCX( LLD_ ), IPREPAD, 452 $ IPOSTPAD, PADVAL ) 453 IF( TPSD ) THEN 454 CALL PCFILLPAD( ICTXT, MP, NRHSQ, 455 $ MEM( IPW-IPREPAD ), 456 $ DESCW( LLD_ ), IPREPAD, 457 $ IPOSTPAD, PADVAL ) 458 ELSE 459 CALL PCFILLPAD( ICTXT, NP, NRHSQ, 460 $ MEM( IPW-IPREPAD ), 461 $ DESCW( LLD_ ), IPREPAD, 462 $ IPOSTPAD, PADVAL ) 463 END IF 464 END IF 465* 466 DO 10 JJ = 1, NRHS 467 CALL PSCNRM2( NCOLS, BNORM, MEM( IPW ), 468 $ 1, JJ, DESCW, 1 ) 469 IF( BNORM.GT.RZERO ) 470 $ CALL PCSSCAL( NCOLS, RONE / BNORM, 471 $ MEM( IPW ), 1, JJ, DESCW, 472 $ 1 ) 473 10 CONTINUE 474* 475 CALL PCGEMM( TRANS, 'N', NROWS, NRHS, NCOLS, 476 $ ONE, MEM( IPA ), 1, 1, DESCA, 477 $ MEM( IPW ), 1, 1, DESCW, ZERO, 478 $ MEM( IPX ), 1, 1, DESCX ) 479* 480 IF( CHECK ) THEN 481* 482* check for memory overwrite 483* 484 CALL PCCHEKPAD( ICTXT, 'Generation', MP, 485 $ NQ, MEM( IPA-IPREPAD ), 486 $ DESCA( LLD_ ), IPREPAD, 487 $ IPOSTPAD, PADVAL ) 488 CALL PCCHEKPAD( ICTXT, 'Generation', MNP, 489 $ NRHSQ, MEM( IPX-IPREPAD ), 490 $ DESCX( LLD_ ), IPREPAD, 491 $ IPOSTPAD, PADVAL ) 492 IF( TPSD ) THEN 493 CALL PCCHEKPAD( ICTXT, 'Generation', 494 $ MP, NRHSQ, 495 $ MEM( IPW-IPREPAD ), 496 $ DESCW( LLD_ ), IPREPAD, 497 $ IPOSTPAD, PADVAL ) 498 ELSE 499 CALL PCCHEKPAD( ICTXT, 'Generation', 500 $ NP, NRHSQ, 501 $ MEM( IPW-IPREPAD ), 502 $ DESCW( LLD_ ), IPREPAD, 503 $ IPOSTPAD, PADVAL ) 504 END IF 505* 506* Allocate space for copy of rhs 507* 508 IPB = IPW 509* 510 IF( TPSD ) THEN 511 CALL DESCINIT( DESCB, N, NRHS, NB, 512 $ NBRHS, 0, 0, ICTXT, 513 $ MAX( 1, NP ) + IMIDPAD, 514 $ IERR( 1 ) ) 515 ELSE 516 CALL DESCINIT( DESCB, M, NRHS, NB, 517 $ NBRHS, 0, 0, ICTXT, 518 $ MAX( 1, MP ) + IMIDPAD, 519 $ IERR( 1 ) ) 520 END IF 521* 522* Check all processes for an error 523* 524 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, 525 $ IERR, 1, -1, 0 ) 526* 527 IF( IERR( 1 ).LT.0 ) THEN 528 IF( IAM.EQ.0 ) 529 $ WRITE( NOUT, FMT = 9997 ) 530 $ 'descriptor' 531 KSKIP = KSKIP + 1 532 GO TO 30 533 END IF 534* 535 IPW = IPB + DESCB( LLD_ )*NRHSQ + 536 $ IPOSTPAD + IPREPAD 537* 538 END IF 539* 540* Calculate the amount of workspace for PCGELS 541* 542 IF( M.GE.N ) THEN 543 LTAU = NUMROC( MIN(M,N), NB, MYCOL, 0, 544 $ NPCOL ) 545 LWF = NB * ( MP + NQ + NB ) 546 LWS = MAX( ( NB*( NB - 1 ) ) / 2, 547 $ ( MP + NRHSQ ) * NB ) + NB*NB 548 ELSE 549 LCM = ILCM( NPROW, NPCOL ) 550 LCMP = LCM / NPROW 551 LTAU = NUMROC( MIN(M,N), NB, MYROW, 0, 552 $ NPROW ) 553 LWF = NB * ( MP + NQ + NB ) 554 LWS = MAX( ( NB*( NB - 1 ) ) / 2, ( NP + 555 $ MAX( NQ + NUMROC( NUMROC( N, NB, 0, 556 $ 0, NPROW ), NB, 0, 0, LCMP ), 557 $ NRHSQ ) ) * NB ) + NB*NB 558 END IF 559* 560 LWORK = LTAU + MAX( LWF, LWS ) 561 WORKSIZ = LWORK + IPOSTPAD 562* 563* Check for adequate memory for problem size 564* 565 IERR( 1 ) = 0 566 IF( IPW+WORKSIZ.GT.MEMSIZ ) THEN 567 IF( IAM.EQ.0 ) 568 $ WRITE( NOUT, FMT = 9996 ) 'solve', 569 $ ( IPW+WORKSIZ )*CPLXSZ 570 IERR( 1 ) = 1 571 END IF 572* 573* Check all processes for an error 574* 575 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, IERR, 576 $ 1, -1, 0 ) 577* 578 IF( IERR( 1 ).GT.0 ) THEN 579 IF( IAM.EQ.0 ) 580 $ WRITE( NOUT, FMT = 9997 ) 'MEMORY' 581 KSKIP = KSKIP + 1 582 GO TO 30 583 END IF 584* 585 IF( CHECK ) THEN 586* 587* Make the copy of the right hand side 588* 589 CALL PCLACPY( 'All', NROWS, NRHS, 590 $ MEM( IPX ), 1, 1, DESCX, 591 $ MEM( IPB ), 1, 1, DESCB ) 592* 593 IF( TPSD ) THEN 594 CALL PCFILLPAD( ICTXT, NP, NRHSQ, 595 $ MEM( IPB-IPREPAD ), 596 $ DESCB( LLD_ ), IPREPAD, 597 $ IPOSTPAD, PADVAL ) 598 ELSE 599 CALL PCFILLPAD( ICTXT, MP, NRHSQ, 600 $ MEM( IPB-IPREPAD ), 601 $ DESCB( LLD_ ), IPREPAD, 602 $ IPOSTPAD, PADVAL ) 603 END IF 604 CALL PCFILLPAD( ICTXT, LWORK, 1, 605 $ MEM( IPW-IPREPAD ), 606 $ LWORK, IPREPAD, 607 $ IPOSTPAD, PADVAL ) 608 END IF 609* 610 CALL SLBOOT( ) 611 CALL BLACS_BARRIER( ICTXT, 'All' ) 612 CALL SLTIMER( 1 ) 613* 614* Solve the LS or overdetermined system 615* 616 CALL PCGELS( TRANS, M, N, NRHS, MEM( IPA ), 617 $ 1, 1, DESCA, MEM( IPX ), 1, 1, 618 $ DESCX, MEM( IPW ), LWORK, INFO ) 619* 620 CALL SLTIMER( 1 ) 621* 622 IF( CHECK ) THEN 623* 624* check for memory overwrite 625* 626 CALL PCCHEKPAD( ICTXT, 'PCGELS', MP, 627 $ NQ, MEM( IPA-IPREPAD ), 628 $ DESCA( LLD_ ), IPREPAD, 629 $ IPOSTPAD, PADVAL ) 630 CALL PCCHEKPAD( ICTXT, 'PCGELS', MNP, 631 $ NRHSQ, MEM( IPX-IPREPAD ), 632 $ DESCX( LLD_ ), IPREPAD, 633 $ IPOSTPAD, PADVAL ) 634 CALL PCCHEKPAD( ICTXT, 'PCGELS', LWORK, 635 $ 1, MEM( IPW-IPREPAD ), 636 $ LWORK, IPREPAD, 637 $ IPOSTPAD, PADVAL ) 638 END IF 639* 640* Regenerate A in place for testing and next 641* iteration 642* 643 CALL PCQRT13( ISCALE, M, N, MEM( IPA ), 1, 1, 644 $ DESCA, ANORM, IASEED, 645 $ MEM( IPW ) ) 646* 647* check the solution to rhs 648* 649 IF( CHECK ) THEN 650* 651* Am I going to call PCQRT17 ? 652* 653 IF( ( M.GE.N .AND. ( .NOT.TPSD ) ) .OR. 654 $ ( M.LT.N .AND. TPSD ) ) THEN 655* 656* Call PCQRT17 first, A, X, and B remain 657* unchanged. Solving LS system 658* 659* Check amount of memory for PCQRT17 660* 661 IF( TPSD ) THEN 662 WORKSIZ = NP*NRHSQ + NRHSP*MQ 663 IPW2 = IPW + WORKSIZ 664 WORKSIZ = WORKSIZ + 665 $ ICEIL( REALSZ*MAX( NQ, MAX( 666 $ MQ, NRHSQ ) ), CPLXSZ ) + 667 $ IPOSTPAD 668 ELSE 669 WORKSIZ = MP*NRHSQ + NRHSP*NQ 670 IPW2 = IPW + WORKSIZ 671 WORKSIZ = WORKSIZ + 672 $ ICEIL( REALSZ*MAX( NQ, 673 $ NRHSQ ), CPLXSZ ) + 674 $ IPOSTPAD 675 END IF 676* 677* Check for adequate memory for problem 678* size 679* 680 IERR( 1 ) = 0 681 IF( ( IPW+WORKSIZ ).GT.MEMSIZ ) THEN 682 IF( IAM.EQ.0 ) 683 $ WRITE( NOUT, FMT = 9996 ) 684 $ 'MEMORY', ( IPW+WORKSIZ )*CPLXSZ 685 IERR( 1 ) = 1 686 END IF 687* 688* Check all processes for an error 689* 690 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, 691 $ IERR, 1, -1, 0 ) 692* 693 IF( IERR( 1 ).GT.0 ) THEN 694 IF( IAM.EQ.0 ) 695 $ WRITE( NOUT, FMT = 9997 ) 696 $ 'MEMORY' 697 KSKIP = KSKIP + 1 698 GO TO 30 699 END IF 700* 701 CALL PCFILLPAD( ICTXT, 702 $ WORKSIZ-IPOSTPAD, 1, 703 $ MEM( IPW-IPREPAD ), 704 $ WORKSIZ-IPOSTPAD, 705 $ IPREPAD, IPOSTPAD, 706 $ PADVAL ) 707* 708 RESULT( 2 ) = PCQRT17( TRANS, 1, M, N, 709 $ NRHS, 710 $ MEM( IPA ), 711 $ 1, 1, DESCA, 712 $ MEM( IPX ), 1, 713 $ 1, DESCX, 714 $ MEM( IPB ), 715 $ 1, 1, DESCB, 716 $ MEM( IPW ), 717 $ MEM( IPW2 ) ) 718 SRESID = RESULT( 2 ) 719* 720 CALL PCCHEKPAD( ICTXT, 'PCQRT17', 721 $ MP, NQ, 722 $ MEM( IPA-IPREPAD ), 723 $ DESCA( LLD_ ), 724 $ IPREPAD, IPOSTPAD, 725 $ PADVAL ) 726 CALL PCCHEKPAD( ICTXT, 'PCQRT17', 727 $ MNP, NRHSQ, 728 $ MEM( IPX-IPREPAD ), 729 $ DESCX( LLD_ ), IPREPAD, 730 $ IPOSTPAD, PADVAL ) 731 IF( TPSD ) THEN 732 CALL PCCHEKPAD( ICTXT, 'PCQRT17', 733 $ NP, NRHSQ, 734 $ MEM( IPB-IPREPAD ), 735 $ DESCB( LLD_ ), 736 $ IPREPAD, IPOSTPAD, 737 $ PADVAL ) 738 ELSE 739 CALL PCCHEKPAD( ICTXT, 'PCQRT17', 740 $ MP, NRHSQ, 741 $ MEM( IPB-IPREPAD ), 742 $ DESCB( LLD_ ), 743 $ IPREPAD, IPOSTPAD, 744 $ PADVAL ) 745 END IF 746 CALL PCCHEKPAD( ICTXT, 'PCQRT17', 747 $ WORKSIZ-IPOSTPAD, 1, 748 $ MEM( IPW-IPREPAD ), 749 $ WORKSIZ-IPOSTPAD, 750 $ IPREPAD, IPOSTPAD, 751 $ PADVAL ) 752 END IF 753* 754* Call PCQRT16, B will be destroyed. 755* 756 IF( TPSD ) THEN 757 WORKSIZ = MP + IPOSTPAD 758 ELSE 759 WORKSIZ = NQ + IPOSTPAD 760 END IF 761* 762* Check for adequate memory for problem size 763* 764 IERR( 1 ) = 0 765 IF( ( IPW+WORKSIZ ).GT.MEMSIZ ) THEN 766 IF( IAM.EQ.0 ) 767 $ WRITE( NOUT, FMT = 9996 ) 'MEMORY', 768 $ ( IPW+WORKSIZ )*CPLXSZ 769 IERR( 1 ) = 1 770 END IF 771* 772* Check all processes for an error 773* 774 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, 775 $ IERR, 1, -1, 0 ) 776* 777 IF( IERR( 1 ).GT.0 ) THEN 778 IF( IAM.EQ.0 ) 779 $ WRITE( NOUT, FMT = 9997 ) 'MEMORY' 780 KSKIP = KSKIP + 1 781 GO TO 30 782 END IF 783* 784 CALL PCFILLPAD( ICTXT, 785 $ WORKSIZ-IPOSTPAD, 1, 786 $ MEM( IPW-IPREPAD ), 787 $ WORKSIZ-IPOSTPAD, 788 $ IPREPAD, IPOSTPAD, 789 $ PADVAL ) 790* 791 CALL PCQRT16( TRANS, M, N, NRHS, 792 $ MEM( IPA ), 1, 1, DESCA, 793 $ MEM( IPX ), 1, 1, DESCX, 794 $ MEM( IPB ), 1, 1, DESCB, 795 $ MEM( IPW ), RESULT( 1 ) ) 796* 797 CALL PCCHEKPAD( ICTXT, 'PCQRT16', 798 $ MP, NQ, 799 $ MEM( IPA-IPREPAD ), 800 $ DESCA( LLD_ ), 801 $ IPREPAD, IPOSTPAD, 802 $ PADVAL ) 803 CALL PCCHEKPAD( ICTXT, 'PCQRT16', 804 $ MNP, NRHSQ, 805 $ MEM( IPX-IPREPAD ), 806 $ DESCX( LLD_ ), IPREPAD, 807 $ IPOSTPAD, PADVAL ) 808 IF( TPSD ) THEN 809 CALL PCCHEKPAD( ICTXT, 'PCQRT16', 810 $ NP, NRHSQ, 811 $ MEM( IPB-IPREPAD ), 812 $ DESCB( LLD_ ), 813 $ IPREPAD, IPOSTPAD, 814 $ PADVAL ) 815 ELSE 816 CALL PCCHEKPAD( ICTXT, 'PCQRT16', 817 $ MP, NRHSQ, 818 $ MEM( IPB-IPREPAD ), 819 $ DESCB( LLD_ ), 820 $ IPREPAD, IPOSTPAD, 821 $ PADVAL ) 822 END IF 823 CALL PCCHEKPAD( ICTXT, 'PCQRT16', 824 $ WORKSIZ-IPOSTPAD, 1, 825 $ MEM( IPW-IPREPAD ), 826 $ WORKSIZ-IPOSTPAD, 827 $ IPREPAD, IPOSTPAD, 828 $ PADVAL ) 829* 830* Call PCQRT14 831* 832 IF( ( M.GE.N .AND. TPSD ) .OR. 833 $ ( M.LT.N .AND. ( .NOT.TPSD ) ) ) THEN 834* 835 IPW = IPB 836* 837 IF( TPSD ) THEN 838* 839 NNRHSQ = NUMROC( N+NRHS, NB, MYCOL, 840 $ 0, NPCOL ) 841 LTAU = NUMROC( MIN( M, N+NRHS ), NB, 842 $ MYCOL, 0, NPCOL ) 843 LWF = NB * ( NB + MP + NNRHSQ ) 844 WORKSIZ = MP * NNRHSQ + LTAU + LWF + 845 $ IPOSTPAD 846* 847 ELSE 848* 849 MNRHSP = NUMROC( M+NRHS, NB, MYROW, 850 $ 0, NPROW ) 851 LTAU = NUMROC( MIN( M+NRHS, N ), NB, 852 $ MYROW, 0, NPROW ) 853 LWF = NB * ( NB + MNRHSP + NQ ) 854 WORKSIZ = MNRHSP * NQ + LTAU + LWF + 855 $ IPOSTPAD 856* 857 END IF 858* 859* Check for adequate memory for problem 860* size 861* 862 IERR( 1 ) = 0 863 IF( ( IPW+WORKSIZ ).GT.MEMSIZ ) THEN 864 IF( IAM.EQ.0 ) 865 $ WRITE( NOUT, FMT = 9996 ) 866 $ 'MEMORY', ( IPW+WORKSIZ )*CPLXSZ 867 IERR( 1 ) = 1 868 END IF 869* 870* Check all processes for an error 871* 872 CALL IGSUM2D( ICTXT, 'All', ' ', 1, 1, 873 $ IERR, 1, -1, 0 ) 874* 875 IF( IERR( 1 ).GT.0 ) THEN 876 IF( IAM.EQ.0 ) 877 $ WRITE( NOUT, FMT = 9997 ) 878 $ 'MEMORY' 879 KSKIP = KSKIP + 1 880 GO TO 30 881 END IF 882* 883 CALL PCFILLPAD( ICTXT, 884 $ WORKSIZ-IPOSTPAD, 1, 885 $ MEM( IPW-IPREPAD ), 886 $ WORKSIZ-IPOSTPAD, 887 $ IPREPAD, IPOSTPAD, 888 $ PADVAL ) 889* 890* Solve underdetermined system 891* 892 RESULT( 2 ) = PCQRT14( TRANS, M, N, 893 $ NRHS, 894 $ MEM( IPA ), 1, 895 $ 1, DESCA, 896 $ MEM( IPX ), 897 $ 1, 1, DESCX, 898 $ MEM( IPW ) ) 899 SRESID = RESULT( 2 ) 900* 901 CALL PCCHEKPAD( ICTXT, 'PCQRT14', 902 $ MP, NQ, 903 $ MEM( IPA-IPREPAD ), 904 $ DESCA( LLD_ ), 905 $ IPREPAD, IPOSTPAD, 906 $ PADVAL ) 907 CALL PCCHEKPAD( ICTXT, 'PCQRT14', 908 $ MNP, NRHSQ, 909 $ MEM( IPX-IPREPAD ), 910 $ DESCX( LLD_ ), IPREPAD, 911 $ IPOSTPAD, PADVAL ) 912 CALL PCCHEKPAD( ICTXT, 'PCQRT14', 913 $ WORKSIZ-IPOSTPAD, 1, 914 $ MEM( IPW-IPREPAD ), 915 $ WORKSIZ-IPOSTPAD, 916 $ IPREPAD, IPOSTPAD, 917 $ PADVAL ) 918 END IF 919* 920* Print information about the tests that 921* did not pass the threshold. 922* 923 PASSED = 'PASSED' 924 DO 20 II = 1, 2 925 IF( ( RESULT( II ).GE.THRESH ) .AND. 926 $ ( RESULT( II )-RESULT( II ).EQ.0.0E+0 927 $ ) ) THEN 928 IF( IAM.EQ.0 ) 929 $ WRITE( NOUT, FMT = 9986 )TRANS, 930 $ M, N, NRHS, NB, ITYPE, II, 931 $ RESULT( II ) 932 KFAIL = KFAIL + 1 933 PASSED = 'FAILED' 934 ELSE 935 KPASS = KPASS + 1 936 END IF 937 20 CONTINUE 938* 939 ELSE 940* 941* By-pass the solve check 942* 943 KPASS = KPASS + 1 944 SRESID = SRESID - SRESID 945 PASSED = 'BYPASS' 946* 947 END IF 948* 949* Gather maximum of all CPU and WALL clock 950* timings 951* 952 CALL SLCOMBINE( ICTXT, 'All', '>', 'W', 1, 1, 953 $ WTIME ) 954 CALL SLCOMBINE( ICTXT, 'All', '>', 'C', 1, 1, 955 $ CTIME ) 956* 957* Print results 958* 959 IF( MYROW.EQ.0 .AND. MYCOL.EQ.0 ) THEN 960 ADDFAC = 2 961 MULFAC = 6 962 IF( M.GE.N ) THEN 963* 964* NOPS = SOPLA( 'CGEQRF', M, N, 0, 0, 965* NB ) + SOPLA( 'CUNMQR', M, NRHS, N, 966* 0, NB ) 967* 968 MULTS = N*( ( ( 23.D0 / 6.D0 )+M+N / 969 $ 2.D0 )+ N*( M-N / 3.D0 ) ) + 970 $ N*NRHS*( 2.D0*M+2.D0-N ) 971 ADDS = N*( ( 5.D0 / 6.D0 )+N* 972 $ ( 1.D0 / 2.D0+( M-N / 3.D0 ) ) ) 973 $ + N*NRHS*( 2.D0*M+1.D0-N ) 974 ELSE 975* 976* NOPS = SOPLA( 'CGELQF', M, N, 0, 0, 977* NB ) + SOPLA( 'CUNMLQ', M, 978* NRHS, N, 0, NB ) 979* 980 MULTS = M*( ( ( 29.D0 / 6.D0 )+2.D0*N-M 981 $ / 2.D0 )+M*( N-M / 3.D0 ) ) 982 $ + N*NRHS*( 2.D0*M+2.D0-N ) 983 ADDS = M*( ( 5.D0 / 6.D0 )+M / 2.D0+M* 984 $ ( N-M / 3.D0 ) ) 985 $ + N*NRHS*( 2.D0*M+1.D0-N ) 986 END IF 987 NOPS = ADDFAC*ADDS + MULFAC*MULTS 988* 989* Calculate total megaflops, for WALL and 990* CPU time, and print output 991* 992* Print WALL time if machine supports it 993* 994 IF( WTIME( 1 ).GT.0.0D+0 ) THEN 995 TMFLOPS = NOPS / ( WTIME( 1 )*1.0D+6 ) 996 ELSE 997 TMFLOPS = 0.0D+0 998 END IF 999* 1000 IF( WTIME( 1 ).GE.0.0D+0 ) 1001 $ WRITE( NOUT, FMT = 9993 ) 1002 $ 'WALL', TRANS, M, N, NB, NRHS, 1003 $ NBRHS, NPROW, NPCOL, WTIME( 1 ), 1004 $ TMFLOPS, PASSED 1005* 1006* Print CPU time if machine supports it 1007* 1008 IF( CTIME( 1 ).GT.0.0D+0 ) THEN 1009 TMFLOPS = NOPS / ( CTIME( 1 )*1.0D+6 ) 1010 ELSE 1011 TMFLOPS = 0.0D+0 1012 END IF 1013* 1014 IF( CTIME( 1 ).GE.0.0D+0 ) 1015 $ WRITE( NOUT, FMT = 9993 ) 1016 $ 'CPU ', TRANS, M, N, NB, NRHS, 1017 $ NBRHS, NPROW, NPCOL, CTIME( 1 ), 1018 $ TMFLOPS, PASSED 1019 END IF 1020 30 CONTINUE 1021 40 CONTINUE 1022 50 CONTINUE 1023 60 CONTINUE 1024 70 CONTINUE 1025 80 CONTINUE 1026 CALL BLACS_GRIDEXIT( ICTXT ) 1027 90 CONTINUE 1028* 1029* Print out ending messages and close output file 1030* 1031 IF( IAM.EQ.0 ) THEN 1032 KTESTS = KPASS + KFAIL + KSKIP 1033 WRITE( NOUT, FMT = * ) 1034 WRITE( NOUT, FMT = 9992 ) KTESTS 1035 IF( CHECK ) THEN 1036 WRITE( NOUT, FMT = 9991 ) KPASS 1037 WRITE( NOUT, FMT = 9989 ) KFAIL 1038 ELSE 1039 WRITE( NOUT, FMT = 9990 ) KPASS 1040 END IF 1041 WRITE( NOUT, FMT = 9988 ) KSKIP 1042 WRITE( NOUT, FMT = * ) 1043 WRITE( NOUT, FMT = * ) 1044 WRITE( NOUT, FMT = 9987 ) 1045 IF( NOUT.NE.6 .AND. NOUT.NE.0 ) 1046 $ CLOSE ( NOUT ) 1047 END IF 1048* 1049 CALL BLACS_EXIT( 0 ) 1050* 1051 9999 FORMAT( 'ILLEGAL ', A6, ': ', A5, ' = ', I3, 1052 $ '; It should be at least 1' ) 1053 9998 FORMAT( 'ILLEGAL GRID: nprow*npcol = ', I4, '. It can be at most', 1054 $ I4 ) 1055 9997 FORMAT( 'Bad ', A6, ' parameters: going on to next test case.' ) 1056 9996 FORMAT( 'Unable to perform ', A, ': need TOTMEM of at least', 1057 $ I11 ) 1058 9995 FORMAT( 'Time TRANS M N NB NRHS NBRHS P Q ', 1059 $ 'LS Time MFLOPS CHECK' ) 1060 9994 FORMAT( '---- ----- ------ ------ --- ----- ----- ----- ----- ', 1061 $ '--------- -------- ------' ) 1062 9993 FORMAT( A4, 3X, A1, 3X, I6, 1X, I6, 1X, I3, 1X, I5, 1X, I5, 1X, 1063 $ I5, 1X, I5, 1X, F9.2, 1X, F8.2, 1X, A6 ) 1064 9992 FORMAT( 'Finished', I6, ' tests, with the following results:' ) 1065 9991 FORMAT( I5, ' tests completed and passed residual checks.' ) 1066 9990 FORMAT( I5, ' tests completed without checking.' ) 1067 9989 FORMAT( I5, ' tests completed and failed residual checks.' ) 1068 9988 FORMAT( I5, ' tests skipped because of illegal input values.' ) 1069 9987 FORMAT( 'END OF TESTS.' ) 1070 9986 FORMAT( ' TRANS=''', A1, ''', M=', I5, ', N=', I5, ', NRHS=', I4, 1071 $ ', NB=', I4, ', type', I2, ', test(', I2, ')=', G12.5 ) 1072* 1073 STOP 1074* 1075* End of PCLSDRIVER 1076* 1077 END 1078