1 SUBROUTINE PSPBINFO( SUMMRY, NOUT, UPLO, NMAT, NVAL, LDNVAL, NBW, 2 $ BWVAL, LDBWVAL, NNB, NBVAL, LDNBVAL, NNR, 3 $ NRVAL, LDNRVAL, NNBR, NBRVAL, LDNBRVAL, 4 $ NGRIDS, PVAL, LDPVAL, QVAL, LDQVAL, THRESH, 5 $ WORK, IAM, NPROCS ) 6* 7* 8* 9* -- ScaLAPACK routine (version 1.7) -- 10* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 11* and University of California, Berkeley. 12* November 15, 1997 13* 14* .. Scalar Arguments .. 15 CHARACTER UPLO 16 CHARACTER*(*) SUMMRY 17 INTEGER IAM, 18 $ LDBWVAL, LDNBRVAL, LDNBVAL, LDNRVAL, LDNVAL, 19 $ LDPVAL, LDQVAL, NGRIDS, NMAT, NNB, NNBR, NBW, 20 $ NPROCS, NNR, NOUT 21 REAL THRESH 22* .. 23* .. Array Arguments .. 24 INTEGER NBRVAL( LDNBRVAL ), NBVAL( LDNBVAL ), 25 $ NRVAL( LDNRVAL ), NVAL( LDNVAL ), 26 $ BWVAL( LDBWVAL), 27 $ PVAL( LDPVAL ), QVAL(LDQVAL), WORK( * ) 28* .. 29* 30* Purpose 31* ======= 32* 33* PSPBINFO get needed startup information for band factorization 34* and transmits it to all processes. 35* 36* Arguments 37* ========= 38* 39* SUMMRY (global output) CHARACTER*(*) 40* Name of output (summary) file (if any). Only defined for 41* process 0. 42* 43* NOUT (global output) INTEGER 44* The unit number for output file. NOUT = 6, ouput to screen, 45* NOUT = 0, output to stderr. Only defined for process 0. 46* 47* UPLO (global output) CHARACTER 48* Specifies whether the upper or lower triangular part of the 49* symmetric matrix A is stored. 50* = 'U': Upper triangular 51* = 'L': Lower triangular 52* 53* 54* NMAT (global output) INTEGER 55* The number of different values that can be used for N. 56* 57* NVAL (global output) INTEGER array, dimension (LDNVAL) 58* The values of N (number of columns in matrix) to run the 59* code with. 60* 61* NBW (global output) INTEGER 62* The number of different values that can be used for @bw@. 63* BWVAL (global output) INTEGER array, dimension (LDNVAL) 64* The values of BW (number of subdiagonals in matrix) to run 65* the code with. 66* 67* LDNVAL (global input) INTEGER 68* The maximum number of different values that can be used for 69* N, LDNVAL > = NMAT. 70* 71* NNB (global output) INTEGER 72* The number of different values that can be used for NB. 73* 74* NBVAL (global output) INTEGER array, dimension (LDNBVAL) 75* The values of NB (blocksize) to run the code with. 76* 77* LDNBVAL (global input) INTEGER 78* The maximum number of different values that can be used for 79* NB, LDNBVAL >= NNB. 80* 81* NNR (global output) INTEGER 82* The number of different values that can be used for NRHS. 83* 84* NRVAL (global output) INTEGER array, dimension(LDNRVAL) 85* The values of NRHS (# of Right Hand Sides) to run the code 86* with. 87* 88* LDNRVAL (global input) INTEGER 89* The maximum number of different values that can be used for 90* NRHS, LDNRVAL >= NNR. 91* 92* NNBR (global output) INTEGER 93* The number of different values that can be used for NBRHS. 94* 95* NBRVAL (global output) INTEGER array, dimension (LDNBRVAL) 96* The values of NBRHS (RHS blocksize) to run the code with. 97* 98* LDNBRVAL (global input) INTEGER 99* The maximum number of different values that can be used for 100* NBRHS, LDNBRVAL >= NBRVAL. 101* 102* NGRIDS (global output) INTEGER 103* The number of different values that can be used for P & Q. 104* 105* PVAL (global output) INTEGER array, dimension (LDPVAL) 106* Not used (will be returned as all 1s) since proc grid is 1D 107* 108* LDPVAL (global input) INTEGER 109* The maximum number of different values that can be used for 110* P, LDPVAL >= NGRIDS. 111* 112* QVAL (global output) INTEGER array, dimension (LDQVAL) 113* The values of Q (number of process columns) to run the code 114* with. 115* 116* LDQVAL (global input) INTEGER 117* The maximum number of different values that can be used for 118* Q, LDQVAL >= NGRIDS. 119* 120* THRESH (global output) REAL 121* Indicates what error checks shall be run and printed out: 122* = 0 : Perform no error checking 123* > 0 : report all residuals greater than THRESH, perform 124* factor check only if solve check fails 125* 126* WORK (local workspace) INTEGER array of dimension >= 127* MAX( 8, LDNVAL+2*LDNBVAL+LDNRVAL+LDNBRVAL+LDPVAL+LDQVAL 128* $ +3*LDNVAL) 129* Used to pack input arrays in order to send info in one 130* message. 131* 132* IAM (local input) INTEGER 133* My process number. 134* 135* NPROCS (global input) INTEGER 136* The total number of processes. 137* 138* ====================================================================== 139* 140* Note: For packing the information we assumed that the length in bytes 141* ===== of an integer is equal to the length in bytes of a real single 142* precision. 143* 144* ===================================================================== 145* 146* Code Developer: Andrew J. Cleary, University of Tennessee. 147* Current address: Lawrence Livermore National Labs. 148* This version released: August, 2001. 149* 150* ====================================================================== 151* 152* .. Parameters .. 153 INTEGER NIN 154 PARAMETER ( NIN = 11 ) 155* .. 156* .. Local Scalars .. 157 INTEGER I, ICTXT 158 CHARACTER*79 USRINFO 159 REAL EPS 160* .. 161* .. External Subroutines .. 162 EXTERNAL BLACS_ABORT, BLACS_GET, BLACS_GRIDEXIT, 163 $ BLACS_GRIDINIT, BLACS_SETUP, ICOPY, IGEBR2D, 164 $ IGEBS2D, SGEBR2D, SGEBS2D 165* .. 166* .. External Functions .. 167 LOGICAL LSAME 168 REAL PSLAMCH 169 EXTERNAL LSAME, PSLAMCH 170* .. 171* .. Intrinsic Functions .. 172 INTRINSIC MAX, MIN 173* .. 174* .. Executable Statements .. 175* 176* Process 0 reads the input data, broadcasts to other processes and 177* writes needed information to NOUT 178* 179 IF( IAM.EQ.0 ) THEN 180* 181* Open file and skip data file header 182* 183 OPEN( NIN, FILE = 'BLLT.dat', STATUS = 'OLD' ) 184 READ( NIN, FMT = * ) SUMMRY 185 SUMMRY = ' ' 186* 187* Read in user-supplied info about machine type, compiler, etc. 188* 189 READ( NIN, FMT = 9999 ) USRINFO 190* 191* Read name and unit number for summary output file 192* 193 READ( NIN, FMT = * ) SUMMRY 194 READ( NIN, FMT = * ) NOUT 195 IF( NOUT.NE.0 .AND. NOUT.NE.6 ) 196 $ OPEN( NOUT, FILE = SUMMRY, STATUS = 'UNKNOWN' ) 197* 198* Read and check the parameter values for the tests. 199* 200* Get UPLO 201* 202 READ( NIN, FMT = * ) UPLO 203* 204* 205* Get number of matrices and their dimensions 206* 207 READ( NIN, FMT = * ) NMAT 208 IF( NMAT.LT.1 .OR. NMAT.GT.LDNVAL ) THEN 209 WRITE( NOUT, FMT = 9994 ) 'N', LDNVAL 210 GO TO 20 211 END IF 212 READ( NIN, FMT = * ) ( NVAL( I ), I = 1, NMAT ) 213* 214* Get bandwidths 215* 216 READ( NIN, FMT = * ) NBW 217 IF( NBW.LT.1 .OR. NBW.GT.LDBWVAL ) THEN 218 WRITE( NOUT, FMT = 9994 ) 'BW', LDBWVAL 219 GO TO 20 220 END IF 221 READ( NIN, FMT = * ) ( BWVAL( I ), I = 1, NBW ) 222* 223* Get values of NB 224* 225 READ( NIN, FMT = * ) NNB 226 IF( NNB.LT.1 .OR. NNB.GT.LDNBVAL ) THEN 227 WRITE( NOUT, FMT = 9994 ) 'NB', LDNBVAL 228 GO TO 20 229 END IF 230 READ( NIN, FMT = * ) ( NBVAL( I ), I = 1, NNB ) 231* 232* Get values of NRHS 233* 234 READ( NIN, FMT = * ) NNR 235 IF( NNR.LT.1 .OR. NNR.GT.LDNRVAL ) THEN 236 WRITE( NOUT, FMT = 9994 ) 'NRHS', LDNRVAL 237 GO TO 20 238 END IF 239 READ( NIN, FMT = * ) ( NRVAL( I ), I = 1, NNR ) 240* 241* Get values of NBRHS 242* 243 READ( NIN, FMT = * ) NNBR 244 IF( NNBR.LT.1 .OR. NNBR.GT.LDNBRVAL ) THEN 245 WRITE( NOUT, FMT = 9994 ) 'NBRHS', LDNBRVAL 246 GO TO 20 247 END IF 248 READ( NIN, FMT = * ) ( NBRVAL( I ), I = 1, NNBR ) 249* 250* Get number of grids 251* 252 READ( NIN, FMT = * ) NGRIDS 253 IF( NGRIDS.LT.1 .OR. NGRIDS.GT.LDPVAL ) THEN 254 WRITE( NOUT, FMT = 9994 ) 'Grids', LDPVAL 255 GO TO 20 256 ELSE IF( NGRIDS.GT.LDQVAL ) THEN 257 WRITE( NOUT, FMT = 9994 ) 'Grids', LDQVAL 258 GO TO 20 259 END IF 260* 261* Processor grid must be 1D so set PVAL to 1 262 DO 8738 I = 1, NGRIDS 263 PVAL( I ) = 1 264 8738 CONTINUE 265* 266* Get values of Q 267* 268 READ( NIN, FMT = * ) ( QVAL( I ), I = 1, NGRIDS ) 269* 270* Get level of checking 271* 272 READ( NIN, FMT = * ) THRESH 273* 274* Close input file 275* 276 CLOSE( NIN ) 277* 278* For pvm only: if virtual machine not set up, allocate it and 279* spawn the correct number of processes. 280* 281 IF( NPROCS.LT.1 ) THEN 282 NPROCS = 0 283 DO 10 I = 1, NGRIDS 284 NPROCS = MAX( NPROCS, PVAL( I )*QVAL( I ) ) 285 10 CONTINUE 286 CALL BLACS_SETUP( IAM, NPROCS ) 287 END IF 288* 289* Temporarily define blacs grid to include all processes so 290* information can be broadcast to all processes. 291* 292 CALL BLACS_GET( -1, 0, ICTXT ) 293 CALL BLACS_GRIDINIT( ICTXT, 'Row-major', 1, NPROCS ) 294* 295* Compute machine epsilon 296* 297 EPS = PSLAMCH( ICTXT, 'eps' ) 298* 299* Pack information arrays and broadcast 300* 301 CALL SGEBS2D( ICTXT, 'All', ' ', 1, 1, THRESH, 1 ) 302 I = 1 303 WORK( I ) = NMAT 304 I = I+1 305 WORK( I ) = NBW 306 I = I+1 307 WORK( I ) = NNB 308 I = I+1 309 WORK( I ) = NNR 310 I = I+1 311 WORK( I ) = NNBR 312 I = I+1 313 WORK( I ) = NGRIDS 314 I = I+1 315 IF( LSAME( UPLO, 'L' ) ) THEN 316 WORK( I ) = 1 317 ELSE 318 WORK( I ) = 2 319 END IF 320 I = I+1 321* Send number of elements to be sent 322 CALL IGEBS2D( ICTXT, 'All', ' ', 1, 1, I-1, 1 ) 323* Send elements 324 CALL IGEBS2D( ICTXT, 'All', ' ', I-1, 1, WORK, I-1 ) 325* 326 I = 1 327 CALL ICOPY( NMAT, NVAL, 1, WORK( I ), 1 ) 328 I = I + NMAT 329 CALL ICOPY( NBW, BWVAL, 1, WORK( I ), 1 ) 330 I = I + NBW 331 CALL ICOPY( NNB, NBVAL, 1, WORK( I ), 1 ) 332 I = I + NNB 333 CALL ICOPY( NNR, NRVAL, 1, WORK( I ), 1 ) 334 I = I + NNR 335 CALL ICOPY( NNBR, NBRVAL, 1, WORK( I ), 1 ) 336 I = I + NNBR 337 CALL ICOPY( NGRIDS, PVAL, 1, WORK( I ), 1 ) 338 I = I + NGRIDS 339 CALL ICOPY( NGRIDS, QVAL, 1, WORK( I ), 1 ) 340 I = I + NGRIDS 341 CALL IGEBS2D( ICTXT, 'All', ' ', I-1, 1, WORK, I-1 ) 342* 343* regurgitate input 344* 345 WRITE( NOUT, FMT = 9999 ) 346 $ 'SCALAPACK banded linear systems.' 347 WRITE( NOUT, FMT = 9999 ) USRINFO 348 WRITE( NOUT, FMT = * ) 349 WRITE( NOUT, FMT = 9999 ) 350 $ 'Tests of the parallel '// 351 $ 'real single precision band matrix solve ' 352 WRITE( NOUT, FMT = 9999 ) 353 $ 'The following scaled residual '// 354 $ 'checks will be computed:' 355 WRITE( NOUT, FMT = 9999 ) 356 $ ' Solve residual = ||Ax - b|| / '// 357 $ '(||x|| * ||A|| * eps * N)' 358 IF( LSAME( UPLO, 'L' ) ) THEN 359 WRITE( NOUT, FMT = 9999 ) 360 $ ' Factorization residual = ||A - LL''|| /'// 361 $ ' (||A|| * eps * N)' 362 ELSE 363 WRITE( NOUT, FMT = 9999 ) 364 $ ' Factorization residual = ||A - U''U|| /'// 365 $ ' (||A|| * eps * N)' 366 END IF 367 WRITE( NOUT, FMT = 9999 ) 368 $ 'The matrix A is randomly '// 369 $ 'generated for each test.' 370 WRITE( NOUT, FMT = * ) 371 WRITE( NOUT, FMT = 9999 ) 372 $ 'An explanation of the input/output '// 373 $ 'parameters follows:' 374 WRITE( NOUT, FMT = 9999 ) 375 $ 'TIME : Indicates whether WALL or '// 376 $ 'CPU time was used.' 377* 378 WRITE( NOUT, FMT = 9999 ) 379 $ 'UPLO : Whether data represents ''Upper'// 380 $ ''' or ''Lower'' triangular portion of array A.' 381 WRITE( NOUT, FMT = 9999 ) 382 $ 'TRANS : Whether solve is to be done with'// 383 $ ' ''Transpose'' of matrix A (T,C) or not (N).' 384 WRITE( NOUT, FMT = 9999 ) 385 $ 'N : The number of rows and columns '// 386 $ 'in the matrix A.' 387 WRITE( NOUT, FMT = 9999 ) 388 $ 'bw : The number of diagonals '// 389 $ 'in the matrix A.' 390 WRITE( NOUT, FMT = 9999 ) 391 $ 'NB : The size of the column panels the'// 392 $ ' matrix A is split into. [-1 for default]' 393 WRITE( NOUT, FMT = 9999 ) 394 $ 'NRHS : The total number of RHS to solve'// 395 $ ' for.' 396 WRITE( NOUT, FMT = 9999 ) 397 $ 'NBRHS : The number of RHS to be put on '// 398 $ 'a column of processes before going' 399 WRITE( NOUT, FMT = 9999 ) 400 $ ' on to the next column of processes.' 401 WRITE( NOUT, FMT = 9999 ) 402 $ 'P : The number of process rows.' 403 WRITE( NOUT, FMT = 9999 ) 404 $ 'Q : The number of process columns.' 405 WRITE( NOUT, FMT = 9999 ) 406 $ 'THRESH : If a residual value is less than'// 407 $ ' THRESH, CHECK is flagged as PASSED' 408 WRITE( NOUT, FMT = 9999 ) 409 $ 'Fact time: Time in seconds to factor the'// 410 $ ' matrix' 411 WRITE( NOUT, FMT = 9999 ) 412 $ 'Sol Time: Time in seconds to solve the'// 413 $ ' system.' 414 WRITE( NOUT, FMT = 9999 ) 415 $ 'MFLOPS : Rate of execution for factor '// 416 $ 'and solve using sequential operation count.' 417 WRITE( NOUT, FMT = 9999 ) 418 $ 'MFLOP2 : Rough estimate of speed '// 419 $ 'using actual op count (accurate big P,N).' 420 WRITE( NOUT, FMT = * ) 421 WRITE( NOUT, FMT = 9999 ) 422 $ 'The following parameter values will be used:' 423 WRITE( NOUT, FMT = 9999 ) 424 $ ' UPLO : '//UPLO 425 WRITE( NOUT, FMT = 9996 ) 426 $ 'N ', ( NVAL(I), I = 1, MIN(NMAT, 10) ) 427 IF( NMAT.GT.10 ) 428 $ WRITE( NOUT, FMT = 9997 ) ( NVAL(I), I = 11, NMAT ) 429 WRITE( NOUT, FMT = 9996 ) 430 $ 'bw ', ( BWVAL(I), I = 1, MIN(NBW, 10) ) 431 IF( NBW.GT.10 ) 432 $ WRITE( NOUT, FMT = 9997 ) ( BWVAL(I), I = 11, NBW ) 433 WRITE( NOUT, FMT = 9996 ) 434 $ 'NB ', ( NBVAL(I), I = 1, MIN(NNB, 10) ) 435 IF( NNB.GT.10 ) 436 $ WRITE( NOUT, FMT = 9997 ) ( NBVAL(I), I = 11, NNB ) 437 WRITE( NOUT, FMT = 9996 ) 438 $ 'NRHS ', ( NRVAL(I), I = 1, MIN(NNR, 10) ) 439 IF( NNR.GT.10 ) 440 $ WRITE( NOUT, FMT = 9997 ) ( NRVAL(I), I = 11, NNR ) 441 WRITE( NOUT, FMT = 9996 ) 442 $ 'NBRHS', ( NBRVAL(I), I = 1, MIN(NNBR, 10) ) 443 IF( NNBR.GT.10 ) 444 $ WRITE( NOUT, FMT = 9997 ) ( NBRVAL(I), I = 11, NNBR ) 445 WRITE( NOUT, FMT = 9996 ) 446 $ 'P ', ( PVAL(I), I = 1, MIN(NGRIDS, 10) ) 447 IF( NGRIDS.GT.10 ) 448 $ WRITE( NOUT, FMT = 9997) ( PVAL(I), I = 11, NGRIDS ) 449 WRITE( NOUT, FMT = 9996 ) 450 $ 'Q ', ( QVAL(I), I = 1, MIN(NGRIDS, 10) ) 451 IF( NGRIDS.GT.10 ) 452 $ WRITE( NOUT, FMT = 9997 ) ( QVAL(I), I = 11, NGRIDS ) 453 WRITE( NOUT, FMT = * ) 454 WRITE( NOUT, FMT = 9995 ) EPS 455 WRITE( NOUT, FMT = 9998 ) THRESH 456* 457 ELSE 458* 459* If in pvm, must participate setting up virtual machine 460* 461 IF( NPROCS.LT.1 ) 462 $ CALL BLACS_SETUP( IAM, NPROCS ) 463* 464* Temporarily define blacs grid to include all processes so 465* all processes have needed startup information 466* 467 CALL BLACS_GET( -1, 0, ICTXT ) 468 CALL BLACS_GRIDINIT( ICTXT, 'Row-major', 1, NPROCS ) 469* 470* Compute machine epsilon 471* 472 EPS = PSLAMCH( ICTXT, 'eps' ) 473* 474 CALL SGEBR2D( ICTXT, 'All', ' ', 1, 1, THRESH, 1, 0, 0 ) 475 CALL IGEBR2D( ICTXT, 'All', ' ', 1, 1, I, 1, 0, 0 ) 476 CALL IGEBR2D( ICTXT, 'All', ' ', I, 1, WORK, I, 0, 0 ) 477 I = 1 478 NMAT = WORK( I ) 479 I = I+1 480 NBW = WORK( I ) 481 I = I+1 482 NNB = WORK( I ) 483 I = I+1 484 NNR = WORK( I ) 485 I = I+1 486 NNBR = WORK( I ) 487 I = I+1 488 NGRIDS = WORK( I ) 489 I = I+1 490 IF( WORK( I ) .EQ. 1 ) THEN 491 UPLO = 'L' 492 ELSE 493 UPLO = 'U' 494 END IF 495 I = I+1 496* 497 I = NMAT + NBW + NNB + NNR + NNBR + 2*NGRIDS 498* 499 CALL IGEBR2D( ICTXT, 'All', ' ', 1, I, WORK, 1, 0, 0 ) 500 I = 1 501 CALL ICOPY( NMAT, WORK( I ), 1, NVAL, 1 ) 502 I = I + NMAT 503 CALL ICOPY( NBW, WORK( I ), 1, BWVAL, 1 ) 504 I = I + NBW 505 CALL ICOPY( NNB, WORK( I ), 1, NBVAL, 1 ) 506 I = I + NNB 507 CALL ICOPY( NNR, WORK( I ), 1, NRVAL, 1 ) 508 I = I + NNR 509 CALL ICOPY( NNBR, WORK( I ), 1, NBRVAL, 1 ) 510 I = I + NNBR 511 CALL ICOPY( NGRIDS, WORK( I ), 1, PVAL, 1 ) 512 I = I + NGRIDS 513 CALL ICOPY( NGRIDS, WORK( I ), 1, QVAL, 1 ) 514* 515 END IF 516* 517 CALL BLACS_GRIDEXIT( ICTXT ) 518* 519 RETURN 520* 521 20 WRITE( NOUT, FMT = 9993 ) 522 CLOSE( NIN ) 523 IF( NOUT.NE.6 .AND. NOUT.NE.0 ) 524 $ CLOSE( NOUT ) 525* 526 CALL BLACS_ABORT( ICTXT, 1 ) 527 STOP 528* 529 9999 FORMAT( A ) 530 9998 FORMAT( 'Routines pass computational tests if scaled residual ', 531 $ 'is less than ', G12.5 ) 532 9997 FORMAT( ' ', 10I6 ) 533 9996 FORMAT( 2X, A5, ': ', 10I6 ) 534 9995 FORMAT( 'Relative machine precision (eps) is taken to be ', 535 $ E18.6 ) 536 9994 FORMAT( ' Number of values of ',5A, ' is less than 1 or greater ', 537 $ 'than ', I2 ) 538 9993 FORMAT( ' Illegal input in file ',40A,'. Aborting run.' ) 539* 540* End of PSPBINFO 541* 542 END 543