1 SUBROUTINE PDSTEBZ( ICTXT, RANGE, ORDER, N, VL, VU, IL, IU, 2 $ ABSTOL, D, E, M, NSPLIT, W, IBLOCK, ISPLIT, 3 $ WORK, LWORK, IWORK, LIWORK, INFO ) 4* 5* -- ScaLAPACK routine (version 1.7) -- 6* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 7* and University of California, Berkeley. 8* November 15, 1997 9* 10* .. Scalar Arguments .. 11 CHARACTER ORDER, RANGE 12 INTEGER ICTXT, IL, INFO, IU, LIWORK, LWORK, M, N, 13 $ NSPLIT 14 DOUBLE PRECISION ABSTOL, VL, VU 15* .. 16* .. Array Arguments .. 17 INTEGER IBLOCK( * ), ISPLIT( * ), IWORK( * ) 18 DOUBLE PRECISION D( * ), E( * ), W( * ), WORK( * ) 19* .. 20* 21* Purpose 22* ======= 23* 24* PDSTEBZ computes the eigenvalues of a symmetric tridiagonal matrix in 25* parallel. The user may ask for all eigenvalues, all eigenvalues in 26* the interval [VL, VU], or the eigenvalues indexed IL through IU. A 27* static partitioning of work is done at the beginning of PDSTEBZ which 28* results in all processes finding an (almost) equal number of 29* eigenvalues. 30* 31* NOTE : It is assumed that the user is on an IEEE machine. If the user 32* is not on an IEEE mchine, set the compile time flag NO_IEEE 33* to 1 (in SLmake.inc). The features of IEEE arithmetic that 34* are needed for the "fast" Sturm Count are : (a) infinity 35* arithmetic (b) the sign bit of a single precision floating 36* point number is assumed be in the 32nd bit position 37* (c) the sign of negative zero. 38* 39* See W. Kahan "Accurate Eigenvalues of a Symmetric Tridiagonal 40* Matrix", Report CS41, Computer Science Dept., Stanford 41* University, July 21, 1966. 42* 43* Arguments 44* ========= 45* 46* ICTXT (global input) INTEGER 47* The BLACS context handle. 48* 49* RANGE (global input) CHARACTER 50* Specifies which eigenvalues are to be found. 51* = 'A': ("All") all eigenvalues will be found. 52* = 'V': ("Value") all eigenvalues in the interval 53* [VL, VU] will be found. 54* = 'I': ("Index") the IL-th through IU-th eigenvalues (of the 55* entire matrix) will be found. 56* 57* ORDER (global input) CHARACTER 58* Specifies the order in which the eigenvalues and their block 59* numbers are stored in W and IBLOCK. 60* = 'B': ("By Block") the eigenvalues will be grouped by 61* split-off block (see IBLOCK, ISPLIT) and 62* ordered from smallest to largest within 63* the block. 64* = 'E': ("Entire matrix") 65* the eigenvalues for the entire matrix 66* will be ordered from smallest to largest. 67* 68* N (global input) INTEGER 69* The order of the tridiagonal matrix T. N >= 0. 70* 71* VL (global input) DOUBLE PRECISION 72* If RANGE='V', the lower bound of the interval to be searched 73* for eigenvalues. Eigenvalues less than VL will not be 74* returned. Not referenced if RANGE='A' or 'I'. 75* 76* VU (global input) DOUBLE PRECISION 77* If RANGE='V', the upper bound of the interval to be searched 78* for eigenvalues. Eigenvalues greater than VU will not be 79* returned. VU must be greater than VL. Not referenced if 80* RANGE='A' or 'I'. 81* 82* IL (global input) INTEGER 83* If RANGE='I', the index (from smallest to largest) of the 84* smallest eigenvalue to be returned. IL must be at least 1. 85* Not referenced if RANGE='A' or 'V'. 86* 87* IU (global input) INTEGER 88* If RANGE='I', the index (from smallest to largest) of the 89* largest eigenvalue to be returned. IU must be at least IL 90* and no greater than N. Not referenced if RANGE='A' or 'V'. 91* 92* ABSTOL (global input) DOUBLE PRECISION 93* The absolute tolerance for the eigenvalues. An eigenvalue 94* (or cluster) is considered to be located if it has been 95* determined to lie in an interval whose width is ABSTOL or 96* less. If ABSTOL is less than or equal to zero, then ULP*|T| 97* will be used, where |T| means the 1-norm of T. 98* Eigenvalues will be computed most accurately when ABSTOL is 99* set to the underflow threshold DLAMCH('U'), not zero. 100* Note : If eigenvectors are desired later by inverse iteration 101* ( PDSTEIN ), ABSTOL should be set to 2*PDLAMCH('S'). 102* 103* D (global input) DOUBLE PRECISION array, dimension (N) 104* The n diagonal elements of the tridiagonal matrix T. To 105* avoid overflow, the matrix must be scaled so that its largest 106* entry is no greater than overflow**(1/2) * underflow**(1/4) 107* in absolute value, and for greatest accuracy, it should not 108* be much smaller than that. 109* 110* E (global input) DOUBLE PRECISION array, dimension (N-1) 111* The (n-1) off-diagonal elements of the tridiagonal matrix T. 112* To avoid overflow, the matrix must be scaled so that its 113* largest entry is no greater than overflow**(1/2) * 114* underflow**(1/4) in absolute value, and for greatest 115* accuracy, it should not be much smaller than that. 116* 117* M (global output) INTEGER 118* The actual number of eigenvalues found. 0 <= M <= N. 119* (See also the description of INFO=2) 120* 121* NSPLIT (global output) INTEGER 122* The number of diagonal blocks in the matrix T. 123* 1 <= NSPLIT <= N. 124* 125* W (global output) DOUBLE PRECISION array, dimension (N) 126* On exit, the first M elements of W contain the eigenvalues 127* on all processes. 128* 129* IBLOCK (global output) INTEGER array, dimension (N) 130* At each row/column j where E(j) is zero or small, the 131* matrix T is considered to split into a block diagonal 132* matrix. On exit IBLOCK(i) specifies which block (from 1 133* to the number of blocks) the eigenvalue W(i) belongs to. 134* NOTE: in the (theoretically impossible) event that bisection 135* does not converge for some or all eigenvalues, INFO is set 136* to 1 and the ones for which it did not are identified by a 137* negative block number. 138* 139* ISPLIT (global output) INTEGER array, dimension (N) 140* The splitting points, at which T breaks up into submatrices. 141* The first submatrix consists of rows/columns 1 to ISPLIT(1), 142* the second of rows/columns ISPLIT(1)+1 through ISPLIT(2), 143* etc., and the NSPLIT-th consists of rows/columns 144* ISPLIT(NSPLIT-1)+1 through ISPLIT(NSPLIT)=N. 145* (Only the first NSPLIT elements will actually be used, but 146* since the user cannot know a priori what value NSPLIT will 147* have, N words must be reserved for ISPLIT.) 148* 149* WORK (local workspace) DOUBLE PRECISION array, 150* dimension ( MAX( 5*N, 7 ) ) 151* 152* LWORK (local input) INTEGER 153* size of array WORK must be >= MAX( 5*N, 7 ) 154* If LWORK = -1, then LWORK is global input and a workspace 155* query is assumed; the routine only calculates the minimum 156* and optimal size for all work arrays. Each of these 157* values is returned in the first entry of the corresponding 158* work array, and no error message is issued by PXERBLA. 159* 160* IWORK (local workspace) INTEGER array, dimension ( MAX( 4*N, 14 ) ) 161* 162* LIWORK (local input) INTEGER 163* size of array IWORK must be >= MAX( 4*N, 14, NPROCS ) 164* If LIWORK = -1, then LIWORK is global input and a workspace 165* query is assumed; the routine only calculates the minimum 166* and optimal size for all work arrays. Each of these 167* values is returned in the first entry of the corresponding 168* work array, and no error message is issued by PXERBLA. 169* 170* INFO (global output) INTEGER 171* = 0 : successful exit 172* < 0 : if INFO = -i, the i-th argument had an illegal value 173* > 0 : some or all of the eigenvalues failed to converge or 174* were not computed: 175* = 1 : Bisection failed to converge for some eigenvalues; 176* these eigenvalues are flagged by a negative block 177* number. The effect is that the eigenvalues may not 178* be as accurate as the absolute and relative 179* tolerances. This is generally caused by arithmetic 180* which is less accurate than PDLAMCH says. 181* = 2 : There is a mismatch between the number of 182* eigenvalues output and the number desired. 183* = 3 : RANGE='i', and the Gershgorin interval initially 184* used was incorrect. No eigenvalues were computed. 185* Probable cause: your machine has sloppy floating 186* point arithmetic. 187* Cure: Increase the PARAMETER "FUDGE", recompile, 188* and try again. 189* 190* Internal Parameters 191* =================== 192* 193* RELFAC DOUBLE PRECISION, default = 2.0 194* The relative tolerance. An interval [a,b] lies within 195* "relative tolerance" if b-a < RELFAC*ulp*max(|a|,|b|), 196* where "ulp" is the machine precision (distance from 1 to 197* the next larger floating point number.) 198* 199* FUDGE DOUBLE PRECISION, default = 2.0 200* A "fudge factor" to widen the Gershgorin intervals. Ideally, 201* a value of 1 should work, but on machines with sloppy 202* arithmetic, this needs to be larger. The default for 203* publicly released versions should be large enough to handle 204* the worst machine around. Note that this has no effect 205* on the accuracy of the solution. 206* 207* ===================================================================== 208* 209* .. Intrinsic Functions .. 210 INTRINSIC ABS, DBLE, ICHAR, MAX, MIN, MOD 211* .. 212* .. External Functions .. 213 LOGICAL LSAME 214 INTEGER BLACS_PNUM 215 DOUBLE PRECISION PDLAMCH 216 EXTERNAL LSAME, BLACS_PNUM, PDLAMCH 217* .. 218* .. External Subroutines .. 219 EXTERNAL BLACS_FREEBUFF, BLACS_GET, BLACS_GRIDEXIT, 220 $ BLACS_GRIDINFO, BLACS_GRIDMAP, DGEBR2D, 221 $ DGEBS2D, DGERV2D, DGESD2D, DLASRT2, GLOBCHK, 222 $ IGEBR2D, IGEBS2D, IGERV2D, IGESD2D, IGSUM2D, 223 $ PDLAEBZ, PDLAIECTB, PDLAIECTL, PDLAPDCT, 224 $ PDLASNBT, PXERBLA 225* .. 226* .. Parameters .. 227 INTEGER BLOCK_CYCLIC_2D, DLEN_, DTYPE_, CTXT_, M_, N_, 228 $ MB_, NB_, RSRC_, CSRC_, LLD_ 229 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 230 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 231 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 232 INTEGER BIGNUM, DESCMULT 233 PARAMETER ( BIGNUM = 10000, DESCMULT = 100 ) 234 DOUBLE PRECISION ZERO, ONE, TWO, FIVE, HALF 235 PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0, TWO = 2.0D+0, 236 $ FIVE = 5.0D+0, HALF = 1.0D+0 / TWO ) 237 DOUBLE PRECISION FUDGE, RELFAC 238 PARAMETER ( FUDGE = 2.0D+0, RELFAC = 2.0D+0 ) 239* .. 240* .. Local Scalars .. 241 LOGICAL LQUERY 242 INTEGER BLKNO, FOUND, I, IBEGIN, IEFLAG, IEND, IFRST, 243 $ IINFO, ILAST, ILOAD, IM, IMYLOAD, IN, INDRIW1, 244 $ INDRIW2, INDRW1, INDRW2, INXTLOAD, IOFF, 245 $ IORDER, IOUT, IRANGE, IRECV, IREM, ITMP1, 246 $ ITMP2, J, JB, K, LAST, LEXTRA, LREQ, MYCOL, 247 $ MYROW, NALPHA, NBETA, NCMP, NEIGINT, NEXT, NGL, 248 $ NGLOB, NGU, NINT, NPCOL, NPROW, OFFSET, 249 $ ONEDCONTEXT, P, PREV, REXTRA, RREQ, SELF, 250 $ TORECV 251 DOUBLE PRECISION ALPHA, ATOLI, BETA, BNORM, DRECV, DSEND, GL, 252 $ GU, INITVL, INITVU, LSAVE, MID, PIVMIN, RELTOL, 253 $ SAFEMN, TMP1, TMP2, TNORM, ULP 254* .. 255* .. Local Arrays .. 256 INTEGER IDUM( 5, 2 ) 257* .. 258* .. Executable Statements .. 259* This is just to keep ftnchek happy 260 IF( BLOCK_CYCLIC_2D*CSRC_*CTXT_*DLEN_*DTYPE_*LLD_*MB_*M_*NB_*N_* 261 $ RSRC_.LT.0 )RETURN 262* 263* Set up process grid 264* 265 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 266* 267 INFO = 0 268 M = 0 269* 270* Decode RANGE 271* 272 IF( LSAME( RANGE, 'A' ) ) THEN 273 IRANGE = 1 274 ELSE IF( LSAME( RANGE, 'V' ) ) THEN 275 IRANGE = 2 276 ELSE IF( LSAME( RANGE, 'I' ) ) THEN 277 IRANGE = 3 278 ELSE 279 IRANGE = 0 280 END IF 281* 282* Decode ORDER 283* 284 IF( LSAME( ORDER, 'B' ) ) THEN 285 IORDER = 2 286 ELSE IF( LSAME( ORDER, 'E' ) .OR. LSAME( ORDER, 'A' ) ) THEN 287 IORDER = 1 288 ELSE 289 IORDER = 0 290 END IF 291* 292* Check for Errors 293* 294 IF( NPROW.EQ.-1 ) THEN 295 INFO = -1 296 ELSE 297* 298* Get machine constants 299* 300 SAFEMN = PDLAMCH( ICTXT, 'S' ) 301 ULP = PDLAMCH( ICTXT, 'P' ) 302 RELTOL = ULP*RELFAC 303 IDUM( 1, 1 ) = ICHAR( RANGE ) 304 IDUM( 1, 2 ) = 2 305 IDUM( 2, 1 ) = ICHAR( ORDER ) 306 IDUM( 2, 2 ) = 3 307 IDUM( 3, 1 ) = N 308 IDUM( 3, 2 ) = 4 309 NGLOB = 5 310 IF( IRANGE.EQ.3 ) THEN 311 IDUM( 4, 1 ) = IL 312 IDUM( 4, 2 ) = 7 313 IDUM( 5, 1 ) = IU 314 IDUM( 5, 2 ) = 8 315 ELSE 316 IDUM( 4, 1 ) = 0 317 IDUM( 4, 2 ) = 0 318 IDUM( 5, 1 ) = 0 319 IDUM( 5, 2 ) = 0 320 END IF 321 IF( MYROW.EQ.0 .AND. MYCOL.EQ.0 ) THEN 322 WORK( 1 ) = ABSTOL 323 IF( IRANGE.EQ.2 ) THEN 324 WORK( 2 ) = VL 325 WORK( 3 ) = VU 326 ELSE 327 WORK( 2 ) = ZERO 328 WORK( 3 ) = ZERO 329 END IF 330 CALL DGEBS2D( ICTXT, 'ALL', ' ', 3, 1, WORK, 3 ) 331 ELSE 332 CALL DGEBR2D( ICTXT, 'ALL', ' ', 3, 1, WORK, 3, 0, 0 ) 333 END IF 334 LQUERY = ( LWORK.EQ.-1 .OR. LIWORK.EQ.-1 ) 335 IF( INFO.EQ.0 ) THEN 336 IF( IRANGE.EQ.0 ) THEN 337 INFO = -2 338 ELSE IF( IORDER.EQ.0 ) THEN 339 INFO = -3 340 ELSE IF( IRANGE.EQ.2 .AND. VL.GE.VU ) THEN 341 INFO = -5 342 ELSE IF( IRANGE.EQ.3 .AND. ( IL.LT.1 .OR. IL.GT.MAX( 1, 343 $ N ) ) ) THEN 344 INFO = -6 345 ELSE IF( IRANGE.EQ.3 .AND. ( IU.LT.MIN( N, 346 $ IL ) .OR. IU.GT.N ) ) THEN 347 INFO = -7 348 ELSE IF( LWORK.LT.MAX( 5*N, 7 ) .AND. .NOT.LQUERY ) THEN 349 INFO = -18 350 ELSE IF( LIWORK.LT.MAX( 4*N, 14, NPROW*NPCOL ) .AND. .NOT. 351 $ LQUERY ) THEN 352 INFO = -20 353 ELSE IF( IRANGE.EQ.2 .AND. ( ABS( WORK( 2 )-VL ).GT.FIVE* 354 $ ULP*ABS( VL ) ) ) THEN 355 INFO = -5 356 ELSE IF( IRANGE.EQ.2 .AND. ( ABS( WORK( 3 )-VU ).GT.FIVE* 357 $ ULP*ABS( VU ) ) ) THEN 358 INFO = -6 359 ELSE IF( ABS( WORK( 1 )-ABSTOL ).GT.FIVE*ULP*ABS( ABSTOL ) ) 360 $ THEN 361 INFO = -9 362 END IF 363 END IF 364 IF( INFO.EQ.0 ) 365 $ INFO = BIGNUM 366 CALL GLOBCHK( ICTXT, NGLOB, IDUM, 5, IWORK, INFO ) 367 IF( INFO.EQ.BIGNUM ) THEN 368 INFO = 0 369 ELSE IF( MOD( INFO, DESCMULT ).EQ.0 ) THEN 370 INFO = -INFO / DESCMULT 371 ELSE 372 INFO = -INFO 373 END IF 374 END IF 375 WORK( 1 ) = DBLE( MAX( 5*N, 7 ) ) 376 IWORK( 1 ) = MAX( 4*N, 14, NPROW*NPCOL ) 377* 378 IF( INFO.NE.0 ) THEN 379 CALL PXERBLA( ICTXT, 'PDSTEBZ', -INFO ) 380 RETURN 381 ELSE IF( LWORK.EQ.-1 .AND. LIWORK.EQ.-1 ) THEN 382 RETURN 383 END IF 384* 385* 386* Quick return if possible 387* 388 IF( N.EQ.0 ) 389 $ RETURN 390* 391 K = 1 392 DO 20 I = 0, NPROW - 1 393 DO 10 J = 0, NPCOL - 1 394 IWORK( K ) = BLACS_PNUM( ICTXT, I, J ) 395 K = K + 1 396 10 CONTINUE 397 20 CONTINUE 398* 399 P = NPROW*NPCOL 400 NPROW = 1 401 NPCOL = P 402* 403 CALL BLACS_GET( ICTXT, 10, ONEDCONTEXT ) 404 CALL BLACS_GRIDMAP( ONEDCONTEXT, IWORK, NPROW, NPROW, NPCOL ) 405 CALL BLACS_GRIDINFO( ONEDCONTEXT, I, J, K, SELF ) 406* 407* Simplifications: 408* 409 IF( IRANGE.EQ.3 .AND. IL.EQ.1 .AND. IU.EQ.N ) 410 $ IRANGE = 1 411* 412 NEXT = MOD( SELF+1, P ) 413 PREV = MOD( P+SELF-1, P ) 414* 415* Compute squares of off-diagonals, splitting points and pivmin. 416* Interleave diagonals and off-diagonals. 417* 418 INDRW1 = MAX( 2*N, 4 ) 419 INDRW2 = INDRW1 + 2*N 420 INDRIW1 = MAX( 2*N, 8 ) 421 NSPLIT = 1 422 WORK( INDRW1+2*N ) = ZERO 423 PIVMIN = ONE 424* 425 DO 30 I = 1, N - 1 426 TMP1 = E( I )**2 427 J = 2*I 428 WORK( INDRW1+J-1 ) = D( I ) 429 IF( ABS( D( I+1 )*D( I ) )*ULP**2+SAFEMN.GT.TMP1 ) THEN 430 ISPLIT( NSPLIT ) = I 431 NSPLIT = NSPLIT + 1 432 WORK( INDRW1+J ) = ZERO 433 ELSE 434 WORK( INDRW1+J ) = TMP1 435 PIVMIN = MAX( PIVMIN, TMP1 ) 436 END IF 437 30 CONTINUE 438 WORK( INDRW1+2*N-1 ) = D( N ) 439 ISPLIT( NSPLIT ) = N 440 PIVMIN = PIVMIN*SAFEMN 441* 442* Compute Gershgorin interval [gl,gu] for entire matrix 443* 444 GU = D( 1 ) 445 GL = D( 1 ) 446 TMP1 = ZERO 447* 448 DO 40 I = 1, N - 1 449 TMP2 = ABS( E( I ) ) 450 GU = MAX( GU, D( I )+TMP1+TMP2 ) 451 GL = MIN( GL, D( I )-TMP1-TMP2 ) 452 TMP1 = TMP2 453 40 CONTINUE 454 GU = MAX( GU, D( N )+TMP1 ) 455 GL = MIN( GL, D( N )-TMP1 ) 456 TNORM = MAX( ABS( GL ), ABS( GU ) ) 457 GL = GL - FUDGE*TNORM*ULP*N - FUDGE*TWO*PIVMIN 458 GU = GU + FUDGE*TNORM*ULP*N + FUDGE*PIVMIN 459* 460 IF( ABSTOL.LE.ZERO ) THEN 461 ATOLI = ULP*TNORM 462 ELSE 463 ATOLI = ABSTOL 464 END IF 465* 466* Find out if on an IEEE machine, the sign bit is the 467* 32nd bit (Big Endian) or the 64th bit (Little Endian) 468* 469 IF( IRANGE.EQ.1 .OR. NSPLIT.EQ.1 ) THEN 470 CALL PDLASNBT( IEFLAG ) 471 ELSE 472 IEFLAG = 0 473 END IF 474 LEXTRA = 0 475 REXTRA = 0 476* 477* Form Initial Interval containing desired eigenvalues 478* 479 IF( IRANGE.EQ.1 ) THEN 480 INITVL = GL 481 INITVU = GU 482 WORK( 1 ) = GL 483 WORK( 2 ) = GU 484 IWORK( 1 ) = 0 485 IWORK( 2 ) = N 486 IFRST = 1 487 ILAST = N 488 ELSE IF( IRANGE.EQ.2 ) THEN 489 IF( VL.GT.GL ) THEN 490 IF( IEFLAG.EQ.0 ) THEN 491 CALL PDLAPDCT( VL, N, WORK( INDRW1+1 ), PIVMIN, IFRST ) 492 ELSE IF( IEFLAG.EQ.1 ) THEN 493 CALL PDLAIECTB( VL, N, WORK( INDRW1+1 ), IFRST ) 494 ELSE 495 CALL PDLAIECTL( VL, N, WORK( INDRW1+1 ), IFRST ) 496 END IF 497 IFRST = IFRST + 1 498 INITVL = VL 499 ELSE 500 INITVL = GL 501 IFRST = 1 502 END IF 503 IF( VU.LT.GU ) THEN 504 IF( IEFLAG.EQ.0 ) THEN 505 CALL PDLAPDCT( VU, N, WORK( INDRW1+1 ), PIVMIN, ILAST ) 506 ELSE IF( IEFLAG.EQ.1 ) THEN 507 CALL PDLAIECTB( VU, N, WORK( INDRW1+1 ), ILAST ) 508 ELSE 509 CALL PDLAIECTL( VU, N, WORK( INDRW1+1 ), ILAST ) 510 END IF 511 INITVU = VU 512 ELSE 513 INITVU = GU 514 ILAST = N 515 END IF 516 WORK( 1 ) = INITVL 517 WORK( 2 ) = INITVU 518 IWORK( 1 ) = IFRST - 1 519 IWORK( 2 ) = ILAST 520 ELSE IF( IRANGE.EQ.3 ) THEN 521 WORK( 1 ) = GL 522 WORK( 2 ) = GU 523 IWORK( 1 ) = 0 524 IWORK( 2 ) = N 525 IWORK( 5 ) = IL - 1 526 IWORK( 6 ) = IU 527 CALL PDLAEBZ( 0, N, 2, 1, ATOLI, RELTOL, PIVMIN, 528 $ WORK( INDRW1+1 ), IWORK( 5 ), WORK, IWORK, NINT, 529 $ LSAVE, IEFLAG, IINFO ) 530 IF( IINFO.NE.0 ) THEN 531 INFO = 3 532 GO TO 230 533 END IF 534 IF( NINT.GT.1 ) THEN 535 IF( IWORK( 5 ).EQ.IL-1 ) THEN 536 WORK( 2 ) = WORK( 4 ) 537 IWORK( 2 ) = IWORK( 4 ) 538 ELSE 539 WORK( 1 ) = WORK( 3 ) 540 IWORK( 1 ) = IWORK( 3 ) 541 END IF 542 IF( IWORK( 1 ).LT.0 .OR. IWORK( 1 ).GT.IL-1 .OR. 543 $ IWORK( 2 ).LE.MIN( IU-1, IWORK( 1 ) ) .OR. 544 $ IWORK( 2 ).GT.N ) THEN 545 INFO = 3 546 GO TO 230 547 END IF 548 END IF 549 LEXTRA = IL - 1 - IWORK( 1 ) 550 REXTRA = IWORK( 2 ) - IU 551 INITVL = WORK( 1 ) 552 INITVU = WORK( 2 ) 553 IFRST = IL 554 ILAST = IU 555 END IF 556* NVL = IFRST - 1 557* NVU = ILAST 558 GL = INITVL 559 GU = INITVU 560 NGL = IWORK( 1 ) 561 NGU = IWORK( 2 ) 562 IM = 0 563 FOUND = 0 564 INDRIW2 = INDRIW1 + NGU - NGL 565 IEND = 0 566 IF( IFRST.GT.ILAST ) 567 $ GO TO 100 568 IF( IFRST.EQ.1 .AND. ILAST.EQ.N ) 569 $ IRANGE = 1 570* 571* Find Eigenvalues -- Loop Over Blocks 572* 573 DO 90 JB = 1, NSPLIT 574 IOFF = IEND 575 IBEGIN = IOFF + 1 576 IEND = ISPLIT( JB ) 577 IN = IEND - IOFF 578 IF( JB.NE.1 ) THEN 579 IF( IRANGE.NE.1 ) THEN 580 FOUND = IM 581* 582* Find total number of eigenvalues found thus far 583* 584 CALL IGSUM2D( ONEDCONTEXT, 'All', ' ', 1, 1, FOUND, 1, 585 $ -1, -1 ) 586 ELSE 587 FOUND = IOFF 588 END IF 589 END IF 590* IF( SELF.GE.P ) 591* $ GO TO 30 592 IF( IN.NE.N ) THEN 593* 594* Compute Gershgorin interval [gl,gu] for split matrix 595* 596 GU = D( IBEGIN ) 597 GL = D( IBEGIN ) 598 TMP1 = ZERO 599* 600 DO 50 J = IBEGIN, IEND - 1 601 TMP2 = ABS( E( J ) ) 602 GU = MAX( GU, D( J )+TMP1+TMP2 ) 603 GL = MIN( GL, D( J )-TMP1-TMP2 ) 604 TMP1 = TMP2 605 50 CONTINUE 606* 607 GU = MAX( GU, D( IEND )+TMP1 ) 608 GL = MIN( GL, D( IEND )-TMP1 ) 609 BNORM = MAX( ABS( GL ), ABS( GU ) ) 610 GL = GL - FUDGE*BNORM*ULP*IN - FUDGE*PIVMIN 611 GU = GU + FUDGE*BNORM*ULP*IN + FUDGE*PIVMIN 612* 613* Compute ATOLI for the current submatrix 614* 615 IF( ABSTOL.LE.ZERO ) THEN 616 ATOLI = ULP*BNORM 617 ELSE 618 ATOLI = ABSTOL 619 END IF 620* 621 IF( GL.LT.INITVL ) THEN 622 GL = INITVL 623 IF( IEFLAG.EQ.0 ) THEN 624 CALL PDLAPDCT( GL, IN, WORK( INDRW1+2*IOFF+1 ), 625 $ PIVMIN, NGL ) 626 ELSE IF( IEFLAG.EQ.1 ) THEN 627 CALL PDLAIECTB( GL, IN, WORK( INDRW1+2*IOFF+1 ), NGL ) 628 ELSE 629 CALL PDLAIECTL( GL, IN, WORK( INDRW1+2*IOFF+1 ), NGL ) 630 END IF 631 ELSE 632 NGL = 0 633 END IF 634 IF( GU.GT.INITVU ) THEN 635 GU = INITVU 636 IF( IEFLAG.EQ.0 ) THEN 637 CALL PDLAPDCT( GU, IN, WORK( INDRW1+2*IOFF+1 ), 638 $ PIVMIN, NGU ) 639 ELSE IF( IEFLAG.EQ.1 ) THEN 640 CALL PDLAIECTB( GU, IN, WORK( INDRW1+2*IOFF+1 ), NGU ) 641 ELSE 642 CALL PDLAIECTL( GU, IN, WORK( INDRW1+2*IOFF+1 ), NGU ) 643 END IF 644 ELSE 645 NGU = IN 646 END IF 647 IF( NGL.GE.NGU ) 648 $ GO TO 90 649 WORK( 1 ) = GL 650 WORK( 2 ) = GU 651 IWORK( 1 ) = NGL 652 IWORK( 2 ) = NGU 653 END IF 654 OFFSET = FOUND - NGL 655 BLKNO = JB 656* 657* Do a static partitioning of work so that each process 658* has to find an (almost) equal number of eigenvalues 659* 660 NCMP = NGU - NGL 661 ILOAD = NCMP / P 662 IREM = NCMP - ILOAD*P 663 ITMP1 = MOD( SELF-FOUND, P ) 664 IF( ITMP1.LT.0 ) 665 $ ITMP1 = ITMP1 + P 666 IF( ITMP1.LT.IREM ) THEN 667 IMYLOAD = ILOAD + 1 668 ELSE 669 IMYLOAD = ILOAD 670 END IF 671 IF( IMYLOAD.EQ.0 ) THEN 672 GO TO 90 673 ELSE IF( IN.EQ.1 ) THEN 674 WORK( INDRW2+IM+1 ) = WORK( INDRW1+2*IOFF+1 ) 675 IWORK( INDRIW1+IM+1 ) = BLKNO 676 IWORK( INDRIW2+IM+1 ) = OFFSET + 1 677 IM = IM + 1 678 GO TO 90 679 ELSE 680 INXTLOAD = ILOAD 681 ITMP2 = MOD( SELF+1-FOUND, P ) 682 IF( ITMP2.LT.0 ) 683 $ ITMP2 = ITMP2 + P 684 IF( ITMP2.LT.IREM ) 685 $ INXTLOAD = INXTLOAD + 1 686 LREQ = NGL + ITMP1*ILOAD + MIN( IREM, ITMP1 ) 687 RREQ = LREQ + IMYLOAD 688 IWORK( 5 ) = LREQ 689 IWORK( 6 ) = RREQ 690 TMP1 = WORK( 1 ) 691 ITMP1 = IWORK( 1 ) 692 CALL PDLAEBZ( 1, IN, 1, 1, ATOLI, RELTOL, PIVMIN, 693 $ WORK( INDRW1+2*IOFF+1 ), IWORK( 5 ), WORK, 694 $ IWORK, NINT, LSAVE, IEFLAG, IINFO ) 695 ALPHA = WORK( 1 ) 696 BETA = WORK( 2 ) 697 NALPHA = IWORK( 1 ) 698 NBETA = IWORK( 2 ) 699 DSEND = BETA 700 IF( NBETA.GT.RREQ+INXTLOAD ) THEN 701 NBETA = RREQ 702 DSEND = ALPHA 703 END IF 704 LAST = MOD( FOUND+MIN( NGU-NGL, P )-1, P ) 705 IF( LAST.LT.0 ) 706 $ LAST = LAST + P 707 IF( SELF.NE.LAST ) THEN 708 CALL DGESD2D( ONEDCONTEXT, 1, 1, DSEND, 1, 0, NEXT ) 709 CALL IGESD2D( ONEDCONTEXT, 1, 1, NBETA, 1, 0, NEXT ) 710 END IF 711 IF( SELF.NE.MOD( FOUND, P ) ) THEN 712 CALL DGERV2D( ONEDCONTEXT, 1, 1, DRECV, 1, 0, PREV ) 713 CALL IGERV2D( ONEDCONTEXT, 1, 1, IRECV, 1, 0, PREV ) 714 ELSE 715 DRECV = TMP1 716 IRECV = ITMP1 717 END IF 718 WORK( 1 ) = MAX( LSAVE, DRECV ) 719 IWORK( 1 ) = IRECV 720 ALPHA = MAX( ALPHA, WORK( 1 ) ) 721 NALPHA = MAX( NALPHA, IRECV ) 722 IF( BETA-ALPHA.LE.MAX( ATOLI, RELTOL*MAX( ABS( ALPHA ), 723 $ ABS( BETA ) ) ) ) THEN 724 MID = HALF*( ALPHA+BETA ) 725 DO 60 J = OFFSET + NALPHA + 1, OFFSET + NBETA 726 WORK( INDRW2+IM+1 ) = MID 727 IWORK( INDRIW1+IM+1 ) = BLKNO 728 IWORK( INDRIW2+IM+1 ) = J 729 IM = IM + 1 730 60 CONTINUE 731 WORK( 2 ) = ALPHA 732 IWORK( 2 ) = NALPHA 733 END IF 734 END IF 735 NEIGINT = IWORK( 2 ) - IWORK( 1 ) 736 IF( NEIGINT.LE.0 ) 737 $ GO TO 90 738* 739* Call the main computational routine 740* 741 CALL PDLAEBZ( 2, IN, NEIGINT, 1, ATOLI, RELTOL, PIVMIN, 742 $ WORK( INDRW1+2*IOFF+1 ), IWORK, WORK, IWORK, 743 $ IOUT, LSAVE, IEFLAG, IINFO ) 744 IF( IINFO.NE.0 ) THEN 745 INFO = 1 746 END IF 747 DO 80 I = 1, IOUT 748 MID = HALF*( WORK( 2*I-1 )+WORK( 2*I ) ) 749 IF( I.GT.IOUT-IINFO ) 750 $ BLKNO = -BLKNO 751 DO 70 J = OFFSET + IWORK( 2*I-1 ) + 1, 752 $ OFFSET + IWORK( 2*I ) 753 WORK( INDRW2+IM+1 ) = MID 754 IWORK( INDRIW1+IM+1 ) = BLKNO 755 IWORK( INDRIW2+IM+1 ) = J 756 IM = IM + 1 757 70 CONTINUE 758 80 CONTINUE 759 90 CONTINUE 760* 761* Find out total number of eigenvalues computed 762* 763 100 CONTINUE 764 M = IM 765 CALL IGSUM2D( ONEDCONTEXT, 'ALL', ' ', 1, 1, M, 1, -1, -1 ) 766* 767* Move the eigenvalues found to their final destinations 768* 769 DO 130 I = 1, P 770 IF( SELF.EQ.I-1 ) THEN 771 CALL IGEBS2D( ONEDCONTEXT, 'ALL', ' ', 1, 1, IM, 1 ) 772 IF( IM.NE.0 ) THEN 773 CALL IGEBS2D( ONEDCONTEXT, 'ALL', ' ', IM, 1, 774 $ IWORK( INDRIW2+1 ), IM ) 775 CALL DGEBS2D( ONEDCONTEXT, 'ALL', ' ', IM, 1, 776 $ WORK( INDRW2+1 ), IM ) 777 CALL IGEBS2D( ONEDCONTEXT, 'ALL', ' ', IM, 1, 778 $ IWORK( INDRIW1+1 ), IM ) 779 DO 110 J = 1, IM 780 W( IWORK( INDRIW2+J ) ) = WORK( INDRW2+J ) 781 IBLOCK( IWORK( INDRIW2+J ) ) = IWORK( INDRIW1+J ) 782 110 CONTINUE 783 END IF 784 ELSE 785 CALL IGEBR2D( ONEDCONTEXT, 'ALL', ' ', 1, 1, TORECV, 1, 0, 786 $ I-1 ) 787 IF( TORECV.NE.0 ) THEN 788 CALL IGEBR2D( ONEDCONTEXT, 'ALL', ' ', TORECV, 1, IWORK, 789 $ TORECV, 0, I-1 ) 790 CALL DGEBR2D( ONEDCONTEXT, 'ALL', ' ', TORECV, 1, WORK, 791 $ TORECV, 0, I-1 ) 792 CALL IGEBR2D( ONEDCONTEXT, 'ALL', ' ', TORECV, 1, 793 $ IWORK( N+1 ), TORECV, 0, I-1 ) 794 DO 120 J = 1, TORECV 795 W( IWORK( J ) ) = WORK( J ) 796 IBLOCK( IWORK( J ) ) = IWORK( N+J ) 797 120 CONTINUE 798 END IF 799 END IF 800 130 CONTINUE 801 IF( NSPLIT.GT.1 .AND. IORDER.EQ.1 ) THEN 802* 803* Sort the eigenvalues 804* 805* 806 DO 140 I = 1, M 807 IWORK( M+I ) = I 808 140 CONTINUE 809 CALL DLASRT2( 'I', M, W, IWORK( M+1 ), IINFO ) 810 DO 150 I = 1, M 811 IWORK( I ) = IBLOCK( I ) 812 150 CONTINUE 813 DO 160 I = 1, M 814 IBLOCK( I ) = IWORK( IWORK( M+I ) ) 815 160 CONTINUE 816 END IF 817 IF( IRANGE.EQ.3 .AND. ( LEXTRA.GT.0 .OR. REXTRA.GT.0 ) ) THEN 818* 819* Discard unwanted eigenvalues (occurs only when RANGE = 'I', 820* and eigenvalues IL, and/or IU are in a cluster) 821* 822 DO 170 I = 1, M 823 WORK( I ) = W( I ) 824 IWORK( I ) = I 825 IWORK( M+I ) = I 826 170 CONTINUE 827 DO 190 I = 1, LEXTRA 828 ITMP1 = I 829 DO 180 J = I + 1, M 830 IF( WORK( J ).LT.WORK( ITMP1 ) ) THEN 831 ITMP1 = J 832 END IF 833 180 CONTINUE 834 TMP1 = WORK( I ) 835 WORK( I ) = WORK( ITMP1 ) 836 WORK( ITMP1 ) = TMP1 837 IWORK( IWORK( M+ITMP1 ) ) = I 838 IWORK( IWORK( M+I ) ) = ITMP1 839 ITMP2 = IWORK( M+I ) 840 IWORK( M+I ) = IWORK( M+ITMP1 ) 841 IWORK( M+ITMP1 ) = ITMP2 842 190 CONTINUE 843 DO 210 I = 1, REXTRA 844 ITMP1 = M - I + 1 845 DO 200 J = M - I, LEXTRA + 1, -1 846 IF( WORK( J ).GT.WORK( ITMP1 ) ) THEN 847 ITMP1 = J 848 END IF 849 200 CONTINUE 850 TMP1 = WORK( M-I+1 ) 851 WORK( M-I+1 ) = WORK( ITMP1 ) 852 WORK( ITMP1 ) = TMP1 853 IWORK( IWORK( M+ITMP1 ) ) = M - I + 1 854 IWORK( IWORK( 2*M-I+1 ) ) = ITMP1 855 ITMP2 = IWORK( 2*M-I+1 ) 856 IWORK( 2*M-I+1 ) = IWORK( M+ITMP1 ) 857 IWORK( M+ITMP1 ) = ITMP2 858* IWORK( ITMP1 ) = 1 859 210 CONTINUE 860 J = 0 861 DO 220 I = 1, M 862 IF( IWORK( I ).GT.LEXTRA .AND. IWORK( I ).LE.M-REXTRA ) THEN 863 J = J + 1 864 W( J ) = WORK( IWORK( I ) ) 865 IBLOCK( J ) = IBLOCK( I ) 866 END IF 867 220 CONTINUE 868 M = M - LEXTRA - REXTRA 869 END IF 870 IF( M.NE.ILAST-IFRST+1 ) THEN 871 INFO = 2 872 END IF 873* 874 230 CONTINUE 875 CALL BLACS_FREEBUFF( ONEDCONTEXT, 1 ) 876 CALL BLACS_GRIDEXIT( ONEDCONTEXT ) 877 RETURN 878* 879* End of PDSTEBZ 880* 881 END 882* 883 SUBROUTINE PDLAEBZ( IJOB, N, MMAX, MINP, ABSTOL, RELTOL, PIVMIN, 884 $ D, NVAL, INTVL, INTVLCT, MOUT, LSAVE, IEFLAG, 885 $ INFO ) 886* 887* -- ScaLAPACK routine (version 1.7) -- 888* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 889* and University of California, Berkeley. 890* November 15, 1997 891* 892* 893* .. Scalar Arguments .. 894 INTEGER IEFLAG, IJOB, INFO, MINP, MMAX, MOUT, N 895 DOUBLE PRECISION ABSTOL, LSAVE, PIVMIN, RELTOL 896* .. 897* .. Array Arguments .. 898 INTEGER INTVLCT( * ), NVAL( * ) 899 DOUBLE PRECISION D( * ), INTVL( * ) 900* .. 901* 902* Purpose 903* ======= 904* 905* PDLAEBZ contains the iteration loop which computes the eigenvalues 906* contained in the input intervals [ INTVL(2*j-1), INTVL(2*j) ] where 907* j = 1,...,MINP. It uses and computes the function N(w), which is 908* the count of eigenvalues of a symmetric tridiagonal matrix less than 909* or equal to its argument w. 910* 911* This is a ScaLAPACK internal subroutine and arguments are not 912* checked for unreasonable values. 913* 914* Arguments 915* ========= 916* 917* IJOB (input) INTEGER 918* Specifies the computation done by PDLAEBZ 919* = 0 : Find an interval with desired values of N(w) at the 920* endpoints of the interval. 921* = 1 : Find a floating point number contained in the initial 922* interval with a desired value of N(w). 923* = 2 : Perform bisection iteration to find eigenvalues of T. 924* 925* N (input) INTEGER 926* The order of the tridiagonal matrix T. N >= 1. 927* 928* MMAX (input) INTEGER 929* The maximum number of intervals that may be generated. If 930* more than MMAX intervals are generated, then PDLAEBZ will 931* quit with INFO = MMAX+1. 932* 933* MINP (input) INTEGER 934* The initial number of intervals. MINP <= MMAX. 935* 936* ABSTOL (input) DOUBLE PRECISION 937* The minimum (absolute) width of an interval. When an interval 938* is narrower than ABSTOL, or than RELTOL times the larger (in 939* magnitude) endpoint, then it is considered to be sufficiently 940* small, i.e., converged. 941* This must be at least zero. 942* 943* RELTOL (input) DOUBLE PRECISION 944* The minimum relative width of an interval. When an interval 945* is narrower than ABSTOL, or than RELTOL times the larger (in 946* magnitude) endpoint, then it is considered to be sufficiently 947* small, i.e., converged. 948* Note : This should be at least radix*machine epsilon. 949* 950* PIVMIN (input) DOUBLE PRECISION 951* The minimum absolute of a "pivot" in the "paranoid" 952* implementation of the Sturm sequence loop. This must be at 953* least max_j |e(j)^2| *safe_min, and at least safe_min, where 954* safe_min is at least the smallest number that can divide 1.0 955* without overflow. 956* See PDLAPDCT for the "paranoid" implementation of the Sturm 957* sequence loop. 958* 959* D (input) DOUBLE PRECISION array, dimension (2*N - 1) 960* Contains the diagonals and the squares of the off-diagonal 961* elements of the tridiagonal matrix T. These elements are 962* assumed to be interleaved in memory for better cache 963* performance. The diagonal entries of T are in the entries 964* D(1),D(3),...,D(2*N-1), while the squares of the off-diagonal 965* entries are D(2),D(4),...,D(2*N-2). To avoid overflow, the 966* matrix must be scaled so that its largest entry is no greater 967* than overflow**(1/2) * underflow**(1/4) in absolute value, 968* and for greatest accuracy, it should not be much smaller 969* than that. 970* 971* NVAL (input/output) INTEGER array, dimension (4) 972* If IJOB = 0, the desired values of N(w) are in NVAL(1) and 973* NVAL(2). 974* If IJOB = 1, NVAL(2) is the desired value of N(w). 975* If IJOB = 2, not referenced. 976* This array will, in general, be reordered on output. 977* 978* INTVL (input/output) DOUBLE PRECISION array, dimension (2*MMAX) 979* The endpoints of the intervals. INTVL(2*j-1) is the left 980* endpoint of the j-th interval, and INTVL(2*j) is the right 981* endpoint of the j-th interval. The input intervals will, 982* in general, be modified, split and reordered by the 983* calculation. 984* On input, INTVL contains the MINP input intervals. 985* On output, INTVL contains the converged intervals. 986* 987* INTVLCT (input/output) INTEGER array, dimension (2*MMAX) 988* The counts at the endpoints of the intervals. INTVLCT(2*j-1) 989* is the count at the left endpoint of the j-th interval, i.e., 990* the function value N(INTVL(2*j-1)), and INTVLCT(2*j) is the 991* count at the right endpoint of the j-th interval. 992* On input, INTVLCT contains the counts at the endpoints of 993* the MINP input intervals. 994* On output, INTVLCT contains the counts at the endpoints of 995* the converged intervals. 996* 997* MOUT (output) INTEGER 998* The number of intervals output. 999* 1000* LSAVE (output) DOUBLE PRECISION 1001* If IJOB = 0 or 2, not referenced. 1002* If IJOB = 1, this is the largest floating point number 1003* encountered which has count N(w) = NVAL(1). 1004* 1005* IEFLAG (input) INTEGER 1006* A flag which indicates whether N(w) should be speeded up by 1007* exploiting IEEE Arithmetic. 1008* 1009* INFO (output) INTEGER 1010* = 0 : All intervals converged. 1011* = 1 - MMAX : The last INFO intervals did not converge. 1012* = MMAX + 1 : More than MMAX intervals were generated. 1013* 1014* ===================================================================== 1015* 1016* .. Intrinsic Functions .. 1017 INTRINSIC ABS, INT, LOG, MAX, MIN 1018* .. 1019* .. External Subroutines .. 1020 EXTERNAL PDLAECV, PDLAIECTB, PDLAIECTL, PDLAPDCT 1021* .. 1022* .. Parameters .. 1023 DOUBLE PRECISION ZERO, TWO, HALF 1024 PARAMETER ( ZERO = 0.0D+0, TWO = 2.0D+0, 1025 $ HALF = 1.0D+0 / TWO ) 1026* .. 1027* .. Local Scalars .. 1028 INTEGER I, ITMAX, J, K, KF, KL, KLNEW, L, LCNT, LREQ, 1029 $ NALPHA, NBETA, NMID, RCNT, RREQ 1030 DOUBLE PRECISION ALPHA, BETA, MID 1031* .. 1032* .. Executable Statements .. 1033* 1034 KF = 1 1035 KL = MINP + 1 1036 INFO = 0 1037 IF( INTVL( 2 )-INTVL( 1 ).LE.ZERO ) THEN 1038 INFO = MINP 1039 MOUT = KF 1040 RETURN 1041 END IF 1042 IF( IJOB.EQ.0 ) THEN 1043* 1044* Check if some input intervals have "converged" 1045* 1046 CALL PDLAECV( 0, KF, KL, INTVL, INTVLCT, NVAL, 1047 $ MAX( ABSTOL, PIVMIN ), RELTOL ) 1048 IF( KF.GE.KL ) 1049 $ GO TO 60 1050* 1051* Compute upper bound on number of iterations needed 1052* 1053 ITMAX = INT( ( LOG( INTVL( 2 )-INTVL( 1 )+PIVMIN )- 1054 $ LOG( PIVMIN ) ) / LOG( TWO ) ) + 2 1055* 1056* Iteration Loop 1057* 1058 DO 20 I = 1, ITMAX 1059 KLNEW = KL 1060 DO 10 J = KF, KL - 1 1061 K = 2*J 1062* 1063* Bisect the interval and find the count at that point 1064* 1065 MID = HALF*( INTVL( K-1 )+INTVL( K ) ) 1066 IF( IEFLAG.EQ.0 ) THEN 1067 CALL PDLAPDCT( MID, N, D, PIVMIN, NMID ) 1068 ELSE IF( IEFLAG.EQ.1 ) THEN 1069 CALL PDLAIECTB( MID, N, D, NMID ) 1070 ELSE 1071 CALL PDLAIECTL( MID, N, D, NMID ) 1072 END IF 1073 LREQ = NVAL( K-1 ) 1074 RREQ = NVAL( K ) 1075 IF( KL.EQ.1 ) 1076 $ NMID = MIN( INTVLCT( K ), 1077 $ MAX( INTVLCT( K-1 ), NMID ) ) 1078 IF( NMID.LE.NVAL( K-1 ) ) THEN 1079 INTVL( K-1 ) = MID 1080 INTVLCT( K-1 ) = NMID 1081 END IF 1082 IF( NMID.GE.NVAL( K ) ) THEN 1083 INTVL( K ) = MID 1084 INTVLCT( K ) = NMID 1085 END IF 1086 IF( NMID.GT.LREQ .AND. NMID.LT.RREQ ) THEN 1087 L = 2*KLNEW 1088 INTVL( L-1 ) = MID 1089 INTVL( L ) = INTVL( K ) 1090 INTVLCT( L-1 ) = NVAL( K ) 1091 INTVLCT( L ) = INTVLCT( K ) 1092 INTVL( K ) = MID 1093 INTVLCT( K ) = NVAL( K-1 ) 1094 NVAL( L-1 ) = NVAL( K ) 1095 NVAL( L ) = NVAL( L-1 ) 1096 NVAL( K ) = NVAL( K-1 ) 1097 KLNEW = KLNEW + 1 1098 END IF 1099 10 CONTINUE 1100 KL = KLNEW 1101 CALL PDLAECV( 0, KF, KL, INTVL, INTVLCT, NVAL, 1102 $ MAX( ABSTOL, PIVMIN ), RELTOL ) 1103 IF( KF.GE.KL ) 1104 $ GO TO 60 1105 20 CONTINUE 1106 ELSE IF( IJOB.EQ.1 ) THEN 1107 ALPHA = INTVL( 1 ) 1108 BETA = INTVL( 2 ) 1109 NALPHA = INTVLCT( 1 ) 1110 NBETA = INTVLCT( 2 ) 1111 LSAVE = ALPHA 1112 LREQ = NVAL( 1 ) 1113 RREQ = NVAL( 2 ) 1114 30 CONTINUE 1115 IF( NBETA.NE.RREQ .AND. BETA-ALPHA.GT. 1116 $ MAX( ABSTOL, RELTOL*MAX( ABS( ALPHA ), ABS( BETA ) ) ) ) 1117 $ THEN 1118* 1119* Bisect the interval and find the count at that point 1120* 1121 MID = HALF*( ALPHA+BETA ) 1122 IF( IEFLAG.EQ.0 ) THEN 1123 CALL PDLAPDCT( MID, N, D, PIVMIN, NMID ) 1124 ELSE IF( IEFLAG.EQ.1 ) THEN 1125 CALL PDLAIECTB( MID, N, D, NMID ) 1126 ELSE 1127 CALL PDLAIECTL( MID, N, D, NMID ) 1128 END IF 1129 NMID = MIN( NBETA, MAX( NALPHA, NMID ) ) 1130 IF( NMID.GE.RREQ ) THEN 1131 BETA = MID 1132 NBETA = NMID 1133 ELSE 1134 ALPHA = MID 1135 NALPHA = NMID 1136 IF( NMID.EQ.LREQ ) 1137 $ LSAVE = ALPHA 1138 END IF 1139 GO TO 30 1140 END IF 1141 KL = KF 1142 INTVL( 1 ) = ALPHA 1143 INTVL( 2 ) = BETA 1144 INTVLCT( 1 ) = NALPHA 1145 INTVLCT( 2 ) = NBETA 1146 ELSE IF( IJOB.EQ.2 ) THEN 1147* 1148* Check if some input intervals have "converged" 1149* 1150 CALL PDLAECV( 1, KF, KL, INTVL, INTVLCT, NVAL, 1151 $ MAX( ABSTOL, PIVMIN ), RELTOL ) 1152 IF( KF.GE.KL ) 1153 $ GO TO 60 1154* 1155* Compute upper bound on number of iterations needed 1156* 1157 ITMAX = INT( ( LOG( INTVL( 2 )-INTVL( 1 )+PIVMIN )- 1158 $ LOG( PIVMIN ) ) / LOG( TWO ) ) + 2 1159* 1160* Iteration Loop 1161* 1162 DO 50 I = 1, ITMAX 1163 KLNEW = KL 1164 DO 40 J = KF, KL - 1 1165 K = 2*J 1166 MID = HALF*( INTVL( K-1 )+INTVL( K ) ) 1167 IF( IEFLAG.EQ.0 ) THEN 1168 CALL PDLAPDCT( MID, N, D, PIVMIN, NMID ) 1169 ELSE IF( IEFLAG.EQ.1 ) THEN 1170 CALL PDLAIECTB( MID, N, D, NMID ) 1171 ELSE 1172 CALL PDLAIECTL( MID, N, D, NMID ) 1173 END IF 1174 LCNT = INTVLCT( K-1 ) 1175 RCNT = INTVLCT( K ) 1176 NMID = MIN( RCNT, MAX( LCNT, NMID ) ) 1177* 1178* Form New Interval(s) 1179* 1180 IF( NMID.EQ.LCNT ) THEN 1181 INTVL( K-1 ) = MID 1182 ELSE IF( NMID.EQ.RCNT ) THEN 1183 INTVL( K ) = MID 1184 ELSE IF( KLNEW.LT.MMAX+1 ) THEN 1185 L = 2*KLNEW 1186 INTVL( L-1 ) = MID 1187 INTVL( L ) = INTVL( K ) 1188 INTVLCT( L-1 ) = NMID 1189 INTVLCT( L ) = INTVLCT( K ) 1190 INTVL( K ) = MID 1191 INTVLCT( K ) = NMID 1192 KLNEW = KLNEW + 1 1193 ELSE 1194 INFO = MMAX + 1 1195 RETURN 1196 END IF 1197 40 CONTINUE 1198 KL = KLNEW 1199 CALL PDLAECV( 1, KF, KL, INTVL, INTVLCT, NVAL, 1200 $ MAX( ABSTOL, PIVMIN ), RELTOL ) 1201 IF( KF.GE.KL ) 1202 $ GO TO 60 1203 50 CONTINUE 1204 END IF 1205 60 CONTINUE 1206 INFO = MAX( KL-KF, 0 ) 1207 MOUT = KL - 1 1208 RETURN 1209* 1210* End of PDLAEBZ 1211* 1212 END 1213* 1214* 1215 SUBROUTINE PDLAECV( IJOB, KF, KL, INTVL, INTVLCT, NVAL, ABSTOL, 1216 $ RELTOL ) 1217* 1218* -- ScaLAPACK routine (version 1.7) -- 1219* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 1220* and University of California, Berkeley. 1221* November 15, 1997 1222* 1223* 1224* .. Scalar Arguments .. 1225 INTEGER IJOB, KF, KL 1226 DOUBLE PRECISION ABSTOL, RELTOL 1227* .. 1228* .. Array Arguments .. 1229 INTEGER INTVLCT( * ), NVAL( * ) 1230 DOUBLE PRECISION INTVL( * ) 1231* .. 1232* 1233* Purpose 1234* ======= 1235* 1236* PDLAECV checks if the input intervals [ INTVL(2*i-1), INTVL(2*i) ], 1237* i = KF, ... , KL-1, have "converged". 1238* PDLAECV modifies KF to be the index of the last converged interval, 1239* i.e., on output, all intervals [ INTVL(2*i-1), INTVL(2*i) ], i < KF, 1240* have converged. Note that the input intervals may be reordered by 1241* PDLAECV. 1242* 1243* This is a SCALAPACK internal procedure and arguments are not checked 1244* for unreasonable values. 1245* 1246* Arguments 1247* ========= 1248* 1249* IJOB (input) INTEGER 1250* Specifies the criterion for "convergence" of an interval. 1251* = 0 : When an interval is narrower than ABSTOL, or than 1252* RELTOL times the larger (in magnitude) endpoint, then 1253* it is considered to have "converged". 1254* = 1 : When an interval is narrower than ABSTOL, or than 1255* RELTOL times the larger (in magnitude) endpoint, or if 1256* the counts at the endpoints are identical to the counts 1257* specified by NVAL ( see NVAL ) then the interval is 1258* considered to have "converged". 1259* 1260* KF (input/output) INTEGER 1261* On input, the index of the first input interval is 2*KF-1. 1262* On output, the index of the last converged interval 1263* is 2*KF-3. 1264* 1265* KL (input) INTEGER 1266* The index of the last input interval is 2*KL-3. 1267* 1268* INTVL (input/output) DOUBLE PRECISION array, dimension (2*(KL-KF)) 1269* The endpoints of the intervals. INTVL(2*j-1) is the left 1270* oendpoint f the j-th interval, and INTVL(2*j) is the right 1271* endpoint of the j-th interval. The input intervals will, 1272* in general, be reordered on output. 1273* On input, INTVL contains the KL-KF input intervals. 1274* On output, INTVL contains the converged intervals, 1 thru' 1275* KF-1, and the unconverged intervals, KF thru' KL-1. 1276* 1277* INTVLCT (input/output) INTEGER array, dimension (2*(KL-KF)) 1278* The counts at the endpoints of the intervals. INTVLCT(2*j-1) 1279* is the count at the left endpoint of the j-th interval, i.e., 1280* the function value N(INTVL(2*j-1)), and INTVLCT(2*j) is the 1281* count at the right endpoint of the j-th interval. This array 1282* will, in general, be reordered on output. 1283* See the comments in PDLAEBZ for more on the function N(w). 1284* 1285* NVAL (input/output) INTEGER array, dimension (2*(KL-KF)) 1286* The desired counts, N(w), at the endpoints of the 1287* corresponding intervals. This array will, in general, 1288* be reordered on output. 1289* 1290* ABSTOL (input) DOUBLE PRECISION 1291* The minimum (absolute) width of an interval. When an interval 1292* is narrower than ABSTOL, or than RELTOL times the larger (in 1293* magnitude) endpoint, then it is considered to be sufficiently 1294* small, i.e., converged. 1295* Note : This must be at least zero. 1296* 1297* RELTOL (input) DOUBLE PRECISION 1298* The minimum relative width of an interval. When an interval 1299* is narrower than ABSTOL, or than RELTOL times the larger (in 1300* magnitude) endpoint, then it is considered to be sufficiently 1301* small, i.e., converged. 1302* Note : This should be at least radix*machine epsilon. 1303* 1304* ===================================================================== 1305* 1306* .. Intrinsic Functions .. 1307 INTRINSIC ABS, MAX 1308* .. 1309* .. Local Scalars .. 1310 LOGICAL CONDN 1311 INTEGER I, ITMP1, ITMP2, J, K, KFNEW 1312 DOUBLE PRECISION TMP1, TMP2, TMP3, TMP4 1313* .. 1314* .. Executable Statements .. 1315* 1316 KFNEW = KF 1317 DO 10 I = KF, KL - 1 1318 K = 2*I 1319 TMP3 = INTVL( K-1 ) 1320 TMP4 = INTVL( K ) 1321 TMP1 = ABS( TMP4-TMP3 ) 1322 TMP2 = MAX( ABS( TMP3 ), ABS( TMP4 ) ) 1323 CONDN = TMP1.LT.MAX( ABSTOL, RELTOL*TMP2 ) 1324 IF( IJOB.EQ.0 ) 1325 $ CONDN = CONDN .OR. ( ( INTVLCT( K-1 ).EQ.NVAL( K-1 ) ) .AND. 1326 $ INTVLCT( K ).EQ.NVAL( K ) ) 1327 IF( CONDN ) THEN 1328 IF( I.GT.KFNEW ) THEN 1329* 1330* Reorder Intervals 1331* 1332 J = 2*KFNEW 1333 TMP1 = INTVL( K-1 ) 1334 TMP2 = INTVL( K ) 1335 ITMP1 = INTVLCT( K-1 ) 1336 ITMP2 = INTVLCT( K ) 1337 INTVL( K-1 ) = INTVL( J-1 ) 1338 INTVL( K ) = INTVL( J ) 1339 INTVLCT( K-1 ) = INTVLCT( J-1 ) 1340 INTVLCT( K ) = INTVLCT( J ) 1341 INTVL( J-1 ) = TMP1 1342 INTVL( J ) = TMP2 1343 INTVLCT( J-1 ) = ITMP1 1344 INTVLCT( J ) = ITMP2 1345 IF( IJOB.EQ.0 ) THEN 1346 ITMP1 = NVAL( K-1 ) 1347 NVAL( K-1 ) = NVAL( J-1 ) 1348 NVAL( J-1 ) = ITMP1 1349 ITMP1 = NVAL( K ) 1350 NVAL( K ) = NVAL( J ) 1351 NVAL( J ) = ITMP1 1352 END IF 1353 END IF 1354 KFNEW = KFNEW + 1 1355 END IF 1356 10 CONTINUE 1357 KF = KFNEW 1358 RETURN 1359* 1360* End of PDLAECV 1361* 1362 END 1363* 1364 SUBROUTINE PDLAPDCT( SIGMA, N, D, PIVMIN, COUNT ) 1365* 1366* -- ScaLAPACK routine (version 1.7) -- 1367* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 1368* and University of California, Berkeley. 1369* November 15, 1997 1370* 1371* 1372* .. Scalar Arguments .. 1373 INTEGER COUNT, N 1374 DOUBLE PRECISION PIVMIN, SIGMA 1375* .. 1376* .. Array Arguments .. 1377 DOUBLE PRECISION D( * ) 1378* .. 1379* 1380* Purpose 1381* ======= 1382* 1383* PDLAPDCT counts the number of negative eigenvalues of (T - SIGMA I). 1384* This implementation of the Sturm Sequence loop has conditionals in 1385* the innermost loop to avoid overflow and determine the sign of a 1386* floating point number. PDLAPDCT will be referred to as the "paranoid" 1387* implementation of the Sturm Sequence loop. 1388* 1389* This is a SCALAPACK internal procedure and arguments are not checked 1390* for unreasonable values. 1391* 1392* Arguments 1393* ========= 1394* 1395* SIGMA (input) DOUBLE PRECISION 1396* The shift. PDLAPDCT finds the number of eigenvalues of T less 1397* than or equal to SIGMA. 1398* 1399* N (input) INTEGER 1400* The order of the tridiagonal matrix T. N >= 1. 1401* 1402* D (input) DOUBLE PRECISION array, dimension (2*N - 1) 1403* Contains the diagonals and the squares of the off-diagonal 1404* elements of the tridiagonal matrix T. These elements are 1405* assumed to be interleaved in memory for better cache 1406* performance. The diagonal entries of T are in the entries 1407* D(1),D(3),...,D(2*N-1), while the squares of the off-diagonal 1408* entries are D(2),D(4),...,D(2*N-2). To avoid overflow, the 1409* matrix must be scaled so that its largest entry is no greater 1410* than overflow**(1/2) * underflow**(1/4) in absolute value, 1411* and for greatest accuracy, it should not be much smaller 1412* than that. 1413* 1414* PIVMIN (input) DOUBLE PRECISION 1415* The minimum absolute of a "pivot" in this "paranoid" 1416* implementation of the Sturm sequence loop. This must be at 1417* least max_j |e(j)^2| *safe_min, and at least safe_min, where 1418* safe_min is at least the smallest number that can divide 1.0 1419* without overflow. 1420* 1421* COUNT (output) INTEGER 1422* The count of the number of eigenvalues of T less than or 1423* equal to SIGMA. 1424* 1425* ===================================================================== 1426* 1427* .. Intrinsic Functions .. 1428 INTRINSIC ABS 1429* .. 1430* .. Parameters .. 1431 DOUBLE PRECISION ZERO 1432 PARAMETER ( ZERO = 0.0D+0 ) 1433* .. 1434* .. Local Scalars .. 1435 INTEGER I 1436 DOUBLE PRECISION TMP 1437* .. 1438* .. Executable Statements .. 1439* 1440 TMP = D( 1 ) - SIGMA 1441 IF( ABS( TMP ).LE.PIVMIN ) 1442 $ TMP = -PIVMIN 1443 COUNT = 0 1444 IF( TMP.LE.ZERO ) 1445 $ COUNT = 1 1446 DO 10 I = 3, 2*N - 1, 2 1447 TMP = D( I ) - D( I-1 ) / TMP - SIGMA 1448 IF( ABS( TMP ).LE.PIVMIN ) 1449 $ TMP = -PIVMIN 1450 IF( TMP.LE.ZERO ) 1451 $ COUNT = COUNT + 1 1452 10 CONTINUE 1453* 1454 RETURN 1455* 1456* End of PDLAPDCT 1457* 1458 END 1459