1 SUBROUTINE PDLAHQR( WANTT, WANTZ, N, ILO, IHI, A, DESCA, WR, WI, 2 $ ILOZ, IHIZ, Z, DESCZ, WORK, LWORK, IWORK, 3 $ ILWORK, INFO ) 4* 5* -- ScaLAPACK routine (version 2.0.2) -- 6* Univ. of Tennessee, Univ. of California Berkeley, Univ. of Colorado Denver 7* May 1 2012 8* 9* .. Scalar Arguments .. 10 LOGICAL WANTT, WANTZ 11 INTEGER IHI, IHIZ, ILO, ILOZ, ILWORK, INFO, LWORK, N 12* .. 13* .. Array Arguments .. 14 INTEGER DESCA( * ), DESCZ( * ), IWORK( * ) 15 DOUBLE PRECISION A( * ), WI( * ), WORK( * ), WR( * ), Z( * ) 16* .. 17* 18* Purpose 19* ======= 20* 21* PDLAHQR is an auxiliary routine used to find the Schur decomposition 22* and or eigenvalues of a matrix already in Hessenberg form from 23* cols ILO to IHI. 24* 25* Notes 26* ===== 27* 28* Each global data object is described by an associated description 29* vector. This vector stores the information required to establish 30* the mapping between an object element and its corresponding process 31* and memory location. 32* 33* Let A be a generic term for any 2D block cyclicly distributed array. 34* Such a global array has an associated description vector DESCA. 35* In the following comments, the character _ should be read as 36* "of the global array". 37* 38* NOTATION STORED IN EXPLANATION 39* --------------- -------------- -------------------------------------- 40* DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 41* DTYPE_A = 1. 42* CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 43* the BLACS process grid A is distribu- 44* ted over. The context itself is glo- 45* bal, but the handle (the integer 46* value) may vary. 47* M_A (global) DESCA( M_ ) The number of rows in the global 48* array A. 49* N_A (global) DESCA( N_ ) The number of columns in the global 50* array A. 51* MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 52* the rows of the array. 53* NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 54* the columns of the array. 55* RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 56* row of the array A is distributed. 57* CSRC_A (global) DESCA( CSRC_ ) The process column over which the 58* first column of the array A is 59* distributed. 60* LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 61* array. LLD_A >= MAX(1,LOCr(M_A)). 62* 63* Let K be the number of rows or columns of a distributed matrix, 64* and assume that its process grid has dimension p x q. 65* LOCr( K ) denotes the number of elements of K that a process 66* would receive if K were distributed over the p processes of its 67* process column. 68* Similarly, LOCc( K ) denotes the number of elements of K that a 69* process would receive if K were distributed over the q processes of 70* its process row. 71* The values of LOCr() and LOCc() may be determined via a call to the 72* ScaLAPACK tool function, NUMROC: 73* LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 74* LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 75* An upper bound for these quantities may be computed by: 76* LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 77* LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 78* 79* Arguments 80* ========= 81* 82* WANTT (global input) LOGICAL 83* = .TRUE. : the full Schur form T is required; 84* = .FALSE.: only eigenvalues are required. 85* 86* WANTZ (global input) LOGICAL 87* = .TRUE. : the matrix of Schur vectors Z is required; 88* = .FALSE.: Schur vectors are not required. 89* 90* N (global input) INTEGER 91* The order of the Hessenberg matrix A (and Z if WANTZ). 92* N >= 0. 93* 94* ILO (global input) INTEGER 95* IHI (global input) INTEGER 96* It is assumed that A is already upper quasi-triangular in 97* rows and columns IHI+1:N, and that A(ILO,ILO-1) = 0 (unless 98* ILO = 1). PDLAHQR works primarily with the Hessenberg 99* submatrix in rows and columns ILO to IHI, but applies 100* transformations to all of H if WANTT is .TRUE.. 101* 1 <= ILO <= max(1,IHI); IHI <= N. 102* 103* A (global input/output) DOUBLE PRECISION array, dimension 104* (DESCA(LLD_),*) 105* On entry, the upper Hessenberg matrix A. 106* On exit, if WANTT is .TRUE., A is upper quasi-triangular in 107* rows and columns ILO:IHI, with any 2-by-2 or larger diagonal 108* blocks not yet in standard form. If WANTT is .FALSE., the 109* contents of A are unspecified on exit. 110* 111* DESCA (global and local input) INTEGER array of dimension DLEN_. 112* The array descriptor for the distributed matrix A. 113* 114* WR (global replicated output) DOUBLE PRECISION array, 115* dimension (N) 116* WI (global replicated output) DOUBLE PRECISION array, 117* dimension (N) 118* The real and imaginary parts, respectively, of the computed 119* eigenvalues ILO to IHI are stored in the corresponding 120* elements of WR and WI. If two eigenvalues are computed as a 121* complex conjugate pair, they are stored in consecutive 122* elements of WR and WI, say the i-th and (i+1)th, with 123* WI(i) > 0 and WI(i+1) < 0. If WANTT is .TRUE., the 124* eigenvalues are stored in the same order as on the diagonal 125* of the Schur form returned in A. A may be returned with 126* larger diagonal blocks until the next release. 127* 128* ILOZ (global input) INTEGER 129* IHIZ (global input) INTEGER 130* Specify the rows of Z to which transformations must be 131* applied if WANTZ is .TRUE.. 132* 1 <= ILOZ <= ILO; IHI <= IHIZ <= N. 133* 134* Z (global input/output) DOUBLE PRECISION array. 135* If WANTZ is .TRUE., on entry Z must contain the current 136* matrix Z of transformations accumulated by PDHSEQR, and on 137* exit Z has been updated; transformations are applied only to 138* the submatrix Z(ILOZ:IHIZ,ILO:IHI). 139* If WANTZ is .FALSE., Z is not referenced. 140* 141* DESCZ (global and local input) INTEGER array of dimension DLEN_. 142* The array descriptor for the distributed matrix Z. 143* 144* WORK (local output) DOUBLE PRECISION array of size LWORK 145* 146* LWORK (local input) INTEGER 147* WORK(LWORK) is a local array and LWORK is assumed big enough 148* so that LWORK >= 3*N + 149* MAX( 2*MAX(DESCZ(LLD_),DESCA(LLD_)) + 2*LOCc(N), 150* 7*Ceil(N/HBL)/LCM(NPROW,NPCOL)) ) 151* 152* IWORK (global and local input) INTEGER array of size ILWORK 153* 154* ILWORK (local input) INTEGER 155* This holds the some of the IBLK integer arrays. This is held 156* as a place holder for the next release. 157* 158* INFO (global output) INTEGER 159* < 0: parameter number -INFO incorrect or inconsistent 160* = 0: successful exit 161* > 0: PDLAHQR failed to compute all the eigenvalues ILO to IHI 162* in a total of 30*(IHI-ILO+1) iterations; if INFO = i, 163* elements i+1:ihi of WR and WI contain those eigenvalues 164* which have been successfully computed. 165* 166* Logic: 167* This algorithm is very similar to _LAHQR. Unlike _LAHQR, 168* instead of sending one double shift through the largest 169* unreduced submatrix, this algorithm sends multiple double shifts 170* and spaces them apart so that there can be parallelism across 171* several processor row/columns. Another critical difference is 172* that this algorithm aggregrates multiple transforms together in 173* order to apply them in a block fashion. 174* 175* Important Local Variables: 176* IBLK = The maximum number of bulges that can be computed. 177* Currently fixed. Future releases this won't be fixed. 178* HBL = The square block size (HBL=DESCA(MB_)=DESCA(NB_)) 179* ROTN = The number of transforms to block together 180* NBULGE = The number of bulges that will be attempted on the 181* current submatrix. 182* IBULGE = The current number of bulges started. 183* K1(*),K2(*) = The current bulge loops from K1(*) to K2(*). 184* 185* Subroutines: 186* This routine calls: 187* PDLACONSB -> To determine where to start each iteration 188* PDLAWIL -> Given the shift, get the transformation 189* DLASORTE -> Pair up eigenvalues so that reals are paired. 190* PDLACP3 -> Parallel array to local replicated array copy & 191* back. 192* DLAREF -> Row/column reflector applier. Core routine 193* here. 194* PDLASMSUB -> Finds negligible subdiagonal elements. 195* 196* Current Notes and/or Restrictions: 197* 1.) This code requires the distributed block size to be square 198* and at least six (6); unlike simpler codes like LU, this 199* algorithm is extremely sensitive to block size. Unwise 200* choices of too small a block size can lead to bad 201* performance. 202* 2.) This code requires A and Z to be distributed identically 203* and have identical contxts. 204* 3.) This release currently does not have a routine for 205* resolving the Schur blocks into regular 2x2 form after 206* this code is completed. Because of this, a significant 207* performance impact is required while the deflation is done 208* by sometimes a single column of processors. 209* 4.) This code does not currently block the initial transforms 210* so that none of the rows or columns for any bulge are 211* completed until all are started. To offset pipeline 212* start-up it is recommended that at least 2*LCM(NPROW,NPCOL) 213* bulges are used (if possible) 214* 5.) The maximum number of bulges currently supported is fixed at 215* 32. In future versions this will be limited only by the 216* incoming WORK array. 217* 6.) The matrix A must be in upper Hessenberg form. If elements 218* below the subdiagonal are nonzero, the resulting transforms 219* may be nonsimilar. This is also true with the LAPACK 220* routine. 221* 7.) For this release, it is assumed RSRC_=CSRC_=0 222* 8.) Currently, all the eigenvalues are distributed to all the 223* nodes. Future releases will probably distribute the 224* eigenvalues by the column partitioning. 225* 9.) The internals of this routine are subject to change. 226* 227* Implemented by: G. Henry, November 17, 1996 228* 229* ===================================================================== 230* 231* .. Parameters .. 232 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 233 $ LLD_, MB_, M_, NB_, N_, RSRC_ 234 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 235 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 236 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 237 DOUBLE PRECISION ZERO, ONE, HALF 238 PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0, HALF = 0.5D+0 ) 239 DOUBLE PRECISION CONST 240 PARAMETER ( CONST = 1.50D+0 ) 241 INTEGER IBLK 242 PARAMETER ( IBLK = 32 ) 243* .. 244* .. Local Scalars .. 245 INTEGER CONTXT, DOWN, HBL, I, I1, I2, IAFIRST, IBULGE, 246 $ ICBUF, ICOL, ICOL1, ICOL2, IDIA, IERR, II, 247 $ IRBUF, IROW, IROW1, IROW2, ISPEC, ISTART, 248 $ ISTARTCOL, ISTARTROW, ISTOP, ISUB, ISUP, 249 $ ITERMAX, ITMP1, ITMP2, ITN, ITS, J, JAFIRST, 250 $ JBLK, JJ, K, KI, L, LCMRC, LDA, LDZ, LEFT, 251 $ LIHIH, LIHIZ, LILOH, LILOZ, LOCALI1, LOCALI2, 252 $ LOCALK, LOCALM, M, MODKM1, MYCOL, MYROW, 253 $ NBULGE, NH, NODE, NPCOL, NPROW, NR, NUM, NZ, 254 $ RIGHT, ROTN, UP, VECSIDX 255 DOUBLE PRECISION AVE, DISC, H00, H10, H11, H12, H21, H22, H33, 256 $ H43H34, H44, OVFL, S, SMLNUM, SUM, T1, T1COPY, 257 $ T2, T3, ULP, UNFL, V1SAVE, V2, V2SAVE, V3, 258 $ V3SAVE, CS, SN 259* .. 260* .. Local Arrays .. 261 INTEGER ICURCOL( IBLK ), ICURROW( IBLK ), K1( IBLK ), 262 $ K2( IBLK ), KCOL( IBLK ), KP2COL( IBLK ), 263 $ KP2ROW( IBLK ), KROW( IBLK ), LOCALK2( IBLK ) 264 DOUBLE PRECISION S1( 2*IBLK, 2*IBLK ), SMALLA( 6, 6, IBLK ), 265 $ VCOPY( 3 ) 266* .. 267* .. External Functions .. 268 INTEGER ILCM, NUMROC 269 DOUBLE PRECISION PDLAMCH 270 EXTERNAL ILCM, NUMROC, PDLAMCH 271* .. 272* .. External Subroutines .. 273 EXTERNAL BLACS_GRIDINFO, DCOPY, DGEBR2D, DGEBS2D, 274 $ DGERV2D, DGESD2D, DGSUM2D, DLAHQR, DLAREF, 275 $ DLARFG, DLASORTE, IGAMN2D, INFOG1L, INFOG2L, 276 $ PDLABAD, PDLACONSB, PDLACP3, PDLASMSUB, 277 $ PDLAWIL, PXERBLA, DLANV2 278* .. 279* .. Intrinsic Functions .. 280 INTRINSIC ABS, MAX, MIN, MOD, SIGN, SQRT 281* .. 282* .. Executable Statements .. 283* 284 INFO = 0 285* 286 ITERMAX = 30*( IHI-ILO+1 ) 287* ITERMAX = 0 288 IF( N.EQ.0 ) 289 $ RETURN 290* 291* NODE (IAFIRST,JAFIRST) OWNS A(1,1) 292* 293 HBL = DESCA( MB_ ) 294 CONTXT = DESCA( CTXT_ ) 295 LDA = DESCA( LLD_ ) 296 IAFIRST = DESCA( RSRC_ ) 297 JAFIRST = DESCA( CSRC_ ) 298 LDZ = DESCZ( LLD_ ) 299 CALL BLACS_GRIDINFO( CONTXT, NPROW, NPCOL, MYROW, MYCOL ) 300 NODE = MYROW*NPCOL + MYCOL 301 NUM = NPROW*NPCOL 302 LEFT = MOD( MYCOL+NPCOL-1, NPCOL ) 303 RIGHT = MOD( MYCOL+1, NPCOL ) 304 UP = MOD( MYROW+NPROW-1, NPROW ) 305 DOWN = MOD( MYROW+1, NPROW ) 306 LCMRC = ILCM( NPROW, NPCOL ) 307* 308* Determine the number of columns we have so we can check workspace 309* 310 LOCALK = NUMROC( N, HBL, MYCOL, JAFIRST, NPCOL ) 311 JJ = N / HBL 312 IF( JJ*HBL.LT.N ) 313 $ JJ = JJ + 1 314 JJ = 7*JJ / LCMRC 315 IF( LWORK.LT.3*N+MAX( 2*MAX( LDA, LDZ )+2*LOCALK, JJ ) ) THEN 316 INFO = -15 317 END IF 318 IF( DESCZ( CTXT_ ).NE.DESCA( CTXT_ ) ) THEN 319 INFO = -( 1300+CTXT_ ) 320 END IF 321 IF( DESCA( MB_ ).NE.DESCA( NB_ ) ) THEN 322 INFO = -( 700+NB_ ) 323 END IF 324 IF( DESCZ( MB_ ).NE.DESCZ( NB_ ) ) THEN 325 INFO = -( 1300+NB_ ) 326 END IF 327 IF( DESCA( MB_ ).NE.DESCZ( MB_ ) ) THEN 328 INFO = -( 1300+MB_ ) 329 END IF 330 IF( ( DESCA( RSRC_ ).NE.0 ) .OR. ( DESCA( CSRC_ ).NE.0 ) ) THEN 331 INFO = -( 700+RSRC_ ) 332 END IF 333 IF( ( DESCZ( RSRC_ ).NE.0 ) .OR. ( DESCZ( CSRC_ ).NE.0 ) ) THEN 334 INFO = -( 1300+RSRC_ ) 335 END IF 336 IF( ( ILO.GT.N ) .OR. ( ILO.LT.1 ) ) THEN 337 INFO = -4 338 END IF 339 IF( ( IHI.GT.N ) .OR. ( IHI.LT.1 ) ) THEN 340 INFO = -5 341 END IF 342 IF( HBL.LT.5 ) THEN 343 INFO = -( 700+MB_ ) 344 END IF 345 CALL IGAMN2D( CONTXT, 'ALL', ' ', 1, 1, INFO, 1, ITMP1, ITMP2, -1, 346 $ -1, -1 ) 347 IF( INFO.LT.0 ) THEN 348 CALL PXERBLA( CONTXT, 'PDLAHQR', -INFO ) 349 RETURN 350 END IF 351* 352* Set work array indices 353* 354 VECSIDX = 0 355 IDIA = 3*N 356 ISUB = 3*N 357 ISUP = 3*N 358 IRBUF = 3*N 359 ICBUF = 3*N 360* 361* Find a value for ROTN 362* 363 ROTN = HBL / 3 364 ROTN = MAX( ROTN, HBL-2 ) 365 ROTN = MIN( ROTN, 1 ) 366* 367 IF( ILO.EQ.IHI ) THEN 368 CALL INFOG2L( ILO, ILO, DESCA, NPROW, NPCOL, MYROW, MYCOL, 369 $ IROW, ICOL, II, JJ ) 370 IF( ( MYROW.EQ.II ) .AND. ( MYCOL.EQ.JJ ) ) THEN 371 WR( ILO ) = A( ( ICOL-1 )*LDA+IROW ) 372 ELSE 373 WR( ILO ) = ZERO 374 END IF 375 WI( ILO ) = ZERO 376 RETURN 377 END IF 378* 379 NH = IHI - ILO + 1 380 NZ = IHIZ - ILOZ + 1 381* 382 CALL INFOG1L( ILOZ, HBL, NPROW, MYROW, 0, LILOZ, LIHIZ ) 383 LIHIZ = NUMROC( IHIZ, HBL, MYROW, 0, NPROW ) 384* 385* Set machine-dependent constants for the stopping criterion. 386* If NORM(H) <= SQRT(OVFL), overflow should not occur. 387* 388 UNFL = PDLAMCH( CONTXT, 'SAFE MINIMUM' ) 389 OVFL = ONE / UNFL 390 CALL PDLABAD( CONTXT, UNFL, OVFL ) 391 ULP = PDLAMCH( CONTXT, 'PRECISION' ) 392 SMLNUM = UNFL*( NH / ULP ) 393* 394* I1 and I2 are the indices of the first row and last column of H 395* to which transformations must be applied. If eigenvalues only are 396* being computed, I1 and I2 are set inside the main loop. 397* 398 IF( WANTT ) THEN 399 I1 = 1 400 I2 = N 401 END IF 402* 403* ITN is the total number of QR iterations allowed. 404* 405 ITN = ITERMAX 406* 407* The main loop begins here. I is the loop index and decreases from 408* IHI to ILO in steps of our schur block size (<=2*IBLK). Each 409* iteration of the loop works with the active submatrix in rows 410* and columns L to I. Eigenvalues I+1 to IHI have already 411* converged. Either L = ILO or the global A(L,L-1) is negligible 412* so that the matrix splits. 413* 414 I = IHI 415 10 CONTINUE 416 L = ILO 417 IF( I.LT.ILO ) 418 $ GO TO 450 419* 420* Perform QR iterations on rows and columns ILO to I until a 421* submatrix of order 1 or 2 splits off at the bottom because a 422* subdiagonal element has become negligible. 423* 424 DO 420 ITS = 0, ITN 425* 426* Look for a single small subdiagonal element. 427* 428 CALL PDLASMSUB( A, DESCA, I, L, K, SMLNUM, WORK( IRBUF+1 ), 429 $ LWORK-IRBUF ) 430 L = K 431* 432 IF( L.GT.ILO ) THEN 433* 434* H(L,L-1) is negligible 435* 436 CALL INFOG2L( L, L-1, DESCA, NPROW, NPCOL, MYROW, MYCOL, 437 $ IROW, ICOL, ITMP1, ITMP2 ) 438 IF( ( MYROW.EQ.ITMP1 ) .AND. ( MYCOL.EQ.ITMP2 ) ) THEN 439 A( ( ICOL-1 )*LDA+IROW ) = ZERO 440 END IF 441 WORK( ISUB+L-1 ) = ZERO 442 END IF 443* 444* Exit from loop if a submatrix of order 1 or 2 has split off. 445* 446 M = L - 10 447* IF ( L .GE. I - (2*IBLK-1) ) 448* IF ( L .GE. I - MAX(2*IBLK-1,HBL) ) 449 IF( L.GE.I-1 ) 450 $ GO TO 430 451* 452* Now the active submatrix is in rows and columns L to I. If 453* eigenvalues only are being computed, only the active submatrix 454* need be transformed. 455* 456 IF( .NOT.WANTT ) THEN 457 I1 = L 458 I2 = I 459 END IF 460* 461* Copy submatrix of size 2*JBLK and prepare to do generalized 462* Wilkinson shift or an exceptional shift 463* 464 JBLK = MIN( IBLK, ( ( I-L+1 ) / 2 )-1 ) 465 IF( JBLK.GT.LCMRC ) THEN 466* 467* Make sure it's divisible by LCM (we want even workloads!) 468* 469 JBLK = JBLK - MOD( JBLK, LCMRC ) 470 END IF 471 JBLK = MIN( JBLK, 2*LCMRC ) 472 JBLK = MAX( JBLK, 1 ) 473* 474 CALL PDLACP3( 2*JBLK, I-2*JBLK+1, A, DESCA, S1, 2*IBLK, -1, -1, 475 $ 0 ) 476 IF( ITS.EQ.20 .OR. ITS.EQ.40 ) THEN 477* 478* Exceptional shift. 479* 480 DO 20 II = 2*JBLK, 2, -1 481 S1( II, II ) = CONST*( ABS( S1( II, II ) )+ 482 $ ABS( S1( II, II-1 ) ) ) 483 S1( II, II-1 ) = ZERO 484 S1( II-1, II ) = ZERO 485 20 CONTINUE 486 S1( 1, 1 ) = CONST*ABS( S1( 1, 1 ) ) 487 ELSE 488 CALL DLAHQR( .FALSE., .FALSE., 2*JBLK, 1, 2*JBLK, S1, 489 $ 2*IBLK, WORK( IRBUF+1 ), WORK( ICBUF+1 ), 1, 490 $ 2*JBLK, Z, LDZ, IERR ) 491* 492* Prepare to use Wilkinson's double shift 493* 494 H44 = S1( 2*JBLK, 2*JBLK ) 495 H33 = S1( 2*JBLK-1, 2*JBLK-1 ) 496 H43H34 = S1( 2*JBLK-1, 2*JBLK )*S1( 2*JBLK, 2*JBLK-1 ) 497 IF( ( JBLK.GT.1 ) .AND. ( ITS.GT.30 ) ) THEN 498 S = S1( 2*JBLK-1, 2*JBLK-2 ) 499 DISC = ( H33-H44 )*HALF 500 DISC = DISC*DISC + H43H34 501 IF( DISC.GT.ZERO ) THEN 502* 503* Real roots: Use Wilkinson's shift twice 504* 505 DISC = SQRT( DISC ) 506 AVE = HALF*( H33+H44 ) 507 IF( ABS( H33 )-ABS( H44 ).GT.ZERO ) THEN 508 H33 = H33*H44 - H43H34 509 H44 = H33 / ( SIGN( DISC, AVE )+AVE ) 510 ELSE 511 H44 = SIGN( DISC, AVE ) + AVE 512 END IF 513 H33 = H44 514 H43H34 = ZERO 515 END IF 516 END IF 517 END IF 518* 519* Look for two consecutive small subdiagonal elements: 520* PDLACONSB is the routine that does this. 521* 522c CALL PDLACONSB( A, DESCA, I, L, M, H44, H33, H43H34, 523c $ WORK( IRBUF+1 ), LWORK-IRBUF ) 524* 525* Skip small submatrices 526* 527* IF ( M .GE. I - 5 ) 528* $ GO TO 80 529* 530* In principle PDLACONSB needs to check all shifts to decide 531* whether two consecutive small subdiagonal entries are suitable 532* as the starting position of the bulge chasing phase. It can be 533* dangerous to check the first pair of shifts only. Moreover it 534* is quite rare to obtain an M which is much larger than L. This 535* process is a bit expensive compared with the benefit. 536* Therefore it is sensible to abandon this routine. Total amount 537* of communications is saved in average. 538* 539 M = L 540* Double-shift QR step 541* 542* NBULGE is the number of bulges that will be attempted 543* 544 ISTOP = MIN( M+ROTN-MOD( M, ROTN ), I-2 ) 545 ISTOP = MIN( ISTOP, M+HBL-3-MOD( M-1, HBL ) ) 546 ISTOP = MIN( ISTOP, I2-2 ) 547 ISTOP = MAX( ISTOP, M ) 548 NBULGE = ( I-1-ISTOP ) / HBL 549* 550* Do not exceed maximum determined. 551* 552 NBULGE = MIN( NBULGE, JBLK ) 553 IF( NBULGE.GT.LCMRC ) THEN 554* 555* Make sure it's divisible by LCM (we want even workloads!) 556* 557 NBULGE = NBULGE - MOD( NBULGE, LCMRC ) 558 END IF 559 NBULGE = MAX( NBULGE, 1 ) 560* 561 IF( ( ITS.NE.20 ) .AND. ( ITS.NE.40 ) .AND. ( NBULGE.GT.1 ) ) 562 $ THEN 563* 564* sort the eigenpairs so that they are in twos for double 565* shifts. only call if several need sorting 566* 567 CALL DLASORTE( S1( 2*( JBLK-NBULGE )+1, 568 $ 2*( JBLK-NBULGE )+1 ), 2*IBLK, 2*NBULGE, 569 $ WORK( IRBUF+1 ), IERR ) 570 END IF 571* 572* IBULGE is the number of bulges going so far 573* 574 IBULGE = 1 575* 576* "A" row defs : main row transforms from LOCALK to LOCALI2 577* 578 CALL INFOG1L( M, HBL, NPCOL, MYCOL, 0, ITMP1, LOCALK ) 579 LOCALK = NUMROC( N, HBL, MYCOL, 0, NPCOL ) 580 CALL INFOG1L( 1, HBL, NPCOL, MYCOL, 0, ICOL1, LOCALI2 ) 581 LOCALI2 = NUMROC( I2, HBL, MYCOL, 0, NPCOL ) 582* 583* "A" col defs : main col transforms from LOCALI1 to LOCALM 584* 585 CALL INFOG1L( I1, HBL, NPROW, MYROW, 0, LOCALI1, ICOL1 ) 586 ICOL1 = NUMROC( N, HBL, MYROW, 0, NPROW ) 587 CALL INFOG1L( 1, HBL, NPROW, MYROW, 0, LOCALM, ICOL1 ) 588 ICOL1 = NUMROC( MIN( M+3, I ), HBL, MYROW, 0, NPROW ) 589* 590* Which row & column will start the bulges 591* 592 ISTARTROW = MOD( ( M+1 ) / HBL, NPROW ) + IAFIRST 593 ISTARTCOL = MOD( ( M+1 ) / HBL, NPCOL ) + JAFIRST 594* 595 CALL INFOG1L( M, HBL, NPROW, MYROW, 0, II, ITMP2 ) 596 ITMP2 = NUMROC( N, HBL, MYROW, 0, NPROW ) 597 CALL INFOG1L( M, HBL, NPCOL, MYCOL, 0, JJ, ITMP2 ) 598 ITMP2 = NUMROC( N, HBL, MYCOL, 0, NPCOL ) 599 CALL INFOG1L( 1, HBL, NPROW, MYROW, 0, ISTOP, KP2ROW( 1 ) ) 600 KP2ROW( 1 ) = NUMROC( M+2, HBL, MYROW, 0, NPROW ) 601 CALL INFOG1L( 1, HBL, NPCOL, MYCOL, 0, ISTOP, KP2COL( 1 ) ) 602 KP2COL( 1 ) = NUMROC( M+2, HBL, MYCOL, 0, NPCOL ) 603* 604* Set all values for bulges. All bulges are stored in 605* intermediate steps as loops over KI. Their current "task" 606* over the global M to I-1 values is always K1(KI) to K2(KI). 607* However, because there are many bulges, K1(KI) & K2(KI) might 608* go past that range while later bulges (KI+1,KI+2,etc..) are 609* finishing up. 610* 611* Rules: 612* If MOD(K1(KI)-1,HBL) < HBL-2 then MOD(K2(KI)-1,HBL)<HBL-2 613* If MOD(K1(KI)-1,HBL) = HBL-2 then MOD(K2(KI)-1,HBL)=HBL-2 614* If MOD(K1(KI)-1,HBL) = HBL-1 then MOD(K2(KI)-1,HBL)=HBL-1 615* K2(KI)-K1(KI) <= ROTN 616* 617* We first hit a border when MOD(K1(KI)-1,HBL)=HBL-2 and we hit 618* it again when MOD(K1(KI)-1,HBL)=HBL-1. 619* 620 DO 30 KI = 1, NBULGE 621 K1( KI ) = M 622 ISTOP = MIN( M+ROTN-MOD( M, ROTN ), I-2 ) 623 ISTOP = MIN( ISTOP, M+HBL-3-MOD( M-1, HBL ) ) 624 ISTOP = MIN( ISTOP, I2-2 ) 625 ISTOP = MAX( ISTOP, M ) 626 K2( KI ) = ISTOP 627 ICURROW( KI ) = ISTARTROW 628 ICURCOL( KI ) = ISTARTCOL 629 LOCALK2( KI ) = ITMP1 630 KROW( KI ) = II 631 KCOL( KI ) = JJ 632 IF( KI.GT.1 ) 633 $ KP2ROW( KI ) = KP2ROW( 1 ) 634 IF( KI.GT.1 ) 635 $ KP2COL( KI ) = KP2COL( 1 ) 636 30 CONTINUE 637* 638* Get first transform on node who owns M+2,M+2 639* 640 DO 31 ITMP1 = 1, 3 641 VCOPY(ITMP1) = ZERO 642 31 CONTINUE 643 ITMP1 = ISTARTROW 644 ITMP2 = ISTARTCOL 645 CALL PDLAWIL( ITMP1, ITMP2, M, A, DESCA, H44, H33, H43H34, 646 $ VCOPY ) 647 V1SAVE = VCOPY( 1 ) 648 V2SAVE = VCOPY( 2 ) 649 V3SAVE = VCOPY( 3 ) 650 IF( K2( IBULGE ).LE.I-1 ) THEN 651 40 CONTINUE 652 IF( ( K1( IBULGE ).GE.M+5 ) .AND. ( IBULGE.LT.NBULGE ) ) 653 $ THEN 654 IF( ( MOD( K2( IBULGE )+2, HBL ).EQ.MOD( K2( IBULGE+1 )+ 655 $ 2, HBL ) ) .AND. ( K1( 1 ).LE.I-1 ) ) THEN 656 H44 = S1( 2*JBLK-2*IBULGE, 2*JBLK-2*IBULGE ) 657 H33 = S1( 2*JBLK-2*IBULGE-1, 2*JBLK-2*IBULGE-1 ) 658 H43H34 = S1( 2*JBLK-2*IBULGE-1, 2*JBLK-2*IBULGE )* 659 $ S1( 2*JBLK-2*IBULGE, 2*JBLK-2*IBULGE-1 ) 660 ITMP1 = ISTARTROW 661 ITMP2 = ISTARTCOL 662 CALL PDLAWIL( ITMP1, ITMP2, M, A, DESCA, H44, H33, 663 $ H43H34, VCOPY ) 664 V1SAVE = VCOPY( 1 ) 665 V2SAVE = VCOPY( 2 ) 666 V3SAVE = VCOPY( 3 ) 667 IBULGE = IBULGE + 1 668 END IF 669 END IF 670* 671* When we hit a border, there are row and column transforms that 672* overlap over several processors and the code gets very 673* "congested." As a remedy, when we first hit a border, a 6x6 674* *local* matrix is generated on one node (called SMALLA) and 675* work is done on that. At the end of the border, the data is 676* passed back and everything stays a lot simpler. 677* 678 DO 80 KI = 1, IBULGE 679* 680 ISTART = MAX( K1( KI ), M ) 681 ISTOP = MIN( K2( KI ), I-1 ) 682 K = ISTART 683 MODKM1 = MOD( K-1, HBL ) 684 IF( ( MODKM1.GE.HBL-2 ) .AND. ( K.LE.I-1 ) ) THEN 685 DO 81 ITMP1 = 1, 6 686 DO 82 ITMP2 = 1, 6 687 SMALLA(ITMP1, ITMP2, KI) = ZERO 688 82 CONTINUE 689 81 CONTINUE 690 IF( ( MODKM1.EQ.HBL-2 ) .AND. ( K.LT.I-1 ) ) THEN 691* 692* Copy 6 elements from global A(K-1:K+4,K-1:K+4) 693* 694 CALL INFOG2L( K+2, K+2, DESCA, NPROW, NPCOL, MYROW, 695 $ MYCOL, IROW1, ICOL1, ITMP1, ITMP2 ) 696 CALL PDLACP3( MIN( 6, N-K+2 ), K-1, A, DESCA, 697 $ SMALLA( 1, 1, KI ), 6, ITMP1, ITMP2, 698 $ 0 ) 699 END IF 700 IF( MODKM1.EQ.HBL-1 ) THEN 701* 702* Copy 6 elements from global A(K-2:K+3,K-2:K+3) 703* 704 CALL INFOG2L( K+1, K+1, DESCA, NPROW, NPCOL, MYROW, 705 $ MYCOL, IROW1, ICOL1, ITMP1, ITMP2 ) 706 CALL PDLACP3( MIN( 6, N-K+3 ), K-2, A, DESCA, 707 $ SMALLA( 1, 1, KI ), 6, ITMP1, ITMP2, 708 $ 0 ) 709 END IF 710 END IF 711* 712* DLAHQR used to have a single row application and a single 713* column application to H. Here we do something a little 714* more clever. We break each transformation down into 3 715* parts: 716* 1.) The minimum amount of work it takes to determine 717* a group of ROTN transformations (this is on 718* the critical path.) (Loops 130-180) 719* 2.) The small work it takes so that each of the rows 720* and columns is at the same place. For example, 721* all ROTN row transforms are all complete 722* through some column TMP. (Loops within 190) 723* 3.) The majority of the row and column transforms 724* are then applied in a block fashion. 725* (Loops 290 on.) 726* 727* Each of these three parts are further subdivided into 3 728* parts: 729* A.) Work at the start of a border when 730* MOD(ISTART-1,HBL) = HBL-2 731* B.) Work at the end of a border when 732* MOD(ISTART-1,HBL) = HBL-1 733* C.) Work in the middle of the block when 734* MOD(ISTART-1,HBL) < HBL-2 735* 736 IF( ( MYROW.EQ.ICURROW( KI ) ) .AND. 737 $ ( MYCOL.EQ.ICURCOL( KI ) ) .AND. 738 $ ( MODKM1.EQ.HBL-2 ) .AND. 739 $ ( ISTART.LT.MIN( I-1, ISTOP+1 ) ) ) THEN 740 K = ISTART 741 NR = MIN( 3, I-K+1 ) 742 IF( K.GT.M ) THEN 743 CALL DCOPY( NR, SMALLA( 2, 1, KI ), 1, VCOPY, 1 ) 744 ELSE 745 VCOPY( 1 ) = V1SAVE 746 VCOPY( 2 ) = V2SAVE 747 VCOPY( 3 ) = V3SAVE 748 END IF 749 CALL DLARFG( NR, VCOPY( 1 ), VCOPY( 2 ), 1, T1COPY ) 750 IF( K.GT.M ) THEN 751 SMALLA( 2, 1, KI ) = VCOPY( 1 ) 752 SMALLA( 3, 1, KI ) = ZERO 753 IF( K.LT.I-1 ) 754 $ SMALLA( 4, 1, KI ) = ZERO 755 ELSE IF( M.GT.L ) THEN 756 SMALLA( 2, 1, KI ) = -SMALLA( 2, 1, KI ) 757 END IF 758 V2 = VCOPY( 2 ) 759 T2 = T1COPY*V2 760 WORK( VECSIDX+( K-1 )*3+1 ) = VCOPY( 2 ) 761 WORK( VECSIDX+( K-1 )*3+2 ) = VCOPY( 3 ) 762 WORK( VECSIDX+( K-1 )*3+3 ) = T1COPY 763 END IF 764* 765 IF( ( MOD( ISTOP-1, HBL ).EQ.HBL-1 ) .AND. 766 $ ( MYROW.EQ.ICURROW( KI ) ) .AND. 767 $ ( MYCOL.EQ.ICURCOL( KI ) ) .AND. 768 $ ( ISTART.LE.MIN( I, ISTOP ) ) ) THEN 769 K = ISTART 770 NR = MIN( 3, I-K+1 ) 771 IF( K.GT.M ) THEN 772 CALL DCOPY( NR, SMALLA( 3, 2, KI ), 1, VCOPY, 1 ) 773 ELSE 774 VCOPY( 1 ) = V1SAVE 775 VCOPY( 2 ) = V2SAVE 776 VCOPY( 3 ) = V3SAVE 777 END IF 778 CALL DLARFG( NR, VCOPY( 1 ), VCOPY( 2 ), 1, T1COPY ) 779 IF( K.GT.M ) THEN 780 SMALLA( 3, 2, KI ) = VCOPY( 1 ) 781 SMALLA( 4, 2, KI ) = ZERO 782 IF( K.LT.I-1 ) 783 $ SMALLA( 5, 2, KI ) = ZERO 784* 785* Set a subdiagonal to zero now if it's possible 786* 787* H11 = SMALLA(1,1,KI) 788* H10 = SMALLA(2,1,KI) 789* H22 = SMALLA(2,2,KI) 790* IF ( ABS(H10) .LE. MAX(ULP*(ABS(H11)+ABS(H22)), 791* $ SMLNUM) ) THEN 792* SMALLA(2,1,KI) = ZERO 793* WORK(ISUB+K-2) = ZERO 794* END IF 795 ELSE IF( M.GT.L ) THEN 796 SMALLA( 3, 2, KI ) = -SMALLA( 3, 2, KI ) 797 END IF 798 V2 = VCOPY( 2 ) 799 T2 = T1COPY*V2 800 WORK( VECSIDX+( K-1 )*3+1 ) = VCOPY( 2 ) 801 WORK( VECSIDX+( K-1 )*3+2 ) = VCOPY( 3 ) 802 WORK( VECSIDX+( K-1 )*3+3 ) = T1COPY 803 END IF 804* 805 IF( ( MODKM1.EQ.0 ) .AND. ( ISTART.LE.I-1 ) .AND. 806 $ ( MYROW.EQ.ICURROW( KI ) ) .AND. 807 $ ( RIGHT.EQ.ICURCOL( KI ) ) ) THEN 808* 809* (IROW1,ICOL1) is (I,J)-coordinates of H(ISTART,ISTART) 810* 811 IROW1 = KROW( KI ) 812 ICOL1 = LOCALK2( KI ) 813 IF( ISTART.GT.M ) THEN 814 VCOPY( 1 ) = SMALLA( 4, 3, KI ) 815 VCOPY( 2 ) = SMALLA( 5, 3, KI ) 816 VCOPY( 3 ) = SMALLA( 6, 3, KI ) 817 NR = MIN( 3, I-ISTART+1 ) 818 CALL DLARFG( NR, VCOPY( 1 ), VCOPY( 2 ), 1, 819 $ T1COPY ) 820 A( ( ICOL1-2 )*LDA+IROW1 ) = VCOPY( 1 ) 821 A( ( ICOL1-2 )*LDA+IROW1+1 ) = ZERO 822 IF( ISTART.LT.I-1 ) THEN 823 A( ( ICOL1-2 )*LDA+IROW1+2 ) = ZERO 824 END IF 825 ELSE 826 IF( M.GT.L ) THEN 827 A( ( ICOL1-2 )*LDA+IROW1 ) = -A( ( ICOL1-2 )* 828 $ LDA+IROW1 ) 829 END IF 830 END IF 831 END IF 832* 833 IF( ( MYROW.EQ.ICURROW( KI ) ) .AND. 834 $ ( MYCOL.EQ.ICURCOL( KI ) ) .AND. 835 $ ( ( ( MODKM1.EQ.HBL-2 ) .AND. ( ISTART.EQ.I- 836 $ 1 ) ) .OR. ( ( MODKM1.LT.HBL-2 ) .AND. ( ISTART.LE.I- 837 $ 1 ) ) ) ) THEN 838* 839* (IROW1,ICOL1) is (I,J)-coordinates of H(ISTART,ISTART) 840* 841 IROW1 = KROW( KI ) 842 ICOL1 = LOCALK2( KI ) 843 DO 70 K = ISTART, ISTOP 844* 845* Create and do these transforms 846* 847 NR = MIN( 3, I-K+1 ) 848 IF( K.GT.M ) THEN 849 IF( MOD( K-1, HBL ).EQ.0 ) THEN 850 VCOPY( 1 ) = SMALLA( 4, 3, KI ) 851 VCOPY( 2 ) = SMALLA( 5, 3, KI ) 852 VCOPY( 3 ) = SMALLA( 6, 3, KI ) 853 ELSE 854 VCOPY( 1 ) = A( ( ICOL1-2 )*LDA+IROW1 ) 855 VCOPY( 2 ) = A( ( ICOL1-2 )*LDA+IROW1+1 ) 856 IF( NR.EQ.3 ) THEN 857 VCOPY( 3 ) = A( ( ICOL1-2 )*LDA+IROW1+2 ) 858 END IF 859 END IF 860 ELSE 861 VCOPY( 1 ) = V1SAVE 862 VCOPY( 2 ) = V2SAVE 863 VCOPY( 3 ) = V3SAVE 864 END IF 865 CALL DLARFG( NR, VCOPY( 1 ), VCOPY( 2 ), 1, 866 $ T1COPY ) 867 IF( K.GT.M ) THEN 868 IF( MOD( K-1, HBL ).GT.0 ) THEN 869 A( ( ICOL1-2 )*LDA+IROW1 ) = VCOPY( 1 ) 870 A( ( ICOL1-2 )*LDA+IROW1+1 ) = ZERO 871 IF( K.LT.I-1 ) THEN 872 A( ( ICOL1-2 )*LDA+IROW1+2 ) = ZERO 873 END IF 874* 875* Set a subdiagonal to zero now if it's possible 876* 877* IF ( (IROW1.GT.2) .AND. (ICOL1.GT.2) .AND. 878* $ (MOD(K-1,HBL) .GT. 1) ) THEN 879* H11 = A((ICOL1-3)*LDA+IROW1-2) 880* H10 = A((ICOL1-3)*LDA+IROW1-1) 881* H22 = A((ICOL1-2)*LDA+IROW1-1) 882* IF ( ABS(H10).LE.MAX(ULP*(ABS(H11)+ABS(H22)), 883* $ SMLNUM) ) THEN 884* A((ICOL1-3)*LDA+IROW1-1) = ZERO 885* END IF 886* END IF 887 END IF 888 ELSE IF( M.GT.L ) THEN 889 IF( MOD( K-1, HBL ).GT.0 ) THEN 890 A( ( ICOL1-2 )*LDA+IROW1 ) = -A( ( ICOL1-2 )* 891 $ LDA+IROW1 ) 892 END IF 893 END IF 894 V2 = VCOPY( 2 ) 895 T2 = T1COPY*V2 896 WORK( VECSIDX+( K-1 )*3+1 ) = VCOPY( 2 ) 897 WORK( VECSIDX+( K-1 )*3+2 ) = VCOPY( 3 ) 898 WORK( VECSIDX+( K-1 )*3+3 ) = T1COPY 899 T1 = T1COPY 900 IF( K.LT.ISTOP ) THEN 901* 902* Do some work so next step is ready... 903* 904 V3 = VCOPY( 3 ) 905 T3 = T1*V3 906 DO 50 J = ICOL1, MIN( K2( KI )+1, I-1 ) + 907 $ ICOL1 - K 908 SUM = A( ( J-1 )*LDA+IROW1 ) + 909 $ V2*A( ( J-1 )*LDA+IROW1+1 ) + 910 $ V3*A( ( J-1 )*LDA+IROW1+2 ) 911 A( ( J-1 )*LDA+IROW1 ) = A( ( J-1 )*LDA+ 912 $ IROW1 ) - SUM*T1 913 A( ( J-1 )*LDA+IROW1+1 ) = A( ( J-1 )*LDA+ 914 $ IROW1+1 ) - SUM*T2 915 A( ( J-1 )*LDA+IROW1+2 ) = A( ( J-1 )*LDA+ 916 $ IROW1+2 ) - SUM*T3 917 50 CONTINUE 918 ITMP1 = LOCALK2( KI ) 919 DO 60 J = IROW1 + 1, IROW1 + 3 920 SUM = A( ( ICOL1-1 )*LDA+J ) + 921 $ V2*A( ICOL1*LDA+J ) + 922 $ V3*A( ( ICOL1+1 )*LDA+J ) 923 A( ( ICOL1-1 )*LDA+J ) = A( ( ICOL1-1 )*LDA+ 924 $ J ) - SUM*T1 925 A( ICOL1*LDA+J ) = A( ICOL1*LDA+J ) - SUM*T2 926 A( ( ICOL1+1 )*LDA+J ) = A( ( ICOL1+1 )*LDA+ 927 $ J ) - SUM*T3 928 60 CONTINUE 929 END IF 930 IROW1 = IROW1 + 1 931 ICOL1 = ICOL1 + 1 932 70 CONTINUE 933 END IF 934* 935 IF( MODKM1.EQ.HBL-2 ) THEN 936 IF( ( DOWN.EQ.ICURROW( KI ) ) .AND. 937 $ ( RIGHT.EQ.ICURCOL( KI ) ) .AND. ( NUM.GT.1 ) ) 938 $ THEN 939 CALL DGERV2D( CONTXT, 3, 1, 940 $ WORK( VECSIDX+( ISTART-1 )*3+1 ), 3, 941 $ DOWN, RIGHT ) 942 END IF 943 IF( ( MYROW.EQ.ICURROW( KI ) ) .AND. 944 $ ( MYCOL.EQ.ICURCOL( KI ) ) .AND. ( NUM.GT.1 ) ) 945 $ THEN 946 CALL DGESD2D( CONTXT, 3, 1, 947 $ WORK( VECSIDX+( ISTART-1 )*3+1 ), 3, 948 $ UP, LEFT ) 949 END IF 950 IF( ( DOWN.EQ.ICURROW( KI ) ) .AND. 951 $ ( NPCOL.GT.1 ) .AND. ( ISTART.LE.ISTOP ) ) THEN 952 JJ = MOD( ICURCOL( KI )+NPCOL-1, NPCOL ) 953 IF( MYCOL.NE.JJ ) THEN 954 CALL DGEBR2D( CONTXT, 'ROW', ' ', 955 $ 3*( ISTOP-ISTART+1 ), 1, 956 $ WORK( VECSIDX+( ISTART-1 )*3+1 ), 957 $ 3*( ISTOP-ISTART+1 ), MYROW, JJ ) 958 ELSE 959 CALL DGEBS2D( CONTXT, 'ROW', ' ', 960 $ 3*( ISTOP-ISTART+1 ), 1, 961 $ WORK( VECSIDX+( ISTART-1 )*3+1 ), 962 $ 3*( ISTOP-ISTART+1 ) ) 963 END IF 964 END IF 965 END IF 966* 967* Broadcast Householder information from the block 968* 969 IF( ( MYROW.EQ.ICURROW( KI ) ) .AND. ( NPCOL.GT.1 ) .AND. 970 $ ( ISTART.LE.ISTOP ) ) THEN 971 IF( MYCOL.NE.ICURCOL( KI ) ) THEN 972 CALL DGEBR2D( CONTXT, 'ROW', ' ', 973 $ 3*( ISTOP-ISTART+1 ), 1, 974 $ WORK( VECSIDX+( ISTART-1 )*3+1 ), 975 $ 3*( ISTOP-ISTART+1 ), MYROW, 976 $ ICURCOL( KI ) ) 977 ELSE 978 CALL DGEBS2D( CONTXT, 'ROW', ' ', 979 $ 3*( ISTOP-ISTART+1 ), 1, 980 $ WORK( VECSIDX+( ISTART-1 )*3+1 ), 981 $ 3*( ISTOP-ISTART+1 ) ) 982 END IF 983 END IF 984 80 CONTINUE 985* 986* Now do column transforms and finish work 987* 988 DO 90 KI = 1, IBULGE 989* 990 ISTART = MAX( K1( KI ), M ) 991 ISTOP = MIN( K2( KI ), I-1 ) 992* 993 IF( MOD( ISTART-1, HBL ).EQ.HBL-2 ) THEN 994 IF( ( RIGHT.EQ.ICURCOL( KI ) ) .AND. 995 $ ( NPROW.GT.1 ) .AND. ( ISTART.LE.ISTOP ) ) THEN 996 JJ = MOD( ICURROW( KI )+NPROW-1, NPROW ) 997 IF( MYROW.NE.JJ ) THEN 998 CALL DGEBR2D( CONTXT, 'COL', ' ', 999 $ 3*( ISTOP-ISTART+1 ), 1, 1000 $ WORK( VECSIDX+( ISTART-1 )*3+1 ), 1001 $ 3*( ISTOP-ISTART+1 ), JJ, MYCOL ) 1002 ELSE 1003 CALL DGEBS2D( CONTXT, 'COL', ' ', 1004 $ 3*( ISTOP-ISTART+1 ), 1, 1005 $ WORK( VECSIDX+( ISTART-1 )*3+1 ), 1006 $ 3*( ISTOP-ISTART+1 ) ) 1007 END IF 1008 END IF 1009 END IF 1010* 1011 IF( ( MYCOL.EQ.ICURCOL( KI ) ) .AND. ( NPROW.GT.1 ) .AND. 1012 $ ( ISTART.LE.ISTOP ) ) THEN 1013 IF( MYROW.NE.ICURROW( KI ) ) THEN 1014 CALL DGEBR2D( CONTXT, 'COL', ' ', 1015 $ 3*( ISTOP-ISTART+1 ), 1, 1016 $ WORK( VECSIDX+( ISTART-1 )*3+1 ), 1017 $ 3*( ISTOP-ISTART+1 ), ICURROW( KI ), 1018 $ MYCOL ) 1019 ELSE 1020 CALL DGEBS2D( CONTXT, 'COL', ' ', 1021 $ 3*( ISTOP-ISTART+1 ), 1, 1022 $ WORK( VECSIDX+( ISTART-1 )*3+1 ), 1023 $ 3*( ISTOP-ISTART+1 ) ) 1024 END IF 1025 END IF 1026 90 CONTINUE 1027* 1028* Now do make up work to have things in block fashion 1029* 1030 DO 150 KI = 1, IBULGE 1031 ISTART = MAX( K1( KI ), M ) 1032 ISTOP = MIN( K2( KI ), I-1 ) 1033* 1034 MODKM1 = MOD( ISTART-1, HBL ) 1035 IF( ( MYROW.EQ.ICURROW( KI ) ) .AND. 1036 $ ( MYCOL.EQ.ICURCOL( KI ) ) .AND. 1037 $ ( MODKM1.EQ.HBL-2 ) .AND. ( ISTART.LT.I-1 ) ) THEN 1038 K = ISTART 1039* 1040* Catch up on column & border work 1041* 1042 NR = MIN( 3, I-K+1 ) 1043 V2 = WORK( VECSIDX+( K-1 )*3+1 ) 1044 V3 = WORK( VECSIDX+( K-1 )*3+2 ) 1045 T1 = WORK( VECSIDX+( K-1 )*3+3 ) 1046 IF( NR.EQ.3 ) THEN 1047* 1048* Do some work so next step is ready... 1049* 1050* V3 = VCOPY( 3 ) 1051 T2 = T1*V2 1052 T3 = T1*V3 1053 ITMP1 = MIN( 6, I2+2-K ) 1054 ITMP2 = MAX( I1-K+2, 1 ) 1055 DO 100 J = 2, ITMP1 1056 SUM = SMALLA( 2, J, KI ) + 1057 $ V2*SMALLA( 3, J, KI ) + 1058 $ V3*SMALLA( 4, J, KI ) 1059 SMALLA( 2, J, KI ) = SMALLA( 2, J, KI ) - SUM*T1 1060 SMALLA( 3, J, KI ) = SMALLA( 3, J, KI ) - SUM*T2 1061 SMALLA( 4, J, KI ) = SMALLA( 4, J, KI ) - SUM*T3 1062 100 CONTINUE 1063 DO 110 J = ITMP2, 5 1064 SUM = SMALLA( J, 2, KI ) + 1065 $ V2*SMALLA( J, 3, KI ) + 1066 $ V3*SMALLA( J, 4, KI ) 1067 SMALLA( J, 2, KI ) = SMALLA( J, 2, KI ) - SUM*T1 1068 SMALLA( J, 3, KI ) = SMALLA( J, 3, KI ) - SUM*T2 1069 SMALLA( J, 4, KI ) = SMALLA( J, 4, KI ) - SUM*T3 1070 110 CONTINUE 1071 END IF 1072 END IF 1073* 1074 IF( ( MOD( ISTART-1, HBL ).EQ.HBL-1 ) .AND. 1075 $ ( ISTART.LE.ISTOP ) .AND. 1076 $ ( MYROW.EQ.ICURROW( KI ) ) .AND. 1077 $ ( MYCOL.EQ.ICURCOL( KI ) ) ) THEN 1078 K = ISTOP 1079* 1080* Catch up on column & border work 1081* 1082 NR = MIN( 3, I-K+1 ) 1083 V2 = WORK( VECSIDX+( K-1 )*3+1 ) 1084 V3 = WORK( VECSIDX+( K-1 )*3+2 ) 1085 T1 = WORK( VECSIDX+( K-1 )*3+3 ) 1086 IF( NR.EQ.3 ) THEN 1087* 1088* Do some work so next step is ready... 1089* 1090* V3 = VCOPY( 3 ) 1091 T2 = T1*V2 1092 T3 = T1*V3 1093 ITMP1 = MIN( 6, I2-K+3 ) 1094 ITMP2 = MAX( I1-K+3, 1 ) 1095 DO 120 J = 3, ITMP1 1096 SUM = SMALLA( 3, J, KI ) + 1097 $ V2*SMALLA( 4, J, KI ) + 1098 $ V3*SMALLA( 5, J, KI ) 1099 SMALLA( 3, J, KI ) = SMALLA( 3, J, KI ) - SUM*T1 1100 SMALLA( 4, J, KI ) = SMALLA( 4, J, KI ) - SUM*T2 1101 SMALLA( 5, J, KI ) = SMALLA( 5, J, KI ) - SUM*T3 1102 120 CONTINUE 1103 DO 130 J = ITMP2, 6 1104 SUM = SMALLA( J, 3, KI ) + 1105 $ V2*SMALLA( J, 4, KI ) + 1106 $ V3*SMALLA( J, 5, KI ) 1107 SMALLA( J, 3, KI ) = SMALLA( J, 3, KI ) - SUM*T1 1108 SMALLA( J, 4, KI ) = SMALLA( J, 4, KI ) - SUM*T2 1109 SMALLA( J, 5, KI ) = SMALLA( J, 5, KI ) - SUM*T3 1110 130 CONTINUE 1111 END IF 1112 END IF 1113* 1114 MODKM1 = MOD( ISTART-1, HBL ) 1115 IF( ( MYROW.EQ.ICURROW( KI ) ) .AND. 1116 $ ( MYCOL.EQ.ICURCOL( KI ) ) .AND. 1117 $ ( ( ( MODKM1.EQ.HBL-2 ) .AND. ( ISTART.EQ.I- 1118 $ 1 ) ) .OR. ( ( MODKM1.LT.HBL-2 ) .AND. ( ISTART.LE.I- 1119 $ 1 ) ) ) ) THEN 1120* 1121* (IROW1,ICOL1) is (I,J)-coordinates of H(ISTART,ISTART) 1122* 1123 IROW1 = KROW( KI ) 1124 ICOL1 = LOCALK2( KI ) 1125 DO 140 K = ISTART, ISTOP 1126* 1127* Catch up on column & border work 1128* 1129 NR = MIN( 3, I-K+1 ) 1130 V2 = WORK( VECSIDX+( K-1 )*3+1 ) 1131 V3 = WORK( VECSIDX+( K-1 )*3+2 ) 1132 T1 = WORK( VECSIDX+( K-1 )*3+3 ) 1133 IF( K.LT.ISTOP ) THEN 1134* 1135* Do some work so next step is ready... 1136* 1137 T2 = T1*V2 1138 T3 = T1*V3 1139 CALL DLAREF( 'Col', A, LDA, .FALSE., Z, LDZ, 1140 $ .FALSE., ICOL1, ICOL1, ISTART, 1141 $ ISTOP, MIN( ISTART+1, I )-K+IROW1, 1142 $ IROW1, LILOZ, LIHIZ, 1143 $ WORK( VECSIDX+1 ), V2, V3, T1, T2, 1144 $ T3 ) 1145 IROW1 = IROW1 + 1 1146 ICOL1 = ICOL1 + 1 1147 ELSE 1148 IF( ( NR.EQ.3 ) .AND. ( MOD( K-1, 1149 $ HBL ).LT.HBL-2 ) ) THEN 1150 T2 = T1*V2 1151 T3 = T1*V3 1152 CALL DLAREF( 'Row', A, LDA, .FALSE., Z, LDZ, 1153 $ .FALSE., IROW1, IROW1, ISTART, 1154 $ ISTOP, ICOL1, MIN( MIN( K2( KI ) 1155 $ +1, I-1 ), I2 )-K+ICOL1, LILOZ, 1156 $ LIHIZ, WORK( VECSIDX+1 ), V2, 1157 $ V3, T1, T2, T3 ) 1158 END IF 1159 END IF 1160 140 CONTINUE 1161 END IF 1162* 1163* Send SMALLA back again. 1164* 1165 K = ISTART 1166 MODKM1 = MOD( K-1, HBL ) 1167 IF( ( MODKM1.GE.HBL-2 ) .AND. ( K.LE.I-1 ) ) THEN 1168 IF( ( MODKM1.EQ.HBL-2 ) .AND. ( K.LT.I-1 ) ) THEN 1169* 1170* Copy 6 elements from global A(K-1:K+4,K-1:K+4) 1171* 1172 CALL INFOG2L( K+2, K+2, DESCA, NPROW, NPCOL, MYROW, 1173 $ MYCOL, IROW1, ICOL1, ITMP1, ITMP2 ) 1174 CALL PDLACP3( MIN( 6, N-K+2 ), K-1, A, DESCA, 1175 $ SMALLA( 1, 1, KI ), 6, ITMP1, ITMP2, 1176 $ 1 ) 1177* 1178 END IF 1179 IF( MODKM1.EQ.HBL-1 ) THEN 1180* 1181* Copy 6 elements from global A(K-2:K+3,K-2:K+3) 1182* 1183 CALL INFOG2L( K+1, K+1, DESCA, NPROW, NPCOL, MYROW, 1184 $ MYCOL, IROW1, ICOL1, ITMP1, ITMP2 ) 1185 CALL PDLACP3( MIN( 6, N-K+3 ), K-2, A, DESCA, 1186 $ SMALLA( 1, 1, KI ), 6, ITMP1, ITMP2, 1187 $ 1 ) 1188 END IF 1189 END IF 1190* 1191 150 CONTINUE 1192* 1193* Now start major set of block ROW reflections 1194* 1195 DO 160 KI = 1, IBULGE 1196 IF( ( MYROW.NE.ICURROW( KI ) ) .AND. 1197 $ ( DOWN.NE.ICURROW( KI ) ) )GO TO 160 1198 ISTART = MAX( K1( KI ), M ) 1199 ISTOP = MIN( K2( KI ), I-1 ) 1200* 1201 IF( ( ISTOP.GT.ISTART ) .AND. 1202 $ ( MOD( ISTART-1, HBL ).LT.HBL-2 ) .AND. 1203 $ ( ICURROW( KI ).EQ.MYROW ) ) THEN 1204 IROW1 = MIN( K2( KI )+1, I-1 ) + 1 1205 CALL INFOG1L( IROW1, HBL, NPCOL, MYCOL, 0, ITMP1, 1206 $ ITMP2 ) 1207 ITMP2 = NUMROC( I2, HBL, MYCOL, 0, NPCOL ) 1208 II = KROW( KI ) 1209 CALL DLAREF( 'Row', A, LDA, WANTZ, Z, LDZ, .TRUE., II, 1210 $ II, ISTART, ISTOP, ITMP1, ITMP2, LILOZ, 1211 $ LIHIZ, WORK( VECSIDX+1 ), V2, V3, T1, T2, 1212 $ T3 ) 1213 END IF 1214 160 CONTINUE 1215* 1216 DO 180 KI = 1, IBULGE 1217 IF( KROW( KI ).GT.KP2ROW( KI ) ) 1218 $ GO TO 180 1219 IF( ( MYROW.NE.ICURROW( KI ) ) .AND. 1220 $ ( DOWN.NE.ICURROW( KI ) ) )GO TO 180 1221 ISTART = MAX( K1( KI ), M ) 1222 ISTOP = MIN( K2( KI ), I-1 ) 1223 IF( ( ISTART.EQ.ISTOP ) .OR. 1224 $ ( MOD( ISTART-1, HBL ).GE.HBL-2 ) .OR. 1225 $ ( ICURROW( KI ).NE.MYROW ) ) THEN 1226 DO 170 K = ISTART, ISTOP 1227 V2 = WORK( VECSIDX+( K-1 )*3+1 ) 1228 V3 = WORK( VECSIDX+( K-1 )*3+2 ) 1229 T1 = WORK( VECSIDX+( K-1 )*3+3 ) 1230 NR = MIN( 3, I-K+1 ) 1231 IF( ( NR.EQ.3 ) .AND. ( KROW( KI ).LE. 1232 $ KP2ROW( KI ) ) ) THEN 1233 IF( ( K.LT.ISTOP ) .AND. 1234 $ ( MOD( K-1, HBL ).LT.HBL-2 ) ) THEN 1235 ITMP1 = MIN( K2( KI )+1, I-1 ) + 1 1236 ELSE 1237 IF( MOD( K-1, HBL ).LT.HBL-2 ) THEN 1238 ITMP1 = MIN( K2( KI )+1, I-1 ) + 1 1239 END IF 1240 IF( MOD( K-1, HBL ).EQ.HBL-2 ) THEN 1241 ITMP1 = MIN( K+4, I2 ) + 1 1242 END IF 1243 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 1244 ITMP1 = MIN( K+3, I2 ) + 1 1245 END IF 1246 END IF 1247* 1248* Find local coor of rows K through K+2 1249* 1250 IROW1 = KROW( KI ) 1251 IROW2 = KP2ROW( KI ) 1252 CALL INFOG1L( ITMP1, HBL, NPCOL, MYCOL, 0, 1253 $ ICOL1, ICOL2 ) 1254 ICOL2 = NUMROC( I2, HBL, MYCOL, 0, NPCOL ) 1255 IF( ( MOD( K-1, HBL ).LT.HBL-2 ) .OR. 1256 $ ( NPROW.EQ.1 ) ) THEN 1257 T2 = T1*V2 1258 T3 = T1*V3 1259 CALL DLAREF( 'Row', A, LDA, WANTZ, Z, LDZ, 1260 $ .FALSE., IROW1, IROW1, ISTART, 1261 $ ISTOP, ICOL1, ICOL2, LILOZ, 1262 $ LIHIZ, WORK( VECSIDX+1 ), V2, 1263 $ V3, T1, T2, T3 ) 1264 END IF 1265 IF( ( MOD( K-1, HBL ).EQ.HBL-2 ) .AND. 1266 $ ( NPROW.GT.1 ) ) THEN 1267 IF( IROW1.EQ.IROW2 ) THEN 1268 CALL DGESD2D( CONTXT, 1, ICOL2-ICOL1+1, 1269 $ A( ( ICOL1-1 )*LDA+IROW2 ), 1270 $ LDA, UP, MYCOL ) 1271 END IF 1272 END IF 1273 IF( ( MOD( K-1, HBL ).EQ.HBL-1 ) .AND. 1274 $ ( NPROW.GT.1 ) ) THEN 1275 IF( IROW1.EQ.IROW2 ) THEN 1276 CALL DGESD2D( CONTXT, 1, ICOL2-ICOL1+1, 1277 $ A( ( ICOL1-1 )*LDA+IROW1 ), 1278 $ LDA, DOWN, MYCOL ) 1279 END IF 1280 END IF 1281 END IF 1282 170 CONTINUE 1283 END IF 1284 180 CONTINUE 1285* 1286 DO 220 KI = 1, IBULGE 1287 IF( KROW( KI ).GT.KP2ROW( KI ) ) 1288 $ GO TO 220 1289 IF( ( MYROW.NE.ICURROW( KI ) ) .AND. 1290 $ ( DOWN.NE.ICURROW( KI ) ) )GO TO 220 1291 ISTART = MAX( K1( KI ), M ) 1292 ISTOP = MIN( K2( KI ), I-1 ) 1293 IF( ( ISTART.EQ.ISTOP ) .OR. 1294 $ ( MOD( ISTART-1, HBL ).GE.HBL-2 ) .OR. 1295 $ ( ICURROW( KI ).NE.MYROW ) ) THEN 1296 DO 210 K = ISTART, ISTOP 1297 V2 = WORK( VECSIDX+( K-1 )*3+1 ) 1298 V3 = WORK( VECSIDX+( K-1 )*3+2 ) 1299 T1 = WORK( VECSIDX+( K-1 )*3+3 ) 1300 NR = MIN( 3, I-K+1 ) 1301 IF( ( NR.EQ.3 ) .AND. ( KROW( KI ).LE. 1302 $ KP2ROW( KI ) ) ) THEN 1303 IF( ( K.LT.ISTOP ) .AND. 1304 $ ( MOD( K-1, HBL ).LT.HBL-2 ) ) THEN 1305 ITMP1 = MIN( K2( KI )+1, I-1 ) + 1 1306 ELSE 1307 IF( MOD( K-1, HBL ).LT.HBL-2 ) THEN 1308 ITMP1 = MIN( K2( KI )+1, I-1 ) + 1 1309 END IF 1310 IF( MOD( K-1, HBL ).EQ.HBL-2 ) THEN 1311 ITMP1 = MIN( K+4, I2 ) + 1 1312 END IF 1313 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 1314 ITMP1 = MIN( K+3, I2 ) + 1 1315 END IF 1316 END IF 1317* 1318 IROW1 = KROW( KI ) + K - ISTART 1319 IROW2 = KP2ROW( KI ) + K - ISTART 1320 CALL INFOG1L( ITMP1, HBL, NPCOL, MYCOL, 0, 1321 $ ICOL1, ICOL2 ) 1322 ICOL2 = NUMROC( I2, HBL, MYCOL, 0, NPCOL ) 1323 IF( ( MOD( K-1, HBL ).EQ.HBL-2 ) .AND. 1324 $ ( NPROW.GT.1 ) ) THEN 1325 IF( IROW1.NE.IROW2 ) THEN 1326 CALL DGERV2D( CONTXT, 1, ICOL2-ICOL1+1, 1327 $ WORK( IRBUF+1 ), 1, DOWN, 1328 $ MYCOL ) 1329 T2 = T1*V2 1330 T3 = T1*V3 1331 DO 190 J = ICOL1, ICOL2 1332 SUM = A( ( J-1 )*LDA+IROW1 ) + 1333 $ V2*A( ( J-1 )*LDA+IROW1+1 ) + 1334 $ V3*WORK( IRBUF+J-ICOL1+1 ) 1335 A( ( J-1 )*LDA+IROW1 ) = A( ( J-1 )* 1336 $ LDA+IROW1 ) - SUM*T1 1337 A( ( J-1 )*LDA+IROW1+1 ) = A( ( J-1 )* 1338 $ LDA+IROW1+1 ) - SUM*T2 1339 WORK( IRBUF+J-ICOL1+1 ) = WORK( IRBUF+ 1340 $ J-ICOL1+1 ) - SUM*T3 1341 190 CONTINUE 1342 CALL DGESD2D( CONTXT, 1, ICOL2-ICOL1+1, 1343 $ WORK( IRBUF+1 ), 1, DOWN, 1344 $ MYCOL ) 1345 END IF 1346 END IF 1347 IF( ( MOD( K-1, HBL ).EQ.HBL-1 ) .AND. 1348 $ ( NPROW.GT.1 ) ) THEN 1349 IF( IROW1.NE.IROW2 ) THEN 1350 CALL DGERV2D( CONTXT, 1, ICOL2-ICOL1+1, 1351 $ WORK( IRBUF+1 ), 1, UP, 1352 $ MYCOL ) 1353 T2 = T1*V2 1354 T3 = T1*V3 1355 DO 200 J = ICOL1, ICOL2 1356 SUM = WORK( IRBUF+J-ICOL1+1 ) + 1357 $ V2*A( ( J-1 )*LDA+IROW1 ) + 1358 $ V3*A( ( J-1 )*LDA+IROW1+1 ) 1359 WORK( IRBUF+J-ICOL1+1 ) = WORK( IRBUF+ 1360 $ J-ICOL1+1 ) - SUM*T1 1361 A( ( J-1 )*LDA+IROW1 ) = A( ( J-1 )* 1362 $ LDA+IROW1 ) - SUM*T2 1363 A( ( J-1 )*LDA+IROW1+1 ) = A( ( J-1 )* 1364 $ LDA+IROW1+1 ) - SUM*T3 1365 200 CONTINUE 1366 CALL DGESD2D( CONTXT, 1, ICOL2-ICOL1+1, 1367 $ WORK( IRBUF+1 ), 1, UP, 1368 $ MYCOL ) 1369 END IF 1370 END IF 1371 END IF 1372 210 CONTINUE 1373 END IF 1374 220 CONTINUE 1375* 1376 DO 240 KI = 1, IBULGE 1377 IF( KROW( KI ).GT.KP2ROW( KI ) ) 1378 $ GO TO 240 1379 IF( ( MYROW.NE.ICURROW( KI ) ) .AND. 1380 $ ( DOWN.NE.ICURROW( KI ) ) )GO TO 240 1381 ISTART = MAX( K1( KI ), M ) 1382 ISTOP = MIN( K2( KI ), I-1 ) 1383 IF( ( ISTART.EQ.ISTOP ) .OR. 1384 $ ( MOD( ISTART-1, HBL ).GE.HBL-2 ) .OR. 1385 $ ( ICURROW( KI ).NE.MYROW ) ) THEN 1386 DO 230 K = ISTART, ISTOP 1387 V2 = WORK( VECSIDX+( K-1 )*3+1 ) 1388 V3 = WORK( VECSIDX+( K-1 )*3+2 ) 1389 T1 = WORK( VECSIDX+( K-1 )*3+3 ) 1390 NR = MIN( 3, I-K+1 ) 1391 IF( ( NR.EQ.3 ) .AND. ( KROW( KI ).LE. 1392 $ KP2ROW( KI ) ) ) THEN 1393 IF( ( K.LT.ISTOP ) .AND. 1394 $ ( MOD( K-1, HBL ).LT.HBL-2 ) ) THEN 1395 ITMP1 = MIN( K2( KI )+1, I-1 ) + 1 1396 ELSE 1397 IF( MOD( K-1, HBL ).LT.HBL-2 ) THEN 1398 ITMP1 = MIN( K2( KI )+1, I-1 ) + 1 1399 END IF 1400 IF( MOD( K-1, HBL ).EQ.HBL-2 ) THEN 1401 ITMP1 = MIN( K+4, I2 ) + 1 1402 END IF 1403 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 1404 ITMP1 = MIN( K+3, I2 ) + 1 1405 END IF 1406 END IF 1407* 1408 IROW1 = KROW( KI ) + K - ISTART 1409 IROW2 = KP2ROW( KI ) + K - ISTART 1410 CALL INFOG1L( ITMP1, HBL, NPCOL, MYCOL, 0, 1411 $ ICOL1, ICOL2 ) 1412 ICOL2 = NUMROC( I2, HBL, MYCOL, 0, NPCOL ) 1413 IF( ( MOD( K-1, HBL ).EQ.HBL-2 ) .AND. 1414 $ ( NPROW.GT.1 ) ) THEN 1415 IF( IROW1.EQ.IROW2 ) THEN 1416 CALL DGERV2D( CONTXT, 1, ICOL2-ICOL1+1, 1417 $ A( ( ICOL1-1 )*LDA+IROW2 ), 1418 $ LDA, UP, MYCOL ) 1419 END IF 1420 END IF 1421 IF( ( MOD( K-1, HBL ).EQ.HBL-1 ) .AND. 1422 $ ( NPROW.GT.1 ) ) THEN 1423 IF( IROW1.EQ.IROW2 ) THEN 1424 CALL DGERV2D( CONTXT, 1, ICOL2-ICOL1+1, 1425 $ A( ( ICOL1-1 )*LDA+IROW1 ), 1426 $ LDA, DOWN, MYCOL ) 1427 END IF 1428 END IF 1429 END IF 1430 230 CONTINUE 1431 END IF 1432 240 CONTINUE 1433 250 CONTINUE 1434* 1435* Now start major set of block COL reflections 1436* 1437 DO 260 KI = 1, IBULGE 1438 IF( ( MYCOL.NE.ICURCOL( KI ) ) .AND. 1439 $ ( RIGHT.NE.ICURCOL( KI ) ) )GO TO 260 1440 ISTART = MAX( K1( KI ), M ) 1441 ISTOP = MIN( K2( KI ), I-1 ) 1442* 1443 IF( ( ( MOD( ISTART-1, HBL ).LT.HBL-2 ) .OR. ( NPCOL.EQ. 1444 $ 1 ) ) .AND. ( ICURCOL( KI ).EQ.MYCOL ) .AND. 1445 $ ( I-ISTOP+1.GE.3 ) ) THEN 1446 K = ISTART 1447 IF( ( K.LT.ISTOP ) .AND. ( MOD( K-1, 1448 $ HBL ).LT.HBL-2 ) ) THEN 1449 ITMP1 = MIN( ISTART+1, I ) - 1 1450 ELSE 1451 IF( MOD( K-1, HBL ).LT.HBL-2 ) THEN 1452 ITMP1 = MIN( K+3, I ) 1453 END IF 1454 IF( MOD( K-1, HBL ).EQ.HBL-2 ) THEN 1455 ITMP1 = MAX( I1, K-1 ) - 1 1456 END IF 1457 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 1458 ITMP1 = MAX( I1, K-2 ) - 1 1459 END IF 1460 END IF 1461* 1462 ICOL1 = KCOL( KI ) 1463 CALL INFOG1L( I1, HBL, NPROW, MYROW, 0, IROW1, IROW2 ) 1464 IROW2 = NUMROC( ITMP1, HBL, MYROW, 0, NPROW ) 1465 IF( IROW1.LE.IROW2 ) THEN 1466 ITMP2 = IROW2 1467 ELSE 1468 ITMP2 = -1 1469 END IF 1470 CALL DLAREF( 'Col', A, LDA, WANTZ, Z, LDZ, .TRUE., 1471 $ ICOL1, ICOL1, ISTART, ISTOP, IROW1, 1472 $ IROW2, LILOZ, LIHIZ, WORK( VECSIDX+1 ), 1473 $ V2, V3, T1, T2, T3 ) 1474 K = ISTOP 1475 IF( MOD( K-1, HBL ).LT.HBL-2 ) THEN 1476* 1477* Do from ITMP1+1 to MIN(K+3,I) 1478* 1479 IF( MOD( K-1, HBL ).LT.HBL-3 ) THEN 1480 IROW1 = ITMP2 + 1 1481 IF( MOD( ( ITMP1 / HBL ), NPROW ).EQ.MYROW ) 1482 $ THEN 1483 IF( ITMP2.GT.0 ) THEN 1484 IROW2 = ITMP2 + MIN( K+3, I ) - ITMP1 1485 ELSE 1486 IROW2 = IROW1 - 1 1487 END IF 1488 ELSE 1489 IROW2 = IROW1 - 1 1490 END IF 1491 ELSE 1492 CALL INFOG1L( ITMP1+1, HBL, NPROW, MYROW, 0, 1493 $ IROW1, IROW2 ) 1494 IROW2 = NUMROC( MIN( K+3, I ), HBL, MYROW, 0, 1495 $ NPROW ) 1496 END IF 1497 V2 = WORK( VECSIDX+( K-1 )*3+1 ) 1498 V3 = WORK( VECSIDX+( K-1 )*3+2 ) 1499 T1 = WORK( VECSIDX+( K-1 )*3+3 ) 1500 T2 = T1*V2 1501 T3 = T1*V3 1502 ICOL1 = KCOL( KI ) + ISTOP - ISTART 1503 CALL DLAREF( 'Col', A, LDA, .FALSE., Z, LDZ, 1504 $ .FALSE., ICOL1, ICOL1, ISTART, ISTOP, 1505 $ IROW1, IROW2, LILOZ, LIHIZ, 1506 $ WORK( VECSIDX+1 ), V2, V3, T1, T2, 1507 $ T3 ) 1508 END IF 1509 END IF 1510 260 CONTINUE 1511* 1512 DO 320 KI = 1, IBULGE 1513 IF( KCOL( KI ).GT.KP2COL( KI ) ) 1514 $ GO TO 320 1515 IF( ( MYCOL.NE.ICURCOL( KI ) ) .AND. 1516 $ ( RIGHT.NE.ICURCOL( KI ) ) )GO TO 320 1517 ISTART = MAX( K1( KI ), M ) 1518 ISTOP = MIN( K2( KI ), I-1 ) 1519 IF( MOD( ISTART-1, HBL ).GE.HBL-2 ) THEN 1520* 1521* INFO is found in a buffer 1522* 1523 ISPEC = 1 1524 ELSE 1525* 1526* All INFO is local 1527* 1528 ISPEC = 0 1529 END IF 1530* 1531 DO 310 K = ISTART, ISTOP 1532* 1533 V2 = WORK( VECSIDX+( K-1 )*3+1 ) 1534 V3 = WORK( VECSIDX+( K-1 )*3+2 ) 1535 T1 = WORK( VECSIDX+( K-1 )*3+3 ) 1536 NR = MIN( 3, I-K+1 ) 1537 IF( ( NR.EQ.3 ) .AND. ( KCOL( KI ).LE.KP2COL( KI ) ) ) 1538 $ THEN 1539* 1540 IF( ( K.LT.ISTOP ) .AND. 1541 $ ( MOD( K-1, HBL ).LT.HBL-2 ) ) THEN 1542 ITMP1 = MIN( ISTART+1, I ) - 1 1543 ELSE 1544 IF( MOD( K-1, HBL ).LT.HBL-2 ) THEN 1545 ITMP1 = MIN( K+3, I ) 1546 END IF 1547 IF( MOD( K-1, HBL ).EQ.HBL-2 ) THEN 1548 ITMP1 = MAX( I1, K-1 ) - 1 1549 END IF 1550 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 1551 ITMP1 = MAX( I1, K-2 ) - 1 1552 END IF 1553 END IF 1554 ICOL1 = KCOL( KI ) + K - ISTART 1555 ICOL2 = KP2COL( KI ) + K - ISTART 1556 CALL INFOG1L( I1, HBL, NPROW, MYROW, 0, IROW1, 1557 $ IROW2 ) 1558 IROW2 = NUMROC( ITMP1, HBL, MYROW, 0, NPROW ) 1559 IF( ( MOD( K-1, HBL ).EQ.HBL-2 ) .AND. 1560 $ ( NPCOL.GT.1 ) ) THEN 1561 IF( ICOL1.EQ.ICOL2 ) THEN 1562 CALL DGESD2D( CONTXT, IROW2-IROW1+1, 1, 1563 $ A( ( ICOL1-1 )*LDA+IROW1 ), 1564 $ LDA, MYROW, LEFT ) 1565 CALL DGERV2D( CONTXT, IROW2-IROW1+1, 1, 1566 $ A( ( ICOL1-1 )*LDA+IROW1 ), 1567 $ LDA, MYROW, LEFT ) 1568 ELSE 1569 CALL DGERV2D( CONTXT, IROW2-IROW1+1, 1, 1570 $ WORK( ICBUF+1 ), IROW2-IROW1+1, 1571 $ MYROW, RIGHT ) 1572 T2 = T1*V2 1573 T3 = T1*V3 1574 DO 270 J = IROW1, IROW2 1575 SUM = A( ( ICOL1-1 )*LDA+J ) + 1576 $ V2*A( ICOL1*LDA+J ) + 1577 $ V3*WORK( ICBUF+J-IROW1+1 ) 1578 A( ( ICOL1-1 )*LDA+J ) = A( ( ICOL1-1 )* 1579 $ LDA+J ) - SUM*T1 1580 A( ICOL1*LDA+J ) = A( ICOL1*LDA+J ) - 1581 $ SUM*T2 1582 WORK( ICBUF+J-IROW1+1 ) = WORK( ICBUF+J- 1583 $ IROW1+1 ) - SUM*T3 1584 270 CONTINUE 1585 CALL DGESD2D( CONTXT, IROW2-IROW1+1, 1, 1586 $ WORK( ICBUF+1 ), IROW2-IROW1+1, 1587 $ MYROW, RIGHT ) 1588 END IF 1589 END IF 1590 IF( ( MOD( K-1, HBL ).EQ.HBL-1 ) .AND. 1591 $ ( NPCOL.GT.1 ) ) THEN 1592 IF( ICOL1.EQ.ICOL2 ) THEN 1593 CALL DGESD2D( CONTXT, IROW2-IROW1+1, 1, 1594 $ A( ( ICOL1-1 )*LDA+IROW1 ), 1595 $ LDA, MYROW, RIGHT ) 1596 CALL DGERV2D( CONTXT, IROW2-IROW1+1, 1, 1597 $ A( ( ICOL1-1 )*LDA+IROW1 ), 1598 $ LDA, MYROW, RIGHT ) 1599 ELSE 1600 CALL DGERV2D( CONTXT, IROW2-IROW1+1, 1, 1601 $ WORK( ICBUF+1 ), IROW2-IROW1+1, 1602 $ MYROW, LEFT ) 1603 T2 = T1*V2 1604 T3 = T1*V3 1605 DO 280 J = IROW1, IROW2 1606 SUM = WORK( ICBUF+J-IROW1+1 ) + 1607 $ V2*A( ( ICOL1-1 )*LDA+J ) + 1608 $ V3*A( ICOL1*LDA+J ) 1609 WORK( ICBUF+J-IROW1+1 ) = WORK( ICBUF+J- 1610 $ IROW1+1 ) - SUM*T1 1611 A( ( ICOL1-1 )*LDA+J ) = A( ( ICOL1-1 )* 1612 $ LDA+J ) - SUM*T2 1613 A( ICOL1*LDA+J ) = A( ICOL1*LDA+J ) - 1614 $ SUM*T3 1615 280 CONTINUE 1616 CALL DGESD2D( CONTXT, IROW2-IROW1+1, 1, 1617 $ WORK( ICBUF+1 ), IROW2-IROW1+1, 1618 $ MYROW, LEFT ) 1619 END IF 1620 END IF 1621* 1622* If we want Z and we haven't already done any Z 1623 IF( ( WANTZ ) .AND. ( MOD( K-1, 1624 $ HBL ).GE.HBL-2 ) .AND. ( NPCOL.GT.1 ) ) THEN 1625* 1626* Accumulate transformations in the matrix Z 1627* 1628 IROW1 = LILOZ 1629 IROW2 = LIHIZ 1630 IF( MOD( K-1, HBL ).EQ.HBL-2 ) THEN 1631 IF( ICOL1.EQ.ICOL2 ) THEN 1632 CALL DGESD2D( CONTXT, IROW2-IROW1+1, 1, 1633 $ Z( ( ICOL1-1 )*LDZ+IROW1 ), 1634 $ LDZ, MYROW, LEFT ) 1635 CALL DGERV2D( CONTXT, IROW2-IROW1+1, 1, 1636 $ Z( ( ICOL1-1 )*LDZ+IROW1 ), 1637 $ LDZ, MYROW, LEFT ) 1638 ELSE 1639 CALL DGERV2D( CONTXT, IROW2-IROW1+1, 1, 1640 $ WORK( ICBUF+1 ), 1641 $ IROW2-IROW1+1, MYROW, 1642 $ RIGHT ) 1643 T2 = T1*V2 1644 T3 = T1*V3 1645 ICOL1 = ( ICOL1-1 )*LDZ 1646 DO 290 J = IROW1, IROW2 1647 SUM = Z( ICOL1+J ) + 1648 $ V2*Z( ICOL1+J+LDZ ) + 1649 $ V3*WORK( ICBUF+J-IROW1+1 ) 1650 Z( J+ICOL1 ) = Z( J+ICOL1 ) - SUM*T1 1651 Z( J+ICOL1+LDZ ) = Z( J+ICOL1+LDZ ) - 1652 $ SUM*T2 1653 WORK( ICBUF+J-IROW1+1 ) = WORK( ICBUF+ 1654 $ J-IROW1+1 ) - SUM*T3 1655 290 CONTINUE 1656 CALL DGESD2D( CONTXT, IROW2-IROW1+1, 1, 1657 $ WORK( ICBUF+1 ), 1658 $ IROW2-IROW1+1, MYROW, 1659 $ RIGHT ) 1660 END IF 1661 END IF 1662 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 1663 IF( ICOL1.EQ.ICOL2 ) THEN 1664 CALL DGESD2D( CONTXT, IROW2-IROW1+1, 1, 1665 $ Z( ( ICOL1-1 )*LDZ+IROW1 ), 1666 $ LDZ, MYROW, RIGHT ) 1667 CALL DGERV2D( CONTXT, IROW2-IROW1+1, 1, 1668 $ Z( ( ICOL1-1 )*LDZ+IROW1 ), 1669 $ LDZ, MYROW, RIGHT ) 1670 ELSE 1671 CALL DGERV2D( CONTXT, IROW2-IROW1+1, 1, 1672 $ WORK( ICBUF+1 ), 1673 $ IROW2-IROW1+1, MYROW, LEFT ) 1674 T2 = T1*V2 1675 T3 = T1*V3 1676 ICOL1 = ( ICOL1-1 )*LDZ 1677 DO 300 J = IROW1, IROW2 1678 SUM = WORK( ICBUF+J-IROW1+1 ) + 1679 $ V2*Z( J+ICOL1 ) + 1680 $ V3*Z( J+ICOL1+LDZ ) 1681 WORK( ICBUF+J-IROW1+1 ) = WORK( ICBUF+ 1682 $ J-IROW1+1 ) - SUM*T1 1683 Z( J+ICOL1 ) = Z( J+ICOL1 ) - SUM*T2 1684 Z( J+ICOL1+LDZ ) = Z( J+ICOL1+LDZ ) - 1685 $ SUM*T3 1686 300 CONTINUE 1687 CALL DGESD2D( CONTXT, IROW2-IROW1+1, 1, 1688 $ WORK( ICBUF+1 ), 1689 $ IROW2-IROW1+1, MYROW, LEFT ) 1690 END IF 1691 END IF 1692 END IF 1693 IF( ICURCOL( KI ).EQ.MYCOL ) THEN 1694 IF( ( ISPEC.EQ.0 ) .OR. ( NPCOL.EQ.1 ) ) THEN 1695 LOCALK2( KI ) = LOCALK2( KI ) + 1 1696 END IF 1697 ELSE 1698 IF( ( MOD( K-1, HBL ).EQ.HBL-1 ) .AND. 1699 $ ( ICURCOL( KI ).EQ.RIGHT ) ) THEN 1700 IF( K.GT.M ) THEN 1701 LOCALK2( KI ) = LOCALK2( KI ) + 2 1702 ELSE 1703 LOCALK2( KI ) = LOCALK2( KI ) + 1 1704 END IF 1705 END IF 1706 IF( ( MOD( K-1, HBL ).EQ.HBL-2 ) .AND. 1707 $ ( I-K.EQ.2 ) .AND. ( ICURCOL( KI ).EQ. 1708 $ RIGHT ) ) THEN 1709 LOCALK2( KI ) = LOCALK2( KI ) + 2 1710 END IF 1711 END IF 1712 END IF 1713 310 CONTINUE 1714 320 CONTINUE 1715* 1716* Column work done 1717* 1718 330 CONTINUE 1719* 1720* Now do NR=2 work 1721* 1722 DO 410 KI = 1, IBULGE 1723 ISTART = MAX( K1( KI ), M ) 1724 ISTOP = MIN( K2( KI ), I-1 ) 1725 IF( MOD( ISTART-1, HBL ).GE.HBL-2 ) THEN 1726* 1727* INFO is found in a buffer 1728* 1729 ISPEC = 1 1730 ELSE 1731* 1732* All INFO is local 1733* 1734 ISPEC = 0 1735 END IF 1736* 1737 DO 400 K = ISTART, ISTOP 1738* 1739 V2 = WORK( VECSIDX+( K-1 )*3+1 ) 1740 V3 = WORK( VECSIDX+( K-1 )*3+2 ) 1741 T1 = WORK( VECSIDX+( K-1 )*3+3 ) 1742 NR = MIN( 3, I-K+1 ) 1743 IF( NR.EQ.2 ) THEN 1744 IF ( ICURROW( KI ).EQ.MYROW ) THEN 1745 T2 = T1*V2 1746 END IF 1747 IF ( ICURCOL( KI ).EQ.MYCOL ) THEN 1748 T2 = T1*V2 1749 END IF 1750* 1751* Apply G from the left to transform the rows of the matrix 1752* in columns K to I2. 1753* 1754 CALL INFOG1L( K, HBL, NPCOL, MYCOL, 0, LILOH, 1755 $ LIHIH ) 1756 LIHIH = NUMROC( I2, HBL, MYCOL, 0, NPCOL ) 1757 CALL INFOG1L( 1, HBL, NPROW, MYROW, 0, ITMP2, 1758 $ ITMP1 ) 1759 ITMP1 = NUMROC( K+1, HBL, MYROW, 0, NPROW ) 1760 IF( ICURROW( KI ).EQ.MYROW ) THEN 1761 IF( ( ISPEC.EQ.0 ) .OR. ( NPROW.EQ.1 ) .OR. 1762 $ ( MOD( K-1, HBL ).EQ.HBL-2 ) ) THEN 1763 ITMP1 = ITMP1 - 1 1764 DO 340 J = ( LILOH-1 )*LDA, 1765 $ ( LIHIH-1 )*LDA, LDA 1766 SUM = A( ITMP1+J ) + V2*A( ITMP1+1+J ) 1767 A( ITMP1+J ) = A( ITMP1+J ) - SUM*T1 1768 A( ITMP1+1+J ) = A( ITMP1+1+J ) - SUM*T2 1769 340 CONTINUE 1770 ELSE 1771 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 1772 CALL DGERV2D( CONTXT, 1, LIHIH-LILOH+1, 1773 $ WORK( IRBUF+1 ), 1, UP, 1774 $ MYCOL ) 1775 DO 350 J = LILOH, LIHIH 1776 SUM = WORK( IRBUF+J-LILOH+1 ) + 1777 $ V2*A( ( J-1 )*LDA+ITMP1 ) 1778 WORK( IRBUF+J-LILOH+1 ) = WORK( IRBUF+ 1779 $ J-LILOH+1 ) - SUM*T1 1780 A( ( J-1 )*LDA+ITMP1 ) = A( ( J-1 )* 1781 $ LDA+ITMP1 ) - SUM*T2 1782 350 CONTINUE 1783 CALL DGESD2D( CONTXT, 1, LIHIH-LILOH+1, 1784 $ WORK( IRBUF+1 ), 1, UP, 1785 $ MYCOL ) 1786 END IF 1787 END IF 1788 ELSE 1789 IF( ( MOD( K-1, HBL ).EQ.HBL-1 ) .AND. 1790 $ ( ICURROW( KI ).EQ.DOWN ) ) THEN 1791 CALL DGESD2D( CONTXT, 1, LIHIH-LILOH+1, 1792 $ A( ( LILOH-1 )*LDA+ITMP1 ), 1793 $ LDA, DOWN, MYCOL ) 1794 CALL DGERV2D( CONTXT, 1, LIHIH-LILOH+1, 1795 $ A( ( LILOH-1 )*LDA+ITMP1 ), 1796 $ LDA, DOWN, MYCOL ) 1797 END IF 1798 END IF 1799* 1800* Apply G from the right to transform the columns of the 1801* matrix in rows I1 to MIN(K+3,I). 1802* 1803 CALL INFOG1L( I1, HBL, NPROW, MYROW, 0, LILOH, 1804 $ LIHIH ) 1805 LIHIH = NUMROC( I, HBL, MYROW, 0, NPROW ) 1806* 1807 IF( ICURCOL( KI ).EQ.MYCOL ) THEN 1808* LOCAL A(LILOZ:LIHIZ,LOCALK2:LOCALK2+2) 1809 IF( ( ISPEC.EQ.0 ) .OR. ( NPCOL.EQ.1 ) .OR. 1810 $ ( MOD( K-1, HBL ).EQ.HBL-2 ) ) THEN 1811 CALL INFOG1L( K, HBL, NPCOL, MYCOL, 0, ITMP1, 1812 $ ITMP2 ) 1813 ITMP2 = NUMROC( K+1, HBL, MYCOL, 0, NPCOL ) 1814 DO 360 J = LILOH, LIHIH 1815 SUM = A( ( ITMP1-1 )*LDA+J ) + 1816 $ V2*A( ITMP1*LDA+J ) 1817 A( ( ITMP1-1 )*LDA+J ) = A( ( ITMP1-1 )* 1818 $ LDA+J ) - SUM*T1 1819 A( ITMP1*LDA+J ) = A( ITMP1*LDA+J ) - 1820 $ SUM*T2 1821 360 CONTINUE 1822 ELSE 1823 ITMP1 = LOCALK2( KI ) 1824 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 1825 CALL DGERV2D( CONTXT, LIHIH-LILOH+1, 1, 1826 $ WORK( ICBUF+1 ), 1827 $ LIHIH-LILOH+1, MYROW, LEFT ) 1828 DO 370 J = LILOH, LIHIH 1829 SUM = WORK( ICBUF+J ) + 1830 $ V2*A( ( ITMP1-1 )*LDA+J ) 1831 WORK( ICBUF+J ) = WORK( ICBUF+J ) - 1832 $ SUM*T1 1833 A( ( ITMP1-1 )*LDA+J ) 1834 $ = A( ( ITMP1-1 )*LDA+J ) - SUM*T2 1835 370 CONTINUE 1836 CALL DGESD2D( CONTXT, LIHIH-LILOH+1, 1, 1837 $ WORK( ICBUF+1 ), 1838 $ LIHIH-LILOH+1, MYROW, LEFT ) 1839 END IF 1840 END IF 1841 ELSE 1842 IF( ( MOD( K-1, HBL ).EQ.HBL-1 ) .AND. 1843 $ ( ICURCOL( KI ).EQ.RIGHT ) ) THEN 1844 ITMP1 = KCOL( KI ) 1845 CALL DGESD2D( CONTXT, LIHIH-LILOH+1, 1, 1846 $ A( ( ITMP1-1 )*LDA+LILOH ), 1847 $ LDA, MYROW, RIGHT ) 1848 CALL INFOG1L( K, HBL, NPCOL, MYCOL, 0, ITMP1, 1849 $ ITMP2 ) 1850 ITMP2 = NUMROC( K+1, HBL, MYCOL, 0, NPCOL ) 1851 CALL DGERV2D( CONTXT, LIHIH-LILOH+1, 1, 1852 $ A( ( ITMP1-1 )*LDA+LILOH ), 1853 $ LDA, MYROW, RIGHT ) 1854 END IF 1855 END IF 1856* 1857 IF( WANTZ ) THEN 1858* 1859* Accumulate transformations in the matrix Z 1860* 1861 IF( ICURCOL( KI ).EQ.MYCOL ) THEN 1862* LOCAL Z(LILOZ:LIHIZ,LOCALK2:LOCALK2+2) 1863 IF( ( ISPEC.EQ.0 ) .OR. ( NPCOL.EQ.1 ) .OR. 1864 $ ( MOD( K-1, HBL ).EQ.HBL-2 ) ) THEN 1865 ITMP1 = KCOL( KI ) + K - ISTART 1866 ITMP1 = ( ITMP1-1 )*LDZ 1867 DO 380 J = LILOZ, LIHIZ 1868 SUM = Z( J+ITMP1 ) + 1869 $ V2*Z( J+ITMP1+LDZ ) 1870 Z( J+ITMP1 ) = Z( J+ITMP1 ) - SUM*T1 1871 Z( J+ITMP1+LDZ ) = Z( J+ITMP1+LDZ ) - 1872 $ SUM*T2 1873 380 CONTINUE 1874 LOCALK2( KI ) = LOCALK2( KI ) + 1 1875 ELSE 1876 ITMP1 = LOCALK2( KI ) 1877* IF WE ACTUALLY OWN COLUMN K 1878 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 1879 CALL DGERV2D( CONTXT, LIHIZ-LILOZ+1, 1, 1880 $ WORK( ICBUF+1 ), LDZ, 1881 $ MYROW, LEFT ) 1882 ITMP1 = ( ITMP1-1 )*LDZ 1883 DO 390 J = LILOZ, LIHIZ 1884 SUM = WORK( ICBUF+J ) + 1885 $ V2*Z( J+ITMP1 ) 1886 WORK( ICBUF+J ) = WORK( ICBUF+J ) - 1887 $ SUM*T1 1888 Z( J+ITMP1 ) = Z( J+ITMP1 ) - SUM*T2 1889 390 CONTINUE 1890 CALL DGESD2D( CONTXT, LIHIZ-LILOZ+1, 1, 1891 $ WORK( ICBUF+1 ), LDZ, 1892 $ MYROW, LEFT ) 1893 LOCALK2( KI ) = LOCALK2( KI ) + 1 1894 END IF 1895 END IF 1896 ELSE 1897* 1898* NO WORK BUT NEED TO UPDATE ANYWAY???? 1899* 1900 IF( ( MOD( K-1, HBL ).EQ.HBL-1 ) .AND. 1901 $ ( ICURCOL( KI ).EQ.RIGHT ) ) THEN 1902 ITMP1 = KCOL( KI ) 1903 ITMP1 = ( ITMP1-1 )*LDZ 1904 CALL DGESD2D( CONTXT, LIHIZ-LILOZ+1, 1, 1905 $ Z( LILOZ+ITMP1 ), LDZ, 1906 $ MYROW, RIGHT ) 1907 CALL DGERV2D( CONTXT, LIHIZ-LILOZ+1, 1, 1908 $ Z( LILOZ+ITMP1 ), LDZ, 1909 $ MYROW, RIGHT ) 1910 LOCALK2( KI ) = LOCALK2( KI ) + 1 1911 END IF 1912 END IF 1913 END IF 1914 END IF 1915 400 CONTINUE 1916* 1917* Adjust local information for this bulge 1918* 1919 IF( NPROW.EQ.1 ) THEN 1920 KROW( KI ) = KROW( KI ) + K2( KI ) - K1( KI ) + 1 1921 KP2ROW( KI ) = KP2ROW( KI ) + K2( KI ) - K1( KI ) + 1 1922 END IF 1923 IF( ( MOD( K1( KI )-1, HBL ).LT.HBL-2 ) .AND. 1924 $ ( ICURROW( KI ).EQ.MYROW ) .AND. ( NPROW.GT.1 ) ) 1925 $ THEN 1926 KROW( KI ) = KROW( KI ) + K2( KI ) - K1( KI ) + 1 1927 END IF 1928 IF( ( MOD( K2( KI ), HBL ).LT.HBL-2 ) .AND. 1929 $ ( ICURROW( KI ).EQ.MYROW ) .AND. ( NPROW.GT.1 ) ) 1930 $ THEN 1931 KP2ROW( KI ) = KP2ROW( KI ) + K2( KI ) - K1( KI ) + 1 1932 END IF 1933 IF( ( MOD( K1( KI )-1, HBL ).GE.HBL-2 ) .AND. 1934 $ ( ( MYROW.EQ.ICURROW( KI ) ) .OR. ( DOWN.EQ. 1935 $ ICURROW( KI ) ) ) .AND. ( NPROW.GT.1 ) ) THEN 1936 CALL INFOG1L( K2( KI )+1, HBL, NPROW, MYROW, 0, 1937 $ KROW( KI ), ITMP2 ) 1938 ITMP2 = NUMROC( N, HBL, MYROW, 0, NPROW ) 1939 END IF 1940 IF( ( MOD( K2( KI ), HBL ).GE.HBL-2 ) .AND. 1941 $ ( ( MYROW.EQ.ICURROW( KI ) ) .OR. ( UP.EQ. 1942 $ ICURROW( KI ) ) ) .AND. ( NPROW.GT.1 ) ) THEN 1943 CALL INFOG1L( 1, HBL, NPROW, MYROW, 0, ITMP2, 1944 $ KP2ROW( KI ) ) 1945 KP2ROW( KI ) = NUMROC( K2( KI )+3, HBL, MYROW, 0, 1946 $ NPROW ) 1947 END IF 1948 IF( NPCOL.EQ.1 ) THEN 1949 KCOL( KI ) = KCOL( KI ) + K2( KI ) - K1( KI ) + 1 1950 KP2COL( KI ) = KP2COL( KI ) + K2( KI ) - K1( KI ) + 1 1951 END IF 1952 IF( ( MOD( K1( KI )-1, HBL ).LT.HBL-2 ) .AND. 1953 $ ( ICURCOL( KI ).EQ.MYCOL ) .AND. ( NPCOL.GT.1 ) ) 1954 $ THEN 1955 KCOL( KI ) = KCOL( KI ) + K2( KI ) - K1( KI ) + 1 1956 END IF 1957 IF( ( MOD( K2( KI ), HBL ).LT.HBL-2 ) .AND. 1958 $ ( ICURCOL( KI ).EQ.MYCOL ) .AND. ( NPCOL.GT.1 ) ) 1959 $ THEN 1960 KP2COL( KI ) = KP2COL( KI ) + K2( KI ) - K1( KI ) + 1 1961 END IF 1962 IF( ( MOD( K1( KI )-1, HBL ).GE.HBL-2 ) .AND. 1963 $ ( ( MYCOL.EQ.ICURCOL( KI ) ) .OR. ( RIGHT.EQ. 1964 $ ICURCOL( KI ) ) ) .AND. ( NPCOL.GT.1 ) ) THEN 1965 CALL INFOG1L( K2( KI )+1, HBL, NPCOL, MYCOL, 0, 1966 $ KCOL( KI ), ITMP2 ) 1967 ITMP2 = NUMROC( N, HBL, MYCOL, 0, NPCOL ) 1968 END IF 1969 IF( ( MOD( K2( KI ), HBL ).GE.HBL-2 ) .AND. 1970 $ ( ( MYCOL.EQ.ICURCOL( KI ) ) .OR. ( LEFT.EQ. 1971 $ ICURCOL( KI ) ) ) .AND. ( NPCOL.GT.1 ) ) THEN 1972 CALL INFOG1L( 1, HBL, NPCOL, MYCOL, 0, ITMP2, 1973 $ KP2COL( KI ) ) 1974 KP2COL( KI ) = NUMROC( K2( KI )+3, HBL, MYCOL, 0, 1975 $ NPCOL ) 1976 END IF 1977 K1( KI ) = K2( KI ) + 1 1978 ISTOP = MIN( K1( KI )+ROTN-MOD( K1( KI ), ROTN ), I-2 ) 1979 ISTOP = MIN( ISTOP, K1( KI )+HBL-3- 1980 $ MOD( K1( KI )-1, HBL ) ) 1981 ISTOP = MIN( ISTOP, I2-2 ) 1982 ISTOP = MAX( ISTOP, K1( KI ) ) 1983* ISTOP = MIN( ISTOP , I-1 ) 1984 K2( KI ) = ISTOP 1985 IF( K1( KI ).EQ.ISTOP ) THEN 1986 IF( ( MOD( ISTOP-1, HBL ).EQ.HBL-2 ) .AND. 1987 $ ( I-ISTOP.GT.1 ) ) THEN 1988* 1989* Next step switches rows & cols 1990* 1991 ICURROW( KI ) = MOD( ICURROW( KI )+1, NPROW ) 1992 ICURCOL( KI ) = MOD( ICURCOL( KI )+1, NPCOL ) 1993 END IF 1994 END IF 1995 410 CONTINUE 1996 IF( K2( IBULGE ).LE.I-1 ) 1997 $ GO TO 40 1998 END IF 1999* 2000 420 CONTINUE 2001* 2002* Failure to converge in remaining number of iterations 2003* 2004 INFO = I 2005 RETURN 2006* 2007 430 CONTINUE 2008* 2009 IF( L.EQ.I ) THEN 2010* 2011* H(I,I-1) is negligible: one eigenvalue has converged. 2012* 2013 CALL INFOG2L( I, I, DESCA, NPROW, NPCOL, MYROW, MYCOL, IROW, 2014 $ ICOL, ITMP1, ITMP2 ) 2015 IF( ( MYROW.EQ.ITMP1 ) .AND. ( MYCOL.EQ.ITMP2 ) ) THEN 2016 WR( I ) = A( ( ICOL-1 )*LDA+IROW ) 2017 ELSE 2018 WR( I ) = ZERO 2019 END IF 2020 WI( I ) = ZERO 2021 ELSE IF( L.EQ.I-1 ) THEN 2022* 2023* H(I-1,I-2) is negligible: a pair of eigenvalues have converged. 2024* 2025 CALL PDELGET( 'All', ' ', H11, A, L, L, DESCA ) 2026 CALL PDELGET( 'All', ' ', H21, A, I, L, DESCA ) 2027 CALL PDELGET( 'All', ' ', H12, A, L, I, DESCA ) 2028 CALL PDELGET( 'All', ' ', H22, A, I, I, DESCA ) 2029 CALL DLANV2( H11, H12, H21, H22, WR( L ), WI( L ), WR( I ), 2030 $ WI( I ), CS, SN ) 2031 IF( NODE .NE. 0 ) THEN 2032 WR( L ) = ZERO 2033 WR( I ) = ZERO 2034 WI( L ) = ZERO 2035 WI( I ) = ZERO 2036 ENDIF 2037 ELSE 2038* 2039* Find the eigenvalues in H(L:I,L:I), L < I-1 2040* 2041 JBLK = I - L + 1 2042 IF( JBLK.LE.2*IBLK ) THEN 2043 CALL PDLACP3( I-L+1, L, A, DESCA, S1, 2*IBLK, 0, 0, 0 ) 2044 CALL DLAHQR( .FALSE., .FALSE., JBLK, 1, JBLK, S1, 2*IBLK, 2045 $ WR( L ), WI( L ), 1, JBLK, Z, LDZ, IERR ) 2046 IF( NODE.NE.0 ) THEN 2047* 2048* Erase the eigenvalues 2049* 2050 DO 440 K = L, I 2051 WR( K ) = ZERO 2052 WI( K ) = ZERO 2053 440 CONTINUE 2054 END IF 2055 END IF 2056 END IF 2057* 2058* Decrement number of remaining iterations, and return to start of 2059* the main loop with new value of I. 2060* 2061 ITN = ITN - ITS 2062 IF( M.EQ.L-10 ) THEN 2063 I = L - 1 2064 ELSE 2065 I = M 2066 END IF 2067* I = L - 1 2068 GO TO 10 2069* 2070 450 CONTINUE 2071 CALL DGSUM2D( CONTXT, 'All', ' ', N, 1, WR, N, -1, -1 ) 2072 CALL DGSUM2D( CONTXT, 'All', ' ', N, 1, WI, N, -1, -1 ) 2073 RETURN 2074* 2075* END OF PDLAHQR 2076* 2077 END 2078