1 SUBROUTINE PZLAHQR( WANTT, WANTZ, N, ILO, IHI, A, DESCA, W, ILOZ, 2 $ IHIZ, Z, DESCZ, WORK, LWORK, IWORK, ILWORK, 3 $ INFO ) 4* 5* -- ScaLAPACK routine (version 1.7.3) -- 6* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 7* and University of California, Berkeley. 8* 1.7.3: March 22, 2006 9* modification suggested by Mark Fahey and Greg Henry 10* 1.7.0: July 31, 2001 11* 12* .. Scalar Arguments .. 13 LOGICAL WANTT, WANTZ 14 INTEGER IHI, IHIZ, ILO, ILOZ, ILWORK, INFO, LWORK, N 15* .. 16* .. Array Arguments .. 17 INTEGER DESCA( * ), DESCZ( * ), IWORK( * ) 18 COMPLEX*16 A( * ), W( * ), WORK( * ), Z( * ) 19* .. 20* 21* Purpose 22* ======= 23* 24* PZLAHQR is an auxiliary routine used to find the Schur decomposition 25* and or eigenvalues of a matrix already in Hessenberg form from 26* cols ILO to IHI. 27* If Z = I, and WANTT=WANTZ=.TRUE., H gets replaced with Z'HZ, 28* with Z'Z=I, and H in Schur form. 29* 30* Notes 31* ===== 32* 33* Each global data object is described by an associated description 34* vector. This vector stores the information required to establish 35* the mapping between an object element and its corresponding process 36* and memory location. 37* 38* Let A be a generic term for any 2D block cyclicly distributed array. 39* Such a global array has an associated description vector DESCA. 40* In the following comments, the character _ should be read as 41* "of the global array". 42* 43* NOTATION STORED IN EXPLANATION 44* --------------- -------------- -------------------------------------- 45* DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 46* DTYPE_A = 1. 47* CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 48* the BLACS process grid A is distribu- 49* ted over. The context itself is glo- 50* bal, but the handle (the integer 51* value) may vary. 52* M_A (global) DESCA( M_ ) The number of rows in the global 53* array A. 54* N_A (global) DESCA( N_ ) The number of columns in the global 55* array A. 56* MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 57* the rows of the array. 58* NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 59* the columns of the array. 60* RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 61* row of the array A is distributed. 62* CSRC_A (global) DESCA( CSRC_ ) The process column over which the 63* first column of the array A is 64* distributed. 65* LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 66* array. LLD_A >= MAX(1,LOCp(M_A)). 67* 68* Let K be the number of rows or columns of a distributed matrix, 69* and assume that its process grid has dimension p x q. 70* LOCp( K ) denotes the number of elements of K that a process 71* would receive if K were distributed over the p processes of its 72* process column. 73* Similarly, LOCq( K ) denotes the number of elements of K that a 74* process would receive if K were distributed over the q processes of 75* its process row. 76* The values of LOCp() and LOCq() may be determined via a call to the 77* ScaLAPACK tool function, NUMROC: 78* LOCp( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 79* LOCq( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 80* An upper bound for these quantities may be computed by: 81* LOCp( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 82* LOCq( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 83* 84* Arguments 85* ========= 86* 87* WANTT (global input) LOGICAL 88* = .TRUE. : the full Schur form T is required; 89* = .FALSE.: only eigenvalues are required. 90* 91* WANTZ (global input) LOGICAL 92* = .TRUE. : the matrix of Schur vectors Z is required; 93* = .FALSE.: Schur vectors are not required. 94* 95* N (global input) INTEGER 96* The order of the Hessenberg matrix A (and Z if WANTZ). 97* N >= 0. 98* 99* ILO (global input) INTEGER 100* IHI (global input) INTEGER 101* It is assumed that A is already upper quasi-triangular in 102* rows and columns IHI+1:N, and that A(ILO,ILO-1) = 0 (unless 103* ILO = 1). PZLAHQR works primarily with the Hessenberg 104* submatrix in rows and columns ILO to IHI, but applies 105* transformations to all of H if WANTT is .TRUE.. 106* 1 <= ILO <= max(1,IHI); IHI <= N. 107* 108* A (global input/output) COMPLEX*16 array, dimension 109* (DESCA(LLD_),*) 110* On entry, the upper Hessenberg matrix A. 111* On exit, if WANTT is .TRUE., A is upper triangular in rows 112* and columns ILO:IHI. If WANTT is .FALSE., the contents of 113* A are unspecified on exit. 114* 115* DESCA (global and local input) INTEGER array of dimension DLEN_. 116* The array descriptor for the distributed matrix A. 117* 118* W (global replicated output) COMPLEX*16 array, dimension (N) 119* The computed eigenvalues ILO to IHI are stored in the 120* corresponding elements of W. If WANTT is .TRUE., the 121* eigenvalues are stored in the same order as on the diagonal 122* of the Schur form returned in A. A may be returned with 123* larger diagonal blocks until the next release. 124* 125* ILOZ (global input) INTEGER 126* IHIZ (global input) INTEGER 127* Specify the rows of Z to which transformations must be 128* applied if WANTZ is .TRUE.. 129* 1 <= ILOZ <= ILO; IHI <= IHIZ <= N. 130* 131* Z (global input/output) COMPLEX*16 array. 132* If WANTZ is .TRUE., on entry Z must contain the current 133* matrix Z of transformations accumulated by PZHSEQR, and on 134* exit Z has been updated; transformations are applied only to 135* the submatrix Z(ILOZ:IHIZ,ILO:IHI). 136* If WANTZ is .FALSE., Z is not referenced. 137* 138* DESCZ (global and local input) INTEGER array of dimension DLEN_. 139* The array descriptor for the distributed matrix Z. 140* 141* WORK (local output) COMPLEX*16 array of size LWORK 142* (Unless LWORK=-1, in which case WORK must be at least size 1) 143* 144* LWORK (local input) INTEGER 145* WORK(LWORK) is a local array and LWORK is assumed big enough 146* so that LWORK >= 3*N + 147* MAX( 2*MAX(DESCZ(LLD_),DESCA(LLD_)) + 2*LOCq(N), 148* 7*Ceil(N/HBL)/LCM(NPROW,NPCOL)) + 149* MAX( 2*N, (8*LCM(NPROW,NPCOL)+2)**2 ) 150* If LWORK=-1, then WORK(1) gets set to the above number and 151* the code returns immediately. 152* 153* IWORK (global and local input) INTEGER array of size ILWORK 154* This will hold some of the IBLK integer arrays. 155* This is held as a place holder for a future release. 156* Currently unreferenced. 157* 158* ILWORK (local input) INTEGER 159* This will hold the size of the IWORK array. 160* This is held as a place holder for a future release. 161* Currently unreferenced. 162* 163* INFO (global output) INTEGER 164* < 0: parameter number -INFO incorrect or inconsistent 165* = 0: successful exit 166* > 0: PZLAHQR failed to compute all the eigenvalues ILO to IHI 167* in a total of 30*(IHI-ILO+1) iterations; if INFO = i, 168* elements i+1:ihi of W contains those eigenvalues 169* which have been successfully computed. 170* 171* Logic: 172* This algorithm is very similar to DLAHQR. Unlike DLAHQR, 173* instead of sending one double shift through the largest 174* unreduced submatrix, this algorithm sends multiple double shifts 175* and spaces them apart so that there can be parallelism across 176* several processor row/columns. Another critical difference is 177* that this algorithm aggregrates multiple transforms together in 178* order to apply them in a block fashion. 179* 180* Important Local Variables: 181* IBLK = The maximum number of bulges that can be computed. 182* Currently fixed. Future releases this won't be fixed. 183* HBL = The square block size (HBL=DESCA(MB_)=DESCA(NB_)) 184* ROTN = The number of transforms to block together 185* NBULGE = The number of bulges that will be attempted on the 186* current submatrix. 187* IBULGE = The current number of bulges started. 188* K1(*),K2(*) = The current bulge loops from K1(*) to K2(*). 189* 190* Subroutines: 191* From LAPACK, this routine calls: 192* ZLAHQR -> Serial QR used to determine shifts and 193* eigenvalues 194* ZLARFG -> Determine the Householder transforms 195* 196* This ScaLAPACK, this routine calls: 197* PZLACONSB -> To determine where to start each iteration 198* ZLAMSH -> Sends multiple shifts through a small 199* submatrix to see how the consecutive 200* subdiagonals change (if PZLACONSB indicates 201* we can start a run in the middle) 202* PZLAWIL -> Given the shift, get the transformation 203* PZLACP3 -> Parallel array to local replicated array copy 204* & back. 205* ZLAREF -> Row/column reflector applier. Core routine 206* here. 207* PZLASMSUB -> Finds negligible subdiagonal elements. 208* 209* Current Notes and/or Restrictions: 210* 1.) This code requires the distributed block size to be square 211* and at least six (6); unlike simpler codes like LU, this 212* algorithm is extremely sensitive to block size. Unwise 213* choices of too small a block size can lead to bad 214* performance. 215* 2.) This code requires A and Z to be distributed identically 216* and have identical contxts. A future version may allow Z to 217* have a different contxt to 1D row map it to all nodes (so no 218* communication on Z is necessary.) 219* 3.) This code does not currently block the initial transforms 220* so that none of the rows or columns for any bulge are 221* completed until all are started. To offset pipeline 222* start-up it is recommended that at least 2*LCM(NPROW,NPCOL) 223* bulges are used (if possible) 224* 4.) The maximum number of bulges currently supported is fixed at 225* 32. In future versions this will be limited only by the 226* incoming WORK and IWORK array. 227* 5.) The matrix A must be in upper Hessenberg form. If elements 228* below the subdiagonal are nonzero, the resulting transforms 229* may be nonsimilar. This is also true with the LAPACK 230* routine ZLAHQR. 231* 6.) For this release, this code has only been tested for 232* RSRC_=CSRC_=0, but it has been written for the general case. 233* 7.) Currently, all the eigenvalues are distributed to all the 234* nodes. Future releases will probably distribute the 235* eigenvalues by the column partitioning. 236* 8.) The internals of this routine are subject to change. 237* 9.) To optimize this for your architecture, try tuning ZLAREF. 238* 10.) This code has only been tested for WANTZ = .TRUE. and may 239* behave unpredictably for WANTZ set to .FALSE. 240* 241* Further Details 242* =============== 243* 244* Contributed by Mark Fahey, June, 2000. 245* 246* ===================================================================== 247* 248* .. Parameters .. 249 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DT_, 250 $ LLD_, MB_, M_, NB_, N_, RSRC_ 251 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DT_ = 1, 252 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 253 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 254 DOUBLE PRECISION RONE 255 PARAMETER ( RONE = 1.0D+0 ) 256 COMPLEX*16 ZERO, ONE 257 PARAMETER ( ZERO = ( 0.0D+0, 0.0D+0 ), 258 $ ONE = ( 1.0D+0, 0.0D+0 ) ) 259 DOUBLE PRECISION CONST 260 PARAMETER ( CONST = 1.50D+0 ) 261 INTEGER IBLK 262 PARAMETER ( IBLK = 32 ) 263* .. 264* .. Local Scalars .. 265 LOGICAL SKIP 266 INTEGER CONTXT, DOWN, HBL, I, I1, I2, IAFIRST, IBULGE, 267 $ ICBUF, ICOL, ICOL1, ICOL2, IDIA, IERR, II, 268 $ IRBUF, IROW, IROW1, IROW2, ISPEC, ISTART, 269 $ ISTARTCOL, ISTARTROW, ISTOP, ISUB, ISUP, 270 $ ITERMAX, ITMP1, ITMP2, ITN, ITS, IZBUF, J, 271 $ JAFIRST, JBLK, JJ, K, KI, L, LCMRC, LDA, LDZ, 272 $ LEFT, LIHIH, LIHIZ, LILOH, LILOZ, LOCALI1, 273 $ LOCALI2, LOCALK, LOCALM, M, MODKM1, MYCOL, 274 $ MYROW, NBULGE, NH, NODE, NPCOL, NPROW, NQ, NR, 275 $ NUM, NZ, RIGHT, ROTN, UP, VECSIDX 276 DOUBLE PRECISION CS, OVFL, S, SMLNUM, ULP, UNFL 277 COMPLEX*16 CDUM, H10, H11, H22, H33, H43H34, H44, SN, SUM, 278 $ T1, T1COPY, T2, T3, V1SAVE, V2, V2SAVE, V3, 279 $ V3SAVE 280* .. 281* .. Local Arrays .. 282 INTEGER ICURCOL( IBLK ), ICURROW( IBLK ), K1( IBLK ), 283 $ K2( IBLK ), KCOL( IBLK ), KP2COL( IBLK ), 284 $ KP2ROW( IBLK ), KROW( IBLK ) 285 COMPLEX*16 S1( 2*IBLK, 2*IBLK ), SMALLA( 6, 6, IBLK ), 286 $ VCOPY( 3 ) 287* .. 288* .. External Functions .. 289 INTEGER ILCM, NUMROC 290 DOUBLE PRECISION PDLAMCH 291 EXTERNAL ILCM, NUMROC, PDLAMCH 292* .. 293* .. External Subroutines .. 294 EXTERNAL BLACS_GRIDINFO, IGAMN2D, IGEBR2D, IGEBS2D, 295 $ INFOG1L, INFOG2L, PDLABAD, PXERBLA, PZLACONSB, 296 $ PZLACP3, PZLASMSUB, PZLAWIL, PZROT, ZCOPY, 297 $ ZGEBR2D, ZGEBS2D, ZGERV2D, ZGESD2D, ZGSUM2D, 298 $ ZLAHQR2, ZLAMSH, ZLANV2, ZLAREF, ZLARFG 299* .. 300* .. Intrinsic Functions .. 301* 302 INTRINSIC ABS, DBLE, DCONJG, DIMAG, MAX, MIN, MOD 303* .. 304* .. Statement Functions .. 305 DOUBLE PRECISION CABS1 306* .. 307* .. Statement Function definitions .. 308 CABS1( CDUM ) = ABS( DBLE( CDUM ) ) + ABS( DIMAG( CDUM ) ) 309* .. 310* .. Executable Statements .. 311* 312 INFO = 0 313* 314 ITERMAX = 30*( IHI-ILO+1 ) 315 IF( N.EQ.0 ) 316 $ RETURN 317* 318* NODE (IAFIRST,JAFIRST) OWNS A(1,1) 319* 320 HBL = DESCA( MB_ ) 321 CONTXT = DESCA( CTXT_ ) 322 LDA = DESCA( LLD_ ) 323 IAFIRST = DESCA( RSRC_ ) 324 JAFIRST = DESCA( CSRC_ ) 325 LDZ = DESCZ( LLD_ ) 326 CALL BLACS_GRIDINFO( CONTXT, NPROW, NPCOL, MYROW, MYCOL ) 327 NODE = MYROW*NPCOL + MYCOL 328 NUM = NPROW*NPCOL 329 LEFT = MOD( MYCOL+NPCOL-1, NPCOL ) 330 RIGHT = MOD( MYCOL+1, NPCOL ) 331 UP = MOD( MYROW+NPROW-1, NPROW ) 332 DOWN = MOD( MYROW+1, NPROW ) 333 LCMRC = ILCM( NPROW, NPCOL ) 334 IF( ( NPROW.LE.3 ) .OR. ( NPCOL.LE.3 ) ) THEN 335 SKIP = .TRUE. 336 ELSE 337 SKIP = .FALSE. 338 END IF 339* 340* Determine the number of columns we have so we can check workspace 341* 342 NQ = NUMROC( N, HBL, MYCOL, JAFIRST, NPCOL ) 343 JJ = N / HBL 344 IF( JJ*HBL.LT.N ) 345 $ JJ = JJ + 1 346 JJ = 7*JJ / LCMRC 347 JJ = 3*N + MAX( 2*MAX( LDA, LDZ )+2*NQ, JJ ) 348 JJ = JJ + MAX( 2*N, ( 8*LCMRC+2 )**2 ) 349 IF( LWORK.EQ.-1 ) THEN 350 WORK( 1 ) = JJ 351 RETURN 352 END IF 353 IF( LWORK.LT.JJ ) THEN 354 INFO = -14 355 END IF 356 IF( DESCZ( CTXT_ ).NE.DESCA( CTXT_ ) ) THEN 357 INFO = -( 1300+CTXT_ ) 358 END IF 359 IF( DESCA( MB_ ).NE.DESCA( NB_ ) ) THEN 360 INFO = -( 700+NB_ ) 361 END IF 362 IF( DESCZ( MB_ ).NE.DESCZ( NB_ ) ) THEN 363 INFO = -( 1300+NB_ ) 364 END IF 365 IF( DESCA( MB_ ).NE.DESCZ( MB_ ) ) THEN 366 INFO = -( 1300+MB_ ) 367 END IF 368 IF( ( DESCA( RSRC_ ).NE.0 ) .OR. ( DESCA( CSRC_ ).NE.0 ) ) THEN 369 INFO = -( 700+RSRC_ ) 370 END IF 371 IF( ( DESCZ( RSRC_ ).NE.0 ) .OR. ( DESCZ( CSRC_ ).NE.0 ) ) THEN 372 INFO = -( 1300+RSRC_ ) 373 END IF 374 IF( ( ILO.GT.N ) .OR. ( ILO.LT.1 ) ) THEN 375 INFO = -4 376 END IF 377 IF( ( IHI.GT.N ) .OR. ( IHI.LT.1 ) ) THEN 378 INFO = -5 379 END IF 380 IF( HBL.LT.5 ) THEN 381 INFO = -( 700+MB_ ) 382 END IF 383 CALL IGAMN2D( CONTXT, 'ALL', ' ', 1, 1, INFO, 1, ITMP1, ITMP2, -1, 384 $ -1, -1 ) 385 IF( INFO.LT.0 ) THEN 386 CALL PXERBLA( CONTXT, 'PZLAHQR', -INFO ) 387 RETURN 388 END IF 389* 390* Set work array indices 391* 392 VECSIDX = 0 393 IDIA = 3*N 394 ISUB = 3*N 395 ISUP = 3*N 396 IRBUF = 3*N 397 ICBUF = 3*N 398 IZBUF = 5*N 399* 400* Find a value for ROTN 401* 402 ROTN = HBL / 3 403 ROTN = MIN( ROTN, HBL-2 ) 404 ROTN = MAX( ROTN, 1 ) 405* 406 IF( ILO.EQ.IHI ) THEN 407 CALL INFOG2L( ILO, ILO, DESCA, NPROW, NPCOL, MYROW, MYCOL, 408 $ IROW, ICOL, II, JJ ) 409 IF( ( MYROW.EQ.II ) .AND. ( MYCOL.EQ.JJ ) ) THEN 410 W( ILO ) = A( ( ICOL-1 )*LDA+IROW ) 411 ELSE 412 W( ILO ) = ZERO 413 END IF 414 RETURN 415 END IF 416* 417 NH = IHI - ILO + 1 418 NZ = IHIZ - ILOZ + 1 419* 420 CALL INFOG1L( ILOZ, HBL, NPROW, MYROW, IAFIRST, LILOZ, LIHIZ ) 421 LIHIZ = NUMROC( IHIZ, HBL, MYROW, IAFIRST, NPROW ) 422* 423* Set machine-dependent constants for the stopping criterion. 424* If NORM(H) <= SQRT(OVFL), overflow should not occur. 425* 426 UNFL = PDLAMCH( CONTXT, 'SAFE MINIMUM' ) 427 OVFL = RONE / UNFL 428 CALL PDLABAD( CONTXT, UNFL, OVFL ) 429 ULP = PDLAMCH( CONTXT, 'PRECISION' ) 430 SMLNUM = UNFL*( NH / ULP ) 431* 432* I1 and I2 are the indices of the first row and last column of H 433* to which transformations must be applied. If eigenvalues only are 434* being computed, I1 and I2 are set inside the main loop. 435* 436 IF( WANTT ) THEN 437 I1 = 1 438 I2 = N 439 END IF 440* 441* ITN is the total number of QR iterations allowed. 442* 443 ITN = ITERMAX 444* 445* The main loop begins here. I is the loop index and decreases from 446* IHI to ILO in steps of our schur block size (<=2*IBLK). Each 447* iteration of the loop works with the active submatrix in rows 448* and columns L to I. Eigenvalues I+1 to IHI have already 449* converged. Either L = ILO or the global A(L,L-1) is negligible 450* so that the matrix splits. 451* 452 I = IHI 453 10 CONTINUE 454 L = ILO 455 IF( I.LT.ILO ) 456 $ GO TO 570 457* 458* Perform QR iterations on rows and columns ILO to I until a 459* submatrix of order 1 or 2 splits off at the bottom because a 460* subdiagonal element has become negligible. 461* 462 DO 540 ITS = 0, ITN 463* 464* Look for a single small subdiagonal element. 465* 466 CALL PZLASMSUB( A, DESCA, I, L, K, SMLNUM, WORK( IRBUF+1 ), 467 $ LWORK-IRBUF ) 468 L = K 469* 470 IF( L.GT.ILO ) THEN 471* 472* H(L,L-1) is negligible 473* 474 CALL INFOG2L( L, L-1, DESCA, NPROW, NPCOL, MYROW, MYCOL, 475 $ IROW, ICOL, ITMP1, ITMP2 ) 476 IF( ( MYROW.EQ.ITMP1 ) .AND. ( MYCOL.EQ.ITMP2 ) ) THEN 477 A( ( ICOL-1 )*LDA+IROW ) = ZERO 478 END IF 479 WORK( ISUB+L-1 ) = ZERO 480 END IF 481* 482* Exit from loop if a submatrix of order 1 or 2 has split off. 483* 484 IF( WANTT ) THEN 485* For Schur form, use 2x2 blocks 486 IF( L.GE.I-1 ) THEN 487 GO TO 550 488 END IF 489 ELSE 490* If we don't want the Schur form, use bigger blocks. 491 IF( L.GE.I-( 2*IBLK-1 ) ) THEN 492 GO TO 550 493 END IF 494 END IF 495* 496* Now the active submatrix is in rows and columns L to I. If 497* eigenvalues only are being computed, only the active submatrix 498* need be transformed. 499* 500 IF( .NOT.WANTT ) THEN 501 I1 = L 502 I2 = I 503 END IF 504* 505* Copy submatrix of size 2*JBLK and prepare to do generalized 506* Wilkinson shift or an exceptional shift 507* 508 JBLK = MIN( IBLK, ( ( I-L+1 ) / 2 )-1 ) 509 IF( JBLK.GT.LCMRC ) THEN 510* 511* Make sure it's divisible by LCM (we want even workloads!) 512* 513 JBLK = JBLK - MOD( JBLK, LCMRC ) 514 END IF 515 JBLK = MIN( JBLK, 2*LCMRC ) 516 JBLK = MAX( JBLK, 1 ) 517* 518 CALL PZLACP3( 2*JBLK, I-2*JBLK+1, A, DESCA, S1, 2*IBLK, -1, -1, 519 $ 0 ) 520 IF( ( ITS.EQ.20 .OR. ITS.EQ.40 ) .AND. ( JBLK.GT.1 ) ) THEN 521* 522* Exceptional shift. 523* 524 DO 20 II = 2*JBLK, 2, -1 525 S1( II, II ) = CONST*( CABS1( S1( II, II ) )+ 526 $ CABS1( S1( II, II-1 ) ) ) 527 S1( II, II-1 ) = ZERO 528 S1( II-1, II ) = ZERO 529 20 CONTINUE 530 S1( 1, 1 ) = CONST*CABS1( S1( 1, 1 ) ) 531 ELSE 532 CALL ZLAHQR2( .FALSE., .FALSE., 2*JBLK, 1, 2*JBLK, S1, 533 $ 2*IBLK, WORK( IRBUF+1 ), 1, 2*JBLK, Z, LDZ, 534 $ IERR ) 535* 536* Prepare to use Wilkinson's double shift 537* 538 H44 = S1( 2*JBLK, 2*JBLK ) 539 H33 = S1( 2*JBLK-1, 2*JBLK-1 ) 540 H43H34 = S1( 2*JBLK-1, 2*JBLK )*S1( 2*JBLK, 2*JBLK-1 ) 541* 542 END IF 543* 544* Look for two consecutive small subdiagonal elements: 545* PZLACONSB is the routine that does this. 546* 547 CALL PZLACONSB( A, DESCA, I, L, M, H44, H33, H43H34, 548 $ WORK( IRBUF+1 ), LWORK-IRBUF ) 549* 550* Double-shift QR step 551* 552* NBULGE is the number of bulges that will be attempted 553* 554 ISTOP = MIN( M+ROTN-1-MOD( M-( M / HBL )*HBL-1, ROTN ), I-2 ) 555 ISTOP = MIN( ISTOP, M+HBL-3-MOD( M-1, HBL ) ) 556 ISTOP = MIN( ISTOP, I2-2 ) 557 ISTOP = MAX( ISTOP, M ) 558 NBULGE = ( I-1-ISTOP ) / HBL 559* 560* Do not exceed maximum determined. 561* 562 NBULGE = MIN( NBULGE, JBLK ) 563 IF( NBULGE.GT.LCMRC ) THEN 564* 565* Make sure it's divisible by LCM (we want even workloads!) 566* 567 NBULGE = NBULGE - MOD( NBULGE, LCMRC ) 568 END IF 569 NBULGE = MAX( NBULGE, 1 ) 570* 571* If we are starting in the middle because of consecutive small 572* subdiagonal elements, we need to see how many bulges we 573* can send through without breaking the consecutive small 574* subdiagonal property. 575* 576 IF( ( NBULGE.GT.1 ) .AND. ( M.GT.L ) ) THEN 577* 578* Copy a chunk of elements from global A(M-1:,M-1:) 579* 580 CALL INFOG2L( M+2, M+2, DESCA, NPROW, NPCOL, MYROW, MYCOL, 581 $ IROW1, ICOL1, ITMP1, ITMP2 ) 582 II = MIN( 4*NBULGE+2, N-M+2 ) 583 CALL PZLACP3( II, M-1, A, DESCA, WORK( IRBUF+1 ), II, ITMP1, 584 $ ITMP2, 0 ) 585 IF( ( MYROW.EQ.ITMP1 ) .AND. ( MYCOL.EQ.ITMP2 ) ) THEN 586* 587* Find a new NBULGE based on the bulges we have. 588* 589 CALL ZLAMSH( S1, 2*IBLK, NBULGE, JBLK, WORK( IRBUF+1 ), 590 $ II, II, ULP ) 591 IF( NUM.GT.1 ) THEN 592 CALL IGEBS2D( CONTXT, 'ALL', ' ', 1, 1, NBULGE, 1 ) 593 END IF 594 ELSE 595* 596* Everyone needs to receive the new NBULGE 597* 598 CALL IGEBR2D( CONTXT, 'ALL', ' ', 1, 1, NBULGE, 1, ITMP1, 599 $ ITMP2 ) 600 END IF 601 END IF 602* 603* IBULGE is the number of bulges going so far 604* 605 IBULGE = 1 606* 607* "A" row defs : main row transforms from LOCALK to LOCALI2 608* 609 CALL INFOG1L( M, HBL, NPCOL, MYCOL, JAFIRST, ITMP1, LOCALK ) 610 LOCALK = NQ 611 CALL INFOG1L( 1, HBL, NPCOL, MYCOL, JAFIRST, ICOL1, LOCALI2 ) 612 LOCALI2 = NUMROC( I2, HBL, MYCOL, JAFIRST, NPCOL ) 613* 614* "A" col defs : main col transforms from LOCALI1 to LOCALM 615* 616 CALL INFOG1L( I1, HBL, NPROW, MYROW, IAFIRST, LOCALI1, ICOL1 ) 617 CALL INFOG1L( 1, HBL, NPROW, MYROW, IAFIRST, LOCALM, ICOL1 ) 618 ICOL1 = NUMROC( MIN( M+3, I ), HBL, MYROW, IAFIRST, NPROW ) 619* 620* Which row & column will start the bulges 621* 622 ISTARTROW = MOD( ( M+1 ) / HBL, NPROW ) + IAFIRST 623 ISTARTCOL = MOD( ( M+1 ) / HBL, NPCOL ) + JAFIRST 624* 625 CALL INFOG1L( M, HBL, NPROW, MYROW, IAFIRST, II, ITMP2 ) 626 CALL INFOG1L( M, HBL, NPCOL, MYCOL, JAFIRST, JJ, ITMP2 ) 627 CALL INFOG1L( 1, HBL, NPROW, MYROW, IAFIRST, ISTOP, 628 $ KP2ROW( 1 ) ) 629 KP2ROW( 1 ) = NUMROC( M+2, HBL, MYROW, IAFIRST, NPROW ) 630 CALL INFOG1L( 1, HBL, NPCOL, MYCOL, JAFIRST, ISTOP, 631 $ KP2COL( 1 ) ) 632 KP2COL( 1 ) = NUMROC( M+2, HBL, MYCOL, JAFIRST, NPCOL ) 633* 634* Set all values for bulges. All bulges are stored in 635* intermediate steps as loops over KI. Their current "task" 636* over the global M to I-1 values is always K1(KI) to K2(KI). 637* However, because there are many bulges, K1(KI) & K2(KI) might 638* go past that range while later bulges (KI+1,KI+2,etc..) are 639* finishing up. Even if ROTN=1, in order to minimize border 640* communication sometimes K1(KI)=HBL-2 & K2(KI)=HBL-1 so both 641* border messages can be handled at once. 642* 643* Rules: 644* If MOD(K1(KI)-1,HBL) < HBL-2 then MOD(K2(KI)-1,HBL)<HBL-2 645* If MOD(K1(KI)-1,HBL) = HBL-1 then MOD(K2(KI)-1,HBL)=HBL-1 646* K2(KI)-K1(KI) <= ROTN 647* 648* We first hit a border when MOD(K1(KI)-1,HBL)=HBL-2 and we hit 649* it again when MOD(K1(KI)-1,HBL)=HBL-1. 650* 651 DO 30 KI = 1, NBULGE 652 K1( KI ) = M 653 ISTOP = MIN( M+ROTN-1-MOD( M-( M / HBL )*HBL-1, ROTN ), 654 $ I-2 ) 655 ISTOP = MIN( ISTOP, M+HBL-3-MOD( M-1, HBL ) ) 656 ISTOP = MIN( ISTOP, I2-2 ) 657 ISTOP = MAX( ISTOP, M ) 658 IF( ( MOD( M-1, HBL ).EQ.HBL-2 ) .AND. 659 $ ( ISTOP.LT.MIN( I-2, I2-2 ) ) ) THEN 660 ISTOP = ISTOP + 1 661 END IF 662 K2( KI ) = ISTOP 663 ICURROW( KI ) = ISTARTROW 664 ICURCOL( KI ) = ISTARTCOL 665 KROW( KI ) = II 666 KCOL( KI ) = JJ 667 IF( KI.GT.1 ) 668 $ KP2ROW( KI ) = KP2ROW( 1 ) 669 IF( KI.GT.1 ) 670 $ KP2COL( KI ) = KP2COL( 1 ) 671 30 CONTINUE 672* 673* Get first transform on node who owns M+2,M+2 674* 675 DO 31 ITMP1 = 1, 3 676 VCOPY(ITMP1) = ZERO 677 31 CONTINUE 678 ITMP1 = ISTARTROW 679 ITMP2 = ISTARTCOL 680 CALL PZLAWIL( ITMP1, ITMP2, M, A, DESCA, H44, H33, H43H34, 681 $ VCOPY ) 682 V1SAVE = VCOPY( 1 ) 683 V2SAVE = VCOPY( 2 ) 684 V3SAVE = VCOPY( 3 ) 685* 686* The main implicit shift Francis loops over the bulges starts 687* here! 688* 689 IF( K2( IBULGE ).LE.I-1 ) THEN 690 40 CONTINUE 691 IF( ( K1( IBULGE ).GE.M+5 ) .AND. ( IBULGE.LT.NBULGE ) ) 692 $ THEN 693 IF( ( MOD( K2( IBULGE )+2, HBL ).EQ.MOD( K2( IBULGE+1 )+ 694 $ 2, HBL ) ) .AND. ( K1( 1 ).LE.I-1 ) ) THEN 695 H44 = S1( 2*JBLK-2*IBULGE, 2*JBLK-2*IBULGE ) 696 H33 = S1( 2*JBLK-2*IBULGE-1, 2*JBLK-2*IBULGE-1 ) 697 H43H34 = S1( 2*JBLK-2*IBULGE-1, 2*JBLK-2*IBULGE )* 698 $ S1( 2*JBLK-2*IBULGE, 2*JBLK-2*IBULGE-1 ) 699 ITMP1 = ISTARTROW 700 ITMP2 = ISTARTCOL 701 CALL PZLAWIL( ITMP1, ITMP2, M, A, DESCA, H44, H33, 702 $ H43H34, VCOPY ) 703 V1SAVE = VCOPY( 1 ) 704 V2SAVE = VCOPY( 2 ) 705 V3SAVE = VCOPY( 3 ) 706 IBULGE = IBULGE + 1 707 END IF 708 END IF 709* 710* When we hit a border, there are row and column transforms that 711* overlap over several processors and the code gets very 712* "congested." As a remedy, when we first hit a border, a 6x6 713* *local* matrix is generated on one node (called SMALLA) and 714* work is done on that. At the end of the border, the data is 715* passed back and everything stays a lot simpler. 716* 717 DO 120 KI = 1, IBULGE 718* 719 ISTART = MAX( K1( KI ), M ) 720 ISTOP = MIN( K2( KI ), I-1 ) 721 K = ISTART 722 MODKM1 = MOD( K-1, HBL ) 723 IF( ( MODKM1.GE.HBL-2 ) .AND. ( K.LE.I-1 ) ) THEN 724 DO 81 ITMP1 = 1, 6 725 DO 82 ITMP2 = 1, 6 726 SMALLA(ITMP1, ITMP2, KI) = ZERO 727 82 CONTINUE 728 81 CONTINUE 729 IF( ( MODKM1.EQ.HBL-2 ) .AND. ( K.LT.I-1 ) ) THEN 730* 731* Copy 6 elements from global A(K-1:K+4,K-1:K+4) 732* 733 ITMP1 = ICURROW( KI ) 734 ITMP2 = ICURCOL( KI ) 735 CALL PZLACP3( MIN( 6, N-K+2 ), K-1, A, DESCA, 736 $ SMALLA( 1, 1, KI ), 6, ITMP1, ITMP2, 737 $ 0 ) 738 END IF 739 IF( MODKM1.EQ.HBL-1 ) THEN 740* 741* Copy 6 elements from global A(K-2:K+3,K-2:K+3) 742* 743 CALL INFOG2L( K+1, K+1, DESCA, NPROW, NPCOL, MYROW, 744 $ MYCOL, IROW1, ICOL1, ITMP1, ITMP2 ) 745 CALL PZLACP3( MIN( 6, N-K+3 ), K-2, A, DESCA, 746 $ SMALLA( 1, 1, KI ), 6, ITMP1, ITMP2, 747 $ 0 ) 748 END IF 749 END IF 750* 751* 752* ZLAHQR used to have a single row application and a single 753* column application to H. Here we do something a little 754* more clever. We break each transformation down into 3 755* parts: 756* 1.) The minimum amount of work it takes to determine 757* a group of ROTN transformations (this is on 758* the critical path.) (Loops 50-120) 759* (the data is broadcast now: loops 180-240) 760* 2.) The small work it takes so that each of the rows 761* and columns is at the same place. For example, 762* all ROTN row transforms are all complete 763* through some column TMP. (Loops 250-260) 764* 3.) The majority of the row and column transforms 765* are then applied in a block fashion. 766* (row transforms are in loops 280-380) 767* (col transforms are in loops 400-540) 768* 769* Each of these three parts are further subdivided into 3 770* parts: 771* A.) Work at the start of a border when 772* MOD(ISTART-1,HBL) = HBL-2 773* B.) Work at the end of a border when 774* MOD(ISTART-1,HBL) = HBL-1 775* C.) Work in the middle of the block when 776* MOD(ISTART-1,HBL) < HBL-2 777* 778* Further optimization is met with the boolean SKIP. A border 779* communication can be broken into several parts for 780* efficient parallelism: 781* Loop over all the bulges, just sending the data out 782* Loop over all the bulges, just doing the work 783* Loop over all the bulges, just sending the data back. 784* 785* 786 IF( ( MYROW.EQ.ICURROW( KI ) ) .AND. 787 $ ( MYCOL.EQ.ICURCOL( KI ) ) .AND. 788 $ ( MODKM1.EQ.HBL-2 ) .AND. 789 $ ( ISTART.LT.MIN( I-1, ISTOP+1 ) ) ) THEN 790 K = ISTART 791 NR = MIN( 3, I-K+1 ) 792 IF( K.GT.M ) THEN 793 CALL ZCOPY( NR, SMALLA( 2, 1, KI ), 1, VCOPY, 1 ) 794 ELSE 795 VCOPY( 1 ) = V1SAVE 796 VCOPY( 2 ) = V2SAVE 797 VCOPY( 3 ) = V3SAVE 798 END IF 799 CALL ZLARFG( NR, VCOPY( 1 ), VCOPY( 2 ), 1, T1COPY ) 800 IF( K.GT.M ) THEN 801 SMALLA( 2, 1, KI ) = VCOPY( 1 ) 802 SMALLA( 3, 1, KI ) = ZERO 803 IF( K.LT.I-1 ) 804 $ SMALLA( 4, 1, KI ) = ZERO 805 ELSE IF( M.GT.L ) THEN 806* 807* Following differs in comparison to pdlahqr. 808* 809 SMALLA( 2, 1, KI ) = SMALLA( 2, 1, KI ) - 810 $ DCONJG( T1COPY )* 811 $ SMALLA( 2, 1, KI ) 812 END IF 813 V2 = VCOPY( 2 ) 814 T2 = T1COPY*V2 815 WORK( VECSIDX+( K-1 )*3+1 ) = VCOPY( 2 ) 816 WORK( VECSIDX+( K-1 )*3+2 ) = VCOPY( 3 ) 817 WORK( VECSIDX+( K-1 )*3+3 ) = T1COPY 818 IF( NR.EQ.3 ) THEN 819* 820* Do some work so next step is ready... 821* 822 T1 = T1COPY 823 V3 = VCOPY( 3 ) 824 T3 = T1*V3 825 ITMP1 = MIN( 6, I2+2-K ) 826 ITMP2 = MAX( I1-K+2, 1 ) 827 DO 50 J = 2, ITMP1 828 SUM = DCONJG( T1 )*SMALLA( 2, J, KI ) + 829 $ DCONJG( T2 )*SMALLA( 3, J, KI ) + 830 $ DCONJG( T3 )*SMALLA( 4, J, KI ) 831 SMALLA( 2, J, KI ) = SMALLA( 2, J, KI ) - SUM 832 SMALLA( 3, J, KI ) = SMALLA( 3, J, KI ) - SUM*V2 833 SMALLA( 4, J, KI ) = SMALLA( 4, J, KI ) - SUM*V3 834 50 CONTINUE 835 DO 60 J = ITMP2, 5 836 SUM = T1*SMALLA( J, 2, KI ) + 837 $ T2*SMALLA( J, 3, KI ) + 838 $ T3*SMALLA( J, 4, KI ) 839 SMALLA( J, 2, KI ) = SMALLA( J, 2, KI ) - SUM 840 SMALLA( J, 3, KI ) = SMALLA( J, 3, KI ) - 841 $ SUM*DCONJG( V2 ) 842 SMALLA( J, 4, KI ) = SMALLA( J, 4, KI ) - 843 $ SUM*DCONJG( V3 ) 844 60 CONTINUE 845 END IF 846 END IF 847* 848 IF( ( MOD( ISTOP-1, HBL ).EQ.HBL-1 ) .AND. 849 $ ( MYROW.EQ.ICURROW( KI ) ) .AND. 850 $ ( MYCOL.EQ.ICURCOL( KI ) ) .AND. 851 $ ( ISTART.LE.MIN( I, ISTOP ) ) ) THEN 852 K = ISTOP 853 NR = MIN( 3, I-K+1 ) 854 IF( K.GT.M ) THEN 855 CALL ZCOPY( NR, SMALLA( 3, 2, KI ), 1, VCOPY, 1 ) 856 ELSE 857 VCOPY( 1 ) = V1SAVE 858 VCOPY( 2 ) = V2SAVE 859 VCOPY( 3 ) = V3SAVE 860 END IF 861 CALL ZLARFG( NR, VCOPY( 1 ), VCOPY( 2 ), 1, T1COPY ) 862 IF( K.GT.M ) THEN 863 SMALLA( 3, 2, KI ) = VCOPY( 1 ) 864 SMALLA( 4, 2, KI ) = ZERO 865 IF( K.LT.I-1 ) 866 $ SMALLA( 5, 2, KI ) = ZERO 867* 868* Set a subdiagonal to zero now if it's possible 869* 870 IF( ( K-2.GT.M ) .AND. ( MOD( K-1, HBL ).GT.1 ) ) 871 $ THEN 872 H11 = SMALLA( 1, 1, KI ) 873 H10 = SMALLA( 2, 1, KI ) 874 H22 = SMALLA( 2, 2, KI ) 875 S = CABS1( H11 ) + CABS1( H22 ) 876 IF( CABS1( H10 ).LE.MAX( ULP*S, SMLNUM ) ) THEN 877 SMALLA( 2, 1, KI ) = ZERO 878 END IF 879 END IF 880 ELSE IF( M.GT.L ) THEN 881* 882* Following differs in comparison to pdlahqr. 883* 884 SMALLA( 3, 2, KI ) = SMALLA( 3, 2, KI ) - 885 $ DCONJG( T1COPY )* 886 $ SMALLA( 3, 2, KI ) 887 END IF 888 V2 = VCOPY( 2 ) 889 T2 = T1COPY*V2 890 WORK( VECSIDX+( K-1 )*3+1 ) = VCOPY( 2 ) 891 WORK( VECSIDX+( K-1 )*3+2 ) = VCOPY( 3 ) 892 WORK( VECSIDX+( K-1 )*3+3 ) = T1COPY 893 IF( NR.EQ.3 ) THEN 894* 895* Do some work so next step is ready... 896* 897 T1 = T1COPY 898 V3 = VCOPY( 3 ) 899 T3 = T1*V3 900 ITMP1 = MIN( 6, I2-K+3 ) 901 ITMP2 = MAX( I1-K+3, 1 ) 902 DO 70 J = 3, ITMP1 903 SUM = DCONJG( T1 )*SMALLA( 3, J, KI ) + 904 $ DCONJG( T2 )*SMALLA( 4, J, KI ) + 905 $ DCONJG( T3 )*SMALLA( 5, J, KI ) 906 SMALLA( 3, J, KI ) = SMALLA( 3, J, KI ) - SUM 907 SMALLA( 4, J, KI ) = SMALLA( 4, J, KI ) - SUM*V2 908 SMALLA( 5, J, KI ) = SMALLA( 5, J, KI ) - SUM*V3 909 70 CONTINUE 910 DO 80 J = ITMP2, 6 911 SUM = T1*SMALLA( J, 3, KI ) + 912 $ T2*SMALLA( J, 4, KI ) + 913 $ T3*SMALLA( J, 5, KI ) 914 SMALLA( J, 3, KI ) = SMALLA( J, 3, KI ) - SUM 915 SMALLA( J, 4, KI ) = SMALLA( J, 4, KI ) - 916 $ SUM*DCONJG( V2 ) 917 SMALLA( J, 5, KI ) = SMALLA( J, 5, KI ) - 918 $ SUM*DCONJG( V3 ) 919 80 CONTINUE 920 END IF 921 END IF 922* 923 IF( ( MODKM1.EQ.0 ) .AND. ( ISTART.LE.I-1 ) .AND. 924 $ ( MYROW.EQ.ICURROW( KI ) ) .AND. 925 $ ( RIGHT.EQ.ICURCOL( KI ) ) ) THEN 926* 927* (IROW1,ICOL1) is (I,J)-coordinates of H(ISTART,ISTART) 928* 929 IROW1 = KROW( KI ) 930 ICOL1 = KCOL( KI ) 931* 932* The ELSE part of this IF needs updated VCOPY, this 933* was not necessary in PDLAHQR. 934* 935 IF( ISTART.GT.M ) THEN 936 VCOPY( 1 ) = SMALLA( 4, 3, KI ) 937 VCOPY( 2 ) = SMALLA( 5, 3, KI ) 938 VCOPY( 3 ) = SMALLA( 6, 3, KI ) 939 NR = MIN( 3, I-ISTART+1 ) 940 CALL ZLARFG( NR, VCOPY( 1 ), VCOPY( 2 ), 1, 941 $ T1COPY ) 942 A( ( ICOL1-2 )*LDA+IROW1 ) = VCOPY( 1 ) 943 A( ( ICOL1-2 )*LDA+IROW1+1 ) = ZERO 944 IF( ISTART.LT.I-1 ) THEN 945 A( ( ICOL1-2 )*LDA+IROW1+2 ) = ZERO 946 END IF 947 ELSE 948* 949* If NPCOL.NE.1 THEN we need updated VCOPY. 950* 951 NR = MIN( 3, I-ISTART+1 ) 952 IF( NPCOL.EQ.1 ) THEN 953 VCOPY( 1 ) = V1SAVE 954 VCOPY( 2 ) = V2SAVE 955 VCOPY( 3 ) = V3SAVE 956 ELSE 957* 958* Get updated VCOPY from RIGHT 959* 960 CALL ZGERV2D( CONTXT, 3, 1, VCOPY, 3, MYROW, 961 $ RIGHT ) 962 END IF 963 CALL ZLARFG( NR, VCOPY( 1 ), VCOPY( 2 ), 1, 964 $ T1COPY ) 965 IF( M.GT.L ) THEN 966* 967* Following differs in comparison to pdlahqr. 968* 969 A( ( ICOL1-2 )*LDA+IROW1 ) = A( ( ICOL1-2 )*LDA+ 970 $ IROW1 )*DCONJG( ONE-T1COPY ) 971 END IF 972 END IF 973 END IF 974* 975 IF( ( MYROW.EQ.ICURROW( KI ) ) .AND. 976 $ ( MYCOL.EQ.ICURCOL( KI ) ) .AND. 977 $ ( ( ( MODKM1.EQ.HBL-2 ) .AND. ( ISTART.EQ.I- 978 $ 1 ) ) .OR. ( ( MODKM1.LT.HBL-2 ) .AND. ( ISTART.LE.I- 979 $ 1 ) ) ) ) THEN 980* 981* (IROW1,ICOL1) is (I,J)-coordinates of H(ISTART,ISTART) 982* 983 IROW1 = KROW( KI ) 984 ICOL1 = KCOL( KI ) 985 DO 110 K = ISTART, ISTOP 986* 987* Create and do these transforms 988* 989 NR = MIN( 3, I-K+1 ) 990 IF( K.GT.M ) THEN 991 IF( MOD( K-1, HBL ).EQ.0 ) THEN 992 VCOPY( 1 ) = SMALLA( 4, 3, KI ) 993 VCOPY( 2 ) = SMALLA( 5, 3, KI ) 994 VCOPY( 3 ) = SMALLA( 6, 3, KI ) 995 ELSE 996 VCOPY( 1 ) = A( ( ICOL1-2 )*LDA+IROW1 ) 997 VCOPY( 2 ) = A( ( ICOL1-2 )*LDA+IROW1+1 ) 998 IF( NR.EQ.3 ) THEN 999 VCOPY( 3 ) = A( ( ICOL1-2 )*LDA+IROW1+2 ) 1000 END IF 1001 END IF 1002 ELSE 1003 VCOPY( 1 ) = V1SAVE 1004 VCOPY( 2 ) = V2SAVE 1005 VCOPY( 3 ) = V3SAVE 1006 END IF 1007* 1008* Must send uptodate copy of VCOPY to left. 1009* 1010 IF( NPCOL.GT.1 .AND. ISTART.LE.M .AND. 1011 $ MOD( K-1, HBL ).EQ.0 ) THEN 1012 CALL ZGESD2D( CONTXT, 3, 1, VCOPY, 3, MYROW, 1013 $ LEFT ) 1014 END IF 1015 CALL ZLARFG( NR, VCOPY( 1 ), VCOPY( 2 ), 1, 1016 $ T1COPY ) 1017 IF( K.GT.M ) THEN 1018 IF( MOD( K-1, HBL ).GT.0 ) THEN 1019 A( ( ICOL1-2 )*LDA+IROW1 ) = VCOPY( 1 ) 1020 A( ( ICOL1-2 )*LDA+IROW1+1 ) = ZERO 1021 IF( K.LT.I-1 ) THEN 1022 A( ( ICOL1-2 )*LDA+IROW1+2 ) = ZERO 1023 END IF 1024* 1025* Set a subdiagonal to zero now if it's possible 1026* 1027 IF( ( IROW1.GT.2 ) .AND. ( ICOL1.GT.2 ) .AND. 1028 $ ( K-2.GT.M ) .AND. ( MOD( K-1, 1029 $ HBL ).GT.1 ) ) THEN 1030 H11 = A( ( ICOL1-3 )*LDA+IROW1-2 ) 1031 H10 = A( ( ICOL1-3 )*LDA+IROW1-1 ) 1032 H22 = A( ( ICOL1-2 )*LDA+IROW1-1 ) 1033 S = CABS1( H11 ) + CABS1( H22 ) 1034 IF( CABS1( H10 ).LE.MAX( ULP*S, SMLNUM ) ) 1035 $ THEN 1036 A( ( ICOL1-3 )*LDA+IROW1-1 ) = ZERO 1037 END IF 1038 END IF 1039 END IF 1040 ELSE IF( M.GT.L ) THEN 1041 IF( MOD( K-1, HBL ).GT.0 ) THEN 1042* 1043* Following differs in comparison to pdlahqr. 1044* 1045 A( ( ICOL1-2 )*LDA+IROW1 ) = A( ( ICOL1-2 )* 1046 $ LDA+IROW1 )*DCONJG( ONE-T1COPY ) 1047 END IF 1048 END IF 1049 V2 = VCOPY( 2 ) 1050 T2 = T1COPY*V2 1051 WORK( VECSIDX+( K-1 )*3+1 ) = VCOPY( 2 ) 1052 WORK( VECSIDX+( K-1 )*3+2 ) = VCOPY( 3 ) 1053 WORK( VECSIDX+( K-1 )*3+3 ) = T1COPY 1054 T1 = T1COPY 1055 IF( K.LT.ISTOP ) THEN 1056* 1057* Do some work so next step is ready... 1058* 1059 V3 = VCOPY( 3 ) 1060 T3 = T1*V3 1061 DO 90 J = ( ICOL1-1 )*LDA + IROW1, 1062 $ ( MIN( K2( KI )+1, I-1 )+ICOL1-K-1 )* 1063 $ LDA + IROW1, LDA 1064 SUM = DCONJG( T1 )*A( J ) + 1065 $ DCONJG( T2 )*A( J+1 ) + 1066 $ DCONJG( T3 )*A( J+2 ) 1067 A( J ) = A( J ) - SUM 1068 A( J+1 ) = A( J+1 ) - SUM*V2 1069 A( J+2 ) = A( J+2 ) - SUM*V3 1070 90 CONTINUE 1071 DO 100 J = IROW1 + 1, IROW1 + 3 1072 SUM = T1*A( ( ICOL1-1 )*LDA+J ) + 1073 $ T2*A( ICOL1*LDA+J ) + 1074 $ T3*A( ( ICOL1+1 )*LDA+J ) 1075 A( ( ICOL1-1 )*LDA+J ) = A( ( ICOL1-1 )*LDA+ 1076 $ J ) - SUM 1077 A( ICOL1*LDA+J ) = A( ICOL1*LDA+J ) - 1078 $ SUM*DCONJG( V2 ) 1079 A( ( ICOL1+1 )*LDA+J ) = A( ( ICOL1+1 )*LDA+ 1080 $ J ) - SUM*DCONJG( V3 ) 1081 100 CONTINUE 1082 END IF 1083 IROW1 = IROW1 + 1 1084 ICOL1 = ICOL1 + 1 1085 110 CONTINUE 1086 END IF 1087 120 CONTINUE 1088* 1089* First part of applying the transforms is complete. 1090* Broadcasts of the Householder data is done here. 1091* 1092 DO 130 KI = 1, IBULGE 1093* 1094 ISTART = MAX( K1( KI ), M ) 1095 ISTOP = MIN( K2( KI ), I-1 ) 1096* 1097* Broadcast Householder information from the block 1098* 1099 IF( ( MYROW.EQ.ICURROW( KI ) ) .AND. ( NPCOL.GT.1 ) .AND. 1100 $ ( ISTART.LE.ISTOP ) ) THEN 1101 IF( MYCOL.NE.ICURCOL( KI ) ) THEN 1102 CALL ZGEBR2D( CONTXT, 'ROW', ' ', 1103 $ 3*( ISTOP-ISTART+1 ), 1, 1104 $ WORK( VECSIDX+( ISTART-1 )*3+1 ), 1105 $ 3*( ISTOP-ISTART+1 ), MYROW, 1106 $ ICURCOL( KI ) ) 1107 ELSE 1108 CALL ZGEBS2D( CONTXT, 'ROW', ' ', 1109 $ 3*( ISTOP-ISTART+1 ), 1, 1110 $ WORK( VECSIDX+( ISTART-1 )*3+1 ), 1111 $ 3*( ISTOP-ISTART+1 ) ) 1112 END IF 1113 END IF 1114 130 CONTINUE 1115* 1116* Now do column transforms and finish work 1117* 1118 DO 140 KI = 1, IBULGE 1119* 1120 ISTART = MAX( K1( KI ), M ) 1121 ISTOP = MIN( K2( KI ), I-1 ) 1122* 1123 IF( ( MYCOL.EQ.ICURCOL( KI ) ) .AND. ( NPROW.GT.1 ) .AND. 1124 $ ( ISTART.LE.ISTOP ) ) THEN 1125 IF( MYROW.NE.ICURROW( KI ) ) THEN 1126 CALL ZGEBR2D( CONTXT, 'COL', ' ', 1127 $ 3*( ISTOP-ISTART+1 ), 1, 1128 $ WORK( VECSIDX+( ISTART-1 )*3+1 ), 1129 $ 3*( ISTOP-ISTART+1 ), ICURROW( KI ), 1130 $ MYCOL ) 1131 ELSE 1132 CALL ZGEBS2D( CONTXT, 'COL', ' ', 1133 $ 3*( ISTOP-ISTART+1 ), 1, 1134 $ WORK( VECSIDX+( ISTART-1 )*3+1 ), 1135 $ 3*( ISTOP-ISTART+1 ) ) 1136 END IF 1137 END IF 1138 140 CONTINUE 1139* 1140* 1141* Now do make up work to have things in block fashion 1142* 1143 DO 160 KI = 1, IBULGE 1144 ISTART = MAX( K1( KI ), M ) 1145 ISTOP = MIN( K2( KI ), I-1 ) 1146* 1147 MODKM1 = MOD( ISTART-1, HBL ) 1148 IF( ( MYROW.EQ.ICURROW( KI ) ) .AND. 1149 $ ( MYCOL.EQ.ICURCOL( KI ) ) .AND. 1150 $ ( ( ( MODKM1.EQ.HBL-2 ) .AND. ( ISTART.EQ.I- 1151 $ 1 ) ) .OR. ( ( MODKM1.LT.HBL-2 ) .AND. ( ISTART.LE.I- 1152 $ 1 ) ) ) ) THEN 1153* 1154* (IROW1,ICOL1) is (I,J)-coordinates of H(ISTART,ISTART) 1155* 1156 IROW1 = KROW( KI ) 1157 ICOL1 = KCOL( KI ) 1158 DO 150 K = ISTART, ISTOP 1159* 1160* Catch up on column & border work 1161* 1162 NR = MIN( 3, I-K+1 ) 1163 V2 = WORK( VECSIDX+( K-1 )*3+1 ) 1164 V3 = WORK( VECSIDX+( K-1 )*3+2 ) 1165 T1 = WORK( VECSIDX+( K-1 )*3+3 ) 1166 T2 = T1*V2 1167 IF( K.LT.ISTOP ) THEN 1168* 1169* Do some work so next step is ready... 1170* 1171 T3 = T1*V3 1172 CALL ZLAREF( 'Col', A, LDA, .FALSE., Z, LDZ, 1173 $ .FALSE., ICOL1, ICOL1, ISTART, 1174 $ ISTOP, MIN( ISTART+1, I )-K+IROW1, 1175 $ IROW1, LILOZ, LIHIZ, 1176 $ WORK( VECSIDX+1 ), V2, V3, T1, T2, 1177 $ T3 ) 1178 IROW1 = IROW1 + 1 1179 ICOL1 = ICOL1 + 1 1180 ELSE 1181 IF( ( NR.EQ.3 ) .AND. ( MOD( K-1, 1182 $ HBL ).LT.HBL-2 ) ) THEN 1183 T3 = T1*V3 1184 CALL ZLAREF( 'Row', A, LDA, .FALSE., Z, LDZ, 1185 $ .FALSE., IROW1, IROW1, ISTART, 1186 $ ISTOP, ICOL1, MIN( MIN( K2( KI ) 1187 $ +1, I-1 ), I2 )-K+ICOL1, LILOZ, 1188 $ LIHIZ, WORK( VECSIDX+1 ), V2, 1189 $ V3, T1, T2, T3 ) 1190 END IF 1191 END IF 1192 150 CONTINUE 1193 END IF 1194* 1195* Send SMALLA back again. 1196* 1197 K = ISTART 1198 MODKM1 = MOD( K-1, HBL ) 1199 IF( ( MODKM1.GE.HBL-2 ) .AND. ( K.LE.I-1 ) ) THEN 1200 IF( ( MODKM1.EQ.HBL-2 ) .AND. ( K.LT.I-1 ) ) THEN 1201* 1202* Copy 6 elements from global A(K-1:K+4,K-1:K+4) 1203* 1204 ITMP1 = ICURROW( KI ) 1205 ITMP2 = ICURCOL( KI ) 1206 CALL PZLACP3( MIN( 6, N-K+2 ), K-1, A, DESCA, 1207 $ SMALLA( 1, 1, KI ), 6, ITMP1, ITMP2, 1208 $ 1 ) 1209* 1210 END IF 1211 IF( MODKM1.EQ.HBL-1 ) THEN 1212* 1213* Copy 6 elements from global A(K-2:K+3,K-2:K+3) 1214* 1215 ITMP1 = ICURROW( KI ) 1216 ITMP2 = ICURCOL( KI ) 1217 CALL PZLACP3( MIN( 6, N-K+3 ), K-2, A, DESCA, 1218 $ SMALLA( 1, 1, KI ), 6, ITMP1, ITMP2, 1219 $ 1 ) 1220 END IF 1221 END IF 1222* 1223 160 CONTINUE 1224* 1225 170 CONTINUE 1226* 1227* Now start major set of block ROW reflections 1228* 1229 DO 180 KI = 1, IBULGE 1230 IF( ( MYROW.NE.ICURROW( KI ) ) .AND. 1231 $ ( DOWN.NE.ICURROW( KI ) ) )GO TO 180 1232 ISTART = MAX( K1( KI ), M ) 1233 ISTOP = MIN( K2( KI ), I-1 ) 1234* 1235 IF( ( ISTOP.GT.ISTART ) .AND. 1236 $ ( MOD( ISTART-1, HBL ).LT.HBL-2 ) .AND. 1237 $ ( ICURROW( KI ).EQ.MYROW ) ) THEN 1238 IROW1 = MIN( K2( KI )+1, I-1 ) + 1 1239 CALL INFOG1L( IROW1, HBL, NPCOL, MYCOL, JAFIRST, 1240 $ ITMP1, ITMP2 ) 1241 ITMP2 = LOCALI2 1242 II = KROW( KI ) 1243 CALL ZLAREF( 'Row', A, LDA, WANTZ, Z, LDZ, .TRUE., II, 1244 $ II, ISTART, ISTOP, ITMP1, ITMP2, LILOZ, 1245 $ LIHIZ, WORK( VECSIDX+1 ), V2, V3, T1, T2, 1246 $ T3 ) 1247 END IF 1248 180 CONTINUE 1249* 1250 DO 220 KI = 1, IBULGE 1251 IF( KROW( KI ).GT.KP2ROW( KI ) ) 1252 $ GO TO 220 1253 IF( ( MYROW.NE.ICURROW( KI ) ) .AND. 1254 $ ( DOWN.NE.ICURROW( KI ) ) )GO TO 220 1255 ISTART = MAX( K1( KI ), M ) 1256 ISTOP = MIN( K2( KI ), I-1 ) 1257 IF( ( ISTART.EQ.ISTOP ) .OR. 1258 $ ( MOD( ISTART-1, HBL ).GE.HBL-2 ) .OR. 1259 $ ( ICURROW( KI ).NE.MYROW ) ) THEN 1260 DO 210 K = ISTART, ISTOP 1261 V2 = WORK( VECSIDX+( K-1 )*3+1 ) 1262 V3 = WORK( VECSIDX+( K-1 )*3+2 ) 1263 T1 = WORK( VECSIDX+( K-1 )*3+3 ) 1264 NR = MIN( 3, I-K+1 ) 1265 IF( ( NR.EQ.3 ) .AND. ( KROW( KI ).LE. 1266 $ KP2ROW( KI ) ) ) THEN 1267 IF( ( K.LT.ISTOP ) .AND. 1268 $ ( MOD( K-1, HBL ).LT.HBL-2 ) ) THEN 1269 ITMP1 = MIN( K2( KI )+1, I-1 ) + 1 1270 ELSE 1271 IF( MOD( K-1, HBL ).LT.HBL-2 ) THEN 1272 ITMP1 = MIN( K2( KI )+1, I-1 ) + 1 1273 END IF 1274 IF( MOD( K-1, HBL ).EQ.HBL-2 ) THEN 1275 ITMP1 = MIN( K+4, I2 ) + 1 1276 END IF 1277 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 1278 ITMP1 = MIN( K+3, I2 ) + 1 1279 END IF 1280 END IF 1281* 1282* Find local coor of rows K through K+2 1283* 1284 IROW1 = KROW( KI ) 1285 IROW2 = KP2ROW( KI ) 1286 IF( ( K.GT.ISTART ) .AND. 1287 $ ( MOD( K-1, HBL ).GE.HBL-2 ) ) THEN 1288 IF( DOWN.EQ.ICURROW( KI ) ) THEN 1289 IROW1 = IROW1 + 1 1290 END IF 1291 IF( MYROW.EQ.ICURROW( KI ) ) THEN 1292 IROW2 = IROW2 + 1 1293 END IF 1294 END IF 1295 CALL INFOG1L( ITMP1, HBL, NPCOL, MYCOL, JAFIRST, 1296 $ ICOL1, ICOL2 ) 1297 ICOL2 = LOCALI2 1298 IF( ( MOD( K-1, HBL ).LT.HBL-2 ) .OR. 1299 $ ( NPROW.EQ.1 ) ) THEN 1300 T2 = T1*V2 1301 T3 = T1*V3 1302 CALL ZLAREF( 'Row', A, LDA, WANTZ, Z, LDZ, 1303 $ .FALSE., IROW1, IROW1, ISTART, 1304 $ ISTOP, ICOL1, ICOL2, LILOZ, 1305 $ LIHIZ, WORK( VECSIDX+1 ), V2, 1306 $ V3, T1, T2, T3 ) 1307 END IF 1308 IF( ( MOD( K-1, HBL ).EQ.HBL-2 ) .AND. 1309 $ ( NPROW.GT.1 ) ) THEN 1310 IF( IROW1.NE.IROW2 ) THEN 1311 CALL ZGESD2D( CONTXT, 2, ICOL2-ICOL1+1, 1312 $ A( ( ICOL1-1 )*LDA+IROW1 ), 1313 $ LDA, DOWN, MYCOL ) 1314 IF( SKIP .AND. ( ISTART.EQ.ISTOP ) ) THEN 1315 CALL ZGERV2D( CONTXT, 2, ICOL2-ICOL1+1, 1316 $ A( ( ICOL1-1 )*LDA+ 1317 $ IROW1 ), LDA, DOWN, 1318 $ MYCOL ) 1319 END IF 1320 ELSE IF( SKIP ) THEN 1321 CALL ZGERV2D( CONTXT, 2, ICOL2-ICOL1+1, 1322 $ WORK( IRBUF+1 ), 2, UP, 1323 $ MYCOL ) 1324 T2 = T1*V2 1325 T3 = T1*V3 1326 DO 190 J = ICOL1, ICOL2 1327 SUM = DCONJG( T1 )* 1328 $ WORK( IRBUF+2*( J-ICOL1 )+1 ) + 1329 $ DCONJG( T2 )*WORK( IRBUF+2* 1330 $ ( J-ICOL1 )+2 ) + 1331 $ DCONJG( T3 )*A( ( J-1 )*LDA+ 1332 $ IROW1 ) 1333 WORK( IRBUF+2*( J-ICOL1 )+1 ) 1334 $ = WORK( IRBUF+2*( J-ICOL1 )+1 ) - 1335 $ SUM 1336 WORK( IRBUF+2*( J-ICOL1 )+2 ) 1337 $ = WORK( IRBUF+2*( J-ICOL1 )+2 ) - 1338 $ SUM*V2 1339 A( ( J-1 )*LDA+IROW1 ) = A( ( J-1 )* 1340 $ LDA+IROW1 ) - SUM*V3 1341 190 CONTINUE 1342 IF( ISTART.EQ.ISTOP ) THEN 1343 CALL ZGESD2D( CONTXT, 2, ICOL2-ICOL1+1, 1344 $ WORK( IRBUF+1 ), 2, UP, 1345 $ MYCOL ) 1346 END IF 1347 END IF 1348 END IF 1349 IF( ( MOD( K-1, HBL ).EQ.HBL-1 ) .AND. 1350 $ ( NPROW.GT.1 ) ) THEN 1351 IF( IROW1.EQ.IROW2 ) THEN 1352 IF( ISTART.EQ.ISTOP ) THEN 1353 CALL ZGESD2D( CONTXT, 2, ICOL2-ICOL1+1, 1354 $ A( ( ICOL1-1 )*LDA+IROW1- 1355 $ 1 ), LDA, DOWN, MYCOL ) 1356 END IF 1357 IF( SKIP ) THEN 1358 CALL ZGERV2D( CONTXT, 2, ICOL2-ICOL1+1, 1359 $ A( ( ICOL1-1 )*LDA+IROW1- 1360 $ 1 ), LDA, DOWN, MYCOL ) 1361 END IF 1362 ELSE IF( SKIP ) THEN 1363 IF( ISTART.EQ.ISTOP ) THEN 1364 CALL ZGERV2D( CONTXT, 2, ICOL2-ICOL1+1, 1365 $ WORK( IRBUF+1 ), 2, UP, 1366 $ MYCOL ) 1367 END IF 1368 T2 = T1*V2 1369 T3 = T1*V3 1370 DO 200 J = ICOL1, ICOL2 1371 SUM = DCONJG( T1 )* 1372 $ WORK( IRBUF+2*( J-ICOL1 )+2 ) + 1373 $ DCONJG( T2 )*A( ( J-1 )*LDA+ 1374 $ IROW1 ) + DCONJG( T3 )* 1375 $ A( ( J-1 )*LDA+IROW1+1 ) 1376 WORK( IRBUF+2*( J-ICOL1 )+2 ) 1377 $ = WORK( IRBUF+2*( J-ICOL1 )+2 ) - 1378 $ SUM 1379 A( ( J-1 )*LDA+IROW1 ) = A( ( J-1 )* 1380 $ LDA+IROW1 ) - SUM*V2 1381 A( ( J-1 )*LDA+IROW1+1 ) = A( ( J-1 )* 1382 $ LDA+IROW1+1 ) - SUM*V3 1383 200 CONTINUE 1384 CALL ZGESD2D( CONTXT, 2, ICOL2-ICOL1+1, 1385 $ WORK( IRBUF+1 ), 2, UP, 1386 $ MYCOL ) 1387* 1388 END IF 1389 END IF 1390 END IF 1391 210 CONTINUE 1392 END IF 1393 220 CONTINUE 1394* 1395 IF( SKIP ) 1396 $ GO TO 290 1397* 1398 DO 260 KI = 1, IBULGE 1399 IF( KROW( KI ).GT.KP2ROW( KI ) ) 1400 $ GO TO 260 1401 IF( ( MYROW.NE.ICURROW( KI ) ) .AND. 1402 $ ( DOWN.NE.ICURROW( KI ) ) )GO TO 260 1403 ISTART = MAX( K1( KI ), M ) 1404 ISTOP = MIN( K2( KI ), I-1 ) 1405 IF( ( ISTART.EQ.ISTOP ) .OR. 1406 $ ( MOD( ISTART-1, HBL ).GE.HBL-2 ) .OR. 1407 $ ( ICURROW( KI ).NE.MYROW ) ) THEN 1408 DO 250 K = ISTART, ISTOP 1409 V2 = WORK( VECSIDX+( K-1 )*3+1 ) 1410 V3 = WORK( VECSIDX+( K-1 )*3+2 ) 1411 T1 = WORK( VECSIDX+( K-1 )*3+3 ) 1412 NR = MIN( 3, I-K+1 ) 1413 IF( ( NR.EQ.3 ) .AND. ( KROW( KI ).LE. 1414 $ KP2ROW( KI ) ) ) THEN 1415 IF( ( K.LT.ISTOP ) .AND. 1416 $ ( MOD( K-1, HBL ).LT.HBL-2 ) ) THEN 1417 ITMP1 = MIN( K2( KI )+1, I-1 ) + 1 1418 ELSE 1419 IF( MOD( K-1, HBL ).LT.HBL-2 ) THEN 1420 ITMP1 = MIN( K2( KI )+1, I-1 ) + 1 1421 END IF 1422 IF( MOD( K-1, HBL ).EQ.HBL-2 ) THEN 1423 ITMP1 = MIN( K+4, I2 ) + 1 1424 END IF 1425 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 1426 ITMP1 = MIN( K+3, I2 ) + 1 1427 END IF 1428 END IF 1429* 1430* Find local coor of rows K through K+2 1431* 1432 IROW1 = KROW( KI ) 1433 IROW2 = KP2ROW( KI ) 1434 IF( ( K.GT.ISTART ) .AND. 1435 $ ( MOD( K-1, HBL ).GE.HBL-2 ) ) THEN 1436 IF( DOWN.EQ.ICURROW( KI ) ) THEN 1437 IROW1 = IROW1 + 1 1438 END IF 1439 IF( MYROW.EQ.ICURROW( KI ) ) THEN 1440 IROW2 = IROW2 + 1 1441 END IF 1442 END IF 1443 CALL INFOG1L( ITMP1, HBL, NPCOL, MYCOL, JAFIRST, 1444 $ ICOL1, ICOL2 ) 1445 ICOL2 = LOCALI2 1446 IF( ( MOD( K-1, HBL ).EQ.HBL-2 ) .AND. 1447 $ ( NPROW.GT.1 ) ) THEN 1448 IF( IROW1.EQ.IROW2 ) THEN 1449 CALL ZGERV2D( CONTXT, 2, ICOL2-ICOL1+1, 1450 $ WORK( IRBUF+1 ), 2, UP, 1451 $ MYCOL ) 1452 T2 = T1*V2 1453 T3 = T1*V3 1454 DO 230 J = ICOL1, ICOL2 1455 SUM = DCONJG( T1 )* 1456 $ WORK( IRBUF+2*( J-ICOL1 )+1 ) + 1457 $ DCONJG( T2 )*WORK( IRBUF+2* 1458 $ ( J-ICOL1 )+2 ) + 1459 $ DCONJG( T3 )*A( ( J-1 )*LDA+ 1460 $ IROW1 ) 1461 WORK( IRBUF+2*( J-ICOL1 )+1 ) 1462 $ = WORK( IRBUF+2*( J-ICOL1 )+1 ) - 1463 $ SUM 1464 WORK( IRBUF+2*( J-ICOL1 )+2 ) 1465 $ = WORK( IRBUF+2*( J-ICOL1 )+2 ) - 1466 $ SUM*V2 1467 A( ( J-1 )*LDA+IROW1 ) = A( ( J-1 )* 1468 $ LDA+IROW1 ) - SUM*V3 1469 230 CONTINUE 1470 IF( ISTART.EQ.ISTOP ) THEN 1471 CALL ZGESD2D( CONTXT, 2, ICOL2-ICOL1+1, 1472 $ WORK( IRBUF+1 ), 2, UP, 1473 $ MYCOL ) 1474 END IF 1475 END IF 1476 END IF 1477 IF( ( MOD( K-1, HBL ).EQ.HBL-1 ) .AND. 1478 $ ( NPROW.GT.1 ) ) THEN 1479 IF( IROW1.NE.IROW2 ) THEN 1480 IF( ISTART.EQ.ISTOP ) THEN 1481 CALL ZGERV2D( CONTXT, 2, ICOL2-ICOL1+1, 1482 $ WORK( IRBUF+1 ), 2, UP, 1483 $ MYCOL ) 1484 END IF 1485 T2 = T1*V2 1486 T3 = T1*V3 1487 DO 240 J = ICOL1, ICOL2 1488 SUM = DCONJG( T1 )* 1489 $ WORK( IRBUF+2*( J-ICOL1 )+2 ) + 1490 $ DCONJG( T2 )*A( ( J-1 )*LDA+ 1491 $ IROW1 ) + DCONJG( T3 )* 1492 $ A( ( J-1 )*LDA+IROW1+1 ) 1493 WORK( IRBUF+2*( J-ICOL1 )+2 ) 1494 $ = WORK( IRBUF+2*( J-ICOL1 )+2 ) - 1495 $ SUM 1496 A( ( J-1 )*LDA+IROW1 ) = A( ( J-1 )* 1497 $ LDA+IROW1 ) - SUM*V2 1498 A( ( J-1 )*LDA+IROW1+1 ) = A( ( J-1 )* 1499 $ LDA+IROW1+1 ) - SUM*V3 1500 240 CONTINUE 1501 CALL ZGESD2D( CONTXT, 2, ICOL2-ICOL1+1, 1502 $ WORK( IRBUF+1 ), 2, UP, 1503 $ MYCOL ) 1504 END IF 1505 END IF 1506 END IF 1507 250 CONTINUE 1508 END IF 1509 260 CONTINUE 1510* 1511 DO 280 KI = 1, IBULGE 1512 IF( KROW( KI ).GT.KP2ROW( KI ) ) 1513 $ GO TO 280 1514 IF( ( MYROW.NE.ICURROW( KI ) ) .AND. 1515 $ ( DOWN.NE.ICURROW( KI ) ) )GO TO 280 1516 ISTART = MAX( K1( KI ), M ) 1517 ISTOP = MIN( K2( KI ), I-1 ) 1518 IF( ( ISTART.EQ.ISTOP ) .OR. 1519 $ ( MOD( ISTART-1, HBL ).GE.HBL-2 ) .OR. 1520 $ ( ICURROW( KI ).NE.MYROW ) ) THEN 1521 DO 270 K = ISTART, ISTOP 1522 V2 = WORK( VECSIDX+( K-1 )*3+1 ) 1523 V3 = WORK( VECSIDX+( K-1 )*3+2 ) 1524 T1 = WORK( VECSIDX+( K-1 )*3+3 ) 1525 NR = MIN( 3, I-K+1 ) 1526 IF( ( NR.EQ.3 ) .AND. ( KROW( KI ).LE. 1527 $ KP2ROW( KI ) ) ) THEN 1528 IF( ( K.LT.ISTOP ) .AND. 1529 $ ( MOD( K-1, HBL ).LT.HBL-2 ) ) THEN 1530 ITMP1 = MIN( K2( KI )+1, I-1 ) + 1 1531 ELSE 1532 IF( MOD( K-1, HBL ).LT.HBL-2 ) THEN 1533 ITMP1 = MIN( K2( KI )+1, I-1 ) + 1 1534 END IF 1535 IF( MOD( K-1, HBL ).EQ.HBL-2 ) THEN 1536 ITMP1 = MIN( K+4, I2 ) + 1 1537 END IF 1538 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 1539 ITMP1 = MIN( K+3, I2 ) + 1 1540 END IF 1541 END IF 1542* 1543* Find local coor of rows K through K+2 1544* 1545 IROW1 = KROW( KI ) 1546 IROW2 = KP2ROW( KI ) 1547 IF( ( K.GT.ISTART ) .AND. 1548 $ ( MOD( K-1, HBL ).GE.HBL-2 ) ) THEN 1549 IF( DOWN.EQ.ICURROW( KI ) ) THEN 1550 IROW1 = IROW1 + 1 1551 END IF 1552 IF( MYROW.EQ.ICURROW( KI ) ) THEN 1553 IROW2 = IROW2 + 1 1554 END IF 1555 END IF 1556 CALL INFOG1L( ITMP1, HBL, NPCOL, MYCOL, JAFIRST, 1557 $ ICOL1, ICOL2 ) 1558 ICOL2 = LOCALI2 1559 IF( ( MOD( K-1, HBL ).EQ.HBL-2 ) .AND. 1560 $ ( NPROW.GT.1 ) ) THEN 1561 IF( IROW1.NE.IROW2 ) THEN 1562 IF( ISTART.EQ.ISTOP ) THEN 1563 CALL ZGERV2D( CONTXT, 2, ICOL2-ICOL1+1, 1564 $ A( ( ICOL1-1 )*LDA+ 1565 $ IROW1 ), LDA, DOWN, 1566 $ MYCOL ) 1567 END IF 1568 END IF 1569 END IF 1570 IF( ( MOD( K-1, HBL ).EQ.HBL-1 ) .AND. 1571 $ ( NPROW.GT.1 ) ) THEN 1572 IF( IROW1.EQ.IROW2 ) THEN 1573 CALL ZGERV2D( CONTXT, 2, ICOL2-ICOL1+1, 1574 $ A( ( ICOL1-1 )*LDA+IROW1- 1575 $ 1 ), LDA, DOWN, MYCOL ) 1576 END IF 1577 END IF 1578 END IF 1579 270 CONTINUE 1580 END IF 1581 280 CONTINUE 1582* 1583 290 CONTINUE 1584* 1585* Now start major set of block COL reflections 1586* 1587 DO 300 KI = 1, IBULGE 1588 IF( ( MYCOL.NE.ICURCOL( KI ) ) .AND. 1589 $ ( RIGHT.NE.ICURCOL( KI ) ) )GO TO 300 1590 ISTART = MAX( K1( KI ), M ) 1591 ISTOP = MIN( K2( KI ), I-1 ) 1592* 1593 IF( ( ( MOD( ISTART-1, HBL ).LT.HBL-2 ) .OR. ( NPCOL.EQ. 1594 $ 1 ) ) .AND. ( ICURCOL( KI ).EQ.MYCOL ) .AND. 1595 $ ( I-ISTOP+1.GE.3 ) ) THEN 1596 K = ISTART 1597 IF( ( K.LT.ISTOP ) .AND. ( MOD( K-1, 1598 $ HBL ).LT.HBL-2 ) ) THEN 1599 ITMP1 = MIN( ISTART+1, I ) - 1 1600 ELSE 1601 IF( MOD( K-1, HBL ).LT.HBL-2 ) THEN 1602 ITMP1 = MIN( K+3, I ) 1603 END IF 1604 IF( MOD( K-1, HBL ).EQ.HBL-2 ) THEN 1605 ITMP1 = MAX( I1, K-1 ) - 1 1606 END IF 1607 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 1608 ITMP1 = MAX( I1, K-2 ) - 1 1609 END IF 1610 END IF 1611* 1612 ICOL1 = KCOL( KI ) 1613 CALL INFOG1L( I1, HBL, NPROW, MYROW, IAFIRST, IROW1, 1614 $ IROW2 ) 1615 IROW2 = NUMROC( ITMP1, HBL, MYROW, IAFIRST, NPROW ) 1616 IF( IROW1.LE.IROW2 ) THEN 1617 ITMP2 = IROW2 1618 ELSE 1619 ITMP2 = -1 1620 END IF 1621 CALL ZLAREF( 'Col', A, LDA, WANTZ, Z, LDZ, .TRUE., 1622 $ ICOL1, ICOL1, ISTART, ISTOP, IROW1, 1623 $ IROW2, LILOZ, LIHIZ, WORK( VECSIDX+1 ), 1624 $ V2, V3, T1, T2, T3 ) 1625 K = ISTOP 1626 IF( MOD( K-1, HBL ).LT.HBL-2 ) THEN 1627* 1628* Do from ITMP1+1 to MIN(K+3,I) 1629* 1630 IF( MOD( K-1, HBL ).LT.HBL-3 ) THEN 1631 IROW1 = ITMP2 + 1 1632 IF( MOD( ( ITMP1 / HBL ), NPROW ).EQ.MYROW ) 1633 $ THEN 1634 IF( ITMP2.GT.0 ) THEN 1635 IROW2 = ITMP2 + MIN( K+3, I ) - ITMP1 1636 ELSE 1637 IROW2 = IROW1 - 1 1638 END IF 1639 ELSE 1640 IROW2 = IROW1 - 1 1641 END IF 1642 ELSE 1643 CALL INFOG1L( ITMP1+1, HBL, NPROW, MYROW, 1644 $ IAFIRST, IROW1, IROW2 ) 1645 IROW2 = NUMROC( MIN( K+3, I ), HBL, MYROW, 1646 $ IAFIRST, NPROW ) 1647 END IF 1648 V2 = WORK( VECSIDX+( K-1 )*3+1 ) 1649 V3 = WORK( VECSIDX+( K-1 )*3+2 ) 1650 T1 = WORK( VECSIDX+( K-1 )*3+3 ) 1651 T2 = T1*V2 1652 T3 = T1*V3 1653 ICOL1 = KCOL( KI ) + ISTOP - ISTART 1654 CALL ZLAREF( 'Col', A, LDA, .FALSE., Z, LDZ, 1655 $ .FALSE., ICOL1, ICOL1, ISTART, ISTOP, 1656 $ IROW1, IROW2, LILOZ, LIHIZ, 1657 $ WORK( VECSIDX+1 ), V2, V3, T1, T2, 1658 $ T3 ) 1659 END IF 1660 END IF 1661 300 CONTINUE 1662* 1663 DO 360 KI = 1, IBULGE 1664 IF( KCOL( KI ).GT.KP2COL( KI ) ) 1665 $ GO TO 360 1666 IF( ( MYCOL.NE.ICURCOL( KI ) ) .AND. 1667 $ ( RIGHT.NE.ICURCOL( KI ) ) )GO TO 360 1668 ISTART = MAX( K1( KI ), M ) 1669 ISTOP = MIN( K2( KI ), I-1 ) 1670 IF( MOD( ISTART-1, HBL ).GE.HBL-2 ) THEN 1671* 1672* INFO is found in a buffer 1673* 1674 ISPEC = 1 1675 ELSE 1676* 1677* All INFO is local 1678* 1679 ISPEC = 0 1680 END IF 1681 DO 350 K = ISTART, ISTOP 1682* 1683 V2 = WORK( VECSIDX+( K-1 )*3+1 ) 1684 V3 = WORK( VECSIDX+( K-1 )*3+2 ) 1685 T1 = WORK( VECSIDX+( K-1 )*3+3 ) 1686 NR = MIN( 3, I-K+1 ) 1687 IF( ( NR.EQ.3 ) .AND. ( KCOL( KI ).LE.KP2COL( KI ) ) ) 1688 $ THEN 1689* 1690 IF( ( K.LT.ISTOP ) .AND. 1691 $ ( MOD( K-1, HBL ).LT.HBL-2 ) ) THEN 1692 ITMP1 = MIN( ISTART+1, I ) - 1 1693 ELSE 1694 IF( MOD( K-1, HBL ).LT.HBL-2 ) THEN 1695 ITMP1 = MIN( K+3, I ) 1696 END IF 1697 IF( MOD( K-1, HBL ).EQ.HBL-2 ) THEN 1698 ITMP1 = MAX( I1, K-1 ) - 1 1699 END IF 1700 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 1701 ITMP1 = MAX( I1, K-2 ) - 1 1702 END IF 1703 END IF 1704 IF( MOD( K-1, HBL ).LT.HBL-2 ) THEN 1705 ICOL1 = KCOL( KI ) + K - ISTART 1706 ICOL2 = KP2COL( KI ) + K - ISTART 1707 ELSE 1708 ICOL1 = KCOL( KI ) 1709 ICOL2 = KP2COL( KI ) 1710 IF( K.GT.ISTART ) THEN 1711 IF( RIGHT.EQ.ICURCOL( KI ) ) THEN 1712 ICOL1 = ICOL1 + 1 1713 END IF 1714 IF( MYCOL.EQ.ICURCOL( KI ) ) THEN 1715 ICOL2 = ICOL2 + 1 1716 END IF 1717 END IF 1718 END IF 1719 CALL INFOG1L( I1, HBL, NPROW, MYROW, IAFIRST, 1720 $ IROW1, IROW2 ) 1721 IROW2 = NUMROC( ITMP1, HBL, MYROW, IAFIRST, NPROW ) 1722 IF( ( MOD( K-1, HBL ).EQ.HBL-2 ) .AND. 1723 $ ( NPCOL.GT.1 ) ) THEN 1724 IF( ICOL1.NE.ICOL2 ) THEN 1725 CALL ZGESD2D( CONTXT, IROW2-IROW1+1, 2, 1726 $ A( ( ICOL1-1 )*LDA+IROW1 ), 1727 $ LDA, MYROW, RIGHT ) 1728 IF( ( ISTART.EQ.ISTOP ) .AND. SKIP ) THEN 1729 CALL ZGERV2D( CONTXT, IROW2-IROW1+1, 2, 1730 $ A( ( ICOL1-1 )*LDA+IROW1 ), 1731 $ LDA, MYROW, RIGHT ) 1732 END IF 1733 ELSE IF( SKIP ) THEN 1734 T2 = T1*V2 1735 T3 = T1*V3 1736 CALL ZGERV2D( CONTXT, IROW2-IROW1+1, 2, 1737 $ WORK( ICBUF+1 ), IROW2-IROW1+1, 1738 $ MYROW, LEFT ) 1739 II = ICBUF - IROW1 + 1 1740 JJ = ICBUF + IROW2 - 2*IROW1 + 2 1741 DO 310 J = IROW1, IROW2 1742 SUM = T1*WORK( II+J ) + T2*WORK( JJ+J ) + 1743 $ T3*A( ( ICOL1-1 )*LDA+J ) 1744 WORK( II+J ) = WORK( II+J ) - SUM 1745 WORK( JJ+J ) = WORK( JJ+J ) - 1746 $ SUM*DCONJG( V2 ) 1747 A( ( ICOL1-1 )*LDA+J ) = A( ( ICOL1-1 )* 1748 $ LDA+J ) - SUM*DCONJG( V3 ) 1749 310 CONTINUE 1750 IF( ISTART.EQ.ISTOP ) THEN 1751 CALL ZGESD2D( CONTXT, IROW2-IROW1+1, 2, 1752 $ WORK( ICBUF+1 ), 1753 $ IROW2-IROW1+1, MYROW, LEFT ) 1754 END IF 1755 END IF 1756 END IF 1757 IF( ( MOD( K-1, HBL ).EQ.HBL-1 ) .AND. 1758 $ ( NPCOL.GT.1 ) ) THEN 1759 IF( ICOL1.EQ.ICOL2 ) THEN 1760 IF( ISTART.EQ.ISTOP ) THEN 1761 CALL ZGESD2D( CONTXT, IROW2-IROW1+1, 2, 1762 $ A( ( ICOL1-2 )*LDA+IROW1 ), 1763 $ LDA, MYROW, RIGHT ) 1764 END IF 1765 IF( SKIP ) THEN 1766 CALL ZGERV2D( CONTXT, IROW2-IROW1+1, 2, 1767 $ A( ( ICOL1-2 )*LDA+IROW1 ), 1768 $ LDA, MYROW, RIGHT ) 1769 END IF 1770 ELSE IF( SKIP ) THEN 1771 IF( ISTART.EQ.ISTOP ) THEN 1772 CALL ZGERV2D( CONTXT, IROW2-IROW1+1, 2, 1773 $ WORK( ICBUF+1 ), 1774 $ IROW2-IROW1+1, MYROW, LEFT ) 1775 END IF 1776 T2 = T1*V2 1777 T3 = T1*V3 1778 II = ICBUF + IROW2 - 2*IROW1 + 2 1779 DO 320 J = IROW1, IROW2 1780 SUM = T1*WORK( J+II ) + 1781 $ T2*A( ( ICOL1-1 )*LDA+J ) + 1782 $ T3*A( ICOL1*LDA+J ) 1783 WORK( J+II ) = WORK( J+II ) - SUM 1784 A( ( ICOL1-1 )*LDA+J ) = A( ( ICOL1-1 )* 1785 $ LDA+J ) - SUM*DCONJG( V2 ) 1786 A( ICOL1*LDA+J ) = A( ICOL1*LDA+J ) - 1787 $ SUM*DCONJG( V3 ) 1788 320 CONTINUE 1789 CALL ZGESD2D( CONTXT, IROW2-IROW1+1, 2, 1790 $ WORK( ICBUF+1 ), IROW2-IROW1+1, 1791 $ MYROW, LEFT ) 1792 END IF 1793 END IF 1794* 1795* If we want Z and we haven't already done any Z 1796* 1797 IF( ( WANTZ ) .AND. ( MOD( K-1, 1798 $ HBL ).GE.HBL-2 ) .AND. ( NPCOL.GT.1 ) ) THEN 1799* 1800* Accumulate transformations in the matrix Z 1801* 1802 IROW1 = LILOZ 1803 IROW2 = LIHIZ 1804 IF( MOD( K-1, HBL ).EQ.HBL-2 ) THEN 1805 IF( ICOL1.NE.ICOL2 ) THEN 1806 CALL ZGESD2D( CONTXT, IROW2-IROW1+1, 2, 1807 $ Z( ( ICOL1-1 )*LDZ+IROW1 ), 1808 $ LDZ, MYROW, RIGHT ) 1809 IF( ( ISTART.EQ.ISTOP ) .AND. SKIP ) THEN 1810 CALL ZGERV2D( CONTXT, IROW2-IROW1+1, 2, 1811 $ Z( ( ICOL1-1 )*LDZ+ 1812 $ IROW1 ), LDZ, MYROW, 1813 $ RIGHT ) 1814 END IF 1815 ELSE IF( SKIP ) THEN 1816 CALL ZGERV2D( CONTXT, IROW2-IROW1+1, 2, 1817 $ WORK( IZBUF+1 ), 1818 $ IROW2-IROW1+1, MYROW, LEFT ) 1819 T2 = T1*V2 1820 T3 = T1*V3 1821 ICOL1 = ( ICOL1-1 )*LDZ 1822 II = IZBUF - IROW1 + 1 1823 JJ = IZBUF + IROW2 - 2*IROW1 + 2 1824 DO 330 J = IROW1, IROW2 1825 SUM = T1*WORK( II+J ) + 1826 $ T2*WORK( JJ+J ) + T3*Z( ICOL1+J ) 1827 WORK( II+J ) = WORK( II+J ) - SUM 1828 WORK( JJ+J ) = WORK( JJ+J ) - 1829 $ SUM*DCONJG( V2 ) 1830 Z( ICOL1+J ) = Z( ICOL1+J ) - 1831 $ SUM*DCONJG( V3 ) 1832 330 CONTINUE 1833 IF( ISTART.EQ.ISTOP ) THEN 1834 CALL ZGESD2D( CONTXT, IROW2-IROW1+1, 2, 1835 $ WORK( IZBUF+1 ), 1836 $ IROW2-IROW1+1, MYROW, 1837 $ LEFT ) 1838 END IF 1839 END IF 1840 END IF 1841 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 1842 IF( ICOL1.EQ.ICOL2 ) THEN 1843 IF( ISTART.EQ.ISTOP ) THEN 1844 CALL ZGESD2D( CONTXT, IROW2-IROW1+1, 2, 1845 $ Z( ( ICOL1-2 )*LDZ+ 1846 $ IROW1 ), LDZ, MYROW, 1847 $ RIGHT ) 1848 END IF 1849 IF( SKIP ) THEN 1850 CALL ZGERV2D( CONTXT, IROW2-IROW1+1, 2, 1851 $ Z( ( ICOL1-2 )*LDZ+ 1852 $ IROW1 ), LDZ, MYROW, 1853 $ RIGHT ) 1854 END IF 1855 ELSE IF( SKIP ) THEN 1856 IF( ISTART.EQ.ISTOP ) THEN 1857 CALL ZGERV2D( CONTXT, IROW2-IROW1+1, 2, 1858 $ WORK( IZBUF+1 ), 1859 $ IROW2-IROW1+1, MYROW, 1860 $ LEFT ) 1861 END IF 1862 T2 = T1*V2 1863 T3 = T1*V3 1864 ICOL1 = ( ICOL1-1 )*LDZ 1865 II = IZBUF + IROW2 - 2*IROW1 + 2 1866 DO 340 J = IROW1, IROW2 1867 SUM = T1*WORK( II+J ) + 1868 $ T2*Z( J+ICOL1 ) + 1869 $ T3*Z( J+ICOL1+LDZ ) 1870 WORK( II+J ) = WORK( II+J ) - SUM 1871 Z( J+ICOL1 ) = Z( J+ICOL1 ) - 1872 $ SUM*DCONJG( V2 ) 1873 Z( J+ICOL1+LDZ ) = Z( J+ICOL1+LDZ ) - 1874 $ SUM*DCONJG( V3 ) 1875 340 CONTINUE 1876 CALL ZGESD2D( CONTXT, IROW2-IROW1+1, 2, 1877 $ WORK( IZBUF+1 ), 1878 $ IROW2-IROW1+1, MYROW, LEFT ) 1879 END IF 1880 END IF 1881 END IF 1882 END IF 1883 350 CONTINUE 1884 360 CONTINUE 1885* 1886 IF( SKIP ) 1887 $ GO TO 450 1888* 1889 DO 420 KI = 1, IBULGE 1890 IF( KCOL( KI ).GT.KP2COL( KI ) ) 1891 $ GO TO 420 1892 IF( ( MYCOL.NE.ICURCOL( KI ) ) .AND. 1893 $ ( RIGHT.NE.ICURCOL( KI ) ) )GO TO 420 1894 ISTART = MAX( K1( KI ), M ) 1895 ISTOP = MIN( K2( KI ), I-1 ) 1896 IF( MOD( ISTART-1, HBL ).GE.HBL-2 ) THEN 1897* 1898* INFO is found in a buffer 1899* 1900 ISPEC = 1 1901 ELSE 1902* 1903* All INFO is local 1904* 1905 ISPEC = 0 1906 END IF 1907 DO 410 K = ISTART, ISTOP 1908* 1909 V2 = WORK( VECSIDX+( K-1 )*3+1 ) 1910 V3 = WORK( VECSIDX+( K-1 )*3+2 ) 1911 T1 = WORK( VECSIDX+( K-1 )*3+3 ) 1912 NR = MIN( 3, I-K+1 ) 1913 IF( ( NR.EQ.3 ) .AND. ( KCOL( KI ).LE.KP2COL( KI ) ) ) 1914 $ THEN 1915* 1916 IF( ( K.LT.ISTOP ) .AND. 1917 $ ( MOD( K-1, HBL ).LT.HBL-2 ) ) THEN 1918 ITMP1 = MIN( ISTART+1, I ) - 1 1919 ELSE 1920 IF( MOD( K-1, HBL ).LT.HBL-2 ) THEN 1921 ITMP1 = MIN( K+3, I ) 1922 END IF 1923 IF( MOD( K-1, HBL ).EQ.HBL-2 ) THEN 1924 ITMP1 = MAX( I1, K-1 ) - 1 1925 END IF 1926 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 1927 ITMP1 = MAX( I1, K-2 ) - 1 1928 END IF 1929 END IF 1930 IF( MOD( K-1, HBL ).LT.HBL-2 ) THEN 1931 ICOL1 = KCOL( KI ) + K - ISTART 1932 ICOL2 = KP2COL( KI ) + K - ISTART 1933 ELSE 1934 ICOL1 = KCOL( KI ) 1935 ICOL2 = KP2COL( KI ) 1936 IF( K.GT.ISTART ) THEN 1937 IF( RIGHT.EQ.ICURCOL( KI ) ) THEN 1938 ICOL1 = ICOL1 + 1 1939 END IF 1940 IF( MYCOL.EQ.ICURCOL( KI ) ) THEN 1941 ICOL2 = ICOL2 + 1 1942 END IF 1943 END IF 1944 END IF 1945 CALL INFOG1L( I1, HBL, NPROW, MYROW, IAFIRST, 1946 $ IROW1, IROW2 ) 1947 IROW2 = NUMROC( ITMP1, HBL, MYROW, IAFIRST, NPROW ) 1948 IF( ( MOD( K-1, HBL ).EQ.HBL-2 ) .AND. 1949 $ ( NPCOL.GT.1 ) ) THEN 1950 IF( ICOL1.EQ.ICOL2 ) THEN 1951 CALL ZGERV2D( CONTXT, IROW2-IROW1+1, 2, 1952 $ WORK( ICBUF+1 ), IROW2-IROW1+1, 1953 $ MYROW, LEFT ) 1954 T2 = T1*V2 1955 T3 = T1*V3 1956 II = ICBUF - IROW1 + 1 1957 JJ = ICBUF + IROW2 - 2*IROW1 + 2 1958 DO 370 J = IROW1, IROW2 1959 SUM = T1*WORK( II+J ) + T2*WORK( JJ+J ) + 1960 $ T3*A( ( ICOL1-1 )*LDA+J ) 1961 WORK( II+J ) = WORK( II+J ) - SUM 1962 WORK( JJ+J ) = WORK( JJ+J ) - 1963 $ SUM*DCONJG( V2 ) 1964 A( ( ICOL1-1 )*LDA+J ) = A( ( ICOL1-1 )* 1965 $ LDA+J ) - SUM*DCONJG( V3 ) 1966 370 CONTINUE 1967 IF( ISTART.EQ.ISTOP ) THEN 1968 CALL ZGESD2D( CONTXT, IROW2-IROW1+1, 2, 1969 $ WORK( ICBUF+1 ), 1970 $ IROW2-IROW1+1, MYROW, LEFT ) 1971 END IF 1972 END IF 1973 END IF 1974 IF( ( MOD( K-1, HBL ).EQ.HBL-1 ) .AND. 1975 $ ( NPCOL.GT.1 ) ) THEN 1976 IF( ICOL1.NE.ICOL2 ) THEN 1977 IF( ISTART.EQ.ISTOP ) THEN 1978 CALL ZGERV2D( CONTXT, IROW2-IROW1+1, 2, 1979 $ WORK( ICBUF+1 ), 1980 $ IROW2-IROW1+1, MYROW, LEFT ) 1981 END IF 1982 T2 = T1*V2 1983 T3 = T1*V3 1984 II = ICBUF + IROW2 - 2*IROW1 + 2 1985 DO 380 J = IROW1, IROW2 1986 SUM = T1*WORK( J+II ) + 1987 $ T2*A( ( ICOL1-1 )*LDA+J ) + 1988 $ T3*A( ICOL1*LDA+J ) 1989 WORK( J+II ) = WORK( J+II ) - SUM 1990 A( ( ICOL1-1 )*LDA+J ) = A( ( ICOL1-1 )* 1991 $ LDA+J ) - SUM*DCONJG( V2 ) 1992 A( ICOL1*LDA+J ) = A( ICOL1*LDA+J ) - 1993 $ SUM*DCONJG( V3 ) 1994 380 CONTINUE 1995 CALL ZGESD2D( CONTXT, IROW2-IROW1+1, 2, 1996 $ WORK( ICBUF+1 ), IROW2-IROW1+1, 1997 $ MYROW, LEFT ) 1998 END IF 1999 END IF 2000* 2001* 2002* If we want Z and we haven't already done any Z 2003 IF( ( WANTZ ) .AND. ( MOD( K-1, 2004 $ HBL ).GE.HBL-2 ) .AND. ( NPCOL.GT.1 ) ) THEN 2005* 2006* Accumulate transformations in the matrix Z 2007* 2008 IROW1 = LILOZ 2009 IROW2 = LIHIZ 2010 IF( MOD( K-1, HBL ).EQ.HBL-2 ) THEN 2011 IF( ICOL1.EQ.ICOL2 ) THEN 2012 CALL ZGERV2D( CONTXT, IROW2-IROW1+1, 2, 2013 $ WORK( IZBUF+1 ), 2014 $ IROW2-IROW1+1, MYROW, LEFT ) 2015 T2 = T1*V2 2016 T3 = T1*V3 2017 ICOL1 = ( ICOL1-1 )*LDZ 2018 II = IZBUF - IROW1 + 1 2019 JJ = IZBUF + IROW2 - 2*IROW1 + 2 2020 DO 390 J = IROW1, IROW2 2021 SUM = T1*WORK( II+J ) + 2022 $ T2*WORK( JJ+J ) + T3*Z( ICOL1+J ) 2023 WORK( II+J ) = WORK( II+J ) - SUM 2024 WORK( JJ+J ) = WORK( JJ+J ) - 2025 $ SUM*DCONJG( V2 ) 2026 Z( ICOL1+J ) = Z( ICOL1+J ) - 2027 $ SUM*DCONJG( V3 ) 2028 390 CONTINUE 2029 IF( ISTART.EQ.ISTOP ) THEN 2030 CALL ZGESD2D( CONTXT, IROW2-IROW1+1, 2, 2031 $ WORK( IZBUF+1 ), 2032 $ IROW2-IROW1+1, MYROW, 2033 $ LEFT ) 2034 END IF 2035 END IF 2036 END IF 2037 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 2038 IF( ICOL1.NE.ICOL2 ) THEN 2039 IF( ISTART.EQ.ISTOP ) THEN 2040 CALL ZGERV2D( CONTXT, IROW2-IROW1+1, 2, 2041 $ WORK( IZBUF+1 ), 2042 $ IROW2-IROW1+1, MYROW, 2043 $ LEFT ) 2044 END IF 2045 T2 = T1*V2 2046 T3 = T1*V3 2047 ICOL1 = ( ICOL1-1 )*LDZ 2048 II = IZBUF + IROW2 - 2*IROW1 + 2 2049 DO 400 J = IROW1, IROW2 2050 SUM = T1*WORK( II+J ) + 2051 $ T2*Z( J+ICOL1 ) + 2052 $ T3*Z( J+ICOL1+LDZ ) 2053 WORK( II+J ) = WORK( II+J ) - SUM 2054 Z( J+ICOL1 ) = Z( J+ICOL1 ) - 2055 $ SUM*DCONJG( V2 ) 2056 Z( J+ICOL1+LDZ ) = Z( J+ICOL1+LDZ ) - 2057 $ SUM*DCONJG( V3 ) 2058 400 CONTINUE 2059 CALL ZGESD2D( CONTXT, IROW2-IROW1+1, 2, 2060 $ WORK( IZBUF+1 ), 2061 $ IROW2-IROW1+1, MYROW, LEFT ) 2062 END IF 2063 END IF 2064 END IF 2065 END IF 2066 410 CONTINUE 2067 420 CONTINUE 2068* 2069 DO 440 KI = 1, IBULGE 2070 IF( KCOL( KI ).GT.KP2COL( KI ) ) 2071 $ GO TO 440 2072 IF( ( MYCOL.NE.ICURCOL( KI ) ) .AND. 2073 $ ( RIGHT.NE.ICURCOL( KI ) ) )GO TO 440 2074 ISTART = MAX( K1( KI ), M ) 2075 ISTOP = MIN( K2( KI ), I-1 ) 2076 IF( MOD( ISTART-1, HBL ).GE.HBL-2 ) THEN 2077* 2078* INFO is found in a buffer 2079* 2080 ISPEC = 1 2081 ELSE 2082* 2083* All INFO is local 2084* 2085 ISPEC = 0 2086 END IF 2087 DO 430 K = ISTART, ISTOP 2088* 2089 V2 = WORK( VECSIDX+( K-1 )*3+1 ) 2090 V3 = WORK( VECSIDX+( K-1 )*3+2 ) 2091 T1 = WORK( VECSIDX+( K-1 )*3+3 ) 2092 NR = MIN( 3, I-K+1 ) 2093 IF( ( NR.EQ.3 ) .AND. ( KCOL( KI ).LE.KP2COL( KI ) ) ) 2094 $ THEN 2095* 2096 IF( ( K.LT.ISTOP ) .AND. 2097 $ ( MOD( K-1, HBL ).LT.HBL-2 ) ) THEN 2098 ITMP1 = MIN( ISTART+1, I ) - 1 2099 ELSE 2100 IF( MOD( K-1, HBL ).LT.HBL-2 ) THEN 2101 ITMP1 = MIN( K+3, I ) 2102 END IF 2103 IF( MOD( K-1, HBL ).EQ.HBL-2 ) THEN 2104 ITMP1 = MAX( I1, K-1 ) - 1 2105 END IF 2106 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 2107 ITMP1 = MAX( I1, K-2 ) - 1 2108 END IF 2109 END IF 2110 IF( MOD( K-1, HBL ).LT.HBL-2 ) THEN 2111 ICOL1 = KCOL( KI ) + K - ISTART 2112 ICOL2 = KP2COL( KI ) + K - ISTART 2113 ELSE 2114 ICOL1 = KCOL( KI ) 2115 ICOL2 = KP2COL( KI ) 2116 IF( K.GT.ISTART ) THEN 2117 IF( RIGHT.EQ.ICURCOL( KI ) ) THEN 2118 ICOL1 = ICOL1 + 1 2119 END IF 2120 IF( MYCOL.EQ.ICURCOL( KI ) ) THEN 2121 ICOL2 = ICOL2 + 1 2122 END IF 2123 END IF 2124 END IF 2125 CALL INFOG1L( I1, HBL, NPROW, MYROW, IAFIRST, 2126 $ IROW1, IROW2 ) 2127 IROW2 = NUMROC( ITMP1, HBL, MYROW, IAFIRST, NPROW ) 2128 IF( ( MOD( K-1, HBL ).EQ.HBL-2 ) .AND. 2129 $ ( NPCOL.GT.1 ) ) THEN 2130 IF( ICOL1.NE.ICOL2 ) THEN 2131 IF( ISTART.EQ.ISTOP ) THEN 2132 CALL ZGERV2D( CONTXT, IROW2-IROW1+1, 2, 2133 $ A( ( ICOL1-1 )*LDA+IROW1 ), 2134 $ LDA, MYROW, RIGHT ) 2135 END IF 2136 END IF 2137 END IF 2138 IF( ( MOD( K-1, HBL ).EQ.HBL-1 ) .AND. 2139 $ ( NPCOL.GT.1 ) ) THEN 2140 IF( ICOL1.EQ.ICOL2 ) THEN 2141 CALL ZGERV2D( CONTXT, IROW2-IROW1+1, 2, 2142 $ A( ( ICOL1-2 )*LDA+IROW1 ), 2143 $ LDA, MYROW, RIGHT ) 2144 END IF 2145 END IF 2146* 2147* If we want Z and we haven't already done any Z 2148* 2149 IF( ( WANTZ ) .AND. ( MOD( K-1, 2150 $ HBL ).GE.HBL-2 ) .AND. ( NPCOL.GT.1 ) ) THEN 2151* 2152* Accumulate transformations in the matrix Z 2153* 2154 IROW1 = LILOZ 2155 IROW2 = LIHIZ 2156 IF( MOD( K-1, HBL ).EQ.HBL-2 ) THEN 2157 IF( ICOL1.NE.ICOL2 ) THEN 2158 IF( ISTART.EQ.ISTOP ) THEN 2159 CALL ZGERV2D( CONTXT, IROW2-IROW1+1, 2, 2160 $ Z( ( ICOL1-1 )*LDZ+ 2161 $ IROW1 ), LDZ, MYROW, 2162 $ RIGHT ) 2163 END IF 2164 END IF 2165 END IF 2166 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 2167 IF( ICOL1.EQ.ICOL2 ) THEN 2168 CALL ZGERV2D( CONTXT, IROW2-IROW1+1, 2, 2169 $ Z( ( ICOL1-2 )*LDZ+IROW1 ), 2170 $ LDZ, MYROW, RIGHT ) 2171 END IF 2172 END IF 2173 END IF 2174 END IF 2175 430 CONTINUE 2176 440 CONTINUE 2177* 2178* Column work done 2179* 2180 450 CONTINUE 2181* 2182* Now do NR=2 work 2183* 2184 DO 530 KI = 1, IBULGE 2185 ISTART = MAX( K1( KI ), M ) 2186 ISTOP = MIN( K2( KI ), I-1 ) 2187 IF( MOD( ISTART-1, HBL ).GE.HBL-2 ) THEN 2188* 2189* INFO is found in a buffer 2190* 2191 ISPEC = 1 2192 ELSE 2193* 2194* All INFO is local 2195* 2196 ISPEC = 0 2197 END IF 2198* 2199 DO 520 K = ISTART, ISTOP 2200* 2201 V2 = WORK( VECSIDX+( K-1 )*3+1 ) 2202 V3 = WORK( VECSIDX+( K-1 )*3+2 ) 2203 T1 = WORK( VECSIDX+( K-1 )*3+3 ) 2204 NR = MIN( 3, I-K+1 ) 2205 IF( NR.EQ.2 ) THEN 2206 IF ( ICURROW( KI ).EQ.MYROW ) THEN 2207 T2 = T1*V2 2208 END IF 2209 IF ( ICURCOL( KI ).EQ.MYCOL ) THEN 2210 T2 = T1*V2 2211 END IF 2212* 2213* Apply G from the left to transform the rows of the matrix 2214* in columns K to I2. 2215* 2216 CALL INFOG1L( K, HBL, NPCOL, MYCOL, JAFIRST, LILOH, 2217 $ LIHIH ) 2218 LIHIH = LOCALI2 2219 CALL INFOG1L( 1, HBL, NPROW, MYROW, IAFIRST, ITMP2, 2220 $ ITMP1 ) 2221 ITMP1 = NUMROC( K+1, HBL, MYROW, IAFIRST, NPROW ) 2222 IF( ICURROW( KI ).EQ.MYROW ) THEN 2223 IF( ( ISPEC.EQ.0 ) .OR. ( NPROW.EQ.1 ) .OR. 2224 $ ( MOD( K-1, HBL ).EQ.HBL-2 ) ) THEN 2225 ITMP1 = ITMP1 - 1 2226 DO 460 J = ( LILOH-1 )*LDA, 2227 $ ( LIHIH-1 )*LDA, LDA 2228 SUM = DCONJG( T1 )*A( ITMP1+J ) + 2229 $ DCONJG( T2 )*A( ITMP1+1+J ) 2230 A( ITMP1+J ) = A( ITMP1+J ) - SUM 2231 A( ITMP1+1+J ) = A( ITMP1+1+J ) - SUM*V2 2232 460 CONTINUE 2233 ELSE 2234 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 2235 CALL ZGERV2D( CONTXT, 1, LIHIH-LILOH+1, 2236 $ WORK( IRBUF+1 ), 1, UP, 2237 $ MYCOL ) 2238 DO 470 J = LILOH, LIHIH 2239 SUM = DCONJG( T1 )* 2240 $ WORK( IRBUF+J-LILOH+1 ) + 2241 $ DCONJG( T2 )*A( ( J-1 )*LDA+ 2242 $ ITMP1 ) 2243 WORK( IRBUF+J-LILOH+1 ) = WORK( IRBUF+ 2244 $ J-LILOH+1 ) - SUM 2245 A( ( J-1 )*LDA+ITMP1 ) = A( ( J-1 )* 2246 $ LDA+ITMP1 ) - SUM*V2 2247 470 CONTINUE 2248 CALL ZGESD2D( CONTXT, 1, LIHIH-LILOH+1, 2249 $ WORK( IRBUF+1 ), 1, UP, 2250 $ MYCOL ) 2251 END IF 2252 END IF 2253 ELSE 2254 IF( ( MOD( K-1, HBL ).EQ.HBL-1 ) .AND. 2255 $ ( ICURROW( KI ).EQ.DOWN ) ) THEN 2256 CALL ZGESD2D( CONTXT, 1, LIHIH-LILOH+1, 2257 $ A( ( LILOH-1 )*LDA+ITMP1 ), 2258 $ LDA, DOWN, MYCOL ) 2259 CALL ZGERV2D( CONTXT, 1, LIHIH-LILOH+1, 2260 $ A( ( LILOH-1 )*LDA+ITMP1 ), 2261 $ LDA, DOWN, MYCOL ) 2262 END IF 2263 END IF 2264* 2265* Apply G from the right to transform the columns of the 2266* matrix in rows I1 to MIN(K+3,I). 2267* 2268 CALL INFOG1L( I1, HBL, NPROW, MYROW, IAFIRST, 2269 $ LILOH, LIHIH ) 2270 LIHIH = NUMROC( I, HBL, MYROW, IAFIRST, NPROW ) 2271* 2272 IF( ICURCOL( KI ).EQ.MYCOL ) THEN 2273* LOCAL A(LILOZ:LIHIZ,KCOL:KCOL+2) 2274 IF( ( ISPEC.EQ.0 ) .OR. ( NPCOL.EQ.1 ) .OR. 2275 $ ( MOD( K-1, HBL ).EQ.HBL-2 ) ) THEN 2276 CALL INFOG1L( K, HBL, NPCOL, MYCOL, JAFIRST, 2277 $ ITMP1, ITMP2 ) 2278 ITMP2 = NUMROC( K+1, HBL, MYCOL, JAFIRST, 2279 $ NPCOL ) 2280 DO 480 J = LILOH, LIHIH 2281 SUM = T1*A( ( ITMP1-1 )*LDA+J ) + 2282 $ T2*A( ITMP1*LDA+J ) 2283 A( ( ITMP1-1 )*LDA+J ) = A( ( ITMP1-1 )* 2284 $ LDA+J ) - SUM 2285 A( ITMP1*LDA+J ) = A( ITMP1*LDA+J ) - 2286 $ SUM*DCONJG( V2 ) 2287 480 CONTINUE 2288 ELSE 2289 ITMP1 = KCOL( KI ) 2290 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 2291 CALL ZGERV2D( CONTXT, LIHIH-LILOH+1, 1, 2292 $ WORK( ICBUF+1 ), 2293 $ LIHIH-LILOH+1, MYROW, LEFT ) 2294 DO 490 J = LILOH, LIHIH 2295 SUM = T1*WORK( ICBUF+J ) + 2296 $ T2*A( ( ITMP1-1 )*LDA+J ) 2297 WORK( ICBUF+J ) = WORK( ICBUF+J ) - SUM 2298 A( ( ITMP1-1 )*LDA+J ) 2299 $ = A( ( ITMP1-1 )*LDA+J ) - 2300 $ SUM*DCONJG( V2 ) 2301 490 CONTINUE 2302 CALL ZGESD2D( CONTXT, LIHIH-LILOH+1, 1, 2303 $ WORK( ICBUF+1 ), 2304 $ LIHIH-LILOH+1, MYROW, LEFT ) 2305 END IF 2306 END IF 2307 ELSE 2308 IF( ( MOD( K-1, HBL ).EQ.HBL-1 ) .AND. 2309 $ ( ICURCOL( KI ).EQ.RIGHT ) ) THEN 2310 ITMP1 = KCOL( KI ) 2311 CALL ZGESD2D( CONTXT, LIHIH-LILOH+1, 1, 2312 $ A( ( ITMP1-1 )*LDA+LILOH ), 2313 $ LDA, MYROW, RIGHT ) 2314 CALL INFOG1L( K, HBL, NPCOL, MYCOL, JAFIRST, 2315 $ ITMP1, ITMP2 ) 2316 ITMP2 = NUMROC( K+1, HBL, MYCOL, JAFIRST, 2317 $ NPCOL ) 2318 CALL ZGERV2D( CONTXT, LIHIH-LILOH+1, 1, 2319 $ A( ( ITMP1-1 )*LDA+LILOH ), 2320 $ LDA, MYROW, RIGHT ) 2321 END IF 2322 END IF 2323* 2324 IF( WANTZ ) THEN 2325* 2326* Accumulate transformations in the matrix Z 2327* 2328 IF( ICURCOL( KI ).EQ.MYCOL ) THEN 2329* LOCAL Z(LILOZ:LIHIZ,KCOL:KCOL+2) 2330 IF( ( ISPEC.EQ.0 ) .OR. ( NPCOL.EQ.1 ) .OR. 2331 $ ( MOD( K-1, HBL ).EQ.HBL-2 ) ) THEN 2332 ITMP1 = KCOL( KI ) + K - ISTART 2333 ITMP1 = ( ITMP1-1 )*LDZ 2334 DO 500 J = LILOZ, LIHIZ 2335 SUM = T1*Z( J+ITMP1 ) + 2336 $ T2*Z( J+ITMP1+LDZ ) 2337 Z( J+ITMP1 ) = Z( J+ITMP1 ) - SUM 2338 Z( J+ITMP1+LDZ ) = Z( J+ITMP1+LDZ ) - 2339 $ SUM*DCONJG( V2 ) 2340 500 CONTINUE 2341 ELSE 2342 ITMP1 = KCOL( KI ) 2343* IF WE ACTUALLY OWN COLUMN K 2344 IF( MOD( K-1, HBL ).EQ.HBL-1 ) THEN 2345 CALL ZGERV2D( CONTXT, LIHIZ-LILOZ+1, 1, 2346 $ WORK( IZBUF+1 ), LDZ, 2347 $ MYROW, LEFT ) 2348 ITMP1 = ( ITMP1-1 )*LDZ 2349 DO 510 J = LILOZ, LIHIZ 2350 SUM = T1*WORK( IZBUF+J ) + 2351 $ T2*Z( J+ITMP1 ) 2352 WORK( IZBUF+J ) = WORK( IZBUF+J ) - 2353 $ SUM 2354 Z( J+ITMP1 ) = Z( J+ITMP1 ) - 2355 $ SUM*DCONJG( V2 ) 2356 510 CONTINUE 2357 CALL ZGESD2D( CONTXT, LIHIZ-LILOZ+1, 1, 2358 $ WORK( IZBUF+1 ), LDZ, 2359 $ MYROW, LEFT ) 2360 END IF 2361 END IF 2362 ELSE 2363* 2364* NO WORK BUT NEED TO UPDATE ANYWAY???? 2365* 2366 IF( ( MOD( K-1, HBL ).EQ.HBL-1 ) .AND. 2367 $ ( ICURCOL( KI ).EQ.RIGHT ) ) THEN 2368 ITMP1 = KCOL( KI ) 2369 ITMP1 = ( ITMP1-1 )*LDZ 2370 CALL ZGESD2D( CONTXT, LIHIZ-LILOZ+1, 1, 2371 $ Z( LILOZ+ITMP1 ), LDZ, 2372 $ MYROW, RIGHT ) 2373 CALL ZGERV2D( CONTXT, LIHIZ-LILOZ+1, 1, 2374 $ Z( LILOZ+ITMP1 ), LDZ, 2375 $ MYROW, RIGHT ) 2376 END IF 2377 END IF 2378 END IF 2379 END IF 2380 520 CONTINUE 2381* 2382* Adjust local information for this bulge 2383* 2384 IF( NPROW.EQ.1 ) THEN 2385 KROW( KI ) = KROW( KI ) + K2( KI ) - K1( KI ) + 1 2386 KP2ROW( KI ) = KP2ROW( KI ) + K2( KI ) - K1( KI ) + 1 2387 END IF 2388 IF( ( MOD( K1( KI )-1, HBL ).LT.HBL-2 ) .AND. 2389 $ ( ICURROW( KI ).EQ.MYROW ) .AND. ( NPROW.GT.1 ) ) 2390 $ THEN 2391 KROW( KI ) = KROW( KI ) + K2( KI ) - K1( KI ) + 1 2392 END IF 2393 IF( ( MOD( K2( KI ), HBL ).LT.HBL-2 ) .AND. 2394 $ ( ICURROW( KI ).EQ.MYROW ) .AND. ( NPROW.GT.1 ) ) 2395 $ THEN 2396 KP2ROW( KI ) = KP2ROW( KI ) + K2( KI ) - K1( KI ) + 1 2397 END IF 2398 IF( ( MOD( K1( KI )-1, HBL ).GE.HBL-2 ) .AND. 2399 $ ( ( MYROW.EQ.ICURROW( KI ) ) .OR. ( DOWN.EQ. 2400 $ ICURROW( KI ) ) ) .AND. ( NPROW.GT.1 ) ) THEN 2401 CALL INFOG1L( K2( KI )+1, HBL, NPROW, MYROW, IAFIRST, 2402 $ KROW( KI ), ITMP2 ) 2403 END IF 2404 IF( ( MOD( K2( KI ), HBL ).GE.HBL-2 ) .AND. 2405 $ ( ( MYROW.EQ.ICURROW( KI ) ) .OR. ( UP.EQ. 2406 $ ICURROW( KI ) ) ) .AND. ( NPROW.GT.1 ) ) THEN 2407 KP2ROW( KI ) = NUMROC( K2( KI )+3, HBL, MYROW, 2408 $ IAFIRST, NPROW ) 2409 END IF 2410 IF( NPCOL.EQ.1 ) THEN 2411 KCOL( KI ) = KCOL( KI ) + K2( KI ) - K1( KI ) + 1 2412 KP2COL( KI ) = KP2COL( KI ) + K2( KI ) - K1( KI ) + 1 2413 END IF 2414 IF( ( MOD( K1( KI )-1, HBL ).LT.HBL-2 ) .AND. 2415 $ ( ICURCOL( KI ).EQ.MYCOL ) .AND. ( NPCOL.GT.1 ) ) 2416 $ THEN 2417 KCOL( KI ) = KCOL( KI ) + K2( KI ) - K1( KI ) + 1 2418 END IF 2419 IF( ( MOD( K2( KI ), HBL ).LT.HBL-2 ) .AND. 2420 $ ( ICURCOL( KI ).EQ.MYCOL ) .AND. ( NPCOL.GT.1 ) ) 2421 $ THEN 2422 KP2COL( KI ) = KP2COL( KI ) + K2( KI ) - K1( KI ) + 1 2423 END IF 2424 IF( ( MOD( K1( KI )-1, HBL ).GE.HBL-2 ) .AND. 2425 $ ( ( MYCOL.EQ.ICURCOL( KI ) ) .OR. ( RIGHT.EQ. 2426 $ ICURCOL( KI ) ) ) .AND. ( NPCOL.GT.1 ) ) THEN 2427 CALL INFOG1L( K2( KI )+1, HBL, NPCOL, MYCOL, JAFIRST, 2428 $ KCOL( KI ), ITMP2 ) 2429 END IF 2430 IF( ( MOD( K2( KI ), HBL ).GE.HBL-2 ) .AND. 2431 $ ( ( MYCOL.EQ.ICURCOL( KI ) ) .OR. ( LEFT.EQ. 2432 $ ICURCOL( KI ) ) ) .AND. ( NPCOL.GT.1 ) ) THEN 2433 KP2COL( KI ) = NUMROC( K2( KI )+3, HBL, MYCOL, 2434 $ JAFIRST, NPCOL ) 2435 END IF 2436 K1( KI ) = K2( KI ) + 1 2437 ISTOP = MIN( K1( KI )+ROTN-MOD( K1( KI ), ROTN ), I-2 ) 2438 ISTOP = MIN( ISTOP, K1( KI )+HBL-3- 2439 $ MOD( K1( KI )-1, HBL ) ) 2440 ISTOP = MIN( ISTOP, I2-2 ) 2441 ISTOP = MAX( ISTOP, K1( KI ) ) 2442 IF( ( MOD( K1( KI )-1, HBL ).EQ.HBL-2 ) .AND. 2443 $ ( ISTOP.LT.MIN( I-2, I2-2 ) ) ) THEN 2444 ISTOP = ISTOP + 1 2445 END IF 2446 K2( KI ) = ISTOP 2447 IF( K1( KI ).LE.ISTOP ) THEN 2448 IF( ( MOD( K1( KI )-1, HBL ).EQ.HBL-2 ) .AND. 2449 $ ( I-K1( KI ).GT.1 ) ) THEN 2450* 2451* Next step switches rows & cols 2452* 2453 ICURROW( KI ) = MOD( ICURROW( KI )+1, NPROW ) 2454 ICURCOL( KI ) = MOD( ICURCOL( KI )+1, NPCOL ) 2455 END IF 2456 END IF 2457 530 CONTINUE 2458* 2459 IF( K2( IBULGE ).LE.I-1 ) 2460 $ GO TO 40 2461 END IF 2462* 2463 540 CONTINUE 2464* 2465* Failure to converge in remaining number of iterations 2466* 2467 INFO = I 2468 RETURN 2469* 2470 550 CONTINUE 2471* 2472 IF( L.EQ.I ) THEN 2473* 2474* H(I,I-1) is negligible: one eigenvalue has converged. 2475* 2476 CALL INFOG2L( I, I, DESCA, NPROW, NPCOL, MYROW, MYCOL, IROW, 2477 $ ICOL, ITMP1, ITMP2 ) 2478 IF( ( MYROW.EQ.ITMP1 ) .AND. ( MYCOL.EQ.ITMP2 ) ) THEN 2479 W( I ) = A( ( ICOL-1 )*LDA+IROW ) 2480 ELSE 2481 W( I ) = ZERO 2482 END IF 2483 ELSE IF( L.EQ.I-1 ) THEN 2484* 2485* H(I-1,I-2) is negligible: a pair of eigenvalues have converged. 2486* 2487 CALL PZLACP3( 2, I-1, A, DESCA, S1, 2*IBLK, -1, -1, 0 ) 2488 CALL ZLANV2( S1( 1, 1 ), S1( 1, 2 ), S1( 2, 1 ), S1( 2, 2 ), 2489 $ W( I-1 ), W( I ), CS, SN ) 2490 CALL PZLACP3( 2, I-1, A, DESCA, S1, 2*IBLK, 0, 0, 1 ) 2491* 2492 IF( NODE.NE.0 ) THEN 2493* Erase the eigenvalues other eigenvalues 2494 W( I-1 ) = ZERO 2495 W( I ) = ZERO 2496 END IF 2497* 2498 IF( WANTT ) THEN 2499* 2500* Apply the transformation to A. 2501* 2502 IF( I2.GT.I ) THEN 2503 CALL PZROT( I2-I, A, I-1, I+1, DESCA, N, A, I, I+1, 2504 $ DESCA, N, CS, SN ) 2505 END IF 2506 CALL PZROT( I-I1-1, A, I1, I-1, DESCA, 1, A, I1, I, DESCA, 2507 $ 1, CS, DCONJG( SN ) ) 2508 END IF 2509 IF( WANTZ ) THEN 2510* 2511* Apply the transformation to Z. 2512* 2513 CALL PZROT( NZ, Z, ILOZ, I-1, DESCZ, 1, Z, ILOZ, I, DESCZ, 2514 $ 1, CS, DCONJG( SN ) ) 2515 END IF 2516* 2517 ELSE 2518* 2519* Find the eigenvalues in H(L:I,L:I), L < I-1 2520* 2521 JBLK = I - L + 1 2522 IF( JBLK.LE.2*IBLK ) THEN 2523 CALL PZLACP3( I-L+1, L, A, DESCA, S1, 2*IBLK, 0, 0, 0 ) 2524 CALL ZLAHQR2( .FALSE., .FALSE., JBLK, 1, JBLK, S1, 2*IBLK, 2525 $ W( L ), 1, JBLK, Z, LDZ, IERR ) 2526 IF( NODE.NE.0 ) THEN 2527* 2528* Erase the eigenvalues 2529* 2530 DO 560 K = L, I 2531 W( K ) = ZERO 2532 560 CONTINUE 2533 END IF 2534 END IF 2535 END IF 2536* 2537* Decrement number of remaining iterations, and return to start of 2538* the main loop with new value of I. 2539* 2540 ITN = ITN - ITS 2541 I = L - 1 2542 GO TO 10 2543* 2544 570 CONTINUE 2545 CALL ZGSUM2D( CONTXT, 'All', ' ', N, 1, W, N, -1, -1 ) 2546 RETURN 2547* 2548* END OF PZLAHQR 2549* 2550 END 2551