1 SUBROUTINE PZHEEVR( JOBZ, RANGE, UPLO, N, A, IA, JA, 2 $ DESCA, VL, VU, IL, IU, M, NZ, W, Z, IZ, 3 $ JZ, DESCZ, 4 $ WORK, LWORK, RWORK, LRWORK, IWORK, LIWORK, 5 $ INFO ) 6 7 IMPLICIT NONE 8* 9* -- ScaLAPACK routine (version 2.0.2) -- 10* Univ. of Tennessee, Univ. of California Berkeley, Univ. of Colorado Denver 11* May 1 2012 12* 13* .. Scalar Arguments .. 14 CHARACTER JOBZ, RANGE, UPLO 15 16 INTEGER IA, IL, INFO, IU, IZ, JA, JZ, LIWORK, LRWORK, 17 $ LWORK, M, N, NZ 18 DOUBLE PRECISION VL, VU 19* .. 20* .. Array Arguments .. 21 INTEGER DESCA( * ), DESCZ( * ), IWORK( * ) 22 DOUBLE PRECISION W( * ), RWORK( * ) 23 COMPLEX*16 A( * ), WORK( * ), Z( * ) 24* .. 25* 26* Purpose 27* ======= 28* 29* PZHEEVR computes selected eigenvalues and, optionally, eigenvectors 30* of a complex Hermitian matrix A distributed in 2D blockcyclic format 31* by calling the recommended sequence of ScaLAPACK routines. 32* 33* First, the matrix A is reduced to real symmetric tridiagonal form. 34* Then, the eigenproblem is solved using the parallel MRRR algorithm. 35* Last, if eigenvectors have been computed, a backtransformation is done. 36* 37* Upon successful completion, each processor stores a copy of all computed 38* eigenvalues in W. The eigenvector matrix Z is stored in 39* 2D blockcyclic format distributed over all processors. 40* 41* For constructive feedback and comments, please contact cvoemel@lbl.gov 42* C. Voemel 43* 44* 45* Arguments 46* ========= 47* 48* JOBZ (global input) CHARACTER*1 49* Specifies whether or not to compute the eigenvectors: 50* = 'N': Compute eigenvalues only. 51* = 'V': Compute eigenvalues and eigenvectors. 52* 53* RANGE (global input) CHARACTER*1 54* = 'A': all eigenvalues will be found. 55* = 'V': all eigenvalues in the interval [VL,VU] will be found. 56* = 'I': the IL-th through IU-th eigenvalues will be found. 57* 58* UPLO (global input) CHARACTER*1 59* Specifies whether the upper or lower triangular part of the 60* symmetric matrix A is stored: 61* = 'U': Upper triangular 62* = 'L': Lower triangular 63* 64* N (global input) INTEGER 65* The number of rows and columns of the matrix A. N >= 0 66* 67* A (local input/workspace) 2D block cyclic COMPLEX*16 array, 68* global dimension (N, N), 69* local dimension ( LLD_A, LOCc(JA+N-1) ) 70* (see Notes below for more detailed explanation of 2d arrays) 71* 72* On entry, the symmetric matrix A. If UPLO = 'U', only the 73* upper triangular part of A is used to define the elements of 74* the symmetric matrix. If UPLO = 'L', only the lower 75* triangular part of A is used to define the elements of the 76* symmetric matrix. 77* 78* On exit, the lower triangle (if UPLO='L') or the upper 79* triangle (if UPLO='U') of A, including the diagonal, is 80* destroyed. 81* 82* IA (global input) INTEGER 83* A's global row index, which points to the beginning of the 84* submatrix which is to be operated on. 85* It should be set to 1 when operating on a full matrix. 86* 87* JA (global input) INTEGER 88* A's global column index, which points to the beginning of 89* the submatrix which is to be operated on. 90* It should be set to 1 when operating on a full matrix. 91* 92* DESCA (global and local input) INTEGER array of dimension DLEN_. 93* (The ScaLAPACK descriptor length is DLEN_ = 9.) 94* The array descriptor for the distributed matrix A. 95* The descriptor stores details about the 2D block-cyclic 96* storage, see the notes below. 97* If DESCA is incorrect, PZHEEVR cannot work correctly. 98* Also note the array alignment requirements specified below 99* 100* VL (global input) DOUBLE PRECISION 101* If RANGE='V', the lower bound of the interval to be searched 102* for eigenvalues. Not referenced if RANGE = 'A' or 'I'. 103* 104* VU (global input) DOUBLE PRECISION 105* If RANGE='V', the upper bound of the interval to be searched 106* for eigenvalues. Not referenced if RANGE = 'A' or 'I'. 107* 108* IL (global input) INTEGER 109* If RANGE='I', the index (from smallest to largest) of the 110* smallest eigenvalue to be returned. IL >= 1. 111* Not referenced if RANGE = 'A'. 112* 113* IU (global input) INTEGER 114* If RANGE='I', the index (from smallest to largest) of the 115* largest eigenvalue to be returned. min(IL,N) <= IU <= N. 116* Not referenced if RANGE = 'A'. 117* 118* M (global output) INTEGER 119* Total number of eigenvalues found. 0 <= M <= N. 120* 121* NZ (global output) INTEGER 122* Total number of eigenvectors computed. 0 <= NZ <= M. 123* The number of columns of Z that are filled. 124* If JOBZ .NE. 'V', NZ is not referenced. 125* If JOBZ .EQ. 'V', NZ = M 126* 127* W (global output) DOUBLE PRECISION array, dimension (N) 128* On normal exit, the first M entries contain the selected 129* eigenvalues in ascending order. 130* 131* Z (local output) COMPLEX*16 array, 132* global dimension (N, N), 133* local dimension ( LLD_Z, LOCc(JZ+N-1) ) 134* If JOBZ = 'V', then on normal exit the first M columns of Z 135* contain the orthonormal eigenvectors of the matrix 136* corresponding to the selected eigenvalues. 137* If JOBZ = 'N', then Z is not referenced. 138* 139* IZ (global input) INTEGER 140* Z's global row index, which points to the beginning of the 141* submatrix which is to be operated on. 142* It should be set to 1 when operating on a full matrix. 143* 144* JZ (global input) INTEGER 145* Z's global column index, which points to the beginning of 146* the submatrix which is to be operated on. 147* It should be set to 1 when operating on a full matrix. 148* 149* DESCZ (global and local input) INTEGER array of dimension DLEN_. 150* The array descriptor for the distributed matrix Z. 151* DESCZ( CTXT_ ) must equal DESCA( CTXT_ ) 152* 153* WORK (local workspace/output) COMPLEX*16 array, 154* dimension (LWORK) 155* WORK(1) returns workspace adequate workspace to allow 156* optimal performance. 157* 158* LWORK (local input) INTEGER 159* Size of WORK array, must be at least 3. 160* If only eigenvalues are requested: 161* LWORK >= N + MAX( NB * ( NP00 + 1 ), NB * 3 ) 162* If eigenvectors are requested: 163* LWORK >= N + ( NP00 + MQ00 + NB ) * NB 164* For definitions of NP00 & MQ00, see LRWORK. 165* 166* For optimal performance, greater workspace is needed, i.e. 167* LWORK >= MAX( LWORK, NHETRD_LWORK ) 168* Where LWORK is as defined above, and 169* NHETRD_LWORK = N + 2*( ANB+1 )*( 4*NPS+2 ) + 170* ( NPS + 1 ) * NPS 171* 172* ICTXT = DESCA( CTXT_ ) 173* ANB = PJLAENV( ICTXT, 3, 'PZHETTRD', 'L', 0, 0, 0, 0 ) 174* SQNPC = SQRT( DBLE( NPROW * NPCOL ) ) 175* NPS = MAX( NUMROC( N, 1, 0, 0, SQNPC ), 2*ANB ) 176* 177* If LWORK = -1, then LWORK is global input and a workspace 178* query is assumed; the routine only calculates the 179* optimal size for all work arrays. Each of these 180* values is returned in the first entry of the corresponding 181* work array, and no error message is issued by PXERBLA. 182* NOTE THAT FOR OPTIMAL PERFORMANCE, LWOPT IS RETURNED 183* (THE OPTIMUM WORKSPACE) RATHER THAN THE MINIMUM NECESSARY 184* WORKSPACE LWMIN WHEN A WORKSPACE QUERY IS ISSUED. 185* FOR VERY SMALL MATRICES, LWOPT >> LWMIN. 186* 187* RWORK (local workspace/output) DOUBLE PRECISION array, 188* dimension (LRWORK) 189* On return, RWORK(1) contains the optimal amount of 190* workspace required for efficient execution. 191* if JOBZ='N' RWORK(1) = optimal amount of workspace 192* required to compute the eigenvalues. 193* if JOBZ='V' RWORK(1) = optimal amount of workspace 194* required to compute eigenvalues and eigenvectors. 195* 196* LRWORK (local input) INTEGER 197* Size of RWORK, must be at least 3. 198* See below for definitions of variables used to define LRWORK. 199* If no eigenvectors are requested (JOBZ = 'N') then 200* LRWORK >= 2 + 5 * N + MAX( 12 * N, NB * ( NP00 + 1 ) ) 201* If eigenvectors are requested (JOBZ = 'V' ) then 202* the amount of workspace required is: 203* LRWORK >= 2 + 5 * N + MAX( 18*N, NP00 * MQ00 + 2 * NB * NB ) + 204* (2 + ICEIL( NEIG, NPROW*NPCOL))*N 205* 206* Variable definitions: 207* NEIG = number of eigenvectors requested 208* NB = DESCA( MB_ ) = DESCA( NB_ ) = 209* DESCZ( MB_ ) = DESCZ( NB_ ) 210* NN = MAX( N, NB, 2 ) 211* DESCA( RSRC_ ) = DESCA( NB_ ) = DESCZ( RSRC_ ) = 212* DESCZ( CSRC_ ) = 0 213* NP00 = NUMROC( NN, NB, 0, 0, NPROW ) 214* MQ00 = NUMROC( MAX( NEIG, NB, 2 ), NB, 0, 0, NPCOL ) 215* ICEIL( X, Y ) is a ScaLAPACK function returning 216* ceiling(X/Y) 217* 218* If LRWORK = -1, then LRWORK is global input and a workspace 219* query is assumed; the routine only calculates the size 220* required for optimal performance for all work arrays. Each of 221* these values is returned in the first entry of the 222* corresponding work arrays, and no error message is issued by 223* PXERBLA. 224* 225* IWORK (local workspace) INTEGER array 226* On return, IWORK(1) contains the amount of integer workspace 227* required. 228* 229* LIWORK (local input) INTEGER 230* size of IWORK 231* 232* Let NNP = MAX( N, NPROW*NPCOL + 1, 4 ). Then: 233* LIWORK >= 12*NNP + 2*N when the eigenvectors are desired 234* LIWORK >= 10*NNP + 2*N when only the eigenvalues have to be computed 235* 236* If LIWORK = -1, then LIWORK is global input and a workspace 237* query is assumed; the routine only calculates the minimum 238* and optimal size for all work arrays. Each of these 239* values is returned in the first entry of the corresponding 240* work array, and no error message is issued by PXERBLA. 241* 242* INFO (global output) INTEGER 243* = 0: successful exit 244* < 0: If the i-th argument is an array and the j-entry had 245* an illegal value, then INFO = -(i*100+j), if the i-th 246* argument is a scalar and had an illegal value, then 247* INFO = -i. 248* 249* Notes 250* ===== 251* 252* Each global data object is described by an associated description 253* vector. This vector stores the information required to establish 254* the mapping between an object element and its corresponding process 255* and memory location. 256* 257* Let A be a generic term for any 2D block cyclicly distributed array. 258* Such a global array has an associated description vector DESCA. 259* In the following comments, the character _ should be read as 260* "of the global array". 261* 262* NOTATION STORED IN EXPLANATION 263* --------------- -------------- -------------------------------------- 264* DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 265* DTYPE_A = 1. 266* CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 267* the BLACS process grid A is distribu- 268* ted over. The context itself is glo- 269* bal, but the handle (the integer 270* value) may vary. 271* M_A (global) DESCA( M_ ) The number of rows in the global 272* array A. 273* N_A (global) DESCA( N_ ) The number of columns in the global 274* array A. 275* MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 276* the rows of the array. 277* NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 278* the columns of the array. 279* RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 280* row of the array A is distributed. 281* CSRC_A (global) DESCA( CSRC_ ) The process column over which the 282* first column of the array A is 283* distributed. 284* LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 285* array. LLD_A >= MAX(1,LOCr(M_A)). 286* 287* Let K be the number of rows or columns of a distributed matrix, 288* and assume that its process grid has dimension p x q. 289* LOCr( K ) denotes the number of elements of K that a process 290* would receive if K were distributed over the p processes of its 291* process column. 292* Similarly, LOCc( K ) denotes the number of elements of K that a 293* process would receive if K were distributed over the q processes of 294* its process row. 295* The values of LOCr() and LOCc() may be determined via a call to the 296* ScaLAPACK tool function, NUMROC: 297* LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 298* LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 299* An upper bound for these quantities may be computed by: 300* LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 301* LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 302* 303* PZHEEVR assumes IEEE 754 standard compliant arithmetic. 304* 305* Alignment requirements 306* ====================== 307* 308* The distributed submatrices A(IA:*, JA:*) and Z(IZ:IZ+M-1,JZ:JZ+N-1) 309* must satisfy the following alignment properties: 310* 311* 1.Identical (quadratic) dimension: 312* DESCA(M_) = DESCZ(M_) = DESCA(N_) = DESCZ(N_) 313* 2.Quadratic conformal blocking: 314* DESCA(MB_) = DESCA(NB_) = DESCZ(MB_) = DESCZ(NB_) 315* DESCA(RSRC_) = DESCZ(RSRC_) 316* 3.MOD( IA-1, MB_A ) = MOD( IZ-1, MB_Z ) = 0 317* 4.IAROW = IZROW 318* 319* 320* .. Parameters .. 321 INTEGER CTXT_, M_, N_, 322 $ MB_, NB_, RSRC_, CSRC_ 323 PARAMETER ( CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 324 $ RSRC_ = 7, CSRC_ = 8 ) 325 DOUBLE PRECISION ZERO 326 PARAMETER ( ZERO = 0.0D0 ) 327* .. 328* .. Local Scalars .. 329 LOGICAL ALLEIG, COLBRT, DOBCST, FINISH, FIRST, INDEIG, 330 $ LOWER, LQUERY, VALEIG, VSTART, WANTZ 331 INTEGER ANB, DOL, DOU, DSTCOL, DSTROW, EIGCNT, FRSTCL, 332 $ I, IAROW, ICTXT, IIL, IINDERR, IINDWLC, IINFO, 333 $ IIU, IM, INDD, INDD2, INDE, INDE2, INDERR, 334 $ INDILU, INDRTAU, INDRW, INDRWORK, INDTAU, 335 $ INDWLC, INDWORK, IPIL, IPIU, IPROC, IZROW, 336 $ LASTCL, LENGTHI, LENGTHI2, LIWMIN, LLRWORK, 337 $ LLWORK, LRWMIN, LRWOPT, LWMIN, LWOPT, MAXCLS, 338 $ MQ00, MYCOL, MYIL, MYIU, MYPROC, MYROW, MZ, NB, 339 $ NDEPTH, NEEDIL, NEEDIU, NHETRD_LWOPT, NNP, 340 $ NP00, NPCOL, NPROCS, NPROW, NPS, NSPLIT, 341 $ OFFSET, PARITY, RLENGTHI, RLENGTHI2, RSTARTI, 342 $ SIZE1, SIZE2, SQNPC, SRCCOL, SRCROW, STARTI, 343 $ ZOFFSET 344 345 DOUBLE PRECISION PIVMIN, SAFMIN, SCALE, VLL, VUU, WL, 346 $ WU 347* 348* .. Local Arrays .. 349 INTEGER IDUM1( 4 ), IDUM2( 4 ) 350* .. 351* .. External Functions .. 352 LOGICAL LSAME 353 INTEGER ICEIL, INDXG2P, NUMROC, PJLAENV 354 DOUBLE PRECISION PDLAMCH 355 EXTERNAL ICEIL, INDXG2P, LSAME, NUMROC, PDLAMCH, 356 $ PJLAENV 357* .. 358* .. External Subroutines .. 359 EXTERNAL BLACS_GRIDINFO, CHK1MAT, DCOPY, DGEBR2D, 360 $ DGEBS2D, DGERV2D, DGESD2D, DLARRC, DLASRT2, 361 $ DSTEGR2A, DSTEGR2B, DSTEGR2, IGEBR2D, 362 $ IGEBS2D, IGERV2D, IGESD2D, IGSUM2D, PCHK1MAT, 363 $ PCHK2MAT, PDLARED1D, PXERBLA, PZELGET, 364 $ PZHENTRD, PZLAEVSWP, PZUNMTR 365* .. 366* .. Intrinsic Functions .. 367 INTRINSIC ABS, DBLE, DCMPLX, ICHAR, INT, MAX, MIN, MOD, 368 $ SQRT 369* .. 370* .. Executable Statements .. 371* 372 373 INFO = 0 374 375*********************************************************************** 376* 377* Decode character arguments to find out what the code should do 378* 379*********************************************************************** 380 WANTZ = LSAME( JOBZ, 'V' ) 381 LOWER = LSAME( UPLO, 'L' ) 382 ALLEIG = LSAME( RANGE, 'A' ) 383 VALEIG = LSAME( RANGE, 'V' ) 384 INDEIG = LSAME( RANGE, 'I' ) 385 LQUERY = ( LWORK.EQ.-1 .OR. LRWORK.EQ.-1 .OR. LIWORK.EQ.-1 ) 386 387*********************************************************************** 388* 389* GET MACHINE PARAMETERS 390* 391*********************************************************************** 392 ICTXT = DESCA( CTXT_ ) 393 SAFMIN = PDLAMCH( ICTXT, 'Safe minimum' ) 394 395*********************************************************************** 396* 397* Set up pointers into the (complex) WORK array 398* 399*********************************************************************** 400 INDTAU = 1 401 INDWORK = INDTAU + N 402 LLWORK = LWORK - INDWORK + 1 403 404*********************************************************************** 405* 406* Set up pointers into the RWORK array 407* 408*********************************************************************** 409 INDRTAU = 1 410 INDD = INDRTAU + N 411 INDE = INDD + N + 1 412 INDD2 = INDE + N + 1 413 INDE2 = INDD2 + N 414 INDRWORK = INDE2 + N 415 LLRWORK = LRWORK - INDRWORK + 1 416 417*********************************************************************** 418* 419* BLACS PROCESSOR GRID SETUP 420* 421*********************************************************************** 422 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 423 424 425 NPROCS = NPROW * NPCOL 426 MYPROC = MYROW * NPCOL + MYCOL 427 IF( NPROW.EQ.-1 ) THEN 428 INFO = -( 800+CTXT_ ) 429 ELSE IF( WANTZ ) THEN 430 IF( ICTXT.NE.DESCZ( CTXT_ ) ) THEN 431 INFO = -( 2100+CTXT_ ) 432 END IF 433 END IF 434 435*********************************************************************** 436* 437* COMPUTE REAL WORKSPACE 438* 439*********************************************************************** 440 IF ( ALLEIG ) THEN 441 MZ = N 442 ELSE IF ( INDEIG ) THEN 443 MZ = IU - IL + 1 444 ELSE 445* Take upper bound for VALEIG case 446 MZ = N 447 END IF 448* 449 NB = DESCA( NB_ ) 450 NP00 = NUMROC( N, NB, 0, 0, NPROW ) 451 MQ00 = NUMROC( MZ, NB, 0, 0, NPCOL ) 452 IF ( WANTZ ) THEN 453 INDRW = INDRWORK + MAX(18*N, NP00*MQ00 + 2*NB*NB) 454 LRWMIN = INDRW - 1 + (ICEIL(MZ, NPROCS) + 2)*N 455 LWMIN = N + MAX((NP00 + MQ00 + NB) * NB, 3 * NB) 456 ELSE 457 INDRW = INDRWORK + 12*N 458 LRWMIN = INDRW - 1 459 LWMIN = N + MAX( NB*( NP00 + 1 ), 3 * NB ) 460 END IF 461 462* The code that validates the input requires 3 workspace entries 463 LRWMIN = MAX(3, LRWMIN) 464 LRWOPT = LRWMIN 465 LWMIN = MAX(3, LWMIN) 466 LWOPT = LWMIN 467* 468 ANB = PJLAENV( ICTXT, 3, 'PZHETTRD', 'L', 0, 0, 0, 0 ) 469 SQNPC = INT( SQRT( DBLE( NPROCS ) ) ) 470 NPS = MAX( NUMROC( N, 1, 0, 0, SQNPC ), 2*ANB ) 471 NHETRD_LWOPT = 2*( ANB+1 )*( 4*NPS+2 ) + ( NPS+4 )*NPS 472 LWOPT = MAX( LWOPT, N+NHETRD_LWOPT ) 473* 474 SIZE1 = INDRW - INDRWORK 475 476*********************************************************************** 477* 478* COMPUTE INTEGER WORKSPACE 479* 480*********************************************************************** 481 NNP = MAX( N, NPROCS+1, 4 ) 482 IF ( WANTZ ) THEN 483 LIWMIN = 12*NNP + 2*N 484 ELSE 485 LIWMIN = 10*NNP + 2*N 486 END IF 487 488*********************************************************************** 489* 490* Set up pointers into the IWORK array 491* 492*********************************************************************** 493* Pointer to eigenpair distribution over processors 494 INDILU = LIWMIN - 2*NPROCS + 1 495 SIZE2 = INDILU - 2*N 496 497 498*********************************************************************** 499* 500* Test the input arguments. 501* 502*********************************************************************** 503 IF( INFO.EQ.0 ) THEN 504 CALL CHK1MAT( N, 4, N, 4, IA, JA, DESCA, 8, INFO ) 505 IF( WANTZ ) 506 $ CALL CHK1MAT( N, 4, N, 4, IZ, JZ, DESCZ, 21, INFO ) 507* 508 IF( INFO.EQ.0 ) THEN 509 IF( .NOT.( WANTZ .OR. LSAME( JOBZ, 'N' ) ) ) THEN 510 INFO = -1 511 ELSE IF( .NOT.( ALLEIG .OR. VALEIG .OR. INDEIG ) ) THEN 512 INFO = -2 513 ELSE IF( .NOT.( LOWER .OR. LSAME( UPLO, 'U' ) ) ) THEN 514 INFO = -3 515 ELSE IF( MOD( IA-1, DESCA( MB_ ) ).NE.0 ) THEN 516 INFO = -6 517 ELSE IF( VALEIG .AND. N.GT.0 .AND. VU.LE.VL ) THEN 518 INFO = -10 519 ELSE IF( INDEIG .AND. ( IL.LT.1 .OR. IL.GT.MAX( 1, N ) ) ) 520 $ THEN 521 INFO = -11 522 ELSE IF( INDEIG .AND. ( IU.LT.MIN( N, IL ) .OR. IU.GT.N )) 523 $ THEN 524 INFO = -12 525 ELSE IF( LWORK.LT.LWMIN .AND. .NOT.LQUERY ) THEN 526 INFO = -21 527 ELSE IF( LRWORK.LT.LRWMIN .AND. .NOT.LQUERY ) THEN 528 INFO = -23 529 ELSE IF( LIWORK.LT.LIWMIN .AND. .NOT.LQUERY ) THEN 530 INFO = -25 531 ELSE IF( DESCA( MB_ ).NE.DESCA( NB_ ) ) THEN 532 INFO = -( 800+NB_ ) 533 END IF 534 IF( WANTZ ) THEN 535 IAROW = INDXG2P( 1, DESCA( NB_ ), MYROW, 536 $ DESCA( RSRC_ ), NPROW ) 537 IZROW = INDXG2P( 1, DESCA( NB_ ), MYROW, 538 $ DESCZ( RSRC_ ), NPROW ) 539 IF( IAROW.NE.IZROW ) THEN 540 INFO = -19 541 ELSE IF( MOD( IA-1, DESCA( MB_ ) ).NE. 542 $ MOD( IZ-1, DESCZ( MB_ ) ) ) THEN 543 INFO = -19 544 ELSE IF( DESCA( M_ ).NE.DESCZ( M_ ) ) THEN 545 INFO = -( 2100+M_ ) 546 ELSE IF( DESCA( N_ ).NE.DESCZ( N_ ) ) THEN 547 INFO = -( 2100+N_ ) 548 ELSE IF( DESCA( MB_ ).NE.DESCZ( MB_ ) ) THEN 549 INFO = -( 2100+MB_ ) 550 ELSE IF( DESCA( NB_ ).NE.DESCZ( NB_ ) ) THEN 551 INFO = -( 2100+NB_ ) 552 ELSE IF( DESCA( RSRC_ ).NE.DESCZ( RSRC_ ) ) THEN 553 INFO = -( 2100+RSRC_ ) 554 ELSE IF( DESCA( CSRC_ ).NE.DESCZ( CSRC_ ) ) THEN 555 INFO = -( 2100+CSRC_ ) 556 ELSE IF( ICTXT.NE.DESCZ( CTXT_ ) ) THEN 557 INFO = -( 2100+CTXT_ ) 558 END IF 559 END IF 560 END IF 561 IDUM2( 1 ) = 1 562 IF( LOWER ) THEN 563 IDUM1( 2 ) = ICHAR( 'L' ) 564 ELSE 565 IDUM1( 2 ) = ICHAR( 'U' ) 566 END IF 567 IDUM2( 2 ) = 2 568 IF( ALLEIG ) THEN 569 IDUM1( 3 ) = ICHAR( 'A' ) 570 ELSE IF( INDEIG ) THEN 571 IDUM1( 3 ) = ICHAR( 'I' ) 572 ELSE 573 IDUM1( 3 ) = ICHAR( 'V' ) 574 END IF 575 IDUM2( 3 ) = 3 576 IF( LQUERY ) THEN 577 IDUM1( 4 ) = -1 578 ELSE 579 IDUM1( 4 ) = 1 580 END IF 581 IDUM2( 4 ) = 4 582 IF( WANTZ ) THEN 583 IDUM1( 1 ) = ICHAR( 'V' ) 584 CALL PCHK2MAT( N, 4, N, 4, IA, JA, DESCA, 8, N, 4, N, 4,IZ, 585 $ JZ, DESCZ, 21, 4, IDUM1, IDUM2, INFO ) 586 ELSE 587 IDUM1( 1 ) = ICHAR( 'N' ) 588 CALL PCHK1MAT( N, 4, N, 4, IA, JA, DESCA, 8, 4, IDUM1, 589 $ IDUM2, INFO ) 590 END IF 591 WORK( 1 ) = DCMPLX( LWOPT ) 592 RWORK( 1 ) = DBLE( LRWOPT ) 593 IWORK( 1 ) = LIWMIN 594 END IF 595* 596 IF( INFO.NE.0 ) THEN 597 CALL PXERBLA( ICTXT, 'PZHEEVR', -INFO ) 598 RETURN 599 ELSE IF( LQUERY ) THEN 600 RETURN 601 END IF 602 603*********************************************************************** 604* 605* Quick return if possible 606* 607*********************************************************************** 608 IF( N.EQ.0 ) THEN 609 IF( WANTZ ) THEN 610 NZ = 0 611 END IF 612 M = 0 613 WORK( 1 ) = DCMPLX( LWOPT ) 614 RWORK( 1 ) = DBLE( LRWOPT ) 615 IWORK( 1 ) = LIWMIN 616 RETURN 617 END IF 618 619 IF( VALEIG ) THEN 620 VLL = VL 621 VUU = VU 622 ELSE 623 VLL = ZERO 624 VUU = ZERO 625 END IF 626* 627* No scaling done here, leave this to MRRR kernel. 628* Scale tridiagonal rather than full matrix. 629* 630*********************************************************************** 631* 632* REDUCE MATRIX TO REAL SYMMETRIC TRIDIAGONAL FORM. 633* 634*********************************************************************** 635 636 637 CALL PZHENTRD( UPLO, N, A, IA, JA, DESCA, RWORK( INDD ), 638 $ RWORK( INDE ), WORK( INDTAU ), WORK( INDWORK ), 639 $ LLWORK, RWORK( INDRWORK ), LLRWORK,IINFO ) 640 641 642 IF (IINFO .NE. 0) THEN 643 CALL PXERBLA( ICTXT, 'PZHENTRD', -IINFO ) 644 RETURN 645 END IF 646 647*********************************************************************** 648* 649* DISTRIBUTE TRIDIAGONAL TO ALL PROCESSORS 650* 651*********************************************************************** 652 OFFSET = 0 653 IF( IA.EQ.1 .AND. JA.EQ.1 .AND. 654 $ DESCA( RSRC_ ).EQ.0 .AND. DESCA( CSRC_ ).EQ.0 ) 655 $ THEN 656 CALL PDLARED1D( N, IA, JA, DESCA, RWORK( INDD ), 657 $ RWORK( INDD2 ), RWORK( INDRWORK ), LLRWORK ) 658* 659 CALL PDLARED1D( N, IA, JA, DESCA, RWORK( INDE ), 660 $ RWORK( INDE2 ), RWORK( INDRWORK ), LLRWORK ) 661 IF( .NOT.LOWER ) 662 $ OFFSET = 1 663 ELSE 664 DO 10 I = 1, N 665 CALL PZELGET( 'A', ' ', WORK( INDWORK ), A, 666 $ I+IA-1, I+JA-1, DESCA ) 667 RWORK( INDD2+I-1 ) = DBLE( WORK( INDWORK ) ) 668 10 CONTINUE 669 IF( LSAME( UPLO, 'U' ) ) THEN 670 DO 20 I = 1, N - 1 671 CALL PZELGET( 'A', ' ', WORK( INDWORK ), A, 672 $ I+IA-1, I+JA, DESCA ) 673 RWORK( INDE2+I-1 ) = DBLE( WORK( INDWORK ) ) 674 20 CONTINUE 675 ELSE 676 DO 30 I = 1, N - 1 677 CALL PZELGET( 'A', ' ', WORK( INDWORK ), A, 678 $ I+IA, I+JA-1, DESCA ) 679 RWORK( INDE2+I-1 ) = DBLE( WORK( INDWORK ) ) 680 30 CONTINUE 681 END IF 682 END IF 683 684 685 686 687*********************************************************************** 688* 689* SET IIL, IIU 690* 691*********************************************************************** 692 IF ( ALLEIG ) THEN 693 IIL = 1 694 IIU = N 695 ELSE IF ( INDEIG ) THEN 696 IIL = IL 697 IIU = IU 698 ELSE IF ( VALEIG ) THEN 699 CALL DLARRC('T', N, VLL, VUU, RWORK( INDD2 ), 700 $ RWORK( INDE2 + OFFSET ), SAFMIN, EIGCNT, IIL, IIU, INFO) 701* Refine upper bound N that was taken 702 MZ = EIGCNT 703 IIL = IIL + 1 704 ENDIF 705 706 IF(MZ.EQ.0) THEN 707 M = 0 708 IF( WANTZ ) THEN 709 NZ = 0 710 END IF 711 WORK( 1 ) = DBLE( LWOPT ) 712 IWORK( 1 ) = LIWMIN 713 RETURN 714 END IF 715 716 MYIL = 0 717 MYIU = 0 718 M = 0 719 IM = 0 720 721*********************************************************************** 722* 723* COMPUTE WORK ASSIGNMENTS 724* 725*********************************************************************** 726 727* 728* Each processor computes the work assignments for all processors 729* 730 CALL PMPIM2( IIL, IIU, NPROCS, 731 $ IWORK(INDILU), IWORK(INDILU+NPROCS) ) 732* 733* Find local work assignment 734* 735 MYIL = IWORK(INDILU+MYPROC) 736 MYIU = IWORK(INDILU+NPROCS+MYPROC) 737 738 739 ZOFFSET = MAX(0, MYIL - IIL - 1) 740 FIRST = ( MYIL .EQ. IIL ) 741 742 743*********************************************************************** 744* 745* CALLS TO MRRR KERNEL 746* 747*********************************************************************** 748 IF(.NOT.WANTZ) THEN 749* 750* Compute eigenvalues only. 751* 752 IINFO = 0 753 IF ( MYIL.GT.0 ) THEN 754 DOL = 1 755 DOU = MYIU - MYIL + 1 756 CALL DSTEGR2( JOBZ, 'I', N, RWORK( INDD2 ), 757 $ RWORK( INDE2+OFFSET ), VLL, VUU, MYIL, MYIU, 758 $ IM, W( 1 ), RWORK( INDRW ), N, 759 $ MYIU - MYIL + 1, 760 $ IWORK( 1 ), RWORK( INDRWORK ), SIZE1, 761 $ IWORK( 2*N+1 ), SIZE2, 762 $ DOL, DOU, ZOFFSET, IINFO ) 763* DSTEGR2 zeroes out the entire W array, so we can't just give 764* it the part of W we need. So here we copy the W entries into 765* their correct location 766 DO 49 I = 1, IM 767 W( MYIL-IIL+I ) = W( I ) 768 49 CONTINUE 769* W( MYIL ) is at W( MYIL - IIL + 1 ) 770* W( X ) is at W(X - IIL + 1 ) 771 END IF 772 IF (IINFO .NE. 0) THEN 773 CALL PXERBLA( ICTXT, 'DSTEGR2', -IINFO ) 774 RETURN 775 END IF 776 ELSEIF ( WANTZ .AND. NPROCS.EQ.1 ) THEN 777* 778* Compute eigenvalues and -vectors, but only on one processor 779* 780 IINFO = 0 781 IF ( MYIL.GT.0 ) THEN 782 DOL = MYIL - IIL + 1 783 DOU = MYIU - IIL + 1 784 CALL DSTEGR2( JOBZ, 'I', N, RWORK( INDD2 ), 785 $ RWORK( INDE2+OFFSET ), VLL, VUU, IIL, IIU, 786 $ IM, W( 1 ), RWORK( INDRW ), N, 787 $ N, 788 $ IWORK( 1 ), RWORK( INDRWORK ), SIZE1, 789 $ IWORK( 2*N+1 ), SIZE2, DOL, DOU, 790 $ ZOFFSET, IINFO ) 791 ENDIF 792 IF (IINFO .NE. 0) THEN 793 CALL PXERBLA( ICTXT, 'DSTEGR2', -IINFO ) 794 RETURN 795 END IF 796 ELSEIF ( WANTZ ) THEN 797* Compute representations in parallel. 798* Share eigenvalue computation for root between all processors 799* Then compute the eigenvectors. 800 IINFO = 0 801* Part 1. compute root representations and root eigenvalues 802 IF ( MYIL.GT.0 ) THEN 803 DOL = MYIL - IIL + 1 804 DOU = MYIU - IIL + 1 805 CALL DSTEGR2A( JOBZ, 'I', N, RWORK( INDD2 ), 806 $ RWORK( INDE2+OFFSET ), VLL, VUU, IIL, IIU, 807 $ IM, W( 1 ), RWORK( INDRW ), N, 808 $ N, RWORK( INDRWORK ), SIZE1, 809 $ IWORK( 2*N+1 ), SIZE2, DOL, 810 $ DOU, NEEDIL, NEEDIU, 811 $ INDERR, NSPLIT, PIVMIN, SCALE, WL, WU, 812 $ IINFO ) 813 ENDIF 814 IF (IINFO .NE. 0) THEN 815 CALL PXERBLA( ICTXT, 'DSTEGR2A', -IINFO ) 816 RETURN 817 END IF 818* 819* The second part of parallel MRRR, the representation tree 820* construction begins. Upon successful completion, the 821* eigenvectors have been computed. This is indicated by 822* the flag FINISH. 823* 824 VSTART = .TRUE. 825 FINISH = (MYIL.LE.0) 826C Part 2. Share eigenvalues and uncertainties between all processors 827 IINDERR = INDRWORK + INDERR - 1 828 829* 830 831 832* 833* There are currently two ways to communicate eigenvalue information 834* using the BLACS. 835* 1.) BROADCAST 836* 2.) POINT2POINT between collaborators (those processors working 837* jointly on a cluster. 838* For efficiency, BROADCAST has been disabled. 839* At a later stage, other more efficient communication algorithms 840* might be implemented, e. g. group or tree-based communication. 841 842 DOBCST = .FALSE. 843 IF(DOBCST) THEN 844* First gather everything on the first processor. 845* Then use BROADCAST-based communication 846 DO 45 I = 2, NPROCS 847 IF (MYPROC .EQ. (I - 1)) THEN 848 DSTROW = 0 849 DSTCOL = 0 850 STARTI = DOL 851 IWORK(1) = STARTI 852 IF(MYIL.GT.0) THEN 853 LENGTHI = MYIU - MYIL + 1 854 ELSE 855 LENGTHI = 0 856 ENDIF 857 IWORK(2) = LENGTHI 858 CALL IGESD2D( ICTXT, 2, 1, IWORK, 2, 859 $ DSTROW, DSTCOL ) 860 IF (( STARTI.GE.1 ) .AND. ( LENGTHI.GE.1 )) THEN 861 LENGTHI2 = 2*LENGTHI 862* Copy eigenvalues into communication buffer 863 CALL DCOPY(LENGTHI,W( STARTI ),1, 864 $ RWORK( INDD ), 1) 865* Copy uncertainties into communication buffer 866 CALL DCOPY(LENGTHI,RWORK(IINDERR+STARTI-1),1, 867 $ RWORK( INDD+LENGTHI ), 1) 868* send buffer 869 CALL DGESD2D( ICTXT, LENGTHI2, 870 $ 1, RWORK( INDD ), LENGTHI2, 871 $ DSTROW, DSTCOL ) 872 END IF 873 ELSE IF (MYPROC .EQ. 0) THEN 874 SRCROW = (I-1) / NPCOL 875 SRCCOL = MOD(I-1, NPCOL) 876 CALL IGERV2D( ICTXT, 2, 1, IWORK, 2, 877 $ SRCROW, SRCCOL ) 878 STARTI = IWORK(1) 879 LENGTHI = IWORK(2) 880 IF (( STARTI.GE.1 ) .AND. ( LENGTHI.GE.1 )) THEN 881 LENGTHI2 = 2*LENGTHI 882* receive buffer 883 CALL DGERV2D( ICTXT, LENGTHI2, 1, 884 $ RWORK(INDD), LENGTHI2, SRCROW, SRCCOL ) 885* copy eigenvalues from communication buffer 886 CALL DCOPY( LENGTHI, RWORK(INDD), 1, 887 $ W( STARTI ), 1) 888* copy uncertainties (errors) from communication buffer 889 CALL DCOPY(LENGTHI,RWORK(INDD+LENGTHI),1, 890 $ RWORK( IINDERR+STARTI-1 ), 1) 891 END IF 892 END IF 893 45 CONTINUE 894 LENGTHI = IIU - IIL + 1 895 LENGTHI2 = LENGTHI * 2 896 IF (MYPROC .EQ. 0) THEN 897* Broadcast eigenvalues and errors to all processors 898 CALL DCOPY(LENGTHI,W ,1, RWORK( INDD ), 1) 899 CALL DCOPY(LENGTHI,RWORK( IINDERR ),1, 900 $ RWORK( INDD+LENGTHI ), 1) 901 CALL DGEBS2D( ICTXT, 'A', ' ', LENGTHI2, 1, 902 $ RWORK(INDD), LENGTHI2 ) 903 ELSE 904 SRCROW = 0 905 SRCCOL = 0 906 CALL DGEBR2D( ICTXT, 'A', ' ', LENGTHI2, 1, 907 $ RWORK(INDD), LENGTHI2, SRCROW, SRCCOL ) 908 CALL DCOPY( LENGTHI, RWORK(INDD), 1, W, 1) 909 CALL DCOPY(LENGTHI,RWORK(INDD+LENGTHI),1, 910 $ RWORK( IINDERR ), 1) 911 END IF 912 ELSE 913* Enable point2point communication between collaborators 914 915* Find collaborators of MYPROC 916 IF( (NPROCS.GT.1).AND.(MYIL.GT.0) ) THEN 917 CALL PMPCOL( MYPROC, NPROCS, IIL, NEEDIL, NEEDIU, 918 $ IWORK(INDILU), IWORK(INDILU+NPROCS), 919 $ COLBRT, FRSTCL, LASTCL ) 920 ELSE 921 COLBRT = .FALSE. 922 ENDIF 923 924 IF(COLBRT) THEN 925* If the processor collaborates with others, 926* communicate information. 927 DO 47 IPROC = FRSTCL, LASTCL 928 IF (MYPROC .EQ. IPROC) THEN 929 STARTI = DOL 930 IWORK(1) = STARTI 931 LENGTHI = MYIU - MYIL + 1 932 IWORK(2) = LENGTHI 933 934 IF ((STARTI.GE.1) .AND. (LENGTHI.GE.1)) THEN 935* Copy eigenvalues into communication buffer 936 CALL DCOPY(LENGTHI,W( STARTI ),1, 937 $ RWORK(INDD), 1) 938* Copy uncertainties into communication buffer 939 CALL DCOPY(LENGTHI, 940 $ RWORK( IINDERR+STARTI-1 ),1, 941 $ RWORK(INDD+LENGTHI), 1) 942 ENDIF 943 944 DO 46 I = FRSTCL, LASTCL 945 IF(I.EQ.MYPROC) GOTO 46 946 DSTROW = I/ NPCOL 947 DSTCOL = MOD(I, NPCOL) 948 CALL IGESD2D( ICTXT, 2, 1, IWORK, 2, 949 $ DSTROW, DSTCOL ) 950 IF ((STARTI.GE.1) .AND. (LENGTHI.GE.1)) THEN 951 LENGTHI2 = 2*LENGTHI 952* send buffer 953 CALL DGESD2D( ICTXT, LENGTHI2, 954 $ 1, RWORK(INDD), LENGTHI2, 955 $ DSTROW, DSTCOL ) 956 END IF 957 46 CONTINUE 958 ELSE 959 SRCROW = IPROC / NPCOL 960 SRCCOL = MOD(IPROC, NPCOL) 961 CALL IGERV2D( ICTXT, 2, 1, IWORK, 2, 962 $ SRCROW, SRCCOL ) 963 RSTARTI = IWORK(1) 964 RLENGTHI = IWORK(2) 965 IF ((RSTARTI.GE.1 ) .AND. (RLENGTHI.GE.1 )) THEN 966 RLENGTHI2 = 2*RLENGTHI 967 CALL DGERV2D( ICTXT, RLENGTHI2, 1, 968 $ RWORK(INDE), RLENGTHI2, 969 $ SRCROW, SRCCOL ) 970* copy eigenvalues from communication buffer 971 CALL DCOPY( RLENGTHI,RWORK(INDE), 1, 972 $ W( RSTARTI ), 1) 973* copy uncertainties (errors) from communication buffer 974 CALL DCOPY(RLENGTHI,RWORK(INDE+RLENGTHI),1, 975 $ RWORK( IINDERR+RSTARTI-1 ), 1) 976 END IF 977 END IF 978 47 CONTINUE 979 ENDIF 980 ENDIF 981 982* Part 3. Compute representation tree and eigenvectors. 983* What follows is a loop in which the tree 984* is constructed in parallel from top to bottom, 985* on level at a time, until all eigenvectors 986* have been computed. 987* 988 100 CONTINUE 989 IF ( MYIL.GT.0 ) THEN 990 CALL DSTEGR2B( JOBZ, N, RWORK( INDD2 ), 991 $ RWORK( INDE2+OFFSET ), 992 $ IM, W( 1 ), RWORK( INDRW ), N, N, 993 $ IWORK( 1 ), RWORK( INDRWORK ), SIZE1, 994 $ IWORK( 2*N+1 ), SIZE2, DOL, 995 $ DOU, NEEDIL, NEEDIU, INDWLC, 996 $ PIVMIN, SCALE, WL, WU, 997 $ VSTART, FINISH, 998 $ MAXCLS, NDEPTH, PARITY, ZOFFSET, IINFO ) 999 IINDWLC = INDRWORK + INDWLC - 1 1000 IF(.NOT.FINISH) THEN 1001 IF((NEEDIL.LT.DOL).OR.(NEEDIU.GT.DOU)) THEN 1002 CALL PMPCOL( MYPROC, NPROCS, IIL, NEEDIL, NEEDIU, 1003 $ IWORK(INDILU), IWORK(INDILU+NPROCS), 1004 $ COLBRT, FRSTCL, LASTCL ) 1005 ELSE 1006 COLBRT = .FALSE. 1007 FRSTCL = MYPROC 1008 LASTCL = MYPROC 1009 ENDIF 1010* 1011* Check if this processor collaborates, i.e. 1012* communication is needed. 1013* 1014 IF(COLBRT) THEN 1015 DO 147 IPROC = FRSTCL, LASTCL 1016 IF (MYPROC .EQ. IPROC) THEN 1017 STARTI = DOL 1018 IWORK(1) = STARTI 1019 IF(MYIL.GT.0) THEN 1020 LENGTHI = MYIU - MYIL + 1 1021 ELSE 1022 LENGTHI = 0 1023 ENDIF 1024 IWORK(2) = LENGTHI 1025 IF ((STARTI.GE.1).AND.(LENGTHI.GE.1)) THEN 1026* Copy eigenvalues into communication buffer 1027 CALL DCOPY(LENGTHI, 1028 $ RWORK( IINDWLC+STARTI-1 ),1, 1029 $ RWORK(INDD), 1) 1030* Copy uncertainties into communication buffer 1031 CALL DCOPY(LENGTHI, 1032 $ RWORK( IINDERR+STARTI-1 ),1, 1033 $ RWORK(INDD+LENGTHI), 1) 1034 ENDIF 1035 1036 DO 146 I = FRSTCL, LASTCL 1037 IF(I.EQ.MYPROC) GOTO 146 1038 DSTROW = I/ NPCOL 1039 DSTCOL = MOD(I, NPCOL) 1040 CALL IGESD2D( ICTXT, 2, 1, IWORK, 2, 1041 $ DSTROW, DSTCOL ) 1042 IF ((STARTI.GE.1).AND.(LENGTHI.GE.1)) THEN 1043 LENGTHI2 = 2*LENGTHI 1044* send buffer 1045 CALL DGESD2D( ICTXT, LENGTHI2, 1046 $ 1, RWORK(INDD), LENGTHI2, 1047 $ DSTROW, DSTCOL ) 1048 END IF 1049 146 CONTINUE 1050 ELSE 1051 SRCROW = IPROC / NPCOL 1052 SRCCOL = MOD(IPROC, NPCOL) 1053 CALL IGERV2D( ICTXT, 2, 1, IWORK, 2, 1054 $ SRCROW, SRCCOL ) 1055 RSTARTI = IWORK(1) 1056 RLENGTHI = IWORK(2) 1057 IF ((RSTARTI.GE.1).AND.(RLENGTHI.GE.1)) THEN 1058 RLENGTHI2 = 2*RLENGTHI 1059 CALL DGERV2D( ICTXT,RLENGTHI2, 1, 1060 $ RWORK(INDE),RLENGTHI2, 1061 $ SRCROW, SRCCOL ) 1062* copy eigenvalues from communication buffer 1063 CALL DCOPY(RLENGTHI,RWORK(INDE), 1, 1064 $ RWORK( IINDWLC+RSTARTI-1 ), 1) 1065* copy uncertainties (errors) from communication buffer 1066 CALL DCOPY(RLENGTHI,RWORK(INDE+RLENGTHI), 1067 $ 1,RWORK( IINDERR+RSTARTI-1 ), 1) 1068 END IF 1069 END IF 1070 147 CONTINUE 1071 ENDIF 1072 GOTO 100 1073 ENDIF 1074 ENDIF 1075 IF (IINFO .NE. 0) THEN 1076 CALL PXERBLA( ICTXT, 'DSTEGR2B', -IINFO ) 1077 RETURN 1078 END IF 1079* 1080 ENDIF 1081 1082* 1083*********************************************************************** 1084* 1085* MAIN PART ENDS HERE 1086* 1087*********************************************************************** 1088* 1089 1090*********************************************************************** 1091* 1092* ALLGATHER: EACH PROCESSOR SENDS ITS EIGENVALUES TO THE FIRST ONE, 1093* THEN THE FIRST PROCESSOR BROADCASTS ALL EIGENVALUES 1094* 1095*********************************************************************** 1096 1097 DO 50 I = 2, NPROCS 1098 IF (MYPROC .EQ. (I - 1)) THEN 1099 DSTROW = 0 1100 DSTCOL = 0 1101 STARTI = MYIL - IIL + 1 1102 IWORK(1) = STARTI 1103 IF(MYIL.GT.0) THEN 1104 LENGTHI = MYIU - MYIL + 1 1105 ELSE 1106 LENGTHI = 0 1107 ENDIF 1108 IWORK(2) = LENGTHI 1109 CALL IGESD2D( ICTXT, 2, 1, IWORK, 2, 1110 $ DSTROW, DSTCOL ) 1111 IF ((STARTI.GE.1).AND.(LENGTHI.GE.1)) THEN 1112 CALL DGESD2D( ICTXT, LENGTHI, 1113 $ 1, W( STARTI ), LENGTHI, 1114 $ DSTROW, DSTCOL ) 1115 ENDIF 1116 ELSE IF (MYPROC .EQ. 0) THEN 1117 SRCROW = (I-1) / NPCOL 1118 SRCCOL = MOD(I-1, NPCOL) 1119 CALL IGERV2D( ICTXT, 2, 1, IWORK, 2, 1120 $ SRCROW, SRCCOL ) 1121 STARTI = IWORK(1) 1122 LENGTHI = IWORK(2) 1123 IF ((STARTI.GE.1).AND.(LENGTHI.GE.1)) THEN 1124 CALL DGERV2D( ICTXT, LENGTHI, 1, 1125 $ W( STARTI ), LENGTHI, SRCROW, SRCCOL ) 1126 ENDIF 1127 ENDIF 1128 50 CONTINUE 1129 1130* Accumulate M from all processors 1131 M = IM 1132 CALL IGSUM2D( ICTXT, 'A', ' ', 1, 1, M, 1, -1, -1 ) 1133 1134* Broadcast eigenvalues to all processors 1135 IF (MYPROC .EQ. 0) THEN 1136* Send eigenvalues 1137 CALL DGEBS2D( ICTXT, 'A', ' ', M, 1, W, M ) 1138 ELSE 1139 SRCROW = 0 1140 SRCCOL = 0 1141 CALL DGEBR2D( ICTXT, 'A', ' ', M, 1, 1142 $ W, M, SRCROW, SRCCOL ) 1143 END IF 1144* 1145* Sort the eigenvalues and keep permutation in IWORK to 1146* sort the eigenvectors accordingly 1147* 1148 DO 160 I = 1, M 1149 IWORK( NPROCS+1+I ) = I 1150 160 CONTINUE 1151 CALL DLASRT2( 'I', M, W, IWORK( NPROCS+2 ), IINFO ) 1152 IF (IINFO.NE.0) THEN 1153 CALL PXERBLA( ICTXT, 'DLASRT2', -IINFO ) 1154 RETURN 1155 END IF 1156 1157*********************************************************************** 1158* 1159* TRANSFORM Z FROM 1D WORKSPACE INTO 2D BLOCKCYCLIC STORAGE 1160* 1161*********************************************************************** 1162 IF ( WANTZ ) THEN 1163 DO 170 I = 1, M 1164 IWORK( M+NPROCS+1+IWORK( NPROCS+1+I ) ) = I 1165 170 CONTINUE 1166* Store NVS in IWORK(1:NPROCS+1) for PZLAEVSWP 1167 IWORK( 1 ) = 0 1168 DO 180 I = 1, NPROCS 1169* Find IL and IU for processor i-1 1170* Has already been computed by PMPIM2 and stored 1171 IPIL = IWORK(INDILU+I-1) 1172 IPIU = IWORK(INDILU+NPROCS+I-1) 1173 IF (IPIL .EQ. 0) THEN 1174 IWORK( I + 1 ) = IWORK( I ) 1175 ELSE 1176 IWORK( I + 1 ) = IWORK( I ) + IPIU - IPIL + 1 1177 ENDIF 1178 180 CONTINUE 1179 1180 IF ( FIRST ) THEN 1181 CALL PZLAEVSWP(N, RWORK( INDRW ), N, Z, IZ, JZ, 1182 $ DESCZ, IWORK( 1 ), IWORK( NPROCS+M+2 ), RWORK( INDRWORK ), 1183 $ SIZE1 ) 1184 ELSE 1185 CALL PZLAEVSWP(N, RWORK( INDRW + N ), N, Z, IZ, JZ, 1186 $ DESCZ, IWORK( 1 ), IWORK( NPROCS+M+2 ), RWORK( INDRWORK ), 1187 $ SIZE1 ) 1188 END IF 1189* 1190 NZ = M 1191* 1192 1193*********************************************************************** 1194* 1195* Compute eigenvectors of A from eigenvectors of T 1196* 1197*********************************************************************** 1198 IF( NZ.GT.0 ) THEN 1199 CALL PZUNMTR( 'L', UPLO, 'N', N, NZ, A, IA, JA, DESCA, 1200 $ WORK( INDTAU ), Z, IZ, JZ, DESCZ, 1201 $ WORK( INDWORK ), LLWORK, IINFO ) 1202 END IF 1203 IF (IINFO.NE.0) THEN 1204 CALL PXERBLA( ICTXT, 'PZUNMTR', -IINFO ) 1205 RETURN 1206 END IF 1207* 1208 1209 END IF 1210* 1211 WORK( 1 ) = DCMPLX( LWOPT ) 1212 RWORK( 1 ) = DBLE( LRWOPT ) 1213 IWORK( 1 ) = LIWMIN 1214 1215 RETURN 1216* 1217* End of PZHEEVR 1218* 1219 END 1220