1 SUBROUTINE PDTRSEN( JOB, COMPQ, SELECT, PARA, N, T, IT, JT, 2 $ DESCT, Q, IQ, JQ, DESCQ, WR, WI, M, S, SEP, WORK, LWORK, 3 $ IWORK, LIWORK, INFO ) 4* 5* Contribution from the Department of Computing Science and HPC2N, 6* Umea University, Sweden 7* 8* -- ScaLAPACK computational routine (version 2.0.1) -- 9* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 10* Univ. of Colorado Denver and University of California, Berkeley. 11* January, 2012 12* 13 IMPLICIT NONE 14* 15* .. Scalar Arguments .. 16 CHARACTER COMPQ, JOB 17 INTEGER INFO, LIWORK, LWORK, M, N, 18 $ IT, JT, IQ, JQ 19 DOUBLE PRECISION S, SEP 20* .. 21* .. Array Arguments .. 22 LOGICAL SELECT( N ) 23 INTEGER PARA( 6 ), DESCT( * ), DESCQ( * ), IWORK( * ) 24 DOUBLE PRECISION Q( * ), T( * ), WI( * ), WORK( * ), WR( * ) 25* .. 26* 27* Purpose 28* ======= 29* 30* PDTRSEN reorders the real Schur factorization of a real matrix 31* A = Q*T*Q**T, so that a selected cluster of eigenvalues appears 32* in the leading diagonal blocks of the upper quasi-triangular matrix 33* T, and the leading columns of Q form an orthonormal basis of the 34* corresponding right invariant subspace. The reordering is performed 35* by PDTRORD. 36* 37* Optionally the routine computes the reciprocal condition numbers of 38* the cluster of eigenvalues and/or the invariant subspace. SCASY 39* library is needed for condition estimation. 40* 41* T must be in Schur form (as returned by PDLAHQR), that is, block 42* upper triangular with 1-by-1 and 2-by-2 diagonal blocks. 43* 44* Notes 45* ===== 46* 47* Each global data object is described by an associated description 48* vector. This vector stores the information required to establish 49* the mapping between an object element and its corresponding process 50* and memory location. 51* 52* Let A be a generic term for any 2D block cyclicly distributed array. 53* Such a global array has an associated description vector DESCA. 54* In the following comments, the character _ should be read as 55* "of the global array". 56* 57* NOTATION STORED IN EXPLANATION 58* --------------- -------------- -------------------------------------- 59* DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 60* DTYPE_A = 1. 61* CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 62* the BLACS process grid A is distribu- 63* ted over. The context itself is glo- 64* bal, but the handle (the integer 65* value) may vary. 66* M_A (global) DESCA( M_ ) The number of rows in the global 67* array A. 68* N_A (global) DESCA( N_ ) The number of columns in the global 69* array A. 70* MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 71* the rows of the array. 72* NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 73* the columns of the array. 74* RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 75* row of the array A is distributed. 76* CSRC_A (global) DESCA( CSRC_ ) The process column over which the 77* first column of the array A is 78* distributed. 79* LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 80* array. LLD_A >= MAX(1,LOCr(M_A)). 81* 82* Let K be the number of rows or columns of a distributed matrix, 83* and assume that its process grid has dimension p x q. 84* LOCr( K ) denotes the number of elements of K that a process 85* would receive if K were distributed over the p processes of its 86* process column. 87* Similarly, LOCc( K ) denotes the number of elements of K that a 88* process would receive if K were distributed over the q processes of 89* its process row. 90* The values of LOCr() and LOCc() may be determined via a call to the 91* ScaLAPACK tool function, NUMROC: 92* LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 93* LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 94* An upper bound for these quantities may be computed by: 95* LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 96* LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 97* 98* Arguments 99* ========= 100* 101* JOB (global input) CHARACTER*1 102* Specifies whether condition numbers are required for the 103* cluster of eigenvalues (S) or the invariant subspace (SEP): 104* = 'N': none; 105* = 'E': for eigenvalues only (S); 106* = 'V': for invariant subspace only (SEP); 107* = 'B': for both eigenvalues and invariant subspace (S and 108* SEP). 109* 110* COMPQ (global input) CHARACTER*1 111* = 'V': update the matrix Q of Schur vectors; 112* = 'N': do not update Q. 113* 114* SELECT (global input) LOGICAL array, dimension (N) 115* SELECT specifies the eigenvalues in the selected cluster. To 116* select a real eigenvalue w(j), SELECT(j) must be set to 117* .TRUE.. To select a complex conjugate pair of eigenvalues 118* w(j) and w(j+1), corresponding to a 2-by-2 diagonal block, 119* either SELECT(j) or SELECT(j+1) or both must be set to 120* .TRUE.; a complex conjugate pair of eigenvalues must be 121* either both included in the cluster or both excluded. 122* 123* PARA (global input) INTEGER*6 124* Block parameters (some should be replaced by calls to 125* PILAENV and others by meaningful default values): 126* PARA(1) = maximum number of concurrent computational windows 127* allowed in the algorithm; 128* 0 < PARA(1) <= min(NPROW,NPCOL) must hold; 129* PARA(2) = number of eigenvalues in each window; 130* 0 < PARA(2) < PARA(3) must hold; 131* PARA(3) = window size; PARA(2) < PARA(3) < DESCT(MB_) 132* must hold; 133* PARA(4) = minimal percentage of flops required for 134* performing matrix-matrix multiplications instead 135* of pipelined orthogonal transformations; 136* 0 <= PARA(4) <= 100 must hold; 137* PARA(5) = width of block column slabs for row-wise 138* application of pipelined orthogonal 139* transformations in their factorized form; 140* 0 < PARA(5) <= DESCT(MB_) must hold. 141* PARA(6) = the maximum number of eigenvalues moved together 142* over a process border; in practice, this will be 143* approximately half of the cross border window size 144* 0 < PARA(6) <= PARA(2) must hold; 145* 146* N (global input) INTEGER 147* The order of the globally distributed matrix T. N >= 0. 148* 149* T (local input/output) DOUBLE PRECISION array, 150* dimension (LLD_T,LOCc(N)). 151* On entry, the local pieces of the global distributed 152* upper quasi-triangular matrix T, in Schur form. On exit, T is 153* overwritten by the local pieces of the reordered matrix T, 154* again in Schur form, with the selected eigenvalues in the 155* globally leading diagonal blocks. 156* 157* IT (global input) INTEGER 158* JT (global input) INTEGER 159* The row and column index in the global array T indicating the 160* first column of sub( T ). IT = JT = 1 must hold. 161* 162* DESCT (global and local input) INTEGER array of dimension DLEN_. 163* The array descriptor for the global distributed matrix T. 164* 165* Q (local input/output) DOUBLE PRECISION array, 166* dimension (LLD_Q,LOCc(N)). 167* On entry, if COMPQ = 'V', the local pieces of the global 168* distributed matrix Q of Schur vectors. 169* On exit, if COMPQ = 'V', Q has been postmultiplied by the 170* global orthogonal transformation matrix which reorders T; the 171* leading M columns of Q form an orthonormal basis for the 172* specified invariant subspace. 173* If COMPQ = 'N', Q is not referenced. 174* 175* IQ (global input) INTEGER 176* JQ (global input) INTEGER 177* The column index in the global array Q indicating the 178* first column of sub( Q ). IQ = JQ = 1 must hold. 179* 180* DESCQ (global and local input) INTEGER array of dimension DLEN_. 181* The array descriptor for the global distributed matrix Q. 182* 183* WR (global output) DOUBLE PRECISION array, dimension (N) 184* WI (global output) DOUBLE PRECISION array, dimension (N) 185* The real and imaginary parts, respectively, of the reordered 186* eigenvalues of T. The eigenvalues are in principle stored in 187* the same order as on the diagonal of T, with WR(i) = T(i,i) 188* and, if T(i:i+1,i:i+1) is a 2-by-2 diagonal block, WI(i) > 0 189* and WI(i+1) = -WI(i). 190* Note also that if a complex eigenvalue is sufficiently 191* ill-conditioned, then its value may differ significantly 192* from its value before reordering. 193* 194* M (global output) INTEGER 195* The dimension of the specified invariant subspace. 196* 0 <= M <= N. 197* 198* S (global output) DOUBLE PRECISION 199* If JOB = 'E' or 'B', S is a lower bound on the reciprocal 200* condition number for the selected cluster of eigenvalues. 201* S cannot underestimate the true reciprocal condition number 202* by more than a factor of sqrt(N). If M = 0 or N, S = 1. 203* If JOB = 'N' or 'V', S is not referenced. 204* 205* SEP (global output) DOUBLE PRECISION 206* If JOB = 'V' or 'B', SEP is the estimated reciprocal 207* condition number of the specified invariant subspace. If 208* M = 0 or N, SEP = norm(T). 209* If JOB = 'N' or 'E', SEP is not referenced. 210* 211* WORK (local workspace/output) DOUBLE PRECISION array, 212* dimension (LWORK) 213* On exit, if INFO = 0, WORK(1) returns the optimal LWORK. 214* 215* LWORK (local input) INTEGER 216* The dimension of the array WORK. 217* 218* If LWORK = -1, then a workspace query is assumed; the routine 219* only calculates the optimal size of the WORK array, returns 220* this value as the first entry of the WORK array, and no error 221* message related to LWORK is issued by PXERBLA. 222* 223* IWORK (local workspace/output) INTEGER array, dimension (LIWORK) 224* 225* LIWORK (local input) INTEGER 226* The dimension of the array IWORK. 227* 228* If LIWORK = -1, then a workspace query is assumed; the 229* routine only calculates the optimal size of the IWORK array, 230* returns this value as the first entry of the IWORK array, and 231* no error message related to LIWORK is issued by PXERBLA. 232* 233* INFO (global output) INTEGER 234* = 0: successful exit 235* < 0: if INFO = -i, the i-th argument had an illegal value. 236* If the i-th argument is an array and the j-entry had 237* an illegal value, then INFO = -(i*1000+j), if the i-th 238* argument is a scalar and had an illegal value, then INFO = -i. 239* > 0: here we have several possibilites 240* *) Reordering of T failed because some eigenvalues are too 241* close to separate (the problem is very ill-conditioned); 242* T may have been partially reordered, and WR and WI 243* contain the eigenvalues in the same order as in T. 244* On exit, INFO = {the index of T where the swap failed}. 245* *) A 2-by-2 block to be reordered split into two 1-by-1 246* blocks and the second block failed to swap with an 247* adjacent block. 248* On exit, INFO = {the index of T where the swap failed}. 249* *) If INFO = N+1, there is no valid BLACS context (see the 250* BLACS documentation for details). 251* *) If INFO = N+2, the routines used in the calculation of 252* the condition numbers raised a positive warning flag 253* (see the documentation for PGESYCTD and PSYCTCON of the 254* SCASY library). 255* *) If INFO = N+3, PGESYCTD raised an input error flag; 256* please report this bug to the authors (see below). 257* If INFO = N+4, PSYCTCON raised an input error flag; 258* please report this bug to the authors (see below). 259* In a future release this subroutine may distinguish between 260* the case 1 and 2 above. 261* 262* Method 263* ====== 264* 265* This routine performs parallel eigenvalue reordering in real Schur 266* form by parallelizing the approach proposed in [3]. The condition 267* number estimation part is performed by using techniques and code 268* from SCASY, see http://www.cs.umu.se/research/parallel/scasy. 269* 270* Additional requirements 271* ======================= 272* 273* The following alignment requirements must hold: 274* (a) DESCT( MB_ ) = DESCT( NB_ ) = DESCQ( MB_ ) = DESCQ( NB_ ) 275* (b) DESCT( RSRC_ ) = DESCQ( RSRC_ ) 276* (c) DESCT( CSRC_ ) = DESCQ( CSRC_ ) 277* 278* All matrices must be blocked by a block factor larger than or 279* equal to two (3). This to simplify reordering across processor 280* borders in the presence of 2-by-2 blocks. 281* 282* Limitations 283* =========== 284* 285* This algorithm cannot work on submatrices of T and Q, i.e., 286* IT = JT = IQ = JQ = 1 must hold. This is however no limitation 287* since PDLAHQR does not compute Schur forms of submatrices anyway. 288* 289* References 290* ========== 291* 292* [1] Z. Bai and J. W. Demmel; On swapping diagonal blocks in real 293* Schur form, Linear Algebra Appl., 186:73--95, 1993. Also as 294* LAPACK Working Note 54. 295* 296* [2] Z. Bai, J. W. Demmel, and A. McKenney; On computing condition 297* numbers for the nonsymmetric eigenvalue problem, ACM Trans. 298* Math. Software, 19(2):202--223, 1993. Also as LAPACK Working 299* Note 13. 300* 301* [3] D. Kressner; Block algorithms for reordering standard and 302* generalized Schur forms, ACM TOMS, 32(4):521-532, 2006. 303* Also LAPACK Working Note 171. 304* 305* [4] R. Granat, B. Kagstrom, and D. Kressner; Parallel eigenvalue 306* reordering in real Schur form, Concurrency and Computations: 307* Practice and Experience, 21(9):1225-1250, 2009. Also as 308* LAPACK Working Note 192. 309* 310* Parallel execution recommendations 311* ================================== 312* 313* Use a square grid, if possible, for maximum performance. The block 314* parameters in PARA should be kept well below the data distribution 315* block size. In particular, see [3,4] for recommended settings for 316* these parameters. 317* 318* In general, the parallel algorithm strives to perform as much work 319* as possible without crossing the block borders on the main block 320* diagonal. 321* 322* Contributors 323* ============ 324* 325* Implemented by Robert Granat, Dept. of Computing Science and HPC2N, 326* Umea University, Sweden, March 2007, 327* in collaboration with Bo Kagstrom and Daniel Kressner. 328* Modified by Meiyue Shao, October 2011. 329* 330* Revisions 331* ========= 332* 333* Please send bug-reports to granat@cs.umu.se 334* 335* Keywords 336* ======== 337* 338* Real Schur form, eigenvalue reordering, Sylvester matrix equation 339* 340* ===================================================================== 341* .. 342* .. Parameters .. 343 CHARACTER TOP 344 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 345 $ LLD_, MB_, M_, NB_, N_, RSRC_ 346 DOUBLE PRECISION ZERO, ONE 347 PARAMETER ( TOP = '1-Tree', 348 $ BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 349 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 350 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9, 351 $ ZERO = 0.0D+0, ONE = 1.0D+0 ) 352* .. 353* .. Local Scalars .. 354 LOGICAL LQUERY, WANTBH, WANTQ, WANTS, WANTSP 355 INTEGER ICOFFT12, ICTXT, IDUM1, IDUM2, IERR, ILOC1, 356 $ IPW1, ITER, ITT, JLOC1, JTT, K, LIWMIN, LLDT, 357 $ LLDQ, LWMIN, MMAX, MMIN, MYROW, MYCOL, N1, N2, 358 $ NB, NOEXSY, NPCOL, NPROCS, NPROW, SPACE, 359 $ T12ROWS, T12COLS, TCOLS, TCSRC, TROWS, TRSRC, 360 $ WRK1, IWRK1, WRK2, IWRK2, WRK3, IWRK3 361 DOUBLE PRECISION DPDUM1, ELEM, EST, SCALE, RNORM 362* .. Local Arrays .. 363 INTEGER DESCT12( DLEN_ ), MBNB2( 2 ) 364* .. 365* .. External Functions .. 366 LOGICAL LSAME 367 INTEGER NUMROC 368 DOUBLE PRECISION PDLANGE 369 EXTERNAL LSAME, NUMROC, PDLANGE 370* .. 371* .. External Subroutines .. 372 EXTERNAL BLACS_GRIDINFO, CHK1MAT, DESCINIT, 373 $ IGAMX2D, INFOG2L, PDLACPY, PDTRORD, PXERBLA, 374 $ PCHK1MAT, PCHK2MAT 375* $ , PGESYCTD, PSYCTCON 376* .. 377* .. Intrinsic Functions .. 378 INTRINSIC MAX, MIN, SQRT 379* .. 380* .. Executable Statements .. 381* 382* Get grid parameters 383* 384 ICTXT = DESCT( CTXT_ ) 385 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 386 NPROCS = NPROW*NPCOL 387* 388* Test if grid is O.K., i.e., the context is valid 389* 390 INFO = 0 391 IF( NPROW.EQ.-1 ) THEN 392 INFO = N+1 393 END IF 394* 395* Check if workspace 396* 397 LQUERY = LWORK.EQ.-1 .OR. LIWORK.EQ.-1 398* 399* Test dimensions for local sanity 400* 401 IF( INFO.EQ.0 ) THEN 402 CALL CHK1MAT( N, 5, N, 5, IT, JT, DESCT, 9, INFO ) 403 END IF 404 IF( INFO.EQ.0 ) THEN 405 CALL CHK1MAT( N, 5, N, 5, IQ, JQ, DESCQ, 13, INFO ) 406 END IF 407* 408* Check the blocking sizes for alignment requirements 409* 410 IF( INFO.EQ.0 ) THEN 411 IF( DESCT( MB_ ).NE.DESCT( NB_ ) ) INFO = -(1000*9 + MB_) 412 END IF 413 IF( INFO.EQ.0 ) THEN 414 IF( DESCQ( MB_ ).NE.DESCQ( NB_ ) ) INFO = -(1000*13 + MB_) 415 END IF 416 IF( INFO.EQ.0 ) THEN 417 IF( DESCT( MB_ ).NE.DESCQ( MB_ ) ) INFO = -(1000*9 + MB_) 418 END IF 419* 420* Check the blocking sizes for minimum sizes 421* 422 IF( INFO.EQ.0 ) THEN 423 IF( N.NE.DESCT( MB_ ) .AND. DESCT( MB_ ).LT.3 ) 424 $ INFO = -(1000*9 + MB_) 425 IF( N.NE.DESCQ( MB_ ) .AND. DESCQ( MB_ ).LT.3 ) 426 $ INFO = -(1000*13 + MB_) 427 END IF 428* 429* Check parameters in PARA 430* 431 NB = DESCT( MB_ ) 432 IF( INFO.EQ.0 ) THEN 433 IF( PARA(1).LT.1 .OR. PARA(1).GT.MIN(NPROW,NPCOL) ) 434 $ INFO = -(1000 * 4 + 1) 435 IF( PARA(2).LT.1 .OR. PARA(2).GE.PARA(3) ) 436 $ INFO = -(1000 * 4 + 2) 437 IF( PARA(3).LT.1 .OR. PARA(3).GT.NB ) 438 $ INFO = -(1000 * 4 + 3) 439 IF( PARA(4).LT.0 .OR. PARA(4).GT.100 ) 440 $ INFO = -(1000 * 4 + 4) 441 IF( PARA(5).LT.1 .OR. PARA(5).GT.NB ) 442 $ INFO = -(1000 * 4 + 5) 443 IF( PARA(6).LT.1 .OR. PARA(6).GT.PARA(2) ) 444 $ INFO = -(1000 * 4 + 6) 445 END IF 446* 447* Check requirements on IT, JT, IQ and JQ 448* 449 IF( INFO.EQ.0 ) THEN 450 IF( IT.NE.1 ) INFO = -7 451 IF( JT.NE.IT ) INFO = -8 452 IF( IQ.NE.1 ) INFO = -11 453 IF( JQ.NE.IQ ) INFO = -12 454 END IF 455* 456* Test input parameters for global sanity 457* 458 IF( INFO.EQ.0 ) THEN 459 CALL PCHK1MAT( N, 5, N, 5, IT, JT, DESCT, 9, 0, IDUM1, 460 $ IDUM2, INFO ) 461 END IF 462 IF( INFO.EQ.0 ) THEN 463 CALL PCHK1MAT( N, 5, N, 5, IQ, JQ, DESCQ, 13, 0, IDUM1, 464 $ IDUM2, INFO ) 465 END IF 466 IF( INFO.EQ.0 ) THEN 467 CALL PCHK2MAT( N, 5, N, 5, IT, JT, DESCT, 9, N, 5, N, 5, 468 $ IQ, JQ, DESCQ, 13, 0, IDUM1, IDUM2, INFO ) 469 END IF 470* 471* Decode and test the input parameters 472* 473 IF( INFO.EQ.0 .OR. LQUERY ) THEN 474 WANTBH = LSAME( JOB, 'B' ) 475 WANTS = LSAME( JOB, 'E' ) .OR. WANTBH 476 WANTSP = LSAME( JOB, 'V' ) .OR. WANTBH 477 WANTQ = LSAME( COMPQ, 'V' ) 478* 479 IF( .NOT.LSAME( JOB, 'N' ) .AND. .NOT.WANTS .AND. .NOT.WANTSP ) 480 $ THEN 481 INFO = -1 482 ELSEIF( .NOT.LSAME( COMPQ, 'N' ) .AND. .NOT.WANTQ ) THEN 483 INFO = -2 484 ELSEIF( N.LT.0 ) THEN 485 INFO = -4 486 ELSE 487* 488* Extract local leading dimension 489* 490 LLDT = DESCT( LLD_ ) 491 LLDQ = DESCQ( LLD_ ) 492* 493* Check the SELECT vector for consistency and set M to the 494* dimension of the specified invariant subspace. 495* 496 M = 0 497 DO 10 K = 1, N 498* 499* IWORK(1:N) is an integer copy of SELECT. 500* 501 IF( SELECT(K) ) THEN 502 IWORK(K) = 1 503 ELSE 504 IWORK(K) = 0 505 END IF 506 IF( K.LT.N ) THEN 507 CALL INFOG2L( K+1, K, DESCT, NPROW, NPCOL, 508 $ MYROW, MYCOL, ITT, JTT, TRSRC, TCSRC ) 509 IF( MYROW.EQ.TRSRC .AND. MYCOL.EQ.TCSRC ) THEN 510 ELEM = T( (JTT-1)*LLDT + ITT ) 511 IF( ELEM.NE.ZERO ) THEN 512 IF( SELECT(K) .AND. .NOT.SELECT(K+1) ) THEN 513* INFO = -3 514 IWORK(K+1) = 1 515 ELSEIF( .NOT.SELECT(K) .AND. SELECT(K+1) ) THEN 516* INFO = -3 517 IWORK(K) = 1 518 END IF 519 END IF 520 END IF 521 END IF 522 IF( SELECT(K) ) M = M + 1 523 10 CONTINUE 524 MMAX = M 525 MMIN = M 526 IF( NPROCS.GT.1 ) 527 $ CALL IGAMX2D( ICTXT, 'All', TOP, 1, 1, MMAX, 1, -1, 528 $ -1, -1, -1, -1 ) 529 IF( NPROCS.GT.1 ) 530 $ CALL IGAMN2D( ICTXT, 'All', TOP, 1, 1, MMIN, 1, -1, 531 $ -1, -1, -1, -1 ) 532 IF( MMAX.GT.MMIN ) THEN 533 M = MMAX 534 IF( NPROCS.GT.1 ) 535 $ CALL IGAMX2D( ICTXT, 'All', TOP, N, 1, IWORK, N, 536 $ -1, -1, -1, -1, -1 ) 537 END IF 538* 539* Set parameters for deep pipelining in parallel 540* Sylvester solver. 541* 542 MBNB2( 1 ) = MIN( MAX( PARA( 3 ), PARA( 2 )*2 ), NB ) 543 MBNB2( 2 ) = MBNB2( 1 ) 544* 545* Compute needed workspace 546* 547 N1 = M 548 N2 = N - M 549 IF( WANTS ) THEN 550c CALL PGESYCTD( 'Solve', 'Schur', 'Schur', 'Notranspose', 551c $ 'Notranspose', -1, 'Demand', N1, N2, T, 1, 1, DESCT, 552c $ T, N1+1, N1+1, DESCT, T, 1, N1+1, DESCT, MBNB2, 553c $ WORK, -1, IWORK(N+1), -1, NOEXSY, SCALE, IERR ) 554 WRK1 = INT(WORK(1)) 555 IWRK1 = IWORK(N+1) 556 WRK1 = 0 557 IWRK1 = 0 558 ELSE 559 WRK1 = 0 560 IWRK1 = 0 561 END IF 562* 563 IF( WANTSP ) THEN 564c CALL PSYCTCON( 'Notranspose', 'Notranspose', -1, 565c $ 'Demand', N1, N2, T, 1, 1, DESCT, T, N1+1, N1+1, 566c $ DESCT, MBNB2, WORK, -1, IWORK(N+1), -1, EST, ITER, 567c $ IERR ) 568 WRK2 = INT(WORK(1)) 569 IWRK2 = IWORK(N+1) 570 WRK2 = 0 571 IWRK2 = 0 572 ELSE 573 WRK2 = 0 574 IWRK2 = 0 575 END IF 576* 577 TROWS = NUMROC( N, NB, MYROW, DESCT(RSRC_), NPROW ) 578 TCOLS = NUMROC( N, NB, MYCOL, DESCT(CSRC_), NPCOL ) 579 WRK3 = N + 7*NB**2 + 2*TROWS*PARA( 3 ) + TCOLS*PARA( 3 ) + 580 $ MAX( TROWS*PARA( 3 ), TCOLS*PARA( 3 ) ) 581 IWRK3 = 5*PARA( 1 ) + PARA(2)*PARA(3) - 582 $ PARA(2) * (PARA(2) + 1 ) / 2 583* 584 IF( WANTSP ) THEN 585 LWMIN = MAX( 1, MAX( WRK2, WRK3) ) 586 LIWMIN = MAX( 1, MAX( IWRK2, IWRK3 ) )+N 587 ELSE IF( LSAME( JOB, 'N' ) ) THEN 588 LWMIN = MAX( 1, WRK3 ) 589 LIWMIN = IWRK3+N 590 ELSE IF( LSAME( JOB, 'E' ) ) THEN 591 LWMIN = MAX( 1, MAX( WRK1, WRK3) ) 592 LIWMIN = MAX( 1, MAX( IWRK1, IWRK3 ) )+N 593 END IF 594* 595 IF( LWORK.LT.LWMIN .AND. .NOT.LQUERY ) THEN 596 INFO = -20 597 ELSE IF( LIWORK.LT.LIWMIN .AND. .NOT.LQUERY ) THEN 598 INFO = -22 599 END IF 600 END IF 601 END IF 602* 603* Global maximum on info 604* 605 IF( NPROCS.GT.1 ) 606 $ CALL IGAMX2D( ICTXT, 'All', TOP, 1, 1, INFO, 1, -1, -1, -1, 607 $ -1, -1 ) 608* 609* Return if some argument is incorrect 610* 611 IF( INFO.NE.0 .AND. .NOT.LQUERY ) THEN 612 M = 0 613 S = ONE 614 SEP = ZERO 615 CALL PXERBLA( ICTXT, 'PDTRSEN', -INFO ) 616 RETURN 617 ELSEIF( LQUERY ) THEN 618 WORK( 1 ) = DBLE(LWMIN) 619 IWORK( 1 ) = LIWMIN 620 RETURN 621 END IF 622* 623* Quick return if possible. 624* 625 IF( M.EQ.N .OR. M.EQ.0 ) THEN 626 IF( WANTS ) 627 $ S = ONE 628 IF( WANTSP ) 629 $ SEP = PDLANGE( '1', N, N, T, IT, JT, DESCT, WORK ) 630 GO TO 50 631 END IF 632* 633* Reorder the eigenvalues. 634* 635 CALL PDTRORD( COMPQ, IWORK, PARA, N, T, IT, JT, 636 $ DESCT, Q, IQ, JQ, DESCQ, WR, WI, M, WORK, LWORK, 637 $ IWORK(N+1), LIWORK-N, INFO ) 638* 639 IF( WANTS ) THEN 640* 641* Solve Sylvester equation T11*R - R*T2 = scale*T12 for R in 642* parallel. 643* 644* Copy T12 to workspace. 645* 646 CALL INFOG2L( 1, N1+1, DESCT, NPROW, NPCOL, MYROW, 647 $ MYCOL, ILOC1, JLOC1, TRSRC, TCSRC ) 648 ICOFFT12 = MOD( N1, NB ) 649 T12ROWS = NUMROC( N1, NB, MYROW, TRSRC, NPROW ) 650 T12COLS = NUMROC( N2+ICOFFT12, NB, MYCOL, TCSRC, NPCOL ) 651 CALL DESCINIT( DESCT12, N1, N2+ICOFFT12, NB, NB, TRSRC, 652 $ TCSRC, ICTXT, MAX(1,T12ROWS), IERR ) 653 CALL PDLACPY( 'All', N1, N2, T, 1, N1+1, DESCT, WORK, 654 $ 1, 1+ICOFFT12, DESCT12 ) 655* 656* Solve the equation to get the solution in workspace. 657* 658 SPACE = DESCT12( LLD_ ) * T12COLS 659 IPW1 = 1 + SPACE 660c CALL PGESYCTD( 'Solve', 'Schur', 'Schur', 'Notranspose', 661c $ 'Notranspose', -1, 'Demand', N1, N2, T, 1, 1, DESCT, T, 662c $ N1+1, N1+1, DESCT, WORK, 1, 1+ICOFFT12, DESCT12, MBNB2, 663c $ WORK(IPW1), LWORK-SPACE+1, IWORK(N+1), LIWORK-N, NOEXSY, 664c $ SCALE, IERR ) 665 IF( IERR.LT.0 ) THEN 666 INFO = N+3 667 ELSE 668 INFO = N+2 669 END IF 670* 671* Estimate the reciprocal of the condition number of the cluster 672* of eigenvalues. 673* 674 RNORM = PDLANGE( 'Frobenius', N1, N2, WORK, 1, 1+ICOFFT12, 675 $ DESCT12, DPDUM1 ) 676 IF( RNORM.EQ.ZERO ) THEN 677 S = ONE 678 ELSE 679 S = SCALE / ( SQRT( SCALE*SCALE / RNORM+RNORM )* 680 $ SQRT( RNORM ) ) 681 END IF 682 END IF 683* 684 IF( WANTSP ) THEN 685* 686* Estimate sep(T11,T21) in parallel. 687* 688c CALL PSYCTCON( 'Notranspose', 'Notranspose', -1, 'Demand', N1, 689c $ N2, T, 1, 1, DESCT, T, N1+1, N1+1, DESCT, MBNB2, WORK, 690c $ LWORK, IWORK(N+1), LIWORK-N, EST, ITER, IERR ) 691 EST = EST * SQRT(DBLE(N1*N2)) 692 SEP = ONE / EST 693 IF( IERR.LT.0 ) THEN 694 INFO = N+4 695 ELSE 696 INFO = N+2 697 END IF 698 END IF 699* 700* Return to calling program. 701* 702 50 CONTINUE 703* 704 RETURN 705* 706* End of PDTRSEN 707* 708 END 709* 710