1 SUBROUTINE PZLATRZ( M, N, L, A, IA, JA, DESCA, TAU, WORK ) 2* 3* -- ScaLAPACK routine (version 1.7) -- 4* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 5* and University of California, Berkeley. 6* December 31, 1998 7* 8* .. Scalar Arguments .. 9 INTEGER IA, JA, L, M, N 10* .. 11* .. Array Arguments .. 12 INTEGER DESCA( * ) 13 COMPLEX*16 A( * ), TAU( * ), WORK( * ) 14* .. 15* 16* Purpose 17* ======= 18* 19* PZLATRZ reduces the M-by-N ( M<=N ) complex upper trapezoidal 20* matrix sub( A ) = [A(IA:IA+M-1,JA:JA+M-1) A(IA:IA+M-1,JA+N-L:JA+N-1)] 21* to upper triangular form by means of unitary transformations. 22* 23* The upper trapezoidal matrix sub( A ) is factored as 24* 25* sub( A ) = ( R 0 ) * Z, 26* 27* where Z is an N-by-N unitary matrix and R is an M-by-M upper 28* triangular matrix. 29* 30* Notes 31* ===== 32* 33* Each global data object is described by an associated description 34* vector. This vector stores the information required to establish 35* the mapping between an object element and its corresponding process 36* and memory location. 37* 38* Let A be a generic term for any 2D block cyclicly distributed array. 39* Such a global array has an associated description vector DESCA. 40* In the following comments, the character _ should be read as 41* "of the global array". 42* 43* NOTATION STORED IN EXPLANATION 44* --------------- -------------- -------------------------------------- 45* DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 46* DTYPE_A = 1. 47* CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 48* the BLACS process grid A is distribu- 49* ted over. The context itself is glo- 50* bal, but the handle (the integer 51* value) may vary. 52* M_A (global) DESCA( M_ ) The number of rows in the global 53* array A. 54* N_A (global) DESCA( N_ ) The number of columns in the global 55* array A. 56* MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 57* the rows of the array. 58* NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 59* the columns of the array. 60* RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 61* row of the array A is distributed. 62* CSRC_A (global) DESCA( CSRC_ ) The process column over which the 63* first column of the array A is 64* distributed. 65* LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 66* array. LLD_A >= MAX(1,LOCr(M_A)). 67* 68* Let K be the number of rows or columns of a distributed matrix, 69* and assume that its process grid has dimension p x q. 70* LOCr( K ) denotes the number of elements of K that a process 71* would receive if K were distributed over the p processes of its 72* process column. 73* Similarly, LOCc( K ) denotes the number of elements of K that a 74* process would receive if K were distributed over the q processes of 75* its process row. 76* The values of LOCr() and LOCc() may be determined via a call to the 77* ScaLAPACK tool function, NUMROC: 78* LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 79* LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 80* An upper bound for these quantities may be computed by: 81* LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 82* LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 83* 84* Arguments 85* ========= 86* 87* M (global input) INTEGER 88* The number of rows to be operated on, i.e. the number of rows 89* of the distributed submatrix sub( A ). M >= 0. 90* 91* N (global input) INTEGER 92* The number of columns to be operated on, i.e. the number of 93* columns of the distributed submatrix sub( A ). N >= 0. 94* 95* L (global input) INTEGER 96* The columns of the distributed submatrix sub( A ) containing 97* the meaningful part of the Householder reflectors. L > 0. 98* 99* A (local input/local output) COMPLEX*16 pointer into the 100* local memory to an array of dimension (LLD_A, LOCc(JA+N-1)). 101* On entry, the local pieces of the M-by-N distributed matrix 102* sub( A ) which is to be factored. On exit, the leading M-by-M 103* upper triangular part of sub( A ) contains the upper trian- 104* gular matrix R, and elements N-L+1 to N of the first M rows 105* of sub( A ), with the array TAU, represent the unitary matrix 106* Z as a product of M elementary reflectors. 107* 108* IA (global input) INTEGER 109* The row index in the global array A indicating the first 110* row of sub( A ). 111* 112* JA (global input) INTEGER 113* The column index in the global array A indicating the 114* first column of sub( A ). 115* 116* DESCA (global and local input) INTEGER array of dimension DLEN_. 117* The array descriptor for the distributed matrix A. 118* 119* TAU (local output) COMPLEX*16, array, dimension LOCr(IA+M-1) 120* This array contains the scalar factors of the elementary 121* reflectors. TAU is tied to the distributed matrix A. 122* 123* WORK (local workspace) COMPLEX*16 array, dimension (LWORK) 124* LWORK >= Nq0 + MAX( 1, Mp0 ), where 125* 126* IROFF = MOD( IA-1, MB_A ), ICOFF = MOD( JA-1, NB_A ), 127* IAROW = INDXG2P( IA, MB_A, MYROW, RSRC_A, NPROW ), 128* IACOL = INDXG2P( JA, NB_A, MYCOL, CSRC_A, NPCOL ), 129* Mp0 = NUMROC( M+IROFF, MB_A, MYROW, IAROW, NPROW ), 130* Nq0 = NUMROC( N+ICOFF, NB_A, MYCOL, IACOL, NPCOL ), 131* 132* and NUMROC, INDXG2P are ScaLAPACK tool functions; 133* MYROW, MYCOL, NPROW and NPCOL can be determined by calling 134* the subroutine BLACS_GRIDINFO. 135* 136* Further Details 137* =============== 138* 139* The factorization is obtained by Householder's method. The kth 140* transformation matrix, Z( k ), whose conjugate transpose is used to 141* introduce zeros into the (m - k + 1)th row of sub( A ), is given in 142* the form 143* 144* Z( k ) = ( I 0 ), 145* ( 0 T( k ) ) 146* 147* where 148* 149* T( k ) = I - tau*u( k )*u( k )', u( k ) = ( 1 ), 150* ( 0 ) 151* ( z( k ) ) 152* 153* tau is a scalar and z( k ) is an ( n - m ) element vector. 154* tau and z( k ) are chosen to annihilate the elements of the kth row 155* of sub( A ). 156* 157* The scalar tau is returned in the kth element of TAU and the vector 158* u( k ) in the kth row of sub( A ), such that the elements of z( k ) 159* are in a( k, m + 1 ), ..., a( k, n ). The elements of R are returned 160* in the upper triangular part of sub( A ). 161* 162* Z is given by 163* 164* Z = Z( 1 ) * Z( 2 ) * ... * Z( m ). 165* 166* ===================================================================== 167* 168* .. Parameters .. 169 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 170 $ LLD_, MB_, M_, NB_, N_, RSRC_ 171 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 172 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 173 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 174 COMPLEX*16 ONE, ZERO 175 PARAMETER ( ONE = ( 1.0D+0, 0.0D+0 ), 176 $ ZERO = ( 0.0D+0, 0.0D+0 ) ) 177* .. 178* .. Local Scalars .. 179 INTEGER I, IAROW, ICTXT, II, J, J1, MP, MYCOL, MYROW, 180 $ NPCOL, NPROW 181 COMPLEX*16 AII 182* .. 183* .. Local Arrays .. 184 INTEGER DESCTAU( DLEN_ ) 185* .. 186* .. External Subroutines .. 187 EXTERNAL DESCSET, INFOG1L, PZELSET, PZLACGV, 188 $ PZLARFG, PZLARZ 189* .. 190* .. External Functions .. 191 INTEGER NUMROC 192 EXTERNAL NUMROC 193* .. 194* .. Intrinsic Functions .. 195 INTRINSIC DCONJG, MAX 196* .. 197* .. Executable Statements .. 198* 199* Quick return if possible 200* 201 IF( M.EQ.0 .OR. N.EQ.0 ) 202 $ RETURN 203* 204* Get grid parameters 205* 206 ICTXT = DESCA( CTXT_ ) 207 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 208* 209 MP = NUMROC( IA+M-1, DESCA( MB_ ), MYROW, DESCA( RSRC_ ), 210 $ NPROW ) 211* 212 CALL DESCSET( DESCTAU, DESCA( M_ ), 1, DESCA( MB_ ), 1, 213 $ DESCA( RSRC_ ), MYCOL, ICTXT, MAX( 1, MP ) ) 214* 215 IF( M.EQ.N ) THEN 216* 217 CALL INFOG1L( IA, DESCA( MB_ ), NPROW, MYROW, DESCA( RSRC_ ), 218 $ II, IAROW ) 219 DO 10 I = II, MP 220 TAU( I ) = ZERO 221 10 CONTINUE 222* 223 ELSE 224* 225 AII = ZERO 226* 227 J1 = JA + N - L 228 DO 20 I = IA+M-1, IA, -1 229 J = JA + I - IA 230* 231* Generate elementary reflector H(i) to annihilate 232* [ A(i, j) A(i,j1:ja+n-1) ] 233* 234 CALL PZLACGV( 1, A, I, J, DESCA, DESCA( M_ ) ) 235 CALL PZLACGV( L, A, I, J1, DESCA, DESCA( M_ ) ) 236 CALL PZLARFG( L+1, AII, I, J, A, I, J1, DESCA, DESCA( M_ ), 237 $ TAU ) 238* 239* Apply H(i) to A(ia:i-1,j:ja+n-1) from the right 240* 241 CALL PZLARZ( 'Right', I-IA, JA+N-J, L, A, I, J1, DESCA, 242 $ DESCA( M_ ), TAU, A, IA, J, DESCA, WORK ) 243 CALL PZELSET( A, I, J, DESCA, DCONJG( AII ) ) 244* 245 20 CONTINUE 246* 247 CALL PZLACGV( M, TAU, IA, 1, DESCTAU, 1 ) 248* 249 END IF 250* 251 RETURN 252* 253* End of PZLATRZ 254* 255 END 256