1 SUBROUTINE PDLARFG( N, ALPHA, IAX, JAX, X, IX, JX, DESCX, INCX, 2 $ TAU ) 3* 4* -- ScaLAPACK auxiliary routine (version 1.7) -- 5* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 6* and University of California, Berkeley. 7* May 1, 1997 8* 9* .. Scalar Arguments .. 10 INTEGER IAX, INCX, IX, JAX, JX, N 11 DOUBLE PRECISION ALPHA 12* .. 13* .. Array Arguments .. 14 INTEGER DESCX( * ) 15 DOUBLE PRECISION TAU( * ), X( * ) 16* .. 17* 18* Purpose 19* ======= 20* 21* PDLARFG generates a real elementary reflector H of order n, such 22* that 23* 24* H * sub( X ) = H * ( x(iax,jax) ) = ( alpha ), H' * H = I. 25* ( x ) ( 0 ) 26* 27* where alpha is a scalar, and sub( X ) is an (N-1)-element real 28* distributed vector X(IX:IX+N-2,JX) if INCX = 1 and X(IX,JX:JX+N-2) if 29* INCX = DESCX(M_). H is represented in the form 30* 31* H = I - tau * ( 1 ) * ( 1 v' ) , 32* ( v ) 33* 34* where tau is a real scalar and v is a real (N-1)-element 35* vector. 36* 37* If the elements of sub( X ) are all zero, then tau = 0 and H is 38* taken to be the unit matrix. 39* 40* Otherwise 1 <= tau <= 2. 41* 42* Notes 43* ===== 44* 45* Each global data object is described by an associated description 46* vector. This vector stores the information required to establish 47* the mapping between an object element and its corresponding process 48* and memory location. 49* 50* Let A be a generic term for any 2D block cyclicly distributed array. 51* Such a global array has an associated description vector DESCA. 52* In the following comments, the character _ should be read as 53* "of the global array". 54* 55* NOTATION STORED IN EXPLANATION 56* --------------- -------------- -------------------------------------- 57* DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 58* DTYPE_A = 1. 59* CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 60* the BLACS process grid A is distribu- 61* ted over. The context itself is glo- 62* bal, but the handle (the integer 63* value) may vary. 64* M_A (global) DESCA( M_ ) The number of rows in the global 65* array A. 66* N_A (global) DESCA( N_ ) The number of columns in the global 67* array A. 68* MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 69* the rows of the array. 70* NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 71* the columns of the array. 72* RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 73* row of the array A is distributed. 74* CSRC_A (global) DESCA( CSRC_ ) The process column over which the 75* first column of the array A is 76* distributed. 77* LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 78* array. LLD_A >= MAX(1,LOCr(M_A)). 79* 80* Let K be the number of rows or columns of a distributed matrix, 81* and assume that its process grid has dimension p x q. 82* LOCr( K ) denotes the number of elements of K that a process 83* would receive if K were distributed over the p processes of its 84* process column. 85* Similarly, LOCc( K ) denotes the number of elements of K that a 86* process would receive if K were distributed over the q processes of 87* its process row. 88* The values of LOCr() and LOCc() may be determined via a call to the 89* ScaLAPACK tool function, NUMROC: 90* LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 91* LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 92* An upper bound for these quantities may be computed by: 93* LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 94* LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 95* 96* Because vectors may be viewed as a subclass of matrices, a 97* distributed vector is considered to be a distributed matrix. 98* 99* Arguments 100* ========= 101* 102* N (global input) INTEGER 103* The global order of the elementary reflector. N >= 0. 104* 105* ALPHA (local output) DOUBLE PRECISION 106* On exit, alpha is computed in the process scope having the 107* vector sub( X ). 108* 109* IAX (global input) INTEGER 110* The global row index in X of X(IAX,JAX). 111* 112* JAX (global input) INTEGER 113* The global column index in X of X(IAX,JAX). 114* 115* X (local input/local output) DOUBLE PRECISION, pointer into the 116* local memory to an array of dimension (LLD_X,*). This array 117* contains the local pieces of the distributed vector sub( X ). 118* Before entry, the incremented array sub( X ) must contain 119* the vector x. On exit, it is overwritten with the vector v. 120* 121* IX (global input) INTEGER 122* The row index in the global array X indicating the first 123* row of sub( X ). 124* 125* JX (global input) INTEGER 126* The column index in the global array X indicating the 127* first column of sub( X ). 128* 129* DESCX (global and local input) INTEGER array of dimension DLEN_. 130* The array descriptor for the distributed matrix X. 131* 132* INCX (global input) INTEGER 133* The global increment for the elements of X. Only two values 134* of INCX are supported in this version, namely 1 and M_X. 135* INCX must not be zero. 136* 137* TAU (local output) DOUBLE PRECISION array, dimension LOCc(JX) 138* if INCX = 1, and LOCr(IX) otherwise. This array contains the 139* Householder scalars related to the Householder vectors. 140* TAU is tied to the distributed matrix X. 141* 142* ===================================================================== 143* 144* .. Parameters .. 145 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 146 $ LLD_, MB_, M_, NB_, N_, RSRC_ 147 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 148 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 149 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 150 DOUBLE PRECISION ONE, ZERO 151 PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) 152* .. 153* .. Local Scalars .. 154 INTEGER ICTXT, IIAX, INDXTAU, IXCOL, IXROW, J, JJAX, 155 $ KNT, MYCOL, MYROW, NPCOL, NPROW 156 DOUBLE PRECISION BETA, RSAFMN, SAFMIN, XNORM 157* .. 158* .. External Subroutines .. 159 EXTERNAL BLACS_GRIDINFO, DGEBR2D, DGEBS2D, PDSCAL, 160 $ INFOG2L, PDNRM2 161* .. 162* .. External Functions .. 163 DOUBLE PRECISION DLAMCH, DLAPY2 164 EXTERNAL DLAMCH, DLAPY2 165* .. 166* .. Intrinsic Functions .. 167 INTRINSIC ABS, SIGN 168* .. 169* .. Executable Statements .. 170* 171* Get grid parameters. 172* 173 ICTXT = DESCX( CTXT_ ) 174 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 175* 176 IF( INCX.EQ.DESCX( M_ ) ) THEN 177* 178* sub( X ) is distributed across a process row. 179* 180 CALL INFOG2L( IX, JAX, DESCX, NPROW, NPCOL, MYROW, MYCOL, 181 $ IIAX, JJAX, IXROW, IXCOL ) 182* 183 IF( MYROW.NE.IXROW ) 184 $ RETURN 185* 186* Broadcast X(IAX,JAX) across the process row. 187* 188 IF( MYCOL.EQ.IXCOL ) THEN 189 J = IIAX+(JJAX-1)*DESCX( LLD_ ) 190 CALL DGEBS2D( ICTXT, 'Rowwise', ' ', 1, 1, X( J ), 1 ) 191 ALPHA = X( J ) 192 ELSE 193 CALL DGEBR2D( ICTXT, 'Rowwise', ' ', 1, 1, ALPHA, 1, 194 $ MYROW, IXCOL ) 195 END IF 196* 197 INDXTAU = IIAX 198* 199 ELSE 200* 201* sub( X ) is distributed across a process column. 202* 203 CALL INFOG2L( IAX, JX, DESCX, NPROW, NPCOL, MYROW, MYCOL, 204 $ IIAX, JJAX, IXROW, IXCOL ) 205* 206 IF( MYCOL.NE.IXCOL ) 207 $ RETURN 208* 209* Broadcast X(IAX,JAX) across the process column. 210* 211 IF( MYROW.EQ.IXROW ) THEN 212 J = IIAX+(JJAX-1)*DESCX( LLD_ ) 213 CALL DGEBS2D( ICTXT, 'Columnwise', ' ', 1, 1, X( J ), 1 ) 214 ALPHA = X( J ) 215 ELSE 216 CALL DGEBR2D( ICTXT, 'Columnwise', ' ', 1, 1, ALPHA, 1, 217 $ IXROW, MYCOL ) 218 END IF 219* 220 INDXTAU = JJAX 221* 222 END IF 223* 224 IF( N.LE.0 ) THEN 225 TAU( INDXTAU ) = ZERO 226 RETURN 227 END IF 228* 229 CALL PDNRM2( N-1, XNORM, X, IX, JX, DESCX, INCX ) 230* 231 IF( XNORM.EQ.ZERO ) THEN 232* 233* H = I 234* 235 TAU( INDXTAU ) = ZERO 236* 237 ELSE 238* 239* General case 240* 241 BETA = -SIGN( DLAPY2( ALPHA, XNORM ), ALPHA ) 242 SAFMIN = DLAMCH( 'S' ) 243 RSAFMN = ONE / SAFMIN 244 IF( ABS( BETA ).LT.SAFMIN ) THEN 245* 246* XNORM, BETA may be inaccurate; scale X and recompute them 247* 248 KNT = 0 249 10 CONTINUE 250 KNT = KNT + 1 251 CALL PDSCAL( N-1, RSAFMN, X, IX, JX, DESCX, INCX ) 252 BETA = BETA*RSAFMN 253 ALPHA = ALPHA*RSAFMN 254 IF( ABS( BETA ).LT.SAFMIN ) 255 $ GO TO 10 256* 257* New BETA is at most 1, at least SAFMIN 258* 259 CALL PDNRM2( N-1, XNORM, X, IX, JX, DESCX, INCX ) 260 BETA = -SIGN( DLAPY2( ALPHA, XNORM ), ALPHA ) 261 TAU( INDXTAU ) = ( BETA-ALPHA ) / BETA 262 CALL PDSCAL( N-1, ONE/(ALPHA-BETA), X, IX, JX, DESCX, INCX ) 263* 264* If ALPHA is subnormal, it may lose relative accuracy 265* 266 ALPHA = BETA 267 DO 20 J = 1, KNT 268 ALPHA = ALPHA*SAFMIN 269 20 CONTINUE 270 ELSE 271 TAU( INDXTAU ) = ( BETA-ALPHA ) / BETA 272 CALL PDSCAL( N-1, ONE/(ALPHA-BETA), X, IX, JX, DESCX, INCX ) 273 ALPHA = BETA 274 END IF 275 END IF 276* 277 RETURN 278* 279* End of PDLARFG 280* 281 END 282