1 SUBROUTINE PDLACON( N, V, IV, JV, DESCV, X, IX, JX, DESCX, ISGN, 2 $ EST, KASE ) 3* 4* -- ScaLAPACK auxiliary routine (version 1.7) -- 5* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 6* and University of California, Berkeley. 7* May 1, 1997 8* 9* .. Scalar Arguments .. 10 INTEGER IV, IX, JV, JX, KASE, N 11 DOUBLE PRECISION EST 12* .. 13* .. Array Arguments .. 14 INTEGER DESCV( * ), DESCX( * ), ISGN( * ) 15 DOUBLE PRECISION V( * ), X( * ) 16* .. 17* 18* Purpose 19* ======= 20* 21* PDLACON estimates the 1-norm of a square, real distributed matrix A. 22* Reverse communication is used for evaluating matrix-vector products. 23* X and V are aligned with the distributed matrix A, this information 24* is implicitly contained within IV, IX, DESCV, and DESCX. 25* 26* Notes 27* ===== 28* 29* Each global data object is described by an associated description 30* vector. This vector stores the information required to establish 31* the mapping between an object element and its corresponding process 32* and memory location. 33* 34* Let A be a generic term for any 2D block cyclicly distributed array. 35* Such a global array has an associated description vector DESCA. 36* In the following comments, the character _ should be read as 37* "of the global array". 38* 39* NOTATION STORED IN EXPLANATION 40* --------------- -------------- -------------------------------------- 41* DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 42* DTYPE_A = 1. 43* CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 44* the BLACS process grid A is distribu- 45* ted over. The context itself is glo- 46* bal, but the handle (the integer 47* value) may vary. 48* M_A (global) DESCA( M_ ) The number of rows in the global 49* array A. 50* N_A (global) DESCA( N_ ) The number of columns in the global 51* array A. 52* MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 53* the rows of the array. 54* NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 55* the columns of the array. 56* RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 57* row of the array A is distributed. 58* CSRC_A (global) DESCA( CSRC_ ) The process column over which the 59* first column of the array A is 60* distributed. 61* LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 62* array. LLD_A >= MAX(1,LOCr(M_A)). 63* 64* Let K be the number of rows or columns of a distributed matrix, 65* and assume that its process grid has dimension p x q. 66* LOCr( K ) denotes the number of elements of K that a process 67* would receive if K were distributed over the p processes of its 68* process column. 69* Similarly, LOCc( K ) denotes the number of elements of K that a 70* process would receive if K were distributed over the q processes of 71* its process row. 72* The values of LOCr() and LOCc() may be determined via a call to the 73* ScaLAPACK tool function, NUMROC: 74* LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 75* LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 76* An upper bound for these quantities may be computed by: 77* LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 78* LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 79* 80* Arguments 81* ========= 82* 83* N (global input) INTEGER 84* The length of the distributed vectors V and X. N >= 0. 85* 86* V (local workspace) DOUBLE PRECISION pointer into the local 87* memory to an array of dimension LOCr(N+MOD(IV-1,MB_V)). On 88* the final return, V = A*W, where EST = norm(V)/norm(W) 89* (W is not returned). 90* 91* IV (global input) INTEGER 92* The row index in the global array V indicating the first 93* row of sub( V ). 94* 95* JV (global input) INTEGER 96* The column index in the global array V indicating the 97* first column of sub( V ). 98* 99* DESCV (global and local input) INTEGER array of dimension DLEN_. 100* The array descriptor for the distributed matrix V. 101* 102* X (local input/local output) DOUBLE PRECISION pointer into the 103* local memory to an array of dimension 104* LOCr(N+MOD(IX-1,MB_X)). On an intermediate return, X 105* should be overwritten by 106* A * X, if KASE=1, 107* A' * X, if KASE=2, 108* PDLACON must be re-called with all the other parameters 109* unchanged. 110* 111* IX (global input) INTEGER 112* The row index in the global array X indicating the first 113* row of sub( X ). 114* 115* JX (global input) INTEGER 116* The column index in the global array X indicating the 117* first column of sub( X ). 118* 119* DESCX (global and local input) INTEGER array of dimension DLEN_. 120* The array descriptor for the distributed matrix X. 121* 122* ISGN (local workspace) INTEGER array, dimension 123* LOCr(N+MOD(IX-1,MB_X)). ISGN is aligned with X and V. 124* 125* 126* EST (global output) DOUBLE PRECISION 127* An estimate (a lower bound) for norm(A). 128* 129* KASE (local input/local output) INTEGER 130* On the initial call to PDLACON, KASE should be 0. 131* On an intermediate return, KASE will be 1 or 2, indicating 132* whether X should be overwritten by A * X or A' * X. 133* On the final return from PDLACON, KASE will again be 0. 134* 135* Further Details 136* =============== 137* 138* The serial version DLACON has been contributed by Nick Higham, 139* University of Manchester. It was originally named SONEST, dated 140* March 16, 1988. 141* 142* Reference: N.J. Higham, "FORTRAN codes for estimating the one-norm of 143* a real or complex matrix, with applications to condition estimation", 144* ACM Trans. Math. Soft., vol. 14, no. 4, pp. 381-396, December 1988. 145* 146* ===================================================================== 147* 148* .. Parameters .. 149 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 150 $ LLD_, MB_, M_, NB_, N_, RSRC_ 151 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 152 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 153 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 154 INTEGER ITMAX 155 PARAMETER ( ITMAX = 5 ) 156 DOUBLE PRECISION ZERO, ONE, TWO 157 PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0, TWO = 2.0D+0 ) 158* .. 159* .. Local Scalars .. 160 INTEGER I, ICTXT, IFLAG, IIVX, IMAXROW, IOFFVX, IROFF, 161 $ ITER, IVXCOL, IVXROW, J, JLAST, JJVX, JUMP, 162 $ K, MYCOL, MYROW, NP, NPCOL, NPROW 163 DOUBLE PRECISION ALTSGN, ESTOLD, JLMAX, TEMP, XMAX 164* .. 165* .. Local Arrays .. 166 DOUBLE PRECISION WORK( 2 ) 167* .. 168* .. External Subroutines .. 169 EXTERNAL BLACS_GRIDINFO, DCOPY, DGEBR2D, DGEBS2D, 170 $ IGSUM2D, INFOG2L, PDAMAX, PDASUM, 171 $ PDELGET 172* .. 173* .. External Functions .. 174 INTEGER INDXG2L, INDXG2P, INDXL2G, NUMROC 175 EXTERNAL INDXG2L, INDXG2P, INDXL2G, NUMROC 176* .. 177* .. Intrinsic Functions .. 178 INTRINSIC ABS, DBLE, MOD, NINT, SIGN 179* .. 180* .. Save statement .. 181 SAVE 182* .. 183* .. Executable Statements .. 184* 185* Get grid parameters. 186* 187 ICTXT = DESCX( CTXT_ ) 188 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 189* 190 CALL INFOG2L( IX, JX, DESCX, NPROW, NPCOL, MYROW, MYCOL, 191 $ IIVX, JJVX, IVXROW, IVXCOL ) 192 IF( MYCOL.NE.IVXCOL ) 193 $ RETURN 194 IROFF = MOD( IX-1, DESCX( MB_ ) ) 195 NP = NUMROC( N+IROFF, DESCX( MB_ ), MYROW, IVXROW, NPROW ) 196 IF( MYROW.EQ.IVXROW ) 197 $ NP = NP - IROFF 198 IOFFVX = IIVX + (JJVX-1)*DESCX( LLD_ ) 199* 200 IF( KASE.EQ.0 ) THEN 201 DO 10 I = IOFFVX, IOFFVX+NP-1 202 X( I ) = ONE / DBLE( N ) 203 10 CONTINUE 204 KASE = 1 205 JUMP = 1 206 RETURN 207 END IF 208* 209 GO TO ( 20, 40, 70, 110, 140 )JUMP 210* 211* ................ ENTRY (JUMP = 1) 212* FIRST ITERATION. X HAS BEEN OVERWRITTEN BY A*X 213* 214 20 CONTINUE 215 IF( N.EQ.1 ) THEN 216 IF( MYROW.EQ.IVXROW ) THEN 217 V( IOFFVX ) = X( IOFFVX ) 218 EST = ABS( V( IOFFVX ) ) 219 CALL DGEBS2D( ICTXT, 'Columnwise', ' ', 1, 1, EST, 1 ) 220 ELSE 221 CALL DGEBR2D( ICTXT, 'Columnwise', ' ', 1, 1, EST, 1, 222 $ IVXROW, MYCOL ) 223 END IF 224* ... QUIT 225 GO TO 150 226 END IF 227 CALL PDASUM( N, EST, X, IX, JX, DESCX, 1 ) 228 IF( DESCX( M_ ).EQ.1 .AND. N.EQ.1 ) THEN 229 IF( MYROW.EQ.IVXROW ) THEN 230 CALL DGEBS2D( ICTXT, 'Columnwise', ' ', 1, 1, EST, 1 ) 231 ELSE 232 CALL DGEBR2D( ICTXT, 'Columnwise', ' ', 1, 1, EST, 1, 233 $ IVXROW, MYCOL ) 234 END IF 235 END IF 236* 237 DO 30 I = IOFFVX, IOFFVX+NP-1 238 X( I ) = SIGN( ONE, X( I ) ) 239 ISGN( I ) = NINT( X( I ) ) 240 30 CONTINUE 241 KASE = 2 242 JUMP = 2 243 RETURN 244* 245* ................ ENTRY (JUMP = 2) 246* FIRST ITERATION. X HAS BEEN OVERWRITTEN BY TRANSPOSE(A)*X 247* 248 40 CONTINUE 249 CALL PDAMAX( N, XMAX, J, X, IX, JX, DESCX, 1 ) 250 IF( DESCX( M_ ).EQ.1 .AND. N.EQ.1 ) THEN 251 IF( MYROW.EQ.IVXROW ) THEN 252 WORK( 1 ) = XMAX 253 WORK( 2 ) = DBLE( J ) 254 CALL DGEBS2D( ICTXT, 'Columnwise', ' ', 2, 1, WORK, 2 ) 255 ELSE 256 CALL DGEBR2D( ICTXT, 'Columnwise', ' ', 2, 1, WORK, 2, 257 $ IVXROW, MYCOL ) 258 XMAX = WORK( 1 ) 259 J = NINT( WORK( 2 ) ) 260 END IF 261 END IF 262 ITER = 2 263* 264* MAIN LOOP - ITERATIONS 2, 3,...,ITMAX 265* 266 50 CONTINUE 267 DO 60 I = IOFFVX, IOFFVX+NP-1 268 X( I ) = ZERO 269 60 CONTINUE 270 IMAXROW = INDXG2P( J, DESCX( MB_ ), MYROW, DESCX( RSRC_ ), NPROW ) 271 IF( MYROW.EQ.IMAXROW ) THEN 272 I = INDXG2L( J, DESCX( MB_ ), MYROW, DESCX( RSRC_ ), NPROW ) 273 X( I ) = ONE 274 END IF 275 KASE = 1 276 JUMP = 3 277 RETURN 278* 279* ................ ENTRY (JUMP = 3) 280* X HAS BEEN OVERWRITTEN BY A*X 281* 282 70 CONTINUE 283 CALL DCOPY( NP, X( IOFFVX ), 1, V( IOFFVX ), 1 ) 284 ESTOLD = EST 285 CALL PDASUM( N, EST, V, IV, JV, DESCV, 1 ) 286 IF( DESCV( M_ ).EQ.1 .AND. N.EQ.1 ) THEN 287 IF( MYROW.EQ.IVXROW ) THEN 288 CALL DGEBS2D( ICTXT, 'Columnwise', ' ', 1, 1, EST, 1 ) 289 ELSE 290 CALL DGEBR2D( ICTXT, 'Columnwise', ' ', 1, 1, EST, 1, 291 $ IVXROW, MYCOL ) 292 END IF 293 END IF 294 IFLAG = 0 295 DO 80 I = IOFFVX, IOFFVX+NP-1 296 IF( NINT( SIGN( ONE, X( I ) ) ).NE.ISGN( I ) ) THEN 297 IFLAG = 1 298 GO TO 90 299 END IF 300 80 CONTINUE 301* 302 90 CONTINUE 303 CALL IGSUM2D( ICTXT, 'C', ' ', 1, 1, IFLAG, 1, -1, MYCOL ) 304* 305* REPEATED SIGN VECTOR DETECTED, HENCE ALGORITHM HAS CONVERGED. 306* ALONG WITH IT, TEST FOR CYCLING. 307* 308 IF( IFLAG.EQ.0 .OR. EST.LE.ESTOLD ) 309 $ GO TO 120 310* 311 DO 100 I = IOFFVX, IOFFVX+NP-1 312 X( I ) = SIGN( ONE, X( I ) ) 313 ISGN( I ) = NINT( X( I ) ) 314 100 CONTINUE 315 KASE = 2 316 JUMP = 4 317 RETURN 318* 319* ................ ENTRY (JUMP = 4) 320* X HAS BEEN OVERWRITTEN BY TRANSPOSE(A)*X 321* 322 110 CONTINUE 323 JLAST = J 324 CALL PDAMAX( N, XMAX, J, X, IX, JX, DESCX, 1 ) 325 IF( DESCX( M_ ).EQ.1 .AND. N.EQ.1 ) THEN 326 IF( MYROW.EQ.IVXROW ) THEN 327 WORK( 1 ) = XMAX 328 WORK( 2 ) = DBLE( J ) 329 CALL DGEBS2D( ICTXT, 'Columnwise', ' ', 2, 1, WORK, 2 ) 330 ELSE 331 CALL DGEBR2D( ICTXT, 'Columnwise', ' ', 2, 1, WORK, 2, 332 $ IVXROW, MYCOL ) 333 XMAX = WORK( 1 ) 334 J = NINT( WORK( 2 ) ) 335 END IF 336 END IF 337 CALL PDELGET( 'Columnwise', ' ', JLMAX, X, JLAST, JX, DESCX ) 338 IF( ( JLMAX.NE.ABS( XMAX ) ).AND.( ITER.LT.ITMAX ) ) THEN 339 ITER = ITER + 1 340 GO TO 50 341 END IF 342* 343* ITERATION COMPLETE. FINAL STAGE. 344* 345 120 CONTINUE 346 DO 130 I = IOFFVX, IOFFVX+NP-1 347 K = INDXL2G( I-IOFFVX+IIVX, DESCX( MB_ ), MYROW, 348 $ DESCX( RSRC_ ), NPROW )-IX+1 349 IF( MOD( K, 2 ).EQ.0 ) THEN 350 ALTSGN = -ONE 351 ELSE 352 ALTSGN = ONE 353 END IF 354 X( I ) = ALTSGN*( ONE+DBLE( K-1 ) / DBLE( N-1 ) ) 355 130 CONTINUE 356 KASE = 1 357 JUMP = 5 358 RETURN 359* 360* ................ ENTRY (JUMP = 5) 361* X HAS BEEN OVERWRITTEN BY A*X 362* 363 140 CONTINUE 364 CALL PDASUM( N, TEMP, X, IX, JX, DESCX, 1 ) 365 IF( DESCX( M_ ).EQ.1 .AND. N.EQ.1 ) THEN 366 IF( MYROW.EQ.IVXROW ) THEN 367 CALL DGEBS2D( ICTXT, 'Columnwise', ' ', 1, 1, TEMP, 1 ) 368 ELSE 369 CALL DGEBR2D( ICTXT, 'Columnwise', ' ', 1, 1, TEMP, 1, 370 $ IVXROW, MYCOL ) 371 END IF 372 END IF 373 TEMP = TWO*( TEMP / DBLE( 3*N ) ) 374 IF( TEMP.GT.EST ) THEN 375 CALL DCOPY( NP, X( IOFFVX ), 1, V( IOFFVX ), 1 ) 376 EST = TEMP 377 END IF 378* 379 150 CONTINUE 380 KASE = 0 381* 382 RETURN 383* 384* End of PDLACON 385* 386 END 387