1 SUBROUTINE PZMAX1( N, AMAX, INDX, X, IX, JX, DESCX, INCX ) 2* 3* -- ScaLAPACK auxiliary routine (version 1.7) -- 4* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 5* and University of California, Berkeley. 6* May 1, 1997 7* 8* .. Scalar Arguments .. 9 INTEGER INDX, INCX, IX, JX, N 10 COMPLEX*16 AMAX 11* .. 12* .. Array Arguments .. 13 INTEGER DESCX( * ) 14 COMPLEX*16 X( * ) 15* .. 16* 17* Purpose 18* ======= 19* 20* PZMAX1 computes the global index of the maximum element in absolute 21* value of a distributed vector sub( X ). The global index is returned 22* in INDX and the value is returned in AMAX, 23* 24* where sub( X ) denotes X(IX:IX+N-1,JX) if INCX = 1, 25* X(IX,JX:JX+N-1) if INCX = M_X. 26* 27* Notes 28* ===== 29* 30* Each global data object is described by an associated description 31* vector. This vector stores the information required to establish 32* the mapping between an object element and its corresponding process 33* and memory location. 34* 35* Let A be a generic term for any 2D block cyclicly distributed array. 36* Such a global array has an associated description vector DESCA. 37* In the following comments, the character _ should be read as 38* "of the global array". 39* 40* NOTATION STORED IN EXPLANATION 41* --------------- -------------- -------------------------------------- 42* DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 43* DTYPE_A = 1. 44* CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 45* the BLACS process grid A is distribu- 46* ted over. The context itself is glo- 47* bal, but the handle (the integer 48* value) may vary. 49* M_A (global) DESCA( M_ ) The number of rows in the global 50* array A. 51* N_A (global) DESCA( N_ ) The number of columns in the global 52* array A. 53* MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 54* the rows of the array. 55* NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 56* the columns of the array. 57* RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 58* row of the array A is distributed. 59* CSRC_A (global) DESCA( CSRC_ ) The process column over which the 60* first column of the array A is 61* distributed. 62* LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 63* array. LLD_A >= MAX(1,LOCr(M_A)). 64* 65* Let K be the number of rows or columns of a distributed matrix, 66* and assume that its process grid has dimension p x q. 67* LOCr( K ) denotes the number of elements of K that a process 68* would receive if K were distributed over the p processes of its 69* process column. 70* Similarly, LOCc( K ) denotes the number of elements of K that a 71* process would receive if K were distributed over the q processes of 72* its process row. 73* The values of LOCr() and LOCc() may be determined via a call to the 74* ScaLAPACK tool function, NUMROC: 75* LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 76* LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 77* An upper bound for these quantities may be computed by: 78* LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 79* LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 80* 81* Because vectors may be viewed as a subclass of matrices, a 82* distributed vector is considered to be a distributed matrix. 83* 84* When the result of a vector-oriented PBLAS call is a scalar, it will 85* be made available only within the scope which owns the vector(s) 86* being operated on. Let X be a generic term for the input vector(s). 87* Then, the processes which receive the answer will be (note that if 88* an operation involves more than one vector, the processes which re- 89* ceive the result will be the union of the following calculation for 90* each vector): 91* 92* If N = 1, M_X = 1 and INCX = 1, then one can't determine if a process 93* row or process column owns the vector operand, therefore only the 94* process of coordinate {RSRC_X, CSRC_X} receives the result; 95* 96* If INCX = M_X, then sub( X ) is a vector distributed over a process 97* row. Each process part of this row receives the result; 98* 99* If INCX = 1, then sub( X ) is a vector distributed over a process 100* column. Each process part of this column receives the result; 101* 102* Based on PZAMAX from Level 1 PBLAS. The change is to use the 103* 'genuine' absolute value. 104* 105* The serial version was contributed to LAPACK by Nick Higham for use 106* with ZLACON. 107* 108* Arguments 109* ========= 110* 111* N (global input) pointer to INTEGER 112* The number of components of the distributed vector sub( X ). 113* N >= 0. 114* 115* AMAX (global output) pointer to DOUBLE PRECISION 116* The absolute value of the largest entry of the distributed 117* vector sub( X ) only in the scope of sub( X ). 118* 119* INDX (global output) pointer to INTEGER 120* The global index of the element of the distributed vector 121* sub( X ) whose real part has maximum absolute value. 122* 123* X (local input) COMPLEX*16 array containing the local 124* pieces of a distributed matrix of dimension of at least 125* ( (JX-1)*M_X + IX + ( N - 1 )*abs( INCX ) ) 126* This array contains the entries of the distributed vector 127* sub( X ). 128* 129* IX (global input) INTEGER 130* The row index in the global array X indicating the first 131* row of sub( X ). 132* 133* JX (global input) INTEGER 134* The column index in the global array X indicating the 135* first column of sub( X ). 136* 137* DESCX (global and local input) INTEGER array of dimension DLEN_. 138* The array descriptor for the distributed matrix X. 139* 140* INCX (global input) INTEGER 141* The global increment for the elements of X. Only two values 142* of INCX are supported in this version, namely 1 and M_X. 143* INCX must not be zero. 144* 145* ===================================================================== 146* 147* .. Parameters .. 148 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 149 $ LLD_, MB_, M_, NB_, N_, RSRC_ 150 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 151 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 152 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 153 COMPLEX*16 ZERO 154 PARAMETER ( ZERO = ( 0.0D+0, 0.0D+0 ) ) 155* .. 156* .. Local Scalars .. 157 CHARACTER CBTOP, CCTOP, RBTOP, RCTOP 158 INTEGER ICOFF, ICTXT, IDUMM, IIX, IROFF, IXCOL, IXROW, 159 $ JJX, LCINDX, LDX, MAXPOS, MYCOL, MYROW, NP, 160 $ NPCOL, NPROW, NQ 161* .. 162* .. Local Arrays .. 163 COMPLEX*16 WORK( 2 ) 164* .. 165* .. External Subroutines .. 166 EXTERNAL BLACS_GRIDINFO, IGEBR2D, IGEBS2D, INFOG2L, 167 $ PB_TOPGET, PZTREECOMB, ZCOMBAMAX1, ZGAMX2D 168* .. 169* .. External Functions .. 170 LOGICAL LSAME 171 INTEGER IZMAX1, INDXL2G, NUMROC 172 EXTERNAL IZMAX1, INDXL2G, NUMROC 173* .. 174* .. Intrinsic Functions .. 175 INTRINSIC ABS, DBLE, DCMPLX, MOD, NINT 176* .. 177* .. Executable Statements .. 178* 179* Get grid parameters 180* 181 ICTXT = DESCX( CTXT_ ) 182 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 183* 184* Quick return if possible. 185* 186 INDX = 0 187 AMAX = ZERO 188 IF( N.LE.0 ) 189 $ RETURN 190* 191* Retrieve local information for vector X. 192* 193 LDX = DESCX( LLD_ ) 194 CALL INFOG2L( IX, JX, DESCX, NPROW, NPCOL, MYROW, MYCOL, IIX, JJX, 195 $ IXROW, IXCOL ) 196* 197 IF( INCX.EQ.1 .AND. DESCX( M_ ).EQ.1 .AND. N.EQ.1 ) THEN 198 INDX = JX 199 AMAX = X( IIX+(JJX-1)*LDX ) 200 RETURN 201 END IF 202* 203* Find the maximum value and its index 204* 205 IF( INCX.EQ.DESCX( M_ ) ) THEN 206* 207 IF( MYROW.EQ.IXROW ) THEN 208* 209 ICOFF = MOD( JX-1, DESCX( NB_ ) ) 210 NQ = NUMROC( N+ICOFF, DESCX( NB_ ), MYCOL, IXCOL, NPCOL ) 211 IF( MYCOL.EQ.IXCOL ) 212 $ NQ = NQ-ICOFF 213* 214 CALL PB_TOPGET( ICTXT, 'Broadcast', 'Rowwise', RBTOP ) 215* 216 IF( LSAME( RBTOP, ' ' ) ) THEN 217* 218 IF( NQ.GT.0 ) THEN 219 LCINDX = JJX-1+IZMAX1( NQ, X( IIX+(JJX-1)*LDX ), LDX ) 220 WORK( 1 ) = X( IIX+(LCINDX-1)*LDX ) 221 WORK( 2 ) = DCMPLX( DBLE( INDXL2G( LCINDX, 222 $ DESCX( NB_ ), MYCOL, DESCX( CSRC_ ), NPCOL ) ) ) 223 ELSE 224 WORK( 1 ) = ZERO 225 WORK( 2 ) = ZERO 226 END IF 227* 228 CALL PZTREECOMB( ICTXT, 'Row', 2, WORK, -1, MYCOL, 229 $ ZCOMBAMAX1 ) 230* 231 AMAX = WORK( 1 ) 232 IF( AMAX.EQ.ZERO ) THEN 233 INDX = JX 234 ELSE 235 INDX = NINT( DBLE( WORK( 2 ) ) ) 236 END IF 237* 238 ELSE 239* 240 CALL PB_TOPGET( ICTXT, 'Combine', 'Rowwise', RCTOP ) 241* 242 IF( NQ.GT.0 ) THEN 243 LCINDX = JJX-1+IZMAX1( NQ, X( IIX+(JJX-1)*LDX ), LDX ) 244 AMAX = X( IIX + (LCINDX-1)*LDX ) 245 ELSE 246 AMAX = ZERO 247 END IF 248* 249* Find the maximum value 250* 251 CALL ZGAMX2D( ICTXT, 'Rowwise', RCTOP, 1, 1, AMAX, 1, 252 $ IDUMM, MAXPOS, 1, -1, MYROW ) 253* 254 IF( AMAX.NE.ZERO ) THEN 255* 256* Broadcast corresponding global index 257* 258 IF( MYCOL.EQ.MAXPOS ) THEN 259 INDX = INDXL2G( LCINDX, DESCX( NB_ ), MYCOL, 260 $ DESCX( CSRC_ ), NPCOL ) 261 CALL IGEBS2D( ICTXT, 'Rowwise', RBTOP, 1, 1, INDX, 262 $ 1 ) 263 ELSE 264 CALL IGEBR2D( ICTXT, 'Rowwise', RBTOP, 1, 1, INDX, 265 $ 1, MYROW, MAXPOS ) 266 END IF 267* 268 ELSE 269* 270 INDX = JX 271* 272 END IF 273* 274 END IF 275* 276 END IF 277* 278 ELSE 279* 280 IF( MYCOL.EQ.IXCOL ) THEN 281* 282 IROFF = MOD( IX-1, DESCX( MB_ ) ) 283 NP = NUMROC( N+IROFF, DESCX( MB_ ), MYROW, IXROW, NPROW ) 284 IF( MYROW.EQ.IXROW ) 285 $ NP = NP-IROFF 286* 287 CALL PB_TOPGET( ICTXT, 'Broadcast', 'Columnwise', CBTOP ) 288* 289 IF( LSAME( CBTOP, ' ' ) ) THEN 290* 291 IF( NP.GT.0 ) THEN 292 LCINDX = IIX-1+IZMAX1( NP, X( IIX+(JJX-1)*LDX ), 1 ) 293 WORK( 1 ) = X( LCINDX + (JJX-1)*LDX ) 294 WORK( 2 ) = DCMPLX( DBLE( INDXL2G( LCINDX, 295 $ DESCX( MB_ ), MYROW, DESCX( RSRC_ ), NPROW ) ) ) 296 ELSE 297 WORK( 1 ) = ZERO 298 WORK( 2 ) = ZERO 299 END IF 300* 301 CALL PZTREECOMB( ICTXT, 'Column', 2, WORK, -1, MYCOL, 302 $ ZCOMBAMAX1 ) 303* 304 AMAX = WORK( 1 ) 305 IF( AMAX.EQ.ZERO ) THEN 306 INDX = IX 307 ELSE 308 INDX = NINT( DBLE( WORK( 2 ) ) ) 309 END IF 310* 311 ELSE 312* 313 CALL PB_TOPGET( ICTXT, 'Combine', 'Columnwise', CCTOP ) 314* 315 IF( NP.GT.0 ) THEN 316 LCINDX = IIX-1+IZMAX1( NP, X( IIX+(JJX-1)*LDX ), 1 ) 317 AMAX = X( LCINDX + (JJX-1)*LDX ) 318 ELSE 319 AMAX = ZERO 320 END IF 321* 322* Find the maximum value 323* 324 CALL ZGAMX2D( ICTXT, 'Columnwise', CCTOP, 1, 1, AMAX, 1, 325 $ MAXPOS, IDUMM, 1, -1, MYCOL ) 326* 327 IF( AMAX.NE.ZERO ) THEN 328* 329* Broadcast corresponding global index 330* 331 IF( MYROW.EQ.MAXPOS ) THEN 332 INDX = INDXL2G( LCINDX, DESCX( MB_ ), MYROW, 333 $ DESCX( RSRC_ ), NPROW ) 334 CALL IGEBS2D( ICTXT, 'Columnwise', CBTOP, 1, 1, 335 $ INDX, 1 ) 336 ELSE 337 CALL IGEBR2D( ICTXT, 'Columnwise', CBTOP, 1, 1, 338 $ INDX, 1, MAXPOS, MYCOL ) 339 END IF 340* 341 ELSE 342* 343 INDX = IX 344* 345 END IF 346* 347 END IF 348* 349 END IF 350* 351 END IF 352* 353 RETURN 354* 355* End of PZMAX1 356* 357 END 358* 359 SUBROUTINE ZCOMBAMAX1 ( V1, V2 ) 360* 361* -- ScaLAPACK auxiliary routine (version 1.7) -- 362* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 363* and University of California, Berkeley. 364* May 1, 1997 365* 366* .. Array Arguments .. 367 COMPLEX*16 V1( 2 ), V2( 2 ) 368* .. 369* 370* Purpose 371* ======= 372* 373* ZCOMBAMAX1 finds the element having maximum real part absolute 374* value as well as its corresponding globl index. 375* 376* Arguments 377* ========= 378* 379* V1 (local input/local output) COMPLEX*16 array of 380* dimension 2. The first maximum absolute value element and 381* its global index. V1(1) = AMAX, V1(2) = INDX. 382* 383* V2 (local input) COMPLEX*16 array of dimension 2. 384* The second maximum absolute value element and its global 385* index. V2(1) = AMAX, V2(2) = INDX. 386* 387* ===================================================================== 388* 389* .. Intrinsic Functions .. 390 INTRINSIC ABS, DBLE 391* .. 392* .. Executable Statements .. 393* 394 IF( ABS( DBLE( V1( 1 ) ) ).LT.ABS( DBLE( V2( 1 ) ) ) ) THEN 395 V1( 1 ) = V2( 1 ) 396 V1( 2 ) = V2( 2 ) 397 END IF 398* 399 RETURN 400* 401* End of ZCOMBAMAX1 402* 403 END 404