1 SUBROUTINE PCMAX1( N, AMAX, INDX, X, IX, JX, DESCX, INCX ) 2* 3* -- ScaLAPACK auxiliary routine (version 1.7) -- 4* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 5* and University of California, Berkeley. 6* May 25, 2001 7* 8* .. Scalar Arguments .. 9 INTEGER INDX, INCX, IX, JX, N 10 COMPLEX AMAX 11* .. 12* .. Array Arguments .. 13 INTEGER DESCX( * ) 14 COMPLEX X( * ) 15* .. 16* 17* Purpose 18* ======= 19* 20* PCMAX1 computes the global index of the maximum element in absolute 21* value of a distributed vector sub( X ). The global index is returned 22* in INDX and the value is returned in AMAX, 23* 24* where sub( X ) denotes X(IX:IX+N-1,JX) if INCX = 1, 25* X(IX,JX:JX+N-1) if INCX = M_X. 26* 27* Notes 28* ===== 29* 30* Each global data object is described by an associated description 31* vector. This vector stores the information required to establish 32* the mapping between an object element and its corresponding process 33* and memory location. 34* 35* Let A be a generic term for any 2D block cyclicly distributed array. 36* Such a global array has an associated description vector DESCA. 37* In the following comments, the character _ should be read as 38* "of the global array". 39* 40* NOTATION STORED IN EXPLANATION 41* --------------- -------------- -------------------------------------- 42* DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 43* DTYPE_A = 1. 44* CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 45* the BLACS process grid A is distribu- 46* ted over. The context itself is glo- 47* bal, but the handle (the integer 48* value) may vary. 49* M_A (global) DESCA( M_ ) The number of rows in the global 50* array A. 51* N_A (global) DESCA( N_ ) The number of columns in the global 52* array A. 53* MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 54* the rows of the array. 55* NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 56* the columns of the array. 57* RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 58* row of the array A is distributed. 59* CSRC_A (global) DESCA( CSRC_ ) The process column over which the 60* first column of the array A is 61* distributed. 62* LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 63* array. LLD_A >= MAX(1,LOCr(M_A)). 64* 65* Let K be the number of rows or columns of a distributed matrix, 66* and assume that its process grid has dimension p x q. 67* LOCr( K ) denotes the number of elements of K that a process 68* would receive if K were distributed over the p processes of its 69* process column. 70* Similarly, LOCc( K ) denotes the number of elements of K that a 71* process would receive if K were distributed over the q processes of 72* its process row. 73* The values of LOCr() and LOCc() may be determined via a call to the 74* ScaLAPACK tool function, NUMROC: 75* LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 76* LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 77* An upper bound for these quantities may be computed by: 78* LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 79* LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 80* 81* Because vectors may be viewed as a subclass of matrices, a 82* distributed vector is considered to be a distributed matrix. 83* 84* When the result of a vector-oriented PBLAS call is a scalar, it will 85* be made available only within the scope which owns the vector(s) 86* being operated on. Let X be a generic term for the input vector(s). 87* Then, the processes which receive the answer will be (note that if 88* an operation involves more than one vector, the processes which re- 89* ceive the result will be the union of the following calculation for 90* each vector): 91* 92* If N = 1, M_X = 1 and INCX = 1, then one can't determine if a process 93* row or process column owns the vector operand, therefore only the 94* process of coordinate {RSRC_X, CSRC_X} receives the result; 95* 96* If INCX = M_X, then sub( X ) is a vector distributed over a process 97* row. Each process part of this row receives the result; 98* 99* If INCX = 1, then sub( X ) is a vector distributed over a process 100* column. Each process part of this column receives the result; 101* 102* Based on PCAMAX from Level 1 PBLAS. The change is to use the 103* 'genuine' absolute value. 104* 105* The serial version was contributed to LAPACK by Nick Higham for use 106* with CLACON. 107* 108* Arguments 109* ========= 110* 111* N (global input) pointer to INTEGER 112* The number of components of the distributed vector sub( X ). 113* N >= 0. 114* 115* AMAX (global output) pointer to REAL 116* The absolute value of the largest entry of the distributed 117* vector sub( X ) only in the scope of sub( X ). 118* 119* INDX (global output) pointer to INTEGER 120* The global index of the element of the distributed vector 121* sub( X ) whose real part has maximum absolute value. 122* 123* X (local input) COMPLEX array containing the local 124* pieces of a distributed matrix of dimension of at least 125* ( (JX-1)*M_X + IX + ( N - 1 )*abs( INCX ) ) 126* This array contains the entries of the distributed vector 127* sub( X ). 128* 129* IX (global input) INTEGER 130* The row index in the global array X indicating the first 131* row of sub( X ). 132* 133* JX (global input) INTEGER 134* The column index in the global array X indicating the 135* first column of sub( X ). 136* 137* DESCX (global and local input) INTEGER array of dimension DLEN_. 138* The array descriptor for the distributed matrix X. 139* 140* INCX (global input) INTEGER 141* The global increment for the elements of X. Only two values 142* of INCX are supported in this version, namely 1 and M_X. 143* INCX must not be zero. 144* 145* ===================================================================== 146* 147* .. Parameters .. 148 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 149 $ LLD_, MB_, M_, NB_, N_, RSRC_ 150 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 151 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 152 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 153 COMPLEX ZERO 154 PARAMETER ( ZERO = ( 0.0E+0, 0.0E+0 ) ) 155* .. 156* .. Local Scalars .. 157 CHARACTER CBTOP, CCTOP, RBTOP, RCTOP 158 INTEGER ICOFF, ICTXT, IDUMM, IIX, IROFF, IXCOL, IXROW, 159 $ JJX, LCINDX, LDX, MAXPOS, MYCOL, MYROW, NP, 160 $ NPCOL, NPROW, NQ 161* .. 162* .. Local Arrays .. 163 COMPLEX WORK( 2 ) 164* .. 165* .. External Subroutines .. 166 EXTERNAL BLACS_GRIDINFO, CCOMBAMAX1, CGAMX2D, 167 $ IGEBR2D, IGEBS2D, INFOG2L, PCTREECOMB, 168 $ PB_TOPGET 169* .. 170* .. External Functions .. 171 LOGICAL LSAME 172 INTEGER ICMAX1, INDXL2G, NUMROC 173 EXTERNAL ICMAX1, INDXL2G, NUMROC 174* .. 175* .. Intrinsic Functions .. 176 INTRINSIC ABS, CMPLX, MOD, NINT, REAL 177* .. 178* .. Executable Statements .. 179* 180* Get grid parameters 181* 182 ICTXT = DESCX( CTXT_ ) 183 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 184* 185* Quick return if possible. 186* 187 INDX = 0 188 AMAX = ZERO 189 IF( N.LE.0 ) 190 $ RETURN 191* 192* Retrieve local information for vector X. 193* 194 LDX = DESCX( LLD_ ) 195 CALL INFOG2L( IX, JX, DESCX, NPROW, NPCOL, MYROW, MYCOL, IIX, JJX, 196 $ IXROW, IXCOL ) 197* 198 IF( INCX.EQ.1 .AND. DESCX( M_ ).EQ.1 .AND. N.EQ.1 ) THEN 199 INDX = JX 200 AMAX = X( IIX+(JJX-1)*LDX ) 201 RETURN 202 END IF 203* 204* Find the maximum value and its index 205* 206 IF( INCX.EQ.DESCX( M_ ) ) THEN 207* 208 IF( MYROW.EQ.IXROW ) THEN 209* 210 ICOFF = MOD( JX-1, DESCX( NB_ ) ) 211 NQ = NUMROC( N+ICOFF, DESCX( NB_ ), MYCOL, IXCOL, NPCOL ) 212 IF( MYCOL.EQ.IXCOL ) 213 $ NQ = NQ-ICOFF 214* 215 CALL PB_TOPGET( ICTXT, 'Broadcast', 'Rowwise', RBTOP ) 216* 217 IF( LSAME( RBTOP, ' ' ) ) THEN 218* 219 IF( NQ.GT.0 ) THEN 220 LCINDX = JJX-1+ICMAX1( NQ, X( IIX+(JJX-1)*LDX ), LDX ) 221 WORK( 1 ) = X( IIX+(LCINDX-1)*LDX ) 222 WORK( 2 ) = CMPLX( REAL( INDXL2G( LCINDX, 223 $ DESCX( NB_ ), MYCOL, DESCX( CSRC_ ), NPCOL ) ) ) 224 ELSE 225 WORK( 1 ) = ZERO 226 WORK( 2 ) = ZERO 227 END IF 228* 229 CALL PCTREECOMB( ICTXT, 'Row', 2, WORK, -1, MYCOL, 230 $ CCOMBAMAX1 ) 231* 232 AMAX = WORK( 1 ) 233 IF( AMAX.EQ.ZERO ) THEN 234 INDX = JX 235 ELSE 236 INDX = NINT( REAL( WORK( 2 ) ) ) 237 END IF 238* 239 ELSE 240* 241 CALL PB_TOPGET( ICTXT, 'Combine', 'Rowwise', RCTOP ) 242* 243 IF( NQ.GT.0 ) THEN 244 LCINDX = JJX-1+ICMAX1( NQ, X( IIX+(JJX-1)*LDX ), LDX ) 245 AMAX = X( IIX + (LCINDX-1)*LDX ) 246 ELSE 247 AMAX = ZERO 248 END IF 249* 250* Find the maximum value 251* 252 CALL CGAMX2D( ICTXT, 'Rowwise', RCTOP, 1, 1, AMAX, 1, 253 $ IDUMM, MAXPOS, 1, -1, MYROW ) 254* 255 IF( AMAX.NE.ZERO ) THEN 256* 257* Broadcast corresponding global index 258* 259 IF( MYCOL.EQ.MAXPOS ) THEN 260 INDX = INDXL2G( LCINDX, DESCX( NB_ ), MYCOL, 261 $ DESCX( CSRC_ ), NPCOL ) 262 CALL IGEBS2D( ICTXT, 'Rowwise', RBTOP, 1, 1, INDX, 263 $ 1 ) 264 ELSE 265 CALL IGEBR2D( ICTXT, 'Rowwise', RBTOP, 1, 1, INDX, 266 $ 1, MYROW, MAXPOS ) 267 END IF 268* 269 ELSE 270* 271 INDX = JX 272* 273 END IF 274* 275 END IF 276* 277 END IF 278* 279 ELSE 280* 281 IF( MYCOL.EQ.IXCOL ) THEN 282* 283 IROFF = MOD( IX-1, DESCX( MB_ ) ) 284 NP = NUMROC( N+IROFF, DESCX( MB_ ), MYROW, IXROW, NPROW ) 285 IF( MYROW.EQ.IXROW ) 286 $ NP = NP-IROFF 287* 288 CALL PB_TOPGET( ICTXT, 'Broadcast', 'Columnwise', CBTOP ) 289* 290 IF( LSAME( CBTOP, ' ' ) ) THEN 291* 292 IF( NP.GT.0 ) THEN 293 LCINDX = IIX-1+ICMAX1( NP, X( IIX+(JJX-1)*LDX ), 1 ) 294 WORK( 1 ) = X( LCINDX + (JJX-1)*LDX ) 295 WORK( 2 ) = CMPLX( REAL( INDXL2G( LCINDX, 296 $ DESCX( MB_ ), MYROW, DESCX( RSRC_ ), NPROW ) ) ) 297 ELSE 298 WORK( 1 ) = ZERO 299 WORK( 2 ) = ZERO 300 END IF 301* 302 CALL PCTREECOMB( ICTXT, 'Column', 2, WORK, -1, MYCOL, 303 $ CCOMBAMAX1 ) 304* 305 AMAX = WORK( 1 ) 306 IF( AMAX.EQ.ZERO ) THEN 307 INDX = IX 308 ELSE 309 INDX = NINT( REAL( WORK( 2 ) ) ) 310 END IF 311* 312 ELSE 313* 314 CALL PB_TOPGET( ICTXT, 'Combine', 'Columnwise', CCTOP ) 315* 316 IF( NP.GT.0 ) THEN 317 LCINDX = IIX-1+ICMAX1( NP, X( IIX+(JJX-1)*LDX ), 1 ) 318 AMAX = X( LCINDX + (JJX-1)*LDX ) 319 ELSE 320 AMAX = ZERO 321 END IF 322* 323* Find the maximum value 324* 325 CALL CGAMX2D( ICTXT, 'Columnwise', CCTOP, 1, 1, AMAX, 1, 326 $ MAXPOS, IDUMM, 1, -1, MYCOL ) 327* 328 IF( AMAX.NE.ZERO ) THEN 329* 330* Broadcast corresponding global index 331* 332 IF( MYROW.EQ.MAXPOS ) THEN 333 INDX = INDXL2G( LCINDX, DESCX( MB_ ), MYROW, 334 $ DESCX( RSRC_ ), NPROW ) 335 CALL IGEBS2D( ICTXT, 'Columnwise', CBTOP, 1, 1, 336 $ INDX, 1 ) 337 ELSE 338 CALL IGEBR2D( ICTXT, 'Columnwise', CBTOP, 1, 1, 339 $ INDX, 1, MAXPOS, MYCOL ) 340 END IF 341* 342 ELSE 343* 344 INDX = IX 345* 346 END IF 347* 348 END IF 349* 350 END IF 351* 352 END IF 353* 354 RETURN 355* 356* End of PCMAX1 357* 358 END 359* 360 SUBROUTINE CCOMBAMAX1 ( V1, V2 ) 361* 362* -- ScaLAPACK auxiliary routine (version 1.7) -- 363* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 364* and University of California, Berkeley. 365* May 1, 1997 366* 367* .. Array Arguments .. 368 COMPLEX V1( 2 ), V2( 2 ) 369* .. 370* 371* Purpose 372* ======= 373* 374* CCOMBAMAX1 finds the element having maximum real part absolute 375* value as well as its corresponding globl index. 376* 377* Arguments 378* ========= 379* 380* V1 (local input/local output) COMPLEX array of 381* dimension 2. The first maximum absolute value element and 382* its global index. V1(1) = AMAX, V1(2) = INDX. 383* 384* V2 (local input) COMPLEX array of dimension 2. 385* The second maximum absolute value element and its global 386* index. V2(1) = AMAX, V2(2) = INDX. 387* 388* ===================================================================== 389* 390* .. Intrinsic Functions .. 391 INTRINSIC ABS, REAL 392* .. 393* .. Executable Statements .. 394* 395 IF( ABS( REAL( V1( 1 ) ) ).LT.ABS( REAL( V2( 1 ) ) ) ) THEN 396 V1( 1 ) = V2( 1 ) 397 V1( 2 ) = V2( 2 ) 398 END IF 399* 400 RETURN 401* 402* End of CCOMBAMAX1 403* 404 END 405