1 SUBROUTINE PZLACP3( M, I, A, DESCA, B, LDB, II, JJ, REV ) 2* 3* -- ScaLAPACK routine (version 1.7) -- 4* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 5* and University of California, Berkeley. 6* July 31, 2001 7* 8* .. Scalar Arguments .. 9 INTEGER I, II, JJ, LDB, M, REV 10* .. 11* .. Array Arguments .. 12 INTEGER DESCA( * ) 13 COMPLEX*16 A( * ), B( LDB, * ) 14* .. 15* 16* Purpose 17* ======= 18* 19* PZLACP3 is an auxiliary routine that copies from a global parallel 20* array into a local replicated array or vise versa. Notice that 21* the entire submatrix that is copied gets placed on one node or 22* more. The receiving node can be specified precisely, or all nodes 23* can receive, or just one row or column of nodes. 24* 25* Notes 26* ===== 27* 28* Each global data object is described by an associated description 29* vector. This vector stores the information required to establish 30* the mapping between an object element and its corresponding process 31* and memory location. 32* 33* Let A be a generic term for any 2D block cyclicly distributed array. 34* Such a global array has an associated description vector DESCA. 35* In the following comments, the character _ should be read as 36* "of the global array". 37* 38* NOTATION STORED IN EXPLANATION 39* --------------- -------------- -------------------------------------- 40* DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 41* DTYPE_A = 1. 42* CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 43* the BLACS process grid A is distribu- 44* ted over. The context itself is glo- 45* bal, but the handle (the integer 46* value) may vary. 47* M_A (global) DESCA( M_ ) The number of rows in the global 48* array A. 49* N_A (global) DESCA( N_ ) The number of columns in the global 50* array A. 51* MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 52* the rows of the array. 53* NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 54* the columns of the array. 55* RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 56* row of the array A is distributed. 57* CSRC_A (global) DESCA( CSRC_ ) The process column over which the 58* first column of the array A is 59* distributed. 60* LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 61* array. LLD_A >= MAX(1,LOCr(M_A)). 62* 63* Let K be the number of rows or columns of a distributed matrix, 64* and assume that its process grid has dimension p x q. 65* LOCr( K ) denotes the number of elements of K that a process 66* would receive if K were distributed over the p processes of its 67* process column. 68* Similarly, LOCc( K ) denotes the number of elements of K that a 69* process would receive if K were distributed over the q processes of 70* its process row. 71* The values of LOCr() and LOCc() may be determined via a call to the 72* ScaLAPACK tool function, NUMROC: 73* LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 74* LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 75* An upper bound for these quantities may be computed by: 76* LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 77* LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 78* 79* Arguments 80* ========= 81* 82* M (global input) INTEGER 83* M is the order of the square submatrix that is copied. 84* M >= 0. 85* Unchanged on exit 86* 87* I (global input) INTEGER 88* A(I,I) is the global location that the copying starts from. 89* Unchanged on exit. 90* 91* A (global input/output) COMPLEX*16 array, dimension 92* (DESCA(LLD_),*) 93* On entry, the parallel matrix to be copied into or from. 94* On exit, if REV=1, the copied data. 95* Unchanged on exit if REV=0. 96* 97* DESCA (global and local input) INTEGER array of dimension DLEN_. 98* The array descriptor for the distributed matrix A. 99* 100* B (local input/output) COMPLEX*16 array of size (LDB,M) 101* If REV=0, this is the global portion of the array 102* A(I:I+M-1,I:I+M-1). 103* If REV=1, this is the unchanged on exit. 104* 105* LDB (local input) INTEGER 106* The leading dimension of B. 107* 108* II (global input) INTEGER 109* By using REV 0 & 1, data can be sent out and returned again. 110* If REV=0, then II is destination row index for the node(s) 111* receiving the replicated B. 112* If II>=0,JJ>=0, then node (II,JJ) receives the data 113* If II=-1,JJ>=0, then all rows in column JJ receive the 114* data 115* If II>=0,JJ=-1, then all cols in row II receive the data 116* If II=-1,JJ=-1, then all nodes receive the data 117* If REV<>0, then II is the source row index for the node(s) 118* sending the replicated B. 119* 120* JJ (global input) INTEGER 121* Similar description as II above 122* 123* REV (global input) INTEGER 124* Use REV = 0 to send global A into locally replicated B 125* (on node (II,JJ)). 126* Use REV <> 0 to send locally replicated B from node (II,JJ) 127* to its owner (which changes depending on its location in 128* A) into the global A. 129* 130* Further Details 131* =============== 132* 133* Implemented by: M. Fahey, May 28, 1999 134* 135* ===================================================================== 136* 137* .. Parameters .. 138 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 139 $ LLD_, MB_, M_, NB_, N_, RSRC_ 140 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 141 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 142 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 143 COMPLEX*16 ZERO 144 PARAMETER ( ZERO = ( 0.0D+0, 0.0D+0 ) ) 145* .. 146* .. Local Scalars .. 147 INTEGER COL, CONTXT, HBL, ICOL1, ICOL2, IDI, IDJ, IFIN, 148 $ III, IROW1, IROW2, ISTOP, ISTOPI, ISTOPJ, ITMP, 149 $ JJJ, LDA, MYCOL, MYROW, NPCOL, NPROW, ROW 150* .. 151* .. External Functions .. 152 INTEGER NUMROC 153 EXTERNAL NUMROC 154* .. 155* .. External Subroutines .. 156 EXTERNAL BLACS_GRIDINFO, INFOG1L, ZGEBR2D, ZGEBS2D, 157 $ ZGERV2D, ZGESD2D 158* .. 159* .. Intrinsic Functions .. 160 INTRINSIC MIN, MOD 161* .. 162* .. Executable Statements .. 163* 164 IF( M.LE.0 ) 165 $ RETURN 166* 167 HBL = DESCA( MB_ ) 168 CONTXT = DESCA( CTXT_ ) 169 LDA = DESCA( LLD_ ) 170* 171 CALL BLACS_GRIDINFO( CONTXT, NPROW, NPCOL, MYROW, MYCOL ) 172* 173 IF( REV.EQ.0 ) THEN 174 DO 20 IDI = 1, M 175 DO 10 IDJ = 1, M 176 B( IDI, IDJ ) = ZERO 177 10 CONTINUE 178 20 CONTINUE 179 END IF 180* 181 IFIN = I + M - 1 182* 183 IF( MOD( I+HBL, HBL ).NE.0 ) THEN 184 ISTOP = MIN( I+HBL-MOD( I+HBL, HBL ), IFIN ) 185 ELSE 186 ISTOP = I 187 END IF 188 IDJ = I 189 ISTOPJ = ISTOP 190 IF( IDJ.LE.IFIN ) THEN 191 30 CONTINUE 192 IDI = I 193 ISTOPI = ISTOP 194 IF( IDI.LE.IFIN ) THEN 195 40 CONTINUE 196 ROW = MOD( ( IDI-1 ) / HBL, NPROW ) 197 COL = MOD( ( IDJ-1 ) / HBL, NPCOL ) 198 CALL INFOG1L( IDI, HBL, NPROW, ROW, 0, IROW1, ITMP ) 199 IROW2 = NUMROC( ISTOPI, HBL, ROW, 0, NPROW ) 200 CALL INFOG1L( IDJ, HBL, NPCOL, COL, 0, ICOL1, ITMP ) 201 ICOL2 = NUMROC( ISTOPJ, HBL, COL, 0, NPCOL ) 202 IF( ( MYROW.EQ.ROW ) .AND. ( MYCOL.EQ.COL ) ) THEN 203 IF( ( II.EQ.-1 ) .AND. ( JJ.EQ.-1 ) ) THEN 204* 205* Send the message to everyone 206* 207 IF( REV.EQ.0 ) THEN 208 CALL ZGEBS2D( CONTXT, 'All', ' ', IROW2-IROW1+1, 209 $ ICOL2-ICOL1+1, A( ( ICOL1-1 )*LDA+ 210 $ IROW1 ), LDA ) 211 END IF 212 END IF 213 IF( ( II.EQ.-1 ) .AND. ( JJ.NE.-1 ) ) THEN 214* 215* Send the message to Column MYCOL which better be JJ 216* 217 IF( REV.EQ.0 ) THEN 218 CALL ZGEBS2D( CONTXT, 'Col', ' ', IROW2-IROW1+1, 219 $ ICOL2-ICOL1+1, A( ( ICOL1-1 )*LDA+ 220 $ IROW1 ), LDA ) 221 END IF 222 END IF 223 IF( ( II.NE.-1 ) .AND. ( JJ.EQ.-1 ) ) THEN 224* 225* Send the message to Row MYROW which better be II 226* 227 IF( REV.EQ.0 ) THEN 228 CALL ZGEBS2D( CONTXT, 'Row', ' ', IROW2-IROW1+1, 229 $ ICOL2-ICOL1+1, A( ( ICOL1-1 )*LDA+ 230 $ IROW1 ), LDA ) 231 END IF 232 END IF 233 IF( ( II.NE.-1 ) .AND. ( JJ.NE.-1 ) .AND. 234 $ ( ( MYROW.NE.II ) .OR. ( MYCOL.NE.JJ ) ) ) THEN 235* 236* Recv/Send the message to (II,JJ) 237* 238 IF( REV.EQ.0 ) THEN 239 CALL ZGESD2D( CONTXT, IROW2-IROW1+1, ICOL2-ICOL1+1, 240 $ A( ( ICOL1-1 )*LDA+IROW1 ), LDA, II, 241 $ JJ ) 242 ELSE 243 CALL ZGERV2D( CONTXT, IROW2-IROW1+1, ICOL2-ICOL1+1, 244 $ B( IDI-I+1, IDJ-I+1 ), LDB, II, JJ ) 245 END IF 246 END IF 247 IF( REV.EQ.0 ) THEN 248 DO 60 JJJ = ICOL1, ICOL2 249 DO 50 III = IROW1, IROW2 250 B( IDI+III-IROW1+1-I, IDJ+JJJ-ICOL1+1-I ) 251 $ = A( ( JJJ-1 )*LDA+III ) 252 50 CONTINUE 253 60 CONTINUE 254 ELSE 255 DO 80 JJJ = ICOL1, ICOL2 256 DO 70 III = IROW1, IROW2 257 A( ( JJJ-1 )*LDA+III ) = B( IDI+III-IROW1+1-I, 258 $ IDJ+JJJ-ICOL1+1-I ) 259 70 CONTINUE 260 80 CONTINUE 261 END IF 262 ELSE 263 IF( ( II.EQ.-1 ) .AND. ( JJ.EQ.-1 ) ) THEN 264 IF( REV.EQ.0 ) THEN 265 CALL ZGEBR2D( CONTXT, 'All', ' ', IROW2-IROW1+1, 266 $ ICOL2-ICOL1+1, B( IDI-I+1, IDJ-I+1 ), 267 $ LDB, ROW, COL ) 268 END IF 269 END IF 270 IF( ( II.EQ.-1 ) .AND. ( JJ.EQ.MYCOL ) ) THEN 271 IF( REV.EQ.0 ) THEN 272 CALL ZGEBR2D( CONTXT, 'Col', ' ', IROW2-IROW1+1, 273 $ ICOL2-ICOL1+1, B( IDI-I+1, IDJ-I+1 ), 274 $ LDB, ROW, COL ) 275 END IF 276 END IF 277 IF( ( II.EQ.MYROW ) .AND. ( JJ.EQ.-1 ) ) THEN 278 IF( REV.EQ.0 ) THEN 279 CALL ZGEBR2D( CONTXT, 'Row', ' ', IROW2-IROW1+1, 280 $ ICOL2-ICOL1+1, B( IDI-I+1, IDJ-I+1 ), 281 $ LDB, ROW, COL ) 282 END IF 283 END IF 284 IF( ( II.EQ.MYROW ) .AND. ( JJ.EQ.MYCOL ) ) THEN 285 IF( REV.EQ.0 ) THEN 286 CALL ZGERV2D( CONTXT, IROW2-IROW1+1, ICOL2-ICOL1+1, 287 $ B( IDI-I+1, IDJ-I+1 ), LDB, ROW, 288 $ COL ) 289 ELSE 290 CALL ZGESD2D( CONTXT, IROW2-IROW1+1, ICOL2-ICOL1+1, 291 $ B( IDI-I+1, IDJ-I+1 ), LDB, ROW, 292 $ COL ) 293* CALL ZGESD2D(CONTXT, IROW2-IROW1+1, ICOL2-ICOL1+1, 294* $ A((ICOL1-1)*LDA+IROW1),LDA, ROW, COL) 295 END IF 296 END IF 297 END IF 298 IDI = ISTOPI + 1 299 ISTOPI = MIN( ISTOPI+HBL, IFIN ) 300 IF( IDI.LE.IFIN ) 301 $ GO TO 40 302 END IF 303 IDJ = ISTOPJ + 1 304 ISTOPJ = MIN( ISTOPJ+HBL, IFIN ) 305 IF( IDJ.LE.IFIN ) 306 $ GO TO 30 307 END IF 308 RETURN 309* 310* End of PZLACP3 311* 312 END 313