1 SUBROUTINE PDPOTRF( UPLO, N, A, IA, JA, DESCA, INFO ) 2* 3* -- ScaLAPACK routine (version 1.7) -- 4* University of Tennessee, Knoxville, Oak Ridge National Laboratory, 5* and University of California, Berkeley. 6* May 25, 2001 7* 8* .. Scalar Arguments .. 9 CHARACTER UPLO 10 INTEGER IA, INFO, JA, N 11* .. 12* .. Array Arguments .. 13 INTEGER DESCA( * ) 14 DOUBLE PRECISION A( * ) 15* .. 16* 17* Purpose 18* ======= 19* 20* PDPOTRF computes the Cholesky factorization of an N-by-N real 21* symmetric positive definite distributed matrix sub( A ) denoting 22* A(IA:IA+N-1, JA:JA+N-1). 23* 24* The factorization has the form 25* 26* sub( A ) = U' * U , if UPLO = 'U', or 27* 28* sub( A ) = L * L', if UPLO = 'L', 29* 30* where U is an upper triangular matrix and L is lower triangular. 31* 32* Notes 33* ===== 34* 35* Each global data object is described by an associated description 36* vector. This vector stores the information required to establish 37* the mapping between an object element and its corresponding process 38* and memory location. 39* 40* Let A be a generic term for any 2D block cyclicly distributed array. 41* Such a global array has an associated description vector DESCA. 42* In the following comments, the character _ should be read as 43* "of the global array". 44* 45* NOTATION STORED IN EXPLANATION 46* --------------- -------------- -------------------------------------- 47* DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 48* DTYPE_A = 1. 49* CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 50* the BLACS process grid A is distribu- 51* ted over. The context itself is glo- 52* bal, but the handle (the integer 53* value) may vary. 54* M_A (global) DESCA( M_ ) The number of rows in the global 55* array A. 56* N_A (global) DESCA( N_ ) The number of columns in the global 57* array A. 58* MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 59* the rows of the array. 60* NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 61* the columns of the array. 62* RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 63* row of the array A is distributed. 64* CSRC_A (global) DESCA( CSRC_ ) The process column over which the 65* first column of the array A is 66* distributed. 67* LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 68* array. LLD_A >= MAX(1,LOCr(M_A)). 69* 70* Let K be the number of rows or columns of a distributed matrix, 71* and assume that its process grid has dimension p x q. 72* LOCr( K ) denotes the number of elements of K that a process 73* would receive if K were distributed over the p processes of its 74* process column. 75* Similarly, LOCc( K ) denotes the number of elements of K that a 76* process would receive if K were distributed over the q processes of 77* its process row. 78* The values of LOCr() and LOCc() may be determined via a call to the 79* ScaLAPACK tool function, NUMROC: 80* LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 81* LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 82* An upper bound for these quantities may be computed by: 83* LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 84* LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 85* 86* This routine requires square block decomposition ( MB_A = NB_A ). 87* 88* Arguments 89* ========= 90* 91* UPLO (global input) CHARACTER 92* = 'U': Upper triangle of sub( A ) is stored; 93* = 'L': Lower triangle of sub( A ) is stored. 94* 95* N (global input) INTEGER 96* The number of rows and columns to be operated on, i.e. the 97* order of the distributed submatrix sub( A ). N >= 0. 98* 99* A (local input/local output) DOUBLE PRECISION pointer into the 100* local memory to an array of dimension (LLD_A, LOCc(JA+N-1)). 101* On entry, this array contains the local pieces of the 102* N-by-N symmetric distributed matrix sub( A ) to be factored. 103* If UPLO = 'U', the leading N-by-N upper triangular part of 104* sub( A ) contains the upper triangular part of the matrix, 105* and its strictly lower triangular part is not referenced. 106* If UPLO = 'L', the leading N-by-N lower triangular part of 107* sub( A ) contains the lower triangular part of the distribu- 108* ted matrix, and its strictly upper triangular part is not 109* referenced. On exit, if UPLO = 'U', the upper triangular 110* part of the distributed matrix contains the Cholesky factor 111* U, if UPLO = 'L', the lower triangular part of the distribu- 112* ted matrix contains the Cholesky factor L. 113* 114* IA (global input) INTEGER 115* The row index in the global array A indicating the first 116* row of sub( A ). 117* 118* JA (global input) INTEGER 119* The column index in the global array A indicating the 120* first column of sub( A ). 121* 122* DESCA (global and local input) INTEGER array of dimension DLEN_. 123* The array descriptor for the distributed matrix A. 124* 125* INFO (global output) INTEGER 126* = 0: successful exit 127* < 0: If the i-th argument is an array and the j-entry had 128* an illegal value, then INFO = -(i*100+j), if the i-th 129* argument is a scalar and had an illegal value, then 130* INFO = -i. 131* > 0: If INFO = K, the leading minor of order K, 132* A(IA:IA+K-1,JA:JA+K-1) is not positive definite, and 133* the factorization could not be completed. 134* 135* ===================================================================== 136* 137* .. Parameters .. 138 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 139 $ LLD_, MB_, M_, NB_, N_, RSRC_ 140 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 141 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 142 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 143 DOUBLE PRECISION ONE 144 PARAMETER ( ONE = 1.0D+0 ) 145* .. 146* .. Local Scalars .. 147 LOGICAL UPPER 148 CHARACTER COLBTOP, ROWBTOP 149 INTEGER I, ICOFF, ICTXT, IROFF, J, JB, JN, MYCOL, 150 $ MYROW, NPCOL, NPROW 151* .. 152* .. Local Arrays .. 153 INTEGER IDUM1( 1 ), IDUM2( 1 ) 154* .. 155* .. External Subroutines .. 156 EXTERNAL BLACS_GRIDINFO, CHK1MAT, PCHK1MAT, PB_TOPGET, 157 $ PB_TOPSET, PDPOTF2, PDSYRK, PDTRSM, 158 $ PXERBLA 159* .. 160* .. External Functions .. 161 LOGICAL LSAME 162 INTEGER ICEIL 163 EXTERNAL ICEIL, LSAME 164* .. 165* .. Intrinsic Functions .. 166 INTRINSIC ICHAR, MIN, MOD 167* .. 168* .. Executable Statements .. 169* 170* Get grid parameters 171* 172 ICTXT = DESCA( CTXT_ ) 173 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 174* 175* Test the input parameters 176* 177 INFO = 0 178 IF( NPROW.EQ.-1 ) THEN 179 INFO = -(600+CTXT_) 180 ELSE 181 CALL CHK1MAT( N, 2, N, 2, IA, JA, DESCA, 6, INFO ) 182 UPPER = LSAME( UPLO, 'U' ) 183 IF( INFO.EQ.0 ) THEN 184 IROFF = MOD( IA-1, DESCA( MB_ ) ) 185 ICOFF = MOD( JA-1, DESCA( NB_ ) ) 186 IF ( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN 187 INFO = -1 188 ELSE IF( IROFF.NE.0 ) THEN 189 INFO = -4 190 ELSE IF( ICOFF.NE.0 ) THEN 191 INFO = -5 192 ELSE IF( DESCA( MB_ ).NE.DESCA( NB_ ) ) THEN 193 INFO = -(600+NB_) 194 END IF 195 END IF 196 IF( UPPER ) THEN 197 IDUM1( 1 ) = ICHAR( 'U' ) 198 ELSE 199 IDUM1( 1 ) = ICHAR( 'L' ) 200 END IF 201 IDUM2( 1 ) = 1 202 CALL PCHK1MAT( N, 2, N, 2, IA, JA, DESCA, 6, 1, IDUM1, IDUM2, 203 $ INFO ) 204 END IF 205* 206 IF( INFO.NE.0 ) THEN 207 CALL PXERBLA( ICTXT, 'PDPOTRF', -INFO ) 208 RETURN 209 END IF 210* 211* Quick return if possible 212* 213 IF( N.EQ.0 ) 214 $ RETURN 215* 216 CALL PB_TOPGET( ICTXT, 'Broadcast', 'Rowwise', ROWBTOP ) 217 CALL PB_TOPGET( ICTXT, 'Broadcast', 'Columnwise', COLBTOP ) 218* 219 IF( UPPER ) THEN 220* 221* Split-ring topology for the communication along process 222* columns, 1-tree topology along process rows. 223* 224 CALL PB_TOPSET( ICTXT, 'Broadcast', 'Rowwise', ' ' ) 225 CALL PB_TOPSET( ICTXT, 'Broadcast', 'Columnwise', 'S-ring' ) 226* 227* A is upper triangular, compute Cholesky factorization A = U'*U. 228* 229* Handle the first block of columns separately 230* 231 JN = MIN( ICEIL( JA, DESCA( NB_ ) )*DESCA(NB_), JA+N-1 ) 232 JB = JN - JA + 1 233* 234* Perform unblocked Cholesky factorization on JB block 235* 236 CALL PDPOTF2( UPLO, JB, A, IA, JA, DESCA, INFO ) 237 IF( INFO.NE.0 ) 238 $ GO TO 30 239* 240 IF( JB+1.LE.N ) THEN 241* 242* Form the row panel of U using the triangular solver 243* 244 CALL PDTRSM( 'Left', UPLO, 'Transpose', 'Non-Unit', 245 $ JB, N-JB, ONE, A, IA, JA, DESCA, A, IA, JA+JB, 246 $ DESCA ) 247* 248* Update the trailing matrix, A = A - U'*U 249* 250 CALL PDSYRK( UPLO, 'Transpose', N-JB, JB, -ONE, A, IA, 251 $ JA+JB, DESCA, ONE, A, IA+JB, JA+JB, DESCA ) 252 END IF 253* 254* Loop over remaining block of columns 255* 256 DO 10 J = JN+1, JA+N-1, DESCA( NB_ ) 257 JB = MIN( N-J+JA, DESCA( NB_ ) ) 258 I = IA + J - JA 259* 260* Perform unblocked Cholesky factorization on JB block 261* 262 CALL PDPOTF2( UPLO, JB, A, I, J, DESCA, INFO ) 263 IF( INFO.NE.0 ) THEN 264 INFO = INFO + J - JA 265 GO TO 30 266 END IF 267* 268 IF( J-JA+JB+1.LE.N ) THEN 269* 270* Form the row panel of U using the triangular solver 271* 272 CALL PDTRSM( 'Left', UPLO, 'Transpose', 'Non-Unit', 273 $ JB, N-J-JB+JA, ONE, A, I, J, DESCA, A, 274 $ I, J+JB, DESCA ) 275* 276* Update the trailing matrix, A = A - U'*U 277* 278 CALL PDSYRK( UPLO, 'Transpose', N-J-JB+JA, JB, 279 $ -ONE, A, I, J+JB, DESCA, ONE, A, I+JB, 280 $ J+JB, DESCA ) 281 END IF 282 10 CONTINUE 283* 284 ELSE 285* 286* 1-tree topology for the communication along process columns, 287* Split-ring topology along process rows. 288* 289 CALL PB_TOPSET( ICTXT, 'Broadcast', 'Rowwise', 'S-ring' ) 290 CALL PB_TOPSET( ICTXT, 'Broadcast', 'Columnwise', ' ' ) 291* 292* A is lower triangular, compute Cholesky factorization A = L*L' 293* (right-looking) 294* 295* Handle the first block of columns separately 296* 297 JN = MIN( ICEIL( JA, DESCA( NB_ ) )*DESCA( NB_ ), JA+N-1 ) 298 JB = JN - JA + 1 299* 300* Perform unblocked Cholesky factorization on JB block 301* 302 CALL PDPOTF2( UPLO, JB, A, IA, JA, DESCA, INFO ) 303 IF( INFO.NE.0 ) 304 $ GO TO 30 305* 306 IF( JB+1.LE.N ) THEN 307* 308* Form the column panel of L using the triangular solver 309* 310 CALL PDTRSM( 'Right', UPLO, 'Transpose', 'Non-Unit', 311 $ N-JB, JB, ONE, A, IA, JA, DESCA, A, IA+JB, JA, 312 $ DESCA ) 313* 314* Update the trailing matrix, A = A - L*L' 315* 316 CALL PDSYRK( UPLO, 'No Transpose', N-JB, JB, -ONE, A, IA+JB, 317 $ JA, DESCA, ONE, A, IA+JB, JA+JB, DESCA ) 318* 319 END IF 320* 321 DO 20 J = JN+1, JA+N-1, DESCA( NB_ ) 322 JB = MIN( N-J+JA, DESCA( NB_ ) ) 323 I = IA + J - JA 324* 325* Perform unblocked Cholesky factorization on JB block 326* 327 CALL PDPOTF2( UPLO, JB, A, I, J, DESCA, INFO ) 328 IF( INFO.NE.0 ) THEN 329 INFO = INFO + J - JA 330 GO TO 30 331 END IF 332* 333 IF( J-JA+JB+1.LE.N ) THEN 334* 335* Form the column panel of L using the triangular solver 336* 337 CALL PDTRSM( 'Right', UPLO, 'Transpose', 'Non-Unit', 338 $ N-J-JB+JA, JB, ONE, A, I, J, DESCA, A, I+JB, 339 $ J, DESCA ) 340* 341* Update the trailing matrix, A = A - L*L' 342* 343 CALL PDSYRK( UPLO, 'No Transpose', N-J-JB+JA, JB, -ONE, 344 $ A, I+JB, J, DESCA, ONE, A, I+JB, J+JB, 345 $ DESCA ) 346* 347 END IF 348 20 CONTINUE 349* 350 END IF 351* 352 30 CONTINUE 353* 354 CALL PB_TOPSET( ICTXT, 'Broadcast', 'Rowwise', ROWBTOP ) 355 CALL PB_TOPSET( ICTXT, 'Broadcast', 'Columnwise', COLBTOP ) 356* 357 RETURN 358* 359* End of PDPOTRF 360* 361 END 362