1 SUBROUTINE IB01MD( METH, ALG, BATCH, CONCT, NOBR, M, L, NSMP, U, 2 $ LDU, Y, LDY, R, LDR, IWORK, DWORK, LDWORK, 3 $ IWARN, INFO ) 4C 5C RELEASE 4.0, WGS COPYRIGHT 2000. 6C 7C PURPOSE 8C 9C To construct an upper triangular factor R of the concatenated 10C block Hankel matrices using input-output data. The input-output 11C data can, optionally, be processed sequentially. 12C 13C ARGUMENTS 14C 15C Mode Parameters 16C 17C METH CHARACTER*1 18C Specifies the subspace identification method to be used, 19C as follows: 20C = 'M': MOESP algorithm with past inputs and outputs; 21C = 'N': N4SID algorithm. 22C 23C ALG CHARACTER*1 24C Specifies the algorithm for computing the triangular 25C factor R, as follows: 26C = 'C': Cholesky algorithm applied to the correlation 27C matrix of the input-output data; 28C = 'F': Fast QR algorithm; 29C = 'Q': QR algorithm applied to the concatenated block 30C Hankel matrices. 31C 32C BATCH CHARACTER*1 33C Specifies whether or not sequential data processing is to 34C be used, and, for sequential processing, whether or not 35C the current data block is the first block, an intermediate 36C block, or the last block, as follows: 37C = 'F': the first block in sequential data processing; 38C = 'I': an intermediate block in sequential data 39C processing; 40C = 'L': the last block in sequential data processing; 41C = 'O': one block only (non-sequential data processing). 42C NOTE that when 100 cycles of sequential data processing 43C are completed for BATCH = 'I', a warning is 44C issued, to prevent for an infinite loop. 45C 46C CONCT CHARACTER*1 47C Specifies whether or not the successive data blocks in 48C sequential data processing belong to a single experiment, 49C as follows: 50C = 'C': the current data block is a continuation of the 51C previous data block and/or it will be continued 52C by the next data block; 53C = 'N': there is no connection between the current data 54C block and the previous and/or the next ones. 55C This parameter is not used if BATCH = 'O'. 56C 57C Input/Output Parameters 58C 59C NOBR (input) INTEGER 60C The number of block rows, s, in the input and output 61C block Hankel matrices to be processed. NOBR > 0. 62C (In the MOESP theory, NOBR should be larger than n, 63C the estimated dimension of state vector.) 64C 65C M (input) INTEGER 66C The number of system inputs. M >= 0. 67C When M = 0, no system inputs are processed. 68C 69C L (input) INTEGER 70C The number of system outputs. L > 0. 71C 72C NSMP (input) INTEGER 73C The number of rows of matrices U and Y (number of 74C samples, t). (When sequential data processing is used, 75C NSMP is the number of samples of the current data 76C block.) 77C NSMP >= 2*(M+L+1)*NOBR - 1, for non-sequential 78C processing; 79C NSMP >= 2*NOBR, for sequential processing. 80C The total number of samples when calling the routine with 81C BATCH = 'L' should be at least 2*(M+L+1)*NOBR - 1. 82C The NSMP argument may vary from a cycle to another in 83C sequential data processing, but NOBR, M, and L should 84C be kept constant. For efficiency, it is advisable to use 85C NSMP as large as possible. 86C 87C U (input) DOUBLE PRECISION array, dimension (LDU,M) 88C The leading NSMP-by-M part of this array must contain the 89C t-by-m input-data sequence matrix U, 90C U = [u_1 u_2 ... u_m]. Column j of U contains the 91C NSMP values of the j-th input component for consecutive 92C time increments. 93C If M = 0, this array is not referenced. 94C 95C LDU INTEGER 96C The leading dimension of the array U. 97C LDU >= NSMP, if M > 0; 98C LDU >= 1, if M = 0. 99C 100C Y (input) DOUBLE PRECISION array, dimension (LDY,L) 101C The leading NSMP-by-L part of this array must contain the 102C t-by-l output-data sequence matrix Y, 103C Y = [y_1 y_2 ... y_l]. Column j of Y contains the 104C NSMP values of the j-th output component for consecutive 105C time increments. 106C 107C LDY INTEGER 108C The leading dimension of the array Y. LDY >= NSMP. 109C 110C R (output or input/output) DOUBLE PRECISION array, dimension 111C ( LDR,2*(M+L)*NOBR ) 112C On exit, if INFO = 0 and ALG = 'Q', or (ALG = 'C' or 'F', 113C and BATCH = 'L' or 'O'), the leading 114C 2*(M+L)*NOBR-by-2*(M+L)*NOBR upper triangular part of 115C this array contains the (current) upper triangular factor 116C R from the QR factorization of the concatenated block 117C Hankel matrices. The diagonal elements of R are positive 118C when the Cholesky algorithm was successfully used. 119C On exit, if ALG = 'C' and BATCH = 'F' or 'I', the leading 120C 2*(M+L)*NOBR-by-2*(M+L)*NOBR upper triangular part of this 121C array contains the current upper triangular part of the 122C correlation matrix in sequential data processing. 123C If ALG = 'F' and BATCH = 'F' or 'I', the array R is not 124C referenced. 125C On entry, if ALG = 'C', or ALG = 'Q', and BATCH = 'I' or 126C 'L', the leading 2*(M+L)*NOBR-by-2*(M+L)*NOBR upper 127C triangular part of this array must contain the upper 128C triangular matrix R computed at the previous call of this 129C routine in sequential data processing. The array R need 130C not be set on entry if ALG = 'F' or if BATCH = 'F' or 'O'. 131C 132C LDR INTEGER 133C The leading dimension of the array R. 134C LDR >= 2*(M+L)*NOBR. 135C 136C Workspace 137C 138C IWORK INTEGER array, dimension (LIWORK) 139C LIWORK >= M+L, if ALG = 'F'; 140C LIWORK >= 0, if ALG = 'C' or 'Q'. 141C 142C DWORK DOUBLE PRECISION array, dimension (LDWORK) 143C On exit, if INFO = 0, DWORK(1) returns the optimal 144C value of LDWORK. 145C On exit, if INFO = -17, DWORK(1) returns the minimum 146C value of LDWORK. 147C Let 148C k = 0, if CONCT = 'N' and ALG = 'C' or 'Q'; 149C k = 2*NOBR-1, if CONCT = 'C' and ALG = 'C' or 'Q'; 150C k = 2*NOBR*(M+L+1), if CONCT = 'N' and ALG = 'F'; 151C k = 2*NOBR*(M+L+2), if CONCT = 'C' and ALG = 'F'. 152C The first (M+L)*k elements of DWORK should be preserved 153C during successive calls of the routine with BATCH = 'F' 154C or 'I', till the final call with BATCH = 'L'. 155C 156C LDWORK INTEGER 157C The length of the array DWORK. 158C LDWORK >= (4*NOBR-2)*(M+L), if ALG = 'C', BATCH <> 'O' and 159C CONCT = 'C'; 160C LDWORK >= 1, if ALG = 'C', BATCH = 'O' or 161C CONCT = 'N'; 162C LDWORK >= (M+L)*2*NOBR*(M+L+3), if ALG = 'F', 163C BATCH <> 'O' and CONCT = 'C'; 164C LDWORK >= (M+L)*2*NOBR*(M+L+1), if ALG = 'F', 165C BATCH = 'F', 'I' and CONCT = 'N'; 166C LDWORK >= (M+L)*4*NOBR*(M+L+1)+(M+L)*2*NOBR, if ALG = 'F', 167C BATCH = 'L' and CONCT = 'N', or 168C BATCH = 'O'; 169C LDWORK >= 4*(M+L)*NOBR, if ALG = 'Q', BATCH = 'F' or 'O', 170C and LDR >= NS = NSMP - 2*NOBR + 1; 171C LDWORK >= 6*(M+L)*NOBR, if ALG = 'Q', BATCH = 'F' or 'O', 172C and LDR < NS, or BATCH = 'I' or 173C 'L' and CONCT = 'N'; 174C LDWORK >= 4*(NOBR+1)*(M+L)*NOBR, if ALG = 'Q', BATCH = 'I' 175C or 'L' and CONCT = 'C'. 176C The workspace used for ALG = 'Q' is 177C LDRWRK*2*(M+L)*NOBR + 4*(M+L)*NOBR, 178C where LDRWRK = LDWORK/(2*(M+L)*NOBR) - 2; recommended 179C value LDRWRK = NS, assuming a large enough cache size. 180C For good performance, LDWORK should be larger. 181C 182C Warning Indicator 183C 184C IWARN INTEGER 185C = 0: no warning; 186C = 1: the number of 100 cycles in sequential data 187C processing has been exhausted without signaling 188C that the last block of data was get; the cycle 189C counter was reinitialized; 190C = 2: a fast algorithm was requested (ALG = 'C' or 'F'), 191C but it failed, and the QR algorithm was then used 192C (non-sequential data processing). 193C 194C Error Indicator 195C 196C INFO INTEGER 197C = 0: successful exit; 198C < 0: if INFO = -i, the i-th argument had an illegal 199C value; 200C = 1: a fast algorithm was requested (ALG = 'C', or 'F') 201C in sequential data processing, but it failed. The 202C routine can be repeatedly called again using the 203C standard QR algorithm. 204C 205C METHOD 206C 207C 1) For non-sequential data processing using QR algorithm, a 208C t x 2(m+l)s matrix H is constructed, where 209C 210C H = [ Uf' Up' Y' ], for METH = 'M', 211C s+1,2s,t 1,s,t 1,2s,t 212C 213C H = [ U' Y' ], for METH = 'N', 214C 1,2s,t 1,2s,t 215C 216C and Up , Uf , U , and Y are block Hankel 217C 1,s,t s+1,2s,t 1,2s,t 1,2s,t 218C matrices defined in terms of the input and output data [3]. 219C A QR factorization is used to compress the data. 220C The fast QR algorithm uses a QR factorization which exploits 221C the block-Hankel structure. Actually, the Cholesky factor of H'*H 222C is computed. 223C 224C 2) For sequential data processing using QR algorithm, the QR 225C decomposition is done sequentially, by updating the upper 226C triangular factor R. This is also performed internally if the 227C workspace is not large enough to accommodate an entire batch. 228C 229C 3) For non-sequential or sequential data processing using 230C Cholesky algorithm, the correlation matrix of input-output data is 231C computed (sequentially, if requested), taking advantage of the 232C block Hankel structure [7]. Then, the Cholesky factor of the 233C correlation matrix is found, if possible. 234C 235C REFERENCES 236C 237C [1] Verhaegen M., and Dewilde, P. 238C Subspace Model Identification. Part 1: The output-error 239C state-space model identification class of algorithms. 240C Int. J. Control, 56, pp. 1187-1210, 1992. 241C 242C [2] Verhaegen M. 243C Subspace Model Identification. Part 3: Analysis of the 244C ordinary output-error state-space model identification 245C algorithm. 246C Int. J. Control, 58, pp. 555-586, 1993. 247C 248C [3] Verhaegen M. 249C Identification of the deterministic part of MIMO state space 250C models given in innovations form from input-output data. 251C Automatica, Vol.30, No.1, pp.61-74, 1994. 252C 253C [4] Van Overschee, P., and De Moor, B. 254C N4SID: Subspace Algorithms for the Identification of 255C Combined Deterministic-Stochastic Systems. 256C Automatica, Vol.30, No.1, pp. 75-93, 1994. 257C 258C [5] Peternell, K., Scherrer, W. and Deistler, M. 259C Statistical Analysis of Novel Subspace Identification Methods. 260C Signal Processing, 52, pp. 161-177, 1996. 261C 262C [6] Sima, V. 263C Subspace-based Algorithms for Multivariable System 264C Identification. 265C Studies in Informatics and Control, 5, pp. 335-344, 1996. 266C 267C [7] Sima, V. 268C Cholesky or QR Factorization for Data Compression in 269C Subspace-based Identification ? 270C Proceedings of the Second NICONET Workshop on ``Numerical 271C Control Software: SLICOT, a Useful Tool in Industry'', 272C December 3, 1999, INRIA Rocquencourt, France, pp. 75-80, 1999. 273C 274C NUMERICAL ASPECTS 275C 276C The implemented method is numerically stable (when QR algorithm is 277C used), reliable and efficient. The fast Cholesky or QR algorithms 278C are more efficient, but the accuracy could diminish by forming the 279C correlation matrix. 280C 2 281C The QR algorithm needs 0(t(2(m+l)s) ) floating point operations. 282C 2 3 283C The Cholesky algorithm needs 0(2t(m+l) s)+0((2(m+l)s) ) floating 284C point operations. 285C 2 3 2 286C The fast QR algorithm needs 0(2t(m+l) s)+0(4(m+l) s ) floating 287C point operations. 288C 289C FURTHER COMMENTS 290C 291C For ALG = 'Q', BATCH = 'O' and LDR < NS, or BATCH <> 'O', the 292C calculations could be rather inefficient if only minimal workspace 293C (see argument LDWORK) is provided. It is advisable to provide as 294C much workspace as possible. Almost optimal efficiency can be 295C obtained for LDWORK = (NS+2)*(2*(M+L)*NOBR), assuming that the 296C cache size is large enough to accommodate R, U, Y, and DWORK. 297C 298C CONTRIBUTOR 299C 300C V. Sima, Research Institute for Informatics, Bucharest, Aug. 1999. 301C 302C REVISIONS 303C 304C Feb. 2000, Aug. 2000. 305C 306C KEYWORDS 307C 308C Cholesky decomposition, Hankel matrix, identification methods, 309C multivariable systems, QR decomposition. 310C 311C ****************************************************************** 312C 313C .. Parameters .. 314 DOUBLE PRECISION ZERO, ONE 315 PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0 ) 316 INTEGER MAXCYC 317 PARAMETER ( MAXCYC = 100 ) 318C .. Scalar Arguments .. 319 INTEGER INFO, IWARN, L, LDR, LDU, LDWORK, LDY, M, NOBR, 320 $ NSMP 321 CHARACTER ALG, BATCH, CONCT, METH 322C .. Array Arguments .. 323 INTEGER IWORK(*) 324 DOUBLE PRECISION DWORK(*), R(LDR, *), U(LDU, *), Y(LDY, *) 325C .. Local Scalars .. 326 DOUBLE PRECISION UPD, TEMP 327 INTEGER I, ICOL, ICYCLE, ID, IERR, II, INICYC, INIT, 328 $ INITI, INU, INY, IREV, ISHFT2, ISHFTU, ISHFTY, 329 $ ITAU, J, JD, JWORK, LDRWMX, LDRWRK, LLDRW, 330 $ LMNOBR, LNOBR, MAXWRK, MINWRK, MLDRW, MMNOBR, 331 $ MNOBR, NCYCLE, NICYCL, NOBR2, NOBR21, NOBRM1, 332 $ NR, NS, NSF, NSL, NSLAST, NSMPSM 333 LOGICAL CHALG, CONNEC, FIRST, FQRALG, INTERM, LAST, 334 $ LINR, MOESP, N4SID, ONEBCH, QRALG 335C .. Local Arrays .. 336 DOUBLE PRECISION DUM( 1 ) 337C .. External Functions .. 338 LOGICAL LSAME 339 INTEGER ILAENV 340 EXTERNAL ILAENV, LSAME 341C .. External Subroutines .. 342 EXTERNAL DAXPY, DCOPY, DGEMM, DGEQRF, DGER, DLACPY, 343 $ DLASET, DPOTRF, DSWAP, DSYRK, IB01MY, MB04OD, 344 $ XERBLA 345C .. Intrinsic Functions .. 346 INTRINSIC INT, MAX, MIN 347C .. Save Statement .. 348C ICYCLE is used to count the cycles for BATCH = 'I'. It is 349C reinitialized at each MAXCYC cycles. 350C MAXWRK is used to store the optimal workspace. 351C NSMPSM is used to sum up the NSMP values for BATCH <> 'O'. 352 SAVE ICYCLE, MAXWRK, NSMPSM 353C .. 354C .. Executable Statements .. 355C 356C Decode the scalar input parameters. 357C 358 MOESP = LSAME( METH, 'M' ) 359 N4SID = LSAME( METH, 'N' ) 360 FQRALG = LSAME( ALG, 'F' ) 361 QRALG = LSAME( ALG, 'Q' ) 362 CHALG = LSAME( ALG, 'C' ) 363 ONEBCH = LSAME( BATCH, 'O' ) 364 FIRST = LSAME( BATCH, 'F' ) .OR. ONEBCH 365 INTERM = LSAME( BATCH, 'I' ) 366 LAST = LSAME( BATCH, 'L' ) .OR. ONEBCH 367 IF( .NOT.ONEBCH ) THEN 368 CONNEC = LSAME( CONCT, 'C' ) 369 ELSE 370 CONNEC = .FALSE. 371 END IF 372C 373 MNOBR = M*NOBR 374 LNOBR = L*NOBR 375 LMNOBR = LNOBR + MNOBR 376 MMNOBR = MNOBR + MNOBR 377 NOBRM1 = NOBR - 1 378 NOBR21 = NOBR + NOBRM1 379 NOBR2 = NOBR21 + 1 380 IWARN = 0 381 INFO = 0 382 IERR = 0 383 IF( FIRST ) THEN 384 ICYCLE = 1 385 MAXWRK = 1 386 NSMPSM = 0 387 END IF 388 NSMPSM = NSMPSM + NSMP 389 NR = LMNOBR + LMNOBR 390C 391C Check the scalar input parameters. 392C 393 IF( .NOT.( MOESP .OR. N4SID ) ) THEN 394 INFO = -1 395 ELSE IF( .NOT.( FQRALG .OR. QRALG .OR. CHALG ) ) THEN 396 INFO = -2 397 ELSE IF( .NOT.( FIRST .OR. INTERM .OR. LAST ) ) THEN 398 INFO = -3 399 ELSE IF( .NOT. ONEBCH ) THEN 400 IF( .NOT.( CONNEC .OR. LSAME( CONCT, 'N' ) ) ) 401 $ INFO = -4 402 END IF 403 IF( INFO.EQ.0 ) THEN 404 IF( NOBR.LE.0 ) THEN 405 INFO = -5 406 ELSE IF( M.LT.0 ) THEN 407 INFO = -6 408 ELSE IF( L.LE.0 ) THEN 409 INFO = -7 410 ELSE IF( NSMP.LT.NOBR2 .OR. 411 $ ( LAST .AND. NSMPSM.LT.NR+NOBR21 ) ) THEN 412 INFO = -8 413 ELSE IF( LDU.LT.1 .OR. ( M.GT.0 .AND. LDU.LT.NSMP ) ) THEN 414 INFO = -10 415 ELSE IF( LDY.LT.NSMP ) THEN 416 INFO = -12 417 ELSE IF( LDR.LT.NR ) THEN 418 INFO = -14 419 ELSE 420C 421C Compute workspace. 422C (Note: Comments in the code beginning "Workspace:" describe 423C the minimal amount of workspace needed at that point in the 424C code, as well as the preferred amount for good performance. 425C NB refers to the optimal block size for the immediately 426C following subroutine, as returned by ILAENV.) 427C 428 NS = NSMP - NOBR21 429 IF ( CHALG ) THEN 430 IF ( .NOT.ONEBCH .AND. CONNEC ) THEN 431 MINWRK = 2*( NR - M - L ) 432 ELSE 433 MINWRK = 1 434 END IF 435 ELSE IF ( FQRALG ) THEN 436 IF ( .NOT.ONEBCH .AND. CONNEC ) THEN 437 MINWRK = NR*( M + L + 3 ) 438 ELSE IF ( FIRST .OR. INTERM ) THEN 439 MINWRK = NR*( M + L + 1 ) 440 ELSE 441 MINWRK = 2*NR*( M + L + 1 ) + NR 442 END IF 443 ELSE 444 MINWRK = 2*NR 445 MAXWRK = NR + NR*ILAENV( 1, 'DGEQRF', ' ', NS, NR, -1, 446 $ -1 ) 447 IF ( FIRST ) THEN 448 IF ( LDR.LT.NS ) THEN 449 MINWRK = MINWRK + NR 450 MAXWRK = NS*NR + MAXWRK 451 END IF 452 ELSE 453 IF ( CONNEC ) THEN 454 MINWRK = MINWRK*( NOBR + 1 ) 455 ELSE 456 MINWRK = MINWRK + NR 457 END IF 458 MAXWRK = NS*NR + MAXWRK 459 END IF 460 END IF 461 MAXWRK = MAX( MINWRK, MAXWRK ) 462C 463 IF( LDWORK.LT.MINWRK ) THEN 464 INFO = -17 465 DWORK( 1 ) = MINWRK 466 END IF 467 END IF 468 END IF 469C 470C Return if there are illegal arguments. 471C 472 IF( INFO.NE.0 ) THEN 473 CALL XERBLA( 'IB01MD', -INFO ) 474 RETURN 475 END IF 476C 477 IF ( CHALG ) THEN 478C 479C Compute the R factor from a Cholesky factorization of the 480C input-output data correlation matrix. 481C 482C Set the parameters for constructing the correlations of the 483C current block. 484C 485 LDRWRK = 2*NOBR2 - 2 486 IF( FIRST ) THEN 487 UPD = ZERO 488 ELSE 489 UPD = ONE 490 END IF 491C 492 IF( .NOT.FIRST .AND. CONNEC ) THEN 493C 494C Restore the saved (M+L)*(2*NOBR-1) "connection" elements of 495C U and Y into their appropriate position in sequential 496C processing. The process is performed column-wise, in 497C reverse order, first for Y and then for U. 498C Workspace: need (4*NOBR-2)*(M+L). 499C 500 IREV = NR - M - L - NOBR21 + 1 501 ICOL = 2*( NR - M - L ) - LDRWRK + 1 502C 503 DO 10 I = 1, M + L 504 CALL DCOPY( NOBR21, DWORK(IREV), -1, DWORK(ICOL), -1 ) 505 IREV = IREV - NOBR21 506 ICOL = ICOL - LDRWRK 507 10 CONTINUE 508C 509 IF ( M.GT.0 ) 510 $ CALL DLACPY( 'Full', NOBR21, M, U, LDU, DWORK(NOBR2), 511 $ LDRWRK ) 512 CALL DLACPY( 'Full', NOBR21, L, Y, LDY, 513 $ DWORK(LDRWRK*M+NOBR2), LDRWRK ) 514 END IF 515C 516 IF ( M.GT.0 ) THEN 517C 518C Let Guu(i,j) = Guu0(i,j) + u_i*u_j' + u_(i+1)*u_(j+1)' + 519C ... + u_(i+NS-1)*u_(j+NS-1)', 520C where u_i' is the i-th row of U, j = 1 : 2s, i = 1 : j, 521C NS = NSMP - 2s + 1, and Guu0(i,j) is a zero matrix for 522C BATCH = 'O' or 'F', and it is the matrix Guu(i,j) computed 523C till the current block for BATCH = 'I' or 'L'. The matrix 524C Guu(i,j) is m-by-m, and Guu(j,j) is symmetric. The 525C upper triangle of the U-U correlations, Guu, is computed 526C (or updated) column-wise in the array R, that is, in the 527C order Guu(1,1), Guu(1,2), Guu(2,2), ..., Guu(2s,2s). 528C Only the submatrices of the first block-row are fully 529C computed (or updated). The remaining ones are determined 530C exploiting the block-Hankel structure, using the updating 531C formula 532C 533C Guu(i+1,j+1) = Guu0(i+1,j+1) - Guu0(i,j) + Guu(i,j) + 534C u_(i+NS)*u_(j+NS)' - u_i*u_j'. 535C 536 IF( .NOT.FIRST ) THEN 537C 538C Subtract the contribution of the previous block of data 539C in sequential processing. The columns must be processed 540C in backward order. 541C 542 DO 20 I = NOBR21*M, 1, -1 543 CALL DAXPY( I, -ONE, R(1,I), 1, R(M+1,M+I), 1 ) 544 20 CONTINUE 545C 546 END IF 547C 548C Compute/update Guu(1,1). 549C 550 IF( .NOT.FIRST .AND. CONNEC ) 551 $ CALL DSYRK( 'Upper', 'Transpose', M, NOBR21, ONE, DWORK, 552 $ LDRWRK, UPD, R, LDR ) 553 CALL DSYRK( 'Upper', 'Transpose', M, NS, ONE, U, LDU, UPD, 554 $ R, LDR ) 555C 556 JD = 1 557C 558 IF( FIRST .OR. .NOT.CONNEC ) THEN 559C 560 DO 70 J = 2, NOBR2 561 JD = JD + M 562 ID = M + 1 563C 564C Compute/update Guu(1,j). 565C 566 CALL DGEMM( 'Transpose', 'NoTranspose', M, M, NS, ONE, 567 $ U, LDU, U(J,1), LDU, UPD, R(1,JD), LDR ) 568C 569C Compute/update Guu(2:j,j), exploiting the 570C block-Hankel structure. 571C 572 IF( FIRST ) THEN 573C 574 DO 30 I = JD - M, JD - 1 575 CALL DCOPY( I, R(1,I), 1, R(M+1,M+I), 1 ) 576 30 CONTINUE 577C 578 ELSE 579C 580 DO 40 I = JD - M, JD - 1 581 CALL DAXPY( I, ONE, R(1,I), 1, R(M+1,M+I), 1 ) 582 40 CONTINUE 583C 584 END IF 585C 586 DO 50 I = 2, J - 1 587 CALL DGER( M, M, ONE, U(NS+I-1,1), LDU, 588 $ U(NS+J-1,1), LDU, R(ID,JD), LDR ) 589 CALL DGER( M, M, -ONE, U(I-1,1), LDU, U(J-1,1), 590 $ LDU, R(ID,JD), LDR ) 591 ID = ID + M 592 50 CONTINUE 593C 594 DO 60 I = 1, M 595 CALL DAXPY( I, U(NS+J-1,I), U(NS+J-1,1), LDU, 596 $ R(JD,JD+I-1), 1 ) 597 CALL DAXPY( I, -U(J-1,I), U(J-1,1), LDU, 598 $ R(JD,JD+I-1), 1 ) 599 60 CONTINUE 600C 601 70 CONTINUE 602C 603 ELSE 604C 605 DO 120 J = 2, NOBR2 606 JD = JD + M 607 ID = M + 1 608C 609C Compute/update Guu(1,j) for sequential processing 610C with connected blocks. 611C 612 CALL DGEMM( 'Transpose', 'NoTranspose', M, M, NOBR21, 613 $ ONE, DWORK, LDRWRK, DWORK(J), LDRWRK, UPD, 614 $ R(1,JD), LDR ) 615 CALL DGEMM( 'Transpose', 'NoTranspose', M, M, NS, ONE, 616 $ U, LDU, U(J,1), LDU, ONE, R(1,JD), LDR ) 617C 618C Compute/update Guu(2:j,j) for sequential processing 619C with connected blocks, exploiting the block-Hankel 620C structure. 621C 622 IF( FIRST ) THEN 623C 624 DO 80 I = JD - M, JD - 1 625 CALL DCOPY( I, R(1,I), 1, R(M+1,M+I), 1 ) 626 80 CONTINUE 627C 628 ELSE 629C 630 DO 90 I = JD - M, JD - 1 631 CALL DAXPY( I, ONE, R(1,I), 1, R(M+1,M+I), 1 ) 632 90 CONTINUE 633C 634 END IF 635C 636 DO 100 I = 2, J - 1 637 CALL DGER( M, M, ONE, U(NS+I-1,1), LDU, 638 $ U(NS+J-1,1), LDU, R(ID,JD), LDR ) 639 CALL DGER( M, M, -ONE, DWORK(I-1), LDRWRK, 640 $ DWORK(J-1), LDRWRK, R(ID,JD), LDR ) 641 ID = ID + M 642 100 CONTINUE 643C 644 DO 110 I = 1, M 645 CALL DAXPY( I, U(NS+J-1,I), U(NS+J-1,1), LDU, 646 $ R(JD,JD+I-1), 1 ) 647 CALL DAXPY( I, -DWORK((I-1)*LDRWRK+J-1), 648 $ DWORK(J-1), LDRWRK, R(JD,JD+I-1), 1 ) 649 110 CONTINUE 650C 651 120 CONTINUE 652C 653 END IF 654C 655 IF ( LAST .AND. MOESP ) THEN 656C 657C Interchange past and future parts for MOESP algorithm. 658C (Only the upper triangular parts are interchanged, and 659C the (1,2) part is transposed in-situ.) 660C 661 TEMP = R(1,1) 662 R(1,1) = R(MNOBR+1,MNOBR+1) 663 R(MNOBR+1,MNOBR+1) = TEMP 664C 665 DO 130 J = 2, MNOBR 666 CALL DSWAP( J, R(1,J), 1, R(MNOBR+1,MNOBR+J), 1 ) 667 CALL DSWAP( J-1, R(1,MNOBR+J), 1, R(J,MNOBR+1), LDR ) 668 130 CONTINUE 669C 670 END IF 671C 672C Let Guy(i,j) = Guy0(i,j) + u_i*y_j' + u_(i+1)*y_(j+1)' + 673C ... + u_(i+NS-1)*y_(j+NS-1)', 674C where u_i' is the i-th row of U, y_j' is the j-th row 675C of Y, j = 1 : 2s, i = 1 : 2s, NS = NSMP - 2s + 1, and 676C Guy0(i,j) is a zero matrix for BATCH = 'O' or 'F', and it 677C is the matrix Guy(i,j) computed till the current block for 678C BATCH = 'I' or 'L'. Guy(i,j) is m-by-L. The U-Y 679C correlations, Guy, are computed (or updated) column-wise 680C in the array R. Only the submatrices of the first block- 681C column and block-row are fully computed (or updated). The 682C remaining ones are determined exploiting the block-Hankel 683C structure, using the updating formula 684C 685C Guy(i+1,j+1) = Guy0(i+1,j+1) - Guy0(i,j) + Guy(i,j) + 686C u_(i+NS)*y(j+NS)' - u_i*y_j'. 687C 688 II = MMNOBR - M 689 IF( .NOT.FIRST ) THEN 690C 691C Subtract the contribution of the previous block of data 692C in sequential processing. The columns must be processed 693C in backward order. 694C 695 DO 140 I = NR - L, MMNOBR + 1, -1 696 CALL DAXPY( II, -ONE, R(1,I), 1, R(M+1,L+I), 1 ) 697 140 CONTINUE 698C 699 END IF 700C 701C Compute/update the first block-column of Guy, Guy(i,1). 702C 703 IF( FIRST .OR. .NOT.CONNEC ) THEN 704C 705 DO 150 I = 1, NOBR2 706 CALL DGEMM( 'Transpose', 'NoTranspose', M, L, NS, ONE, 707 $ U(I,1), LDU, Y, LDY, UPD, 708 $ R((I-1)*M+1,MMNOBR+1), LDR ) 709 150 CONTINUE 710C 711 ELSE 712C 713 DO 160 I = 1, NOBR2 714 CALL DGEMM( 'Transpose', 'NoTranspose', M, L, NOBR21, 715 $ ONE, DWORK(I), LDRWRK, DWORK(LDRWRK*M+1), 716 $ LDRWRK, UPD, R((I-1)*M+1,MMNOBR+1), LDR ) 717 CALL DGEMM( 'Transpose', 'NoTranspose', M, L, NS, ONE, 718 $ U(I,1), LDU, Y, LDY, ONE, 719 $ R((I-1)*M+1,MMNOBR+1), LDR ) 720 160 CONTINUE 721C 722 END IF 723C 724 JD = MMNOBR + 1 725C 726 IF( FIRST .OR. .NOT.CONNEC ) THEN 727C 728 DO 200 J = 2, NOBR2 729 JD = JD + L 730 ID = M + 1 731C 732C Compute/update Guy(1,j). 733C 734 CALL DGEMM( 'Transpose', 'NoTranspose', M, L, NS, ONE, 735 $ U, LDU, Y(J,1), LDY, UPD, R(1,JD), LDR ) 736C 737C Compute/update Guy(2:2*s,j), exploiting the 738C block-Hankel structure. 739C 740 IF( FIRST ) THEN 741C 742 DO 170 I = JD - L, JD - 1 743 CALL DCOPY( II, R(1,I), 1, R(M+1,L+I), 1 ) 744 170 CONTINUE 745C 746 ELSE 747C 748 DO 180 I = JD - L, JD - 1 749 CALL DAXPY( II, ONE, R(1,I), 1, R(M+1,L+I), 1 ) 750 180 CONTINUE 751C 752 END IF 753C 754 DO 190 I = 2, NOBR2 755 CALL DGER( M, L, ONE, U(NS+I-1,1), LDU, 756 $ Y(NS+J-1,1), LDY, R(ID,JD), LDR ) 757 CALL DGER( M, L, -ONE, U(I-1,1), LDU, Y(J-1,1), 758 $ LDY, R(ID,JD), LDR ) 759 ID = ID + M 760 190 CONTINUE 761C 762 200 CONTINUE 763C 764 ELSE 765C 766 DO 240 J = 2, NOBR2 767 JD = JD + L 768 ID = M + 1 769C 770C Compute/update Guy(1,j) for sequential processing 771C with connected blocks. 772C 773 CALL DGEMM( 'Transpose', 'NoTranspose', M, L, NOBR21, 774 $ ONE, DWORK, LDRWRK, DWORK(LDRWRK*M+J), 775 $ LDRWRK, UPD, R(1,JD), LDR ) 776 CALL DGEMM( 'Transpose', 'NoTranspose', M, L, NS, ONE, 777 $ U, LDU, Y(J,1), LDY, ONE, R(1,JD), LDR ) 778C 779C Compute/update Guy(2:2*s,j) for sequential 780C processing with connected blocks, exploiting the 781C block-Hankel structure. 782C 783 IF( FIRST ) THEN 784C 785 DO 210 I = JD - L, JD - 1 786 CALL DCOPY( II, R(1,I), 1, R(M+1,L+I), 1 ) 787 210 CONTINUE 788C 789 ELSE 790C 791 DO 220 I = JD - L, JD - 1 792 CALL DAXPY( II, ONE, R(1,I), 1, R(M+1,L+I), 1 ) 793 220 CONTINUE 794C 795 END IF 796C 797 DO 230 I = 2, NOBR2 798 CALL DGER( M, L, ONE, U(NS+I-1,1), LDU, 799 $ Y(NS+J-1,1), LDY, R(ID,JD), LDR ) 800 CALL DGER( M, L, -ONE, DWORK(I-1), LDRWRK, 801 $ DWORK(LDRWRK*M+J-1), LDRWRK, R(ID,JD), 802 $ LDR ) 803 ID = ID + M 804 230 CONTINUE 805C 806 240 CONTINUE 807C 808 END IF 809C 810 IF ( LAST .AND. MOESP ) THEN 811C 812C Interchange past and future parts of U-Y correlations 813C for MOESP algorithm. 814C 815 DO 250 J = MMNOBR + 1, NR 816 CALL DSWAP( MNOBR, R(1,J), 1, R(MNOBR+1,J), 1 ) 817 250 CONTINUE 818C 819 END IF 820 END IF 821C 822C Let Gyy(i,j) = Gyy0(i,j) + y_i*y_i' + y_(i+1)*y_(i+1)' + ... + 823C y_(i+NS-1)*y_(i+NS-1)', 824C where y_i' is the i-th row of Y, j = 1 : 2s, i = 1 : j, 825C NS = NSMP - 2s + 1, and Gyy0(i,j) is a zero matrix for 826C BATCH = 'O' or 'F', and it is the matrix Gyy(i,j) computed till 827C the current block for BATCH = 'I' or 'L'. Gyy(i,j) is L-by-L, 828C and Gyy(j,j) is symmetric. The upper triangle of the Y-Y 829C correlations, Gyy, is computed (or updated) column-wise in 830C the corresponding part of the array R, that is, in the order 831C Gyy(1,1), Gyy(1,2), Gyy(2,2), ..., Gyy(2s,2s). Only the 832C submatrices of the first block-row are fully computed (or 833C updated). The remaining ones are determined exploiting the 834C block-Hankel structure, using the updating formula 835C 836C Gyy(i+1,j+1) = Gyy0(i+1,j+1) - Gyy0(i,j) + Gyy(i,j) + 837C y_(i+NS)*y_(j+NS)' - y_i*y_j'. 838C 839 JD = MMNOBR + 1 840C 841 IF( .NOT.FIRST ) THEN 842C 843C Subtract the contribution of the previous block of data 844C in sequential processing. The columns must be processed in 845C backward order. 846C 847 DO 260 I = NR - L, MMNOBR + 1, -1 848 CALL DAXPY( I-MMNOBR, -ONE, R(JD,I), 1, R(JD+L,L+I), 1 ) 849 260 CONTINUE 850C 851 END IF 852C 853C Compute/update Gyy(1,1). 854C 855 IF( .NOT.FIRST .AND. CONNEC ) 856 $ CALL DSYRK( 'Upper', 'Transpose', L, NOBR21, ONE, 857 $ DWORK(LDRWRK*M+1), LDRWRK, UPD, R(JD,JD), LDR ) 858 CALL DSYRK( 'Upper', 'Transpose', L, NS, ONE, Y, LDY, UPD, 859 $ R(JD,JD), LDR ) 860C 861 IF( FIRST .OR. .NOT.CONNEC ) THEN 862C 863 DO 310 J = 2, NOBR2 864 JD = JD + L 865 ID = MMNOBR + L + 1 866C 867C Compute/update Gyy(1,j). 868C 869 CALL DGEMM( 'Transpose', 'NoTranspose', L, L, NS, ONE, Y, 870 $ LDY, Y(J,1), LDY, UPD, R(MMNOBR+1,JD), LDR ) 871C 872C Compute/update Gyy(2:j,j), exploiting the block-Hankel 873C structure. 874C 875 IF( FIRST ) THEN 876C 877 DO 270 I = JD - L, JD - 1 878 CALL DCOPY( I-MMNOBR, R(MMNOBR+1,I), 1, 879 $ R(MMNOBR+L+1,L+I), 1 ) 880 270 CONTINUE 881C 882 ELSE 883C 884 DO 280 I = JD - L, JD - 1 885 CALL DAXPY( I-MMNOBR, ONE, R(MMNOBR+1,I), 1, 886 $ R(MMNOBR+L+1,L+I), 1 ) 887 280 CONTINUE 888C 889 END IF 890C 891 DO 290 I = 2, J - 1 892 CALL DGER( L, L, ONE, Y(NS+I-1,1), LDY, Y(NS+J-1,1), 893 $ LDY, R(ID,JD), LDR ) 894 CALL DGER( L, L, -ONE, Y(I-1,1), LDY, Y(J-1,1), LDY, 895 $ R(ID,JD), LDR ) 896 ID = ID + L 897 290 CONTINUE 898C 899 DO 300 I = 1, L 900 CALL DAXPY( I, Y(NS+J-1,I), Y(NS+J-1,1), LDY, 901 $ R(JD,JD+I-1), 1 ) 902 CALL DAXPY( I, -Y(J-1,I), Y(J-1,1), LDY, R(JD,JD+I-1), 903 $ 1 ) 904 300 CONTINUE 905C 906 310 CONTINUE 907C 908 ELSE 909C 910 DO 360 J = 2, NOBR2 911 JD = JD + L 912 ID = MMNOBR + L + 1 913C 914C Compute/update Gyy(1,j) for sequential processing with 915C connected blocks. 916C 917 CALL DGEMM( 'Transpose', 'NoTranspose', L, L, NOBR21, 918 $ ONE, DWORK(LDRWRK*M+1), LDRWRK, 919 $ DWORK(LDRWRK*M+J), LDRWRK, UPD, 920 $ R(MMNOBR+1,JD), LDR ) 921 CALL DGEMM( 'Transpose', 'NoTranspose', L, L, NS, ONE, Y, 922 $ LDY, Y(J,1), LDY, ONE, R(MMNOBR+1,JD), LDR ) 923C 924C Compute/update Gyy(2:j,j) for sequential processing 925C with connected blocks, exploiting the block-Hankel 926C structure. 927C 928 IF( FIRST ) THEN 929C 930 DO 320 I = JD - L, JD - 1 931 CALL DCOPY( I-MMNOBR, R(MMNOBR+1,I), 1, 932 $ R(MMNOBR+L+1,L+I), 1 ) 933 320 CONTINUE 934C 935 ELSE 936C 937 DO 330 I = JD - L, JD - 1 938 CALL DAXPY( I-MMNOBR, ONE, R(MMNOBR+1,I), 1, 939 $ R(MMNOBR+L+1,L+I), 1 ) 940 330 CONTINUE 941C 942 END IF 943C 944 DO 340 I = 2, J - 1 945 CALL DGER( L, L, ONE, Y(NS+I-1,1), LDY, Y(NS+J-1,1), 946 $ LDY, R(ID,JD), LDR ) 947 CALL DGER( L, L, -ONE, DWORK(LDRWRK*M+I-1), LDRWRK, 948 $ DWORK(LDRWRK*M+J-1), LDRWRK, R(ID,JD), 949 $ LDR ) 950 ID = ID + L 951 340 CONTINUE 952C 953 DO 350 I = 1, L 954 CALL DAXPY( I, Y(NS+J-1,I), Y(NS+J-1,1), LDY, 955 $ R(JD,JD+I-1), 1 ) 956 CALL DAXPY( I, -DWORK(LDRWRK*(M+I-1)+J-1), 957 $ DWORK(LDRWRK*M+J-1), LDRWRK, R(JD,JD+I-1), 958 $ 1 ) 959 350 CONTINUE 960C 961 360 CONTINUE 962C 963 END IF 964C 965 IF ( .NOT.LAST ) THEN 966 IF ( CONNEC ) THEN 967C 968C For sequential processing with connected data blocks, 969C save the remaining ("connection") elements of U and Y 970C in the first (M+L)*(2*NOBR-1) locations of DWORK. 971C 972 IF ( M.GT.0 ) 973 $ CALL DLACPY( 'Full', NOBR21, M, U(NS+1,1), LDU, DWORK, 974 $ NOBR21 ) 975 CALL DLACPY( 'Full', NOBR21, L, Y(NS+1,1), LDY, 976 $ DWORK(MMNOBR-M+1), NOBR21 ) 977 END IF 978C 979C Return to get new data. 980C 981 ICYCLE = ICYCLE + 1 982 IF ( ICYCLE.GT.MAXCYC ) 983 $ IWARN = 1 984 RETURN 985C 986 ELSE 987C 988C Try to compute the Cholesky factor of the correlation 989C matrix. 990C 991 CALL DPOTRF( 'Upper', NR, R, LDR, IERR ) 992 GO TO 370 993 END IF 994 ELSE IF ( FQRALG ) THEN 995C 996C Compute the R factor from a fast QR factorization of the 997C input-output data correlation matrix. 998C 999 CALL IB01MY( METH, BATCH, CONCT, NOBR, M, L, NSMP, U, LDU, 1000 $ Y, LDY, R, LDR, IWORK, DWORK, LDWORK, IWARN, 1001 $ IERR ) 1002 IF( .NOT.LAST ) 1003 $ RETURN 1004 MAXWRK = MAX( MAXWRK, INT( DWORK(1) ) ) 1005 END IF 1006C 1007 370 CONTINUE 1008C 1009 IF( IERR.NE.0 ) THEN 1010C 1011C Error return from a fast factorization algorithm of the 1012C input-output data correlation matrix. 1013C 1014 IF( ONEBCH ) THEN 1015 QRALG = .TRUE. 1016 IWARN = 2 1017 MINWRK = 2*NR 1018 MAXWRK = NR + NR*ILAENV( 1, 'DGEQRF', ' ', NS, NR, -1, 1019 $ -1 ) 1020 IF ( LDR.LT.NS ) THEN 1021 MINWRK = MINWRK + NR 1022 MAXWRK = NS*NR + MAXWRK 1023 END IF 1024 MAXWRK = MAX( MINWRK, MAXWRK ) 1025C 1026 IF( LDWORK.LT.MINWRK ) THEN 1027 INFO = -17 1028C 1029C Return: Not enough workspace. 1030C 1031 DWORK( 1 ) = MINWRK 1032 CALL XERBLA( 'IB01MD', -INFO ) 1033 RETURN 1034 END IF 1035 ELSE 1036 INFO = 1 1037 RETURN 1038 END IF 1039 END IF 1040C 1041 IF ( QRALG ) THEN 1042C 1043C Compute the R factor from a QR factorization of the matrix H 1044C of concatenated block Hankel matrices. 1045C 1046C Construct the matrix H. 1047C 1048C Set the parameters for constructing the current segment of the 1049C Hankel matrix, taking the available memory space into account. 1050C INITI+1 points to the beginning rows of U and Y from which 1051C data are taken when NCYCLE > 1 inner cycles are needed, 1052C or for sequential processing with connected blocks. 1053C LDRWMX is the number of rows that can fit in the working space. 1054C LDRWRK is the actual number of rows processed in this space. 1055C NSLAST is the number of samples to be processed at the last 1056C inner cycle. 1057C 1058 INITI = 0 1059 LDRWMX = LDWORK / NR - 2 1060 NCYCLE = 1 1061 NSLAST = NSMP 1062 LINR = .FALSE. 1063 IF ( FIRST ) THEN 1064 LINR = LDR.GE.NS 1065 LDRWRK = NS 1066 ELSE IF ( CONNEC ) THEN 1067 LDRWRK = NSMP 1068 ELSE 1069 LDRWRK = NS 1070 END IF 1071 INICYC = 1 1072C 1073 IF ( .NOT.LINR ) THEN 1074 IF ( LDRWMX.LT.LDRWRK ) THEN 1075C 1076C Not enough working space for doing a single inner cycle. 1077C NCYCLE inner cycles are to be performed for the current 1078C data block using the working space. 1079C 1080 NCYCLE = LDRWRK / LDRWMX 1081 NSLAST = MOD( LDRWRK, LDRWMX ) 1082 IF ( NSLAST.NE.0 ) THEN 1083 NCYCLE = NCYCLE + 1 1084 ELSE 1085 NSLAST = LDRWMX 1086 END IF 1087 LDRWRK = LDRWMX 1088 NS = LDRWRK 1089 IF ( FIRST ) INICYC = 2 1090 END IF 1091 MLDRW = M*LDRWRK 1092 LLDRW = L*LDRWRK 1093 INU = MLDRW*NOBR + 1 1094 INY = MLDRW*NOBR2 + 1 1095 END IF 1096C 1097C Process the data given at the current call. 1098C 1099 IF ( .NOT.FIRST .AND. CONNEC ) THEN 1100C 1101C Restore the saved (M+L)*(2*NOBR-1) "connection" elements of 1102C U and Y into their appropriate position in sequential 1103C processing. The process is performed column-wise, in 1104C reverse order, first for Y and then for U. 1105C 1106 IREV = NR - M - L - NOBR21 + 1 1107 ICOL = INY + LLDRW - LDRWRK 1108C 1109 DO 380 I = 1, L 1110 CALL DCOPY( NOBR21, DWORK(IREV), -1, DWORK(ICOL), -1 ) 1111 IREV = IREV - NOBR21 1112 ICOL = ICOL - LDRWRK 1113 380 CONTINUE 1114C 1115 IF( MOESP ) THEN 1116 ICOL = INU + MLDRW - LDRWRK 1117 ELSE 1118 ICOL = MLDRW - LDRWRK + 1 1119 END IF 1120C 1121 DO 390 I = 1, M 1122 CALL DCOPY( NOBR21, DWORK(IREV), -1, DWORK(ICOL), -1 ) 1123 IREV = IREV - NOBR21 1124 ICOL = ICOL - LDRWRK 1125 390 CONTINUE 1126C 1127 IF( MOESP ) 1128 $ CALL DLACPY( 'Full', NOBRM1, M, DWORK(INU+NOBR), LDRWRK, 1129 $ DWORK, LDRWRK ) 1130 END IF 1131C 1132C Data compression using QR factorization. 1133C 1134 IF ( FIRST ) THEN 1135C 1136C Non-sequential data processing or first block in 1137C sequential data processing: 1138C Use the general QR factorization algorithm. 1139C 1140 IF ( LINR ) THEN 1141C 1142C Put the input-output data in the array R. 1143C 1144 IF( M.GT.0 ) THEN 1145 IF( MOESP ) THEN 1146C 1147 DO 400 I = 1, NOBR 1148 CALL DLACPY( 'Full', NS, M, U(NOBR+I,1), LDU, 1149 $ R(1,M*(I-1)+1), LDR ) 1150 400 CONTINUE 1151C 1152 DO 410 I = 1, NOBR 1153 CALL DLACPY( 'Full', NS, M, U(I,1), LDU, 1154 $ R(1,MNOBR+M*(I-1)+1), LDR ) 1155 410 CONTINUE 1156C 1157 ELSE 1158C 1159 DO 420 I = 1, NOBR2 1160 CALL DLACPY( 'Full', NS, M, U(I,1), LDU, 1161 $ R(1,M*(I-1)+1), LDR ) 1162 420 CONTINUE 1163C 1164 END IF 1165 END IF 1166C 1167 DO 430 I = 1, NOBR2 1168 CALL DLACPY( 'Full', NS, L, Y(I,1), LDY, 1169 $ R(1,MMNOBR+L*(I-1)+1), LDR ) 1170 430 CONTINUE 1171C 1172C Workspace: need 4*(M+L)*NOBR, 1173C prefer 2*(M+L)*NOBR+2*(M+L)*NOBR*NB. 1174C 1175 ITAU = 1 1176 JWORK = ITAU + NR 1177 CALL DGEQRF( NS, NR, R, LDR, DWORK(ITAU), DWORK(JWORK), 1178 $ LDWORK-JWORK+1, IERR ) 1179 ELSE 1180C 1181C Put the input-output data in the array DWORK. 1182C 1183 IF( M.GT.0 ) THEN 1184 ISHFTU = 1 1185 IF( MOESP ) THEN 1186 ISHFT2 = INU 1187C 1188 DO 440 I = 1, NOBR 1189 CALL DLACPY( 'Full', NS, M, U(NOBR+I,1), LDU, 1190 $ DWORK(ISHFTU), LDRWRK ) 1191 ISHFTU = ISHFTU + MLDRW 1192 440 CONTINUE 1193C 1194 DO 450 I = 1, NOBR 1195 CALL DLACPY( 'Full', NS, M, U(I,1), LDU, 1196 $ DWORK(ISHFT2), LDRWRK ) 1197 ISHFT2 = ISHFT2 + MLDRW 1198 450 CONTINUE 1199C 1200 ELSE 1201C 1202 DO 460 I = 1, NOBR2 1203 CALL DLACPY( 'Full', NS, M, U(I,1), LDU, 1204 $ DWORK(ISHFTU), LDRWRK ) 1205 ISHFTU = ISHFTU + MLDRW 1206 460 CONTINUE 1207C 1208 END IF 1209 END IF 1210C 1211 ISHFTY = INY 1212C 1213 DO 470 I = 1, NOBR2 1214 CALL DLACPY( 'Full', NS, L, Y(I,1), LDY, 1215 $ DWORK(ISHFTY), LDRWRK ) 1216 ISHFTY = ISHFTY + LLDRW 1217 470 CONTINUE 1218C 1219C Workspace: need 2*(M+L)*NOBR + 4*(M+L)*NOBR, 1220C prefer NS*2*(M+L)*NOBR + 2*(M+L)*NOBR 1221C + 2*(M+L)*NOBR*NB, 1222C used LDRWRK*2*(M+L)*NOBR + 4*(M+L)*NOBR, 1223C where NS = NSMP - 2*NOBR + 1, 1224C LDRWRK = min(NS, LDWORK/(2*(M+L)*NOBR)-2). 1225C 1226 ITAU = LDRWRK*NR + 1 1227 JWORK = ITAU + NR 1228 CALL DGEQRF( NS, NR, DWORK, LDRWRK, DWORK(ITAU), 1229 $ DWORK(JWORK), LDWORK-JWORK+1, IERR ) 1230 CALL DLACPY( 'Upper ', MIN(NS,NR), NR, DWORK, LDRWRK, R, 1231 $ LDR ) 1232 END IF 1233C 1234 IF ( NS.LT.NR ) 1235 $ CALL DLASET( 'Upper ', NR - NS, NR - NS, ZERO, ZERO, 1236 $ R(NS+1,NS+1), LDR ) 1237 INITI = INITI + NS 1238 END IF 1239C 1240 IF ( NCYCLE.GT.1 .OR. .NOT.FIRST ) THEN 1241C 1242C Remaining segments of the first data block or 1243C remaining segments/blocks in sequential data processing: 1244C Use a structure-exploiting QR factorization algorithm. 1245C 1246 NSL = LDRWRK 1247 IF ( .NOT.CONNEC ) NSL = NS 1248 ITAU = LDRWRK*NR + 1 1249 JWORK = ITAU + NR 1250C 1251 DO 560 NICYCL = INICYC, NCYCLE 1252C 1253C INIT denotes the beginning row where new data are put. 1254C 1255 IF ( CONNEC .AND. NICYCL.EQ.1 ) THEN 1256 INIT = NOBR2 1257 ELSE 1258 INIT = 1 1259 END IF 1260 IF ( NCYCLE.GT.1 .AND. NICYCL.EQ.NCYCLE ) THEN 1261C 1262C Last samples in the last data segment of a block. 1263C 1264 NS = NSLAST 1265 NSL = NSLAST 1266 END IF 1267C 1268C Put the input-output data in the array DWORK. 1269C 1270 NSF = NS 1271 IF ( INIT.GT.1 .AND. NCYCLE.GT.1 ) NSF = NSF - NOBR21 1272 IF ( M.GT.0 ) THEN 1273 ISHFTU = INIT 1274C 1275 IF( MOESP ) THEN 1276 ISHFT2 = INIT + INU - 1 1277C 1278 DO 480 I = 1, NOBR 1279 CALL DLACPY( 'Full', NSF, M, U(INITI+NOBR+I,1), 1280 $ LDU, DWORK(ISHFTU), LDRWRK ) 1281 ISHFTU = ISHFTU + MLDRW 1282 480 CONTINUE 1283C 1284 DO 490 I = 1, NOBR 1285 CALL DLACPY( 'Full', NSF, M, U(INITI+I,1), LDU, 1286 $ DWORK(ISHFT2), LDRWRK ) 1287 ISHFT2 = ISHFT2 + MLDRW 1288 490 CONTINUE 1289C 1290 ELSE 1291C 1292 DO 500 I = 1, NOBR2 1293 CALL DLACPY( 'Full', NSF, M, U(INITI+I,1), LDU, 1294 $ DWORK(ISHFTU), LDRWRK ) 1295 ISHFTU = ISHFTU + MLDRW 1296 500 CONTINUE 1297C 1298 END IF 1299 END IF 1300C 1301 ISHFTY = INIT + INY - 1 1302C 1303 DO 510 I = 1, NOBR2 1304 CALL DLACPY( 'Full', NSF, L, Y(INITI+I,1), LDY, 1305 $ DWORK(ISHFTY), LDRWRK ) 1306 ISHFTY = ISHFTY + LLDRW 1307 510 CONTINUE 1308C 1309 IF ( INIT.GT.1 ) THEN 1310C 1311C Prepare the connection to the previous block of data 1312C in sequential processing. 1313C 1314 IF( MOESP .AND. M.GT.0 ) 1315 $ CALL DLACPY( 'Full', NOBR, M, U, LDU, DWORK(NOBR), 1316 $ LDRWRK ) 1317C 1318C Shift the elements from the connection to the previous 1319C block of data in sequential processing. 1320C 1321 IF ( M.GT.0 ) THEN 1322 ISHFTU = MLDRW + 1 1323C 1324 IF( MOESP ) THEN 1325 ISHFT2 = MLDRW + INU 1326C 1327 DO 520 I = 1, NOBRM1 1328 CALL DLACPY( 'Full', NOBR21, M, 1329 $ DWORK(ISHFTU-MLDRW+1), LDRWRK, 1330 $ DWORK(ISHFTU), LDRWRK ) 1331 ISHFTU = ISHFTU + MLDRW 1332 520 CONTINUE 1333C 1334 DO 530 I = 1, NOBRM1 1335 CALL DLACPY( 'Full', NOBR21, M, 1336 $ DWORK(ISHFT2-MLDRW+1), LDRWRK, 1337 $ DWORK(ISHFT2), LDRWRK ) 1338 ISHFT2 = ISHFT2 + MLDRW 1339 530 CONTINUE 1340C 1341 ELSE 1342C 1343 DO 540 I = 1, NOBR21 1344 CALL DLACPY( 'Full', NOBR21, M, 1345 $ DWORK(ISHFTU-MLDRW+1), LDRWRK, 1346 $ DWORK(ISHFTU), LDRWRK ) 1347 ISHFTU = ISHFTU + MLDRW 1348 540 CONTINUE 1349C 1350 END IF 1351 END IF 1352C 1353 ISHFTY = LLDRW + INY 1354C 1355 DO 550 I = 1, NOBR21 1356 CALL DLACPY( 'Full', NOBR21, L, 1357 $ DWORK(ISHFTY-LLDRW+1), LDRWRK, 1358 $ DWORK(ISHFTY), LDRWRK ) 1359 ISHFTY = ISHFTY + LLDRW 1360 550 CONTINUE 1361C 1362 END IF 1363C 1364C Workspace: need LDRWRK*2*(M+L)*NOBR + 4*(M+L)*NOBR. 1365C 1366 CALL MB04OD( 'Full', NR, 0, NSL, R, LDR, DWORK, LDRWRK, 1367 $ DUM, NR, DUM, NR, DWORK(ITAU), DWORK(JWORK) 1368 $ ) 1369 INITI = INITI + NSF 1370 560 CONTINUE 1371C 1372 END IF 1373C 1374 IF ( .NOT.LAST ) THEN 1375 IF ( CONNEC ) THEN 1376C 1377C For sequential processing with connected data blocks, 1378C save the remaining ("connection") elements of U and Y 1379C in the first (M+L)*(2*NOBR-1) locations of DWORK. 1380C 1381 IF ( M.GT.0 ) 1382 $ CALL DLACPY( 'Full', NOBR21, M, U(INITI+1,1), LDU, 1383 $ DWORK, NOBR21 ) 1384 CALL DLACPY( 'Full', NOBR21, L, Y(INITI+1,1), LDY, 1385 $ DWORK(MMNOBR-M+1), NOBR21 ) 1386 END IF 1387C 1388C Return to get new data. 1389C 1390 ICYCLE = ICYCLE + 1 1391 IF ( ICYCLE.LE.MAXCYC ) 1392 $ RETURN 1393 IWARN = 1 1394 ICYCLE = 1 1395C 1396 END IF 1397C 1398 END IF 1399C 1400C Return optimal workspace in DWORK(1). 1401C 1402 DWORK( 1 ) = MAXWRK 1403 IF ( LAST ) THEN 1404 ICYCLE = 1 1405 MAXWRK = 1 1406 NSMPSM = 0 1407 END IF 1408 RETURN 1409C 1410C *** Last line of IB01MD *** 1411 END 1412