1C NOTE THAT THERE ARE MANY APPROACHES AND ALGORITHMS FOR 2C PERFORMING CLUSTER ANALYSIS. THIS FILE CONTAINS ROUTINES 3C FROM THE FOLLOWING SOURCES: 4C 5C 1) JOHN HARTIGAN (1979), "ALGORITHM AS 136", APPLIED 6C STATISTICS, VOL. 28, NO. 1. 7C 8C 2) JOHN HARTIGAN (1975), "CLUSTERING ALGORITHMS", 9C WILEY. 10C 11C WE ACTUALLY USE THE ROUTINES AS EXTRACTED FROM THE 12C CMLIB LIBRARY. THESE ARE SLIGHTLY DIFFERENT THAN THE 13C ROUTINES AS GIVEN IN THE BOOK. 14C 15C WE DO INCLUDE ALL THE ROUTINES. ALSO, WE USE THE 16C K-MEANS ALGORITHM FROM APPLIED STATISTICS RATHER THAN 17C THE VERSION FROM THE BOOK (THE APPLIED STATISTICS 18C VERSION IS SOMEWHAT SIMPLER THAN THE BOOK VERSION). 19C 20C 3) KAUFMAN AND ROUSSEEUW (1990), "FINDING GROUPS IN 21C DATA", WILEY. 22C 23C THESE ROUTINES ARE DESIGNED TO BE MORE ROBUST THAN 24C SOME OF THE STANDARD CLUSTERING ALGORITHMS. 25C 26C SPECIFICALLY, WE SUPPORT THE K-MEDOIDS METHODS FROM 27C THE "PAM" AND "CLARA" ALGORITHMS. 28C 29C KAUFMAN AND ROUSSEEUW SUPPORT THE FOLLOWING PROGRAMS: 30C 31C 1. DAISY - FOR CREATING DISSIMILARITY MATRICES 32C (DATAPLOT HAS COMMANDS TO DO WHAT DAISY 33C CAN DO, SO NOTHING EXPLICITLY IMPLEMENTED 34C FROM DAISY) 35C 2. PAM - PARTITIONING AROUND MEDOIDS BASED ON EITHER 36C MEASUREMENT DATA OR A DISSIMILARITY MATRIX. 37C CURRENTLY LIMITED TO A MAXIMUM OF 100 OBJECTS. 38C 3. CLARA - PARTITIONING AROUND MEDOIDS FOR THE CASE OF MORE 39C THAN 100 OBJECTS. ONLY APPLIED TO MEASUREMENT 40C DATA. 41C 4. FANNY - FUZZY CLUSTERING. NOT CURRENTLY IMPLEMENTED. 42C 5. AGNES - HIERARCHIAL CLUSTERING (OR AGGLOMERATIVE NESTING). 43C 6. MONA - HIERARCHIAL CLUSTERING FOR BINARY DATA. NOT 44C CURRENTLY SUPPORTED. 45C 46C ALSO INCLUDE ROUTINES FROM APPLIED STATISTICS 136 47C SPECIFICALLY FOR K-MEANS CLUSTERING. 48C 49C LIST OF ROUTINES: 50C 51C HARTIGAN'S K-MEANS FROM APPLIED STATISTICS 136: 52C 53C 1. KMNS - THE ROUTINE THAT IS CALLED FROM DATAPLOT 54C 2. OPTRA - OPTIMAL TRANSFER STAGE 55C 3. QTRAN - QUICK TRANSFER STAGE 56C 57C ADDITIONAL HARTIGAN CLUSTERING CODES. CURRENTLY WE USE 58C "MIX" FOR NORMAL MIXTURE MODELS AND SLINK FOR SINGLE 59C LINKAGE (NEAREST NEIGHBOR) CLUSTERING. NOT ALL OF THESE 60C ROUTINES ARE CURRENTLY ACTIVELY USED. 61C 62C 1. MIX - PERFORMS NORMAL MIXTURE CLUSTERING 63C 2. COVOUT - USED BY MIX TO PRINT OUTPUT 64C 3. INVERT - USED BY MIX TO INVERT A MATRIX 65C 4. CLUMOM - USED BY MIX TO COMPUTE WEIGHTED MEANS AND 66C STANDARD DEVIATIONS 67C 68C 5. MIXIND - NORMAL MIXTURE WITH SPECIFIC COVARIANCE 69C MODEL (NOT CURRENTLY USED) 70C 6. MIXOUT - PRINT OUTPUT OF MIXIND 71C 72C FOLLOWING ARE NOT ACTIVELY CALLED 73C 74C 7. SLINK - PERFORMS SINGLE LINKAGE CLUSTERING 75C 8. BUILD - K-MEANS FROM CMLIB 76C 9. KMEANS - (CALLED BY BUILD) 77C 10. SINGLE - CALLED BY BUILD/KMEANS 78C 11. KOUT - CALLED BY BUILD/KMEANS 79C 80C 12. SPLIT1 - SPLITTING ALGORITHM FOR CLUSTERING 81C (NOT CURRENTLY USED) 82C 13. SPLIT2 - SPLITTING ALGORITHM FOR CLUSTERING 83C (NOT CURRENTLY USED) 84C 14. CSPLIT - USED BY SPLIT1/SPLIT2 85C 15. RSPLIT - USED BY SPLIT1/SPLIT2 86 87C 16. QUICK - PERFORMS A "QUICK" CLUSTERING (NOT 88C CURRENTLY USED) 89C 90C ROUTINES FROM KAUFFMAN AND ROUSSEEUW 91C 92C 1. BSWAP (FOR CLARA AND PAM) 93C 2. DYSTA (FOR CLARA) 94C 3. DYSTAP (FOR PAM) 95C 4. DYSTAF (FOR FANNY) 96C 5. MEET (FOR CLARA AND PAM) 97C 6. RESUL (FOR CLARA) 98C 7. SELEC (FOR CLARA) 99C 8. CSTAT (PAM) 100C 9. SUPCL (AGNES) 101C 10. AVERL (AGNES) 102C 11. BANAG (AGNES) 103C 12. SPLYT (DIANA) 104C 13. BANDY (DIANA) 105C 14. CADDY (FANNY) 106C 15. FUZZY (FANNY) 107C 108 SUBROUTINE KMNS(A, M, N, C, K, IC1, IC2, NC, AN1, AN2, NCP, D, 109 * ITRAN, LIVE, ITER, WSS, IFAULT) 110C 111C ALGORITHM AS 136 APPL. STATIST. (1979) VOL.28, NO.1 112C 113C Divide M points in N-dimensional space into K clusters so that 114C the within cluster sum of squares is minimized. 115C 116 INTEGER IC1(M), IC2(M), NC(K), NCP(K), ITRAN(K), LIVE(K) 117 REAL A(M,N), D(M), C(K,N), AN1(K), AN2(K), WSS(K), DT(2) 118 REAL ZERO, ONE 119C 120C Define BIG to be a very large positive number 121C 122 DATA BIG /1.E30/, ZERO /0.0/, ONE /1.0/ 123C 124 IFAULT = 3 125 IF (K .LE. 1 .OR. K .GE. M) RETURN 126 IFAULT = 0 127C 128C For each point I, find its two closest centres, IC1(I) and 129C IC2(I). Assign it to IC1(I). 130C 131 DO 50 I = 1, M 132 IC1(I) = 1 133 IC2(I) = 2 134 DO 10 IL = 1, 2 135 DT(IL) = ZERO 136 DO 11 J = 1, N 137 DA = A(I,J) - C(IL,J) 138 DT(IL) = DT(IL) + DA*DA 139 11 CONTINUE 140 10 CONTINUE 141 IF (DT(1) .GT. DT(2)) THEN 142 IC1(I) = 2 143 IC2(I) = 1 144 TEMP = DT(1) 145 DT(1) = DT(2) 146 DT(2) = TEMP 147 END IF 148 DO 55 L = 3, K 149 DB = ZERO 150 DO 30 J = 1, N 151 DC = A(I,J) - C(L,J) 152 DB = DB + DC*DC 153 IF (DB .GE. DT(2)) GO TO 50 154 30 CONTINUE 155 IF (DB .LT. DT(1)) GO TO 40 156 DT(2) = DB 157 IC2(I) = L 158 GO TO 50 159 40 DT(2) = DT(1) 160 IC2(I) = IC1(I) 161 DT(1) = DB 162 IC1(I) = L 163 55 CONTINUE 164 50 CONTINUE 165C 166C Update cluster centres to be the average of points contained 167C within them. 168C 169 DO 70 L = 1, K 170 NC(L) = 0 171 DO 60 J = 1, N 172 C(L,J) = ZERO 173 60 CONTINUE 174 70 CONTINUE 175 DO 90 I = 1, M 176 L = IC1(I) 177 NC(L) = NC(L) + 1 178 DO 80 J = 1, N 179 C(L,J) = C(L,J) + A(I,J) 180 80 CONTINUE 181 90 CONTINUE 182C 183C Check to see if there is any empty cluster at this stage 184C 185 DO 120 L = 1, K 186 IF (NC(L) .EQ. 0) THEN 187 IFAULT = 1 188 RETURN 189 END IF 190 IFAULT = 0 191 AA = NC(L) 192 DO 110 J = 1, N 193 C(L,J) = C(L,J) / AA 194 110 CONTINUE 195C 196C Initialize AN1, AN2, ITRAN & NCP 197C AN1(L) = NC(L) / (NC(L) - 1) 198C AN2(L) = NC(L) / (NC(L) + 1) 199C ITRAN(L) = 1 if cluster L is updated in the quick-transfer 200C stage, 201C = 0 otherwise 202C In the optimal-transfer stage, NCP(L) stores the step at which 203C cluster L is last updated. 204C In the quick-transfer stage, NCP(L) stores the step at which 205C cluster L is last updated plus M. 206C 207 AN2(L) = AA / (AA + ONE) 208 AN1(L) = BIG 209 IF (AA .GT. ONE) AN1(L) = AA / (AA - ONE) 210 ITRAN(L) = 1 211 NCP(L) = -1 212 120 CONTINUE 213 INDX = 0 214 DO 140 IJ = 1, ITER 215C 216C In this stage, there is only one pass through the data. Each 217C point is re-allocated, if necessary, to the cluster that will 218C induce the maximum reduction in within-cluster sum of squares. 219C 220 CALL OPTRA(A, M, N, C, K, IC1, IC2, NC, AN1, AN2, NCP, D, 221 * ITRAN, LIVE, INDX) 222C 223C Stop if no transfer took place in the last M optimal transfer 224C steps. 225C 226 IF (INDX .EQ. M) GO TO 150 227C 228C Each point is tested in turn to see if it should be re-allocated 229C to the cluster to which it is most likely to be transferred, 230C IC2(I), from its present cluster, IC1(I). Loop through the 231C data until no further change is to take place. 232C 233 CALL QTRAN(A, M, N, C, K, IC1, IC2, NC, AN1, AN2, NCP, D, 234 * ITRAN, INDX) 235C 236C If there are only two clusters, there is no need to re-enter the 237C optimal transfer stage. 238C 239 IF (K .EQ. 2) GO TO 150 240C 241C NCP has to be set to 0 before entering OPTRA. 242C 243 DO 130 L = 1, K 244 NCP(L) = 0 245 130 CONTINUE 246 140 CONTINUE 247C 248C Since the specified number of iterations has been exceeded, set 249C IFAULT = 2. This may indicate unforeseen looping. 250C 251 IFAULT = 2 252C 253C Compute within-cluster sum of squares for each cluster. 254C 255 150 CONTINUE 256 DO 160 L = 1, K 257 WSS(L) = ZERO 258 DO 165 J = 1, N 259 C(L,J) = ZERO 260 165 CONTINUE 261 160 CONTINUE 262 DO 170 I = 1, M 263 II = IC1(I) 264 DO 175 J = 1, N 265 C(II,J) = C(II,J) + A(I,J) 266 175 CONTINUE 267 170 CONTINUE 268 DO 190 J = 1, N 269 DO 180 L = 1, K 270 C(L,J) = C(L,J) / FLOAT(NC(L)) 271 180 CONTINUE 272 DO 195 I = 1, M 273 II = IC1(I) 274 DA = A(I,J) - C(II,J) 275 WSS(II) = WSS(II) + DA*DA 276 195 CONTINUE 277 190 CONTINUE 278C 279 RETURN 280 END 281C 282C 283 SUBROUTINE OPTRA(A, M, N, C, K, IC1, IC2, NC, AN1, AN2, NCP, D, 284 * ITRAN, LIVE, INDX) 285C 286C ALGORITHM AS 136.1 APPL. STATIST. (1979) VOL.28, NO.1 287C 288C This is the optimal transfer stage. 289C 290C Each point is re-allocated, if necessary, to the cluster that 291C will induce a maximum reduction in the within-cluster sum of 292C squares. 293C 294 INTEGER IC1(M), IC2(M), NC(K), NCP(K), ITRAN(K), LIVE(K) 295 REAL A(M,N), D(M), C(K,N), AN1(K), AN2(K), ZERO, ONE 296C 297C Define BIG to be a very large positive number. 298C 299 DATA BIG /1.0E30/, ZERO /0.0/, ONE/1.0/ 300C 301C If cluster L is updated in the last quick-transfer stage, it 302C belongs to the live set throughout this stage. Otherwise, at 303C each step, it is not in the live set if it has not been updated 304C in the last M optimal transfer steps. 305C 306 DO 10 L = 1, K 307 IF (ITRAN(L) .EQ. 1) LIVE(L) = M + 1 308 10 CONTINUE 309 DO 100 I = 1, M 310 INDX = INDX + 1 311 L1 = IC1(I) 312 L2 = IC2(I) 313 LL = L2 314C 315C If point I is the only member of cluster L1, no transfer. 316C 317 IF (NC(L1) .EQ. 1) GO TO 90 318C 319C If L1 has not yet been updated in this stage, no need to 320C re-compute D(I). 321C 322 IF (NCP(L1) .EQ. 0) GO TO 30 323 DE = ZERO 324 DO 20 J = 1, N 325 DF = A(I,J) - C(L1,J) 326 DE = DE + DF*DF 327 20 CONTINUE 328 D(I) = DE * AN1(L1) 329C 330C Find the cluster with minimum R2. 331C 332 30 DA = ZERO 333 DO 40 J = 1, N 334 DB = A(I,J) - C(L2,J) 335 DA = DA + DB*DB 336 40 CONTINUE 337 R2 = DA * AN2(L2) 338 DO 60 L = 1, K 339C 340C If I >= LIVE(L1), then L1 is not in the live set. If this is 341C true, we only need to consider clusters that are in the live set 342C for possible transfer of point I. Otherwise, we need to consider 343C all possible clusters. 344C 345 IF (I .GE. LIVE(L1) .AND. I .GE. LIVE(L) .OR. L .EQ. L1 .OR. 346 * L .EQ. LL) GO TO 60 347 RR = R2 / AN2(L) 348 DC = ZERO 349 DO 50 J = 1, N 350 DD = A(I,J) - C(L,J) 351 DC = DC + DD*DD 352 IF (DC .GE. RR) GO TO 60 353 50 CONTINUE 354 R2 = DC * AN2(L) 355 L2 = L 356 60 CONTINUE 357 IF (R2 .LT. D(I)) GO TO 70 358C 359C If no transfer is necessary, L2 is the new IC2(I). 360C 361 IC2(I) = L2 362 GO TO 90 363C 364C Update cluster centres, LIVE, NCP, AN1 & AN2 for clusters L1 and 365C L2, and update IC1(I) & IC2(I). 366C 367 70 INDX = 0 368 LIVE(L1) = M + I 369 LIVE(L2) = M + I 370 NCP(L1) = I 371 NCP(L2) = I 372 AL1 = NC(L1) 373 ALW = AL1 - ONE 374 AL2 = NC(L2) 375 ALT = AL2 + ONE 376 DO 80 J = 1, N 377 C(L1,J) = (C(L1,J) * AL1 - A(I,J)) / ALW 378 C(L2,J) = (C(L2,J) * AL2 + A(I,J)) / ALT 379 80 CONTINUE 380 NC(L1) = NC(L1) - 1 381 NC(L2) = NC(L2) + 1 382 AN2(L1) = ALW / AL1 383 AN1(L1) = BIG 384 IF (ALW .GT. ONE) AN1(L1) = ALW / (ALW - ONE) 385 AN1(L2) = ALT / AL2 386 AN2(L2) = ALT / (ALT + ONE) 387 IC1(I) = L2 388 IC2(I) = L1 389 90 CONTINUE 390 IF (INDX .EQ. M) RETURN 391 100 CONTINUE 392 DO 110 L = 1, K 393C 394C ITRAN(L) = 0 before entering QTRAN. Also, LIVE(L) has to be 395C decreased by M before re-entering OPTRA. 396C 397 ITRAN(L) = 0 398 LIVE(L) = LIVE(L) - M 399 110 CONTINUE 400C 401 RETURN 402 END 403C 404C 405 SUBROUTINE QTRAN(A, M, N, C, K, IC1, IC2, NC, AN1, AN2, NCP, D, 406 * ITRAN, INDX) 407C 408C ALGORITHM AS 136.2 APPL. STATIST. (1979) VOL.28, NO.1 409C 410C This is the quick transfer stage. 411C IC1(I) is the cluster which point I belongs to. 412C IC2(I) is the cluster which point I is most likely to be 413C transferred to. 414C For each point I, IC1(I) & IC2(I) are switched, if necessary, to 415C reduce within-cluster sum of squares. The cluster centres are 416C updated after each step. 417C 418 INTEGER IC1(M), IC2(M), NC(K), NCP(K), ITRAN(K) 419 REAL A(M,N), D(M), C(K,N), AN1(K), AN2(K), ZERO, ONE 420C 421C Define BIG to be a very large positive number 422C 423 DATA BIG /1.0E30/, ZERO /0.0/, ONE /1.0/ 424C 425C In the optimal transfer stage, NCP(L) indicates the step at which 426C cluster L is last updated. In the quick transfer stage, NCP(L) 427C is equal to the step at which cluster L is last updated plus M. 428C 429 ICOUN = 0 430 ISTEP = 0 431 10 DO 70 I = 1, M 432 ICOUN = ICOUN + 1 433 ISTEP = ISTEP + 1 434 L1 = IC1(I) 435 L2 = IC2(I) 436C 437C If point I is the only member of cluster L1, no transfer. 438C 439 IF (NC(L1) .EQ. 1) GO TO 60 440C 441C If ISTEP > NCP(L1), no need to re-compute distance from point I to 442C cluster L1. Note that if cluster L1 is last updated exactly M 443C steps ago, we still need to compute the distance from point I to 444C cluster L1. 445C 446 IF (ISTEP .GT. NCP(L1)) GO TO 30 447 DA = ZERO 448 DO 20 J = 1, N 449 DB = A(I,J) - C(L1,J) 450 DA = DA + DB*DB 451 20 CONTINUE 452 D(I) = DA * AN1(L1) 453C 454C If ISTEP >= both NCP(L1) & NCP(L2) there will be no transfer of 455C point I at this step. 456C 457 30 IF (ISTEP .GE. NCP(L1) .AND. ISTEP .GE. NCP(L2)) GO TO 60 458 R2 = D(I) / AN2(L2) 459 DD = ZERO 460 DO 40 J = 1, N 461 DE = A(I,J) - C(L2,J) 462 DD = DD + DE*DE 463 IF (DD .GE. R2) GO TO 60 464 40 CONTINUE 465C 466C Update cluster centres, NCP, NC, ITRAN, AN1 & AN2 for clusters 467C L1 & L2. Also update IC1(I) & IC2(I). Note that if any 468C updating occurs in this stage, INDX is set back to 0. 469C 470 ICOUN = 0 471 INDX = 0 472 ITRAN(L1) = 1 473 ITRAN(L2) = 1 474 NCP(L1) = ISTEP + M 475 NCP(L2) = ISTEP + M 476 AL1 = NC(L1) 477 ALW = AL1 - ONE 478 AL2 = NC(L2) 479 ALT = AL2 + ONE 480 DO 50 J = 1, N 481 C(L1,J) = (C(L1,J) * AL1 - A(I,J)) / ALW 482 C(L2,J) = (C(L2,J) * AL2 + A(I,J)) / ALT 483 50 CONTINUE 484 NC(L1) = NC(L1) - 1 485 NC(L2) = NC(L2) + 1 486 AN2(L1) = ALW / AL1 487 AN1(L1) = BIG 488 IF (ALW .GT. ONE) AN1(L1) = ALW / (ALW - ONE) 489 AN1(L2) = ALT / AL2 490 AN2(L2) = ALT / (ALT + ONE) 491 IC1(I) = L2 492 IC2(I) = L1 493C 494C If no re-allocation took place in the last M steps, return. 495C 496 60 IF (ICOUN .EQ. M) RETURN 497 70 CONTINUE 498 GO TO 10 499 END 500 SUBROUTINE BLOCK(MM, M, N, D, CLAB, RLAB, TITLE, KC, DMNB, NB, 501 * IERR, OUNIT) 502C 503C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 504C 505C PURPOSE 506C ------- 507C 508C PRINTS OUTLINES OF BLOCKS OVER A DISTANCE MATRIX 509C 510C DESCRIPTION 511C ----------- 512C 513C 1. THERE EXISTS AN ORDERING OF THE ROWS OF THE BLOCK SUCH THAT 514C EVERY BLOCK CONSISTS OF A SET OF OBJECTS CONTIGUOUS IN THAT 515C ORDER. THE ALGORITHM IS GIVEN ON PAGE 156 OF THE FIRST 516C REFERENCE. THE ROW OBJECTS ARE STORED IN THE VECTOR RLAB IN 517C SUCH AN ORDER. SIMILARLY, THE COLUMNS CAN BE ORDERED WHICH IS 518C STORED IN THE CLAB ARRAY. 519C 520C 2. THIS ORDERING OF THE OBJECTS ALLOWS THE BLOCKS TO BE NAMED BY 521C GIVING THE LOCATION OF THE FIRST AND LAST ROW AND COLUMN IN THE 522C ARRAY FOR EACH BLOCK. THE FIRST TWO COLUMNS OF THE NB ARRAY 523C STORE THE FIRST AND LAST ROWS IN EACH BLOCK AND THE THIRD AND 524C FOURTH COLUMNS STORE THE FIRST AND LAST COLUMNS IN EACH BLOCK 525C 526C 3. THE FINAL BLOCK DIAGRAM PRINTS THE ROW LABELS AND THE COLUMN 527C LABELS AND THE DISTANCE MATRIX WHERE EACH VALUE IS MULTIPLIED 528C BY 10. THE HORIZONTAL BOUNDARIES OF THE BLOCKS ARE REPRESENTED 529C BY DASHES AND THE VERTICAL BOUNDARIES BY QUOTE MARKS. COMMAS 530C REPRESENT THE CORNERS OF THE BLOCKS. 531C 532C INPUT PARAMETERS 533C ---------------- 534C 535C MM INTEGER SCALAR (UNCHANGED ON OUTPUT). 536C THE LEADING DIMENSION OF MATRIX D. 537C 538C M INTEGER SCALAR (UNCHANGED ON OUTPUT). 539C THE NUMBER OF OBJECTS. 540C 541C N INTEGER SCALAR (UNCHANGED ON OUTPUT). 542C THE NUMBER OF VARIABLES. 543C 544C D REAL MATRIX WHOSE FIRST DIMENSION MUST BE MM AND SECOND 545C DIMENSION MUST BE AT LEAST M (UNCHANGED ON OUTPUT). 546C THE MATRIX OF DISTANCES. 547C 548C D(I,J) = DISTANCE FROM CASE I TO CASE J 549C 550C CLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST N 551C (UNCHANGED ON OUTPUT). 552C ORDERED LABELS OF THE COLUMNS. 553C 554C RLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST M 555C (UNCHANGED ON OUTPUT). 556C ORDERED LABELS OF THE ROWS. 557C 558C TITLE 10-CHARACTER VARIABLE (UNCHANGED ON OUTPUT). 559C TITLE OF THE DATA SET. 560C 561C KC INTEGER SCALAR (UNCHANGED ON OUTPUT). 562C THE NUMBER OF BLOCKS. 563C 564C DMNB INTEGER SCALAR (UNCHANGED ON OUTPUT). 565C THE LEADING DIMENSION OF MATRIX NB. MUST BE AT LEAST 4. 566C 567C NB REAL MATRIX WHOSE FIRST DIMENSION MUST BE DMNB AND SECOND 568C DIMENSION MUST BE AT LEAST KC (UNCHANGED ON OUTPUT). 569C THE MATRIX DEFINING THE BOUNDARIES OF THE BLOCKS. 570C 571C NB(1,I) IS 1 + THE FIRST ROW IN BLOCK I 572C NB(2,I) IS 1 + THE LAST ROW IN BLOCK I 573C NB(3,I) IS 1 + THE FIRST COLUMN IN BLOCK I 574C NB(4,I) IS 1 + THE LAST COLUMN IN BLOCK I 575C 576C OUNIT INTEGER SCALAR (UNCHANGED ON OUTPUT). 577C UNIT NUMBER FOR OUTPUT. 578C 579C OUTPUT PARAMETER 580C ---------------- 581C 582C IERR INTEGER SCALAR. 583C ERROR FLAG. 584C 585C IERR = 0, NO ERRORS WERE DETECTED DURING EXECUTION 586C 587C IERR = 2, EITHER THE FIRST AND LAST CASES OR THE CLUSTER 588C DIAMETER FOR A CLUSTER IS OUT OF BOUNDS. THE 589C CLUSTER AND ITS BOUNDARIES ARE PRINTED ON UNIT 590C OUNIT. EXECUTION WILL CONTINUE WITH QUESTIONABLE 591C RESULTS FOR THAT CLUSTER. 592C 593C REFERENCES 594C ---------- 595C 596C HARTIGAN, J. A. (1975). CLUSTERING ALGORITHMS, JOHN WILEY & 597C SONS, INC., NEW YORK. PAGE 168. 598C 599C HARTIGAN, J. A. (1975) PRINTER GRAPHICS FOR CLUSTERING. JOURNAL OF 600C STATISTICAL COMPUTATION AND SIMULATION. VOLUME 4,PAGES 187-213. 601C 602C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 603C 604 INTEGER DMNB, OUNIT 605 DIMENSION D(MM,*), NB(DMNB,*), IA(26) 606 CHARACTER*4 CLAB(*), RLAB(*), DD, AE(26) 607 CHARACTER*10 TITLE 608CCCCC CHARACTER*1 DASH,DITTO,COMMA,BLANK,STAR,DOT,AA(26) 609 CHARACTER*1 DASH,DITTO,COMMA,BLANK,AA(26) 610C 611 INCLUDE 'DPCOMC.INC' 612 INCLUDE 'DPCOP2.INC' 613C 614 DATA DD/'----'/ 615CCCCC DATA DASH,DITTO,COMMA,BLANK,STAR,DOT/'-','''',',',' ','*','.'/ 616 DATA DASH,DITTO,COMMA,BLANK/'-','''',',',' '/ 617C 618C CHECK BOUNDARY ARRAY NB 619C 620 IF (OUNIT .LE. 0) RETURN 621 DO 10 K=1,KC 622 IF(NB(1,K).LT.2.OR.NB(1,K).GT.NB(2,K).OR.NB(2,K).GT.M .OR. 623 * NB(3,K).LT.2.OR.NB(3,K).GT.NB(4,K).OR.NB(4,K).GT.N) THEN 624 WRITE(ICOUT,1) K 625 CALL DPWRST('XXX','WRIT') 626 WRITE(ICOUT,6) (NB(I,K)-1,I=1,4) 627 CALL DPWRST('XXX','WRIT') 628 IERR = 2 629 ENDIF 630 10 CONTINUE 631 1 FORMAT(' BAD BOUNDARY IN BLOCK ',I3) 632 6 FORMAT(' BOUNDARIES ARE ', 4I5) 633C 634 JPP=(N-2)/25+1 635 DO 80 JP=1,JPP 636 JLP=25*(JP-1)+1 637 JUP=25*JP+1 638 IF(JUP.GT.N-1) JUP=N-1 639 JR=JUP-JLP+1 640C 641C WRITE TITLES 642C 643 WRITE(ICOUT,999) 644 999 FORMAT(1X) 645 CALL DPWRST('XXX','WRIT') 646 WRITE(ICOUT,999) 647 CALL DPWRST('XXX','WRIT') 648 WRITE(ICOUT,2) TITLE 649 2 FORMAT(' BLOCKED ARRAY ',A10) 650 CALL DPWRST('XXX','WRIT') 651C 652C WRITE OUT ARRAY ONE LINE AT A TIME 653C 654 WRITE(ICOUT,3)(CLAB(J),J=JLP,JUP) 655 3 FORMAT(10X,25(1X,A4)) 656 CALL DPWRST('XXX','WRIT') 657 DO 85 I=1,M 658 I1=I-1 659 DO 20 L=1,26 660 AE(L)=BLANK 661 AA(L)=BLANK 662 20 CONTINUE 663 IF (I .NE. 1) THEN 664C 665C FILL IN DISTANCES 666C 667 DO 30 J=JLP,JUP 668 IA(J-JLP+1)=INT(D(I1,J)*10.) 669 30 CONTINUE 670C 671C FILL IN VERTICAL BOUNDARIES 672C 673 DO 40 K=1,KC 674 IF(NB(2,K).GE.I.AND.NB(1,K).LE.I) THEN 675 JL=NB(3,K)-1 676 JU=NB(4,K) 677 IF(JL.GE.JLP.AND.JL.LE.JUP) AA(JL-JLP+1)=DITTO 678 IF(JU.GE.JLP.AND.JU.LE.JUP) AA(JU-JLP+1)=DITTO 679 IF(JU.EQ.JLP+JR) AA(JR+1)=DITTO 680 ENDIF 681 40 CONTINUE 682 WRITE(ICOUT,4) RLAB(I1),(AA(J),IA(J),J=1,JR),AA(JR+1) 683 4 FORMAT(1X,A4,5X,25(A1,I4),A1) 684 CALL DPWRST('XXX','WRIT') 685C 686C FILL IN HORIZONTAL BOUNDARIES 687C 688 ENDIF 689 DO 60 K=1,KC 690 IF(NB(1,K).EQ.I+1.OR.NB(2,K).EQ.I) THEN 691 JL=NB(3,K)-1 692 JU=NB(4,K) 693 J1=JL-JLP+1 694 J2=JU-JLP+1 695 IF(J1.LE.0) J1=1 696 IF(J2.GT.26) J2=26 697 IF(J1.LE.26.AND.J2.GT.0) THEN 698 DO 50 J=J1,J2 699 IF(J.NE.J2) AE(J)=DD 700 IF(AA(J).EQ.BLANK) AA(J)=DASH 701 50 CONTINUE 702 IF(NB(1,K).EQ.I+1) THEN 703 AA(J1)=COMMA 704 AA(J2)=COMMA 705 ENDIF 706 ENDIF 707 ENDIF 708 60 CONTINUE 709 WRITE(ICOUT,5)(AA(J),AE(J),J=1,JR),AA(JR+1) 710 5 FORMAT(10X,25(A1,A4),A1) 711 CALL DPWRST('XXX','WRIT') 712 85 CONTINUE 713 80 CONTINUE 714 RETURN 715 END 716 SUBROUTINE BUILD(MM, M, N, A, CLAB, RLAB, K, ITER, XMISS, 717 1 DMSUM1, DMSUM2, SUM, IWORK, WORK, CWORK) 718CCCCC SUBROUTINE BUILD(MM, M, N, A, CLAB, RLAB, TITLE, K, ITER, XMISS, 719CCCCC1 DMSUM1, DMSUM2, SUM, IWORK, WORK, CWORK, OUNIT) 720C 721C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 722C 723C PURPOSE 724C ------- 725C 726C BUILDS CLUSTERS BY THE K-MEANS ALGORITHM, PRINTING THE RESULTS 727C FOR ALL INTERMEDIATE ITERATIONS 728C 729C DESCRIPTION 730C ----------- 731C 732C 1. THE VARIABLES SHOULD BE SCALED SIMILARLY (CLUSTER SUBROUTINE 733C STAND CAN BE USED TO STANDARDIZE THE VARIABLES). 734C 735C 2. THE ROUTINE ITERATES FROM 1 TO THE DESIRED NUMBER OF CLUSTERS. 736C THE FIRST ITERATION STARTS WITH THE CLUSTER OF ALL CASES AND 737C COMPUTES THE SUMMARY STATISTICS FOR EACH VARIABLE AND THE 738C DISTANCES FROM EACH CASE TO THE CLUSTER CENTER WITH ALL THE 739C CALCULATIONS BEING PRINTED. THE SECOND ITERATION DIVIDES THE 740C CLUSTER INTO TWO CLUSTERS, MOVING CASES FROM ONE TO THE OTHER 741C UNTIL EITHER NO FURTHER MOVEMENTS DECREASE THE DISTANCES 742C BETWEEN EACH CASE AND THE CENTER OF ITS ASSIGNED CLUSTER OR THE 743C MAXIMUM NUMBER OF MOVEMENTS PER ITERATION HAS BEEN REACHED. 744C FOR THE THIRD AND SUBSEQUENT ITERATIONS, THE CLUSTER WITH THE 745C LARGEST VARIANCE IS SPLIT AND ITS CASES ARE ASSIGNED TO THE 746C CLUSTER WHOSE MEAN IS THE SMALLEST DISTANCE FROM THE CASE. THE 747C MEANS ARE THEN UPDATED AND THE PROCESS OF REASSIGNING CASES TO 748C CLUSTERS CONTINUES UNTIL NO REASSIGNMENTS ARE MADE FOR AN 749C ITERATION. 750C 751C 3. THE CLUSTERS AND THEIR STATISTICS WILL BE PRINTED OUT AFTER EACH 752C ITERATION ON FORTRAN UNIT OUNIT. 753C 754C INPUT PARAMETERS 755C ---------------- 756C 757C MM INTEGER SCALAR (UNCHANGED ON OUTPUT). 758C THE FIRST DIMENSION OF THE MATRIX A. MUST BE AT LEAST M. 759C 760C M INTEGER SCALAR (UNCHANGED ON OUTPUT). 761C THE NUMBER OF CASES. 762C 763C N INTEGER SCALAR (UNCHANGED ON OUTPUT). 764C THE NUMBER OF VARIABLES. 765C 766C A REAL MATRIX WHOSE FIRST DIMENSION MUST BE MM AND WHOSE SECOND 767C DIMENSION MUST BE AT LEAST N (UNCHANGED ON OUTPUT). 768C THE MATRIX OF DATA VALUES. 769C 770C A(I,J) IS THE VALUE FOR THE J-TH VARIABLE FOR THE I-TH CASE. 771C 772C CLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST N. 773C (UNCHANGED ON OUTPUT). 774C THE LABELS OF THE VARIABLES. 775C 776C RLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST M. 777C (UNCHANGED ON OUTPUT). 778C THE LABELS OF THE CASES. 779C 780C TITLE 10-CHARACTER VARIABLE (UNCHANGED ON OUTPUT). 781C TITLE OF THE DATA SET. 782C 783C K INTEGER SCALAR (UNCHANGED ON OUTPUT). 784C THE NUMBER OF CLUSTERS DESIRED. 785C 786C ITER INTEGER SCALAR (UNCHANGED ON OUTPUT). 787C MAXIMUM NUMBER OF MOVEMENTS ALLOWED PER ITERATION. 788C 789C XMISS REAL SCALAR (UNCHANGED ON OUTPUT). 790C MISSING VALUE CODE. IF A(I,J) = XMISS, THEN THE VALUE FOR THE 791C J-TH VARIABLE FOR THE I-TH CASE IS ASSUMED TO BE MISSING. 792C 793C DMSUM1 INTEGER SCALAR (UNCHANGED ON OUTPUT). 794C THE FIRST DIMENSION OF THE MATRIX SUM. MUST BE AT LEAST 7. 795C 796C DMSUM2 INTEGER SCALAR (UNCHANGED ON OUTPUT). 797C THE SECOND DIMENSION OF THE MATRIX SUM. MUST BE AT LEAST N. 798C 799C SUM REAL MATRIX WHOSE FIRST DIMENSION MUST BE DMSUM1, WHOSE SECOND 800C DIMENSION MUST BE DMSUM2 AND WHOSE THIRD DIMENSION MUST 801C BE AT LEAST K+1. 802C WORK MATRIX. 803C 804C IWORK INTEGER VECTOR DIMENSIONED AT LEAST M. 805C WORK VECTOR. 806C 807C WORK REAL VECTOR DIMENSIONED AT LEAST 2*N+2*M. 808C WORK VECTOR. 809C 810C CWORK VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST N. 811C WORK VECTOR. 812C 813C OUNIT INTEGER SCALAR (UNCHANGED ON OUTPUT). 814C UNIT NUMBER FOR OUTPUT. 815C 816C REFERENCE 817C --------- 818C 819C HARTIGAN, J. A. (1975). CLUSTERING ALGORITHMS, JOHN WILEY & 820C SONS, INC., NEW YORK. PAGES 84-108. 821C 822C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 823C 824CCCCC INTEGER DMSUM1, DMSUM2, DCLUS, OUNIT 825 INTEGER DMSUM1, DMSUM2, DCLUS 826 DIMENSION SUM(DMSUM1,DMSUM2,*), A(MM,*), WORK(*), IWORK(*) 827 CHARACTER*4 CLAB(*), RLAB(*), CWORK(*) 828CCCCC CHARACTER*10 TITLE 829C 830C SUM(1,J,I) IS THE VALUE OF THE J-TH VARIABLE AT THE CENTER OF 831C CLUSTER I 832C SUM(2,J,I) IS THE NUMBER OF NON-MISSING OBSERVATIONS FOR THE J-TH 833C VARIABLE IN CLUSTER I 834C SUM(3,J,I) IS THE MEAN OF THE J-TH VARIABLE IN CLUSTER I 835C SUM(4,J,I) IS THE STANDARD DEVIATION OF THE J-TH VARIABLE IN 836C CLUSTER I 837C SUM(5,J,I) IS THE MINIMUM OF THE J-TH VARIABLE IN CLUSTER I 838C SUM(6,J,I) IS THE MAXIMUM OF THE J-TH VARIABLE IN CLUSTER I 839C SUM(7,J,I) IS THE SUM OF SQUARED DEVIATIONS FOR THE J-TH VARIABLE 840C FROM THE MEAN OF CLUSTER I 841C 842C THE K+1-ST ROW OF SUM STORES THE SAME CALCULATIONS AS ABOVE EXCEPT 843C FOR THE ENTIRE DATA SET RATHER THAN FOR AN INDIVIDUAL CLUSTER 844C 845 KKK=0 846 KK=0 847 KM=0 848C 849 DCLUS = 2*N + M 850 DO 13 I=1,7 851 DO 12 J=1,N 852 DO 11 KK=1,K+1 853 SUM(I,J,KK)=0. 854 11 CONTINUE 855 12 CONTINUE 856 13 CONTINUE 857C 858C LOOP ONCE FOR EACH DESIRED CLUSTER 859C 860 DO 130 KK=1,K 861 DO 60 NC=1,ITER 862 ERR=0. 863 DO 20 KKK=1,KK 864 DO 25 J=1,N 865 IF(NC.EQ.1.OR.SUM(1,J,KKK).NE.SUM(3,J,KKK)) ERR=1. 866 25 CONTINUE 867 20 CONTINUE 868C 869C IF NO CHANGES HAVE BEEN MADE, OUTPUT THE CLUSTER 870C 871 IF(ERR.EQ.0.) GO TO 70 872 DO 30 KKK=1,KK 873 DO 35 J=1,N 874 SUM(2,J,KKK)=0. 875 SUM(1,J,KKK)=SUM(3,J,KKK) 876 35 CONTINUE 877 30 CONTINUE 878 DO 50 I=1,M 879 DO 40 J=1,N 880 WORK(J)=A(I,J) 881 40 CONTINUE 882 IWORK(I)=NC 883C 884C FIND BEST CLUSTER FOR CASE I 885C 886 CALL KMEANS(N, WORK, KK, XMISS, DMSUM1, DMSUM2, SUM, 887 * IWORK(I), WORK(DCLUS+I)) 888 50 CONTINUE 889 60 CONTINUE 890CCC70 IF (OUNIT .GT. 0) CALL KOUT(MM, M, N, A, CLAB, RLAB, TITLE, KK, 891CCCCC* DMSUM1, DMSUM2, SUM, IWORK, WORK(DCLUS+1), 892CCCCC* WORK(N+1), WORK(M+N+1), CWORK, OUNIT) 893 70 CALL KOUT(M, N, CLAB, RLAB, KK, 894 * DMSUM1, DMSUM2, SUM, IWORK, WORK(DCLUS+1), 895 * WORK(N+1), WORK(M+N+1), CWORK) 896 897C 898C CREATE A NEW CLUSTER BY SPLITTING VARIABLE WITH LARGEST WITHIN- 899C CLUSTER VARIANCE AT THAT VALUE OF THAT VARIABLE AT THE CENTER 900C OF THE CLUSTER 901C 902 SM=0. 903 DO 80 J=1,N 904 DO 85 KKK=1,KK 905 IF(SUM(4,J,KKK).GE.SM) THEN 906 SM=SUM(4,J,KKK) 907 KM=KKK 908 ENDIF 909 85 CONTINUE 910 80 CONTINUE 911 KN=KK+1 912 DO 90 JJ=1,N 913 SUM(2,JJ,KM)=0. 914 SUM(3,JJ,KM)=0. 915 SUM(2,JJ,KN)=0. 916 SUM(3,JJ,KN)=0. 917 90 CONTINUE 918 DO 110 I=1,M 919 IF(IWORK(I).EQ.KM) THEN 920 DO 100 JJ=1,N 921 IF(A(I,JJ).NE.XMISS) THEN 922 IF(A(I,JJ).GE.SUM(1,JJ,KM)) THEN 923 SUM(2,JJ,KN)=SUM(2,JJ,KN)+1 924 SUM(3,JJ,KN)=SUM(3,JJ,KN)+A(I,JJ) 925 ELSE 926 SUM(2,JJ,KM)=SUM(2,JJ,KM)+1 927 SUM(3,JJ,KM)=SUM(3,JJ,KM)+A(I,JJ) 928 ENDIF 929 ENDIF 930 100 CONTINUE 931 ENDIF 932 110 CONTINUE 933 DO 120 JJ=1,N 934 IF(SUM(2,JJ,KN).NE.0.)SUM(3,JJ,KN)=SUM(3,JJ,KN)/SUM(2,JJ,KN) 935 IF(SUM(2,JJ,KM).NE.0.)SUM(3,JJ,KM)=SUM(3,JJ,KM)/SUM(2,JJ,KM) 936 120 CONTINUE 937 130 CONTINUE 938 RETURN 939 END 940 SUBROUTINE CLUMOM(MM, M, N, A, ICLUS, W, U, DMC1, DMC2, C) 941C 942C NOTE: RENAMED FOR DATAPLOT TO AVOID NAME CONFLICT. 943C 944C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 945C 946C PURPOSE 947C ------- 948C 949C COMPUTES WEIGHTED MEANS AND COVARIANCES 950C 951C INPUT PARAMETERS 952C ---------------- 953C 954C MM INTEGER SCALAR (UNCHANGED ON OUTPUT). 955C THE FIRST DIMENSION OF THE MATRIX A. MUST BE AT LEAST M. 956C 957C M INTEGER SCALAR (UNCHANGED ON OUTPUT). 958C THE NUMBER OF CASES. 959C 960C N INTEGER SCALAR (UNCHANGED ON OUTPUT). 961C THE NUMBER OF VARIABLES. 962C 963C A REAL MATRIX WHOSE FIRST DIMENSION MUST BE MM AND WHOSE SECOND 964C DIMENSION MUST BE AT LEAST N (UNCHANGED ON OUTPUT). 965C THE MATRIX OF DATA VALUES. 966C 967C ICLUS INTEGER SCALAR (UNCHANGED ON OUTPUT). 968C THE CLUSTER NUMBER. 969C 970C W REAL VECTOR DIMENSIONED AT LEAST M (UNCHANGED ON OUTPUT). 971C VECTOR OF WEIGHTS FOR THE OBJECTS. 972C 973C DMC1 INTEGER SCALAR (UNCHANGED ON OUTPUT). 974C THE FIRST DIMENSION OF THE MATRIX C. MUST BE AT LEAST N. 975C 976C DMC2 INTEGER SCALAR (UNCHANGED ON OUTPUT). 977C THE SECOND DIMENSION OF THE MATRIX C. MUST BE AT LEAST N. 978C 979C OUTPUT PARAMETERS 980C ----------------- 981C 982C U REAL VECTOR DIMENSIONED AT LEAST N. 983C VECTOR OF WEIGHTED CLUSTER MEANS FOR EACH VARIABLE. 984C 985C C REAL MATRIX WHOSE FIRST DIMENSION MUST BE DMC1, WHOSE SECOND 986C DIMENSION MUST BE DMC2, AND WHOSE THIRD DIMENSION MUST BE 987C AT LEAST K. 988C C(I,J,K) IS THE IJ-TH ELEMENT OF THE COVARIANCE MATRIX FOR THE 989C K-TH CLUSTER. 990C 991C REFERENCE 992C --------- 993C 994C HARTIGAN, J. A. (1975). CLUSTERING ALGORITHMS, JOHN WILEY & 995C SONS, INC., NEW YORK. PAGE 73. 996C 997C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 998C 999 INTEGER DMC1, DMC2 1000 DIMENSION C(DMC1,DMC2,*), W(*), U(*), A(MM,*) 1001C 1002 INCLUDE 'DPCOMC.INC' 1003C 1004 SP=0. 1005 DO 10 I=1,M 1006 SP=SP+W(I) 1007 10 CONTINUE 1008 IF(SP.EQ.0.) SP=R1MACH(4) 1009C 1010C COMPUTED WEIGHTED MEANS 1011C 1012 DO 30 J=1,N 1013 SS=0. 1014 DO 20 I=1,M 1015 SS=SS+A(I,J)*W(I) 1016 20 CONTINUE 1017 U(J)=SS/SP 1018 30 CONTINUE 1019C 1020C COMPUTED WEIGHTED COVARIANCES 1021C 1022 DO 50 J=1,N 1023 DO 55 K=1,J 1024 SS=0. 1025 DO 40 I=1,M 1026 SS=SS+(A(I,J)-U(J))*(A(I,K)-U(K))*W(I) 1027 40 CONTINUE 1028 C(J,K,ICLUS)=SS/SP 1029 C(K,J,ICLUS)=C(J,K,ICLUS) 1030 55 CONTINUE 1031 50 CONTINUE 1032 RETURN 1033 END 1034 SUBROUTINE COVOUT(M, N, CLAB1, CLAB2, RLAB, TITLE, K, 1035CCCCC SUBROUTINE COVOUT(MM, M, N, A, CLAB1, CLAB2, RLAB, TITLE, K, 1036 * DMWORK, WORK1, DMC1, DMC2, C, WORK, 1037 * ICAPTY,ICAPSW,IFORSW) 1038C 1039C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 1040C 1041C PURPOSE 1042C ------- 1043C 1044C PRINTS RESULTS OF MIX 1045C 1046C DESCRIPTION 1047C ----------- 1048C 1049C 1. SEE DESCRIPTION OF MIX FOR DESCRIPTION OF OUTPUT. 1050C 1051C INPUT PARAMETERS 1052C ---------------- 1053C 1054C SEE SUBROUTINE MIX FOR PARAMETERS 1055C 1056C REFERENCE 1057C --------- 1058C 1059C HARTIGAN, J. A. (1975). CLUSTERING ALGORITHMS, JOHN WILEY & 1060C SONS, INC., NEW YORK. PAGE 127. 1061C 1062C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 1063C 1064CCCCC INTEGER DMWORK, P, U, PMIX, DMC1, DMC2, OUNIT 1065CCCCC DIMENSION A(MM,*), WORK1(DMWORK,*), C(DMC1,DMC2,*), WORK(*) 1066 INTEGER DMWORK, P, U, PMIX, DMC1, DMC2 1067 DIMENSION WORK1(DMWORK,*), C(DMC1,DMC2,*), WORK(*) 1068 CHARACTER*4 CLAB1(*), CLAB2(*) 1069 CHARACTER*8 RLAB(*) 1070 CHARACTER*10 TITLE 1071 CHARACTER*4 ICAPTY 1072 CHARACTER*4 ICAPSW 1073 CHARACTER*4 IFORSW 1074 CHARACTER*4 ISUBRO 1075 CHARACTER*4 IBUGA3 1076 CHARACTER*4 IERROR 1077C 1078 INCLUDE 'DPCOMC.INC' 1079 INCLUDE 'DPCOST.INC' 1080C 1081 PARAMETER(NUMCLI=9) 1082 PARAMETER(MAXLIN=3) 1083 PARAMETER(MAXROW=40) 1084 CHARACTER*65 ITITLE 1085 CHARACTER*60 ITITL9 1086 CHARACTER*4 ALIGN(NUMCLI) 1087 CHARACTER*4 VALIGN(NUMCLI) 1088 INTEGER IDIGI2(MAXROW,NUMCLI) 1089 INTEGER NTOT(MAXROW) 1090 INTEGER ROWSEP(MAXROW) 1091 CHARACTER*20 ITITL2(MAXLIN,NUMCLI) 1092 CHARACTER*20 IVALUE(MAXROW,NUMCLI) 1093 CHARACTER*4 ITYPCO(NUMCLI) 1094 INTEGER NCTIT2(MAXLIN,NUMCLI) 1095 INTEGER NCVALU(MAXROW,NUMCLI) 1096 INTEGER NCOLSP(MAXLIN,NUMCLI) 1097 INTEGER IWHTML(NUMCLI) 1098 INTEGER IWRTF(NUMCLI) 1099 REAL AMAT(MAXROW,NUMCLI) 1100 LOGICAL IFRST 1101 LOGICAL ILAST 1102 LOGICAL IFLAGS 1103 LOGICAL IFLAGE 1104C 1105 INCLUDE 'DPCOP2.INC' 1106C 1107 ISUBRO='XXXX' 1108 IBUGA3='OFF' 1109 IERROR='OFF' 1110C 1111 P = 0 1112 U = P + M 1113 PMIX = U + N + 1 1114C 1115 NUMDIG=7 1116 IF(IFORSW.EQ.'1')NUMDIG=1 1117 IF(IFORSW.EQ.'2')NUMDIG=2 1118 IF(IFORSW.EQ.'3')NUMDIG=3 1119 IF(IFORSW.EQ.'4')NUMDIG=4 1120 IF(IFORSW.EQ.'5')NUMDIG=5 1121 IF(IFORSW.EQ.'6')NUMDIG=6 1122 IF(IFORSW.EQ.'7')NUMDIG=7 1123 IF(IFORSW.EQ.'8')NUMDIG=8 1124 IF(IFORSW.EQ.'9')NUMDIG=9 1125 IF(IFORSW.EQ.'0')NUMDIG=0 1126 IF(IFORSW.EQ.'E')NUMDIG=-7 1127 IF(IFORSW.EQ.'-2')NUMDIG=-2 1128 IF(IFORSW.EQ.'-3')NUMDIG=-3 1129 IF(IFORSW.EQ.'-4')NUMDIG=-4 1130 IF(IFORSW.EQ.'-5')NUMDIG=-5 1131 IF(IFORSW.EQ.'-6')NUMDIG=-6 1132 IF(IFORSW.EQ.'-7')NUMDIG=-7 1133 IF(IFORSW.EQ.'-8')NUMDIG=-8 1134 IF(IFORSW.EQ.'-9')NUMDIG=-9 1135C 1136 IF(IPRINT.EQ.'ON')THEN 1137 WRITE(ICOUT,999) 1138 999 FORMAT(1X) 1139 CALL DPWRST('XXX','WRIT') 1140 WRITE(ICOUT,1) TITLE,K 1141 1 FORMAT(' MIXTURE MODEL FOR ',A10,' WITH',I5,' CLUSTERS') 1142 CALL DPWRST('XXX','WRIT') 1143 WRITE(ICOUT,999) 1144 CALL DPWRST('XXX','WRIT') 1145 WRITE(ICOUT,2)(KK,KK=1,K) 1146 2 FORMAT(' CLUSTER',3X,9(6X,I4,3X)) 1147 CALL DPWRST('XXX','WRIT') 1148 ENDIF 1149C 1150C PRINT CLUSTER PROBABILITIES 1151C 1152 IF(IPRINT.EQ.'ON')THEN 1153 WRITE(ICOUT,999) 1154 CALL DPWRST('XXX','WRIT') 1155 WRITE(ICOUT,3) 1156 3 FORMAT(' MIXTURE PROBABILITIES') 1157 CALL DPWRST('XXX','WRIT') 1158 WRITE(ICOUT,333)(WORK1(PMIX,KK),KK=1,K) 1159 333 FORMAT((12X,10F12.6)) 1160 CALL DPWRST('XXX','WRIT') 1161 ENDIF 1162C 1163C PRINT MEANS 1164C 1165 IF(IPRINT.EQ.'ON')THEN 1166 WRITE(ICOUT,999) 1167 CALL DPWRST('XXX','WRIT') 1168C 1169CCCCC WRITE(ICOUT,4) 1170CCCC4 FORMAT(' CLUSTER MEANS') 1171CCCCC CALL DPWRST('XXX','WRIT') 1172C 1173CCCCC DO 10 J=1,N 1174CCCCC WRITE(ICOUT,5)CLAB1(J),CLAB2(J),(WORK1(U+J,KK),KK=1,K) 1175CCCC5 FORMAT(1X,2A4,5X,10F12.4) 1176CCCCC CALL DPWRST('XXX','WRIT') 1177CCC10 CONTINUE 1178C 1179 ITITLE=' ' 1180 NCTITL=0 1181 ITITL9='Cluster Means' 1182 NCTIT9=13 1183C 1184 IF(K.LE.6)THEN 1185 NLOOP=1 1186 ELSE 1187 NLOOP=K/6 1188 NTEMP=MOD(K,6) 1189 IF(NTEMP.GT.0)NLOOP=NLOOP+1 1190 ENDIF 1191C 1192 IWHTML(1)=100 1193 IWHTML(2)=150 1194 IWHTML(3)=150 1195 IWHTML(4)=150 1196 IWHTML(5)=150 1197 IWHTML(6)=150 1198 IWHTML(7)=150 1199 IWHTML(8)=150 1200 IINC2=1200 1201 IINC1=1500 1202 IWRTF(1)=IINC2 1203 IWRTF(2)=IWRTF(1)+IINC1 1204 IWRTF(3)=IWRTF(2)+IINC1 1205 IWRTF(4)=IWRTF(3)+IINC1 1206 IWRTF(5)=IWRTF(4)+IINC1 1207 IWRTF(6)=IWRTF(5)+IINC1 1208 IWRTF(7)=IWRTF(6)+IINC1 1209 IWRTF(8)=IWRTF(7)+IINC1 1210C 1211 DO1010II=1,NLOOP 1212 IF(II.EQ.NLOOP)THEN 1213 NUMCOL=MOD(K,6) 1214 IF(NUMCOL.EQ.0)NUMCOL=6 1215 ELSE 1216 NUMCOL=6 1217 ENDIF 1218 NUMCOL=NUMCOL+1 1219 NUMLIN=1 1220C 1221 DO1020J=1,NUMCLI 1222 DO1030I=1,MAXLIN 1223 ITITL2(I,J)=' ' 1224 NCTIT2(I,J)=0 1225 NCOLSP(I,J)=1 1226 1030 CONTINUE 1227 1020 CONTINUE 1228C 1229 ISTRT=(II-1)*7+1 1230 IEND=ISTRT+6 1231 IF(IEND.GT.N)IEND=N 1232C 1233 ITITL2(1,1)='Variable' 1234 NCTIT2(1,1)=8 1235C 1236 DO1040L=ISTRT,IEND 1237 ITITL2(1,L+1)='Cluster ' 1238 WRITE(ITITL2(1,L+1)(9:11),'(I3)')L 1239 NCTIT2(1,L+1)=11 1240 1040 CONTINUE 1241C 1242 NMAX=0 1243 ICNT=0 1244 ICNT2=0 1245 DO1050I=1,NUMCOL 1246 VALIGN(I)='b' 1247 ALIGN(I)='r' 1248 NTOT(I)=15 1249 ITYPCO(I)='NUME' 1250 IF(I.EQ.1)THEN 1251 NTOT(I)=12 1252 ALIGN(I)='l' 1253 ITYPCO(I)='ALPH' 1254 ENDIF 1255 NMAX=NMAX+NTOT(I) 1256 1050 CONTINUE 1257C 1258 ICNT=0 1259 DO1060J=1,N 1260 ICNT=ICNT+1 1261 IDIGI2(ICNT,1)=0 1262 IVALUE(ICNT,1)(1:4)=CLAB1(J)(1:4) 1263 IVALUE(ICNT,1)(5:8)=CLAB2(J)(1:4) 1264 NCVALU(ICNT,1)=8 1265 AMAT(ICNT,1)=0.0 1266C 1267 ICNT2=1 1268 DO1065KK=ISTRT,IEND 1269C 1270 ICNT2=ICNT2+1 1271 IDIGI2(ICNT,ICNT2)=NUMDIG 1272 IVALUE(ICNT,ICNT2)=' ' 1273 NCVALU(ICNT,ICNT2)=0 1274 AMAT(ICNT,ICNT2)=WORK1(U+J,KK) 1275 ROWSEP(ICNT)=0 1276C 1277 1065 CONTINUE 1278 1060 CONTINUE 1279C 1280 ROWSEP(ICNT)=1 1281 IFRST=.TRUE. 1282 ILAST=.TRUE. 1283 IFLAGS=.TRUE. 1284 IFLAGE=.TRUE. 1285 CALL DPDT5B(ITITLE,NCTITL, 1286 1 ITITL9,NCTIT9,ITITL2,NCTIT2, 1287 1 MAXLIN,NUMLIN,NUMCLI,NUMCOL, 1288 1 IVALUE,NCVALU,AMAT,ITYPCO,MAXROW,ICNT, 1289 1 IDIGI2,NTOT,IWHTML,IWRTF,VALIGN,ALIGN,NMAX, 1290 1 NCOLSP,ROWSEP, 1291 1 ICAPSW,ICAPTY,IFRST,ILAST, 1292 1 IFLAGS,IFLAGE, 1293 1 ISUBRO,IBUGA3,IERROR) 1294 1010 CONTINUE 1295C 1296 WRITE(ICOUT,999) 1297 CALL DPWRST('XXX','WRIT') 1298 WRITE(ICOUT,6) 1299 6 FORMAT(' DETERMINANTS') 1300 CALL DPWRST('XXX','WRIT') 1301 WRITE(ICOUT,666)(C(1,N+1,J),J=1,K) 1302 666 FORMAT((12X,10E12.4)) 1303 CALL DPWRST('XXX','WRIT') 1304 WRITE(ICOUT,999) 1305 CALL DPWRST('XXX','WRIT') 1306CCCCC WRITE(ICOUT,7) 1307CCCC7 FORMAT(' WITHIN CLUSTER VARIANCES AND CORRELATIONS') 1308CCCCC CALL DPWRST('XXX','WRIT') 1309C 1310CCCCC DO 30 I=1,N 1311CCCCC DO 30 J=I,N 1312CCCCC DO 20 KK=1,K 1313CCCCC Z=C(I,I,KK)*C(J,J,KK) 1314CCCCC WORK(KK)=C(I,J,KK) 1315CCCCC IF(I.EQ.J) Z=0. 1316CCC20 IF(Z.NE.0.) WORK(KK)=C(I,J,KK)*Z**(-0.5) 1317CCCCC IF(I.EQ.J) THEN 1318CCCCC WRITE(ICOUT,999) 1319CCCCC CALL DPWRST('XXX','WRIT') 1320CCCCC ENDIF 1321CCCCC WRITE(ICOUT,9) CLAB1(I),CLAB2(I),CLAB1(J),CLAB2(J), 1322CCCCC1 (WORK(KK),KK=1,K) 1323CCCC9 FORMAT(1X,2A4,2X,2A4,10F12.4) 1324CCCCC CALL DPWRST('XXX','WRIT') 1325CCC30 CONTINUE 1326C 1327 ITITLE=' ' 1328 NCTITL=0 1329 ITITL9='Within Cluster Variances and Correlations' 1330 NCTIT9=41 1331C 1332 IWHTML(1)=100 1333 IWHTML(2)=100 1334 IWHTML(3)=150 1335 IWHTML(4)=150 1336 IWHTML(5)=150 1337 IWHTML(6)=150 1338 IWHTML(7)=150 1339 IWHTML(8)=150 1340 IWHTML(9)=150 1341 IINC2=1200 1342 IINC1=1500 1343 IWRTF(1)=IINC2 1344 IWRTF(2)=IWRTF(1)+IINC2 1345 IWRTF(3)=IWRTF(2)+IINC1 1346 IWRTF(4)=IWRTF(3)+IINC1 1347 IWRTF(5)=IWRTF(4)+IINC1 1348 IWRTF(6)=IWRTF(5)+IINC1 1349 IWRTF(7)=IWRTF(6)+IINC1 1350 IWRTF(8)=IWRTF(7)+IINC1 1351 IWRTF(9)=IWRTF(8)+IINC1 1352C 1353 DO1110II=1,NLOOP 1354 IF(II.EQ.NLOOP)THEN 1355 NUMCOL=MOD(K,6) 1356 IF(NUMCOL.EQ.0)NUMCOL=6 1357 ELSE 1358 NUMCOL=6 1359 ENDIF 1360 NUMCOL=NUMCOL+2 1361 NUMLIN=1 1362C 1363 DO1120J=1,NUMCLI 1364 DO1130I=1,MAXLIN 1365 ITITL2(I,J)=' ' 1366 NCTIT2(I,J)=0 1367 NCOLSP(I,J)=1 1368 1130 CONTINUE 1369 1120 CONTINUE 1370C 1371 ISTRT=(II-1)*7+1 1372 IEND=ISTRT+6 1373 IF(IEND.GT.N)IEND=N 1374C 1375 ITITL2(1,1)='I' 1376 NCTIT2(1,1)=1 1377 ITITL2(1,2)='J' 1378 NCTIT2(1,2)=1 1379C 1380 DO1140L=ISTRT,IEND 1381 ITITL2(1,L+2)='Cluster ' 1382 WRITE(ITITL2(1,L+2)(9:11),'(I3)')L 1383 NCTIT2(1,L+2)=11 1384 1140 CONTINUE 1385C 1386 NMAX=0 1387 ICNT=0 1388 ICNT2=0 1389 DO1150I=1,NUMCOL 1390 VALIGN(I)='b' 1391 ALIGN(I)='r' 1392 NTOT(I)=15 1393 ITYPCO(I)='NUME' 1394 IF(I.LE.2)THEN 1395 NTOT(I)=12 1396 ALIGN(I)='l' 1397 ITYPCO(I)='ALPH' 1398 ENDIF 1399 NMAX=NMAX+NTOT(I) 1400 1150 CONTINUE 1401C 1402 ICNT=0 1403 DO1160I=1,N 1404 DO1165J=I,N 1405 ICNT=ICNT+1 1406 IDIGI2(ICNT,1)=0 1407 IDIGI2(ICNT,2)=0 1408 IVALUE(ICNT,1)(1:4)=CLAB1(I)(1:4) 1409 IVALUE(ICNT,1)(5:8)=CLAB2(I)(1:4) 1410 NCVALU(ICNT,1)=8 1411 AMAT(ICNT,1)=0.0 1412 IVALUE(ICNT,2)(1:4)=CLAB1(J)(1:4) 1413 IVALUE(ICNT,2)(5:8)=CLAB2(J)(1:4) 1414 NCVALU(ICNT,2)=8 1415 AMAT(ICNT,2)=0.0 1416C 1417 ICNT2=2 1418 DO1168KK=ISTRT,IEND 1419C 1420 Z=C(I,I,KK)*C(J,J,KK) 1421 WORK(KK)=C(I,J,KK) 1422 IF(I.EQ.J)Z=0. 1423 IF(Z.NE.0.)WORK(KK)=C(I,J,KK)*Z**(-0.5) 1424 ICNT2=ICNT2+1 1425 IDIGI2(ICNT,ICNT2)=NUMDIG 1426 IVALUE(ICNT,ICNT2)=' ' 1427 NCVALU(ICNT,ICNT2)=0 1428 AMAT(ICNT,ICNT2)=WORK(KK) 1429 ROWSEP(ICNT)=0 1430 1168 CONTINUE 1431C 1432 1165 CONTINUE 1433 1160 CONTINUE 1434C 1435 ROWSEP(ICNT)=1 1436 IFRST=.TRUE. 1437 ILAST=.TRUE. 1438 IFLAGS=.TRUE. 1439 IFLAGE=.TRUE. 1440 CALL DPDT5B(ITITLE,NCTITL, 1441 1 ITITL9,NCTIT9,ITITL2,NCTIT2, 1442 1 MAXLIN,NUMLIN,NUMCLI,NUMCOL, 1443 1 IVALUE,NCVALU,AMAT,ITYPCO,MAXROW,ICNT, 1444 1 IDIGI2,NTOT,IWHTML,IWRTF,VALIGN,ALIGN,NMAX, 1445 1 NCOLSP,ROWSEP, 1446 1 ICAPSW,ICAPTY,IFRST,ILAST, 1447 1 IFLAGS,IFLAGE, 1448 1 ISUBRO,IBUGA3,IERROR) 1449C 1450 1110 CONTINUE 1451 ENDIF 1452C 1453C PRINT PROBABILITIES 1454C 1455 IF(IPRINT.EQ.'ON')THEN 1456 WRITE(ICOUT,999) 1457 CALL DPWRST('XXX','WRIT') 1458CCCCC WRITE(ICOUT,11) 1459CCC11 FORMAT(' BELONGING PROBABILITIES') 1460CCCCC CALL DPWRST('XXX','WRIT') 1461CCCCC DO 40 I=1,M 1462CCCCC WRITE(ICOUT,12) RLAB(I),(WORK1(P+I,KK),KK=1,K) 1463CCC12 FORMAT(1X,A8,2X,10F12.6) 1464CCCCC CALL DPWRST('XXX','WRIT') 1465CCC40 CONTINUE 1466C 1467 ITITLE=' ' 1468 NCTITL=0 1469 ITITL9='Belonging Probabilities' 1470 NCTIT9=23 1471C 1472C DO 2 LOOPS: 1473C 1474C 1) FIRST LOOP IS FOR THE NUMBER OF ROWS (OBSERVATIONS) 1475C 2) SECOND LOOP IS FOR THE NUMBER OF COLUMNS (CLUSTERS) 1476C 1477 IF(M.LE.MAXROW)THEN 1478 NLOOP2=1 1479 ELSE 1480 NLOOP2=M/MAXROW 1481 IF(MOD(M,MAXROW).GT.0)NLOOP2=NLOOP2+1 1482 ENDIF 1483C 1484 IF(K.LE.6)THEN 1485 NLOOP=1 1486 ELSE 1487 NLOOP=K/6 1488 IF(MOD(K,6).GT.0)NLOOP=NLOOP+1 1489 ENDIF 1490C 1491 IWHTML(1)=120 1492 IWHTML(2)=150 1493 IWHTML(3)=150 1494 IWHTML(4)=150 1495 IWHTML(5)=150 1496 IWHTML(6)=150 1497 IWHTML(7)=150 1498 IWHTML(8)=150 1499 IINC2=1200 1500 IINC1=1500 1501 IWRTF(1)=IINC2 1502 IWRTF(2)=IWRTF(1)+IINC1 1503 IWRTF(3)=IWRTF(2)+IINC1 1504 IWRTF(4)=IWRTF(3)+IINC1 1505 IWRTF(5)=IWRTF(4)+IINC1 1506 IWRTF(6)=IWRTF(5)+IINC1 1507 IWRTF(7)=IWRTF(6)+IINC1 1508 IWRTF(8)=IWRTF(7)+IINC1 1509C 1510 DO1201JJ=1,NLOOP2 1511 IROW1=(JJ-1)*MAXROW+1 1512 IROW2=JJ*MAXROW 1513 IF(IROW2.GT.M)IROW2=M 1514 DO1210II=1,NLOOP 1515 IF(II.EQ.NLOOP)THEN 1516 NUMCOL=MOD(K,6) 1517 IF(NUMCOL.EQ.0)NUMCOL=6 1518 ELSE 1519 NUMCOL=6 1520 ENDIF 1521 NUMCOL=NUMCOL+1 1522 NUMLIN=1 1523C 1524 DO1220J=1,NUMCLI 1525 DO1230I=1,MAXLIN 1526 ITITL2(I,J)=' ' 1527 NCTIT2(I,J)=0 1528 NCOLSP(I,J)=1 1529 1230 CONTINUE 1530 1220 CONTINUE 1531C 1532 ISTRT=(II-1)*7+1 1533 IEND=ISTRT+6 1534 IF(IEND.GT.N)IEND=N 1535C 1536 ITITL2(1,1)='Observation' 1537 NCTIT2(1,1)=11 1538C 1539 DO1240L=ISTRT,IEND 1540 ITITL2(1,L+1)='Cluster ' 1541 WRITE(ITITL2(1,L+1)(9:11),'(I3)')L 1542 NCTIT2(1,L+1)=11 1543 1240 CONTINUE 1544C 1545 NMAX=0 1546 ICNT=0 1547 ICNT2=0 1548 DO1250I=1,NUMCOL 1549 VALIGN(I)='b' 1550 ALIGN(I)='r' 1551 NTOT(I)=15 1552 ITYPCO(I)='NUME' 1553 IF(I.EQ.1)THEN 1554 NTOT(I)=12 1555 ALIGN(I)='l' 1556 ITYPCO(I)='ALPH' 1557 ENDIF 1558 NMAX=NMAX+NTOT(I) 1559 1250 CONTINUE 1560C 1561 ICNT=0 1562 DO1260J=IROW1,IROW2 1563 ICNT=ICNT+1 1564 IDIGI2(ICNT,1)=0 1565 IVALUE(ICNT,1)(1:8)=RLAB(J)(1:8) 1566 NCVALU(ICNT,1)=8 1567 AMAT(ICNT,1)=0.0 1568C 1569 ICNT2=1 1570 DO1265KK=ISTRT,IEND 1571C 1572 ICNT2=ICNT2+1 1573 IDIGI2(ICNT,ICNT2)=NUMDIG 1574 IVALUE(ICNT,ICNT2)=' ' 1575 NCVALU(ICNT,ICNT2)=0 1576 AMAT(ICNT,ICNT2)=WORK1(P+J,KK) 1577 ROWSEP(ICNT)=0 1578C 1579 1265 CONTINUE 1580 1260 CONTINUE 1581C 1582 ROWSEP(ICNT)=1 1583 IFRST=.TRUE. 1584 ILAST=.TRUE. 1585 IFLAGS=.TRUE. 1586 IFLAGE=.TRUE. 1587 CALL DPDT5B(ITITLE,NCTITL, 1588 1 ITITL9,NCTIT9,ITITL2,NCTIT2, 1589 1 MAXLIN,NUMLIN,NUMCLI,NUMCOL, 1590 1 IVALUE,NCVALU,AMAT,ITYPCO,MAXROW,ICNT, 1591 1 IDIGI2,NTOT,IWHTML,IWRTF,VALIGN,ALIGN,NMAX, 1592 1 NCOLSP,ROWSEP, 1593 1 ICAPSW,ICAPTY,IFRST,ILAST, 1594 1 IFLAGS,IFLAGE, 1595 1 ISUBRO,IBUGA3,IERROR) 1596 1210 CONTINUE 1597 1201 CONTINUE 1598C 1599 ENDIF 1600 RETURN 1601 END 1602 SUBROUTINE CSPLIT(MM, M, A, CLAB, IR, KA, TH, IORD, DMIWRK, 1603CCCCC SUBROUTINE CSPLIT(MM, M, N, A, CLAB, IR, KA, TH, IORD, DMIWRK, 1604 * IWORK, DMWORK, WORK) 1605C 1606C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 1607C 1608C PURPOSE 1609C ------- 1610C 1611C FINDS OPTIMAL SPLIT OF VARIABLES 1612C 1613C DESCRIPTION 1614C ----------- 1615C 1616C 1. INITIALLY, THE FIRST CLUSTER CONSISTS OF ALL VARIABLES WITHIN 1617C THE BLOCK IR AND THE SECOND CLUSTER IS EMPTY. THE REDUCTION IN 1618C THE WITHIN-CLUSTER SUM OF SQUARES FOR MOVING EACH VARIABLE 1619C FROM THE FIRST CLUSTER TO THE SECOND IS CALCULATED. THE 1620C VARIABLE THAT REDUCES THE SUM OF SQUARES THE MOST IS MOVED AND 1621C THIS CONTINUES UNTIL ALL VARIABLES ARE MOVED WITH EACH 1622C REDUCTION STORED. THEN THE SPLIT THAT HAD THE SMALLEST 1623C REDUCTION OF ALL IS RETURNED AS THE OPTIMUM SPLIT. 1624C 1625C INPUT PARAMETERS 1626C ---------------- 1627C 1628C MM, M, N, A, CLAB, TH, IORD, DMIWRK, DMWORK -- SEE SUBROUTINE SPLIT2 1629C 1630C IR INTEGER SCALAR (UNCHANGED ON OUTPUT). 1631C NUMBER OF BLOCK TO BE SPLIT. 1632C 1633C KA INTEGER SCALAR (UNCHANGED ON OUTPUT). 1634C NUMBER OF BLOCKS. 1635C 1636C IWORK INTEGER MATRIX WHOSE FIRST DIMENSION MUST BE DMIWRK AND SECOND 1637C DIMENSION MUST BE AT LEAST KA. 1638C THE MATRIX DEFINING THE BOUNDARIES OF THE BLOCKS. 1639C 1640C IWORK(1,I) IS 1 + THE FIRST ROW IN BLOCK I 1641C IWORK(2,I) IS 1 + THE LAST ROW IN BLOCK I 1642C IWORK(3,I) IS 1 + THE FIRST COLUMN IN BLOCK I 1643C IWORK(4,I) IS 1 + THE LAST COLUMN IN BLOCK I 1644C 1645C WORK REAL MATRIX WHOSE FIRST DIMENSION MUST BE DMWORK AND SECOND 1646C DIMENSION MUST BE AT LEAST MAX(M,N). 1647C 1648C WORK(1,I) = FIRST CASE IN CASE CLUSTER I 1649C WORK(2,I) = LAST CASE IN CASE CLUSTER I 1650C WORK(3,I) = REDUCTION IN SSQ DUE TO SPLITTING 1651C WORK(4,I) = LAST CASE IN FIRST CLUSTER OF SPLIT OF I 1652C WORK(5,I) = 1 IF CASE IS INCLUDED IN PRESENT VARIABLE SPLIT 1653C WORK(6,I) = NUMBER OF VARIABLES IN I-TH ROW OF PRESENT 1654C VARIABLE SPLIT 1655C WORK(7,I) = MEAN OF I-TH CASE, FIRST VARIABLE CLUSTER 1656C WORK(8,I) = NUMBER OF VARIABLES SECOND CLUSTER 1657C WORK(9,I) = MEAN OF I-TH CASE, SECOND CLUSTER 1658C 1659C WORK(10-18,I) ARE SIMILAR WITH VARIABLES AND CASES REVERSED. 1660C 1661C REFERENCE 1662C --------- 1663C 1664C HARTIGAN, J. A. (1975). CLUSTERING ALGORITHMS, JOHN WILEY & 1665C SONS, INC., NEW YORK. PAGE 276. 1666C 1667C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 1668C 1669 INTEGER DMWORK, DMIWRK 1670 DIMENSION A(MM,*), IWORK(DMIWRK,*), WORK(DMWORK,*) 1671 CHARACTER*4 CLAB(*), C 1672C 1673 INCLUDE 'DPCOMC.INC' 1674 INCLUDE 'DPCOBE.INC' 1675 INCLUDE 'DPCOP2.INC' 1676C 1677 XM=99999. 1678 DO 10 I=1,M 1679 WORK(5,I)=0. 1680 10 CONTINUE 1681C 1682C LOOK FOR BLOCKS WITHIN THRESHOLD 1683C 1684 JL=INT(WORK(10,IR)) 1685 JU=INT(WORK(11,IR)) 1686 DO 40 K=1,KA 1687 IF(IWORK(3,K).EQ.JL+1.AND.IWORK(4,K).EQ.JU+1) THEN 1688 IL=IWORK(1,K) 1689 IF(IL.LT.0) IL=-IL 1690 IU=IWORK(2,K) 1691C 1692C COMPUTE VARIANCES 1693C 1694 NC=0 1695 DO 30 I=IL-1,IU-1 1696 S1=0. 1697 S2=0. 1698 S3=0. 1699 DO 20 J=JL,JU 1700 IF(A(I,J).NE.XM) THEN 1701 S1=S1+1 1702 S2=S2+A(I,J) 1703 S3=S3+A(I,J)**2 1704 ENDIF 1705 20 CONTINUE 1706 WORK(6,I)=S1 1707 IF(S1.NE.0.) THEN 1708 WORK(7,I)=S2/S1 1709 S3=S3/S1-(S2/S1)**2 1710 ENDIF 1711 IF(S3.GT.TH) THEN 1712 WORK(5,I)=1. 1713 NC=1 1714 ENDIF 1715 30 CONTINUE 1716 IF(NC.EQ.0) IWORK(3,K)=-IWORK(3,K) 1717 ENDIF 1718 40 CONTINUE 1719C 1720C FIND BEST VARIABLE SPLIT 1721C 1722 DO 50 I=1,M 1723 WORK(8,I)=0. 1724 WORK(9,I)=0. 1725 50 CONTINUE 1726 DM=0. 1727 WORK(12,IR)=0. 1728 WORK(13,IR)=JL 1729 DO 100 J=JL,JU-1 1730 JJ=JU-J+JL 1731 JD=JJ 1732 DD=-R1MACH(2) 1733 DO 70 L=JL,JJ 1734 IF(IORD.LT.2.OR.L.EQ.JJ) THEN 1735 DL=0. 1736 DO 60 I=1,M 1737 IF(WORK(5,I).NE.0.AND.A(I,L).NE.XM) THEN 1738 DL=DL+(A(I,L)-WORK(7,I))**2*(WORK(6,I)+1.)/WORK(6,I) 1739 DL=DL-(A(I,L)-WORK(9,I))**2*WORK(8,I)/(WORK(8,I)+1.) 1740 ENDIF 1741 60 CONTINUE 1742 IF(DL.GT.DD) THEN 1743 DD=DL 1744 JD=L 1745 ENDIF 1746 ENDIF 1747 70 CONTINUE 1748C 1749C INTERCHANGE JD AND JJ 1750C 1751 DO 80 I=1,M 1752 CC=A(I,JJ) 1753 A(I,JJ)=A(I,JD) 1754 A(I,JD)=CC 1755 80 CONTINUE 1756 C = CLAB(JJ) 1757 CLAB(JJ) = CLAB(JD) 1758 CLAB(JD) = C 1759C 1760C UPDATE MEANS 1761C 1762 DO 90 I=1,M 1763 IF(WORK(5,I).NE.0..AND.A(I,JJ).NE.XM) THEN 1764 WORK(6,I)=WORK(6,I)-1. 1765 IF(WORK(6,I).NE.0.)WORK(7,I)=WORK(7,I)+(WORK(7,I)- 1766 * A(I,JJ))/WORK(6,I) 1767 WORK(8,I)=WORK(8,I)+1. 1768 WORK(9,I)=WORK(9,I)-(WORK(9,I)-A(I,JJ))/WORK(8,I) 1769 ENDIF 1770 90 CONTINUE 1771 DM=DM+DD 1772 IF(DM.GE.WORK(12,IR)) THEN 1773 WORK(12,IR)=DM 1774 WORK(13,IR)=JJ-1 1775 ENDIF 1776 100 CONTINUE 1777 RETURN 1778 END 1779 SUBROUTINE INVERT(MM, M, A, DET, WORK, IWORK, IERR) 1780CCCCC SUBROUTINE INVERT(MM, M, A, DET, WORK, IWORK, IERR, OUNIT) 1781C 1782C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 1783C 1784C PURPOSE 1785C ------- 1786C 1787C COMPUTES THE INVERSE AND DETERMINANT OF THE SYMMETRIC MATRIX 1788C (E.G., A COVARIANCE MATRIX) 1789C 1790C DESCRIPTION 1791C ----------- 1792C 1793C 1. THE LINPACK SUBROUTINE SSIFA IS CALLED TO FACTOR THE MATRIX AND 1794C THEN THE LINPACK SUBROUTINE SSIDI IS CALLED TO USE THE 1795C FACTORIZATION TO FIND THE INVERSE AND DETERMINANT. THE INPUT 1796C MATRIX MUST BE SYMMETRIC AND IS OVERWRITTEN WITH ITS INVERSE ON 1797C OUTPUT. 1798C 1799C INPUT PARAMETERS 1800C ---------------- 1801C 1802C MM INTEGER SCALAR (UNCHANGED ON OUTPUT). 1803C THE FIRST DIMENSION OF THE MATRIX A. MUST BE AT LEAST M. 1804C 1805C M INTEGER SCALAR (UNCHANGED ON OUTPUT). 1806C THE NUMBER OF ROWS AND COLUMNS IN THE MATRIX A. 1807C 1808C A REAL SYMMETRIC MATRIX WHOSE FIRST DIMENSION MUST BE MM AND 1809C WHOSE SECOND DIMENSION MUST BE AT LEAST M (CHANGED ON 1810C OUTPUT). 1811C THE MATRIX OF DATA VALUES. 1812C 1813C A(I,J) IS THE VALUE FOR THE J-TH VARIABLE FOR THE I-TH CASE. 1814C 1815C WORK REAL VECTOR DIMENSIONED AT LEAST N. 1816C WORK VECTOR. 1817C 1818C IWORK INTEGER VECTOR DIMENSIONED AT LEAST N. 1819C WORK VECTOR. 1820C 1821C OUNIT INTEGER SCALAR (UNCHANGED ON OUTPUT). 1822C UNIT NUMBER FOR ERROR MESSAGES. 1823C 1824C OUTPUT PARAMETERS 1825C ----------------- 1826C 1827C A REAL MATRIX WHOSE FIRST DIMENSION MUST BE MM AND SECOND 1828C DIMENSION MUST BE AT LEAST N. 1829C THE INVERSE OF THE INPUT MATRIX. 1830C 1831C DET REAL VECTOR DIMENSIONED AT LEAST 2. 1832C THE DETERMINANT OF THE MATRIX. 1833C 1834C THE DETERMINANT IS DET(1) ** DET(2). 1835C 1836C IERR INTEGER SCALAR. 1837C ERROR FLAG. 1838C 1839C IF IERR = 0, NO ERROR CONDITION WAS DETECTED. 1840C 1841C IF IERR = K, THE K-TH PIVOT BLOCK IS SINGULAR. THE INVERSE IS 1842C NOT COMPUTED. ERROR CONDITION SET IN CMLIB 1843C ROUTINE SSIFA. 1844C 1845C REFERENCES 1846C ---------- 1847C 1848C HARTIGAN, J. A. (1975). CLUSTERING ALGORITHMS, JOHN WILEY & 1849C SONS, INC., NEW YORK. PAGE 69. 1850C 1851C NBS CORE MATH LIBRARY, VOLS. 1-4 (GAITHERSBURG: QA297.C69 IN NBS 1852C LIBRARY, ADMIN E-120). 1853C 1854C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 1855C 1856CCCCC INTEGER OUNIT 1857CCCCC DIMENSION A(MM,*), IWORK(*), WORK(*), DET(*), INERT(3) 1858 DIMENSION A(MM,*), IWORK(*), WORK(*), DET(*) 1859C 1860 INCLUDE 'DPCOMC.INC' 1861 INCLUDE 'DPCOP2.INC' 1862C 1863C NOTE: FOR DATAPLOT, REPLACE OLDER LINPAC ROUTINES WITH 1864C VERSIONS THAT ARE USED IN DATAPLOT. 1865C 1866 IERR=0 1867CCCCC CALL SSIFA(A,MM,M,IWORK,IERR) 1868CCCCC IF (IERR .NE. 0) THEN 1869CCCCC IF (OUNIT .GT. 0) THEN 1870CCCCC WRITE(ICOUT,1) 1871CCC1 FORMAT('MATRIX TO BE INVERTED MAY BE SINGULAR') 1872CCCCC CALL DPWRST('XXX','WRIT') 1873CCCCC GOTO9000 1874CCCCC ENDIF 1875CCCCC JOB = 111 1876CCCCC CALL SSIDI(A,MM,M,IWORK,DET,INERT,WORK,JOB) 1877 CALL SGECO(A,MM,M,IWORK,RCOND,WORK) 1878C 1879 EPS=1.0E-20 1880 IF(RCOND.LE.EPS)THEN 1881 WRITE(ICOUT,999) 1882 CALL DPWRST('XXX','BUG ') 1883 WRITE(ICOUT,2571) 1884 CALL DPWRST('XXX','ERRO ') 1885 WRITE(ICOUT,2572) 1886 CALL DPWRST('XXX','ERRO ') 1887 GOTO9000 1888 ELSE 1889 IJOB=1 1890 CALL SGEDI(A,MM,M,IWORK,DET,WORK,IJOB) 1891 END IF 1892 999 FORMAT(1X) 1893 2571 FORMAT('****** ERROR IN INVERT ********') 1894 2572 FORMAT(' THE INPUT MATRIX IS SINGULAR') 1895CCCCC END CHANGE 1896C 1897 DO 10 I = 1 , M 1898 DO 20 J = I , M 1899 A(J,I) = A(I,J) 1900 20 CONTINUE 1901 10 CONTINUE 1902C 1903 9000 CONTINUE 1904 RETURN 1905 END 1906 SUBROUTINE KMEANS(N, X, K, XMISS, DMSUM1, DMSUM2, SUM, JMIN, DMIN) 1907C 1908C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 1909C 1910C PURPOSE 1911C ------- 1912C 1913C ASSIGNS A VARIABLE TO ITS CLOSEST CLUSTER AND UPDATES THE SUMMARY 1914C STATISTICS 1915C 1916C DESCRIPTION 1917C ----------- 1918C 1919C 1. THE DISTANCE BETWEEN THE CASE X AND THE CENTER OF EACH CLUSTER 1920C IS COMPUTED AND X IS ASSIGNED TO THE CLUSTER WITH THE SMALLEST 1921C DISTANCE. THE SUMMARY STATISTICS FOR THE ASSIGNED CLUSTER ARE 1922C THEN UPDATED. 1923C 1924C INPUT PARAMETERS 1925C ---------------- 1926C 1927C N INTEGER SCALAR (UNCHANGED ON OUTPUT). 1928C THE NUMBER OF VARIABLES. 1929C 1930C X REAL VECTOR DIMENSIONED AT LEAST N (UNCHANGED ON OUTPUT). 1931C THE MATRIX OF DATA VALUES. 1932C 1933C K INTEGER SCALAR (UNCHANGED ON OUTPUT). 1934C THE NUMBER OF CLUSTERS. 1935C 1936C XMISS REAL SCALAR (UNCHANGED ON OUTPUT). 1937C VALUE THAT A DATA VALUE IS SET TO IF CONSIDERED MISSING. 1938C 1939C DMSUM1 INTEGER SCALAR (UNCHANGED ON OUTPUT). 1940C THE FIRST DIMENSION OF THE MATRIX SUM. MUST BE AT LEAST 7. 1941C 1942C DMSUM2 INTEGER SCALAR (UNCHANGED ON OUTPUT). 1943C THE SECOND DIMENSION OF THE MATRIX SUM. MUST BE AT LEAST N. 1944C 1945C OUTPUT PARAMETERS 1946C ------------------ 1947C 1948C SUM REAL MATRIX WHOSE FIRST DIMENSION MUST BE DMSUM1, WHOSE SECOND 1949C DIMENSION MUST BE DMSUM2, AND WHOSE THIRD DIMENSION MUST 1950C BE AT LEAST K+1. 1951C THE PARAMETERS FOR EACH CLUSTER. 1952C 1953C JMIN INTEGER SCALAR. 1954C CLUSTER WHOSE CENTER X IS CLOSEST TO. 1955C 1956C DMIN REAL SCALAR. 1957C DISTANCE BETWEEN X AND CENTER OF JMIN CLUSTER. 1958C 1959C REFERENCE 1960C --------- 1961C 1962C HARTIGAN, J. A. (1975). CLUSTERING ALGORITHMS, JOHN WILEY & 1963C SONS, INC., NEW YORK. PAGES 84-105. 1964C 1965C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 1966C 1967 INTEGER DMSUM1, DMSUM2 1968 DIMENSION SUM(DMSUM1,DMSUM2,*), X(*) 1969C 1970 INCLUDE 'DPCOMC.INC' 1971C 1972 JMIN=1 1973 DMIN=R1MACH(2) 1974C 1975C CALCULATE DISTANCE TO EACH CLUSTER CENTER 1976C 1977 DO 20 J=1,K 1978 XP=R1MACH(4) 1979 DD=0. 1980 DO 10 I=1,N 1981 IF (X(I).NE.XMISS) THEN 1982 DD=DD+(X(I)-SUM(1,I,J))**2 1983 XP=XP+1. 1984 ENDIF 1985 10 CONTINUE 1986 DD=(DD/XP)**0.5 1987 IF(DD.LE.DMIN) THEN 1988 DMIN=DD 1989 JMIN=J 1990 ENDIF 1991 20 CONTINUE 1992C 1993C UPDATE SUMMARY STATISTICS FOR CHOSEN CLUSTER 1994C 1995 DO 30 I=1,N 1996 IF(X(I).NE.XMISS) CALL SINGLE(X(I),SUM(2,I,JMIN),SUM(3,I,JMIN), 1997 * SUM(4,I,JMIN),SUM(5,I,JMIN),SUM(6,I,JMIN),SUM(7,I,JMIN)) 1998 30 CONTINUE 1999 RETURN 2000 END 2001 SUBROUTINE KOUT(M, N, CLAB, RLAB, KK, DMSUM1, 2002 * DMSUM2, SUM, NCLUS, DCLUS, DD, R, CWORK) 2003CCCCC SUBROUTINE KOUT(MM, M, N, A, CLAB, RLAB, TITLE, KK, DMSUM1, 2004CCCCC* DMSUM2, SUM, NCLUS, DCLUS, DD, R, CWORK, OUNIT) 2005C 2006C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2007C 2008C PURPOSE 2009C ------- 2010C 2011C PRINTS OUTPUT FOR K-MEANS ALGORITHM 2012C 2013C DESCRIPTION 2014C ----------- 2015C 2016C 1. THE OUTPUT CONSISTS OF THE OVERALL STATISTICS FOR THE CURRENT 2017C PARTITION, FOLLOWED BY THE STATISTICS FOR EACH CLUSTER. THE 2018C ANALYSIS OF VARIANCE IS COMPUTED FOR EACH VARIABLE IN THE 2019C PARTITION. 2020C 2021C INPUT PARAMETERS 2022C ---------------- 2023C 2024C MM INTEGER SCALAR (UNCHANGED ON OUTPUT). 2025C THE FIRST DIMENSION OF THE MATRIX A. MUST BE AT LEAST M. 2026C 2027C M INTEGER SCALAR (UNCHANGED ON OUTPUT). 2028C THE NUMBER OF CASES. 2029C 2030C N INTEGER SCALAR (UNCHANGED ON OUTPUT). 2031C THE NUMBER OF VARIABLES. 2032C 2033C A REAL MATRIX WHOSE FIRST DIMENSION MUST BE MM AND WHOSE SECOND 2034C DIMENSION MUST BE AT LEAST N (UNCHANGED ON OUTPUT). 2035C THE MATRIX OF DATA VALUES. 2036C 2037C CLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST N. 2038C (UNCHANGED ON OUTPUT). 2039C THE LABELS OF THE VARIABLES. 2040C 2041C RLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST M. 2042C (UNCHANGED ON OUTPUT). 2043C THE LABELS OF THE CASES. 2044C 2045C TITLE 10-CHARACTER VARIABLE (UNCHANGED ON OUTPUT). 2046C TITLE OF THE DATA SET. 2047C 2048C KK INTEGER SCALAR (UNCHANGED ON OUTPUT). 2049C THE NUMBER OF CLUSTERS. 2050C 2051C DMSUM1 INTEGER SCALAR (UNCHANGED ON OUTPUT). 2052C THE FIRST DIMENSION OF THE MATRIX SUM. MUST BE AT LEAST 7. 2053C 2054C DMSUM2 INTEGER SCALAR (UNCHANGED ON OUTPUT). 2055C THE SECOND DIMENSION OF THE MATRIX SUM. MUST BE AT LEAST N. 2056C 2057C SUM REAL MATRIX WHOSE FIRST DIMENSION MUST BE DMSUM1, WHOSE SECOND 2058C DIMENSION MUST BE DMSUM2, AND WHOSE THIRD DIMENSION MUST 2059C BE AT LEAST K+1 (UNCHANGED ON OUTPUT). 2060C THE PARAMETERS FOR EACH CLUSTER. 2061C 2062C SUM(1,J,I) IS THE VALUE OF THE J-TH VARIABLE AT THE CENTER OF 2063C CLUSTER I 2064C SUM(2,J,I) IS THE NUMBER OF NON-MISSING OBSERVATIONS FOR THE 2065C J-TH VARIABLE IN CLUSTER I 2066C SUM(3,J,I) IS THE MEAN OF THE J-TH VARIABLE IN CLUSTER I 2067C SUM(4,J,I) IS THE STANDARD DEVIATION OF THE J-TH VARIABLE IN 2068C CLUSTER I 2069C SUM(5,J,I) IS THE MINIMUM OF THE J-TH VARIABLE IN CLUSTER I 2070C SUM(6,J,I) IS THE MAXIMUM OF THE J-TH VARIABLE IN CLUSTER I 2071C SUM(7,J,I) IS THE SUM OF SQUARED DEVIATIONS FOR THE J-TH 2072C VARIABLE FROM THE MEAN OF CLUSTER I 2073C 2074C THE K+1-ST ROW OF SUM STORES THE SAME CALCULATIONS AS ABOVE 2075C EXCEPT FOR THE ENTIRE DATA SET RATHER THAN FOR AN 2076C INDIVIDUAL CLUSTER 2077C 2078C NCLUS INTEGER VECTOR DIMENSIONED AT LEAST M (UNCHANGED ON OUTPUT). 2079C NCLUS(I) IS THE CLUSTER FOR CASE I. 2080C 2081C DCLUS REAL VECTOR DIMENSIONED AT LEAST M (UNCHANGED ON OUTPUT). 2082C DCLUS(I) IS THE DISTANCE OF EACH CASE TO THE CLOSEST CLUSTER. 2083C 2084C DD REAL VECTOR DIMENSIONED AT LEAST N. 2085C WORK VECTOR. 2086C 2087C R REAL VECTOR DIMENSIONED AT LEAST N. 2088C WORK VECTOR. 2089C 2090C CWORK VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST N. 2091C WORK VECTOR. 2092C 2093C OUNIT INTEGER SCALAR (UNCHANGED ON OUTPUT). 2094C UNIT NUMBER FOR OUTPUT. 2095C 2096C REFERENCE 2097C --------- 2098C 2099C HARTIGAN, J. A. (1975). CLUSTERING ALGORITHMS, JOHN WILEY & 2100C SONS, INC., NEW YORK. PAGE 110. 2101C 2102C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2103C 2104CCCCC INTEGER DMSUM1, DMSUM2, OUNIT 2105 INTEGER DMSUM1, DMSUM2 2106CCCCC DIMENSION SUM(DMSUM1,DMSUM2,*), NCLUS(*), DCLUS(*), A(MM,*), R(*), 2107 DIMENSION SUM(DMSUM1,DMSUM2,*), NCLUS(*), DCLUS(*), R(*), DD(*) 2108 CHARACTER*4 CLAB(*), RLAB(*), CWORK(*) 2109CCCCC CHARACTER*10 TITLE 2110C 2111 INCLUDE 'DPCOMC.INC' 2112 INCLUDE 'DPCOP2.INC' 2113C 2114C 1/2008: MODIFIED FOR DATAPLOT TO USE DATAPLOT I/O 2115C 2116 DATA NPAGE,LC/0,0/ 2117C 2118C OUTPUT MEAN SQUARE CALCULATION OVER ALL CLUSTERS 2119C 2120 999 FORMAT(1X) 2121C 2122 NPAGE=NPAGE+1 2123C 2124 WRITE(ICOUT,1) NPAGE 2125 1 FORMAT('1',110X,I5) 2126 WRITE(ICOUT,2) KK 2127 2 FORMAT(' OVERALL MEAN SQUARE CALCULATIONS, FOR EACH VARIABLE, ', 2128 1 ' WITH',I5,' CLUSTERS') 2129 CALL DPWRST('XXX','WRIT') 2130C 2131 ASSW=0. 2132 DO 20 J=1,N 2133 SD=0. 2134 SC=0. 2135 SSB=0. 2136 SSW=0. 2137 DO 10 K=1,KK 2138 SD=SD+SUM(3,J,K)*SUM(2,J,K) 2139 SSB=SSB+SUM(3,J,K)**2*SUM(2,J,K) 2140 SSW=SSW+SUM(7,J,K) 2141 SC=SC+SUM(2,J,K) 2142 10 CONTINUE 2143 DFB=KK-1. 2144 DFW=SC-DFB-1. 2145 ASSW=ASSW+SSW 2146 IF(SC.GT.0.) SSB=SSB-SD**2/SC 2147 IF(DFB.GT.0.) SSB=SSB/DFB 2148 IF(DFW.GT.0.) SSW=SSW/DFW 2149 RATIO=0. 2150 IF(LC.NE.0.AND.SSW.GT.0.) RATIO=(R(J)/SSW-1.)*(1.+DFW)+1. 2151 R(J)=SSW 2152C 2153 WRITE(ICOUT,3)CLAB(J),SSW,DFW,SSB,DFB,RATIO 2154 3 FORMAT(' VARIABLE',4X,A4,F20.6, 2155 1 '(WITHIN MEAN SQ.)',F4.0,'(WITHIN DF)',F20.6, 2156 1 '(BETWEEN MSQ)',F4.0,'(BETWEEN DF)',F6.1,'(FRATIO)') 2157 CALL DPWRST('XXX','WRIT') 2158C 2159 20 CONTINUE 2160C 2161 WRITE(ICOUT,4) ASSW 2162 4 FORMAT(' OVERALL WITHIN SUM OF SQUARES',F20.6) 2163 CALL DPWRST('XXX','WRIT') 2164C 2165 LC=LC+1 2166C 2167C OUTPUT STATISTICS FOR EACH CLUSTER 2168C 2169 DO 50 K=1,KK 2170C 2171 WRITE(ICOUT,5) 2172 5 FORMAT(1X,131('-')) 2173 CALL DPWRST('XXX','WRIT') 2174 WRITE(ICOUT,6) K,KK 2175 6 FORMAT(I5,' TH CLUSTER OF',I5) 2176 CALL DPWRST('XXX','WRIT') 2177 WRITE(ICOUT,999) 2178 CALL DPWRST('XXX','WRIT') 2179 WRITE(ICOUT,7) 2180 7 FORMAT('CLUSTER MEMBERS WITH THEIR DISTANCES TO THE ', 2181 1 'CLUSTER CENTER') 2182 CALL DPWRST('XXX','WRIT') 2183 WRITE(ICOUT,17)(I,I=1,10) 2184 17 FORMAT(13X,10I11) 2185 CALL DPWRST('XXX','WRIT') 2186C 2187 L=0 2188 DO 30 I=1,M 2189 IF(NCLUS(I).EQ.K) THEN 2190 L=L+1 2191 CWORK(L)=RLAB(I) 2192 DD(L)=DCLUS(I) 2193 ENDIF 2194 IF ((L.GE.10.OR.I.GE.M).AND.L.NE.0) THEN 2195C 2196 WRITE(ICOUT,8)(CWORK(LL),LL=1,L) 2197 8 FORMAT(15X,10(7X,A4)) 2198 CALL DPWRST('XXX','WRIT') 2199 WRITE(ICOUT,9)(DD(LL),LL=1,L) 2200 9 FORMAT(15X,10F11.4) 2201 CALL DPWRST('XXX','WRIT') 2202C 2203 L=0 2204 ENDIF 2205 30 CONTINUE 2206C 2207 WRITE(ICOUT,999) 2208 CALL DPWRST('XXX','WRIT') 2209 WRITE(ICOUT,11) 2210 11 FORMAT('SUMMARY STATISTICS FOR THE CLUSTER') 2211 CALL DPWRST('XXX','WRIT') 2212 WRITE(ICOUT,12) 2213 12 FORMAT(' LABEL',5X,'CENTRE',8X,'COUNT',12X,'AVE', 2214 1 13X,'SD',11X,'XMIN',11X,'XMAX',12X,'SSQ') 2215 CALL DPWRST('XXX','WRIT') 2216C 2217 DO 40 J=1,N 2218C 2219 WRITE(ICOUT,13)CLAB(J),(SUM(I,J,K),I=1,7) 2220 13 FORMAT(1X,A4,7F15.6) 2221 CALL DPWRST('XXX','WRIT') 2222C 2223 40 CONTINUE 2224 50 CONTINUE 2225 RETURN 2226 END 2227 SUBROUTINE MIX(MM, M, N, A, CLAB1, CLAB2, RLAB, TITLE, K, MXITER, 2228 * NCOV, DMWORK, WORK1, DMWRK1, DMWRK2, WORK2, DMWRK3, 2229 * WORK3, IWORK, 2230 * ICAPTY, ICAPSW, IFORSW, IERR) 2231C 2232C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2233C 2234C PURPOSE 2235C ------- 2236C 2237C FITS THE MIXTURE MODEL BY A MAXIMUM LOG-LIKEHOOD CRITERION 2238C 2239C DESCRIPTION 2240C ----------- 2241C 2242C 1. THE DATA ARE ASSUMED TO BE A RANDOM SAMPLE OF SIZE M FROM A 2243C MIXTURE OF K MULTIVARIATE NORMAL DISTRIBUTIONS IN N DIMENSIONS. 2244C THE PROBABILITY THAT THE J-TH OBSERVATION WAS DRAWN FROM THE 2245C I-TH NORMAL FOR J=1,...,M I=1,...,K IS USED TO ESTIMATE WHICH 2246C NORMAL EACH OBSERVATION WAS SAMPLED FROM, AND HENCE GROUP THE 2247C OBSERVATIONS INTO K CLUSTERS. THE CRITERION TO BE MAXIMIZED IS 2248C THE LOG LIKELIHOOD 2249C 2250C SUM LOG(G(I)) OVER I=1,...,M 2251C 2252C WHERE G(I) IS THE PROBABILITY DENSITY OF THE I-TH OBSERVATION. 2253C 2254C SEE PAGE 116 OF THE REFERENCE FOR A FURTHER DESCRIPTION OF G. 2255C 2256C 2. THE MANY PARAMETERS PRESENT IN THE BETWEEN-NORMAL COVARIANCE 2257C MATRICES REQUIRE MUCH DATA FOR THEIR ESTIMATION. A RULE OF 2258C THUMB IS THAT M SHOULD BE GREATER THAN (N+1)(N+2)K/2. EVEN 2259C WITH MANY OBSERVATIONS, THE PROCEDURE IS VULNERABLE TO 2260C NONNORMALITY OR LINEAR DEPENDENCE AMONG THE VARIABLES. TO 2261C REDUCE THIS SENSITIVITY ONE CAN MAKE ASSUMPTIONS ON THESE 2262C COVARIANCE MATRICES BY SETTING THE NCOV PARAMETER TO: 2263C 2264C 1 IF THE COVARIANCE MATRICES ARE ARBITRARY 2265C 2 IF THE COVARIANCE MATRICES IN DIFFERENT NORMALS ARE EQUAL 2266C 3 IF THE COVARIANCE MATRICES ARE EQUAL AND DIAGONAL 2267C 4 IF ALL VARIABLES HAVE THE SAME VARIANCE AND ARE PAIRWISE 2268C INDEPENDENT 2269C 2270C 3. AFTER EVERY 5 ITERATIONS, THE CLUSTER PROBABILITIES, MEANS, AND 2271C DETERMINANTS OF COVARIANCE MATRICES ARE PRINTED OUT. ALSO, THE 2272C WITHIN-CLUSTER VARIANCES AND CORRELATIONS FOR EVERY PAIR OF 2273C VARIABLES FOR EACH CLUSTER, AND FINALLY EVERY OBSERVATION AND 2274C ITS BELONGING PROBABILILTY FOR EACH CLUSTER IS PRINTED. THE 2275C LOG LIKELIHOOD IS PRINTED AFTER EACH ITERATION. THE ITERATIONS 2276C STOP EITHER AFTER THE MAXIMUM NUMBER OF ITERATIONS HAVE BEEN 2277C REACHED OR AFTER THE INCREASE IN THE LOG LIKELIHOOD FROM ONE 2278C ITERATION TO ANOTHER IS LESS THAT .0001. ALL OUTPUT IS SENT TO 2279C FORTRAN UNIT OUNIT. 2280C 2281C INPUT PARAMETERS 2282C ---------------- 2283C 2284C MM INTEGER SCALAR (UNCHANGED ON OUTPUT). 2285C THE FIRST DIMENSION OF THE MATRIX A. MUST BE AT LEAST M. 2286C 2287C M INTEGER SCALAR (UNCHANGED ON OUTPUT). 2288C THE NUMBER OF CASES. 2289C 2290C N INTEGER SCALAR (UNCHANGED ON OUTPUT). 2291C THE NUMBER OF VARIABLES. 2292C 2293C A REAL MATRIX WHOSE FIRST DIMENSION MUST BE MM AND WHOSE SECOND 2294C DIMENSION MUST BE AT LEAST N (UNCHANGED ON OUTPUT). 2295C THE MATRIX OF DATA VALUES. 2296C 2297C A(I,J) IS THE VALUE FOR THE J-TH VARIABLE FOR THE I-TH CASE. 2298C 2299C CLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST N. 2300C (UNCHANGED ON OUTPUT). 2301C THE LABELS OF THE VARIABLES. 2302C 2303C RLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST M. 2304C (UNCHANGED ON OUTPUT). 2305C THE LABELS OF THE CASES. 2306C 2307C TITLE 10-CHARACTER VARIABLE (UNCHANGED ON OUTPUT). 2308C TITLE OF THE DATA SET. 2309C 2310C K INTEGER SCALAR (UNCHANGED ON OUTPUT). 2311C THE DESIRED NUMBER OF CLUSTERS. 2312C 2313C MXITER INTEGER SCALAR (UNCHANGED ON OUTPUT). 2314C THE MAXIMUM NUMBER OF ITERATIONS ALLOWED. 2315C 2316C NCOV INTEGER SCALAR (UNCHANGED ON OUTPUT). 2317C DETERMINES STRUCTURE OF THE WITHIN-CLUSTER COVARIANCE MATRIX 2318C 2319C NCOV = 1 GENERAL COVARIANCES 2320C NCOV = 2 COVARIANCES EQUAL BETWEEN CLUSTERS 2321C NCOV = 3 COVARIANCES EQUAL AND DIAGONAL 2322C NCOV = 4 COVARIANCES SPHERICAL 2323C 2324C DMWORK INTEGER SCALAR (UNCHANGED ON OUTPUT). 2325C THE LEADING DIMENSION OF THE MATRIX WORK1. MUST BE AT LEAST 2326C 2*M+N+1. 2327C 2328C WORK1 REAL MATRIX WHOSE FIRST DIMENSION MUST BE DMWORK AND WHOSE 2329C SECOND DIMENSION MUST BE AT LEAST K. 2330C WORK MATRIX. 2331C 2332C DMWRK1 INTEGER SCALAR (UNCHANGED ON OUTPUT). 2333C THE FIRST DIMENSION OF THE MATRIX WORK2. MUST BE AT LEAST N. 2334C 2335C DMWRK2 INTEGER SCALAR (UNCHANGED ON OUTPUT). 2336C THE SECOND DIMENSION OF THE MATRIX WORK2. MUST BE AT LEAST 2337C N+1. 2338C 2339C WORK2 REAL MATRIX WHOSE FIRST DIMENSION MUST BE DMWRK1, WHOSE SECOND 2340C DIMENSION MUST BE DMWRK2, AND WHOSE THIRD DIMENSION MUST BE 2341C AT LEAST K+1. 2342C WORK MATRIX. 2343C 2344C DMWRK3 INTEGER SCALAR (UNCHANGED ON OUTPUT). 2345C THE LEADING DIMENSION OF THE MATRIX WORK3. MUST BE AT LEAST 2346C N. 2347C 2348C WORK3 REAL MATRIX WHOSE FIRST DIMENSION MUST BE DMWRK3 AND WHOSE 2349C SECOND DIMENSION MUST BE AT LEAST N+1. 2350C WORK MATRIX. 2351C 2352C IWORK INTEGER VECTOR DIMENSIONED AT LEAST N. 2353C WORK VECTOR. 2354C 2355C OUNIT INTEGER SCALAR (UNCHANGED ON OUTPUT). 2356C UNIT NUMBER FOR OUTPUT. 2357C 2358C OUTPUT PARAMETER 2359C ---------------- 2360C 2361C IERR INTEGER SCALAR. 2362C ERROR FLAG. 2363C 2364C IF IERR = 0, NO ERROR WAS DETECTED. 2365C 2366C IF IERR = K, THE K-TH PIVOT BLOCK OF ONE OF THE COVARIANCE 2367C MATRICES WAS SINGULAR. THEREFORE, AN INVERSE 2368C COULD NOT BE CALCULATED AND EXECUTION WAS 2369C TERMINATED. THE ERROR FLAG WAS SET IN CMLIB 2370C SUBROUTINE SSIFA. 2371C 2372C REFERENCE 2373C --------- 2374C 2375C HARTIGAN, J. A. (1975). CLUSTERING ALGORITHMS, JOHN WILEY & 2376C SONS, INC., NEW YORK. PAGES 113-129. 2377C 2378C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2379C 2380CCCCC INTEGER OUNIT, P, U, PMIX, T, DMWORK, DMWRK1, DMWRK2, DMWRK3 2381 INTEGER P, U, PMIX, T, DMWORK, DMWRK1, DMWRK2, DMWRK3 2382 DIMENSION A(MM,*), WORK1(DMWORK,*), WORK2(DMWRK1,DMWRK2,*), 2383 * DETER(2), IWORK(*), WORK3(DMWRK3,*) 2384 CHARACTER*4 CLAB1(*), CLAB2(*) 2385 CHARACTER*8 RLAB(*) 2386 CHARACTER*10 TITLE 2387 CHARACTER*4 ICAPTY 2388 CHARACTER*4 ICAPSW 2389 CHARACTER*4 IFORSW 2390 LOGICAL DONE 2391C 2392 INCLUDE 'DPCOMC.INC' 2393 INCLUDE 'DPCOP2.INC' 2394C 2395C INITIALIZE 2396C 2397 DONE = .FALSE. 2398 P = 0 2399 U = P + M 2400 PMIX = U + N + 1 2401 T = PMIX 2402 XLL1 = -R1MACH(2) 2403 DO 10 J=1,K 2404 WORK2(1,N+1,J)=0. 2405 10 CONTINUE 2406 DO 30 I=1,M 2407 DO 20 J=1,K 2408 WORK1(P+I,J)=0. 2409 20 CONTINUE 2410 J=(I*K)/(M+1)+1 2411 WORK1(P+I,J)=1. 2412 30 CONTINUE 2413 DO 200 IT=1,MXITER 2414C 2415C UPDATE MEANS AND COVARIANCES 2416C 2417 DO 40 J=1,K 2418 CALL CLUMOM(MM,M,N,A,J,WORK1(P+1,J),WORK1(U+1,J),DMWRK1, 2419 * DMWRK2,WORK2) 2420 40 CONTINUE 2421C 2422C UPDATE WEIGHTS 2423C 2424 WW=0. 2425 DO 60 J=1,K 2426 WORK1(PMIX,J)=0. 2427 DO 50 I=1,M 2428 WORK1(PMIX,J)=WORK1(PMIX,J)+WORK1(P+I,J) 2429 50 CONTINUE 2430 WW=WW+WORK1(PMIX,J) 2431 60 CONTINUE 2432 DO 70 J=1,K 2433 IF(WW.NE.0.) WORK1(PMIX,J)=WORK1(PMIX,J)/WW 2434 70 CONTINUE 2435C 2436C ADJUST FOR COVARIANCE STRUCTURE 2437C 2438 IF(NCOV.NE.1) THEN 2439 DO 100 I=1,N 2440 DO 105 II=1,N 2441 WORK2(I,II,1)=WORK1(PMIX,1)*WORK2(I,II,1) 2442 DO 80 J=2,K 2443 WORK2(I,II,1)=WORK2(I,II,1)+WORK2(I,II,J)* 2444 * WORK1(PMIX,J) 2445 80 CONTINUE 2446 IF(NCOV.GE.3.AND.I.NE.II) WORK2(I,II,1)=0. 2447 DO 90 J=2,K 2448 WORK2(I,II,J)=WORK2(I,II,1) 2449 90 CONTINUE 2450 105 CONTINUE 2451 100 CONTINUE 2452 IF (NCOV.EQ.4) THEN 2453 CC=0. 2454 DO 110 I=1,N 2455 CC=CC+WORK2(I,I,1) 2456 110 CONTINUE 2457 CC=CC/N 2458 DO 120 I=1,N 2459 DO 125 J=1,K 2460 WORK2(I,I,J)=CC 2461 125 CONTINUE 2462 120 CONTINUE 2463 ENDIF 2464 ENDIF 2465 II=IT-1 2466 IF(((II/5)*5.EQ.II.OR.DONE)) 2467CCCCC* CALL COVOUT(MM,M,N,A,CLAB1,CLAB2,RLAB,TITLE,K,DMWORK,WORK1, 2468 * CALL COVOUT(M,N,CLAB1,CLAB2,RLAB,TITLE,K,DMWORK,WORK1, 2469 * DMWRK1,DMWRK2,WORK2,WORK1(T+1,1), 2470 * ICAPTY,ICAPSW,IFORSW) 2471 IF (DONE) RETURN 2472C 2473C UPDATE BELONGING PROBABILITIES 2474C 2475 DO 160 J=1,K 2476C 2477C COMPUTE INVERSES AND DETERMINANTS OF COVARIANCE MATRICES 2478C 2479 DO 130 III = 1 , N 2480 DO 135 JJJ = 1 , N 2481 WORK3(III,JJJ) = WORK2(III,JJJ,J) 2482 135 CONTINUE 2483 130 CONTINUE 2484CCCCC CALL INVERT(DMWRK3,N,WORK3,DETER,WORK3(1,N+1),IWORK,IERR, 2485CCCCC* OUNIT) 2486 CALL INVERT(DMWRK3,N,WORK3,DETER,WORK3(1,N+1),IWORK,IERR) 2487 IF (IERR .NE. 0) RETURN 2488 DET = DETER(1) * (10. ** DETER(2)) 2489 DO 140 III = 1 , N 2490 DO 145 JJJ = 1 , N 2491 WORK2(III,JJJ,J) = WORK3(III,JJJ) 2492 145 CONTINUE 2493 140 CONTINUE 2494 IF(DET.EQ.0.) RETURN 2495 DET=SQRT(ABS(DET)) 2496 WORK2(1,N+1,J)=DET 2497C 2498C COMPUTE PROBABILITY DENSITY FOR THE I-TH OBSERVATION FROM THE J-TH 2499C NORMAL 2500C 2501 DO 165 I=1,M 2502 S=0. 2503 DO 150 L=1,N 2504 DO 155 LL=1,N 2505 S=S+WORK2(L,LL,J)*(A(I,L)-WORK1(U+L,J))*(A(I,LL)- 2506 * WORK1(U+LL,J)) 2507 155 CONTINUE 2508 150 CONTINUE 2509 IF(S.GT.100.) S=100. 2510 WORK1(T+I,J)=EXP(-S/2.)*WORK1(PMIX,J)/DET 2511 165 CONTINUE 2512 160 CONTINUE 2513C 2514C COMPUTES LOG LIKELIHOOD 2515C 2516 XLL=0. 2517 DO 180 I=1,M 2518 S=0. 2519 DO 170 J=1,K 2520 S=S+WORK1(T+I,J) 2521 170 CONTINUE 2522 IF(S.EQ.0.) S=R1MACH(4) 2523 XLL=XLL+LOG(S) 2524 DO 185 J=1,K 2525 WORK1(T+I,J)=WORK1(T+I,J)/S 2526 185 CONTINUE 2527 180 CONTINUE 2528 IF (IPRINT.EQ.'ON') THEN 2529 WRITE(ICOUT,1) IT,XLL 2530 1 FORMAT(' ITERATION = ',I5,' LOG LIKELIHOOD = ',F12.6) 2531 CALL DPWRST('XXX','WRIT') 2532 ENDIF 2533C 2534C UPDATE PROBABILITY THE I-TH OBSERVATION WAS DRAWN FROM THE J-TH 2535C NORMAL 2536C 2537 DO 190 I=1,M 2538 DO 195 J=1,K 2539 XIT=MXITER 2540 ALPHA=1.+.7*IT/XIT 2541 WORK1(P+I,J)=ALPHA*WORK1(T+I,J)-(ALPHA-1.)*WORK1(P+I,J) 2542C 2543C AT EVERY FIFTH ITERATION, SET PROBABILITIES TO EITHER ZERO OR ONE 2544C 2545 IF(IT.EQ.5.AND.WORK1(P+I,J).GT.0.5) WORK1(P+I,J)=1. 2546 IF(IT.EQ.5.AND.WORK1(P+I,J).LE.0.5) WORK1(P+I,J)=0. 2547 IF(WORK1(P+I,J).GT.1.) WORK1(P+I,J)=1. 2548 IF(WORK1(P+I,J).LT.0.) WORK1(P+I,J)=0. 2549 195 CONTINUE 2550 190 CONTINUE 2551C 2552C RETURN IF NO CHANGE IN LOG LIKELIHOOD 2553C 2554 IF (XLL-XLL1 .LE. .00001) DONE = .TRUE. 2555 XLL1 = XLL 2556 200 CONTINUE 2557 RETURN 2558 END 2559 SUBROUTINE MIXIND(MM, M, N, A, CLAB, RLAB, TITLE, K, DMWORK, 2560 * WORK1, WORK2) 2561CCCCC* WORK1, WORK2, OUNIT) 2562C 2563C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2564C 2565C PURPOSE 2566C ------- 2567C 2568C FITS THE MIXTURE MODEL FROM K MULTIVARIATE NORMALS WHERE K IS 2569C THE DESIRED NUMBER OF CLUSTERS. THE VARIABLES ARE ASSUMED TO 2570C HAVE VARIANCE CONSTANT OVER DIFFERENT CLUSTERS 2571C 2572C DESCRIPTION 2573C ----------- 2574C 2575C 1. THE DATA ARE ASSUMED TO BE A RANDOM SAMPLE OF SIZE M FROM A 2576C MIXTURE OF K MULTIVARIATE NORMAL DISTRIBUTIONS IN N DIMENSIONS. 2577C THE SUBROUTINE PREDICTS THE DISTRIBUTION THAT EACH OBSERVATION 2578C WAS SAMPLED FROM AND HENCE GROUPS THE OBSERVATIONS INTO K 2579C CLUSTERS. SEE PAGE 113 OF THE REFERENCE FOR A FURTHER 2580C DESCRIPTION OF THE MIXTURE ALGORITHM. 2581C 2582C 2. THE ROUTINE BEGINS WITH THE CLUSTER OF ALL OBJECTS AND THEN 2583C DIVIDES INTO TWO, THEN THREE, ..., THEN FINALLY K CLUSTERS. 2584C THE RESULTS ARE PRINTED AFTER EACH DIVISION ON FORTRAN UNIT 2585C OUNIT. THE RESULTS CONSIST OF THE WITHIN-CLUSTER VARIANCES FOR 2586C EACH VARIABLE, THEN SETS UP A COLUMN FOR EACH CLUSTER. THE 2587C MIXTURE PROBABILITY IS THE PROBABILITY THAT A NEW OBJECT WILL 2588C BE GROUPED INTO THAT CLUSTER. THEN THE MEANS OF THE VARIABLES 2589C FOR THE CLUSTER ARE PRINTED, AS WELL AS THE PROBABILITIES THAT 2590C EACH CASE BELONGS TO EACH CLUSTER. 2591C 2592C INPUT PARAMETERS 2593C ---------------- 2594C 2595C MM INTEGER SCALAR (UNCHANGED ON OUTPUT). 2596C THE FIRST DIMENSION OF THE MATRIX A. MUST BE AT LEAST M. 2597C 2598C M INTEGER SCALAR (UNCHANGED ON OUTPUT). 2599C THE NUMBER OF CASES. 2600C 2601C N INTEGER SCALAR (UNCHANGED ON OUTPUT). 2602C THE NUMBER OF VARIABLES. 2603C 2604C A REAL MATRIX WHOSE FIRST DIMENSION MUST BE MM AND WHOSE SECOND 2605C DIMENSION MUST BE AT LEAST N (UNCHANGED ON OUTPUT). 2606C THE MATRIX OF DATA VALUES. 2607C 2608C A(I,J) IS THE VALUE FOR THE J-TH VARIABLE FOR THE I-TH CASE. 2609C 2610C CLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST N. 2611C (UNCHANGED ON OUTPUT). 2612C THE LABELS OF THE VARIABLES. 2613C 2614C RLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST M. 2615C (UNCHANGED ON OUTPUT). 2616C THE LABELS OF THE CASES. 2617C 2618C TITLE 10-CHARACTER VARIABLE (UNCHANGED ON OUTPUT). 2619C TITLE OF THE DATA SET. 2620C 2621C K INTEGER SCALAR (UNCHANGED ON OUTPUT). 2622C THE NUMBER OF CLUSTERS. 2623C 2624C DMWORK INTEGER SCALAR (UNCHANGED ON OUTPUT). 2625C THE LEADING DIMENSION OF THE MATRIX WORK1. MUST BE AT LEAST 2626C N+M+1. 2627C 2628C WORK1 REAL MATRIX WHOSE FIRST DIMENSION MUST BE DMWORK AND WHOSE 2629C SECOND DIMENSION MUST BE AT LEAST K. 2630C WORK MATRIX. 2631C 2632C WORK2 REAL VECTOR DIMENSIONED AT LEAST N. 2633C WORK VECTOR. 2634C 2635C OUNIT INTEGER SCALAR (UNCHANGED ON OUTPUT). 2636C UNIT NUMBER FOR OUTPUT. 2637C 2638C REFERENCE 2639C --------- 2640C 2641C HARTIGAN, J. A. (1975). CLUSTERING ALGORITHMS, JOHN WILEY & 2642C SONS, INC., NEW YORK. PAGES 113-129. 2643C 2644C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2645C 2646CCCCC INTEGER DMWORK, U, P, PMIX, OUNIT 2647 INTEGER DMWORK, U, P, PMIX 2648 DIMENSION A(MM,*), WORK1(DMWORK,*), WORK2(*) 2649 CHARACTER*4 CLAB(*), RLAB(*) 2650 CHARACTER*10 TITLE 2651C 2652 INCLUDE 'DPCOMC.INC' 2653C 2654 U = 0 2655 P = U + N 2656 PMIX = P + M + 1 2657 XM=99999. 2658 TH=.0001 2659 DO 160 KK=1,K 2660C 2661C IF NOT FIRST PASS, FIND FURTHEST CASE FROM PRESENT MEANS 2662C 2663 DM=0. 2664 IM=1 2665 IF(KK.NE.1) THEN 2666 DO 30 I=1,M 2667 DI=R1MACH(2)/N 2668 DO 20 KL=1,KK-1 2669 DD=0. 2670 XC=0. 2671 DO 10 J=1,N 2672 IF(A(I,J).NE.XM) THEN 2673 XC=XC+1. 2674 DD=DD+(A(I,J)-WORK1(U+J,KL))**2 /WORK2(J) 2675 IF(DD.GT.DI*N) GO TO 20 2676 ENDIF 2677 10 CONTINUE 2678 IF(XC.EQ.0.) GO TO 30 2679 DD=DD/XC 2680 IF(DD.LT.DI) DI=DD 2681 20 CONTINUE 2682 IF(DI.GE.DM) THEN 2683 DM=DI 2684 IM=I 2685 ENDIF 2686 30 CONTINUE 2687 ENDIF 2688C 2689C BEGIN A NEW CLUSTER LABELED KK 2690C 2691 DO 40 J=1,N 2692 WORK1(U+J,KK)=A(IM,J) 2693 40 CONTINUE 2694 WORK1(PMIX,KK)=EXP(0.5*N) 2695 ITER=25 2696 DO 150 IT=1,ITER 2697C 2698C UPDATE PROBABILITIES OF BELONGING 2699C 2700 DO 90 I=1,M 2701 PP=0. 2702 DO 60 KL=1,KK 2703 DD=0. 2704 DO 50 J=1,N 2705 IF(A(I,J).NE.XM.AND.KK.NE.1) 2706 * DD=DD+(A(I,J)-WORK1(U+J,KL))**2/(WORK2(J)*2.) 2707 50 CONTINUE 2708 IF(DD.GT.100.) DD=100. 2709 WORK1(P+I,KL)=WORK1(PMIX,KL)*EXP(-DD) 2710 PP=PP+WORK1(P+I,KL) 2711 60 CONTINUE 2712 IF(PP.NE.0.) THEN 2713 PN=0. 2714 DO 70 KL=1,KK 2715 WORK1(P+I,KL)=WORK1(P+I,KL)/PP 2716 IF(WORK1(P+I,KL).LT.TH) WORK1(P+I,KL)=0. 2717 PN =PN+WORK1(P+I,KL) 2718 70 CONTINUE 2719 DO 80 KL=1,KK 2720 WORK1(P+I,KL)=WORK1(P+I,KL)/PN 2721 80 CONTINUE 2722 ENDIF 2723 90 CONTINUE 2724C 2725C UPDATE MIXTURE PROBABILITIES 2726C 2727 DO 100 KL=1,KK 2728 WORK1(PMIX,KL)=0. 2729 DO 105 I=1,M 2730 WORK1(PMIX,KL)=WORK1(PMIX,KL)+WORK1(P+I,KL)/M 2731 105 CONTINUE 2732 100 CONTINUE 2733C 2734C UPDATE CLUSTER ESTIMATES, EACH ONE A WEIGHTED MEAN 2735C 2736 DO 120 KL=1,KK 2737 DO 125 J=1,N 2738 WORK1(U+J,KL)=0. 2739 DO 110 I=1,M 2740 WORK1(U+J,KL)=WORK1(U+J,KL)+A(I,J)*WORK1(P+I,KL) 2741 110 CONTINUE 2742 IF(WORK1(PMIX,KL).NE.0.) 2743 * WORK1(U+J,KL)=WORK1(U+J,KL)/(WORK1(PMIX,KL)*M) 2744 125 CONTINUE 2745 120 CONTINUE 2746 DO 140 J=1,N 2747 WORK2(J)=0. 2748 DO 130 I=1,M 2749 DO 135 KL=1,KK 2750 WORK2(J)=WORK2(J)+(A(I,J)-WORK1(U+J,KL))**2* 2751 * WORK1(P+I,KL) 2752 135 CONTINUE 2753 130 CONTINUE 2754 WORK2(J)=WORK2(J)/M 2755 140 CONTINUE 2756 150 CONTINUE 2757C 2758C PRINT RESULTS OF ITERATION 2759C 2760CCCCC IF (OUNIT .GT. 0) CALL MIXOUT(MM,M,N,A,CLAB,RLAB,TITLE,KK, 2761CCCCC* DMWORK,WORK1,WORK2,OUNIT) 2762 CALL MIXOUT(M,N,CLAB,RLAB,TITLE,KK, 2763 * DMWORK,WORK1,WORK2) 2764 160 CONTINUE 2765 RETURN 2766 END 2767 SUBROUTINE MIXOUT(M, N, CLAB, RLAB, TITLE, K, DMWORK, 2768 * WORK1, WORK2) 2769CCCCC SUBROUTINE MIXOUT(MM, M, N, A, CLAB, RLAB, TITLE, K, DMWORK, 2770CCCCC* WORK1, WORK2, OUNIT) 2771C 2772C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2773C 2774C PURPOSE 2775C ------- 2776C 2777C PRINTS THE RESULTS FOR EACH ITERATION OF MIXIND 2778C 2779C DESCRIPTION 2780C ----------- 2781C 2782C 1. SEE SUBROUTINE MIXIND FOR DESCRIPTION OF OUTPUT. 2783C 2784C INPUT PARAMETERS 2785C ---------------- 2786C 2787C K INTEGER SCALAR (UNCHANGED ON OUTPUT). 2788C THE CURRENT NUMBER OF CLUSTERS. 2789C 2790C FOR OTHER PARAMETERS -- SEE SUBROUTINE MIXIND 2791C 2792C REFERENCE 2793C --------- 2794C 2795C HARTIGAN, J. A. (1975). CLUSTERING ALGORITHMS, JOHN WILEY & 2796C SONS, INC., NEW YORK. PAGE 129. 2797C 2798C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2799C 2800CCCCC INTEGER DMWORK, U, P, PMIX, OUNIT 2801CCCCC DIMENSION A(MM,*), WORK1(DMWORK,*), WORK2(*) 2802 INTEGER DMWORK, U, P, PMIX 2803 DIMENSION WORK1(DMWORK,*), WORK2(*) 2804 CHARACTER*4 CLAB(*), RLAB(*) 2805 CHARACTER*10 TITLE 2806C 2807 INCLUDE 'DPCOMC.INC' 2808 INCLUDE 'DPCOP2.INC' 2809C 2810 U = 0 2811 P = U + N 2812 PMIX = P + M + 1 2813C 2814 WRITE(ICOUT,999) 2815 999 FORMAT(1X) 2816 CALL DPWRST('XXX','WRIT') 2817 WRITE(ICOUT,1) TITLE,K 2818 1 FORMAT(' MIXTURE MODEL FOR',2X,A10,'WITH',I5,' CLUSTERS') 2819 CALL DPWRST('XXX','WRIT') 2820C 2821C PRINT VARIANCES 2822C 2823 WRITE(ICOUT,999) 2824 CALL DPWRST('XXX','WRIT') 2825 WRITE(ICOUT,2) 2826 2 FORMAT(' WITHIN CLUSTER VARIANCES') 2827 CALL DPWRST('XXX','WRIT') 2828 WRITE(ICOUT,222)(WORK2(J),CLAB(J),J=1,N) 2829 222 FORMAT(5(F15.6,'(',A4,')')) 2830 CALL DPWRST('XXX','WRIT') 2831C 2832C PRINT CLUSTER PROBABILITIES 2833C 2834 WRITE(ICOUT,3)(KK,KK=1,K) 2835 3 FORMAT(9X,' CLUSTER', 9(I3,1X,' CLUSTER')) 2836 CALL DPWRST('XXX','WRIT') 2837 WRITE(ICOUT,999) 2838 CALL DPWRST('XXX','WRIT') 2839 WRITE(ICOUT,4)(WORK1(PMIX,KK),KK=1,K) 2840 4 FORMAT(' MIXTURE PROBABILITIES',/(7X,10F12.6)) 2841 CALL DPWRST('XXX','WRIT') 2842C 2843C PRINT MEANS 2844C 2845 WRITE(ICOUT,999) 2846 CALL DPWRST('XXX','WRIT') 2847 WRITE(ICOUT,5) 2848 5 FORMAT(' CLUSTER MEANS') 2849 CALL DPWRST('XXX','WRIT') 2850C 2851 DO 10 J=1,N 2852 WRITE(ICOUT,6) CLAB(J),(WORK1(U+J,KK),KK=1,K) 2853 6 FORMAT(1X,A4,2X,10F12.4) 2854 CALL DPWRST('XXX','WRIT') 2855 10 CONTINUE 2856C 2857C PRINT PROBABILITIES 2858C 2859 WRITE(ICOUT,999) 2860 CALL DPWRST('XXX','WRIT') 2861 WRITE(ICOUT,7) 2862 7 FORMAT(' BELONGING PROBABILITIES') 2863 CALL DPWRST('XXX','WRIT') 2864C 2865 DO 20 I=1,M 2866 WRITE(ICOUT,8) RLAB(I),(WORK1(P+I,KK),KK=1,K) 2867 8 FORMAT(1X,A4,2X,10F12.6) 2868 CALL DPWRST('XXX','WRIT') 2869 20 CONTINUE 2870 RETURN 2871 END 2872 SUBROUTINE QUICK(MM, M, N, A, RLAB, THRESH, XMISS, 2873 * NC, IWORK, OUNIT) 2874CCCCC SUBROUTINE QUICK(MM, M, N, A, CLAB, RLAB, TITLE, THRESH, XMISS, 2875CCCCC* NC, IWORK, OUNIT) 2876C 2877C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2878C 2879C PURPOSE 2880C ------- 2881C 2882C FINDS A QUICK PARTITION OF THE CASES BY COMPARING, TO A USER- 2883C DEFINED THRESHOLD, THE EUCLIDEAN DISTANCES TO THE EXISTING 2884C CLUSTER LEADERS 2885C 2886C DESCRIPTION 2887C ----------- 2888C 2889C 1. INITIALLY, THE FIRST CASE WILL BE ASSIGNED TO THE FIRST CLUSTER 2890C AND BECOMES THE LEADER OF THE FIRST CLUSTER. THEN, GIVEN A NEW 2891C CASE, CYCLE THROUGH THE EXISTING CLUSTERS IN ORDER. PLACE THE 2892C CASE IN THE FIRST CLUSTER WHERE THE DISTANCE BETWEEN THE CASE 2893C AND THE CLUSTER LEADER IS LESS THAN THE THRESHOLD. IF NO 2894C CLUSTER EXISTS, PLACE THE CASE IN A NEW CLUSTER MAKING IT THE 2895C CLUSTER LEADER. ONCE THE MAXIMUM NUMBER OF DESIRED CLUSTERS 2896C HAS BEEN REACHED, NO NEW CLUSTERS WILL BE FORMED AND CASES NOT 2897C BELONGING TO AN EXISTING CLUSTER WILL BE IGNORED. 2898C 2899C 2. THE DISTANCE FUNCTION USED IS THE EUCLIDEAN DISTANCE. THE 2900C VARIABLES SHOULD BE SCALED SIMILARLY (CLUSTER SUBROUTINE STAND 2901C CAN BE USED TO STANDARDIZE THE VARIABLES). ANY MISSING VALUES 2902C WILL BE IGNORED IN THE DISTANCE CALCULATION. 2903C 2904C 3. THE OUTPUT IS ON FORTRAN UNIT OUNIT, WHICH FOR EACH CLUSTER IS 2905C THE CLUSTER LEADER AND ITS VALUES FOLLOWED BY THE OTHER 2906C MEMBERS. 2907C 2908C INPUT PARAMETERS 2909C ---------------- 2910C 2911C MM INTEGER SCALAR (UNCHANGED ON OUTPUT). 2912C THE FIRST DIMENSION OF THE MATRIX A. MUST BE AT LEAST M. 2913C 2914C M INTEGER SCALAR (UNCHANGED ON OUTPUT). 2915C THE NUMBER OF CASES. 2916C 2917C N INTEGER SCALAR (UNCHANGED ON OUTPUT). 2918C THE NUMBER OF VARIABLES. 2919C 2920C A REAL MATRIX WHOSE FIRST DIMENSION MUST BE MM AND WHOSE SECOND 2921C DIMENSION MUST BE AT LEAST N (UNCHANGED ON OUTPUT). 2922C THE MATRIX OF DATA VALUES. 2923C 2924C A(I,J) IS THE VALUE FOR THE J-TH VARIABLE FOR THE I-TH CASE. 2925C 2926C CLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST N. 2927C (UNCHANGED ON OUTPUT). 2928C THE LABELS OF THE VARIABLES. 2929C 2930C RLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST M. 2931C (UNCHANGED ON OUTPUT). 2932C THE LABELS OF THE CASES. 2933C 2934C TITLE 10-CHARACTER VARIABLE (UNCHANGED ON OUTPUT). 2935C TITLE OF THE DATA SET. 2936C 2937C THRESH REAL SCALAR (UNCHANGED ON OUTPUT). 2938C THRESHOLD SUCH THAT ANY TWO CASES WHOSE DISTANCE IS LESS 2939C THAN THRESH WILL BE ASSIGNED TO THE SAME CLUSTER. 2940C 2941C XMISS REAL SCALAR (UNCHANGED ON OUTPUT). 2942C MISSING VALUE CODE. IF A(I,J) = XMISS, THEN THE VALUE FOR THE 2943C J-TH VARIABLE FOR THE I-TH CASE IS ASSUMED TO BE MISSING. 2944C 2945C NC INTEGER SCALAR (UNCHANGED ON OUTPUT). 2946C MAXIMUM NUMBER OF CLUSTERS DESIRED. 2947C 2948C IWORK INTEGER VECTOR DIMENSIONED AT LEAST M+NC. 2949C WORK VECTOR. 2950C 2951C OUNIT INTEGER SCALAR (UNCHANGED ON OUTPUT). 2952C UNIT NUMBER FOR OUTPUT. 2953C 2954C REFERENCE 2955C --------- 2956C 2957C HARTIGAN, J. A. (1975). CLUSTERING ALGORITHMS, JOHN WILEY & 2958C SONS, INC., NEW YORK. PAGES 74-83. 2959C 2960C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 2961C 2962 DIMENSION A(MM,*), IWORK(*) 2963 INTEGER OUNIT 2964CCCCC CHARACTER*4 CLAB(*), RLAB(*), AA(20) 2965 CHARACTER*4 RLAB(*), AA(20) 2966CCCCC CHARACTER*10 TITLE 2967C 2968 INCLUDE 'DPCOMC.INC' 2969 INCLUDE 'DPCOP2.INC' 2970C 2971 LL = 0 2972 LC = LL + NC 2973 IF(OUNIT.LE.0) OUNIT = IPR 2974C 2975C ASSIGN THE FIRST CASE TO THE FIRST CLUSTER 2976C 2977 KC=1 2978 IWORK(LL+1)=1 2979 DMAX=N * THRESH**2 2980 DO 30 I=1,M 2981 IWORK(LC+I)=0 2982 DO 20 KK=1,KC 2983 K=KC-KK+1 2984 L=IWORK(LL+K) 2985C 2986C COMPUTES DISTANCE BETWEEN CASE AND CLUSTER LEADER 2987C 2988 DD=0. 2989 DC=0. 2990 DO 10 J=1,N 2991 IF (A(L,J).NE.XMISS.AND.A(I,J).NE.XMISS) THEN 2992 DC=DC+1. 2993 DD=DD+(A(L,J)-A(I,J))**2 2994C 2995C GET NEXT CLUSTER IF DISTANCE IS TOO LARGE 2996C 2997 IF(DD.GT.DMAX) GO TO 20 2998 ENDIF 2999 10 CONTINUE 3000 IF(DC.NE.0.) DD=SQRT(DD/DC) 3001C 3002C ASSIGN CASE I TO CLUSTER K IF DISTANCE BELOW THRESHOLD 3003C 3004 IF (DD.LE.THRESH) THEN 3005 IWORK(LC+I)=K 3006 GO TO 30 3007 ENDIF 3008 20 CONTINUE 3009C 3010C CREATE NEW CLUSTER AND LEADER 3011C 3012 IF (KC.NE.NC) THEN 3013 KC=KC+1 3014 IWORK(LC+I)=KC 3015 IWORK(LL+KC)=I 3016 ENDIF 3017 30 CONTINUE 3018C 3019C OUTPUT CLUSTER LEADERS 3020C 3021CCCCC IF (OUNIT .LE. 0) GOTO9000 3022 IF (IPR .LE. 0) GOTO9000 3023C 3024 WRITE(ICOUT,1) 3025 1 FORMAT(' CLUSTER LEADERS') 3026 CALL DPWRST('XXX','WRIT') 3027C 3028 DO 40 K=1,KC 3029 I=IWORK(LL+K) 3030C 3031 WRITE(OUNIT,2) K, RLAB(I),(A(I,J),J=1,MAX(N,10)) 3032 2 FORMAT(' CLUSTER',I4,2X,A4,10F11.4) 3033 CALL DPWRST('XXX','WRIT') 3034 IF (N.GT.10)THEN 3035 WRITE(OUNIT,12) (A(I,J),J=11,N) 3036 12 FORMAT(18X,10F11.4) 3037 CALL DPWRST('XXX','WRIT') 3038 ENDIF 3039 40 CONTINUE 3040C 3041 WRITE(ICOUT,3) 3042 3 FORMAT(1X) 3043 CALL DPWRST('XXX','WRIT') 3044C 3045C OUTPUT CLUSTERS 3046C 3047 KC=KC+1 3048 DO 50 K=1,KC 3049 KK=K-1 3050 J=0 3051 DO 60 I=1,M 3052 IF (J.EQ.20) J=0 3053 IF (IWORK(LC+I).EQ.KK) THEN 3054 J=J+1 3055 AA(J)=RLAB(I) 3056 ENDIF 3057 IF (J.EQ.20.OR.(I.EQ.M.AND.J.NE.0)) THEN 3058 WRITE(OUNIT,4) KK,(AA(JJ),JJ=1,J) 3059 4 FORMAT(' CLUSTER',I5,20(1X,A4)) 3060 CALL DPWRST('XXX','WRIT') 3061 ENDIF 3062 60 CONTINUE 3063 50 CONTINUE 3064C 3065 9000 CONTINUE 3066 RETURN 3067 END 3068 SUBROUTINE RSPLIT(MM, N, A, RLAB, IR, KA, TH, IORD, DMIWRK, 3069CCCCC SUBROUTINE RSPLIT(MM, M, N, A, RLAB, IR, KA, TH, IORD, DMIWRK, 3070 * IWORK, DMWORK, WORK) 3071C 3072C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 3073C 3074C PURPOSE 3075C ------- 3076C 3077C FINDS OPTIMAL SPLIT OF THE CASES 3078C 3079C DESCRIPTION 3080C ----------- 3081C 3082C 1. INITIALLY, THE FIRST CLUSTER CONSISTS OF ALL CASES WITHIN THE 3083C BLOCK IR AND THE SECOND CLUSTER IS EMPTY. THE REDUCTION IN THE 3084C WITHIN-CLUSTER SUM OF SQUARES FOR MOVING EACH CASE FROM THE 3085C FIRST CLUSTER TO THE SECOND IS CALCULATED. THE CASE THAT 3086C REDUCES THE SUM OF SQUARES THE MOST IS MOVED AND THIS CONTINUES 3087C UNTIL ALL CASES ARE MOVED WITH EACH REDUCTION STORED. THEN THE 3088C SPLIT THAT HAD THE SMALLEST REDUCTION OF ALL IS RETURNED AS THE 3089C OPTIMUM SPLIT. 3090C 3091C INPUT PARAMETERS 3092C ---------------- 3093C 3094C MM, M, N, A, RLAB, TH, IORD, DMIWRK, DMWORK -- SEE SUBROUTINE SPLIT2 3095C 3096C IR INTEGER SCALAR (UNCHANGED ON OUTPUT). 3097C NUMBER OF BLOCK TO BE SPLIT. 3098C 3099C KA INTEGER SCALAR (UNCHANGED ON OUTPUT). 3100C NUMBER OF BLOCKS. 3101C 3102C IWORK INTEGER MATRIX WHOSE FIRST DIMENSION MUST BE DMIWRK AND SECOND 3103C DIMENSION MUST BE AT LEAST KA. 3104C THE MATRIX DEFINING THE BOUNDARIES OF THE BLOCKS. 3105C 3106C IWORK(1,I) IS 1 + THE FIRST ROW IN BLOCK I 3107C IWORK(2,I) IS 1 + THE LAST ROW IN BLOCK I 3108C IWORK(3,I) IS 1 + THE FIRST COLUMN IN BLOCK I 3109C IWORK(4,I) IS 1 + THE LAST COLUMN IN BLOCK I 3110C 3111C WORK REAL MATRIX WHOSE FIRST DIMENSION MUST BE DMWORK AND SECOND 3112C DIMENSION MUST BE AT LEAST MAX(M,N). 3113C 3114C WORK(1,I) = FIRST CASE IN CASE CLUSTER I 3115C WORK(2,I) = LAST CASE IN CASE CLUSTER I 3116C WORK(3,I) = REDUCTION IN SSQ DUE TO SPLITTING 3117C WORK(4,I) = LAST CASE IN FIRST CLUSTER OF SPLIT OF I 3118C WORK(5,I) = 1 IF CASE IS INCLUDED IN PRESENT VARIABLE SPLIT 3119C WORK(6,I) = NUMBER OF VARIABLES IN I-TH ROW OF PRESENT 3120C VARIABLE SPLIT 3121C WORK(7,I) = MEAN OF I-TH CASE, FIRST VARIABLE CLUSTER 3122C WORK(8,I) = NUMBER OF VARIABLES SECOND CLUSTER 3123C WORK(9,I) = MEAN OF I-TH CASE, SECOND CLUSTER 3124C 3125C WORK(10-18,I) ARE SIMILAR WITH VARIABLES AND CASES REVERSED. 3126C 3127C REFERENCE 3128C --------- 3129C 3130C HARTIGAN, J. A. (1975). CLUSTERING ALGORITHMS, JOHN WILEY & 3131C SONS, INC., NEW YORK. PAGE 277. 3132C 3133C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 3134C 3135 INTEGER DMWORK, DMIWRK 3136 DIMENSION A(MM,*),IWORK(DMIWRK,*),WORK(DMWORK,*) 3137 CHARACTER*4 RLAB(*), C 3138C 3139 INCLUDE 'DPCOMC.INC' 3140 INCLUDE 'DPCOP2.INC' 3141C 3142 XM=99999. 3143 DO 10 J=1,N 3144 WORK(14,J)=0. 3145 10 CONTINUE 3146C 3147C LOOK FOR BLOCKS WITHIN THRESHOLD 3148C 3149 IL=INT(WORK(1,IR)) 3150 IU=INT(WORK(2,IR)) 3151 DO 40 K=1,KA 3152 IF(IWORK(1,K).EQ.IL+1.AND.IWORK(2,K).EQ.IU+1) THEN 3153 JL=IWORK(3,K) 3154 JU=IWORK(4,K) 3155 IF(JL.LT.0) JL=-JL 3156C 3157C COMPUTE VARIANCES 3158C 3159 NC=0 3160 DO 30 J=JL-1,JU-1 3161 S1=0. 3162 S2=0. 3163 S3=0. 3164 DO 20 I=IL,IU 3165 IF(A(I,J).NE.XM) THEN 3166 S1=S1+1 3167 S2=S2+A(I,J) 3168 S3=S3+A(I,J)**2 3169 ENDIF 3170 20 CONTINUE 3171 WORK(15,J)=S1 3172 IF(S1.NE.0) THEN 3173 S3=S3/S1-(S2/S1)**2 3174 WORK(16,J)=S2/S1 3175 ENDIF 3176 IF(S3.GT.TH) THEN 3177 WORK(14,J)=1. 3178 NC=1 3179 ENDIF 3180 30 CONTINUE 3181 IF(NC.EQ.0) IWORK(1,K)=-IWORK(1,K) 3182 ENDIF 3183 40 CONTINUE 3184C 3185C FIND BEST CASE SPLIT 3186C 3187 DO 50 J=1,N 3188 WORK(17,J)=0. 3189 WORK(18,J)=0. 3190 50 CONTINUE 3191 DM=0. 3192 WORK(3,IR)=0. 3193 WORK(4,IR)=IL 3194 DO 100 I=IL,IU-1 3195 II=IU-I+IL 3196 ID=II 3197 DD=-R1MACH(2) 3198 DO 70 L=IL,II 3199 IF((IORD.NE.1.AND.IORD.NE.3).OR.L.EQ.II) THEN 3200 DL=0. 3201 DO 60 J=1,N 3202 IF(WORK(14,J).NE.0.AND.A(L,J).NE.XM) THEN 3203 DL=DL+(A(L,J)-WORK(16,J))**2*(WORK(15,J)+1)/ 3204 * WORK(15,J) 3205 DL=DL-(A(L,J)-WORK(18,J))**2*WORK(17,J)/ 3206 * (WORK(17,J)+1) 3207 ENDIF 3208 60 CONTINUE 3209 IF(DL.GT.DD) THEN 3210 DD=DL 3211 ID=L 3212 ENDIF 3213 ENDIF 3214 70 CONTINUE 3215C 3216C INTERCHANGE ID AND II 3217C 3218 DO 80 J=1,N 3219 CC=A(II,J) 3220 A(II,J)=A(ID,J) 3221 A(ID,J)=CC 3222 80 CONTINUE 3223 C = RLAB(II) 3224 RLAB(II) = RLAB(ID) 3225 RLAB(ID) = C 3226C 3227C UPDATE MEANS 3228C 3229 DO 90 J=1,N 3230 IF(WORK(14,J).NE.0.AND.A(II,J).NE.XM) THEN 3231 WORK(15,J)=WORK(15,J)-1. 3232 IF(WORK(15,J).NE.0.)WORK(16,J)=WORK(16,J)+ 3233 * (WORK(16,J)-A(II,J))/WORK(15,J) 3234 WORK(17,J)=WORK(17,J)+1. 3235 WORK(18,J)=WORK(18,J)-(WORK(18,J)-A(II,J))/WORK(17,J) 3236 ENDIF 3237 90 CONTINUE 3238 DM=DM+DD 3239 IF(DM.GE.WORK(3,IR)) THEN 3240 WORK(3,IR)=DM 3241 WORK(4,IR)=II-1 3242 ENDIF 3243 100 CONTINUE 3244 RETURN 3245 END 3246 SUBROUTINE SINGLE(X, COUNT, AVE, SD, XMIN, XMAX, SSQ) 3247C 3248C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 3249C 3250C PURPOSE 3251C ------- 3252C 3253C INCORPORATES A NEW VALUE INTO THE SUMMARY STATISTICS 3254C 3255C INPUT PARAMETERS 3256C ---------------- 3257C 3258C SEE SUBROUTINE BUILD FOR PARAMETER DESCRIPTIONS. 3259C 3260C REFERENCE 3261C --------- 3262C 3263C HARTIGAN, J. A. (1975). CLUSTERING ALGORITHMS, JOHN WILEY & 3264C SONS, INC., NEW YORK. PAGE 109. 3265C 3266C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 3267C 3268 INCLUDE 'DPCOMC.INC' 3269C 3270 IF(COUNT.EQ.0.) THEN 3271 AVE=0. 3272 SD=0. 3273 XMIN=R1MACH(2) 3274 XMAX=-R1MACH(2) 3275 SSQ=0. 3276 ENDIF 3277 COUNT=COUNT+1. 3278 AVE=AVE+(X-AVE)/COUNT 3279 IF(COUNT.NE.1.) SSQ=SSQ+COUNT*(X-AVE)**2/(COUNT-1.) 3280 SD=(SSQ/COUNT)**0.5 3281 IF(XMIN.GT.X) XMIN=X 3282 IF(XMAX.LT.X) XMAX=X 3283 RETURN 3284 END 3285 SUBROUTINE SPLIT(MM, N, A, RLAB, DMW, W, IL, IU, DMU, U, 3286CCCCC SUBROUTINE SPLIT(MM, M, N, A, CLAB, RLAB, DMW, W, IL, IU, DMU, U, 3287 * WCLAB, IM, DM) 3288C 3289C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 3290C 3291C PURPOSE 3292C ------- 3293C 3294C SPLITS A ROW CLUSTER ON SELECTED VARIABLES 3295C 3296C DESCRIPTION 3297C ----------- 3298C 3299C 1. INITIALLY, THE FIRST CLUSTER CONSISTS OF ALL CASES BETWEEN IL 3300C AND IU AND THE SECOND CLUSTER IS EMPTY. THE WEIGHTED MEANS ARE 3301C DETERMINED AND USED TO FIND THE REDUCTION IN THE WITHIN-CLUSTER 3302C SUM OF SQUARES FOR MOVING EACH CASE FROM THE FIRST CLUSTER TO 3303C THE SECOND. THE OBJECT THAT REDUCES THE SUM OF SQUARES THE 3304C MOST IS MOVED AND THIS CONTINUES UNTIL ALL OBJECTS ARE MOVED 3305C WITH EACH REDUCTION STORED. THEN THE SPLIT THAT HAD THE 3306C SMALLEST REDUCTION OF ALL IS RETURNED AS THE OPTIMUM SPLIT. 3307C 3308C INPUT PARAMETERS 3309C ---------------- 3310C 3311C MM, M, N, A, CLAB, RLAB, DMW, W -- SEE SUBROUTINE SPLIT1 3312C 3313C IL, IU INTEGER SCALARS (UNCHANGED ON OUTPUT). 3314C THE FIRST AND LAST OBJECTS IN THE BLOCK TO BE SPLIT. 3315C 3316C DMU INTEGER SCALAR (UNCHANGED ON OUTPUT). 3317C THE LEADING DIMENSION OF MATRIX U. MUST BE AT LEAST 4. 3318C 3319C U REAL MATRIX WHOSE FIRST DIMENSION MUST BE DMU AND SECOND 3320C DIMENSION MUST BE AT LEAST N (CHANGED ON OUTPUT). 3321C MATRIX OF CLUSTER MEANS. 3322C 3323C OUTPUT PARAMETERS 3324C ----------------- 3325C 3326C WCLAB INTEGER VECTOR DIMENSIONED AT LEAST N. 3327C WCLAB(I) WILL STORE THE CLUSTER (EITHER 1 OR 2) OBJECT I WAS 3328C ASSIGNED TO. 3329C 3330C IM INTEGER SCALAR. 3331C THE BORDER OF THE SPLIT. OBJECTS IL,...,IM WERE ASSIGNED TO 3332C CLUSTER 1 AND OBJECTS IM+1,...,IU WERE ASSIGNED TO CLUSTER 3333C 2. 3334C 3335C DM INTEGER SCALAR. 3336C THE REDUCTION IN THE WITHIN-CLUSTER SUM OF SQUARES. 3337C 3338C REFERENCE 3339C --------- 3340C 3341C HARTIGAN, J. A. (1975). CLUSTERING ALGORITHMS, JOHN WILEY & 3342C SONS, INC., NEW YORK. PAGE 272. 3343C 3344C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 3345C 3346 INTEGER DMW, DMU, WCLAB(*) 3347 DIMENSION W(DMW,*), A(MM,*), U(DMU,*) 3348CCCCC CHARACTER*4 CLAB(*), RLAB(*), CTEMP 3349 CHARACTER*4 RLAB(*), CTEMP 3350C 3351 INCLUDE 'DPCOMC.INC' 3352 INCLUDE 'DPCOP2.INC' 3353C 3354C FIND WEIGHTED MEAN OF ALL CASES 3355C 3356 TH=R1MACH(4) 3357 DO 10 J=1,N 3358 U(1,J)=0. 3359 U(3,J)=0. 3360 U(2,J)=TH 3361 U(4,J)=TH 3362 10 CONTINUE 3363 DO 30 J=1,N 3364 IF(WCLAB(J).NE.0) THEN 3365 DO 20 I=IL,IU 3366 U(1,J)=U(1,J)+A(I,J)*W(I,J) 3367 U(2,J)=U(2,J)+WCLAB(J) 3368 20 CONTINUE 3369 U(1,J)=U(1,J)/U(2,J) 3370 ENDIF 3371 30 CONTINUE 3372 DM=0. 3373 DD=0. 3374 DO 80 IC=IL,IU 3375 II=IU-IC+IL 3376 DMAX=-R1MACH(2) 3377 IMAX=II 3378C 3379C DETERMINE THE EFFECT OF MOVING ITH CASE 3380C 3381 DO 50 I=IL,II 3382 D=0. 3383 DO 40 J=1,N 3384 IF(WCLAB(J).NE.0) THEN 3385 IF(U(2,J).EQ.W(I,J)) U(2,J)=W(I,J)+TH 3386 D=D+W(I,J)*U(2,J)*(A(I,J)-U(1,J))**2/(U(2,J)-W(I,J)) 3387 D=D-W(I,J)*U(4,J)*(A(I,J)-U(3,J))**2/(U(4,J)+W(I,J)) 3388 ENDIF 3389 40 CONTINUE 3390C 3391C STORE THE LARGEST 3392C 3393 IF(D.GT.DMAX) THEN 3394 IMAX=I 3395 DMAX=D 3396 ENDIF 3397 50 CONTINUE 3398 DD=DD+DMAX 3399 IF(DD.GT.DM) IM=II-1 3400 IF(DD.GT.DM) DM=DD 3401C 3402C UPDATE MEANS OF THE TWO CLUSTERS 3403C 3404 I=IMAX 3405 DO 60 J=1,N 3406 IF(WCLAB(J).NE.0) THEN 3407 U(2,J)=U(2,J)-W(I,J) 3408 IF(U(2,J).LT.TH) U(2,J)=TH 3409 U(1,J)=U(1,J)+(U(1,J)-A(I,J))*W(I,J)/U(2,J) 3410 U(4,J)=U(4,J)+W(I,J) 3411 U(3,J)=U(3,J)-(U(3,J)-A(I,J))*W(I,J)/U(4,J) 3412 ENDIF 3413 60 CONTINUE 3414C 3415C INTERCHANGE SELECTED ROW WITH LAST FEASIBLE ROW 3416C 3417 DO 70 J=1,N 3418 C=A(I,J) 3419 A(I,J)=A(II,J) 3420 A(II,J)=C 3421 C=W(I,J) 3422 W(I,J)=W(II,J) 3423 W(II,J)=C 3424 70 CONTINUE 3425 CTEMP = RLAB(I) 3426 RLAB(I) = RLAB(II) 3427 RLAB(II) = CTEMP 3428 80 CONTINUE 3429 RETURN 3430 END 3431 SUBROUTINE SPLIT1(MM, M, N, A, CLAB, RLAB, TITLE, DMW, W, TH, 3432 * KD, IWORK, DMIWRK, IWORK1, DMWORK, WORK, IERR, 3433 * OUNIT) 3434C 3435C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 3436C 3437C PURPOSE 3438C ------- 3439C 3440C SPLITS THE CASES IN EACH VARIABLE UNTIL ALL WITHIN-CLUSTER 3441C VARIANCES ARE SMALLER THAN A USER-SPECIFIED THRESHOLD 3442C 3443C DESCRIPTION 3444C ----------- 3445C 3446C 1. THE THRESHOLD IS THE LARGEST WITHIN-CLUSTER VARIANCE FOR EACH 3447C VARIABLE. THE VARIABLES MUST BE SCALED SIMILARLY (CLUSTER 3448C SUBROUTINE STAND CAN BE USED TO STANDARDIZE THE VARIABLES). 3449C THE ROUTINE STARTS WITH ONE CLUSTER OF ALL CASES FOR EACH 3450C VARIABLE. FOR EACH CLUSTER WHOSE VARIANCE IS LARGER THAN THE 3451C THRESHOLD, IT IS SPLIT INTO TWO CLUSTERS SUCH THAT THE SUM OF 3452C THE TWO WITHIN-CLUSTER VARIANCES IS SMALLEST. THIS REPEATS 3453C UNTIL ALL CLUSTER VARIANCES ARE SMALLER THAN THE THRESHOLD. 3454C THE THRESHOLD SHOULD BE CHOSEN WISELY AS A LARGE THRESHOLD WILL 3455C PRODUCE A FEW LARGE CLUSTERS AND A SMALL THRESHOLD WILL PRODUCE 3456C MANY SMALL CLUSTERS. 3457C 3458C 2. A MATRIX CAN BE USED TO WEIGH THE DATA VALUES. A WEIGHT OF 1. 3459C WILL GIVE THE VALUE FULL WEIGHT, A WEIGHT OF 0. WILL GIVE THE 3460C VALUE NO WEIGHT (IE. A MISSING VALUE). ALL WEIGHTS MUST BE 3461C BETWEEN 0. AND 1., AND THE WEIGHT MATRIX WILL BE DESTROYED 3462C DURING EXECUTION. 3463C 3464C 3. THE OUTPUT DIAGRAM IS AN ARRAY WITH THE VARIABLES LABELING THE 3465C COLUMNS AND THE CASES LABELING THE ROWS AND THE VARIABLE VALUES 3466C MULTIPLIED BY 10 AS THE ELEMENTS OF THE ARRAY. THE HORIZONTAL 3467C LINES OUTLINE THE BLOCKS AS EACH BLOCK IS ASSUMED TO CONTAIN 3468C ONLY ONE VARIABLE AND HENCE, ONLY ONE COLUMN. THE OUTPUT 3469C DIAGRAM IS WRITTEN ON FORTRAN UNIT OUNIT. 3470C 3471C INPUT PARAMETERS 3472C ---------------- 3473C 3474C MM INTEGER SCALAR (UNCHANGED ON OUTPUT). 3475C THE LEADING DIMENSION OF MATRIX A. MUST BE AT LEAST M. 3476C 3477C M INTEGER SCALAR (UNCHANGED ON OUTPUT). 3478C THE NUMBER OF CASES. 3479C 3480C N INTEGER SCALAR (UNCHANGED ON OUTPUT). 3481C THE NUMBER OF VARIABLES. 3482C 3483C A REAL MATRIX WHOSE FIRST DIMENSION MUST BE MM AND SECOND 3484C DIMENSION MUST BE AT LEAST M (CHANGED ON OUTPUT). 3485C THE DATA MATRIX. 3486C 3487C A(I,J) IS THE VALUE FOR THE J-TH VARIABLE FOR THE I-TH CASE. 3488C 3489C CLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST N 3490C (UNCHANGED ON OUTPUT). 3491C LABELS OF THE VARIABLES. 3492C 3493C RLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST M 3494C (CHANGED ON OUTPUT). 3495C LABELS OF THE CASES. 3496C 3497C TITLE 10-CHARACTER VARIABLE (UNCHANGED ON OUTPUT). 3498C TITLE OF THE DATA SET. 3499C 3500C DMW INTEGER SCALAR (UNCHANGED ON OUTPUT). 3501C THE LEADING DIMENSION OF MATRIX W. MUST BE AT LEAST M. 3502C 3503C W REAL MATRIX WHOSE FIRST DIMENSION MUST BE DMW AND SECOND 3504C DIMENSION MUST BE AT LEAST N (CHANGED ON OUTPUT). 3505C W(I,J) IS THE WEIGHT OF VARIABLE J FOR CASE I AND SHOULD BE 3506C BETWEEN 0. AND 1. MISSING VALUES SHOULD BE GIVEN A WEIGHT 3507C OF 0. 3508C 3509C TH INTEGER SCALAR (UNCHANGED ON OUTPUT). 3510C THRESHOLD VARIANCE FOR VARIABLES WITHIN CLUSTERS. 3511C 3512C KD INTEGER SCALAR (UNCHANGED ON OUTPUT). 3513C THE MAXIMUM NUMBER OF BLOCKS ALLOCATED (SECOND DIMENSION OF 3514C IWORK1). THE SMALLEST K SHOULD BE IS M AND THE LARGEST IS 3515C N*M. 3516C 3517C IWORK INTEGER VECTOR DIMENSIONED AT LEAST 2*M+N. 3518C WORK VECTOR. 3519C 3520C DMIWRK INTEGER SCALAR (UNCHANGED ON OUTPUT). 3521C THE LEADING DIMENSION OF MATRIX IWORK1. MUST BE AT LEAST 4. 3522C 3523C IWORK1 INTEGER MATRIX WHOSE FIRST DIMENSION MUST BE DMIWRK AND 3524C SECOND DIMENSION MUST BE AT LEAST KD. 3525C WORK MATRIX. 3526C 3527C DMWORK INTEGER SCALAR (UNCHANGED ON OUTPUT). 3528C THE LEADING DIMENSION OF MATRIX WORK. MUST BE AT LEAST 4. 3529C 3530C WORK REAL MATRIX WHOSE FIRST DIMENSION MUST BE DMWORK AND SECOND 3531C SECOND MUST BE AT LEAST N (CHANGED ON OUTPUT). 3532C WORK MATRIX. 3533C 3534C OUNIT INTEGER SCALAR (UNCHANGED ON OUTPUT). 3535C UNIT NUMBER FOR OUTPUT. 3536C 3537C OUTPUT PARAMETER 3538C ---------------- 3539C 3540C IERR INTEGER SCALAR. 3541C ERROR FLAG. 3542C 3543C IERR = 0, NO ERRORS WERE DETECTED DURING EXECUTION 3544C 3545C IERR = 1, THE NUMBER OF BLOCKS NEEDED WAS LARGER THAN THE 3546C NUMBER OF BLOCKS ALLOCATED. EXECUTION IS 3547C TERMINATED. INCREASE KD. 3548C 3549C IERR = 2, EITHER THE FIRST AND LAST CASES OR THE CLUSTER 3550C DIAMETER FOR A CLUSTER IS OUT OF BOUNDS. THE 3551C CLUSTER AND ITS BOUNDARIES ARE PRINTED ON UNIT 3552C OUNIT. EXECUTION WILL CONTINUE WITH QUESTIONABLE 3553C RESULTS FOR THAT CLUSTER. 3554C 3555C REFERENCES 3556C ---------- 3557C 3558C HARTIGAN, J. A. (1975). CLUSTERING ALGORITHMS, JOHN WILEY & 3559C SONS, INC., NEW YORK. PAGES 251-271. 3560C 3561C HARTIGAN, J. A. (1975) PRINTER GRAPHICS FOR CLUSTERING. JOURNAL OF 3562C STATISTICAL COMPUTATION AND SIMULATION. VOLUME 4,PAGES 187-213. 3563C 3564C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 3565C 3566 INTEGER DMW, DMIWRK, DMWORK, OUNIT 3567 DIMENSION A(MM,*), W(DMW,*), IWORK1(DMIWRK,*), IWORK(*), 3568 * WORK(DMWORK,*) 3569 CHARACTER*4 CLAB(*), RLAB(*) 3570 CHARACTER*10 TITLE 3571C 3572 INCLUDE 'DPCOMC.INC' 3573 INCLUDE 'DPCOP2.INC' 3574C 3575C INTEGER WORK VECTOR OFFSETS 3576C 3577 IERR = 0 3578 IWCLAB=0 3579 INC1=N 3580 INC2=N+M 3581C 3582C INITIALIZE CLUSTER OF ALL ROWS 3583C 3584 IWORK(INC1+1)=1 3585 IWORK(INC2+1)=M 3586 KR=0 3587 KC=0 3588 10 KR=KR+1 3589 IF(KR.EQ.0) GOTO 50 3590 SP=0. 3591 IL=IWORK(INC1+KR) 3592 IU=IWORK(INC2+KR) 3593C 3594C IDENTIFY VARIABLES WITHIN THRESHOLD FOR WITHIN-CLUSTER VARIANCES 3595C 3596 DO 40 J=1,N 3597 IWORK(IWCLAB+J)=1 3598 S1=0. 3599 S2=0. 3600 S3=0. 3601 DO 20 I=IL,IU 3602 IF(W(I,J).NE.0.) THEN 3603 S1=S1+W(I,J) 3604 S2=S2+W(I,J)*A(I,J) 3605 S3=S3+W(I,J)*A(I,J)**2 3606 ENDIF 3607 20 CONTINUE 3608 IF(S1.NE.0.) THEN 3609 S2=S2/S1 3610 S3=S3/S1-S2**2 3611 IF(S3.GT.TH) THEN 3612 SP=1. 3613 GOTO 40 3614 ENDIF 3615 KC=KC+1 3616 IF (KC .GT. KD) THEN 3617 IF (OUNIT .GT. 0) THEN 3618 WRITE(OUNIT,*) 3619CCC22 FORMAT(' TOO MANY BLOCKS FOR SPACE ALLOCATED, ', 3620CCCCC1 'INCREASE KD AND SECOND DIMENSION OF IWORK1') 3621 CALL DPWRST('XXX','WRIT') 3622 ENDIF 3623 IERR = 1 3624 RETURN 3625 ENDIF 3626 IWORK1(1,KC)=IL+1 3627 IWORK1(2,KC)=IU+1 3628 IWORK1(3,KC)=J+1 3629 IWORK1(4,KC)=J+1 3630 DO 30 I=IL,IU 3631 W(I,J)=0. 3632 30 CONTINUE 3633 ENDIF 3634 IWORK(IWCLAB+J)=0 3635 40 CONTINUE 3636C 3637C SPLIT CLUSTER KR IF NECESSARY 3638C 3639 IF(SP.EQ.0.) THEN 3640 KR=KR-2 3641 GO TO 10 3642 ENDIF 3643CCCCC CALL SPLIT(MM,M,N,A,CLAB,RLAB,DMW,W,IL,IU,DMWORK,WORK, 3644 CALL SPLIT(MM,N,A,RLAB,DMW,W,IL,IU,DMWORK,WORK, 3645 * IWORK(IWCLAB+1),IM,DM) 3646 IWORK(INC2+KR+1)=IWORK(INC2+KR) 3647 IWORK(INC2+KR)=IM 3648 IWORK(INC1+KR+1)=IM+1 3649 GO TO 10 3650 50 CALL BLOCK(MM, M+1, N+1, A, CLAB, RLAB, TITLE, KC, DMIWRK, IWORK1, 3651 * IERR, OUNIT) 3652 RETURN 3653 END 3654 SUBROUTINE SPLIT2(MM, M, N, A, CLAB, RLAB, TITLE, KD, TH, IORD, 3655 * DMIWRK, IWORK, DMWORK, WORK, IERR, OUNIT) 3656C 3657C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 3658C 3659C PURPOSE 3660C ------- 3661C 3662C SPLITS MATRIX OF CASE-BY-VARIABLE DATA VALUES INTO BLOCKS UNTIL 3663C ALL WITHIN-BLOCK VARIANCES ARE LESS THAN A GIVEN THRESHOLD. 3664C INCLUDES USER-CONTROLLED CONSTRAINTS 3665C 3666C DESCRIPTION 3667C ----------- 3668C 3669C 1. THE THRESHOLD IS THE LARGEST VARIANCE FOR THE DATA VALUES IN 3670C THE BLOCKS. THE VARIABLES SHOULD BE SCALED SIMILARLY (CLUSTER 3671C SUBROUTINE CAN BE USED TO STANDARDIZE THE VARIABLES. THE 3672C ROUTINE STARTS WITH THE DATA MATRIX AS ONE BLOCK. THEN THE 3673C BLOCK WITH THE LARGEST VARIANCE IS CHOSEN AND IF THAT VARIANCE 3674C IS LARGER THAN THE THRESHOLD, THE BLOCK IS OPTIMALLY SPLIT BY 3675C BOTH CASES AND VARIABLES. THE VARIANCES FOR THE NEW BLOCKS ARE 3676C DETERMINED AND THE PROCESS REPEATS BY FINDING THE NEWEST 3677C LARGEST VARIANCE. ONCE THE LARGEST VARIANCE IS LESS THAN THE 3678C THRESHOLD, THE RESULTS ARE PRINTED IN A BLOCK DIAGRAM ON 3679C FORTRAN UNIT OUNIT. THE THRESHOLD SHOULD BE CHOSEN WISELY AS A 3680C LARGE THRESHOLD WILL PRODUCE A FEW LARGE BLOCKS AND A SMALL 3681C THRESHOLD WILL PRODUCE MANY SMALL BLOCKS. 3682C 3683C 2. MISSING VALUES SHOULD BE REPRESENTED BY 99999. 3684C 3685C 3. THE CASES AND/OR VARIABLES CAN BE CONSTRAINED BY THE IORD 3686C PARAMETER. SETTING IORD = 0 HAS BOTH CASES AND VARIABLES 3687C UNCONSTRAINED; SETTING IORD = 1 CONSTRAINS ONLY CASES; SETTING 3688C IORD = 2 CONSTRAINS ONLY VARIABLES; AND SETTING IORD = 3 3689C CONSTRAINS BOTH CASES AND VARIABLES. 3690C 3691C 3. THE BLOCK DIAGRAM IS THE DATA MATRIX WITH THE DATA VALUES 3692C MULTIPLIED BY 10. THE BLOCKS ARE OUTLINED BY THE VERTICAL AND 3693C HORIZONTAL LINES. 3694C 3695C INPUT PARAMETERS 3696C ---------------- 3697C 3698C MM INTEGER SCALAR (UNCHANGED ON OUTPUT). 3699C THE LEADING DIMENSION OF MATRIX A. MUST BE AT LEAST M. 3700C 3701C M INTEGER SCALAR (UNCHANGED ON OUTPUT). 3702C THE NUMBER OF OBJECTS. 3703C 3704C N INTEGER SCALAR (UNCHANGED ON OUTPUT). 3705C THE NUMBER OF VARIABLES. 3706C 3707C A REAL MATRIX WHOSE FIRST DIMENSION MUST BE MM AND SECOND 3708C DIMENSION MUST BE AT LEAST M (CHANGED ON OUTPUT). 3709C THE DATA MATRIX. 3710C 3711C A(I,J) IS THE VALUE FOR THE J-TH VARIABLE FOR THE I-TH CASE. 3712C 3713C CLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST N 3714C (CHANGED ON OUTPUT). 3715C ORDERED LABELS OF THE COLUMNS. 3716C 3717C RLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST M 3718C (CHANGED ON OUTPUT). 3719C ORDERED LABELS OF THE ROWS. 3720C 3721C TITLE 10-CHARACTER VARIABLE (UNCHANGED ON OUTPUT). 3722C TITLE OF DATA SET. 3723C 3724C KD INTEGER SCALAR (UNCHANGED ON OUTPUT). 3725C MAXIMUM NUMBER OF BLOCKS. SHOULD BE BETWEEN M AND N*M. 3726C 3727C TH REAL SCALAR (UNCHANGED ON OUTPUT). 3728C THRESHOLD VARIANCE FOR DATA VALUES WITHIN A BLOCK. 3729C 3730C IORD INTEGER SCALAR (UNCHANGED ON OUTPUT). 3731C ORDERING PARAMETER. 3732C 3733C IORD = 0 CASES AND VARIABLES ARE UNCONSTRAINED 3734C IORD = 1 CONSTRAIN CASES 3735C IORD = 2 CONSTRAIN VARIABLES 3736C IORD = 3 CASES AND VARIABLES ARE CONSTRAINED 3737C 3738C DMIWRK INTEGER SCALAR (UNCHANGED ON OUTPUT). 3739C THE LEADING DIMENSION OF MATRIX IWORK. MUST BE AT LEAST 4. 3740C 3741C IWORK INTEGER MATRIX WHOSE FIRST DIMENSION MUST BE DMIWRK AND SECOND 3742C DIMENSION MUST BE AT LEAST KC. 3743C WORK MATRIX. 3744C 3745C DMWORK INTEGER SCALAR (UNCHANGED ON OUTPUT). 3746C THE LEADING DIMENSION OF MATRIX WORK. MUST BE AT LEAST 18. 3747C 3748C WORK REAL MATRIX WHOSE FIRST DIMENSION MUST BE DMWORK AND SECOND 3749C DIMENSION MUST BE AT LEAST MAX(M,N). 3750C WORK MATRIX. 3751C 3752C OUNIT INTEGER SCALAR (UNCHANGED ON OUTPUT). 3753C UNIT NUMBER FOR OUTPUT. 3754C 3755C OUTPUT PARAMETER 3756C ---------------- 3757C 3758C IERR INTEGER SCALAR. 3759C ERROR FLAG. 3760C 3761C IERR = 0, NO ERRORS WERE DETECTED DURING EXECUTION 3762C 3763C IERR = 1, THE NUMBER OF BLOCKS NEEDED WAS LARGER THAN THE 3764C NUMBER OF BLOCKS ALLOCATED. EXECUTION IS 3765C TERMINATED. INCREASE KD. 3766C 3767C IERR = 2, EITHER THE FIRST AND LAST CASES OR THE CLUSTER 3768C DIAMETER FOR A CLUSTER IS OUT OF BOUNDS. THE 3769C CLUSTER AND ITS BOUNDARIES ARE PRINTED ON UNIT 3770C OUNIT. EXECUTION WILL CONTINUE WITH QUESTIONABLE 3771C RESULTS FOR THAT CLUSTER. 3772C 3773C REFERENCES 3774C ---------- 3775C 3776C HARTIGAN, J. A. (1972) "DIRECT CLUSTERING OF A DATA MATRIX." 3777C JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION. VOL. 67, 3778C PAGES 123-129. 3779C 3780C HARTIGAN, J. A. (1975). CLUSTERING ALGORITHMS, JOHN WILEY & 3781C SONS, INC., NEW YORK. PAGES 251-277. 3782C 3783C HARTIGAN, J. A. (1975) PRINTER GRAPHICS FOR CLUSTERING. JOURNAL OF 3784C STATISTICAL COMPUTATION AND SIMULATION. VOLUME 4,PAGES 187-213. 3785C 3786C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 3787C 3788 INTEGER DMIWRK, DMWORK, OUNIT 3789 DIMENSION A(MM,*), IWORK(DMIWRK,*), WORK(DMWORK,*) 3790 CHARACTER*4 CLAB(*), RLAB(*) 3791 CHARACTER*10 TITLE 3792C 3793 INCLUDE 'DPCOMC.INC' 3794 INCLUDE 'DPCOP2.INC' 3795C 3796C 3797C INITIALIZE BLOCKS AND ROW AND COLUMN CLUSTERS 3798C 3799 IERR = 0 3800 WORK(1,1)=1. 3801 WORK(2,1)=M 3802 WORK(10,1)=1. 3803 WORK(11,1)=N 3804 KR=1 3805 KC=1 3806 KA=1 3807 IWORK(1,1)=2 3808 IWORK(2,1)=M+1 3809 IWORK(3,1)=2 3810 IWORK(4,1)=N+1 3811 IR=1 3812 IC=1 3813 K=KD 3814CCCCC CALL RSPLIT(MM,M,N,A,RLAB,IR,KA,TH,IORD,DMIWRK,IWORK,DMWORK,WORK) 3815 CALL RSPLIT(MM,N,A,RLAB,IR,KA,TH,IORD,DMIWRK,IWORK,DMWORK,WORK) 3816CCCCC CALL CSPLIT(MM,M,N,A,CLAB,IC,KA,TH,IORD,DMIWRK,IWORK,DMWORK,WORK) 3817 CALL CSPLIT(MM,M,A,CLAB,IC,KA,TH,IORD,DMIWRK,IWORK,DMWORK,WORK) 3818 10 IF (KA .GT. KD) THEN 3819 IF (OUNIT .GT. 0) THEN 3820 WRITE(ICOUT,1) 3821 1 FORMAT(' NUMBER OF BLOCKS ALLOCATED IS TOO SMALL. ', 3822 1 'INCREASE KD') 3823 ENDIF 3824 IERR = 1 3825 RETURN 3826 ENDIF 3827C 3828C FIND BEST CASE OR VARIABLE SPLIT 3829C 3830 IB=1 3831 XB=0. 3832 DO 20 I=1,KR 3833 IF(WORK(3,I).GT.XB) THEN 3834 XB=WORK(3,I) 3835 IB=I 3836 ENDIF 3837 20 CONTINUE 3838 DO 30 J=1,KC 3839 IF(WORK(12,J).GT.XB) THEN 3840 XB=WORK(12,J) 3841 IB=J+M 3842 ENDIF 3843 30 CONTINUE 3844 IF(XB.EQ.0.) GOTO 60 3845C 3846C SPLIT CASE CLUSTER 3847C 3848 KKC=KA 3849 IF(IB.LE.M) THEN 3850 IL=INT(WORK(1,IB)) 3851 IU=INT(WORK(2,IB)) 3852 IM=INT(WORK(4,IB)) 3853 DO 40 K=1,KA 3854 IF(IWORK(1,K).EQ.IL+1.AND.IWORK(2,K).EQ.IU+1) THEN 3855 KKC=KKC+1 3856 IWORK(1,KKC)=IM+2 3857 IWORK(2,KKC)=IWORK(2,K) 3858 IWORK(2,K)=IM+1 3859 IWORK(3,KKC)=IWORK(3,K) 3860 IWORK(4,KKC)=IWORK(4,K) 3861 ENDIF 3862 40 CONTINUE 3863 KA=KKC 3864 WORK(2,IB)=IM 3865 KR=KR+1 3866 WORK(1,KR)=IM+1 3867 WORK(2,KR)=IU 3868CCCCC CALL RSPLIT(MM,M,N,A,RLAB,IB,KA,TH,IORD,DMIWRK,IWORK,DMWORK, 3869 CALL RSPLIT(MM,N,A,RLAB,IB,KA,TH,IORD,DMIWRK,IWORK,DMWORK, 3870 * WORK) 3871CCCCC CALL RSPLIT(MM,M,N,A,RLAB,KR,KA,TH,IORD,DMIWRK,IWORK,DMWORK, 3872 CALL RSPLIT(MM,N,A,RLAB,KR,KA,TH,IORD,DMIWRK,IWORK,DMWORK, 3873 * WORK) 3874 GO TO 10 3875 ELSE 3876C 3877C SPLIT VARIABLE CLUSTER 3878C 3879 JB=IB-M 3880 JL=INT(WORK(10,JB)) 3881 JU=INT(WORK(11,JB)) 3882 JM=INT(WORK(13,JB)) 3883 DO 50 K=1,KA 3884 IF(IWORK(3,K).EQ.JL+1.AND.IWORK(4,K).EQ.JU+1) THEN 3885 KKC=KKC+1 3886 IWORK(3,KKC)=JM+2 3887 IWORK(4,KKC)=IWORK(4,K) 3888 IWORK(4,K)=JM+1 3889 IWORK(1,KKC)=IWORK(1,K) 3890 IWORK(2,KKC)=IWORK(2,K) 3891 ENDIF 3892 50 CONTINUE 3893 KA=KKC 3894 WORK(11,JB)=JM 3895 KC=KC+1 3896 WORK(10,KC)=JM+1 3897 WORK(11,KC)=JU 3898CCCCC CALL CSPLIT(MM,M,N,A,CLAB,KC,KA,TH,IORD,DMIWRK,IWORK,DMWORK, 3899 CALL CSPLIT(MM,M,A,CLAB,KC,KA,TH,IORD,DMIWRK,IWORK,DMWORK, 3900 * WORK) 3901CCCCC CALL CSPLIT(MM,M,N,A,CLAB,JB,KA,TH,IORD,DMIWRK,IWORK,DMWORK, 3902 CALL CSPLIT(MM,M,A,CLAB,JB,KA,TH,IORD,DMIWRK,IWORK,DMWORK, 3903 * WORK) 3904 GO TO 10 3905 ENDIF 3906 60 CONTINUE 3907 DO 70 K=1,KA 3908 DO 75 J=1,4 3909 IF(IWORK(J,K).LT.0) IWORK(J,K)=-IWORK(J,K) 3910 75 CONTINUE 3911 70 CONTINUE 3912 CALL BLOCK(MM,M+1,N+1,A,CLAB,RLAB,TITLE,KA,DMIWRK,IWORK,IERR, 3913 * OUNIT) 3914 RETURN 3915 END 3916 SUBROUTINE SLINK(M, DMD, D, DMIWRK, IWORK, WORK) 3917CCCCC SUBROUTINE SLINK(M, DMD, D, DRLAB, DTITLE, DMIWRK, IWORK, WORK, 3918CCCCC* TLAB, IOUT, IERR, OUNIT) 3919C 3920C 2017/04: MODIIFIED FOR DATAPLOT. SUPPRESS TREE1 AND 3921C BLOCK1 ROUTINES (DISPLAYING CLUSTERS WILL BE 3922C DONE BY THE CALLING DATAPLOT ROUTINE). 3923C 3924C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 3925C 3926C PURPOSE 3927C ------- 3928C 3929C UTILIZES THE SINGLE-LINKAGE CLUSTERING ALGORITHM TO CONSTRUCT 3930C A TREE FROM A USER-SPECIFIED DISTANCE MATRIX 3931C 3932C DESCRIPTION 3933C ----------- 3934C 3935C 1. THE ALGORITHM TO COMPUTE SINGLE-LINKAGE TREES IS FOUND ON PAGES 3936C 191-195 OF THE REFERENCE. THE DATA MATRIX ARE THE DISTANCES 3937C BETWEEN THE CASES. THE DISTANCES SHOULD BE CALCULATED ON 3938C SCALED DATA (CLUSTER SUBROUTINE STAND CAN BE USED TO 3939C STANDARDIZE THE VARIABLES). THE OUTPUT CAN BE THE REGULAR 3940C REGULAR TREE OUTPUT OR THE BLOCK REPRESENTATION OF THE TREE AND 3941C IS WRITTEN ON FORTRAN UNIT OUNIT. 3942C 3943C 2. THE REGULAR TREE LISTS THE CASES VERTICALLY AND HAS HORIZONTAL 3944C LINES EMANATING FROM EACH CASE. EACH CLUSTER WILL CORRESPOND 3945C TO A VERTICAL LINE BETWEEN TWO HORIZONTAL LINES. THE CASES 3946C BETWEEN AND INCLUDED IN THE HORIZONTAL LINES ARE THE MEMBERS OF 3947C THE CLUSTER. THE DISTANCE FROM THE CASE NAMES TO THE VERTICAL 3948C LINES CORRESPOND TO THE CLUSTER DIAMETER OR THE DISTANCE 3949C BETWEEN THE FIRST AND LAST CASES. 3950C 3951C 3. THE BLOCK DIAGRAM PRINTS THE DISTANCE MATRIX WITH THE CASES 3952C LABELING BOTH HORIZONTAL AND VERTICAL AXES. THE DISTANCES HAVE 3953C BEEN MULTIPLIED BY 10. THE HORIZONTAL BOUNDARIES OF THE BLOCKS 3954C ARE REPRESENTED BY DASHES AND THE VERTICAL BOUNDARIES BY QUOTE 3955C MARKS. COMMAS REPRESENT THE CORNERS OF THE BLOCKS. 3956C 3957C INPUT PARAMETERS 3958C ---------------- 3959C 3960C M INTEGER SCALAR (UNCHANGED ON OUTPUT). 3961C THE NUMBER OF OBJECTS. 3962C 3963C DMD INTEGER SCALAR (UNCHANGED ON OUTPUT). 3964C THE LEADING DIMENSION OF MATRIX D. MUST BE AT LEAST M. 3965C 3966C D REAL MATRIX WHOSE FIRST DIMENSION MUST BE DMD AND SECOND 3967C DIMENSION MUST BE AT LEAST M (CHANGED ON OUTPUT). 3968C THE MATRIX OF DISTANCES. ORDERED ON OUTPUT SUCH THAT ALL 3969C CLUSTERS ARE CONTIGUOUS IN THE ORDER. 3970C 3971C D(I,J) = DISTANCE FROM CASE I TO CASE J 3972C 3973C DRLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST M 3974C (CHANGED ON OUTPUT). 3975C LABELS OF THE CASES. ORDERED ON OUTPUT. 3976C 3977C DTITLE 10-CHARACTER VARIABLE (UNCHANGED ON OUTPUT). 3978C TITLE OF THE DATA SET. 3979C 3980C DMIWRK INTEGER SCALAR (UNCHANGED ON OUTPUT). 3981C THE LEADING DIMENSION OF MATRIX IWORK. MUST BE AT LEAST 4. 3982C 3983C IWORK INTEGER VECTOR WHOSE FIRST DIMENSION MUST BE AT DMIWRK AND 3984C WHOSE SECOND DIMENSION MUST BE AT LEAST M+1. 3985C WORK VECTOR. 3986C 3987C WORK REAL VECTOR DIMENSIONED AT LEAST M+1. 3988C WORK VECTOR. 3989C 3990C TLAB VECTOR OF 4-CHARACTER VARIABLES DIMENSIONED AT LEAST M+1 3991C WORK VECTOR. 3992C 3993C IF THE REGULAR TREE DIAGRAM IS NOT CHOSEN, TLAB CAN HAVE 3994C LENGTH 1. 3995C 3996C IOUT INTEGER SCALAR (UNCHANGED ON OUTPUT). 3997C OPTION FOR CHOOSING FORM OF OUTPUT. IOUT HAS THE DECIMAL 3998C EXPANSION AB SUCH THAT IF 3999C 4000C A .NE. 0 THE REGULAR TREE WILL BE PRINTED 4001C B .NE. 0 THE BLOCKED TREE WILL BE PRINTED 4002C 4003C OUNIT INTEGER SCALAR (UNCHANGED ON OUTPUT). 4004C UNIT NUMBER FOR OUTPUT. 4005C 4006C OUTPUT PARAMETER 4007C ---------------- 4008C 4009C IERR INTEGER SCALAR. 4010C ERROR FLAG. 4011C 4012C IERR = 0, NO ERRORS WERE DETECTED DURING EXECUTION 4013C 4014C IERR = 1, EITHER THE FIRST AND LAST CASES OR THE CLUSTER 4015C DIAMETER FOR A CLUSTER IS OUT OF BOUNDS. THE 4016C CLUSTER AND ITS VALUES ARE PRINTED ON UNIT OUNIT. 4017C EXECUTION WILL CONTINUE WITH QUESTIONABLE RESULTS 4018C FOR THAT CLUSTER. ERROR FLAG SET IN THE REGULAR 4019C TREE OUTPUT ROUTINE. 4020C 4021C IERR = 2, EITHER THE FIRST AND LAST CASES OR THE CLUSTER 4022C DIAMETER FOR A CLUSTER IS OUT OF BOUNDS. THE 4023C CLUSTER AND ITS BOUNDARIES ARE PRINTED ON UNIT 4024C OUNIT. EXECUTION WILL CONTINUE WITH QUESTIONABLE 4025C RESULTS FOR THAT CLUSTER. ERROR FLAG SET IN THE 4026C BLOCK TREE OUTPUT ROUTINE. 4027C 4028C REFERENCES 4029C ---------- 4030C 4031C HARTIGAN, J. A. (1975). CLUSTERING ALGORITHMS, JOHN WILEY & 4032C SONS, INC., NEW YORK. PAGES 191-215. 4033C 4034C HARTIGAN, J. A. (1975) PRINTER GRAPHICS FOR CLUSTERING. JOURNAL OF 4035C STATISTICAL COMPUTATION AND SIMULATION. VOLUME 4,PAGES 187-213. 4036C 4037C<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 4038C 4039CCCCC INTEGER DMIWRK, DMD, OUNIT 4040 INTEGER DMIWRK, DMD 4041 DIMENSION D(DMD,*), WORK(*), IWORK(DMIWRK,*) 4042CCCCC CHARACTER*4 DRLAB(*), TLAB(*), CTEMP 4043CCCCC CHARACTER*10 DTITLE 4044C 4045 INCLUDE 'DPCOMC.INC' 4046C 4047 IERR = 0 4048 DO 10 I = 1 , M 4049 IWORK(4,I) = I 4050 10 CONTINUE 4051 D(1,1)=R1MACH(2) 4052C 4053C FIND THE OBJECT CLOSEST TO THE FIRST OBJECT 4054C 4055 DO 20 K=2 , M 4056 IF(D(1,K).LT.D(1,1)) THEN 4057 D(1,1)=D(1,K) 4058 IWORK(4,1)=K 4059 ENDIF 4060 20 CONTINUE 4061C 4062C SET UP THE CLUSTERS 4063C 4064 DO 90 NEXT = 1,M-1 4065 J = NEXT + 1 4066 DMIN=R1MACH(2) 4067 IMIN=NEXT 4068C 4069C FIND THE SMALLEST OF THE SMALLEST DISTANCES SO FAR COMPUTED 4070C 4071 DO 30 I=1,NEXT 4072 IF(D(I,I).LT.DMIN) THEN 4073 DMIN=D(I,I) 4074 IMIN=I 4075 ENDIF 4076 30 CONTINUE 4077 WORK(J+1)=100.*DMIN 4078 I=IWORK(4,IMIN) 4079C 4080C PLACE THE OBJECT JUST DETERMINED IN THE NEXT POSITION BY 4081C EXCHANGING IT WITH THE ONE CURRENTLY THERE 4082C 4083 DO 40 K=1,M 4084 A=D(I,K) 4085 D(I,K)=D(J,K) 4086 D(J,K)=A 4087 40 CONTINUE 4088CNIST CTEMP = DRLAB(I) 4089CNIST DRLAB(I)= DRLAB(J) 4090CNIST DRLAB(J) = CTEMP 4091 DO 50 K=1,M 4092 A=D(K,I) 4093 D(K,I)=D(K,J) 4094 D(K,J)=A 4095 50 CONTINUE 4096 ITEMP = IWORK(4,I) 4097 IWORK(4,I) = IWORK(4,J) 4098 IWORK(4,J) = ITEMP 4099 DO 60 K=1,NEXT 4100 IF(IWORK(4,K).EQ.I) IWORK(4,K)=1 4101 IF(IWORK(4,K).EQ.J) IWORK(4,K)=I 4102 60 CONTINUE 4103C 4104C UPDATE THE SMALLEST DISTANCES 4105C 4106 DO 80 I=1,J 4107 IWORK(4,J)=J 4108 IF(IWORK(4,I).LE.J) THEN 4109 IWORK(4,I)=I 4110 D(I,I)=R1MACH(2) 4111 DO 70 K=J,M 4112 IF(K.NE.J.AND.D(I,K).LT.D(I,I)) THEN 4113 D(I,I)=D(I,K) 4114 IWORK(4,I)=K 4115 ENDIF 4116 70 CONTINUE 4117 ENDIF 4118 80 CONTINUE 4119 90 CONTINUE 4120C 4121C FIND BOUNDARIES OF CLUSTERS 4122C 4123 WORK(2)=R1MACH(2) 4124 M1 = M + 1 4125 DO 140 K=2,M1 4126 IWORK(1,K)=K 4127 IWORK(2,K)=K 4128 DO 100 L=K,M1 4129 IF(L.NE.K) THEN 4130 IF(WORK(L).GT.WORK(K)) GO TO 110 4131 ENDIF 4132 IWORK(2,K)=L 4133 100 CONTINUE 4134 110 CONTINUE 4135 DO 120 L=2,K 4136 LL=K-L+2 4137 IF(L.NE.2) THEN 4138 IF(WORK(LL).GT.WORK(K)) GO TO 130 4139 ENDIF 4140 120 CONTINUE 4141 130 IWORK(1,K)=LL 4142 140 CONTINUE 4143 MM2=M-1 4144 DO 160 K=1,MM2 4145 DO 150 L=1,2 4146 IWORK(L,K)=IWORK(L,K+2) 4147 150 CONTINUE 4148 WORK(K)=WORK(K+2) 4149 160 CONTINUE 4150C 4151C SCALE CLUSTER DIAMETERS BETWEEN 1 AND 100 4152C 4153 XMAX = 0. 4154 DO 170 K=1,MM2 4155 IF(XMAX.LT.WORK(K)) XMAX=WORK(K) 4156 170 CONTINUE 4157 DO 180 K=1,MM2 4158 IWORK(3,K)=INT((WORK(K)*100)/XMAX) 4159 180 CONTINUE 4160C 4161C REORDER DISTANCE MATRIX 4162C 4163 DO 190 I=1,M 4164 D(I,I)=0. 4165 190 CONTINUE 4166C 4167C PRODUCE OUTPUT 4168C 4169CNIST IA = IOUT / 10 4170CNIST IB = MOD(IOUT,10) 4171CNIST IF (IA .NE. 0) THEN 4172CNIST IF (OUNIT .GT. 0) WRITE(OUNIT,1) 4173CNIS1 FORMAT('1') 4174CNIST TLAB(1) = DTITLE 4175CNIST DO 200 I = 1 , M 4176CN200 TLAB(I+1) = DRLAB(I) 4177CNIST CALL TREE1(M+1,M-1,DMIWRK,IWORK,TLAB,IERR,OUNIT) 4178CNIST ENDIF 4179CNIST IF (IB .NE. 0) THEN 4180CNIST DO 210 K = 1, M-1 4181CNIST IWORK(3,K) = IWORK(1,K) 4182CNIST IWORK(4,K) = IWORK(2,K) 4183C210 CONTINUE 4184CNIST CALL BLOCK(DMD,M+1,M+1,D,DRLAB,DRLAB,DTITLE,M-1,DMIWRK,IWORK, 4185CNIST* IERR,OUNIT) 4186CNIST ENDIF 4187 RETURN 4188 END 4189 SUBROUTINE BSWAP(KK,NSAM,NREPR,DYSMA,DYSMB,BETER,DYS,SKY,S,IFLAG, 4190 1 LARGE,ISUBRO,IBUGA3) 4191CNIST SUBROUTINE BSWAP(KK,NSAM,NREPR,DYS,SKY,S,LUB) 4192C 4193C THE FOLLOWING CHANGES WERE MADE TO INCORPORATE INTO DATAPLOT 4194C 4195C 1. USE DATAPLOT OUTPUT 4196C 2. RECODE A BIT FOR BETTER READABILITY 4197C 3. ADD TEMPORARY ARRAYS TO CALL LIST 4198C 4199C NOTE THAT WE CAN USE THIS ROUTINE FOR BOTH "CLARA" AND 4200C "PAM". JUST NEED TO ADD TEMPORARY ARRAYS TO CALL LIST. 4201C PAM USES "NN" RATHER THAN "NSAMP", BUT THIS CAN BE TAKEN 4202C CARE OF IN THE CALLING ROUTINE. ALSO, THE FEEDBACK 4203C MESSAGE IS SLIGHTLY DIFFERENT FOR PAM, SO ADD A FLAG 4204C TO SPECIFY WHETHER BEING CALLED FROM CLARA OR PAM. 4205C 4206C KK = NUMBER OF CLUSTERS 4207C NSAM = NUMBER OF SAMPLES 4208C NREPR = 4209C DYSMA = 4210C DYSMB = 4211C BETER = 4212C DYS = THE OUTPUT MATRIX CONTAINING THE DISTANCES 4213C SKY = SUM OF DISSIMILARITIES/DISTANCES 4214C S = 4215C LUB = OUTPUT UNIT (DON'T USE) 4216C 4217 DIMENSION NREPR(*) 4218 DIMENSION DYS(*) 4219 DIMENSION DYSMA(*) 4220 DIMENSION DYSMB(*) 4221 DIMENSION BETER(*) 4222C 4223 INTEGER MEET 4224 EXTERNAL MEET 4225C 4226 CHARACTER*4 IFLAG 4227 CHARACTER*4 ISUBRO 4228 CHARACTER*4 IBUGA3 4229C 4230 INCLUDE 'DPCOP2.INC' 4231C 4232 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'SWAP')THEN 4233 WRITE(ICOUT,11) 4234 11 FORMAT('***** AT THE BEGINNING OF BSWAP--') 4235 CALL DPWRST('XXX','BUG ') 4236 WRITE(ICOUT,15)KK,NSAM,S,IFLAG 4237 15 FORMAT('KK,NSAM,S,IFLAG = ',2I8,G15.7,2X,A4) 4238 CALL DPWRST('XXX','BUG ') 4239 ENDIF 4240C 4241CC 4242CC FIRST ALGORITHM: BUILD. 4243CC 4244 NBEST=0 4245 KBEST=0 4246 NMAX=0 4247 NNY=0 4248 DO 17 J=1,NSAM 4249 NREPR(J)=0 4250 DYSMA(J)=1.1*S+1.0 4251 17 CONTINUE 4252C 4253 20 CONTINUE 4254 DO 22 JA=1,NSAM 4255 IF(NREPR(JA).NE.0)GO TO 22 4256 BETER(JA)=0. 4257 DO 21 J=1,NSAM 4258 NJAJ=MEET(JA,J) 4259 CMD=DYSMA(J)-DYS(NJAJ) 4260 IF(CMD.GT.0.0)BETER(JA)=BETER(JA)+CMD 4261 21 CONTINUE 4262 22 CONTINUE 4263 AMMAX=0. 4264 DO 31 JA=1,NSAM 4265 IF(NREPR(JA).NE.0)GO TO 31 4266 IF(BETER(JA).LT.AMMAX)GO TO 31 4267 AMMAX=BETER(JA) 4268 NMAX=JA 4269 31 CONTINUE 4270 NREPR(NMAX)=1 4271 NNY=NNY+1 4272 DO 41 J=1,NSAM 4273 NJN=MEET(NMAX,J) 4274 IF(DYS(NJN).LT.DYSMA(J))DYSMA(J)=DYS(NJN) 4275 41 CONTINUE 4276C 4277 IF(NNY.NE.KK)GO TO 20 4278C 4279 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'SWAP')THEN 4280 WRITE(ICOUT,9011)NNY 4281 9011 FORMAT('***** BEFORE 51 LOOP--, NNY = ',I8) 4282 CALL DPWRST('XXX','BUG ') 4283 DO9020II=1,NSAM 4284 WRITE(ICOUT,9025)II,NREPR(II) 4285 9025 FORMAT('II,NREPR(II) = ',2I8) 4286 CALL DPWRST('XXX','BUG ') 4287 9020 CONTINUE 4288 ENDIF 4289C 4290 SKY=0. 4291 DO 51 J=1,NSAM 4292 SKY=SKY+DYSMA(J) 4293 51 CONTINUE 4294 IF(KK.EQ.1)GOTO9090 4295 RSAM=NSAM 4296 ASKY=SKY/RSAM 4297CNIST WRITE(LUB,9100)ASKY 4298C9100 FORMAT(1X/33H RESULT OF BUILD FOR THIS SAMPLE/2X, 4299CNISTF ' AVERAGE DISTANCE = ',F12.3) 4300 WRITE(ICOUT,999) 4301 999 FORMAT(1X) 4302 CALL DPWRST('XXX','BUG ') 4303 IF(IFLAG.EQ.'CLAR')THEN 4304 IF(IPRINT.EQ.'ON' .AND. LARGE.EQ.2)THEN 4305 WRITE(ICOUT,9100) 4306 9100 FORMAT('RESULT OF BUILD FOR THIS SAMPLE') 4307 CALL DPWRST('XXX','BUG ') 4308 WRITE(ICOUT,9101)ASKY 4309 9101 FORMAT(' AVERAGE DISTANCE = ',F12.5) 4310 CALL DPWRST('XXX','BUG ') 4311 ENDIF 4312 ELSE 4313 IF(IPRINT.EQ.'ON' .AND. LARGE.EQ.2)THEN 4314 WRITE(ICOUT,9110) 4315 9110 FORMAT('RESULT OF BUILD') 4316 CALL DPWRST('XXX','BUG ') 4317 WRITE(ICOUT,9111)ASKY 4318 9111 FORMAT(' AVERAGE DISSIMILARITY = ',F12.5) 4319 CALL DPWRST('XXX','BUG ') 4320 ENDIF 4321 ENDIF 4322CC 4323CC SECOND ALGORITHM: SWAP. 4324CC 4325 60 CONTINUE 4326C 4327 DO 63 J=1,NSAM 4328 DYSMA(J)=1.1*S+1.0 4329 DYSMB(J)=1.1*S+1.0 4330 DO 62 JA=1,NSAM 4331 IF(NREPR(JA).EQ.0)GO TO 62 4332 NJAJ=MEET(JA,J) 4333 IF(DYS(NJAJ).GE.DYSMA(J))THEN 4334 IF(DYS(NJAJ).GE.DYSMB(J))GO TO 62 4335 DYSMB(J)=DYS(NJAJ) 4336 ELSE 4337 DYSMB(J)=DYSMA(J) 4338 DYSMA(J)=DYS(NJAJ) 4339 ENDIF 4340 62 CONTINUE 4341 63 CONTINUE 4342C 4343 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'SWAP')THEN 4344 WRITE(ICOUT,16) 4345 16 FORMAT('***** AFTER 63 LOOP--') 4346 CALL DPWRST('XXX','BUG ') 4347 DO116I=1,NSAM 4348 WRITE(ICOUT,117)I,DYS(I),DYSMA(I),DYSMB(I) 4349 117 FORMAT('I,DYS(I),DYSMA(I),DYSMB(I) = ',I8,3G15.7) 4350 CALL DPWRST('XXX','BUG ') 4351 116 CONTINUE 4352 ENDIF 4353C 4354 DZSKY=1.0 4355 DO 73 K=1,NSAM 4356 IF(NREPR(K).EQ.1)GO TO 73 4357C 4358 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'SWAP')THEN 4359 WRITE(ICOUT,171)K,NREPR(K),DZ,DZSKY 4360 171 FORMAT('K,NREPR(K),DZ,DZSKY = ',2I8,2G15.7) 4361 CALL DPWRST('XXX','BUG ') 4362 ENDIF 4363C 4364 DO 72 JA=1,NSAM 4365 IF(NREPR(JA).EQ.0)GO TO 72 4366 DZ=0. 4367 DO 71 J=1,NSAM 4368 NJAJ=MEET(JA,J) 4369 NKJ=MEET(K,J) 4370 IF(DYS(NJAJ).NE.DYSMA(J))THEN 4371 IF(DYS(NKJ).LT.DYSMA(J))DZ=DZ-DYSMA(J)+DYS(NKJ) 4372 ELSE 4373 SMALL=DYSMB(J) 4374 IF(DYS(NJAJ).LT.SMALL)SMALL=DYS(NKJ) 4375 DZ=DZ-DYSMA(J)+SMALL 4376 ENDIF 4377 71 CONTINUE 4378 IF(DZ.GE.DZSKY)GO TO 72 4379 DZSKY=DZ 4380 KBEST=K 4381 NBEST=JA 4382 72 CONTINUE 4383 73 CONTINUE 4384C 4385 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'SWAP')THEN 4386 WRITE(ICOUT,9017)DZSKY 4387 9017 FORMAT('***** AFTER 73 LOOP, DZSKY = ',G15.7) 4388 CALL DPWRST('XXX','BUG ') 4389 ENDIF 4390C 4391 IF(DZSKY.GE.0.0)GOTO9090 4392 NREPR(KBEST)=1 4393 NREPR(NBEST)=0 4394 SKY=SKY+DZSKY 4395 GO TO 60 4396 9090 CONTINUE 4397C 4398 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'SWAP')THEN 4399 WRITE(ICOUT,9001) 4400 9001 FORMAT('***** AT THE END OF BSWAP--') 4401 CALL DPWRST('XXX','BUG ') 4402 ENDIF 4403C 4404 RETURN 4405 END 4406 SUBROUTINE DYSTA(NSAM,JPP,NSEL,X,DYS,NDYST,AMISS,JHALT, 4407 1 ISUBRO,IBUGA3) 4408CNIST SUBROUTINE DYSTA(NSAM,JPP,NSEL,X,MAXXX,MAXTT,DYS,NDYST,JTMD, 4409CNIST1 VALMD,JHALT,LUB,FNAMEB) 4410C 4411C KAUFFMAN AND ROUSSEEUW CODE FROM CLARA ALGORITHM. THIS 4412C ROUTINE COMPUTES EITHER EUCLIDEAN DISTANCE OR MANHATTAN 4413C DISTANCE BETWEEN ALL OBJECTS OF A SAMPLE (CLARA VERSION) 4414C 4415C NSAM = NUMBER OF SAMPLES 4416C JPP = NUMBER OF VARIABLES 4417C NSEL = INTEGER ARRAY CONTAINING OBJECTS SELECTED 4418C X = THE DATA MATRIX 4419C MAXXX = THE MAXIMUM OF ROWS TIMES COLUMNS, WE DON'T USE 4420C MAXTT = THE MAXIMUM OF VARIABLES (COLUMNS), WE DON'T USE 4421C DYS = THE OUTPUT MATRIX CONTAINING THE DISTANCES 4422C NDYST = 1 => EUCLIDEAN DISTANCES 4423C 2 => MANHATTAN (= CITY BLOCK) DISTANCES 4424C JHALT = SET TO 1 FOR ERROR CONDITION 4425C JTMD = FOR MISSING VALUES, WE DON'T USE 4426C VALMD = FOR MISSING VALUES, WE DON'T USE 4427C LUB = OUTPUT UNIT 4428C (WE USE DATAPLOT OUTPUT STRUCTURE, SO 4429C REMOVE THIS) 4430C FNAMEB = OUTPUT FILE NAME 4431C (WE USE DATAPLOT OUTPUT STRUCTURE, SO 4432C REMOVE THIS) 4433C 4434C CHANGES FOR INCORPORATING INTO DATAPLOT: 4435C 4436C 1. USE DATAPLOT I/O ROUTINES 4437C 2. FOR DATAPLOT, ONLY USE A SINGLE VALUE TO DENOTE 4438C MISSING VALUES 4439C 3. RECODED SLIGHTLY TO REDUCE USE OF GO TO's (THIS 4440C WAS JUST TO IMPROVE READABILITY OF THE CODE) 4441C 4442CNIST DIMENSION X(MAXXX),DYS(4951) 4443CNIST DIMENSION NSEL(100),JTMD(MAXTT),VALMD(MAXTT) 4444CNIST CHARACTER*30 FNAMEB 4445C 4446 DIMENSION X(*) 4447 DIMENSION DYS(*) 4448 DIMENSION NSEL(*) 4449C 4450 CHARACTER*4 IBUGA3 4451 CHARACTER*4 ISUBRO 4452C 4453 INCLUDE 'DPCOP2.INC' 4454C 4455 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'YSTA')THEN 4456 WRITE(ICOUT,51) 4457 51 FORMAT('***** AT THE BEGINNING OF DYSTA--') 4458 CALL DPWRST('XXX','BUG ') 4459 WRITE(ICOUT,55)NSAM,JPP,AMISS 4460 55 FORMAT('NSAM,JPP,AMISS = ',2I8,G15.7) 4461 CALL DPWRST('XXX','BUG ') 4462 DO56I=1,NSAM 4463 WRITE(ICOUT,57)I,NSEL(I) 4464 57 FORMAT('I,NSEL(I) = ',2I8) 4465 CALL DPWRST('XXX','BUG ') 4466 56 CONTINUE 4467 DO58I=1,JPP 4468 WRITE(ICOUT,59)I,X(I) 4469 59 FORMAT('I,X(I) = ',I8,G15.7) 4470 CALL DPWRST('XXX','BUG ') 4471 58 CONTINUE 4472 ENDIF 4473C 4474 JHALT=0 4475 PP=JPP 4476 NLK=1 4477 DYS(1)=0.0 4478 DO 100 L=2,NSAM 4479 LSUBT=L-1 4480 LSEL=NSEL(L) 4481 DO 20 K=1,LSUBT 4482 KSEL=NSEL(K) 4483 CLK=0.0 4484 NLK=NLK+1 4485 NPRES=0 4486 DO 30 J=1,JPP 4487 NUMLJ=(LSEL-1)*JPP+J 4488 NUMKJ=(KSEL-1)*JPP+J 4489CNIST IF(JTMD(J).GE.0)GO TO 40 4490CNIST IF(X(NUMLJ).EQ.VALMD(J))GO TO 30 4491CNIST IF(X(NUMKJ).EQ.VALMD(J))GO TO 30 4492 IF(X(NUMLJ).EQ.AMISS)GO TO 30 4493 IF(X(NUMKJ).EQ.AMISS)GO TO 30 4494CNI40 CONTINUE 4495 NPRES=NPRES+1 4496 IF(NDYST.NE.1)THEN 4497 CLK=CLK+ABS(X(NUMLJ)-X(NUMKJ)) 4498 ELSE 4499 CLK=CLK+(X(NUMLJ)-X(NUMKJ))*(X(NUMLJ)-X(NUMKJ)) 4500 ENDIF 4501 30 CONTINUE 4502 RPRES=NPRES 4503 IF(NPRES.EQ.0)THEN 4504 JHALT=1 4505CNIST WRITE(LUB,9400)LSEL,KSEL 4506C9400 FORMAT(1X,8H OBJECTS,I8,4H AND,I8,23H HAVE NO COMMON MEASURE, 4507CNISTF6HMENTS,/49H SO THE DISTANCE BETWEEN THEM CANNOT BE COMPUTED) 4508CNIST IF(FNAMEB.NE.'CON')WRITE(*,9400)LSEL,KSEL 4509 WRITE(ICOUT,999) 4510 999 FORMAT(1X) 4511 CALL DPWRST('XXX','BUG ') 4512 WRITE(ICOUT,9401)LSEL,KSEL 4513 9401 FORMAT('***** OBJECTS ',I8,' AND ',I8,' HAVE NO ', 4514 1 'COMMON MEASURE, SO') 4515 CALL DPWRST('XXX','BUG ') 4516 WRITE(ICOUT,9403) 4517 9403 FORMAT(' THE DISTANCE BETWEEN THEM CANNOT BE ', 4518 1 'COMPUTED.') 4519 CALL DPWRST('XXX','BUG ') 4520 DYS(NLK)=0.0 4521 GO TO 20 4522 ENDIF 4523 IF(NDYST.EQ.1)THEN 4524 DYS(NLK)=SQRT(CLK*(PP/RPRES)) 4525 ELSE 4526 DYS(NLK)=CLK*(PP/RPRES) 4527 ENDIF 4528 20 CONTINUE 4529 100 CONTINUE 4530C 4531 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'YSTA')THEN 4532 WRITE(ICOUT,9051) 4533 9051 FORMAT('***** AT THE END OF DYSTA--') 4534 CALL DPWRST('XXX','BUG ') 4535 DO9056I=1,NLK 4536 WRITE(ICOUT,9057)I,DYS(I) 4537 9057 FORMAT('I,DYS(I) = ',I8,G15.7) 4538 CALL DPWRST('XXX','BUG ') 4539 9056 CONTINUE 4540 ENDIF 4541C 4542 RETURN 4543 END 4544 FUNCTION MEET(L,J) 4545 IF(L.GT.J)THEN 4546CC 4547CC J LESS THAN L 4548CC 4549 MEET=(L-2)*(L-1)/2+J+1 4550 ELSEIF(L.EQ.J)THEN 4551CC 4552CC J EQUALS L 4553CC 4554 MEET=1 4555 ELSE 4556CC 4557CC L LESS THAN J 4558CC 4559 MEET=(J-2)*(J-1)/2+L+1 4560 ENDIF 4561C 4562 RETURN 4563 END 4564 SUBROUTINE DYSTAP(NN,JPP,MAXNN,MAXPP,X,DYS,NDYST,AMISS,JHALT, 4565 1 ISUBRO,IBUGA3) 4566C 4567CNIST SUBROUTINE DYSTA(NN,JPP,MAXNN,MAXPP,MAXHH,X,DYS,NDYST,JTMD, 4568CNIST1 VALMD,LAB,JHALT,LUB,FNAMEB) 4569C 4570C KAUFFMAN AND ROUSSEEUW CODE FROM PAM ALGORITHM. THIS 4571C ROUTINE COMPUTES EITHER EUCLIDEAN DISTANCE OR MANHATTAN 4572C DISTANCE BETWEEN ALL OBJECTS OF A SAMPLE (CLARA VERSION) 4573C 4574C NN = NUMBER OF SAMPLES 4575C JPP = NUMBER OF VARIABLES 4576C MAXN = THE ROW DIMENSION OF X 4577C MAXPP = THE COLUMN DIMENSION OF X 4578C MAXHH = THE MAXIMUM DIMENSION FOR THE DISTANCES 4579C (DATAPLOT DOES NOT USE) 4580C X = THE DATA MATRIX 4581C DYS = THE OUTPUT MATRIX CONTAINING THE DISTANCES 4582C NDYST = 1 => EUCLIDEAN DISTANCES 4583C 2 => MANHATTAN (= CITY BLOCK) DISTANCES 4584C JTMD = FOR MISSING VALUES, WE DON'T USE 4585C VALMD = FOR MISSING VALUES, WE DON'T USE 4586C LAB = ... 4587C JHALT = SET TO 1 FOR ERROR CONDITION 4588C LUB = OUTPUT UNIT 4589C (DATAPLOT DOES NOT USE) 4590C FNAMEB = OUTPUT FILE NAME 4591C (DATAPLOT DOES NOT USE) 4592C 4593C CHANGES FOR INCORPORATING INTO DATAPLOT: 4594C 4595C 1. USE DATAPLOT I/O ROUTINES 4596C 2. FOR DATAPLOT, ONLY USE A SINGLE VALUE TO DENOTE 4597C MISSING VALUES 4598C 3. RECODED SLIGHTLY TO REDUCE USE OF GO TO's (THIS 4599C WAS JUST TO IMPROVE READABILITY OF THE CODE) 4600C 4601CNIST DIMENSION X(MAXNN,MAXPP),DYS(MAXHH),JTMD(MAXPP),VALMD(MAXPP) 4602CNIST CHARACTER LAB(3,MAXNN) 4603CNIST CHARACTER*30 FNAMEB 4604 DIMENSION X(MAXNN,MAXPP) 4605 DIMENSION DYS(*) 4606C 4607 CHARACTER*4 IBUGA3 4608 CHARACTER*4 ISUBRO 4609C 4610 INCLUDE 'DPCOP2.INC' 4611C 4612 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'YSTA')THEN 4613 WRITE(ICOUT,51) 4614 51 FORMAT('***** AT THE BEGINNING OF DYSTAP--') 4615 CALL DPWRST('XXX','BUG ') 4616 WRITE(ICOUT,55)NN,JPP,AMISS 4617 55 FORMAT('NN,JPP,AMISS = ',2I8,G15.7) 4618 CALL DPWRST('XXX','BUG ') 4619 DO58I=1,NN 4620 WRITE(ICOUT,59)I,(X(I,J),J=1,JPP) 4621 59 FORMAT('I,X(I,J) = ',I8,30G15.7) 4622 CALL DPWRST('XXX','BUG ') 4623 58 CONTINUE 4624 ENDIF 4625C 4626 JHALT=0 4627 PP=JPP 4628 NLK=1 4629 DYS(1)=0.0 4630 DO 100 L=2,NN 4631 LSUBT=L-1 4632 DO 20 K=1,LSUBT 4633 CLK=0.0 4634 NLK=NLK+1 4635 NPRES=0 4636 DO 30 J=1,JPP 4637CNIST IF(JTMD(J).GE.0)GOTO 40 4638CNIST IF(X(L,J).EQ.VALMD(J))GOTO 30 4639CNIST IF(X(K,J).EQ.VALMD(J))GOTO 30 4640 IF(X(L,J).EQ.AMISS)GOTO30 4641 IF(X(K,J).EQ.AMISS)GOTO30 4642CNI40 CONTINUE 4643 NPRES=NPRES+1 4644 IF(NDYST.NE.1)THEN 4645 CLK=CLK+ABS(X(L,J)-X(K,J)) 4646 ELSE 4647 CLK=CLK+(X(L,J)-X(K,J))*(X(L,J)-X(K,J)) 4648 ENDIF 4649 30 CONTINUE 4650 RPRES=NPRES 4651 IF(NPRES.EQ.0)THEN 4652 JHALT=1 4653CNIST WRITE(LUB,9400)LAB(1,L),LAB(2,L),LAB(3,L),LAB(1,K),LAB(2,K) 4654CNIST1 ,LAB(3,K) 4655C9400 FORMAT(' OBJECTS ',3A1,' AND ',3A1, 4656CNIST1 ' HAVE NO COMMON MEASUREMENTS.') 4657CNIST IF(FNAMEB.NE.'CON')WRITE(*,9400)LAB(1,L),LAB(2,L),LAB(3,L), 4658CNIST1 LAB(1,K),LAB(2,K),LAB(3,K) 4659 WRITE(ICOUT,999) 4660 999 FORMAT(1X) 4661 CALL DPWRST('XXX','BUG ') 4662 WRITE(ICOUT,9401)L,K 4663 9401 FORMAT('***** OBJECTS ',I8,' AND ',I8,' HAVE NO ', 4664 1 'COMMON MEASURE, SO') 4665 CALL DPWRST('XXX','BUG ') 4666 WRITE(ICOUT,9403) 4667 9403 FORMAT(' THE DISTANCE BETWEEN THEM CANNOT BE ', 4668 1 'COMPUTED.') 4669 CALL DPWRST('XXX','BUG ') 4670 DYS(NLK)=0.0 4671 GOTO 20 4672 ENDIF 4673 IF(NDYST.EQ.1)THEN 4674 DYS(NLK)=SQRT(CLK*(PP/RPRES)) 4675 ELSE 4676 DYS(NLK)=CLK*(PP/RPRES) 4677 ENDIF 4678 20 CONTINUE 4679 100 CONTINUE 4680C 4681 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'STAP')THEN 4682 WRITE(ICOUT,9051) 4683 9051 FORMAT('***** AT THE END OF DYSTA--') 4684 CALL DPWRST('XXX','BUG ') 4685 DO9056I=1,NLK 4686 WRITE(ICOUT,9057)I,DYS(I) 4687 9057 FORMAT('I,DYS(I) = ',I8,G15.7) 4688 CALL DPWRST('XXX','BUG ') 4689 9056 CONTINUE 4690 ENDIF 4691C 4692 RETURN 4693 END 4694 SUBROUTINE SELEC(KK,NN,JPP,NSTAN,NDYST,ZB,NSAM,MDATA, 4695 1 AMISS,NREPR,NSEL,DYS,X,NR,NAFS, 4696 1 TTD,RADUS,RATT, 4697 1 TTNEW,RDNEW, 4698 1 NRNEW,NSNEW,NPNEW,NS,NP,NEW, 4699 1 LARGE,ISUBRO,IBUGA3) 4700C 4701C THE FOLLOWING CHANGES WERE MADE TO INCORPORATE INTO DATAPLOT 4702C 4703C 1. USE DATAPLOT OUTPUT 4704C 2. RECODE A BIT FOR BETTER READABILITY 4705C 4706C KK = NUMBER OF CLUSTERS 4707C NN = NUMBER OF ROWS (CASES) 4708C JPP = NUMBER OF COLUMNS (VARIABLES) 4709C NSTAN = 0 => NO STANDARDIZATION APPLIED 4710C 1 => STANDARDIZATION APPLIED 4711C NDYST = 1 => EUCLIDEAN DISTANCES 4712C 2 => MANHATTAN (= CITY BLOCK) DISTANCES 4713C ZB = TOTAL DISTANCE 4714C NSAM = NUMBER OF SAMPLES 4715C MDATA = 0 => NO MISSING DATA 4716C 1 => THERE IS MISSING DATA 4717C AMISS = NUMBER DENOTING A MISSING VALUE 4718C NREPR = INTEGER ARRAY 4719C 0 => NOT A REPRESENTATIVE OBJECT 4720C 1 => IS A REPRESENTATIVE OBJECT 4721C NSEL = INTEGER ARRAY CONTAINING OBJECTS SELECTED 4722C DYS = THE OUTPUT MATRIX CONTAINING THE DISTANCES 4723C X = THE DATA MATRIX 4724C MAXTT = THE MAXIMUM OF ROWS TIMES COLUMNS 4725C MAXXX = THE MAXIMUM NUBER OF VARIABLES (COLUMNS) 4726C NR = 4727C NAFS = 4728C TTD = AVERAGE DISTANCE TO EACH MEDOID 4729C RADUS = MAXIMUM DISTANCE TO EACH MEDOID 4730C RATT = MAXIMUM DISTANCE OF MEDOID DIVIDED BY MINIMUM 4731C DISTANCE TO ANOTHER MEDOID 4732C 4733CNIST SUBROUTINE SELEC(KK,NN,JPP,NSTAN,NDYST,ZB,NSAM,LUB,MDATA, 4734CNISTF JTMD,VALMD,NREPR,NSEL,DYS,X,MAXXX,MAXTT,NR,NAFS, 4735CNISTF TTD,RADUS,RATT) 4736CNIST DIMENSION NREPR(100),NSEL(100),DYS(4951),X(MAXXX),NEW(30) 4737CNIST DIMENSION NRNEW(30),NSNEW(30),NPNEW(30),TTNEW(30),RDNEW(30) 4738CNIST DIMENSION NS(30),NR(30),NP(30),TTD(30),RADUS(30),RATT(30) 4739CNIST DIMENSION JTMD(MAXTT),VALMD(MAXTT) 4740 PARAMETER (MAXCLU=30) 4741C 4742 DIMENSION DYS(*) 4743 DIMENSION X(*) 4744 DIMENSION TTD(*) 4745 DIMENSION RADUS(*) 4746 DIMENSION RATT(*) 4747 DIMENSION NREPR(*) 4748 DIMENSION NSEL(*) 4749 DIMENSION NR(*) 4750C 4751 DIMENSION NRNEW(*) 4752 DIMENSION NSNEW(*) 4753 DIMENSION NPNEW(*) 4754 DIMENSION TTNEW(*) 4755 DIMENSION RDNEW(*) 4756 DIMENSION NS(*) 4757 DIMENSION NP(*) 4758 DIMENSION NEW(*) 4759C 4760 CHARACTER*4 ISUBRO 4761 CHARACTER*4 IBUGA3 4762C 4763 INCLUDE 'DPCOP2.INC' 4764C 4765 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'ELEC')THEN 4766 WRITE(ICOUT,51) 4767 51 FORMAT('***** AT THE BEGINNING OF SELEC--') 4768 CALL DPWRST('XXX','BUG ') 4769 WRITE(ICOUT,55)KK,NN,JPP,NSTAN,NDYST,NSAM 4770 55 FORMAT('KK,NN,JPP,NSTAN,NDYST,NSAM = ',6I8) 4771 CALL DPWRST('XXX','BUG ') 4772 ENDIF 4773CC 4774CC NAFS = 1 IF A DISTANCE CANNOT BE CALCULATED 4775CC 4776 NAFS=0 4777 JKABC=0 4778 DNULL=0.0 4779CC 4780CC IDENTIFICATION OF REPRESENTATIVE OBJECTS, AND INITIALIZATIONS 4781CC 4782 JK=0 4783 DO 10 J=1,NSAM 4784 IF(NREPR(J).EQ.0)GO TO 10 4785 JK=JK+1 4786 NR(JK)=NSEL(J) 4787 NS(JK)=0 4788 TTD(JK)=0. 4789 RADUS(JK)=-1. 4790 NP(JK)=J 4791 10 CONTINUE 4792CC 4793CC ASSIGNMENT OF THE OBJECTS OF THE ENTIRE DATA SET TO A CLUSTER, 4794CC COMPUTATION OF SOME STATISTICS, DETERMINATION OF THE 4795CC NEW ORDERING OF THE CLUSTERS 4796CC 4797 ZB=0. 4798 PP=JPP 4799 NEWF=0 4800 JN=0 4801 15 CONTINUE 4802 JN=JN+1 4803 IF(MDATA.NE.0)THEN 4804 PRES=0. 4805 DO 70 JK=1,KK 4806 DSUM=0. 4807 NRJK=NR(JK) 4808 ABC=0. 4809 DO 50 JP=1,JPP 4810 NA=(NRJK-1)*JPP+JP 4811 NB=(JN-1)*JPP+JP 4812 IF(X(NA).NE.AMISS .AND. X(NB).NE.AMISS)THEN 4813 ABC=ABC+1. 4814 TRA=ABS(X(NA)-X(NB)) 4815 IF(NDYST.EQ.1)TRA=TRA*TRA 4816 DSUM=DSUM+TRA 4817 ENDIF 4818 50 CONTINUE 4819 IF(ABC.LT.0.5)GO TO 70 4820 DSUM=DSUM*ABC/PP 4821 IF(PRES.GT.0.5)THEN 4822 IF(DSUM.GE.DNULL)GO TO 70 4823 ELSE 4824 PRES=1. 4825 ENDIF 4826 DNULL=DSUM 4827 JKABC=JK 4828 70 CONTINUE 4829 IF(PRES.GT.0.5)GO TO 80 4830CNIST WRITE(LUB,9000)JN 4831C9000 FORMAT(' OBJECT',I5,37H DOESNT HAVE COMMON MEASUREMENTS WITH, 4832CNISTF 53H ANY OF THE MEDOIDS AND THEREFORE CANNOT BE ASSIGNED.) 4833CNIST WRITE(LUB,9002) 4834C9002 FORMAT(1X,' THIS SAMPLE IS NOT CONSIDERED ANY FURTHER') 4835 WRITE(ICOUT,999) 4836 999 FORMAT(1X) 4837 CALL DPWRST('XXX','BUG ') 4838 WRITE(ICOUT,9000)JN 4839 9000 FORMAT('****** OBJECT ',I5,' DOES NOT HAVE COMMON MEASUREMENTS', 4840 1 'WITH ANY') 4841 CALL DPWRST('XXX','BUG ') 4842 WRITE(ICOUT,9001) 4843 9001 FORMAT(' OF THE MEDOIDS AND THEREFORE CANNOT BE ', 4844 1 'ASSIGNED.') 4845 CALL DPWRST('XXX','BUG ') 4846 WRITE(ICOUT,9002) 4847 9002 FORMAT(' THIS SAMPLE IS NOT CONSIDERED ANY FURTHER.') 4848 CALL DPWRST('XXX','BUG ') 4849 NAFS=1 4850 GOTO9090 4851 ENDIF 4852C 4853 DO 30 JK=1,KK 4854 DSUM=0. 4855 NRJK=NR(JK) 4856 DO 20 JP=1,JPP 4857 NA=(NRJK-1)*JPP+JP 4858 NB=(JN-1)*JPP+JP 4859 TRA=ABS(X(NA)-X(NB)) 4860 IF(NDYST.EQ.1)TRA=TRA*TRA 4861 DSUM=DSUM+TRA 4862 20 CONTINUE 4863 IF(JK.EQ.1 .OR. DSUM.LT.DNULL)THEN 4864 DNULL=DSUM 4865 JKABC=JK 4866 ENDIF 4867 30 CONTINUE 4868C 4869 80 CONTINUE 4870 IF(NDYST.EQ.1)DNULL=SQRT(DNULL) 4871 ZB=ZB+DNULL 4872 TTD(JKABC)=TTD(JKABC)+DNULL 4873 IF(DNULL.GT.RADUS(JKABC))RADUS(JKABC)=DNULL 4874 NS(JKABC)=NS(JKABC)+1 4875 IF(NEWF.GE.KK)GO TO 90 4876 IF(NEWF.GE.1)THEN 4877 DO 82 JNEW=1,NEWF 4878 IF(JKABC.EQ.NEW(JNEW))GO TO 90 4879 82 CONTINUE 4880 ENDIF 4881 NEWF=NEWF+1 4882 NEW(NEWF)=JKABC 4883 90 CONTINUE 4884 IF(JN.LT.NN)GO TO 15 4885CC 4886CC A PERMUTATION IS CARRIED OUT ON VECTORS NR,NS,NP,TTD,RADUS 4887CC USING THE INFORMATION IN VECTOR NEW. 4888CC 4889 DO 92 JK=1,KK 4890 NJK=NEW(JK) 4891 NRNEW(JK)=NR(NJK) 4892 NSNEW(JK)=NS(NJK) 4893 NPNEW(JK)=NP(NJK) 4894 TTNEW(JK)=TTD(NJK) 4895 RDNEW(JK)=RADUS(NJK) 4896 92 CONTINUE 4897 DO 94 JK=1,KK 4898 NR(JK)=NRNEW(JK) 4899 NS(JK)=NSNEW(JK) 4900 NP(JK)=NPNEW(JK) 4901 TTD(JK)=TTNEW(JK) 4902 RADUS(JK)=RDNEW(JK) 4903 94 CONTINUE 4904CC 4905CC PRINTING OF RESULTS FOR ENTIRE DATA SET 4906CC 4907 RNN=NN 4908 ZM=ZB/RNN 4909CNIST WRITE(LUB,9010)ZB,ZM 4910C9010 FORMAT(33H RESULTS FOR THE ENTIRE DATA SET/3X, 4911CNISTF 20H TOTAL DISTANCE =,F15.3/3X,20H AVERAGE DISTANCE =,F15.3) 4912C 4913 IF(IPRINT.EQ.'ON' .AND. LARGE.EQ.2)THEN 4914 WRITE(ICOUT,9010) 4915 9010 FORMAT('RESULTS FOR THE ENTIRE DATA SET') 4916 CALL DPWRST('XXX','BUG ') 4917 WRITE(ICOUT,9011)ZB 4918 9011 FORMAT(' TOTAL DISTANCE = ',F15.3) 4919 CALL DPWRST('XXX','BUG ') 4920 WRITE(ICOUT,9012)ZM 4921 9012 FORMAT(' AVERAGE DISTANCE = ',F15.3) 4922 CALL DPWRST('XXX','BUG ') 4923C 4924 WRITE(ICOUT,999) 4925 CALL DPWRST('XXX','BUG ') 4926 IF(NSTAN.EQ.0)THEN 4927CNIST WRITE(LUB,9020) 4928C9020 FORMAT(/46H CLUSTER SIZE MEDOID COORDINATES OF MEDOID) 4929 WRITE(ICOUT,9020) 4930 9020 FORMAT(' CLUSTER SIZE MEDOID COORDINATES OF MEDOID') 4931 CALL DPWRST('XXX','BUG ') 4932 ELSEIF(NSTAN.NE.0)THEN 4933CNIST WRITE(LUB,9025) 4934C9025 FORMAT(/46H CLUSTER SIZE MEDOID COORDINATES OF MEDOID, 4935CNISTF 28H (STANDARDIZED MEASUREMENTS)) 4936 WRITE(ICOUT,9025) 4937 9025 FORMAT(' CLUSTER SIZE MEDOID COORDINATES OF MEDOID ', 4938 1 ' (STANDARDIZED MEASUREMENTS)') 4939 CALL DPWRST('XXX','BUG ') 4940 ENDIF 4941 ENDIF 4942 DO 100 JK=1,KK 4943 JKA=(NR(JK)-1)*JPP+1 4944 JKB=JKA-1+JPP 4945 IF(IPRINT.EQ.'ON' .AND. LARGE.EQ.2)THEN 4946CNIST WRITE(LUB,9030)JK,NS(JK),NR(JK),(X(J),J=JKA,JKB) 4947C9030 FORMAT(/1X,I8,I5,I7,2X,5F11.2,20(/23X,5F11.2)) 4948 WRITE(ICOUT,999) 4949 CALL DPWRST('XXX','BUG ') 4950 JSTOP=JKA+4 4951 IF(JSTOP.GT.JKB)JSTOP=JKB 4952 WRITE(ICOUT,9030)JK,NS(JK),NR(JK),(X(J),J=JKA,JSTOP) 4953 9030 FORMAT(1X,I8,I5,I7,2X,5F11.2) 4954 CALL DPWRST('XXX','BUG ') 4955 ENDIF 4956 IF(JKB.GT.JKA+4)THEN 4957 JSTRT=JKA+5 4958 NTEMP=JKB - JSTRT + 1 4959 NLOOP=NTEMP/5 4960 IF(MOD(NTEMP,5).GT.0)NLOOP=NLOOP+1 4961 DO9031L=1,NLOOP 4962 JSTOP=JSTRT+4 4963 IF(JSTOP.GT.JKB)JSTOP=JKB 4964 IF(IPRINT.EQ.'ON' .AND. LARGE.EQ.2)THEN 4965 WRITE(ICOUT,9032)(X(J),J=JSTRT,JSTOP) 4966 9032 FORMAT(23X,5F11.2) 4967 CALL DPWRST('XXX','BUG ') 4968 ENDIF 4969 JSTRT=JSTRT+5 4970 9031 CONTINUE 4971 ENDIF 4972 100 CONTINUE 4973 DO 101 J=1,KK 4974 RNS=NS(J) 4975 TTD(J)=TTD(J)/RNS 4976 101 CONTINUE 4977CNIST WRITE(LUB,9040)(TTD(J),J=1,KK) 4978C9040 FORMAT(/33H AVERAGE DISTANCE TO EACH MEDOID,6(/2X,5F12.3)) 4979CNIST WRITE(LUB,9050)(RADUS(J),J=1,KK) 4980C9050 FORMAT(/33H MAXIMUM DISTANCE TO EACH MEDOID,6(/2X,5F12.3)) 4981 IF(IPRINT.EQ.'ON' .AND. LARGE.EQ.2)THEN 4982 WRITE(ICOUT,999) 4983 CALL DPWRST('XXX','BUG ') 4984 WRITE(ICOUT,9040) 4985 9040 FORMAT(' AVERAGE DISTANCE TO EACH MEDOID') 4986 CALL DPWRST('XXX','BUG ') 4987 NLOOP=KK/5 4988 IF(MOD(KK,5).GT.0)NLOOP=NLOOP+1 4989 JSTRT=1 4990 DO9041L=1,NLOOP 4991 JSTOP=JSTRT+1 4992 IF(JSTOP.GT.KK)JSTOP=KK 4993 WRITE(ICOUT,9042)(TTD(J),J=JSTRT,JSTOP) 4994 9042 FORMAT(2X,5F11.2) 4995 CALL DPWRST('XXX','BUG ') 4996 JSTRT=JSTRT+5 4997 9041 CONTINUE 4998 WRITE(ICOUT,999) 4999 CALL DPWRST('XXX','BUG ') 5000 WRITE(ICOUT,9050) 5001 9050 FORMAT(' MAXIMUM DISTANCE TO EACH MEDOID') 5002 CALL DPWRST('XXX','BUG ') 5003 JSTRT=1 5004 DO9051L=1,NLOOP 5005 JSTOP=JSTRT+1 5006 IF(JSTOP.GT.KK)JSTOP=KK 5007 WRITE(ICOUT,9042)(RADUS(J),J=JSTRT,JSTOP) 5008 CALL DPWRST('XXX','BUG ') 5009 JSTRT=JSTRT+5 5010 9051 CONTINUE 5011 ENDIF 5012C 5013 IF(KK.GT.1)THEN 5014CC 5015CC COMPUTATION OF MINIMAL DISTANCE OF MEDOID KA TO ANY 5016CC OTHER MEDOID FOR COMPARISON WITH THE RADIUS OF CLUSTER KA. 5017CC 5018 DO 120 KA=1,KK 5019 NSTRT=0 5020 NPA=NP(KA) 5021 DO 110 KB=1,KK 5022 IF(KB.EQ.KA)GO TO 110 5023 NPB=NP(KB) 5024 NPAB=MEET(NPA,NPB) 5025 IF(NSTRT.EQ.0)THEN 5026 NSTRT=1 5027 ELSE 5028 IF(DYS(NPAB).GE.RATT(KA))GO TO 110 5029 ENDIF 5030 RATT(KA)=DYS(NPAB) 5031 IF(RATT(KA).NE.0.)GO TO 110 5032 IF(IPRINT.EQ.'ON' .AND. LARGE.EQ.2)THEN 5033CNIST WRITE(LUB,9054)KA,KB 5034C9054 FORMAT(/51H THE DISSIMILARITY BETWEEN THE MEDOIDS OF CLUSTERS, 5035CNISTF I3,5H AND ,I3,9H IS ZERO.) 5036CNIST WRITE(LUB,9056) 5037C9056 FORMAT(' IN THE FOLLOWING VECTOR A VALUE OF -1 IS GIVEN TO', 5038CNISTF ' BOTH CLUSTERS.') 5039 WRITE(ICOUT,999) 5040 CALL DPWRST('XXX','BUG ') 5041 WRITE(ICOUT,9054) 5042 9054 FORMAT('THE DISSIMILARITY BETWEEN THE MEDOIDS OF ', 5043 1 'CLUSTERS',I3,' AND ',I3,' IS ZERO.') 5044 CALL DPWRST('XXX','BUG ') 5045 WRITE(ICOUT,9056) 5046 9056 FORMAT('IN THE FOLLOWING VECTOR A VALUE OF -1 IS GIVEN ', 5047 1 'TO BOTH CLUSTERS.') 5048 CALL DPWRST('XXX','BUG ') 5049 ENDIF 5050C 5051 RATT(KA)=-1. 5052 110 CONTINUE 5053 IF(RATT(KA).GT.(-0.5))RATT(KA)=RADUS(KA)/RATT(KA) 5054 120 CONTINUE 5055CNIST WRITE(LUB,9060)(RATT(J),J=1,KK) 5056C9060 FORMAT(/49H MAXIMUM DISTANCE TO A MEDOID DIVIDED BY MINIMUM/ 5057CNISTF 42H DISTANCE OF THE MEDOID TO ANOTHER MEDOID,6(/2X,5F12.3)) 5058 IF(IPRINT.EQ.'ON' .AND. LARGE.EQ.2)THEN 5059 WRITE(ICOUT,9060) 5060 9060 FORMAT(' MAXIMUM DISTANCE TO A MEDOID DIVIDED BY MINIMUM') 5061 CALL DPWRST('XXX','BUG ') 5062 WRITE(ICOUT,9061) 5063 9061 FORMAT(' DISTANCE OF THE MEDOID TO ANOTHER MEDOID') 5064 CALL DPWRST('XXX','BUG ') 5065 NLOOP=KK/5 5066 IF(MOD(KK,5).GT.0)NLOOP=NLOOP+1 5067 JSTRT=1 5068 DO9063L=1,NLOOP 5069 JSTOP=JSTRT+1 5070 IF(JSTOP.GT.KK)JSTOP=KK 5071 WRITE(ICOUT,9064)(RATT(J),J=JSTRT,JSTOP) 5072 9064 FORMAT(2X,5F11.2) 5073 CALL DPWRST('XXX','BUG ') 5074 JSTRT=JSTRT+5 5075 9063 CONTINUE 5076 ENDIF 5077 ENDIF 5078C 5079 9090 CONTINUE 5080 RETURN 5081 END 5082 SUBROUTINE RESUL(KK,NN,JPP,LARGE,NDYST,X,NRX,AMISS,IC1,IOUNI1) 5083CNIST SUBROUTINE RESUL(KK,NN,JPP,LARGE,NDYST,LUB,MDATA,JTMD, 5084CNIST1 VALMD,X,MAXXX,MAXTT,NRX) 5085C 5086C THE FOLLOWING CHANGES WERE MADE TO INCORPORATE INTO DATAPLOT 5087C 5088C 1. USE DATAPLOT OUTPUT 5089C 2. RECODE A BIT FOR BETTER READABILITY 5090C 5091C KK = NUMBER OF CLUSTERS 5092C NN = NUMBER OF ROWS (CASES) 5093C JPP = NUMBER OF COLUMNS (VARIABLES) 5094C LARGE = SPECIFY WHAT WILL BE OUTPUT 5095C NDYST = 1 => EUCLIDEAN DISTANCES 5096C 2 => MANHATTAN (= CITY BLOCK) DISTANCES 5097C LUB = OUTPUT UNIT (WE DON'T USE) 5098C MDATA = MISSING VALUES PRESENT (WE DON'T USE) 5099C JTMD = FOR MISSING VALUES (WE DON'T USE) 5100C VALMD = FOR MISSING VALUES (WE DON'T USE) 5101C X = THE DATA MATRIX 5102C MAXXX = THE MAXIMUM OF ROWS TIMES COLUMNS (WE DON'T USE) 5103C MAXTT = THE MAXIMUM NUMBER OF VARIABLES (COLUMNS) (WE DON'T USE) 5104C 5105 DIMENSION X(*) 5106 DIMENSION NRX(*) 5107 DIMENSION IC1(*) 5108C 5109 PARAMETER (LYNF=25) 5110 DIMENSION LYNE(LYNF) 5111C 5112 INCLUDE 'DPCOP2.INC' 5113C 5114 JKSKY=0 5115 PP=JPP 5116 DNULL=0.0 5117CC 5118CC CLUSTERING VECTOR IS INCORPORATED INTO X, AND PRINTED. 5119CC 5120 JN=0 5121 100 CONTINUE 5122 JN=JN+1 5123 NJNB=(JN-1)*JPP 5124 DO 145 JK=1,KK 5125 IF(NRX(JK).EQ.JN)GO TO 220 5126 145 CONTINUE 5127 JNA=(JN-1)*JPP+1 5128 DO 190 JK=1,KK 5129 DSUM=0. 5130 NRJK=(NRX(JK)-1)*JPP 5131 ABC=0. 5132 DO 180 J=1,JPP 5133 NA=NRJK+J 5134 NB=NJNB+J 5135 IF(X(NA).EQ.AMISS .OR. X(NB).EQ.AMISS)GO TO 180 5136 ABC=ABC+1. 5137 TRA=ABS(X(NA)-X(NB)) 5138 IF(NDYST.EQ.1)TRA=TRA*TRA 5139 DSUM=DSUM+TRA 5140 180 CONTINUE 5141 IF(NDYST.EQ.1)DSUM=SQRT(DSUM) 5142 DSUM=DSUM*ABC/PP 5143 IF(JK.EQ.1)DNULL=DSUM+0.1 5144 IF(DSUM.GE.DNULL)GO TO 190 5145 DNULL=DSUM 5146 JKSKY=JK 5147 190 CONTINUE 5148C 5149 X(JNA)=JKSKY 5150 220 CONTINUE 5151 IF(JN.LT.NN)GO TO 100 5152C 5153 DO 230 JK=1,KK 5154 NRJK=NRX(JK) 5155 NRJKA=(NRJK-1)*JPP+1 5156 X(NRJKA)=JK 5157 230 CONTINUE 5158CNIST WRITE(LUB,9110) 5159C9110 FORMAT(//2X,18H CLUSTERING VECTOR/3X,17(1H*)/) 5160 IF(IPRINT.EQ.'ON' .AND. LARGE.GE.1)THEN 5161 WRITE(ICOUT,999) 5162 999 FORMAT(1X) 5163 CALL DPWRST('XXX','BUG ') 5164 WRITE(ICOUT,999) 5165 CALL DPWRST('XXX','BUG ') 5166 WRITE(ICOUT,9110) 5167 9110 FORMAT(' CLUSTERING VECTOR') 5168 CALL DPWRST('XXX','BUG ') 5169 WRITE(ICOUT,9111) 5170 9111 FORMAT(' *****************') 5171 CALL DPWRST('XXX','BUG ') 5172 WRITE(ICOUT,999) 5173 CALL DPWRST('XXX','BUG ') 5174 ENDIF 5175C 5176 ICNT=0 5177 MTEL=0 5178 MTELP=LYNF 5179 240 CONTINUE 5180 DO 250 J=1,MTELP 5181 MTEL=MTEL+1 5182 MTELA=(MTEL-1)*JPP+1 5183 LYNE(J)=INT(X(MTELA)) 5184 ICNT=ICNT+1 5185 IC1(ICNT)=LYNE(J) 5186 250 CONTINUE 5187CNIST WRITE(LUB,9120)(LYNE(J),J=1,MTELP) 5188C 5189 IF(IPRINT.EQ.'ON' .AND. LARGE.GE.1)THEN 5190 WRITE(ICOUT,9120)(LYNE(J),J=1,MTELP) 5191 9120 FORMAT(4X,25I3) 5192 CALL DPWRST('XXX','BUG ') 5193 ENDIF 5194 DO9125J=1,MTELP 5195 WRITE(IOUNI1,'(I7)')LYNE(J) 5196 9125 CONTINUE 5197C 5198 IF(MTEL.GE.NN)GO TO 300 5199 NNTEL=NN-MTEL 5200 IF(NNTEL.GE.LYNF)GO TO 240 5201 MTELP=NN-MTEL 5202 GO TO 240 5203CC 5204CC WHEN LARGE IS NOT ZERO, LIST OF ALL CLUSTER ELEMENTS IN ENTIRE 5205CC DATA SET IS GIVEN. 5206CC 5207 300 CONTINUE 5208 IF(LARGE.LE.0)GO TO 330 5209CNIST WRITE(LUB,9130) 5210C9130 FORMAT(//4X,27HCLUSTER SIZE MEDOID OBJECTS) 5211 IF(IPRINT.EQ.'ON')THEN 5212 WRITE(ICOUT,999) 5213 CALL DPWRST('XXX','BUG ') 5214 WRITE(ICOUT,999) 5215 CALL DPWRST('XXX','BUG ') 5216 WRITE(ICOUT,9130) 5217 9130 FORMAT(' CLUSTER SIZE MEDOID OBJECTS') 5218 CALL DPWRST('XXX','BUG ') 5219 ENDIF 5220C 5221 DO 320 KA=1,KK 5222 MTT=0 5223 J=0 5224 325 CONTINUE 5225 J=J+1 5226 JA=(J-1)*JPP+1 5227 NXJA=INT(X(JA)+0.1) 5228 IF(NXJA.EQ.KA)MTT=MTT+1 5229 IF(J.LT.NN)GO TO 325 5230CNIST WRITE(LUB,9140)KA,MTT,NRX(KA) 5231C9140 FORMAT(/3X,I8,I5,I7) 5232 IF(IPRINT.EQ.'ON')THEN 5233 WRITE(ICOUT,999) 5234 CALL DPWRST('XXX','BUG ') 5235 WRITE(ICOUT,9140)KA,MTT,NRX(KA) 5236 9140 FORMAT(3X,I8,I5,I7) 5237 CALL DPWRST('XXX','BUG ') 5238 ENDIF 5239C 5240 MTT=0 5241 J=0 5242 315 CONTINUE 5243 J=J+1 5244 JA=(J-1)*JPP+1 5245 NXJA=INT(X(JA)+0.1) 5246 IF(NXJA.NE.KA)GO TO 310 5247 MTT=MTT+1 5248 LYNE(MTT)=J 5249 IF(MTT.NE.10)GO TO 310 5250 MTT=0 5251C 5252CNIST WRITE(LUB,9150)(LYNE(JJ),JJ=1,10) 5253 IF(IPRINT.EQ.'ON')THEN 5254 WRITE(ICOUT,9150)(LYNE(JJ),JJ=1,10) 5255 9150 FORMAT(24X,10I5) 5256 CALL DPWRST('XXX','BUG ') 5257 ENDIF 5258C 5259 310 CONTINUE 5260 IF(J.LT.NN)GO TO 315 5261CNIST IF(MTT.NE.0)WRITE(LUB,9150)(LYNE(JJ),JJ=1,MTT) 5262 IF(MTT.NE.0 .AND. IPRINT.EQ.'ON')THEN 5263 WRITE(ICOUT,9150)(LYNE(JJ),JJ=1,MTT) 5264 CALL DPWRST('XXX','BUG ') 5265 ENDIF 5266 320 CONTINUE 5267 330 CONTINUE 5268 RETURN 5269 END 5270 SUBROUTINE CSTAT(KK,NN,NSEND,NREPR,RADUS,DAMER,TTD,SEPAR,S, 5271 1 DYS,NCLUV,NELEM,JPP,MAXNN,MAXPP,X,JDYSS,NSTAN, 5272 1 IOUNI2,ISUBRO,IBUGA3) 5273CNIST SUBROUTINE CSTAT(KK,NN,NSEND,NREPR,RADUS,DAMER,TTD,SEPAR,Z,S, 5274CNIST1 MAXHH,DYS,NCLUV,NELEM,JPP,MAXNN,MAXPP,X,LAB, 5275CNIST1 LUB,JDYSS,NSTAN) 5276C 5277C THE FOLLOWING CHANGES WERE MADE TO INCORPORATE INTO DATAPLOT 5278C 5279C 1. USE DATAPLOT OUTPUT 5280C 2. RECODE A BIT FOR BETTER READABILITY 5281C 5282C KK = NUMBER OF CLUSTERS 5283C NN = NUMBER OF ROWS (CASES) 5284C NSEND = 5285C NREPR = 5286C RADUS = MAXIMUM DISSIMILARITY TO EACH MEDOID 5287C DAMER = 5288C TTD = AVERAGE DISSIMILARITY TO EACH MEDOID 5289C SEPAR = 5290C Z = 5291C S = 5292C MAXHH = MAXIMUM DIMENSION FOR DISTANCES 5293C (MAXNN*(MAXNN-1)/2 + 1) 5294C EQUALS 4951 IN ROUSSEEUW 5295C DYS = VECTOR CONTAINING THE DISSIMILARITIES 5296C NCLUV = 5297C NELEM = 5298C JPP = NUMBER OF COLUMNS (VARIABLES) 5299C MAXNN = MAXIMUM NUMBER OF ROWS 5300C SET TO 100 BY ROUSSEEUW 5301C MAXPP = MAXIMUM NUMBER OF VARIABLES 5302C SET TO 20 BY ROUSSEEUW 5303C X = THE DATA MATRIX 5304C LAB = VARIABLE LABELS 5305C USE "001", "002", ETC. 5306C JDYSS = 1 => DISSIMILARITY MATRIX 5307C 1 = MEASUREMENT DATA 5308C NSTAN = 0 => NO STANDARDIZATION OF VARIABLES 5309C 1 => VARIABLES ARE STANDARDIZED 5310C LUB = OUTPUT UNIT (WE DON'T USE) 5311C 5312 DIMENSION SEPAR(*) 5313 DIMENSION DAMER(*) 5314 DIMENSION TTD(*) 5315 DIMENSION DYS(*) 5316 DIMENSION X(MAXNN,MAXPP) 5317C 5318 DIMENSION NCLUV(*) 5319 DIMENSION NSEND(*) 5320 DIMENSION NREPR(*) 5321 DIMENSION NELEM(*) 5322 DIMENSION RADUS(*) 5323C 5324CCCCC CHARACTER*1 LAB(3,MAXNN) 5325 CHARACTER*1 JDRAW(30) 5326 CHARACTER*3 LAB1 5327C 5328 CHARACTER*4 IBUGA3 5329 CHARACTER*4 ISUBRO 5330C 5331 INCLUDE 'DPCOP2.INC' 5332C 5333 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'YSTA')THEN 5334 WRITE(ICOUT,999) 5335 999 FORMAT(1X) 5336 CALL DPWRST('XXX','BUG ') 5337 WRITE(ICOUT,51) 5338 51 FORMAT('***** AT THE BEGINNING OF CSTAT--') 5339 CALL DPWRST('XXX','BUG ') 5340 WRITE(ICOUT,55)KK,NN 5341 55 FORMAT('KK,NN = ',2I8) 5342 CALL DPWRST('XXX','BUG ') 5343 ENDIF 5344C 5345 KSMAL=0 5346C 5347 DO 130 J=1,NN 5348 IF(NREPR(J).EQ.1)THEN 5349 NSEND(J)=J 5350 ELSE 5351 DSMAL=1.1*S+1.0 5352 DO 110 K=1,NN 5353 IF(NREPR(K).EQ.0)GO TO 110 5354 NJAJ=MEET(K,J) 5355 IF(DYS(NJAJ).GE.DSMAL)GO TO 110 5356 DSMAL=DYS(NJAJ) 5357 KSMAL=K 5358 110 CONTINUE 5359 NSEND(J)=KSMAL 5360 ENDIF 5361 130 CONTINUE 5362C 5363 JK=1 5364 NPLAC=NSEND(1) 5365 DO 135 J=1,NN 5366 NCLUV(J)=0 5367 IF(NSEND(J).EQ.NPLAC)NCLUV(J)=1 5368 135 CONTINUE 5369C 5370 DO 145 JA=2,NN 5371 NPLAC=NSEND(JA) 5372 IF(NCLUV(NPLAC).NE.0)GO TO 145 5373 JK=JK+1 5374 DO 140 J=2,NN 5375 IF(NSEND(J).EQ.NPLAC)NCLUV(J)=JK 5376 140 CONTINUE 5377 IF(JK.EQ.KK)GO TO 148 5378 145 CONTINUE 5379C 5380C ANALYSIS OF THE CLUSTERING. 5381C 5382 148 CONTINUE 5383 IF(IPRINT.EQ.'ON')THEN 5384 WRITE(ICOUT,999) 5385 CALL DPWRST('XXX','BUG ') 5386 WRITE(ICOUT,9200) 5387 9200 FORMAT('CLUSTERS ') 5388 CALL DPWRST('XXX','BUG ') 5389 WRITE(ICOUT,9201) 5390 9201 FORMAT(2X,' NUMBER MEDOID SIZE OBJECTS') 5391 CALL DPWRST('XXX','BUG ') 5392 ENDIF 5393C 5394 DO 160 NUMCL=1,KK 5395 NTT=0 5396 RADUS(NUMCL)=-1.0 5397 TTT=0.0 5398C 5399 DO 150 J=1,NN 5400 IF(NCLUV(J).NE.NUMCL)GO TO 150 5401 NTT=NTT+1 5402 M=NSEND(J) 5403 NELEM(NTT)=J 5404 NJM=MEET(J,M) 5405 TTT=TTT+DYS(NJM) 5406 IF(DYS(NJM).GT.RADUS(NUMCL))RADUS(NUMCL)=DYS(NJM) 5407 150 CONTINUE 5408C 5409 RTT=NTT 5410 TTD(NUMCL)=TTT/RTT 5411 NSS=NTT 5412 IF(NSS.GT.10)NSS=10 5413 DO 152 L=1,NSS 5414 LEEN=3*(L-1)+1 5415 LTWE=3*(L-1)+2 5416 LDRE=3*L 5417 NCASE=NELEM(L) 5418 LAB1='000' 5419 WRITE(LAB1,'(I3)')NCASE 5420 JDRAW(LEEN)=LAB1(1:1) 5421 JDRAW(LTWE)=LAB1(2:2) 5422 JDRAW(LDRE)=LAB1(3:3) 5423 152 CONTINUE 5424C 5425 NSSDR=NSS*3 5426CNIST WRITE(LUB,9210)NUMCL,LAB(1,M),LAB(2,M),LAB(3,M),NTT, 5427CNIST1 (JDRAW(K),K=1,NSSDR) 5428C9210 FORMAT(/1X,I5,6X,3A1,2X,I6,5X,10(3A1,1X)) 5429 IF(IPRINT.EQ.'ON')THEN 5430 WRITE(ICOUT,999) 5431 CALL DPWRST('XXX','BUG ') 5432 LAB1='000' 5433 WRITE(LAB1,'(I3)')M 5434 WRITE(ICOUT,9210)NUMCL,LAB1,NTT, 5435 1 (JDRAW(K),K=1,NSSDR) 5436 9210 FORMAT(I5,6X,A3,2X,I6,5X,10(3A1,1X)) 5437 CALL DPWRST('XXX','BUG ') 5438 ENDIF 5439C 5440 IF(NTT.LE.10)GO TO 160 5441 KAUNT=0 5442 DO 154 L=11,NTT 5443 KAUNT=KAUNT+1 5444 LEEN=3*(KAUNT-1)+1 5445 LTWE=3*(KAUNT-1)+2 5446 LDRE=3*KAUNT 5447 NCASE=NELEM(L) 5448 LAB1='000' 5449 WRITE(LAB1,'(I3)')NCASE 5450 JDRAW(LEEN)=LAB1(1:1) 5451 JDRAW(LTWE)=LAB1(2:2) 5452 JDRAW(LDRE)=LAB1(3:3) 5453 IF(KAUNT.EQ.10)THEN 5454 IF(IPRINT.EQ.'ON')THEN 5455 WRITE(ICOUT,9215)(JDRAW(K),K=1,30) 5456 9215 FORMAT(28X,10(3A1,1X)) 5457 CALL DPWRST('XXX','BUG ') 5458 ENDIF 5459 KAUNT=0 5460 ENDIF 5461 154 CONTINUE 5462C 5463 IF(KAUNT.GE.1 .AND. IPRINT.EQ.'ON')THEN 5464 WRITE(ICOUT,9215)(JDRAW(K),K=1,LDRE) 5465 CALL DPWRST('XXX','BUG ') 5466 ENDIF 5467 160 CONTINUE 5468C 5469 IF(JDYSS.NE.1)THEN 5470 IF(IPRINT.EQ.'ON')THEN 5471 WRITE(ICOUT,999) 5472 CALL DPWRST('XXX','BUG ') 5473 WRITE(ICOUT,999) 5474 CALL DPWRST('XXX','BUG ') 5475 IF(NSTAN.EQ.0)THEN 5476 WRITE(ICOUT,9220) 5477 9220 FORMAT('COORDINATES OF MEDOIDS') 5478 CALL DPWRST('XXX','BUG ') 5479 WRITE(ICOUT,9221) 5480 9221 FORMAT('**********************') 5481 CALL DPWRST('XXX','BUG ') 5482 WRITE(ICOUT,999) 5483 CALL DPWRST('XXX','BUG ') 5484 ELSEIF(NSTAN.EQ.1)THEN 5485 WRITE(ICOUT,9230) 5486 9230 FORMAT('COORDINATES OF MEDOIDS (USING STANDARDIZED ', 5487 1 'MEASUREMENTS') 5488 CALL DPWRST('XXX','BUG ') 5489 WRITE(ICOUT,9231) 5490 9231 FORMAT( 5491 1 '*******************************************************') 5492 CALL DPWRST('XXX','BUG ') 5493 ENDIF 5494 ENDIF 5495 IF(IPRINT.EQ.'ON')THEN 5496 WRITE(ICOUT,999) 5497 CALL DPWRST('XXX','BUG ') 5498 ENDIF 5499C 5500 DO 210 NUMCL=1,KK 5501 DO 220 L=1,NN 5502 IF(NCLUV(L).EQ.NUMCL)GO TO 225 5503 220 CONTINUE 5504 225 CONTINUE 5505 M=NSEND(L) 5506 LAB1='000' 5507 WRITE(LAB1,'(I3)')M 5508 ILOOP=JPP/8 5509 IREM=MOD(JPP,8) 5510 IF(IREM.GT.0)ILOOP=ILOOP+1 5511 DO9243II=1,ILOOP 5512 ISTRT=(II-1)*8 + 1 5513 ISTOP=II*8 5514 IF(ISTOP.GT.JPP)ISTOP=JPP 5515 IF(IPRINT.EQ.'ON')THEN 5516 WRITE(ICOUT,9240)LAB1,(X(M,J),J=ISTRT,ISTOP) 5517 9240 FORMAT(1X,A3,2X,8F9.2) 5518 CALL DPWRST('XXX','BUG ') 5519 ENDIF 5520 WRITE(IOUNI2,9240)LAB1,(X(M,J),J=ISTRT,ISTOP) 5521 9243 CONTINUE 5522 210 CONTINUE 5523 ENDIF 5524C 5525 RNN=NN 5526 IF(KK.EQ.1)THEN 5527 DAMER(1)=S 5528 GO TO 300 5529 ENDIF 5530C 5531 IF(IPRINT.EQ.'ON')THEN 5532 WRITE(ICOUT,999) 5533 CALL DPWRST('XXX','BUG ') 5534 WRITE(ICOUT,9270) 5535 9270 FORMAT('CLUSTERING VECTOR') 5536 CALL DPWRST('XXX','BUG ') 5537 WRITE(ICOUT,9271) 5538 9271 FORMAT('*****************') 5539 CALL DPWRST('XXX','BUG ') 5540 WRITE(ICOUT,999) 5541 CALL DPWRST('XXX','BUG ') 5542C 5543 WRITE(ICOUT,9280)(NCLUV(J),J=1,NN) 5544 9280 FORMAT(11X,50(20I3/11X)) 5545 CALL DPWRST('XXX','BUG ') 5546C 5547 WRITE(ICOUT,999) 5548 CALL DPWRST('XXX','BUG ') 5549 WRITE(ICOUT,999) 5550 CALL DPWRST('XXX','BUG ') 5551 WRITE(ICOUT,9290) 5552 9290 FORMAT('CLUSTERING CHARACTERISTICS') 5553 CALL DPWRST('XXX','BUG ') 5554 WRITE(ICOUT,9291) 5555 9291 FORMAT('**************************') 5556 CALL DPWRST('XXX','BUG ') 5557 ENDIF 5558CC 5559CC NUML = NUMBER OF L-CLUSTERS. 5560CC 5561 NUML=0 5562 DO 40 K=1,KK 5563CC 5564CC IDENTIFICATION OF CLUSTER K: 5565CC NEL=NUMBER OF OBJECTS 5566CC NELEM=VECTOR OF OBJECTS 5567CC 5568 NEL=0 5569C 5570 DO 23 J=1,NN 5571 IF(NCLUV(J).NE.K)GO TO 23 5572 NEL=NEL+1 5573 NELEM(NEL)=J 5574 23 CONTINUE 5575C 5576 IF(NEL.EQ.1)THEN 5577 NVN=NELEM(1) 5578 DAMER(K)=0. 5579 SEPAR(K)=1.1*S+1.0 5580 DO 250 J=1,NN 5581 IF(J.EQ.NVN)GO TO 250 5582 MEVJ=MEET(NVN,J) 5583 IF(SEPAR(K).GT.DYS(MEVJ))SEPAR(K)=DYS(MEVJ) 5584 250 CONTINUE 5585CC 5586CC IS CLUSTER K 1) AN L-CLUSTER ? 5587CC 2) AN L*-CLUSTER ? 5588CC 5589 LAB1='000' 5590 WRITE(LAB1,'(I3)')NVN 5591 IF(SEPAR(K).NE.0.)THEN 5592 NUML=NUML+1 5593 IF(IPRINT.EQ.'ON')THEN 5594 WRITE(ICOUT,9310)K 5595 CALL DPWRST('XXX','BUG ') 5596 WRITE(ICOUT,9320)LAB1 5597 9320 FORMAT(8X,' IT IS A SINGLETON CONSISTING OF OBJECT ',A3) 5598 CALL DPWRST('XXX','BUG ') 5599 WRITE(ICOUT,9321)SEPAR(K) 5600 9321 FORMAT(8X,' ITS SEPARATION = ',F11.2) 5601 CALL DPWRST('XXX','BUG ') 5602 WRITE(ICOUT,999) 5603 CALL DPWRST('XXX','BUG ') 5604 ENDIF 5605 ELSE 5606 IF(IPRINT.EQ.'ON')THEN 5607 WRITE(ICOUT,9324)K,LAB1 5608 9324 FORMAT(' CLUSTER ',I4,' IS A SINGLETON CONSISTING OF', 5609 1 ' OBJECT ',A3,'. IT IS NOT ISOLATED.') 5610 CALL DPWRST('XXX','BUG ') 5611 WRITE(ICOUT,9326) 5612 9326 FORMAT(' ** IT IS NOT ADVISABLE TO DIVIDE THE DATA INTO', 5613 1 ' SO MANY CLUSTERS.') 5614 CALL DPWRST('XXX','BUG ') 5615 WRITE(ICOUT,999) 5616 CALL DPWRST('XXX','BUG ') 5617 ENDIF 5618 ENDIF 5619C 5620 ELSE 5621 DAM=-1. 5622 SEP=1.1*S+1.0 5623 KAND=1 5624C 5625 DO 26 JA=1,NEL 5626 NVNA=NELEM(JA) 5627 AJA=-1. 5628 AJB=1.1*S+1.0 5629 DO 25 JB=1,NN 5630 JNDZ=MEET(NVNA,JB) 5631 IF(NCLUV(JB).EQ.K)GO TO 30 5632 IF(DYS(JNDZ).LT.AJB)AJB=DYS(JNDZ) 5633 GO TO 25 5634 30 CONTINUE 5635 IF(DYS(JNDZ).GT.AJA)AJA=DYS(JNDZ) 5636 25 CONTINUE 5637 IF(AJA.GE.AJB)KAND=0 5638 IF(DAM.LT.AJA)DAM=AJA 5639 IF(SEP.GT.AJB)SEP=AJB 5640 26 CONTINUE 5641C 5642 SEPAR(K)=SEP 5643 DAMER(K)=DAM 5644 IF(KAND.EQ.0)GO TO 40 5645CC 5646CC DIAMETER AND SEPARATION OF ISOLATED CLUSTERS 5647CC 5648 IF(IPRINT.EQ.'ON')THEN 5649 WRITE(ICOUT,9310)K 5650 9310 FORMAT('CLUSTER ',I4,' IS ISOLATED') 5651 CALL DPWRST('XXX','BUG ') 5652 WRITE(ICOUT,9330)DAM,SEP 5653 9330 FORMAT(8X,' WITH DIAMETER =',F11.2,' AND SEPARATION =', 5654 1 F11.2) 5655 CALL DPWRST('XXX','BUG ') 5656 ENDIF 5657 NUML=NUML+1 5658 IF(DAM.LT.SEP)THEN 5659 IF(IPRINT.EQ.'ON')THEN 5660 WRITE(ICOUT,9350) 5661 9350 FORMAT(8X,' THEREFORE IT IS AN L*-CLUSTER.') 5662 CALL DPWRST('XXX','BUG ') 5663 WRITE(ICOUT,999) 5664 CALL DPWRST('XXX','BUG ') 5665 ENDIF 5666 ELSE 5667 IF(IPRINT.EQ.'ON')THEN 5668 WRITE(ICOUT,9340) 5669 9340 FORMAT(8X,' IT IS AN L-CLUSTER.') 5670 WRITE(ICOUT,999) 5671 CALL DPWRST('XXX','BUG ') 5672 ENDIF 5673 ENDIF 5674C 5675 ENDIF 5676C 5677 40 CONTINUE 5678C 5679 IF(IPRINT.EQ.'ON')THEN 5680 IF(NUML.EQ.0)THEN 5681 WRITE(ICOUT,9360) 5682 9360 FORMAT(' THERE ARE NO ISOLATED CLUSTERS') 5683 CALL DPWRST('XXX','BUG ') 5684 ELSEIF(NUML.GE.1)THEN 5685 WRITE(ICOUT,9365)NUML 5686 9365 FORMAT(' THE NUMBER OF ISOLATED CLUSTERS = ',I4) 5687 CALL DPWRST('XXX','BUG ') 5688 ENDIF 5689 ENDIF 5690C 5691 300 CONTINUE 5692C 5693 IF(IPRINT.EQ.'ON')THEN 5694 WRITE(ICOUT,999) 5695 CALL DPWRST('XXX','BUG ') 5696 WRITE(ICOUT,9370) 5697 9370 FORMAT(' DIAMETER OF EACH CLUSTER') 5698 CALL DPWRST('XXX','BUG ') 5699 ILOOP=KK/8 5700 IREM=MOD(KK,8) 5701 IF(IREM.GT.0)ILOOP=ILOOP+1 5702 DO9375II=1,ILOOP 5703 ISTRT=(II-1)*8 + 1 5704 ISTOP=II*8 5705 IF(ISTOP.GT.KK)ISTOP=KK 5706 WRITE(ICOUT,9371)(DAMER(J),J=ISTRT,ISTOP) 5707 9371 FORMAT(2X,8F9.2) 5708 CALL DPWRST('XXX','BUG ') 5709 9375 CONTINUE 5710 ENDIF 5711C 5712 IF(KK.NE.1)THEN 5713 IF(IPRINT.EQ.'ON')THEN 5714 WRITE(ICOUT,999) 5715 CALL DPWRST('XXX','BUG ') 5716 WRITE(ICOUT,9380) 5717 9380 FORMAT(' SEPARATION OF EACH CLUSTER') 5718 CALL DPWRST('XXX','BUG ') 5719 ILOOP=KK/8 5720 IREM=MOD(KK,8) 5721 IF(IREM.GT.0)ILOOP=ILOOP+1 5722 DO9385II=1,ILOOP 5723 ISTRT=(II-1)*8 + 1 5724 ISTOP=II*8 5725 IF(ISTOP.GT.KK)ISTOP=KK 5726 WRITE(ICOUT,9381)(SEPAR(J),J=ISTRT,ISTOP) 5727 9381 FORMAT(2X,8F9.2) 5728 CALL DPWRST('XXX','BUG ') 5729 9385 CONTINUE 5730 ENDIF 5731 ENDIF 5732 IF(IPRINT.EQ.'ON')THEN 5733 WRITE(ICOUT,999) 5734 CALL DPWRST('XXX','BUG ') 5735 WRITE(ICOUT,9390) 5736 9390 FORMAT(' AVERAGE DISSIMILARITY TO EACH MEDOID') 5737 CALL DPWRST('XXX','BUG ') 5738 ILOOP=KK/8 5739 IREM=MOD(KK,8) 5740 IF(IREM.GT.0)ILOOP=ILOOP+1 5741 DO9395II=1,ILOOP 5742 ISTRT=(II-1)*8 + 1 5743 ISTOP=II*8 5744 IF(ISTOP.GT.KK)ISTOP=KK 5745 WRITE(ICOUT,9391)(TTD(J),J=ISTRT,ISTOP) 5746 9391 FORMAT(2X,8F9.2) 5747 CALL DPWRST('XXX','BUG ') 5748 9395 CONTINUE 5749 WRITE(ICOUT,999) 5750 CALL DPWRST('XXX','BUG ') 5751 WRITE(ICOUT,9400) 5752 9400 FORMAT(' MAXIMUM DISSIMILARITY TO EACH MEDOID') 5753 CALL DPWRST('XXX','BUG ') 5754 ILOOP=KK/8 5755 IREM=MOD(KK,8) 5756 IF(IREM.GT.0)ILOOP=ILOOP+1 5757 DO9405II=1,ILOOP 5758 ISTRT=(II-1)*8 + 1 5759 ISTOP=II*8 5760 IF(ISTOP.GT.KK)ISTOP=KK 5761 WRITE(ICOUT,9401)(RADUS(J),J=1,KK) 5762 9401 FORMAT(2X,8F9.2) 5763 CALL DPWRST('XXX','BUG ') 5764 9405 CONTINUE 5765 ENDIF 5766C 5767 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'YSTA')THEN 5768 WRITE(ICOUT,999) 5769 CALL DPWRST('XXX','BUG ') 5770 WRITE(ICOUT,9051) 5771 9051 FORMAT('***** AT THE END OF CSTAT--') 5772 CALL DPWRST('XXX','BUG ') 5773 ENDIF 5774C 5775 RETURN 5776 END 5777 SUBROUTINE SUPCL(DYS,KKA,KKB,AREST,NER) 5778CNIST SUBROUTINE SUPCL(MAXHH,DYS,KKA,KKB,AREST,MAXNN,NER) 5779C 5780C DYS = VECTOR OF DISTANCES 5781C KKA = ... 5782C KKB = ... 5783C AREST = ... 5784C NER = ... 5785C 5786 DIMENSION DYS(*) 5787 DIMENSION NER(*) 5788C 5789 KKC=KKB-1 5790 AREST=0. 5791 DO 20 L=KKA,KKC 5792 LNER=NER(L) 5793 KKD=L+1 5794 DO 10 J=KKD,KKB 5795 JNER=NER(J) 5796 MLJ=MEET(LNER,JNER) 5797 IF(DYS(MLJ).GT.AREST)AREST=DYS(MLJ) 5798 10 CONTINUE 5799 20 CONTINUE 5800C 5801 RETURN 5802 END 5803 SUBROUTINE AVERL(NN,KWAN,NER,BAN,DYS, 5804 1 NCLUT,LAT,LBT,BANLAT,BANLBT, 5805 1 IOUNI2,IOUNI3,IOUNI4,IAGNME,ISUBRO,IBUGA3) 5806CNIST SUBROUTINE AVERL(NN,MAXNN,KWAN,NER,BAN,MAXHH,DYS,LUB) 5807C 5808C NN = NUMBER OF OBJECTS 5809C MAXNN = MAXIMUM NUMBER OF OBJECTS (DATAPLOT DOES NOT USE) 5810C KWAN = NUMBER OF OBJECTS IN EACH CLUSTER 5811C NER = FINAL ORDERING OF OBJECTS 5812C BAN = DISSIMILARITIES BETWEEN CLUSTERS 5813C MAXHH = MAXIMUM NUMBER OF DISSIMILARITIES (DATAPLOT DOES NOT USE) 5814C DYS = VECTOR OF DISSIMILARITIES 5815C LUB = OUTOPUT UNIT (DATAPLOT DOES NOT USE) 5816C 5817 DIMENSION DYS(*) 5818 DIMENSION BAN(*) 5819 DIMENSION BANLAT(*) 5820 DIMENSION BANLBT(*) 5821C 5822 DIMENSION NER(*) 5823 DIMENSION KWAN(*) 5824 DIMENSION NCLUT(*) 5825 DIMENSION LAT(*) 5826 DIMENSION LBT(*) 5827C 5828 CHARACTER*4 IAGNME 5829 CHARACTER*4 ISUBRO 5830 CHARACTER*4 IBUGA3 5831C 5832 INCLUDE 'DPCOP2.INC' 5833C 5834C INITIALIZATION 5835C 5836C NCLU = NUMBER OF CLUSTERS 5837C KWAN = NUMBER OF OBJECTS IN EACH CLUSTER 5838C NER = OBJECT ID'S FOR THE CLUSTER 5839C 5840 IF(IBUGA3.EQ.'ON')THEN 5841 WRITE(ICOUT,5)NN 5842 5 FORMAT('BEGINING OF AVERL: NN = ',I6) 5843 CALL DPWRST('XXX','BUG ') 5844 ENDIF 5845C 5846 FC=0.0 5847 ICNT=0 5848 NCLU=NN-1 5849 DO 10 L=1,NN 5850 KWAN(L)=1 5851 NER(L)=L 5852 10 CONTINUE 5853CC 5854CC FIND CLOSEST CLUSTERS 5855CC 5856 100 CONTINUE 5857 J=1 5858C 5859 80 CONTINUE 5860 J=J+1 5861 IF(KWAN(J).EQ.0)GOTO 80 5862 NEJ=MEET(1,J) 5863 SMALD=DYS(NEJ)*1.1+1.0 5864 NNS=NN-1 5865 DO 120 L=1,NNS 5866 IF(KWAN(L).EQ.0)GO TO 120 5867 LMUCH=L+1 5868 DO 110 J=LMUCH,NN 5869 IF(KWAN(J).EQ.0)GO TO 110 5870 NLJ=MEET(L,J) 5871 IF(DYS(NLJ).GT.SMALD)GO TO 110 5872 SMALD=DYS(NLJ) 5873 LA=L 5874 LB=J 5875 110 CONTINUE 5876 120 CONTINUE 5877CC 5878CC DETERMINE LFYRS AND LLAST 5879CC 5880 DO 200 L=1,NN 5881 IF(NER(L).EQ.LA)LFYRS=L 5882 IF(NER(L).EQ.LB)LLAST=L 5883 200 CONTINUE 5884 BAN(LLAST)=SMALD 5885CC 5886CC IF THE TWO CLUSTERS ARE NEXT TO EACH OTHER, 5887CC NER MUST NOT BE CHANGED 5888CC 5889 LNEXT=LFYRS+KWAN(LA) 5890 IF(LNEXT.NE.LLAST)THEN 5891CC 5892CC UPDATING NER AND BAN 5893CC 5894 LPUT=LFYRS+KWAN(LA) 5895 LNUM=LLAST-LPUT 5896 DO 220 L=1,LNUM 5897 LKA=NER(LPUT) 5898 AKB=BAN(LPUT) 5899 LENDA=LLAST+KWAN(LB)-2 5900 LENDB=LENDA+1 5901 DO 210 J=LPUT,LENDA 5902 NER(J)=NER(J+1) 5903 BAN(J)=BAN(J+1) 5904 210 CONTINUE 5905 NER(LENDB)=LKA 5906 BAN(LENDB)=AKB 5907 220 CONTINUE 5908 ENDIF 5909CC 5910CC CALCULATE NEW DISSIMILARITIES 5911CC 5912CC SUPPORT FOR DIFFERENT METHODS 5913CC 5914 IF(IAGNME.EQ.'SING')THEN 5915 DO 241 LQ=1,NN 5916 IF(LQ.EQ.LA.OR.LQ.EQ.LB)GO TO 241 5917 IF(KWAN(LQ).EQ.0)GO TO 241 5918 NAQ=MEET(LA,LQ) 5919 NBQ=MEET(LB,LQ) 5920 DNEW=DYS(NAQ) 5921 IF(DYS(NBQ).LT.DNEW)DNEW=DYS(NBQ) 5922 DYS(NAQ)=DNEW 5923 241 CONTINUE 5924 ELSEIF(IAGNME.EQ.'COMP')THEN 5925 DO 242 LQ=1,NN 5926 IF(LQ.EQ.LA.OR.LQ.EQ.LB)GO TO 242 5927 IF(KWAN(LQ).EQ.0)GO TO 242 5928 NAQ=MEET(LA,LQ) 5929 NBQ=MEET(LB,LQ) 5930 DNEW=DYS(NAQ) 5931 IF(DNEW.LT.DYS(NBQ))DNEW=DYS(NBQ) 5932 DYS(NAQ)=DNEW 5933 242 CONTINUE 5934 ELSEIF(IAGNME.EQ.'CENT')THEN 5935 DO 243 LQ=1,NN 5936 IF(LQ.EQ.LA.OR.LQ.EQ.LB)GO TO 243 5937 IF(KWAN(LQ).EQ.0)GO TO 243 5938 TA=KWAN(LA) 5939 TB=KWAN(LB) 5940 FA=TA/(TA+TB) 5941 FB=TB/(TA+TB) 5942 NAQ=MEET(LA,LQ) 5943 NBQ=MEET(LB,LQ) 5944 NAB=MEET(LA,LB) 5945 D=FA*DYS(NAQ)*DYS(NAQ) + FB*DYS(NBQ)*DYS(NBQ) 5946 D=D + FC*DYS(NAB)*DYS(NAB) 5947 DYS(NAQ)=SQRT(D) 5948 243 CONTINUE 5949 ELSEIF(IAGNME.EQ.'WARD')THEN 5950 DO 244 LQ=1,NN 5951 IF(LQ.EQ.LA.OR.LQ.EQ.LB)GO TO 244 5952 IF(KWAN(LQ).EQ.0)GO TO 244 5953 TA=KWAN(LA) 5954 TB=KWAN(LB) 5955 TQ=KWAN(LQ) 5956 FA=(TA + TQ)/(TA + TB + TQ) 5957 FB=(TB + TQ)/(TA + TB + TQ) 5958 FC=-TQ/(TA + TB + TQ) 5959 NAQ=MEET(LA,LQ) 5960 NBQ=MEET(LB,LQ) 5961 NAB=MEET(LA,LB) 5962 D=FA*DYS(NAQ)*DYS(NAQ) + FB*DYS(NBQ)*DYS(NBQ) 5963 D=D + FC*DYS(NAB)*DYS(NAB) 5964 DYS(NAQ)=SQRT(D) 5965 244 CONTINUE 5966 ELSEIF(IAGNME.EQ.'WAVL')THEN 5967 DO 245 LQ=1,NN 5968 IF(LQ.EQ.LA.OR.LQ.EQ.LB)GO TO 245 5969 IF(KWAN(LQ).EQ.0)GO TO 245 5970 NAQ=MEET(LA,LQ) 5971 NBQ=MEET(LB,LQ) 5972 DYS(NAQ)=(DYS(NAQ) + DYS(NBQ))/2.0 5973 245 CONTINUE 5974 ELSEIF(IAGNME.EQ.'GOWE')THEN 5975 DO 246 LQ=1,NN 5976 IF(LQ.EQ.LA.OR.LQ.EQ.LB)GO TO 246 5977 IF(KWAN(LQ).EQ.0)GO TO 246 5978 NAQ=MEET(LA,LQ) 5979 NBQ=MEET(LB,LQ) 5980 NAB=MEET(LA,LB) 5981 D=(DYS(NAQ)*DYS(NAQ) + DYS(NBQ)*DYS(NBQ))/2.0 5982 D=D - (DYS(NAB)*DYS(NAB))/4.0 5983 DYS(NAQ)=SQRT(D) 5984 246 CONTINUE 5985 ELSE 5986 DO 240 LQ=1,NN 5987 IF(LQ.EQ.LA.OR.LQ.EQ.LB)GO TO 240 5988 IF(KWAN(LQ).EQ.0)GO TO 240 5989 TA=KWAN(LA) 5990 TB=KWAN(LB) 5991 FA=TA/(TA+TB) 5992 FB=TB/(TA+TB) 5993 NAQ=MEET(LA,LQ) 5994 NBQ=MEET(LB,LQ) 5995 DYS(NAQ)=FA*DYS(NAQ)+FB*DYS(NBQ) 5996 240 CONTINUE 5997 ENDIF 5998C 5999 IF(NCLU.EQ.1 .AND. IPRINT.EQ.'ON')THEN 6000 WRITE(ICOUT,999) 6001 999 FORMAT(1X) 6002 CALL DPWRST('XXX','BUG ') 6003 WRITE(ICOUT,999) 6004 CALL DPWRST('XXX','BUG ') 6005 WRITE(ICOUT,9100) 6006 9100 FORMAT('THE FINAL ORDERING OF THE OBJECTS IS') 6007 CALL DPWRST('XXX','BUG ') 6008 WRITE(ICOUT,999) 6009 CALL DPWRST('XXX','BUG ') 6010 ILOOP=NN/5 6011 IF(MOD(NN,5).GT.0)ILOOP=ILOOP+1 6012 DO9111II=1,ILOOP 6013 ISTRT=(II-1)*5+1 6014 ISTOP=II*5 6015 IF(ISTOP.GT.NN)ISTOP=NN 6016 WRITE(ICOUT,9110)(NER(L),L=ISTRT,ISTOP) 6017 9110 FORMAT(5(I9,6X)) 6018 CALL DPWRST('XXX','BUG ') 6019 9111 CONTINUE 6020 WRITE(ICOUT,999) 6021 CALL DPWRST('XXX','BUG ') 6022 WRITE(ICOUT,999) 6023 CALL DPWRST('XXX','BUG ') 6024 WRITE(ICOUT,9120) 6025 9120 FORMAT('THE DISSIMILARITIES BETWEEN CLUSTERS ARE') 6026 CALL DPWRST('XXX','BUG ') 6027 WRITE(ICOUT,999) 6028 CALL DPWRST('XXX','BUG ') 6029 ILOOP=(NN-1)/5 6030 IF(MOD((NN-1),5).GT.0)ILOOP=ILOOP+1 6031 DO9131II=1,ILOOP 6032 ISTRT=(II-1)*5+2 6033 ISTOP=II*5 + 1 6034 IF(ISTOP.GT.NN)ISTOP=NN 6035 WRITE(ICOUT,9130)(BAN(L),L=ISTRT,ISTOP) 6036 9130 FORMAT(3X,5F15.3) 6037 CALL DPWRST('XXX','BUG ') 6038 9131 CONTINUE 6039 ENDIF 6040C 6041 KWAN(LA)=KWAN(LA)+KWAN(LB) 6042 KWAN(LB)=0 6043C 6044C PRINT RESULTS FROM CURRENT CLUSTER 6045C 6046 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'VERL')THEN 6047 WRITE(ICOUT,9910)NCLU,LA,LB,LFYRS,LLAST 6048 9910 FORMAT('AVERL: NCLU,LA,LB,LFYRS,LLAST = ',5I5) 6049 CALL DPWRST('XXX','BUG ') 6050 DO9920L=1,NN 6051 WRITE(ICOUT,9921)L,NER(L),KWAN(L),BAN(L) 6052 9921 FORMAT('L,NER(L),KWAN(L),BAN(L) = ',3I5,F12.5) 6053 CALL DPWRST('XXX','BUG ') 6054 9920 CONTINUE 6055 ENDIF 6056C 6057 DISTMX=-1.0 6058 DO1020II=1,NN 6059 IF(BAN(II).GT.DISTMX)DISTMX=BAN(II) 6060 1020 CONTINUE 6061C 6062 DO1025II=1,NN 6063 IF(LA.EQ.NER(II))THEN 6064 AVAL1=BAN(II) 6065 ELSEIF(LB.EQ.NER(II))THEN 6066 AVAL2=BAN(II) 6067 ENDIF 6068 1025 CONTINUE 6069 ICNT=ICNT+1 6070 WRITE(IOUNI4,'(3I5,2E15.7)')NCLU,LA,LB,AVAL1,AVAL2 6071 NCLU=NCLU-1 6072 IF(NCLU.GT.0)GOTO 100 6073C 6074C NOW CREATE DATA FOR: 6075C 6076C 1. DENDOGRAM (IOUNI3) 6077C 2. ICICLE PLOT (IOUNI2) 6078C 6079 REWIND(IOUNI4) 6080C 6081 DO2010KK=1,ICNT 6082 READ(IOUNI4,'(3I5,2E15.7)',END=2019,ERR=2019) 6083 1 NCLUT(KK),LAT(KK),LBT(KK),BANLAT(KK),BANLBT(KK) 6084 2010 CONTINUE 6085C 6086 ITAG=0 6087 DO2020KK=1,ICNT 6088C 6089C LB IDENTIFIES "RIGHT HAND SIDE" OF BRANCH. LA IDENTIFIES 6090C WHICH CLUSTER IT IS JOINING. NER WILL BE USED TO IDENTIFY 6091C THE APPROPRIATE X-COORDINATE. 6092C 6093 IFRST=LAT(KK) 6094 ISEC=LBT(KK) 6095 AVAL1=BANLBT(KK) 6096C 6097 IF(KK.EQ.1)THEN 6098C 6099C FIRST CLUSTER BEING FORMED 6100C 6101 XVAL1=1.0 6102 XVAL2=2.0 6103 DO2021JJ=1,NN 6104 IF(IFRST.EQ.NER(JJ))XVAL1=REAL(JJ) 6105 IF(ISEC.EQ.NER(JJ))XVAL2=REAL(JJ) 6106 2021 CONTINUE 6107 YVAL1=0.0 6108 YVAL2=AVAL1 6109 ITAG=ITAG+1 6110 WRITE(IOUNI3,'(3E15.7)')XVAL1,YVAL1,REAL(ITAG) 6111 WRITE(IOUNI3,'(3E15.7)')XVAL1,YVAL2,REAL(ITAG) 6112 WRITE(IOUNI3,'(3E15.7)')XVAL2,YVAL2,REAL(ITAG) 6113 WRITE(IOUNI3,'(3E15.7)')XVAL2,YVAL1,REAL(ITAG) 6114C 6115 ELSE 6116C 6117C IF NOT THE FIRST, THEN CHECK IF LA MATCHES ANY 6118C PREVIOUS LA. 6119C 6120 IFLAGL=0 6121 IFLAGR=0 6122 DO2030JJ=KK-1,1,-1 6123 IF(IFRST.EQ.LAT(JJ))THEN 6124C 6125C MATCH WITH PREVIOUS CLUSTER FOUND 6126C 6127 ISEC2=LBT(JJ) 6128 DO2031LL=1,NN 6129 IF(ISEC2.EQ.NER(LL))THEN 6130 XVAL1=REAL(LL) - 0.5 6131 XVAL3=REAL(LL) 6132 ENDIF 6133 IF(ISEC.EQ.NER(LL))THEN 6134 XVAL2=REAL(LL) 6135 XVAL4=XVAL2 6136 ENDIF 6137 2031 CONTINUE 6138 YVAL1=BANLBT(JJ) 6139 YVAL2=AVAL1 6140 YVAL3=0.0 6141 IFLAGL=1 6142 GOTO2039 6143 ENDIF 6144 2030 CONTINUE 6145 2039 CONTINUE 6146C 6147 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'VERL')THEN 6148 WRITE(ICOUT,2035)KK,IFLAGL 6149 2035 FORMAT('AFTER 2030 LOOP: KK,IFLAGL = ',2I6) 6150 CALL DPWRST('XXX','BUG ') 6151 WRITE(ICOUT,2038)XVAL1,XVAL2,YVAL1,YVAL2,YVAL3 6152 2038 FORMAT('XVAL1,XVAL2,YVAL1,YVAL2,YVAL3 = ',5G15.7) 6153 CALL DPWRST('XXX','BUG ') 6154 ENDIF 6155C 6156C IF NOT THE FIRST AND IF LA DOES NOT MATCH ANY PREVIOUS LA, 6157C THEN CHECK IF LB MATCHES ANY PREVIOUS LA. 6158C 6159 IF(IFLAGL.EQ.0)THEN 6160 DO2040JJ=KK-1,1,-1 6161 IF(ISEC.EQ.LAT(JJ))THEN 6162C 6163C MATCH WITH PREVIOUS CLUSTER FOUND 6164C 6165 ISEC2=LBT(JJ) 6166 DO2041LL=1,NN 6167 IF(IFRST.EQ.NER(LL))THEN 6168 XVAL1=REAL(LL) 6169 XVAL3=REAL(LL) 6170 ENDIF 6171 IF(ISEC2.EQ.NER(LL))THEN 6172 XVAL2=REAL(LL) - 0.5 6173 XVAL4=REAL(LL) 6174 ENDIF 6175 2041 CONTINUE 6176 YVAL1=0.0 6177 YVAL2=AVAL1 6178 YVAL3=BANLBT(JJ) 6179 IFLAGR=1 6180 GOTO2049 6181 ENDIF 6182 2040 CONTINUE 6183 2049 CONTINUE 6184C 6185 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'VERL')THEN 6186 WRITE(ICOUT,2045)KK,IFLAGR 6187 2045 FORMAT('AFTER 2040 LOOP: KK,IFLAGR = ',2I6) 6188 CALL DPWRST('XXX','BUG ') 6189 WRITE(ICOUT,2038)XVAL1,XVAL2,YVAL1,YVAL2,YVAL3 6190 CALL DPWRST('XXX','BUG ') 6191 ENDIF 6192C 6193 ELSEIF(IFLAGL.EQ.1)THEN 6194 DO2050JJ=KK-1,1,-1 6195 IF(ISEC.EQ.LAT(JJ))THEN 6196C 6197C MATCH WITH PREVIOUS CLUSTER FOUND 6198C 6199 ISEC2=LBT(JJ) 6200 DO2053LL=1,NN 6201 IF(ISEC2.EQ.NER(LL))THEN 6202 XVAL2=REAL(LL) - 0.5 6203 XVAL4=REAL(LL) 6204 ENDIF 6205 2053 CONTINUE 6206 YVAL2=AVAL1 6207 YVAL3=BANLBT(JJ) 6208 IFLAGR=1 6209 GOTO2059 6210 ENDIF 6211 2050 CONTINUE 6212 2059 CONTINUE 6213C 6214 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'VERL')THEN 6215 WRITE(ICOUT,2055)KK,IFLAGR 6216 2055 FORMAT('AFTER 2050 LOOP: KK,IFLAGR = ',2I6) 6217 CALL DPWRST('XXX','BUG ') 6218 WRITE(ICOUT,2038)XVAL1,XVAL2,YVAL1,YVAL2,YVAL3 6219 CALL DPWRST('XXX','BUG ') 6220 ENDIF 6221C 6222 ENDIF 6223C 6224 IF(IFLAGL.EQ.1 .OR. IFLAGR.EQ.1)THEN 6225 ITAG=ITAG+1 6226 WRITE(IOUNI3,'(3E15.7)')XVAL1,YVAL1,REAL(ITAG) 6227 WRITE(IOUNI3,'(3E15.7)')XVAL1,YVAL2,REAL(ITAG) 6228 WRITE(IOUNI3,'(3E15.7)')XVAL2,YVAL2,REAL(ITAG) 6229 WRITE(IOUNI3,'(3E15.7)')XVAL2,YVAL3,REAL(ITAG) 6230C 6231 ELSE 6232C 6233C NO MATCH WITH PREVIOUS CLUSTER, SO CREATING A 6234C NEW CLUSTER 6235C 6236 DO2061JJ=1,NN 6237 IF(IFRST.EQ.NER(JJ))THEN 6238 XVAL1=REAL(JJ) 6239 XVAL3=REAL(JJ) 6240 ENDIF 6241 IF(ISEC.EQ.NER(JJ))THEN 6242 XVAL2=REAL(JJ) 6243 XVAL4=REAL(JJ) 6244 ENDIF 6245 2061 CONTINUE 6246 YVAL1=0.0 6247 YVAL2=AVAL1 6248 ITAG=ITAG+1 6249 WRITE(IOUNI3,'(3E15.7)')XVAL1,YVAL1,REAL(ITAG) 6250 WRITE(IOUNI3,'(3E15.7)')XVAL1,YVAL2,REAL(ITAG) 6251 WRITE(IOUNI3,'(3E15.7)')XVAL2,YVAL2,REAL(ITAG) 6252 WRITE(IOUNI3,'(3E15.7)')XVAL2,YVAL1,REAL(ITAG) 6253C 6254 ENDIF 6255 ENDIF 6256C 6257 2020 CONTINUE 6258C 6259 ITAG=0 6260 DO3020KK=1,ICNT 6261C 6262C LB IDENTIFIES "RIGHT HAND SIDE" OF BRANCH. LA IDENTIFIES 6263C WHICH CLUSTER IT IS JOINING. NER WILL BE USED TO IDENTIFY 6264C THE APPROPRIATE X-COORDINATE. 6265C 6266 IFRST=LAT(KK) 6267 ISEC=LBT(KK) 6268 NCLU=NCLUT(KK) 6269 YVAL=REAL(NCLU) 6270C 6271 IF(KK.EQ.1)THEN 6272C 6273C FIRST CLUSTER BEING FORMED 6274C 6275 XVAL1=1.0 6276 XVAL2=2.0 6277 DO3021JJ=1,NN 6278 IF(IFRST.EQ.NER(JJ))XVAL1=REAL(JJ) 6279 IF(ISEC.EQ.NER(JJ))XVAL2=REAL(JJ) 6280 3021 CONTINUE 6281C 6282 XVAL1=(XVAL1-1.0)*2.0 + 1.0 6283 XVAL2=XVAL1 + 1.0 6284 XVAL3=(XVAL2-1.0)*2.0 + 1.0 6285 ITAG=ITAG+1 6286 WRITE(IOUNI2,'(3E15.7)')XVAL1,YVAL,REAL(ITAG) 6287 ITAG=ITAG+1 6288 WRITE(IOUNI2,'(3E15.7)')XVAL2,YVAL,REAL(ITAG) 6289 ITAG=ITAG+1 6290 WRITE(IOUNI2,'(3E15.7)')XVAL3,YVAL,REAL(ITAG) 6291C 6292 ELSEIF(KK.EQ.ICNT)THEN 6293C 6294C LAST CLUSTER BEING FORMED 6295C 6296 XVAL2=2.0 6297 DO307JJ=1,NN 6298 IF(ISEC.EQ.NER(JJ))THEN 6299 XVAL2=REAL(JJ) 6300 XVAL2=(XVAL2-1.0)*2.0 + 1.0 6301 XVAL1=XVAL2 - 1.0 6302 ENDIF 6303 307 CONTINUE 6304 ITAG=ITAG+1 6305 WRITE(IOUNI2,'(3E15.7)')XVAL1,YVAL,REAL(ITAG) 6306 ITAG=ITAG+1 6307 WRITE(IOUNI2,'(3E15.7)')XVAL2,YVAL,REAL(ITAG) 6308 ELSE 6309C 6310C IF NOT THE FIRST, THEN CHECK IF LA MATCHES ANY 6311C PREVIOUS LA. 6312C 6313 IFLAGL=0 6314 IFLAGR=0 6315 IFLAG3=0 6316 DO3030JJ=KK-1,1,-1 6317 IF(IFRST.EQ.LAT(JJ))THEN 6318C 6319C MATCH WITH PREVIOUS CLUSTER FOUND 6320C 6321 ISEC2=LBT(JJ) 6322 DO3031LL=1,NN 6323 IF(ISEC2.EQ.NER(LL))THEN 6324 XVAL1=REAL(LL) 6325 XVAL1=(XVAL1-1.0)*2.0 + 2.0 6326 ENDIF 6327 IF(ISEC.EQ.NER(LL))THEN 6328 XVAL2=REAL(LL) 6329 XVAL2=(XVAL2-1.0)*2.0 + 1.0 6330 IFLAG3=1 6331 ENDIF 6332 3031 CONTINUE 6333 IFLAGL=1 6334 GOTO3039 6335 ENDIF 6336 3030 CONTINUE 6337 3039 CONTINUE 6338C 6339 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'VERL')THEN 6340 WRITE(ICOUT,3035)KK,IFLAGL,XVAL1,XVAL2,YVAL 6341 3035 FORMAT('AFTER 3030 LOOP: KK,IFLAGL,XVAL1,XVAL2,YVAL = ', 6342 1 2I6,3G15.7) 6343 CALL DPWRST('XXX','BUG ') 6344 WRITE(ICOUT,3036)IFRST,ISEC,ISEC2 6345 3036 FORMAT(' IFRST,ISEC,ISEC2 = ',3I6) 6346 CALL DPWRST('XXX','BUG ') 6347 ENDIF 6348C 6349C IF NOT THE FIRST AND IF LA DOES NOT MATCH ANY PREVIOUS LA, 6350C THEN CHECK IF LB MATCHES ANY PREVIOUS LA. 6351C 6352 IF(IFLAGL.EQ.0)THEN 6353 DO3040JJ=KK-1,1,-1 6354 IF(ISEC.EQ.LAT(JJ))THEN 6355C 6356C MATCH WITH PREVIOUS CLUSTER FOUND 6357C 6358 ISEC2=LBT(JJ) 6359 DO3041LL=1,NN 6360 IF(IFRST.EQ.NER(LL))THEN 6361 XVAL1=REAL(LL) 6362 XVAL1=(XVAL1-1.0)*2.0 + 1.0 6363 ENDIF 6364 IF(ISEC2.EQ.NER(LL))THEN 6365 XVAL2=REAL(LL) 6366 XVAL2=(XVAL1-2.0)*2.0 + 1.0 6367 ENDIF 6368 3041 CONTINUE 6369 GOTO3049 6370 ENDIF 6371 3040 CONTINUE 6372 3049 CONTINUE 6373C 6374 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'VERL')THEN 6375 WRITE(ICOUT,3045)KK,IFLAGR 6376 3045 FORMAT('AFTER 3040 LOOP: KK,IFLAGR = ',2I6) 6377 CALL DPWRST('XXX','BUG ') 6378 WRITE(ICOUT,3048)XVAL1,XVAL2,YVAL 6379 3048 FORMAT('XVAL1,XVAL2,YVAL = ',3G15.7) 6380 CALL DPWRST('XXX','BUG ') 6381 ENDIF 6382C 6383 ELSEIF(IFLAGL.EQ.1 .AND. IFLAG3.EQ.0)THEN 6384 DO3050JJ=KK-1,1,-1 6385 IF(ISEC.EQ.LAT(JJ))THEN 6386C 6387C MATCH WITH PREVIOUS CLUSTER FOUND 6388C 6389 ISEC2=LBT(JJ) 6390 DO3053LL=1,NN 6391 IF(ISEC2.EQ.NER(LL))THEN 6392 XVAL2=REAL(LL) 6393 XVAL2=(XVAL2-2.0)*2.0 + 1.0 6394 ENDIF 6395 3053 CONTINUE 6396 IFLAGR=1 6397 GOTO3059 6398 ENDIF 6399 3050 CONTINUE 6400 3059 CONTINUE 6401C 6402 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'VERL')THEN 6403 WRITE(ICOUT,3055)KK,IFLAGR 6404 3055 FORMAT('AFTER 3050 LOOP: KK,IFLAGR = ',2I6) 6405 CALL DPWRST('XXX','BUG ') 6406 WRITE(ICOUT,3048)XVAL1,XVAL2,YVAL 6407 CALL DPWRST('XXX','BUG ') 6408 ENDIF 6409C 6410 ENDIF 6411C 6412 IF(IFLAGL.EQ.1 .OR. IFLAGR.EQ.1)THEN 6413 ITAG=ITAG+1 6414 WRITE(IOUNI2,'(3E15.7)')XVAL1,YVAL,REAL(ITAG) 6415 ITAG=ITAG+1 6416 WRITE(IOUNI2,'(3E15.7)')XVAL2,YVAL,REAL(ITAG) 6417 ELSE 6418C 6419C NO MATCH WITH PREVIOUS CLUSTER, SO CREATING A 6420C NEW CLUSTER 6421C 6422 DO3061JJ=1,NN 6423 IF(IFRST.EQ.NER(JJ))THEN 6424 XVAL1=REAL(JJ) 6425 XVAL1=2.0*XVAL1 6426 ENDIF 6427 IF(ISEC.EQ.NER(JJ))THEN 6428 XVAL2=REAL(JJ) 6429 XVAL2=2.0*(XVAL2-1.0) + 1.0 6430 ENDIF 6431 3061 CONTINUE 6432 ITAG=ITAG+1 6433 WRITE(IOUNI2,'(3E15.7)')XVAL1,YVAL,REAL(ITAG) 6434 ITAG=ITAG+1 6435 WRITE(IOUNI2,'(3E15.7)')XVAL2,YVAL,REAL(ITAG) 6436 ENDIF 6437 ENDIF 6438C 6439 3020 CONTINUE 6440C 6441 2019 CONTINUE 6442 RETURN 6443 END 6444 SUBROUTINE SPLYT(NN,KWAN,NER,BAN,DYS, 6445 1 NCLUT,LAT,LBT,BANLAT,BANLBT, 6446 1 IOUNI2,IOUNI3,IOUNI4,ISUBRO,IBUGA3) 6447C 6448CNIST SUBROUTINE SPLYT(NN,MAXNN,KWAN,NER,BAN,MAXHH,DYS,LUB) 6449C 6450C NN = NUMBER OF OBJECTS 6451C MAXNN = MAXIMUM NUMBER OF OBJECTS 6452C KWAN = NUMBER OF OBJECTS IN EACH CLUSTER 6453C NER = ORDERING OF OBJECTS 6454C BAN = DISSIMILARITIES BETWEEN CLUSTERS 6455C MAXHH = MAXIMUM NUMBER FO DISSIMILARITIES (DATAPLOT DOES NOT USE) 6456C DYS = VECTOR OF DISSIMILARITIES 6457C LUB = OUTPUT UNIT (DATAPLOT DOES NOT USE) 6458C 6459 DIMENSION KWAN(*) 6460 DIMENSION DYS(*) 6461 DIMENSION NER(*) 6462 DIMENSION BAN(*) 6463 DIMENSION BANLAT(*) 6464 DIMENSION BANLBT(*) 6465C 6466 DIMENSION NCLUT(*) 6467 DIMENSION LAT(*) 6468 DIMENSION LBT(*) 6469C 6470 CHARACTER*4 ISUBRO 6471 CHARACTER*4 IBUGA3 6472C 6473 INCLUDE 'DPCOP2.INC' 6474C 6475 IF(IBUGA3.EQ.'ON')THEN 6476 WRITE(ICOUT,5)NN 6477 5 FORMAT('BEGINING OF AVERL: NN = ',I6) 6478 CALL DPWRST('XXX','BUG ') 6479 ENDIF 6480C 6481 LXF=0 6482 LXG=0 6483 LNDSD=0 6484 JMB=0 6485 L=0 6486 JMA=0 6487 JAWAY=0 6488 ICNT=0 6489C 6490CC INITIALIZATION 6491CC 6492 NCLU=1 6493 NHALF=NN*(NN-1)/2+1 6494 DO 10 L=1,NN 6495 KWAN(L)=0 6496 BAN(L)=0. 6497 NER(L)=L 6498 10 CONTINUE 6499 KWAN(1)=NN 6500 JA=1 6501CC 6502CC COMPUTATION OF DIAMETER OF DATA SET 6503CC 6504 CS=0.0 6505 K=0 6506 20 CONTINUE 6507 K=K+1 6508 IF(DYS(K).GT.CS)CS=DYS(K) 6509 IF(K.LT.NHALF)GO TO 20 6510CC 6511CC PREPARE FOR SPLITTING 6512CC 6513 30 CONTINUE 6514 JB=JA+KWAN(JA)-1 6515 JMA=JB 6516CC 6517CC SPECIAL CASE OF A PAIR OF OBJECTS 6518CC 6519 IF(KWAN(JA).EQ.2)THEN 6520 KWAN(JA)=1 6521 KWAN(JB)=1 6522 JAN=NER(JA) 6523 JBN=NER(JB) 6524 JAB=MEET(JAN,JBN) 6525 BAN(JB)=DYS(JAB) 6526 GO TO 400 6527 ENDIF 6528CC 6529CC FINDING FIRST OBJECT TO BE SHIFTED 6530CC 6531 BYGSD=-1. 6532 DO 110 L=JA,JB 6533 LNER=NER(L) 6534 SD=0. 6535 DO 100 J=JA,JB 6536 JNER=NER(J) 6537 NLJ=MEET(LNER,JNER) 6538 SD=SD+DYS(NLJ) 6539 100 CONTINUE 6540 IF(SD.LE.BYGSD)GO TO 110 6541 BYGSD=SD 6542 LNDSD=L 6543 110 CONTINUE 6544CC 6545CC SHIFTING THE FIRST OBJECT 6546CC 6547 KWAN(JA)=KWAN(JA)-1 6548 KWAN(JB)=1 6549 IF(JB.NE.LNDSD)THEN 6550 LCHAN=NER(LNDSD) 6551 LMM=JB-1 6552 DO 112 LMMA=LNDSD,LMM 6553 LMMB=LMMA+1 6554 NER(LMMA)=NER(LMMB) 6555 112 CONTINUE 6556 NER(JB)=LCHAN 6557 ENDIF 6558 SPLYN=0. 6559 JMA=JB-1 6560CC 6561CC FINDING THE NEXT OBJECT TO BE SHIFTED 6562CC 6563 120 CONTINUE 6564 SPLYN=SPLYN+1. 6565 REST=JMA-JA 6566 BDYFF=-1. 6567 DO 150 L=JA,JMA 6568 LNER=NER(L) 6569 DA=0. 6570 DO 130 J=JA,JMA 6571 JNER=NER(J) 6572 NLJ=MEET(LNER,JNER) 6573 DA=DA+DYS(NLJ) 6574 130 CONTINUE 6575 DA=DA/REST 6576 DB=0. 6577 JMB=JMA+1 6578 DO 140 J=JMB,JB 6579 JNER=NER(J) 6580 NLJ=MEET(LNER,JNER) 6581 DB=DB+DYS(NLJ) 6582 140 CONTINUE 6583 DB=DB/SPLYN 6584 DYFF=DA-DB 6585 IF(DYFF.LE.BDYFF)GO TO 150 6586 BDYFF=DYFF 6587 JAWAY=L 6588 150 CONTINUE 6589 JMB=JMA+1 6590CC 6591CC SHIFTING THE NEXT OBJECT WHEN NECESSARY 6592CC 6593 IF(BDYFF.LE.0.)GO TO 200 6594 IF(JMA.NE.JAWAY)THEN 6595 LCHAN=NER(JAWAY) 6596 LMZ=JMA-1 6597 DO 160 LXX=JAWAY,LMZ 6598 LXXP=LXX+1 6599 NER(LXX)=NER(LXXP) 6600 160 CONTINUE 6601 NER(JMA)=LCHAN 6602 ENDIF 6603C 6604 DO 170 LXX=JMB,JB 6605 LXY=LXX-1 6606 IF(NER(LXY).LT.NER(LXX))GO TO 180 6607 LCHAN=NER(LXY) 6608 NER(LXY)=NER(LXX) 6609 NER(LXX)=LCHAN 6610 170 CONTINUE 6611 180 CONTINUE 6612 KWAN(JA)=KWAN(JA)-1 6613 KWAN(JMA)=KWAN(JMB)+1 6614 KWAN(JMB)=0 6615 JMA=JMA-1 6616 JMB=JMA+1 6617 IF(JMA.NE.JA)GO TO 120 6618CC 6619CC SWITCH THE TWO PARTS WHEN NECESSARY 6620CC 6621 200 CONTINUE 6622 IF(NER(JA).GE.NER(JMB))THEN 6623 LXXA=JA 6624 DO 220 LGRB=JMB,JB 6625 LXXA=LXXA+1 6626 LCHAN=NER(LGRB) 6627 DO 210 LXY=LXXA,LGRB 6628 LXF=LGRB-LXY+LXXA 6629 LXG=LXF-1 6630 NER(LXF)=NER(LXG) 6631 210 CONTINUE 6632 NER(LXG)=LCHAN 6633 220 CONTINUE 6634 LLQ=KWAN(JMB) 6635 KWAN(JMB)=0 6636 JMA=JA+JB-JMA-1 6637 JMB=JMA+1 6638 KWAN(JMB)=KWAN(JA) 6639 KWAN(JA)=LLQ 6640 ENDIF 6641CC 6642CC COMPUTE LEVEL FOR BANNER 6643CC 6644 IF(NCLU.EQ.1)BAN(JMB)=CS 6645 IF(NCLU.EQ.1)GO TO 400 6646 CALL SUPCL(DYS,JA,JB,AREST,NER) 6647 BAN(JMB)=AREST 6648C 6649 400 CONTINUE 6650C 6651 DO1025II=1,NN 6652 IF(JA.EQ.NER(II))THEN 6653 AVAL1=BAN(II) 6654 ELSEIF(JB.EQ.NER(II))THEN 6655 AVAL2=BAN(II) 6656 ENDIF 6657 1025 CONTINUE 6658 ICNT=ICNT+1 6659 WRITE(IOUNI4,'(3I5,2E15.7)')NCLU,JA,JB,AVAL1,AVAL2 6660C 6661 NCLU=NCLU+1 6662 IF(NCLU.EQ.2 .AND. IPRINT.EQ.'ON')THEN 6663 WRITE(ICOUT,999) 6664 999 FORMAT(1X) 6665 CALL DPWRST('XXX','BUG ') 6666 WRITE(ICOUT,999) 6667 CALL DPWRST('XXX','BUG ') 6668CNIST WRITE(ICOUT,9000)NN,JMA,KWAN(JMB) 6669C9000 FORMAT(//22H AT THE FIRST STEP THE,I4,20H OBJECTS ARE DIVIDED, 6670CNIST1 5H INTO/3X,I4,12H OBJECTS AND,I4,8H OBJECTS) 6671 WRITE(ICOUT,9000)NN 6672 9000 FORMAT('AT THE FIRST STEP THE',I4,' OBJECTS ARE DIVIDED INTO') 6673 CALL DPWRST('XXX','BUG ') 6674 WRITE(ICOUT,9001)JMA,KWAN(JMB) 6675 9001 FORMAT(2X,I4,' OBJECTS AND',I4,' OBJECTS') 6676 CALL DPWRST('XXX','BUG ') 6677 ENDIF 6678 IF(NCLU.EQ.NN)GOTO 500 6679CC 6680CC CONTINUE SPLITTING UNTIL ALL OBJECTS ARE SEPARATED 6681CC 6682 IF(JB.EQ.NN)GO TO 430 6683 420 CONTINUE 6684 JA=JA+KWAN(JA) 6685 IF(JA.GT.NN)GO TO 430 6686 IF(KWAN(JA).LE.1)GO TO 420 6687 GO TO 30 6688 430 CONTINUE 6689 JA=1 6690 IF(KWAN(JA).EQ.1)GO TO 420 6691 GO TO 30 6692C 6693 500 CONTINUE 6694C 6695 ILOOP=NN/5 6696 IREM=MOD(NN,5) 6697 IF(IREM.GT.0)ILOOP=ILOOP+1 6698C 6699 IF(IPRINT.EQ.'ON')THEN 6700 WRITE(ICOUT,999) 6701 CALL DPWRST('XXX','BUG ') 6702 WRITE(ICOUT,999) 6703 CALL DPWRST('XXX','BUG ') 6704 WRITE(ICOUT,9100) 6705 9100 FORMAT('THE FINAL ORDERING OF THE OBJECTS IS') 6706 CALL DPWRST('XXX','BUG ') 6707 WRITE(ICOUT,999) 6708 CALL DPWRST('XXX','BUG ') 6709 DO9111II=1,ILOOP 6710 ISTRT=(II-1)*5 + 1 6711 ISTOP=II*5 6712 IF(ISTOP.GT.NN)ISTOP=NN 6713 WRITE(ICOUT,9110)(NER(L),L=ISTRT,ISTOP) 6714 9110 FORMAT(5(I9,6X)) 6715 CALL DPWRST('XXX','BUG ') 6716 9111 CONTINUE 6717 WRITE(ICOUT,999) 6718 CALL DPWRST('XXX','BUG ') 6719 WRITE(ICOUT,999) 6720 CALL DPWRST('XXX','BUG ') 6721 WRITE(ICOUT,9120) 6722 9120 FORMAT('THE DIAMETERS OF THE CLUSTERS ARE') 6723 CALL DPWRST('XXX','BUG ') 6724 DO9131II=1,ILOOP 6725 ISTRT=(II-1)*5 + 2 6726 ISTOP=II*5 6727 IF(ISTOP.GT.NN)ISTOP=NN 6728 WRITE(ICOUT,9130)(BAN(L),L=ISTRT,ISTOP) 6729 9130 FORMAT(3X,5F15.3) 6730 CALL DPWRST('XXX','BUG ') 6731 9131 CONTINUE 6732 ENDIF 6733C 6734C NOW CREATE DATA FOR: 6735C 6736C 1. DENDOGRAM (IOUNI3) 6737C 2. ICICLE PLOT (IOUNI2) 6738C 6739 REWIND(IOUNI4) 6740C 6741 DO2010KK=1,ICNT 6742 READ(IOUNI4,'(3I5,2E15.7)',END=2019,ERR=2019) 6743 1 NCLUT(KK),LAT(KK),LBT(KK),BANLAT(KK),BANLBT(KK) 6744 2010 CONTINUE 6745C 6746 ITAG=0 6747 DO2020KK=1,ICNT 6748C 6749C LB IDENTIFIES "RIGHT HAND SIDE" OF BRANCH. LA IDENTIFIES 6750C WHICH CLUSTER IT IS JOINING. NER WILL BE USED TO IDENTIFY 6751C THE APPROPRIATE X-COORDINATE. 6752C 6753 IFRST=LAT(KK) 6754 ISEC=LBT(KK) 6755 AVAL1=BANLBT(KK) 6756C 6757 IF(KK.EQ.1)THEN 6758C 6759C FIRST CLUSTER BEING FORMED 6760C 6761 XVAL1=1.0 6762 XVAL2=2.0 6763 DO2021JJ=1,NN 6764 IF(IFRST.EQ.NER(JJ))XVAL1=REAL(JJ) 6765 IF(ISEC.EQ.NER(JJ))XVAL2=REAL(JJ) 6766 2021 CONTINUE 6767 YVAL1=0.0 6768 YVAL2=AVAL1 6769 ITAG=ITAG+1 6770 WRITE(IOUNI3,'(3E15.7)')XVAL1,YVAL1,REAL(ITAG) 6771 WRITE(IOUNI3,'(3E15.7)')XVAL1,YVAL2,REAL(ITAG) 6772 WRITE(IOUNI3,'(3E15.7)')XVAL2,YVAL2,REAL(ITAG) 6773 WRITE(IOUNI3,'(3E15.7)')XVAL2,YVAL1,REAL(ITAG) 6774C 6775 ELSE 6776C 6777C IF NOT THE FIRST, THEN CHECK IF LA MATCHES ANY 6778C PREVIOUS LA. 6779C 6780 IFLAGL=0 6781 IFLAGR=0 6782 DO2030JJ=KK-1,1,-1 6783 IF(IFRST.EQ.LAT(JJ))THEN 6784C 6785C MATCH WITH PREVIOUS CLUSTER FOUND 6786C 6787 ISEC2=LBT(JJ) 6788 DO2031LL=1,NN 6789 IF(ISEC2.EQ.NER(LL))THEN 6790 XVAL1=REAL(LL) - 0.5 6791 XVAL3=REAL(LL) 6792 ENDIF 6793 IF(ISEC.EQ.NER(LL))THEN 6794 XVAL2=REAL(LL) 6795 XVAL4=XVAL2 6796 ENDIF 6797 2031 CONTINUE 6798 YVAL1=BANLBT(JJ) 6799 YVAL2=AVAL1 6800 YVAL3=0.0 6801 IFLAGL=1 6802 GOTO2039 6803 ENDIF 6804 2030 CONTINUE 6805 2039 CONTINUE 6806C 6807 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'VERL')THEN 6808 WRITE(ICOUT,2035)KK,IFLAGL 6809 2035 FORMAT('AFTER 2030 LOOP: KK,IFLAGL = ',2I6) 6810 CALL DPWRST('XXX','BUG ') 6811 WRITE(ICOUT,2038)XVAL1,XVAL2,YVAL1,YVAL2,YVAL3 6812 2038 FORMAT('XVAL1,XVAL2,YVAL1,YVAL2,YVAL3 = ',5G15.7) 6813 CALL DPWRST('XXX','BUG ') 6814 ENDIF 6815C 6816C IF NOT THE FIRST AND IF LA DOES NOT MATCH ANY PREVIOUS LA, 6817C THEN CHECK IF LB MATCHES ANY PREVIOUS LA. 6818C 6819 IF(IFLAGL.EQ.0)THEN 6820 DO2040JJ=KK-1,1,-1 6821 IF(ISEC.EQ.LAT(JJ))THEN 6822C 6823C MATCH WITH PREVIOUS CLUSTER FOUND 6824C 6825 ISEC2=LBT(JJ) 6826 DO2041LL=1,NN 6827 IF(IFRST.EQ.NER(LL))THEN 6828 XVAL1=REAL(LL) 6829 XVAL3=REAL(LL) 6830 ENDIF 6831 IF(ISEC2.EQ.NER(LL))THEN 6832 XVAL2=REAL(LL) - 0.5 6833 XVAL4=REAL(LL) 6834 ENDIF 6835 2041 CONTINUE 6836 YVAL1=0.0 6837 YVAL2=AVAL1 6838 YVAL3=BANLBT(JJ) 6839 IFLAGR=1 6840 GOTO2049 6841 ENDIF 6842 2040 CONTINUE 6843 2049 CONTINUE 6844C 6845 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'VERL')THEN 6846 WRITE(ICOUT,2045)KK,IFLAGR 6847 2045 FORMAT('AFTER 2040 LOOP: KK,IFLAGR = ',2I6) 6848 CALL DPWRST('XXX','BUG ') 6849 WRITE(ICOUT,2038)XVAL1,XVAL2,YVAL1,YVAL2,YVAL3 6850 CALL DPWRST('XXX','BUG ') 6851 ENDIF 6852C 6853 ELSEIF(IFLAGL.EQ.1)THEN 6854 DO2050JJ=KK-1,1,-1 6855 IF(ISEC.EQ.LAT(JJ))THEN 6856C 6857C MATCH WITH PREVIOUS CLUSTER FOUND 6858C 6859 ISEC2=LBT(JJ) 6860 DO2053LL=1,NN 6861 IF(ISEC2.EQ.NER(LL))THEN 6862 XVAL2=REAL(LL) - 0.5 6863 XVAL4=REAL(LL) 6864 ENDIF 6865 2053 CONTINUE 6866 YVAL2=AVAL1 6867 YVAL3=BANLBT(JJ) 6868 IFLAGR=1 6869 GOTO2059 6870 ENDIF 6871 2050 CONTINUE 6872 2059 CONTINUE 6873C 6874 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'VERL')THEN 6875 WRITE(ICOUT,2055)KK,IFLAGR 6876 2055 FORMAT('AFTER 2050 LOOP: KK,IFLAGR = ',2I6) 6877 CALL DPWRST('XXX','BUG ') 6878 WRITE(ICOUT,2038)XVAL1,XVAL2,YVAL1,YVAL2,YVAL3 6879 CALL DPWRST('XXX','BUG ') 6880 ENDIF 6881C 6882 ENDIF 6883C 6884 IF(IFLAGL.EQ.1 .OR. IFLAGR.EQ.1)THEN 6885 ITAG=ITAG+1 6886 WRITE(IOUNI3,'(3E15.7)')XVAL1,YVAL1,REAL(ITAG) 6887 WRITE(IOUNI3,'(3E15.7)')XVAL1,YVAL2,REAL(ITAG) 6888 WRITE(IOUNI3,'(3E15.7)')XVAL2,YVAL2,REAL(ITAG) 6889 WRITE(IOUNI3,'(3E15.7)')XVAL2,YVAL3,REAL(ITAG) 6890C 6891 ELSE 6892C 6893C NO MATCH WITH PREVIOUS CLUSTER, SO CREATING A 6894C NEW CLUSTER 6895C 6896 DO2061JJ=1,NN 6897 IF(IFRST.EQ.NER(JJ))THEN 6898 XVAL1=REAL(JJ) 6899 XVAL3=REAL(JJ) 6900 ENDIF 6901 IF(ISEC.EQ.NER(JJ))THEN 6902 XVAL2=REAL(JJ) 6903 XVAL4=REAL(JJ) 6904 ENDIF 6905 2061 CONTINUE 6906 YVAL1=0.0 6907 YVAL2=AVAL1 6908 ITAG=ITAG+1 6909 WRITE(IOUNI3,'(3E15.7)')XVAL1,YVAL1,REAL(ITAG) 6910 WRITE(IOUNI3,'(3E15.7)')XVAL1,YVAL2,REAL(ITAG) 6911 WRITE(IOUNI3,'(3E15.7)')XVAL2,YVAL2,REAL(ITAG) 6912 WRITE(IOUNI3,'(3E15.7)')XVAL2,YVAL1,REAL(ITAG) 6913C 6914 ENDIF 6915 ENDIF 6916C 6917 2020 CONTINUE 6918C 6919 ITAG=0 6920 DO3020KK=1,ICNT 6921C 6922C LB IDENTIFIES "RIGHT HAND SIDE" OF BRANCH. LA IDENTIFIES 6923C WHICH CLUSTER IT IS JOINING. NER WILL BE USED TO IDENTIFY 6924C THE APPROPRIATE X-COORDINATE. 6925C 6926 IFRST=LAT(KK) 6927 ISEC=LBT(KK) 6928 NCLU=NCLUT(KK) 6929 YVAL=REAL(NCLU) 6930C 6931 IF(KK.EQ.1)THEN 6932C 6933C FIRST CLUSTER BEING FORMED 6934C 6935 XVAL1=1.0 6936 XVAL2=2.0 6937 DO3021JJ=1,NN 6938 IF(IFRST.EQ.NER(JJ))XVAL1=REAL(JJ) 6939 IF(ISEC.EQ.NER(JJ))XVAL2=REAL(JJ) 6940 3021 CONTINUE 6941C 6942 XVAL1=(XVAL1-1.0)*2.0 + 1.0 6943 XVAL2=XVAL1 + 1.0 6944 XVAL3=(XVAL2-1.0)*2.0 + 1.0 6945 ITAG=ITAG+1 6946 WRITE(IOUNI2,'(3E15.7)')XVAL1,YVAL,REAL(ITAG) 6947 ITAG=ITAG+1 6948 WRITE(IOUNI2,'(3E15.7)')XVAL2,YVAL,REAL(ITAG) 6949 ITAG=ITAG+1 6950 WRITE(IOUNI2,'(3E15.7)')XVAL3,YVAL,REAL(ITAG) 6951C 6952 ELSEIF(KK.EQ.ICNT)THEN 6953C 6954C LAST CLUSTER BEING FORMED 6955C 6956 XVAL2=2.0 6957 DO307JJ=1,NN 6958 IF(ISEC.EQ.NER(JJ))THEN 6959 XVAL2=REAL(JJ) 6960 XVAL2=(XVAL2-1.0)*2.0 + 1.0 6961 XVAL1=XVAL2 - 1.0 6962 ENDIF 6963 307 CONTINUE 6964 ITAG=ITAG+1 6965 WRITE(IOUNI2,'(3E15.7)')XVAL1,YVAL,REAL(ITAG) 6966 ITAG=ITAG+1 6967 WRITE(IOUNI2,'(3E15.7)')XVAL2,YVAL,REAL(ITAG) 6968 ELSE 6969C 6970C IF NOT THE FIRST, THEN CHECK IF LA MATCHES ANY 6971C PREVIOUS LA. 6972C 6973 IFLAGL=0 6974 IFLAGR=0 6975 IFLAG3=0 6976 DO3030JJ=KK-1,1,-1 6977 IF(IFRST.EQ.LAT(JJ))THEN 6978C 6979C MATCH WITH PREVIOUS CLUSTER FOUND 6980C 6981 ISEC2=LBT(JJ) 6982 DO3031LL=1,NN 6983 IF(ISEC2.EQ.NER(LL))THEN 6984 XVAL1=REAL(LL) 6985 XVAL1=(XVAL1-1.0)*2.0 + 2.0 6986 ENDIF 6987 IF(ISEC.EQ.NER(LL))THEN 6988 XVAL2=REAL(LL) 6989 XVAL2=(XVAL2-1.0)*2.0 + 1.0 6990 IFLAG3=1 6991 ENDIF 6992 3031 CONTINUE 6993 IFLAGL=1 6994 GOTO3039 6995 ENDIF 6996 3030 CONTINUE 6997 3039 CONTINUE 6998C 6999 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'VERL')THEN 7000 WRITE(ICOUT,3035)KK,IFLAGL,XVAL1,XVAL2,YVAL 7001 3035 FORMAT('AFTER 3030 LOOP: KK,IFLAGL,XVAL1,XVAL2,YVAL = ', 7002 1 2I6,3G15.7) 7003 CALL DPWRST('XXX','BUG ') 7004 WRITE(ICOUT,3036)IFRST,ISEC,ISEC2 7005 3036 FORMAT(' IFRST,ISEC,ISEC2 = ',3I6) 7006 CALL DPWRST('XXX','BUG ') 7007 ENDIF 7008C 7009C IF NOT THE FIRST AND IF LA DOES NOT MATCH ANY PREVIOUS LA, 7010C THEN CHECK IF LB MATCHES ANY PREVIOUS LA. 7011C 7012 IF(IFLAGL.EQ.0)THEN 7013 DO3040JJ=KK-1,1,-1 7014 IF(ISEC.EQ.LAT(JJ))THEN 7015C 7016C MATCH WITH PREVIOUS CLUSTER FOUND 7017C 7018 ISEC2=LBT(JJ) 7019 DO3041LL=1,NN 7020 IF(IFRST.EQ.NER(LL))THEN 7021 XVAL1=REAL(LL) 7022 XVAL1=(XVAL1-1.0)*2.0 + 1.0 7023 ENDIF 7024 IF(ISEC2.EQ.NER(LL))THEN 7025 XVAL2=REAL(LL) 7026 XVAL2=(XVAL1-2.0)*2.0 + 1.0 7027 ENDIF 7028 3041 CONTINUE 7029 GOTO3049 7030 ENDIF 7031 3040 CONTINUE 7032 3049 CONTINUE 7033C 7034 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'VERL')THEN 7035 WRITE(ICOUT,3045)KK,IFLAGR 7036 3045 FORMAT('AFTER 3040 LOOP: KK,IFLAGR = ',2I6) 7037 CALL DPWRST('XXX','BUG ') 7038 WRITE(ICOUT,3048)XVAL1,XVAL2,YVAL 7039 3048 FORMAT('XVAL1,XVAL2,YVAL = ',3G15.7) 7040 CALL DPWRST('XXX','BUG ') 7041 ENDIF 7042C 7043 ELSEIF(IFLAGL.EQ.1 .AND. IFLAG3.EQ.0)THEN 7044 DO3050JJ=KK-1,1,-1 7045 IF(ISEC.EQ.LAT(JJ))THEN 7046C 7047C MATCH WITH PREVIOUS CLUSTER FOUND 7048C 7049 ISEC2=LBT(JJ) 7050 DO3053LL=1,NN 7051 IF(ISEC2.EQ.NER(LL))THEN 7052 XVAL2=REAL(LL) 7053 XVAL2=(XVAL2-2.0)*2.0 + 1.0 7054 ENDIF 7055 3053 CONTINUE 7056 IFLAGR=1 7057 GOTO3059 7058 ENDIF 7059 3050 CONTINUE 7060 3059 CONTINUE 7061C 7062 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'VERL')THEN 7063 WRITE(ICOUT,3055)KK,IFLAGR 7064 3055 FORMAT('AFTER 3050 LOOP: KK,IFLAGR = ',2I6) 7065 CALL DPWRST('XXX','BUG ') 7066 WRITE(ICOUT,3048)XVAL1,XVAL2,YVAL 7067 CALL DPWRST('XXX','BUG ') 7068 ENDIF 7069C 7070 ENDIF 7071C 7072 IF(IFLAGL.EQ.1 .OR. IFLAGR.EQ.1)THEN 7073 ITAG=ITAG+1 7074 WRITE(IOUNI2,'(3E15.7)')XVAL1,YVAL,REAL(ITAG) 7075 ITAG=ITAG+1 7076 WRITE(IOUNI2,'(3E15.7)')XVAL2,YVAL,REAL(ITAG) 7077 ELSE 7078C 7079C NO MATCH WITH PREVIOUS CLUSTER, SO CREATING A 7080C NEW CLUSTER 7081C 7082 DO3061JJ=1,NN 7083 IF(IFRST.EQ.NER(JJ))THEN 7084 XVAL1=REAL(JJ) 7085 XVAL1=2.0*XVAL1 7086 ENDIF 7087 IF(ISEC.EQ.NER(JJ))THEN 7088 XVAL2=REAL(JJ) 7089 XVAL2=2.0*(XVAL2-1.0) + 1.0 7090 ENDIF 7091 3061 CONTINUE 7092 ITAG=ITAG+1 7093 WRITE(IOUNI2,'(3E15.7)')XVAL1,YVAL,REAL(ITAG) 7094 ITAG=ITAG+1 7095 WRITE(IOUNI2,'(3E15.7)')XVAL2,YVAL,REAL(ITAG) 7096 ENDIF 7097 ENDIF 7098C 7099 3020 CONTINUE 7100C 7101C 7102 2019 CONTINUE 7103C 7104 IF(IBUGA3.EQ.'ON')THEN 7105 WRITE(ICOUT,9905) 7106 9905 FORMAT('END OF SPLYT') 7107 CALL DPWRST('XXX','BUG ') 7108 ENDIF 7109C 7110 RETURN 7111 END 7112 SUBROUTINE BANAG(NN,BAN,NER,IOUNI5,IAGNBA,ISUBRO,IERROR) 7113C 7114CNIST SUBROUTINE BANAG(NN,MAXNN,BAN,NER,LAB,NUM,LUB) 7115C 7116C ORIGINAL ROUTINE USED TO DRAW "BANNER" LINE PRINTER GRAPH FOR 7117C AGGLOMERATIVE CLUSTERING ALGORITHM (AGNES). 7118C 7119C NN = NUMBER OF ROWS IN THE DISSIMILARITY MATRIX 7120C MAXNN = MAXIMUM NUMBER OF ROWS ALLOWED (NOT USED) 7121C BAN = DISSIMILARITIES BETWEEN CLUSTERS 7122C NER = FINAL ORDERING OF THE OBJECTS 7123C LAB = OBJECT LABELS 7124C NUM = LABELING STRING FOR BANNER PLOT 7125C LUB = OUTPUT UNIT FOR PRINTING (NOT USED) 7126C 7127 DIMENSION BAN(*) 7128 DIMENSION NER(*) 7129C 7130 CHARACTER*4 IAGNBA 7131 CHARACTER*4 ISUBRO 7132 CHARACTER*4 IERROR 7133C 7134CNIST CHARACTER*1 LAB(3,MAXNN) 7135 CHARACTER*3 LAB 7136 CHARACTER*1 JDRAW(78) 7137 CHARACTER*1 NUM(13) 7138 CHARACTER*1 JBLAN,JSTAR,JSEPA 7139C 7140 INCLUDE 'DPCOP2.INC' 7141C 7142 IERROR='NO' 7143C 7144 NUM(1)='0' 7145 NUM(2)='1' 7146 NUM(3)='2' 7147 NUM(4)='3' 7148 NUM(5)='4' 7149 NUM(6)='5' 7150 NUM(7)='6' 7151 NUM(8)='7' 7152 NUM(9)='8' 7153 NUM(10)='9' 7154 NUM(11)=' ' 7155 NUM(12)='*' 7156 NUM(13)='+' 7157 JBLAN=NUM(11) 7158 JSTAR=NUM(12) 7159 JSEPA=NUM(13) 7160C 7161 IF(IPRINT.EQ.'ON' .AND. IAGNBA.EQ.'ON')THEN 7162 WRITE(ICOUT,999) 7163 999 FORMAT(1X) 7164 CALL DPWRST('XXX','BUG ') 7165 WRITE(ICOUT,999) 7166 CALL DPWRST('XXX','BUG ') 7167 WRITE(ICOUT,999) 7168 CALL DPWRST('XXX','BUG ') 7169 WRITE(ICOUT,9000) 7170 9000 FORMAT(34X,'************') 7171 CALL DPWRST('XXX','BUG ') 7172 WRITE(ICOUT,9001) 7173 9001 FORMAT(34X,'*',10X,'*') 7174 CALL DPWRST('XXX','BUG ') 7175 WRITE(ICOUT,9002) 7176 9002 FORMAT(34X,'* BANNER *') 7177 CALL DPWRST('XXX','BUG ') 7178 WRITE(ICOUT,9001) 7179 CALL DPWRST('XXX','BUG ') 7180 WRITE(ICOUT,9000) 7181 CALL DPWRST('XXX','BUG ') 7182C 7183 WRITE(ICOUT,999) 7184 CALL DPWRST('XXX','BUG ') 7185 WRITE(ICOUT,999) 7186 CALL DPWRST('XXX','BUG ') 7187 WRITE(ICOUT,9200) 7188 9200 FORMAT(25('0 '),'1') 7189 CALL DPWRST('XXX','BUG ') 7190 WRITE(ICOUT,9201) 7191 9201 FORMAT(26('. ')) 7192 CALL DPWRST('XXX','BUG ') 7193C 7194 WRITE(ICOUT,9210) 7195 9210 FORMAT('0 0 0 1 1 2 2 2 3 3 4 4 4 5 5 ', 7196 1 '6 6 6 7 7 8 8 8 9 9 0') 7197 CALL DPWRST('XXX','BUG ') 7198C 7199 WRITE(ICOUT,9220) 7200 9220 FORMAT(5('0 4 8 2 6 '),'0') 7201 CALL DPWRST('XXX','BUG ') 7202 WRITE(ICOUT,999) 7203 CALL DPWRST('XXX','BUG ') 7204 WRITE(ICOUT,999) 7205 CALL DPWRST('XXX','BUG ') 7206 ENDIF 7207C 7208C SUP = MAXIMUM VALUE IN BAN 7209C 7210 SUP=0.0 7211 DO 70 K=2,NN 7212 IF(BAN(K).GT.SUP)SUP=BAN(K) 7213 70 CONTINUE 7214C 7215 AC=0.0 7216 DO 80 K=1,NN 7217 KEARL=K 7218 IF(K.EQ.1)KEARL=2 7219 KAFTE=K+1 7220 IF(K.EQ.NN)KAFTE=NN 7221 SYZE=BAN(KEARL) 7222 IF(BAN(KAFTE).LT.SYZE)SYZE=BAN(KAFTE) 7223C 7224 IF(ISUBRO.EQ.'ANAG')THEN 7225 WRITE(ICOUT,8001)IOUNI5,K,KEARL,KAFTE,SYZE 7226 8001 FORMAT('IOUNI5,K,KEARL,KAFTE,SYZE = ',4I8,G15.7) 7227 CALL DPWRST('XXX','BUG ') 7228 ENDIF 7229C 7230 AC=AC+1.0-(SYZE/SUP) 7231 LEMPT=INT((SYZE/SUP)*75.0+0.01) 7232 IF(LEMPT.NE.0)THEN 7233 DO 81 L=1,LEMPT 7234 JDRAW(L)=JBLAN 7235 81 CONTINUE 7236 ENDIF 7237 LADD=LEMPT+1 7238 KAUNT=0 7239 NCASE=NER(K) 7240C 7241 LAB='000' 7242 IF(NCASE.LE.9)THEN 7243 WRITE(LAB(3:3),'(I1)')NCASE 7244 ELSEIF(NCASE.LE.99)THEN 7245 WRITE(LAB(2:3),'(I2)')NCASE 7246 ELSE 7247 WRITE(LAB(1:3),'(I3)')NCASE 7248 ENDIF 7249C 7250 DO 83 L=LADD,78 7251 KAUNT=KAUNT+1 7252 IF(KAUNT.EQ.5)KAUNT=1 7253 IF(KAUNT.EQ.1)JDRAW(L)=LAB(1:1) 7254 IF(KAUNT.EQ.2)JDRAW(L)=LAB(2:2) 7255 IF(KAUNT.EQ.3)JDRAW(L)=LAB(3:3) 7256 IF(KAUNT.EQ.4)JDRAW(L)=JSEPA 7257 83 CONTINUE 7258 IF(IPRINT.EQ.'ON' .AND. IAGNBA.EQ.'ON')THEN 7259 WRITE(ICOUT,9100)(JDRAW(J),J=1,78) 7260 9100 FORMAT(1X,78A1) 7261 CALL DPWRST('XXX','BUG ') 7262 ENDIF 7263C 7264 IF(K.EQ.NN)GO TO 90 7265 SYZE=BAN(KAFTE) 7266 LEMPT=INT((SYZE/SUP)*75.0+0.01) 7267 IF(LEMPT.EQ.0)GOTO 86 7268 DO 85 L=1,LEMPT 7269 JDRAW(L)=JBLAN 7270 85 CONTINUE 7271 86 CONTINUE 7272 LADD=LEMPT+1 7273C 7274 DO 87 L=LADD,78 7275 JDRAW(L)=JSTAR 7276 87 CONTINUE 7277 IF(IPRINT.EQ.'ON' .AND. IAGNBA.EQ.'ON')THEN 7278 WRITE(ICOUT,9100)(JDRAW(J),J=1,78) 7279 CALL DPWRST('XXX','BUG ') 7280 ENDIF 7281 80 CONTINUE 7282C 7283 90 CONTINUE 7284C 7285 IF(IPRINT.EQ.'ON' .AND. IAGNBA.EQ.'ON')THEN 7286 WRITE(ICOUT,9200) 7287 CALL DPWRST('XXX','BUG ') 7288 WRITE(ICOUT,9201) 7289 CALL DPWRST('XXX','BUG ') 7290 WRITE(ICOUT,9210) 7291 CALL DPWRST('XXX','BUG ') 7292 WRITE(ICOUT,9220) 7293 CALL DPWRST('XXX','BUG ') 7294 WRITE(ICOUT,999) 7295 CALL DPWRST('XXX','BUG ') 7296 ENDIF 7297C 7298 IF(IPRINT.EQ.'ON')THEN 7299 WRITE(ICOUT,999) 7300 CALL DPWRST('XXX','BUG ') 7301 WRITE(ICOUT,9300)SUP 7302 9300 FORMAT(' THE ACTUAL HIGHEST LEVEL IS ',F25.10) 7303 CALL DPWRST('XXX','BUG ') 7304 ENDIF 7305C 7306 RNN=NN 7307 AC=AC/RNN 7308 IF(IPRINT.EQ.'ON')THEN 7309 WRITE(ICOUT,999) 7310 CALL DPWRST('XXX','BUG ') 7311 WRITE(ICOUT,999) 7312 CALL DPWRST('XXX','BUG ') 7313 WRITE(ICOUT,9310)AC 7314 9310 FORMAT(' THE AGGLOMERATIVE COEFFICIENT OF THIS DATA SET IS ', 7315 1 F5.2) 7316 CALL DPWRST('XXX','BUG ') 7317 ENDIF 7318C 7319 RETURN 7320 END 7321 SUBROUTINE BANDY(NN,BAN,NER,IOUNI5,IAGNBA,ISUBRO,IERROR) 7322C 7323CNIST SUBROUTINE BANDY(NN,MAXNN,BAN,NER,LAB,NUM,LUB) 7324C 7325C ORIGINAL ROUTINE USED TO DRAW "BANNER" LINE PRINTER GRAPH FOR 7326C DIVISIVE CLUSTERING ALGORITHM (DIANA). 7327C 7328C NN = NUMBER OF ROWS IN THE DISSIMILARITY MATRIX 7329C MAXNN = MAXIMUM NUMBER OF ROWS ALLOWED (NOT USED) 7330C BAN = DISSIMILARITIES BETWEEN CLUSTERS 7331C NER = FINAL ORDERING OF THE OBJECTS 7332C LAB = OBJECT LABELS 7333C NUM = LABELING STRING FOR BANNER PLOT 7334C LUB = OUTPUT UNIT FOR PRINTING (NOT USED) 7335C 7336 DIMENSION BAN(*) 7337 DIMENSION NER(*) 7338C 7339CNIST CHARACTER*1 LAB(3,MAXNN),JDRAW(78),NUM(13),JSTAR,JSEPA 7340 CHARACTER*4 IAGNBA 7341 CHARACTER*4 ISUBRO 7342 CHARACTER*4 IERROR 7343C 7344 CHARACTER*3 LAB 7345 CHARACTER*1 JDRAW(78) 7346 CHARACTER*1 NUM(13) 7347 CHARACTER*1 JBLAN,JSTAR,JSEPA 7348C 7349 INCLUDE 'DPCOP2.INC' 7350C 7351 IERROR='YES' 7352C 7353 IF(ISUBRO.EQ.'ANDY')THEN 7354 WRITE(ICOUT,52)IOUNI5 7355 52 FORMAT('IOUNI5 = ',I5) 7356 CALL DPWRST('XXX','BUG ') 7357 ENDIF 7358C 7359 NUM(1)='0' 7360 NUM(2)='1' 7361 NUM(3)='2' 7362 NUM(4)='3' 7363 NUM(5)='4' 7364 NUM(6)='5' 7365 NUM(7)='6' 7366 NUM(8)='7' 7367 NUM(9)='8' 7368 NUM(10)='9' 7369 NUM(11)=' ' 7370 NUM(12)='*' 7371 NUM(13)='+' 7372 JBLAN=NUM(11) 7373 JSTAR=NUM(12) 7374 JSEPA=NUM(13) 7375C 7376 IF(IPRINT.EQ.'ON' .AND. IAGNBA.EQ.'ON')THEN 7377 WRITE(ICOUT,999) 7378 999 FORMAT(1X) 7379 CALL DPWRST('XXX','BUG ') 7380 WRITE(ICOUT,999) 7381 CALL DPWRST('XXX','BUG ') 7382 WRITE(ICOUT,999) 7383 CALL DPWRST('XXX','BUG ') 7384 WRITE(ICOUT,9000) 7385 9000 FORMAT(34X,'************') 7386 CALL DPWRST('XXX','BUG ') 7387 WRITE(ICOUT,9001) 7388 9001 FORMAT(34X,'*',10X,'*') 7389 CALL DPWRST('XXX','BUG ') 7390 WRITE(ICOUT,9002) 7391 9002 FORMAT(34X,'* BANNER *') 7392 CALL DPWRST('XXX','BUG ') 7393 WRITE(ICOUT,9001) 7394 CALL DPWRST('XXX','BUG ') 7395 WRITE(ICOUT,9000) 7396 CALL DPWRST('XXX','BUG ') 7397C 7398 WRITE(ICOUT,999) 7399 CALL DPWRST('XXX','BUG ') 7400 WRITE(ICOUT,999) 7401 CALL DPWRST('XXX','BUG ') 7402 WRITE(ICOUT,9200) 7403 9200 FORMAT(25('0 '),'1') 7404 CALL DPWRST('XXX','BUG ') 7405 WRITE(ICOUT,9201) 7406 9201 FORMAT(26('. ')) 7407 CALL DPWRST('XXX','BUG ') 7408C 7409 WRITE(ICOUT,9210) 7410 9210 FORMAT('0 0 0 1 1 2 2 2 3 3 4 4 4 5 5 ', 7411 1 '6 6 6 7 7 8 8 8 9 9 0') 7412 CALL DPWRST('XXX','BUG ') 7413C 7414 WRITE(ICOUT,9220) 7415 9220 FORMAT(5('0 4 8 2 6 '),'0') 7416 CALL DPWRST('XXX','BUG ') 7417 WRITE(ICOUT,999) 7418 CALL DPWRST('XXX','BUG ') 7419 WRITE(ICOUT,999) 7420 CALL DPWRST('XXX','BUG ') 7421 ENDIF 7422C 7423C SUP = MAXIMUM VALUE IN BAN 7424C 7425 SUP=0.0 7426 DO 70 K=2,NN 7427 IF(BAN(K).GT.SUP)SUP=BAN(K) 7428 70 CONTINUE 7429 DO 71 K=2,NN 7430 BAN(K)=BAN(K)/SUP 7431 71 CONTINUE 7432C 7433 DC=0.0 7434 DO 80 K=1,NN 7435 NCASE=NER(K) 7436 DO 81 L=1,19 7437 LALFA=(L-1)*4+1 7438 LBETA=(L-1)*4+2 7439 LGAMA=(L-1)*4+3 7440 LDELT=L*4 7441 LAB='000' 7442 IF(NCASE.LE.9)THEN 7443 WRITE(LAB(3:3),'(I1)')NCASE 7444 ELSEIF(NCASE.LE.99)THEN 7445 WRITE(LAB(2:3),'(I2)')NCASE 7446 ELSE 7447 WRITE(LAB(1:3),'(I3)')NCASE 7448 ENDIF 7449 JDRAW(LALFA)=LAB(1:1) 7450 JDRAW(LBETA)=LAB(2:2) 7451 JDRAW(LGAMA)=LAB(3:3) 7452 JDRAW(LDELT)=JSEPA 7453 81 CONTINUE 7454C 7455 JDRAW(77)=LAB(1:1) 7456 JDRAW(78)=LAB(2:2) 7457 KEARL=K 7458 IF(K.EQ.1)KEARL=2 7459 KAFTE=K+1 7460 IF(K.EQ.NN)KAFTE=NN 7461 SYZE=BAN(KEARL) 7462 IF(BAN(KAFTE).LT.SYZE)SYZE=BAN(KAFTE) 7463 DC=DC+1.0-SYZE 7464 LENGT=INT((1.0-SYZE)*75.0+0.01)+3 7465C 7466 IF(IPRINT.EQ.'ON' .AND. IAGNBA.EQ.'ON')THEN 7467 WRITE(ICOUT,9100)(JDRAW(J),J=1,LENGT) 7468 9100 FORMAT(1X,78A1) 7469 CALL DPWRST('XXX','BUG ') 7470 ENDIF 7471C 7472 IF(K.EQ.NN)GO TO 90 7473 SYZE=BAN(KAFTE) 7474 LENGT=INT((1.0-SYZE)*75.0+0.01)+3 7475 DO 82 L=1,LENGT 7476 JDRAW(L)=JSTAR 7477 82 CONTINUE 7478C 7479 IF(IPRINT.EQ.'ON' .AND. IAGNBA.EQ.'ON')THEN 7480 WRITE(ICOUT,9100)(JDRAW(J),J=1,LENGT) 7481 CALL DPWRST('XXX','BUG ') 7482 ENDIF 7483C 7484 80 CONTINUE 7485 90 CONTINUE 7486C 7487 IF(IPRINT.EQ.'ON' .AND. IAGNBA.EQ.'ON')THEN 7488 WRITE(ICOUT,9200) 7489 CALL DPWRST('XXX','BUG ') 7490 WRITE(ICOUT,9201) 7491 CALL DPWRST('XXX','BUG ') 7492 WRITE(ICOUT,9210) 7493 CALL DPWRST('XXX','BUG ') 7494 WRITE(ICOUT,9220) 7495 CALL DPWRST('XXX','BUG ') 7496 WRITE(ICOUT,999) 7497 CALL DPWRST('XXX','BUG ') 7498 ENDIF 7499C 7500 IF(IPRINT.EQ.'ON')THEN 7501 WRITE(ICOUT,999) 7502 CALL DPWRST('XXX','BUG ') 7503 WRITE(ICOUT,999) 7504 CALL DPWRST('XXX','BUG ') 7505 WRITE(ICOUT,9300)SUP 7506 9300 FORMAT(' THE ACTUAL DIAMETER OF THIS DATA SET IS ',F25.10) 7507 CALL DPWRST('XXX','BUG ') 7508 ENDIF 7509C 7510 RNN=NN 7511 DC=DC/RNN 7512 IF(IPRINT.EQ.'ON')THEN 7513 WRITE(ICOUT,999) 7514 CALL DPWRST('XXX','BUG ') 7515 WRITE(ICOUT,999) 7516 CALL DPWRST('XXX','BUG ') 7517 WRITE(ICOUT,9310)DC 7518 9310 FORMAT(' THE DIVISIVE COEFFICIENT OF THIS DATA SET IS ',F5.2) 7519 CALL DPWRST('XXX','BUG ') 7520 ENDIF 7521C 7522 RETURN 7523 END 7524 SUBROUTINE DYSTAF(NN,JPP,MAXNN,MAXPP,X,DSS,NDYST,AMISS,JHALT, 7525 1 ISUBRO,IBUGA3) 7526C 7527CNIST SUBROUTINE DYSTA(NN,JPP,MAXNN,MAXPP,MAXHH,X,DSS,NDYST, 7528CNISTF JTMD,VALMD,LAB,JHALT,LUB,FNAMEB) 7529C 7530C KAUFFMAN AND ROUSSEEUW CODE FROM FANNY ALGORITHM. THIS 7531C ROUTINE COMPUTES EITHER EUCLIDEAN DISTANCE OR MANHATTAN 7532C DISTANCE BETWEEN ALL OBJECTS (FANNY VERSION). 7533C 7534C NN = NUMBER OF SAMPLES 7535C JPP = NUMBER OF VARIABLES 7536C MAXNNN = THE ROW DIMENSION OF X 7537C MAXPP = THE COLUMN DIMENSION OF X 7538C MAXHH = THE MAXIMUM DIMENSION FOR THE DISTANCES 7539C (DATAPLOT DOES NOT USE) 7540C X = THE DATA MATRIX 7541C DSS = THE OUTPUT MATRIX CONTAINING THE DISTANCES 7542C NDYST = 1 => EUCLIDEAN DISTANCES 7543C 2 => MANHATTAN (= CITY BLOCK) DISTANCES 7544C JTMD = FOR MISSING VALUES, WE DON'T USE 7545C VALMD = FOR MISSING VALUES, WE DON'T USE 7546C LAB = OBJECT LABELS 7547C (DATAPLOT AUTOMATICALLY USES ROW-ID) 7548C JHALT = SET TO 1 FOR ERROR CONDITION 7549C LUB = OUTPUT UNIT 7550C (DATAPLOT DOES NOT USE) 7551C FNAMEB = OUTPUT FILE NAME 7552C (DATAPLOT DOES NOT USE) 7553C 7554C CHANGES FOR INCORPORATING INTO DATAPLOT: 7555C 7556C 1. USE DATAPLOT I/O ROUTINES 7557C 2. FOR DATAPLOT, ONLY USE A SINGLE VALUE TO DENOTE 7558C MISSING VALUES 7559C 3. RECODED SLIGHTLY TO REDUCE USE OF GO TO's (THIS 7560C WAS JUST TO IMPROVE READABILITY OF THE CODE) 7561C 7562 DIMENSION X(MAXNN,MAXPP) 7563 DIMENSION DSS(*) 7564CNIST DIMENSION JTMD(MAXPP) 7565CNIST DIMENSION VALMD(MAXPP) 7566CNIST CHARACTER LAB(3,MAXNN) 7567CNIST CHARACTER*30 FNAMEB 7568C 7569 CHARACTER*4 IBUGA3 7570 CHARACTER*4 ISUBRO 7571C 7572 INCLUDE 'DPCOP2.INC' 7573C 7574 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'YSTA')THEN 7575 WRITE(ICOUT,51) 7576 51 FORMAT('***** AT THE BEGINNING OF DYSTAF--') 7577 CALL DPWRST('XXX','BUG ') 7578 WRITE(ICOUT,55)NN,JPP,AMISS 7579 55 FORMAT('NN,JPP,AMISS = ',2I8,G15.7) 7580 CALL DPWRST('XXX','BUG ') 7581 DO58I=1,NN 7582 WRITE(ICOUT,59)I,(X(I,J),J=1,JPP) 7583 59 FORMAT('I,X(I,J) = ',I8,30G15.7) 7584 CALL DPWRST('XXX','BUG ') 7585 58 CONTINUE 7586 ENDIF 7587C 7588 JHALT=0 7589 PP=JPP 7590 NNSUB=NN-1 7591 NLK=0 7592 DO 100 L=1,NNSUB 7593 LPLUS=L+1 7594 DO 20 K=LPLUS,NN 7595 CLK=0.0 7596 NLK=NLK+1 7597 NPRES=0 7598 DO 30 J=1,JPP 7599CNIST IF(JTMD(J).GE.0)GOTO 40 7600CNIST IF(X(L,J).EQ.VALMD(J))GOTO 30 7601CNIST IF(X(K,J).EQ.VALMD(J))GOTO 30 7602 IF(X(K,J).EQ.AMISS)GOTO30 7603CNI40 CONTINUE 7604 NPRES=NPRES+1 7605 IF(NDYST.NE.1)THEN 7606 CLK=CLK+ABS(X(L,J)-X(K,J)) 7607 ELSE 7608 CLK=CLK+(X(L,J)-X(K,J))*(X(L,J)-X(K,J)) 7609 ENDIF 7610 30 CONTINUE 7611 RPRES=NPRES 7612 IF(NPRES.EQ.0)THEN 7613 JHALT=1 7614CNIST WRITE(LUB,9400)LAB(1,L),LAB(2,L),LAB(3,L), 7615CNIST1 LAB(1,K),LAB(2,K),LAB(3,K) 7616C9400 FORMAT(' OBJECTS ',3A1,' AND ',3A1) 7617CNIST1 ' HAVE NO COMMON MEASUREMENTS') 7618CNIST IF(FNAMEB.NE.'CON')WRITE(*,9400)LAB(1,L),LAB(2,L),LAB(3,L), 7619CNIST1 LAB(1,K),LAB(2,K),LAB(3,K) 7620 WRITE(ICOUT,999) 7621 999 FORMAT(1X) 7622 CALL DPWRST('XXX','BUG ') 7623 WRITE(ICOUT,9401)L,K 7624 9401 FORMAT('***** OBJECTS ',I8,' AND ',I8,' HAVE NO ', 7625 1 'COMMON MEASURE, SO') 7626 CALL DPWRST('XXX','BUG ') 7627 WRITE(ICOUT,9403) 7628 9403 FORMAT(' THE DISTANCE BETWEEN THEM CANNOT BE ', 7629 1 'COMPUTED.') 7630 CALL DPWRST('XXX','BUG ') 7631 DSS(NLK)=0.0 7632 GOTO 20 7633 ENDIF 7634 IF(NDYST.NE.1)THEN 7635 DSS(NLK)=CLK*(PP/RPRES) 7636 ELSE 7637 DSS(NLK)=SQRT(CLK*(PP/RPRES)) 7638 ENDIF 7639 20 CONTINUE 7640 100 CONTINUE 7641C 7642 RETURN 7643 END 7644CC 7645CC 7646 SUBROUTINE CADDY(NN,MAXNN,P,K,KTRUE, 7647 1 NFUZZ,NCLUV,RDRAW,NELEM,EDA,EDB, 7648 1 IOUNI1,IOUNI2,IBUGA3,ISUBRO) 7649C 7650CNIST SUBROUTINE CADDY(NN,MAXNN,MAXKK,P,LAB,K,KTRUE,LUB, 7651CNIST1 NFUZZ,NCLUV,RDRAW,NELEM,EDA,EDB) 7652C 7653 DIMENSION P(MAXNN,*) 7654 DIMENSION RDRAW(*) 7655 DIMENSION NCLUV(*) 7656 DIMENSION NELEM(*) 7657 DIMENSION NFUZZ(*) 7658 CHARACTER JDRAW(30) 7659CNIST CHARACTER LAB(3,MAXNN) 7660 CHARACTER*3 LAB 7661C 7662 CHARACTER*4 ISUBRO 7663 CHARACTER*4 IBUGA3 7664C 7665 INCLUDE 'DPCOP2.INC' 7666C 7667 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'ADDY')THEN 7668 WRITE(ICOUT,5) 7669 5 FORMAT('AT THE BEGINNING OF CADDY') 7670 CALL DPWRST('XXX','BUG ') 7671 WRITE(ICOUT,6)NN,MAXNN 7672 6 FORMAT('NN,MAXNN = ',2I6) 7673 CALL DPWRST('XXX','BUG ') 7674 DO7II=1,NN 7675 WRITE(ICOUT,8)II,(P(II,JJ),JJ=1,MIN(K,10)) 7676 8 FORMAT('II,(P(II,JJ),JJ=1,K) = ',I6,10G15.7) 7677 CALL DPWRST('XXX','BUG ') 7678 7 CONTINUE 7679 ENDIF 7680C 7681 PBEST=P(1,1) 7682 NBEST=1 7683 KKK=0 7684 KM=0 7685C 7686 DO 10 L=2,K 7687 IF(P(1,L).LE.PBEST)GOTO10 7688 PBEST=P(1,L) 7689 NBEST=L 7690 10 CONTINUE 7691C 7692 NFUZZ(1)=NBEST 7693 NCLUV(1)=1 7694 KTRUE=1 7695C 7696 DO 20 M=2,NN 7697 PBEST=P(M,1) 7698 NBEST=1 7699 DO 30 L=2,K 7700 IF(P(M,L).LE.PBEST)GOTO30 7701 PBEST=P(M,L) 7702 NBEST=L 7703 30 CONTINUE 7704C 7705 JSTAY=0 7706 DO 40 KTRY=1,KTRUE 7707 IF(NFUZZ(KTRY).NE.NBEST)GO TO 40 7708 NCLUV(M)=KTRY 7709 JSTAY=1 7710 40 CONTINUE 7711C 7712 IF(JSTAY.EQ.1)GO TO 20 7713 KTRUE=KTRUE+1 7714 NFUZZ(KTRUE)=NBEST 7715 NCLUV(M)=KTRUE 7716 20 CONTINUE 7717C 7718 IF(KTRUE.GE.K)GO TO 100 7719C 7720 KNEXT=KTRUE+1 7721 DO 60 KWALK=KNEXT,K 7722 DO 70 KLEFT=1,K 7723 JSTAY=0 7724 KSUP=KWALK-1 7725 DO 80 KTRY=1,KSUP 7726 IF(NFUZZ(KTRY).NE.KLEFT)GO TO 80 7727 JSTAY=1 7728 80 CONTINUE 7729 IF(JSTAY.NE.1)THEN 7730 NFUZZ(KWALK)=KLEFT 7731 GO TO 60 7732 ENDIF 7733 70 CONTINUE 7734 60 CONTINUE 7735C 7736 100 CONTINUE 7737 IF(IPRINT.EQ.'ON')THEN 7738 WRITE(ICOUT,9210) 7739 9210 FORMAT(1X) 7740 CALL DPWRST('XXX','BUG ') 7741 WRITE(ICOUT,9210) 7742 CALL DPWRST('XXX','BUG ') 7743 WRITE(ICOUT,9200) 7744 9200 FORMAT('FUZZY CLUSTERING') 7745 CALL DPWRST('XXX','BUG ') 7746 WRITE(ICOUT,9201) 7747 9201 FORMAT('****************') 7748 CALL DPWRST('XXX','BUG ') 7749 ILOOP=K/10 7750 IF(MOD(K,10).GT.0)ILOOP=ILOOP+1 7751 DO9205II=1,ILOOP 7752 ISTRT=(II-1)*10 + 1 7753 ISTOP=II*10 7754 IF(ISTOP.GT.K)ISTOP=K 7755 WRITE(ICOUT,9202)(L,L=ISTRT,ISTOP) 7756 9202 FORMAT(3X,10I7) 7757 CALL DPWRST('XXX','BUG ') 7758 9205 CONTINUE 7759 ENDIF 7760C 7761 DO 110 M=1,NN 7762 DO 120 L=1,K 7763 LFUZZ=NFUZZ(L) 7764 RDRAW(L)=P(M,LFUZZ) 7765C 7766 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'ADDY')THEN 7767 WRITE(ICOUT,121)M,L,LFUZZ,RDRAW(L) 7768 121 FORMAT('M,L,LFUZZ,RDRAW(L) = ',3I6,F7.4) 7769 CALL DPWRST('XXX','BUG ') 7770 ENDIF 7771C 7772 120 CONTINUE 7773C 7774 IF(IPRINT.EQ.'ON')THEN 7775 LAB='000' 7776 IF(M.LE.9)THEN 7777 WRITE(LAB(3:3),'(I1)')M 7778 ELSEIF(M.LE.99)THEN 7779 WRITE(LAB(2:3),'(I2)')M 7780 ELSEIF(M.LE.999)THEN 7781 WRITE(LAB(1:3),'(I3)')M 7782 ENDIF 7783C 7784 ILOOP=K/10 7785 IF(MOD(K,10).GT.0)ILOOP=ILOOP+1 7786 DO9225II=1,ILOOP 7787 ISTRT=(II-1)*10 + 1 7788 ISTOP=II*10 7789 IF(ISTOP.GT.K)ISTOP=K 7790 IF(II.EQ.1)THEN 7791 WRITE(ICOUT,9220)LAB,(RDRAW(L),L=ISTRT,ISTOP) 7792 9220 FORMAT(A3,1X,10F7.4) 7793 CALL DPWRST('XXX','BUG ') 7794 ELSE 7795 WRITE(ICOUT,9221)(RDRAW(L),L=ISTRT,ISTOP) 7796 9221 FORMAT(4X,10F7.4) 7797 CALL DPWRST('XXX','BUG ') 7798 ENDIF 7799 9225 CONTINUE 7800 ENDIF 7801C 7802 WRITE(IOUNI1,9228)M,(RDRAW(L),L=1,K) 7803 9228 FORMAT(I5,30F10.4) 7804C 7805 110 CONTINUE 7806C 7807 IF(IPRINT.EQ.'ON')THEN 7808 WRITE(ICOUT,9210) 7809 CALL DPWRST('XXX','BUG ') 7810 WRITE(ICOUT,9300)EDA 7811 9300 FORMAT('PARTITION COEFFICIENT OF DUNN = ',F5.2) 7812 CALL DPWRST('XXX','BUG ') 7813 WRITE(ICOUT,9301)EDB 7814 9301 FORMAT('ITS NORMALIZED VERSION = ',F5.2) 7815 CALL DPWRST('XXX','BUG ') 7816 WRITE(ICOUT,9210) 7817 CALL DPWRST('XXX','BUG ') 7818 WRITE(ICOUT,9210) 7819 CALL DPWRST('XXX','BUG ') 7820 WRITE(ICOUT,9230) 7821 9230 FORMAT(' CLOSEST HARD CLUSTERING') 7822 CALL DPWRST('XXX','BUG ') 7823 WRITE(ICOUT,9231) 7824 9231 FORMAT('************************') 7825 CALL DPWRST('XXX','BUG ') 7826 WRITE(ICOUT,9210) 7827 CALL DPWRST('XXX','BUG ') 7828 ENDIF 7829C 7830 IF(KTRUE.LT.K)THEN 7831 IF(IPRINT.EQ.'ON')THEN 7832 WRITE(ICOUT,9210) 7833 CALL DPWRST('XXX','BUG ') 7834 WRITE(ICOUT,9240) 7835 9240 FORMAT('FOR THIS HARD CLUSTERING, IT TURNS OUT THAT') 7836 CALL DPWRST('XXX','BUG ') 7837 WRITE(ICOUT,9241)KTRUE 7838 9241 FORMAT('ONLY THE FIRST ',I4,' CLUSTERS ARE NONEMPTY.') 7839 CALL DPWRST('XXX','BUG ') 7840 WRITE(ICOUT,9210) 7841 CALL DPWRST('XXX','BUG ') 7842 WRITE(ICOUT,9210) 7843 CALL DPWRST('XXX','BUG ') 7844 ENDIF 7845 ENDIF 7846C 7847 IF(IPRINT.EQ.'ON')THEN 7848 WRITE(ICOUT,9250) 7849 9250 FORMAT('CLUSTER NUMBER SIZE OBJECTS') 7850 CALL DPWRST('XXX','BUG ') 7851 ENDIF 7852C 7853 DO 160 NUMCL=1,KTRUE 7854 NTT=0 7855 DO 150 J=1,NN 7856 IF(NCLUV(J).NE.NUMCL)GO TO 150 7857 NTT=NTT+1 7858 NELEM(NTT)=J 7859 150 CONTINUE 7860 NSS=NTT 7861 IF(NSS.GT.10)NSS=10 7862 DO 152 L=1,NSS 7863 LEEN=3*(L-1)+1 7864 LTWE=3*(L-1)+2 7865 LDRE=3*L 7866 NCASE=NELEM(L) 7867 LAB='000' 7868 IF(NCASE.LE.9)THEN 7869 WRITE(LAB(3:3),'(I1)')NCASE 7870 ELSEIF(NCASE.LE.99)THEN 7871 WRITE(LAB(2:3),'(I2)')NCASE 7872 ELSEIF(NCASE.LE.999)THEN 7873 WRITE(LAB(1:3),'(I3)')NCASE 7874 ENDIF 7875 JDRAW(LEEN)=LAB(1:1) 7876 JDRAW(LTWE)=LAB(2:2) 7877 JDRAW(LDRE)=LAB(3:3) 7878 152 CONTINUE 7879C 7880 NSSDR=NSS*3 7881 IF(IPRINT.EQ.'ON')THEN 7882 WRITE(ICOUT,9210) 7883 CALL DPWRST('XXX','BUG ') 7884 WRITE(ICOUT,9260)NUMCL,NTT,(JDRAW(LL),LL=1,NSSDR) 7885 9260 FORMAT(5X,I5,5X,I6,5X,10(3A1,1X)) 7886 CALL DPWRST('XXX','BUG ') 7887 ENDIF 7888C 7889 IF(NTT.LE.10)GO TO 160 7890 KAUNT=0 7891 DO 154 L=11,NTT 7892 KAUNT=KAUNT+1 7893 LEEN=3*(KAUNT-1)+1 7894 LTWE=3*(KAUNT-1)+2 7895 LDRE=3*KAUNT 7896 NCASE=NELEM(L) 7897 LAB='000' 7898 IF(NCASE.LE.9)THEN 7899 WRITE(LAB(3:3),'(I1)')NCASE 7900 ELSEIF(NCASE.LE.99)THEN 7901 WRITE(LAB(2:3),'(I2)')NCASE 7902 ELSEIF(NCASE.LE.999)THEN 7903 WRITE(LAB(1:3),'(I3)')NCASE 7904 ENDIF 7905 JDRAW(LEEN)=LAB(1:1) 7906 JDRAW(LTWE)=LAB(2:2) 7907 JDRAW(LDRE)=LAB(3:3) 7908 IF(KAUNT.EQ.10)THEN 7909 IF(IPRINT.EQ.'ON')THEN 7910 WRITE(ICOUT,9270)(JDRAW(LL),LL=1,30) 7911 9270 FORMAT(27X,10(3A1,1X)) 7912 CALL DPWRST('XXX','BUG ') 7913 KAUNT=0 7914 ENDIF 7915 ENDIF 7916 154 CONTINUE 7917C 7918 IF(KAUNT.GE.1 .AND. IPRINT.EQ.'ON')THEN 7919 WRITE(ICOUT,9270)(JDRAW(LL),LL=1,LDRE) 7920 CALL DPWRST('XXX','BUG ') 7921 ENDIF 7922 160 CONTINUE 7923C 7924 IF(IPRINT.EQ.'ON')THEN 7925 WRITE(ICOUT,9210) 7926 CALL DPWRST('XXX','BUG ') 7927 WRITE(ICOUT,9210) 7928 CALL DPWRST('XXX','BUG ') 7929 WRITE(ICOUT,9280) 7930 9280 FORMAT('CLUSTERING VECTOR') 7931 CALL DPWRST('XXX','BUG ') 7932 WRITE(ICOUT,9281) 7933 9281 FORMAT('*****************') 7934 CALL DPWRST('XXX','BUG ') 7935 WRITE(ICOUT,9210) 7936 CALL DPWRST('XXX','BUG ') 7937C 7938 ILOOP=NN/20 7939 IF(MOD(NN,20).GT.0)ILOOP=ILOOP+1 7940 DO9295II=1,ILOOP 7941 ISTRT=(II-1)*20 + 1 7942 ISTOP=II*20 7943 IF(ISTOP.GT.NN)ISTOP=NN 7944 WRITE(ICOUT,9290)(NCLUV(J),J=ISTRT,ISTOP) 7945 9290 FORMAT(10X,20I3) 7946 CALL DPWRST('XXX','BUG ') 7947 9295 CONTINUE 7948C 7949 ENDIF 7950C 7951 DO9296II=1,NN 7952 WRITE(IOUNI2,'(I5)')NCLUV(II) 7953 9296 CONTINUE 7954C 7955 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'ADDY')THEN 7956 WRITE(ICOUT,9910) 7957 9910 FORMAT('AT THE END OF CADDY') 7958 CALL DPWRST('XXX','BUG ') 7959 ENDIF 7960C 7961 RETURN 7962 END 7963 SUBROUTINE FUZZY(NN,MAXNN,P,DP,PT,DSS,ESP,EF,EDA,EDB,K, 7964 1 IBUGA3,ISUBRO) 7965C 7966CNIST SUBROUTINE FUZZY(NN,MAXNN,MAXKK,MAXHH,P,DP,PT,LAB,DSS,ESP,EF, 7967CNIST1 EDA,EDB,K,LUB) 7968C 7969 DIMENSION P(MAXNN,*),DP(MAXNN,*) 7970 DIMENSION DSS(*),PT(*),ESP(*),EF(*) 7971CNIST CHARACTER LAB(3,MAXNN) 7972C 7973 CHARACTER*4 IBUGA3 7974 CHARACTER*4 ISUBRO 7975C 7976CCCCC CHARACTER*3 LAB 7977C 7978 INCLUDE 'DPCOP2.INC' 7979C 7980 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'UZZY')THEN 7981 WRITE(ICOUT,10) 7982 10 FORMAT('AT THE BEGINNING OF FUZZY') 7983 CALL DPWRST('XXX','BUG ') 7984 WRITE(ICOUT,11)NN,MAXNN,K 7985 11 FORMAT('NN,MAXNN,K = ',3I6) 7986 CALL DPWRST('XXX','BUG ') 7987 ENDIF 7988CC 7989CC R IS THE EXPONENT, STRICTLY LARGER THAN 1.0 7990CC EPS IS THE PRECISION FOR THE ITERATIONS 7991CC NYT IS THE MAXIMAL NUMBER OF ITERATIONS 7992CC 7993 R=2.0 7994 EPS=0.000001 7995 NYT=500 7996CC 7997CC INITIAL FUZZY CLUSTERING 7998CC 7999 NNSUB=NN-1 8000 RVERS=1./R 8001 RKME=REAL(K-1) 8002 DO 30 M=1,NN 8003 DO 20 L=1,K 8004 DP(M,L)=0. 8005 P(M,L)=0.1/RKME 8006 20 CONTINUE 8007 30 CONTINUE 8008C 8009 NDK=NN/K 8010 ND=NDK 8011 L=1 8012 DO 50 M=1,NN 8013 P(M,L)=0.9 8014 IF(M.GE.ND)THEN 8015 ND=ND+NDK 8016 L=L+1 8017 IF(L.EQ.K)ND=NN 8018 ENDIF 8019 DO 40 LX=1,K 8020 P(M,LX)=P(M,LX)**R 8021 40 CONTINUE 8022 50 CONTINUE 8023C 8024 IF(IPRINT.EQ.'ON')THEN 8025 WRITE(ICOUT,999) 8026 999 FORMAT(1X) 8027 CALL DPWRST('XXX','BUG ') 8028 WRITE(ICOUT,999) 8029 CALL DPWRST('XXX','BUG ') 8030 WRITE(ICOUT,9110) 8031 9110 FORMAT(' ITERATION OBJECTIVE FUNCTION') 8032 CALL DPWRST('XXX','BUG ') 8033 WRITE(ICOUT,999) 8034 CALL DPWRST('XXX','BUG ') 8035 ENDIF 8036CC 8037CC INITIAL CRITERION VALUE 8038CC 8039 CRYT=0. 8040 DO 100 L=1,K 8041 ESP(L)=0. 8042 EF(L)=0. 8043 DO 90 M=1,NN 8044 ESP(L)=ESP(L)+P(M,L) 8045C 8046 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'UZZY')THEN 8047 WRITE(ICOUT,91)L,M,P(M,L),ESP(L) 8048 91 FORMAT('AT DO 100: L,M,P(M,L),ESP(L) = ',2I5,2G15.7) 8049 CALL DPWRST('XXX','BUG ') 8050 ENDIF 8051C 8052 DO 80 J=1,NN 8053 IF(J.EQ.M)GO TO 80 8054 J2=MIN0(M,J) 8055 J1=(J2-1)*NN-(J2*(J2+1))/2+MAX0(M,J) 8056 DP(M,L)=DP(M,L)+P(J,L)*DSS(J1) 8057 EF(L)=EF(L)+P(J,L)*P(M,L)*DSS(J1) 8058C 8059 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'UZZY')THEN 8060 WRITE(ICOUT,82)J,J1,J2,DSS(J1),DP(M,L),EF(L) 8061 82 FORMAT('AT DO 80: J,J1,J2,DSS(J1),DP(M,L),EF(L)=', 8062 1 3I5,3G15.7) 8063 CALL DPWRST('XXX','BUG ') 8064 ENDIF 8065C 8066 80 CONTINUE 8067 90 CONTINUE 8068 CRYT=CRYT+EF(L)/(ESP(L)*2.) 8069 100 CONTINUE 8070 CRT=CRYT 8071 REEN=1./(R-1.) 8072CC 8073CC START OF ITERATIONS 8074CC 8075 KAUNT=1 8076 M=0 8077CC 8078CC THE NEW MEMBERSHIP COEFFICIENTS OF THE OBJECTS ARE CALCULATED, 8079CC AND THE RESULTING VALUE OF THE CRITERION IS COMPUTED. 8080CC 8081 200 CONTINUE 8082 M=M+1 8083 DT=0. 8084 DO 210 L=1,K 8085 PT(L)=((2.*ESP(L)*ESP(L))/(2.*ESP(L)*DP(M,L)-EF(L)))**REEN 8086 DT=DT+PT(L) 8087 210 CONTINUE 8088C 8089 XX=0. 8090 DO 220 L=1,K 8091 PT(L)=PT(L)/DT 8092 IF(PT(L).LE.0.)XX=XX+PT(L) 8093 220 CONTINUE 8094C 8095 DO 240 L=1,K 8096 IF(PT(L).LE.0.)PT(L)=0. 8097 PT(L)=(PT(L)/(1.0-XX))**R 8098 ESP(L)=ESP(L)+PT(L)-P(M,L) 8099 DO 230 J=1,NN 8100 IF(J.EQ.M)GO TO 230 8101 J2=MIN0(M,J) 8102 J1=(J2-1)*NN-(J2*(J2+1))/2+MAX0(M,J) 8103 DDD=(PT(L)-P(M,L))*DSS(J1) 8104 DP(J,L)=DP(J,L)+DDD 8105 EF(L)=EF(L)+2.*P(J,L)*DDD 8106 230 CONTINUE 8107 P(M,L)=PT(L) 8108 240 CONTINUE 8109C 8110 IF(M.LT.NN)GO TO 200 8111 CRYT=0. 8112 EDA=0. 8113 DO 250 L=1,K 8114 ANN=NN 8115 EDA=EDA+ESP(L)/ANN 8116 CRYT=CRYT+EF(L)/(ESP(L)*2.) 8117 250 CONTINUE 8118CC 8119CC CRITERION IS PRINTED AND TESTED FOR CONVERGENCE 8120CC 8121 IF(IPRINT.EQ.'ON')THEN 8122 WRITE(ICOUT,9120)KAUNT,CRYT 8123 9120 FORMAT(I5,11X,F11.4) 8124 CALL DPWRST('XXX','BUG ') 8125 ENDIF 8126C 8127 IF((CRT/CRYT-1.).LE.EPS)GO TO 500 8128 IF(KAUNT.LT.NYT)THEN 8129 M=0 8130 KAUNT=KAUNT+1 8131 CRT=CRYT 8132 GO TO 200 8133 ENDIF 8134C 8135 IF(IPRINT.EQ.'ON')THEN 8136 WRITE(ICOUT,999) 8137 CALL DPWRST('XXX','BUG ') 8138 WRITE(ICOUT,9130)NYT 8139 9130 FORMAT('The maximum number of iterations (',I3, 8140 1 ') has been reached.') 8141 CALL DPWRST('XXX','BUG ') 8142 WRITE(ICOUT,9131) 8143 9131 FORMAT('The iterative procedure is therefore interrupted.') 8144 CALL DPWRST('XXX','BUG ') 8145 GO TO 500 8146 ENDIF 8147CC 8148CC NON-FUZZYNESS INDEX OF LIBERT IS COMPUTED 8149CC 8150 500 CONTINUE 8151 SMALL=1. 8152 FL=0. 8153 DO 410 MM=1,NN 8154 BBB=P(MM,1)**RVERS 8155 DO 400 J=2,K 8156 AAA=P(MM,J)**RVERS 8157 IF(AAA.GT.BBB)BBB=AAA 8158 400 CONTINUE 8159 IF(BBB.LT.SMALL)SMALL=BBB 8160 FL=FL+BBB 8161 410 CONTINUE 8162C 8163 RNN=NN 8164 FL=(FL/RNN+SMALL)/2. 8165 RK=K 8166 FL=(RK*FL-1.)/(RK-1.) 8167C 8168CC IF(IPRINT.EQ.'ON')THEN 8169CC WRITE(ICOUT,999) 8170CC CALL DPWRST('XXX','BUG ') 8171CC WRITE(ICOUT,9135)FL 8172C9135 FORMAT('NON-FUZZYNESS INDEX OF LIBERT = ',F5.2) 8173CC CALL DPWRST('XXX','BUG ') 8174CC WRITE(ICOUT,999) 8175CC CALL DPWRST('XXX','BUG ') 8176CC ENDIF 8177C 8178 ZK=K 8179 EDB=(ZK*EDA-1.)/(ZK-1.) 8180 DO 520 M=1,NN 8181 DO 510 L=1,K 8182 P(M,L)=P(M,L)**RVERS 8183 510 CONTINUE 8184 520 CONTINUE 8185C 8186 IF(IBUGA3.EQ.'ON' .OR. ISUBRO.EQ.'UZZY')THEN 8187 WRITE(ICOUT,9910) 8188 9910 FORMAT('AT THE END OF FUZZY') 8189 CALL DPWRST('XXX','BUG ') 8190 DO9911II=1,NN 8191 WRITE(ICOUT,9912)II,(P(II,JJ),JJ=1,K) 8192 9912 FORMAT('II,(P(II,JJ),JJ=1,K) = ',I6,20F7.4) 8193 CALL DPWRST('XXX','BUG ') 8194 9911 CONTINUE 8195 ENDIF 8196C 8197 RETURN 8198 END 8199