1 Performance of GA Communication Primitives 2 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3 Jarek Nieplocha 4 5 6Uniform block distribution of matrix 710x710. 7Accumulate operation is atomic. 8Program run on 4 processors. 9Process 0 holds block [1:355, 1:355], process 3 holds block [356:710, 356:710]. 10Process 0 first accesses local data and then remote data. 11To reduce data caching, each time a different section of the matrix is accessed. 12................................................................................ 13 14machine: Cray T3D 15name: h4p.nersc.gov 16options: -O1 -h inline3, readahead on, -DFLUSHCACHE 17note: lack of cache coherency degrades latency of "local" get 18date: Thu Dec 28 13:49:40 PST 1995 19 20 21 22 ACCESS [ 1: 355, 1: 355] 23 bytes loop get put accumulate 24 8 841 0.336E-04 0.24E+00MB/s 0.154E-04 0.52E+00MB/s 0.451E-04 0.18E+00MB/s 25 72 1936 0.365E-04 0.20E+01MB/s 0.174E-04 0.41E+01MB/s 0.473E-04 0.15E+01MB/s 26 128 2500 0.379E-04 0.34E+01MB/s 0.183E-04 0.70E+01MB/s 0.492E-04 0.26E+01MB/s 27 648 1024 0.471E-04 0.14E+02MB/s 0.253E-04 0.26E+02MB/s 0.638E-04 0.10E+02MB/s 28 2048 361 0.679E-04 0.30E+02MB/s 0.367E-04 0.56E+02MB/s 0.104E-03 0.20E+02MB/s 29 7200 121 0.128E-03 0.56E+02MB/s 0.771E-04 0.93E+02MB/s 0.244E-03 0.29E+02MB/s 30 32768 25 0.386E-03 0.85E+02MB/s 0.355E-03 0.92E+02MB/s 0.513E-03 0.64E+02MB/s 31 66248 9 0.717E-03 0.92E+02MB/s 0.681E-03 0.97E+02MB/s 0.874E-03 0.76E+02MB/s 32131072 4 0.132E-02 0.10E+03MB/s 0.127E-02 0.10E+03MB/s 0.161E-02 0.82E+02MB/s 33233928 4 0.225E-02 0.10E+03MB/s 0.220E-02 0.11E+03MB/s 0.258E-02 0.91E+02MB/s 34524288 1 0.392E-02 0.13E+03MB/s 0.396E-02 0.13E+03MB/s 0.557E-02 0.94E+02MB/s 35996872 1 0.695E-02 0.14E+03MB/s 0.695E-02 0.14E+03MB/s 0.101E-01 0.98E+02MB/s 36 ACCESS [356: 710,356: 710] 37 bytes loop get put accumulate 38 8 841 0.216E-04 0.37E+00MB/s 0.157E-04 0.51E+00MB/s 0.367E-04 0.22E+00MB/s 39 72 1936 0.280E-04 0.26E+01MB/s 0.180E-04 0.40E+01MB/s 0.466E-04 0.15E+01MB/s 40 128 2500 0.308E-04 0.42E+01MB/s 0.192E-04 0.67E+01MB/s 0.520E-04 0.25E+01MB/s 41 648 1024 0.591E-04 0.11E+02MB/s 0.278E-04 0.23E+02MB/s 0.954E-04 0.68E+01MB/s 42 2048 361 0.117E-03 0.17E+02MB/s 0.463E-04 0.44E+02MB/s 0.193E-03 0.11E+02MB/s 43 7200 121 0.315E-03 0.23E+02MB/s 0.108E-03 0.67E+02MB/s 0.533E-03 0.14E+02MB/s 44 32768 25 0.119E-02 0.28E+02MB/s 0.362E-03 0.90E+02MB/s 0.189E-02 0.17E+02MB/s 45 66248 9 0.231E-02 0.29E+02MB/s 0.671E-03 0.99E+02MB/s 0.359E-02 0.18E+02MB/s 46131072 4 0.425E-02 0.31E+02MB/s 0.125E-02 0.11E+03MB/s 0.732E-02 0.18E+02MB/s 47233928 4 0.733E-02 0.32E+02MB/s 0.211E-02 0.11E+03MB/s 0.122E-01 0.19E+02MB/s 48524288 1 0.161E-01 0.32E+02MB/s 0.461E-02 0.11E+03MB/s 0.263E-01 0.20E+02MB/s 49996872 1 0.296E-01 0.34E+02MB/s 0.843E-02 0.12E+03MB/s 0.485E-01 0.21E+02MB/s 50 51 52machine: Cray T3D 53name: h4p.nersc.gov 54options: -O3 -h inline3, readahead on, implicit (automatic) cache invalidation 55note: lack of cache coherency degrades latency of "local" get 56date: Mon Mar 25 10:32:42 PST 1996 57 58 59 ACCESS [ 1: 355, 1: 355] 60 bytes loop get put accumulate 61 8 841 0.223E-04 0.35E+00MB/s 0.170E-04 0.47E+00MB/s 0.270E-04 0.30E+00MB/s 62 72 1936 0.255E-04 0.28E+01MB/s 0.189E-04 0.38E+01MB/s 0.289E-04 0.25E+01MB/s 63 128 2500 0.269E-04 0.48E+01MB/s 0.197E-04 0.65E+01MB/s 0.301E-04 0.43E+01MB/s 64 648 1024 0.361E-04 0.18E+02MB/s 0.261E-04 0.25E+02MB/s 0.426E-04 0.15E+02MB/s 65 2048 361 0.579E-04 0.35E+02MB/s 0.360E-04 0.57E+02MB/s 0.754E-04 0.27E+02MB/s 66 7200 121 0.121E-03 0.59E+02MB/s 0.759E-04 0.95E+02MB/s 0.226E-03 0.32E+02MB/s 67 32768 25 0.371E-03 0.88E+02MB/s 0.347E-03 0.95E+02MB/s 0.573E-03 0.57E+02MB/s 68 66248 9 0.695E-03 0.95E+02MB/s 0.668E-03 0.99E+02MB/s 0.969E-03 0.68E+02MB/s 69131072 4 0.129E-02 0.10E+03MB/s 0.125E-02 0.10E+03MB/s 0.175E-02 0.75E+02MB/s 70233928 4 0.220E-02 0.11E+03MB/s 0.216E-02 0.11E+03MB/s 0.277E-02 0.84E+02MB/s 71524288 1 0.384E-02 0.14E+03MB/s 0.382E-02 0.14E+03MB/s 0.591E-02 0.89E+02MB/s 72996872 1 0.687E-02 0.15E+03MB/s 0.680E-02 0.15E+03MB/s 0.106E-01 0.94E+02MB/s 73 74 ACCESS [356: 710,356: 710] 75 bytes loop get put accumulate 76 8 841 0.244E-04 0.33E+00MB/s 0.187E-04 0.43E+00MB/s 0.347E-04 0.23E+00MB/s 77 72 1936 0.304E-04 0.24E+01MB/s 0.210E-04 0.34E+01MB/s 0.471E-04 0.15E+01MB/s 78 128 2500 0.333E-04 0.38E+01MB/s 0.221E-04 0.58E+01MB/s 0.542E-04 0.24E+01MB/s 79 648 1024 0.611E-04 0.11E+02MB/s 0.317E-04 0.20E+02MB/s 0.106E-03 0.61E+01MB/s 80 2048 361 0.120E-03 0.17E+02MB/s 0.494E-04 0.41E+02MB/s 0.213E-03 0.96E+01MB/s 81 7200 121 0.321E-03 0.22E+02MB/s 0.111E-03 0.65E+02MB/s 0.579E-03 0.12E+02MB/s 82 32768 25 0.121E-02 0.27E+02MB/s 0.364E-03 0.90E+02MB/s 0.229E-02 0.14E+02MB/s 83 66248 9 0.235E-02 0.28E+02MB/s 0.671E-03 0.99E+02MB/s 0.403E-02 0.16E+02MB/s 84131072 4 0.443E-02 0.30E+02MB/s 0.125E-02 0.11E+03MB/s 0.847E-02 0.15E+02MB/s 85233928 4 0.758E-02 0.31E+02MB/s 0.210E-02 0.11E+03MB/s 0.134E-01 0.17E+02MB/s 86524288 1 0.166E-01 0.32E+02MB/s 0.451E-02 0.12E+03MB/s 0.297E-01 0.18E+02MB/s 87996872 1 0.303E-01 0.33E+02MB/s 0.838E-02 0.12E+03MB/s 0.509E-01 0.20E+02MB/s 88................................................................................ 89 90machine: KSR-2 91name: circus.pnl.gov 92options: -O1 -qdiv 93date: Mon Jan 23 09:22:48 PST 1995 94 95 ACCESS [ 1 : 355 , 1 : 355 ] 96 bytes loop get put accumulate 97 8 841 0.357D-04 0.22D+00MB/s 0.350D-04 0.23D+00MB/s 0.379D-04 0.21D+00MB/s 98 72 1936 0.415D-04 0.17D+01MB/s 0.419D-04 0.17D+01MB/s 0.444D-04 0.16D+01MB/s 99 128 2500 0.454D-04 0.28D+01MB/s 0.463D-04 0.28D+01MB/s 0.493D-04 0.26D+01MB/s 100 648 1024 0.714D-04 0.91D+01MB/s 0.753D-04 0.86D+01MB/s 0.813D-04 0.80D+01MB/s 101 2048 361 0.119D-03 0.17D+02MB/s 0.129D-03 0.16D+02MB/s 0.147D-03 0.14D+02MB/s 102 7200 121 0.311D-03 0.23D+02MB/s 0.331D-03 0.22D+02MB/s 0.377D-03 0.19D+02MB/s 103 32768 25 0.109D-02 0.30D+02MB/s 0.113D-02 0.29D+02MB/s 0.128D-02 0.26D+02MB/s 104 66248 9 0.234D-02 0.28D+02MB/s 0.229D-02 0.29D+02MB/s 0.260D-02 0.26D+02MB/s 105131072 4 0.446D-02 0.29D+02MB/s 0.444D-02 0.29D+02MB/s 0.495D-02 0.26D+02MB/s 106233928 4 0.774D-02 0.30D+02MB/s 0.769D-02 0.30D+02MB/s 0.873D-02 0.27D+02MB/s 107524288 1 0.171D-01 0.31D+02MB/s 0.171D-01 0.31D+02MB/s 0.191D-01 0.27D+02MB/s 108996872 1 0.308D-01 0.32D+02MB/s 0.307D-01 0.32D+02MB/s 0.348D-01 0.29D+02MB/s 109 110 ACCESS [ 356 : 710 , 356 : 710 ] 111 bytes loop get put accumulate 112 8 841 0.373D-04 0.21D+00MB/s 0.364D-04 0.22D+00MB/s 0.448D-04 0.18D+00MB/s 113 72 1936 0.427D-04 0.17D+01MB/s 0.430D-04 0.17D+01MB/s 0.537D-04 0.13D+01MB/s 114 128 2500 0.469D-04 0.27D+01MB/s 0.479D-04 0.27D+01MB/s 0.604D-04 0.21D+01MB/s 115 648 1024 0.751D-04 0.86D+01MB/s 0.801D-04 0.81D+01MB/s 0.118D-03 0.55D+01MB/s 116 2048 361 0.135D-03 0.15D+02MB/s 0.144D-03 0.14D+02MB/s 0.160D-03 0.13D+02MB/s 117 7200 121 0.365D-03 0.20D+02MB/s 0.376D-03 0.19D+02MB/s 0.410D-03 0.18D+02MB/s 118 32768 25 0.133D-02 0.25D+02MB/s 0.137D-02 0.24D+02MB/s 0.148D-02 0.22D+02MB/s 119 66248 9 0.276D-02 0.24D+02MB/s 0.276D-02 0.24D+02MB/s 0.307D-02 0.22D+02MB/s 120131072 4 0.543D-02 0.24D+02MB/s 0.541D-02 0.24D+02MB/s 0.588D-02 0.22D+02MB/s 121233928 4 0.928D-02 0.25D+02MB/s 0.917D-02 0.26D+02MB/s 0.101D-01 0.23D+02MB/s 122524288 1 0.209D-01 0.25D+02MB/s 0.210D-01 0.25D+02MB/s 0.227D-01 0.23D+02MB/s 123996872 1 0.353D-01 0.28D+02MB/s 0.347D-01 0.29D+02MB/s 0.387D-01 0.26D+02MB/s 124................................................................................ 125 126machine: Intel Paragon 127name: trex.caltech.edu 128options: -O2 -Msafeptr -Knoieee -Mquad -Mreentrant; run with -plk 129note:122K message buffer; masktrap responsible for latency of "local" accumulate 130date: Fri Dec 29 11:28:54 PST 1995 131 132 133 134 ACCESS [ 1: 355, 1: 355] 135 bytes loop get put accumulate 136 8 841 0.707D-04 0.11D+00MB/s 0.297D-04 0.27D+00MB/s 0.167D-03 0.48D-01MB/s 137 72 1936 0.352D-04 0.20D+01MB/s 0.324D-04 0.22D+01MB/s 0.178D-03 0.41D+00MB/s 138 128 2500 0.379D-04 0.34D+01MB/s 0.341D-04 0.38D+01MB/s 0.183D-03 0.70D+00MB/s 139 648 1024 0.588D-04 0.11D+02MB/s 0.450D-04 0.14D+02MB/s 0.221D-03 0.29D+01MB/s 140 2048 361 0.108D-03 0.19D+02MB/s 0.695D-04 0.29D+02MB/s 0.306D-03 0.67D+01MB/s 141 7200 121 0.277D-03 0.26D+02MB/s 0.151D-03 0.48D+02MB/s 0.468D-03 0.15D+02MB/s 142 32768 25 0.550D-03 0.60D+02MB/s 0.550D-03 0.60D+02MB/s 0.109D-02 0.30D+02MB/s 143 66248 9 0.972D-03 0.68D+02MB/s 0.980D-03 0.68D+02MB/s 0.190D-02 0.35D+02MB/s 144131072 4 0.170D-02 0.77D+02MB/s 0.174D-02 0.75D+02MB/s 0.321D-02 0.41D+02MB/s 145233928 4 0.279D-02 0.84D+02MB/s 0.282D-02 0.83D+02MB/s 0.540D-02 0.43D+02MB/s 146524288 1 0.601D-02 0.87D+02MB/s 0.618D-02 0.85D+02MB/s 0.113D-01 0.47D+02MB/s 147996872 1 0.109D-01 0.92D+02MB/s 0.108D-01 0.93D+02MB/s 0.216D-01 0.46D+02MB/s 148 149 ACCESS [356: 710,356: 710] 150 bytes loop get put accumulate 151 8 841 0.487D-03 0.16D-01MB/s 0.629D-04 0.13D+00MB/s 0.646D-04 0.12D+00MB/s 152 72 1936 0.507D-03 0.14D+00MB/s 0.656D-04 0.11D+01MB/s 0.688D-04 0.10D+01MB/s 153 128 2500 0.524D-03 0.24D+00MB/s 0.678D-04 0.19D+01MB/s 0.702D-04 0.18D+01MB/s 154 648 1024 0.560D-03 0.12D+01MB/s 0.859D-04 0.75D+01MB/s 0.893D-04 0.73D+01MB/s 155 2048 361 0.726D-03 0.28D+01MB/s 0.162D-03 0.13D+02MB/s 0.161D-03 0.13D+02MB/s 156 7200 121 0.132D-02 0.55D+01MB/s 0.376D-03 0.19D+02MB/s 0.353D-03 0.20D+02MB/s 157 32768 25 0.270D-02 0.12D+02MB/s 0.130D-02 0.25D+02MB/s 0.117D-02 0.28D+02MB/s 158 66248 9 0.499D-02 0.13D+02MB/s 0.250D-02 0.27D+02MB/s 0.211D-02 0.31D+02MB/s 159131072 4 0.905D-02 0.14D+02MB/s 0.477D-02 0.27D+02MB/s 0.402D-02 0.33D+02MB/s 160233928 4 0.142D-01 0.17D+02MB/s 0.811D-02 0.29D+02MB/s 0.665D-02 0.35D+02MB/s 161524288 1 0.323D-01 0.16D+02MB/s 0.181D-01 0.29D+02MB/s 0.147D-01 0.36D+02MB/s 162996872 1 0.599D-01 0.17D+02MB/s 0.337D-01 0.30D+02MB/s 0.279D-01 0.36D+02MB/s 163................................................................................ 164 165machine: IBM SP2 (MPL) 166name: et{0201,0301,0401,0601}nwmpp1.emsl.pnl.gov 167options: -O (cc) -O2 (f77) 168note:122K message buffer; AIX4, TARGET=SP 169date: Wed Oct 9 12:12:39 PDT 1996 170 171 ACCESS [ 1: 355, 1: 355] 172 bytes loop get put accumulate 173 8 841 .972D-05 .82D+00MB/s .928D-05 .86D+00MB/s .134D-04 .60D+00MB/s 174 72 1936 .103D-04 .70D+01MB/s .103D-04 .70D+01MB/s .146D-04 .49D+01MB/s 175 128 2500 .104D-04 .12D+02MB/s .110D-04 .12D+02MB/s .153D-04 .84D+01MB/s 176 648 1024 .143D-04 .45D+02MB/s .157D-04 .41D+02MB/s .199D-04 .33D+02MB/s 177 2048 361 .249D-04 .82D+02MB/s .315D-04 .65D+02MB/s .346D-04 .59D+02MB/s 178 7200 121 .628D-04 .11D+03MB/s .764D-04 .94D+02MB/s .822D-04 .88D+02MB/s 179 32768 25 .318D-03 .10D+03MB/s .380D-03 .86D+02MB/s .406D-03 .81D+02MB/s 180 66248 9 .876D-03 .76D+02MB/s .867D-03 .76D+02MB/s .909D-03 .73D+02MB/s 181131072 4 .165D-02 .80D+02MB/s .167D-02 .78D+02MB/s .171D-02 .77D+02MB/s 182233928 4 .293D-02 .80D+02MB/s .302D-02 .77D+02MB/s .298D-02 .79D+02MB/s 183524288 1 .665D-02 .79D+02MB/s .658D-02 .80D+02MB/s .667D-02 .79D+02MB/s 184996872 1 .120D-01 .83D+02MB/s .121D-01 .82D+02MB/s .123D-01 .81D+02MB/s 185 186 ACCESS [356: 710,356: 710] 187 bytes loop get put accumulate 188 8 841 .256D-03 .31D-01MB/s .981D-04 .82D-01MB/s .904D-04 .88D-01MB/s 189 72 1936 .253D-03 .28D+00MB/s .115D-03 .63D+00MB/s .943D-04 .76D+00MB/s 190 128 2500 .255D-03 .50D+00MB/s .105D-03 .12D+01MB/s .947D-04 .14D+01MB/s 191 648 1024 .299D-03 .22D+01MB/s .139D-03 .47D+01MB/s .121D-03 .53D+01MB/s 192 2048 361 .394D-03 .52D+01MB/s .181D-03 .11D+02MB/s .192D-03 .11D+02MB/s 193 7200 121 .679D-03 .11D+02MB/s .392D-03 .18D+02MB/s .366D-03 .20D+02MB/s 194 32768 25 .251D-02 .13D+02MB/s .191D-02 .17D+02MB/s .192D-02 .17D+02MB/s 195 66248 9 .448D-02 .15D+02MB/s .333D-02 .20D+02MB/s .347D-02 .19D+02MB/s 196131072 4 .791D-02 .17D+02MB/s .619D-02 .21D+02MB/s .635D-02 .21D+02MB/s 197233928 4 .134D-01 .17D+02MB/s .105D-01 .22D+02MB/s .103D-01 .23D+02MB/s 198524288 1 .330D-01 .16D+02MB/s .241D-01 .22D+02MB/s .241D-01 .22D+02MB/s 199996872 1 .580D-01 .17D+02MB/s .466D-01 .21D+02MB/s .436D-01 .23D+02MB/s 200................................................................................ 201 202machine: IBM SP1 (MPL) 203name: spnode{022,024,025,026}mcs.anl.gov 204options: -O (cc) -O2 (f77) 205note:122K message buffer; latency of rcvncall in MPL is much higher than in EUIH 206date: Fri Dec 29 15:36:19 CST 199 207 208 ACCESS [ 1: 355, 1: 355] 209 bytes loop get put accumulate 210 8 841 .132D-04 .61D+00MB/s .132D-04 .61D+00MB/s .183D-04 .44D+00MB/s 211 72 1936 .148D-04 .49D+01MB/s .145D-04 .50D+01MB/s .196D-04 .37D+01MB/s 212 128 2500 .156D-04 .82D+01MB/s .157D-04 .82D+01MB/s .214D-04 .60D+01MB/s 213 648 1024 .252D-04 .26D+02MB/s .240D-04 .27D+02MB/s .303D-04 .21D+02MB/s 214 2048 361 .446D-04 .46D+02MB/s .403D-04 .51D+02MB/s .491D-04 .42D+02MB/s 215 7200 121 .884D-04 .81D+02MB/s .912D-04 .79D+02MB/s .108D-03 .67D+02MB/s 216 32768 25 .464D-03 .71D+02MB/s .472D-03 .69D+02MB/s .521D-03 .63D+02MB/s 217 66248 9 .869D-03 .76D+02MB/s .911D-03 .73D+02MB/s .102D-02 .65D+02MB/s 218131072 4 .170D-02 .77D+02MB/s .171D-02 .77D+02MB/s .194D-02 .68D+02MB/s 219233928 4 .290D-02 .81D+02MB/s .299D-02 .78D+02MB/s .330D-02 .71D+02MB/s 220524288 1 .637D-02 .82D+02MB/s .643D-02 .82D+02MB/s .728D-02 .72D+02MB/s 221996872 1 .116D-01 .86D+02MB/s .116D-01 .86D+02MB/s .136D-01 .73D+02MB/s 222 223 ACCESS [356: 710,356: 710] 224 bytes loop get put accumulate 225 8 841 .517D-03 .15D-01MB/s .752D-04 .11D+00MB/s .771D-04 .10D+00MB/s 226 72 1936 .549D-03 .13D+00MB/s .978D-04 .74D+00MB/s .949D-04 .76D+00MB/s 227 128 2500 .570D-03 .22D+00MB/s .973D-04 .13D+01MB/s .994D-04 .13D+01MB/s 228 648 1024 .625D-03 .10D+01MB/s .145D-03 .45D+01MB/s .147D-03 .44D+01MB/s 229 2048 361 .699D-03 .29D+01MB/s .277D-03 .74D+01MB/s .284D-03 .72D+01MB/s 230 7200 121 .113D-02 .64D+01MB/s .655D-03 .11D+02MB/s .651D-03 .11D+02MB/s 231 32768 25 .332D-02 .99D+01MB/s .255D-02 .13D+02MB/s .253D-02 .13D+02MB/s 232 66248 9 .531D-02 .12D+02MB/s .449D-02 .15D+02MB/s .444D-02 .15D+02MB/s 233131072 4 .103D-01 .13D+02MB/s .810D-02 .16D+02MB/s .810D-02 .16D+02MB/s 234233928 4 .166D-01 .14D+02MB/s .134D-01 .17D+02MB/s .142D-01 .16D+02MB/s 235524288 1 .370D-01 .14D+02MB/s .298D-01 .18D+02MB/s .300D-01 .17D+02MB/s 236996872 1 .703D-01 .14D+02MB/s .602D-01 .17D+02MB/s .576D-01 .17D+02MB/s 237................................................................................ 238 239machine: 4-processor 75MHz SGI Power Challenge 240name: coho.pnl.gov 241options: -O3 (f77) -O (cc) 242note: SGI uslocks instead of semaphores used for locking 243date: Tue Oct 10 15:47:51 PDT 1995 244 245 246 247 ACCESS [ 1: 355, 1: 355] 248 bytes loop get put accumulate 249 8 841 0.106D-04 0.76D+00MB/s 0.930D-05 0.86D+00MB/s 0.115D-04 0.70D+00MB/s 250 72 1936 0.120D-04 0.60D+01MB/s 0.107D-04 0.67D+01MB/s 0.124D-04 0.58D+01MB/s 251 128 2500 0.136D-04 0.94D+01MB/s 0.121D-04 0.11D+02MB/s 0.129D-04 0.99D+01MB/s 252 648 1024 0.177D-04 0.37D+02MB/s 0.164D-04 0.40D+02MB/s 0.164D-04 0.40D+02MB/s 253 2048 361 0.259D-04 0.79D+02MB/s 0.243D-04 0.84D+02MB/s 0.213D-04 0.96D+02MB/s 254 7200 121 0.461D-04 0.16D+03MB/s 0.440D-04 0.16D+03MB/s 0.396D-04 0.18D+03MB/s 255 32768 25 0.125D-03 0.26D+03MB/s 0.121D-03 0.27D+03MB/s 0.114D-03 0.29D+03MB/s 256 66248 9 0.208D-03 0.32D+03MB/s 0.192D-03 0.35D+03MB/s 0.222D-03 0.30D+03MB/s 257131072 4 0.390D-03 0.34D+03MB/s 0.370D-03 0.35D+03MB/s 0.389D-03 0.34D+03MB/s 258233928 4 0.674D-03 0.35D+03MB/s 0.542D-03 0.43D+03MB/s 0.677D-03 0.35D+03MB/s 259524288 1 0.226D-02 0.23D+03MB/s 0.127D-02 0.41D+03MB/s 0.147D-02 0.36D+03MB/s 260996872 1 0.331D-02 0.30D+03MB/s 0.211D-02 0.47D+03MB/s 0.276D-02 0.36D+03MB/s 261 262 ACCESS [356: 710,356: 710] 263 bytes loop get put accumulate 264 8 841 0.133D-04 0.60D+00MB/s 0.118D-04 0.68D+00MB/s 0.126D-04 0.64D+00MB/s 265 72 1936 0.146D-04 0.49D+01MB/s 0.132D-04 0.55D+01MB/s 0.151D-04 0.48D+01MB/s 266 128 2500 0.158D-04 0.81D+01MB/s 0.144D-04 0.89D+01MB/s 0.160D-04 0.80D+01MB/s 267 648 1024 0.261D-04 0.25D+02MB/s 0.245D-04 0.26D+02MB/s 0.277D-04 0.23D+02MB/s 268 2048 361 0.510D-04 0.40D+02MB/s 0.490D-04 0.42D+02MB/s 0.569D-04 0.36D+02MB/s 269 7200 121 0.128D-03 0.56D+02MB/s 0.125D-03 0.58D+02MB/s 0.156D-03 0.46D+02MB/s 270 32768 25 0.495D-03 0.66D+02MB/s 0.492D-03 0.67D+02MB/s 0.674D-03 0.49D+02MB/s 271 66248 9 0.103D-02 0.64D+02MB/s 0.954D-03 0.69D+02MB/s 0.128D-02 0.52D+02MB/s 272131072 4 0.187D-02 0.70D+02MB/s 0.186D-02 0.71D+02MB/s 0.250D-02 0.52D+02MB/s 273233928 4 0.324D-02 0.72D+02MB/s 0.323D-02 0.72D+02MB/s 0.439D-02 0.53D+02MB/s 274524288 1 0.802D-02 0.65D+02MB/s 0.719D-02 0.73D+02MB/s 0.990D-02 0.53D+02MB/s 275996872 1 0.143D-01 0.70D+02MB/s 0.131D-01 0.76D+02MB/s 0.182D-01 0.55D+02MB/s 276................................................................................ 277 278 279machine: 4-processor Fujitsu VX-4 280name: kaiousei.fecit.co.uk 281note: initial port, performance optimizations not completed 282date: Thu Dec 4 12:04:49 PST 1997 283 284 ACCESS [ 1: 710, 1: 355] 285 bytes loop get put accumulate 286 8 1711 0.131d-04 0.61d+00MB/s 0.267d-04 0.30d+00MB/s 0.474d-04 0.17d+00MB/s 287 72 3872 0.134d-04 0.54d+01MB/s 0.273d-04 0.26d+01MB/s 0.491d-04 0.15d+01MB/s 288 128 5050 0.136d-04 0.94d+01MB/s 0.275d-04 0.46d+01MB/s 0.509d-04 0.25d+01MB/s 289 648 2048 0.148d-04 0.44d+02MB/s 0.287d-04 0.23d+02MB/s 0.655d-04 0.99d+01MB/s 290 2048 741 0.167d-04 0.12d+03MB/s 0.300d-04 0.68d+02MB/s 0.103d-03 0.20d+02MB/s 291 7200 242 0.203d-04 0.35d+03MB/s 0.359d-04 0.20d+03MB/s 0.243d-03 0.30d+02MB/s 292 32768 50 0.317d-04 0.10d+04MB/s 0.429d-04 0.76d+03MB/s 0.302d-03 0.11d+03MB/s 293 66248 21 0.430d-04 0.15d+04MB/s 0.557d-04 0.12d+04MB/s 0.510d-03 0.13d+03MB/s 294131072 10 0.588d-04 0.22d+04MB/s 0.685d-04 0.19d+04MB/s 0.634d-03 0.21d+03MB/s 295233928 8 0.828d-04 0.28d+04MB/s 0.948d-04 0.25d+04MB/s 0.106d-02 0.22d+03MB/s 296524288 2 0.151d-03 0.35d+04MB/s 0.149d-03 0.35d+04MB/s 0.156d-02 0.34d+03MB/s 297996872 2 0.226d-03 0.44d+04MB/s 0.237d-03 0.42d+04MB/s 0.293d-02 0.34d+03MB/s 298 299 300 ACCESS [ 1: 710,356: 710] 301 bytes loop get put accumulate 302 8 1711 0.260d-04 0.31d+00MB/s 0.301d-04 0.27d+00MB/s 0.673d-04 0.12d+00MB/s 303 72 3872 0.544d-04 0.13d+01MB/s 0.343d-04 0.21d+01MB/s 0.103d-03 0.70d+00MB/s 304 128 5050 0.629d-04 0.20d+01MB/s 0.358d-04 0.36d+01MB/s 0.121d-03 0.11d+01MB/s 305 648 2048 0.112d-03 0.58d+01MB/s 0.447d-04 0.14d+02MB/s 0.221d-03 0.29d+01MB/s 306 2048 741 0.179d-03 0.11d+02MB/s 0.562d-04 0.36d+02MB/s 0.374d-03 0.55d+01MB/s 307 7200 242 0.315d-03 0.23d+02MB/s 0.864d-04 0.83d+02MB/s 0.768d-03 0.94d+01MB/s 308 32768 50 0.667d-03 0.49d+02MB/s 0.177d-03 0.18d+03MB/s 0.158d-02 0.21d+02MB/s 309 66248 21 0.979d-03 0.68d+02MB/s 0.273d-03 0.24d+03MB/s 0.234d-02 0.28d+02MB/s 310131072 10 0.149d-02 0.88d+02MB/s 0.212d-01 0.62d+01MB/s 0.334d-02 0.39d+02MB/s 311233928 8 0.206d-02 0.11d+03MB/s 0.676d-03 0.35d+03MB/s 0.498d-02 0.47d+02MB/s 312524288 2 0.327d-02 0.16d+03MB/s 0.131d-02 0.40d+03MB/s 0.781d-02 0.67d+02MB/s 313996872 2 0.509d-02 0.20d+03MB/s 0.228d-02 0.44d+03MB/s 0.128d-01 0.78d+02MB/s 314 315 316