1                 Performance of GA Communication Primitives
2                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3                             Jarek Nieplocha
4
5
6Uniform block distribution of matrix 710x710.
7Accumulate operation is atomic.
8Program run on 4 processors.
9Process 0 holds block [1:355, 1:355], process 3 holds block [356:710, 356:710].
10Process 0 first accesses local data and then remote data.
11To reduce data caching, each time a different section of the matrix is accessed.
12................................................................................
13
14machine: Cray T3D
15name: h4p.nersc.gov
16options: -O1 -h inline3, readahead on, -DFLUSHCACHE
17note: lack of cache coherency degrades latency of "local" get
18date: Thu Dec 28 13:49:40 PST 1995
19
20
21
22                     ACCESS [  1: 355,  1: 355]
23 bytes  loop         get                    put                 accumulate
24     8  841 0.336E-04 0.24E+00MB/s 0.154E-04 0.52E+00MB/s 0.451E-04 0.18E+00MB/s
25    72 1936 0.365E-04 0.20E+01MB/s 0.174E-04 0.41E+01MB/s 0.473E-04 0.15E+01MB/s
26   128 2500 0.379E-04 0.34E+01MB/s 0.183E-04 0.70E+01MB/s 0.492E-04 0.26E+01MB/s
27   648 1024 0.471E-04 0.14E+02MB/s 0.253E-04 0.26E+02MB/s 0.638E-04 0.10E+02MB/s
28  2048  361 0.679E-04 0.30E+02MB/s 0.367E-04 0.56E+02MB/s 0.104E-03 0.20E+02MB/s
29  7200  121 0.128E-03 0.56E+02MB/s 0.771E-04 0.93E+02MB/s 0.244E-03 0.29E+02MB/s
30 32768   25 0.386E-03 0.85E+02MB/s 0.355E-03 0.92E+02MB/s 0.513E-03 0.64E+02MB/s
31 66248    9 0.717E-03 0.92E+02MB/s 0.681E-03 0.97E+02MB/s 0.874E-03 0.76E+02MB/s
32131072    4 0.132E-02 0.10E+03MB/s 0.127E-02 0.10E+03MB/s 0.161E-02 0.82E+02MB/s
33233928    4 0.225E-02 0.10E+03MB/s 0.220E-02 0.11E+03MB/s 0.258E-02 0.91E+02MB/s
34524288    1 0.392E-02 0.13E+03MB/s 0.396E-02 0.13E+03MB/s 0.557E-02 0.94E+02MB/s
35996872    1 0.695E-02 0.14E+03MB/s 0.695E-02 0.14E+03MB/s 0.101E-01 0.98E+02MB/s
36                     ACCESS [356: 710,356: 710]
37 bytes  loop         get                    put                 accumulate
38     8  841 0.216E-04 0.37E+00MB/s 0.157E-04 0.51E+00MB/s 0.367E-04 0.22E+00MB/s
39    72 1936 0.280E-04 0.26E+01MB/s 0.180E-04 0.40E+01MB/s 0.466E-04 0.15E+01MB/s
40   128 2500 0.308E-04 0.42E+01MB/s 0.192E-04 0.67E+01MB/s 0.520E-04 0.25E+01MB/s
41   648 1024 0.591E-04 0.11E+02MB/s 0.278E-04 0.23E+02MB/s 0.954E-04 0.68E+01MB/s
42  2048  361 0.117E-03 0.17E+02MB/s 0.463E-04 0.44E+02MB/s 0.193E-03 0.11E+02MB/s
43  7200  121 0.315E-03 0.23E+02MB/s 0.108E-03 0.67E+02MB/s 0.533E-03 0.14E+02MB/s
44 32768   25 0.119E-02 0.28E+02MB/s 0.362E-03 0.90E+02MB/s 0.189E-02 0.17E+02MB/s
45 66248    9 0.231E-02 0.29E+02MB/s 0.671E-03 0.99E+02MB/s 0.359E-02 0.18E+02MB/s
46131072    4 0.425E-02 0.31E+02MB/s 0.125E-02 0.11E+03MB/s 0.732E-02 0.18E+02MB/s
47233928    4 0.733E-02 0.32E+02MB/s 0.211E-02 0.11E+03MB/s 0.122E-01 0.19E+02MB/s
48524288    1 0.161E-01 0.32E+02MB/s 0.461E-02 0.11E+03MB/s 0.263E-01 0.20E+02MB/s
49996872    1 0.296E-01 0.34E+02MB/s 0.843E-02 0.12E+03MB/s 0.485E-01 0.21E+02MB/s
50
51
52machine: Cray T3D
53name: h4p.nersc.gov
54options: -O3 -h inline3, readahead on, implicit (automatic) cache invalidation
55note: lack of cache coherency degrades latency of "local" get
56date: Mon Mar 25 10:32:42 PST 1996
57
58
59                     ACCESS [  1: 355,  1: 355]
60 bytes  loop         get                    put                 accumulate
61     8  841 0.223E-04 0.35E+00MB/s 0.170E-04 0.47E+00MB/s 0.270E-04 0.30E+00MB/s
62    72 1936 0.255E-04 0.28E+01MB/s 0.189E-04 0.38E+01MB/s 0.289E-04 0.25E+01MB/s
63   128 2500 0.269E-04 0.48E+01MB/s 0.197E-04 0.65E+01MB/s 0.301E-04 0.43E+01MB/s
64   648 1024 0.361E-04 0.18E+02MB/s 0.261E-04 0.25E+02MB/s 0.426E-04 0.15E+02MB/s
65  2048  361 0.579E-04 0.35E+02MB/s 0.360E-04 0.57E+02MB/s 0.754E-04 0.27E+02MB/s
66  7200  121 0.121E-03 0.59E+02MB/s 0.759E-04 0.95E+02MB/s 0.226E-03 0.32E+02MB/s
67 32768   25 0.371E-03 0.88E+02MB/s 0.347E-03 0.95E+02MB/s 0.573E-03 0.57E+02MB/s
68 66248    9 0.695E-03 0.95E+02MB/s 0.668E-03 0.99E+02MB/s 0.969E-03 0.68E+02MB/s
69131072    4 0.129E-02 0.10E+03MB/s 0.125E-02 0.10E+03MB/s 0.175E-02 0.75E+02MB/s
70233928    4 0.220E-02 0.11E+03MB/s 0.216E-02 0.11E+03MB/s 0.277E-02 0.84E+02MB/s
71524288    1 0.384E-02 0.14E+03MB/s 0.382E-02 0.14E+03MB/s 0.591E-02 0.89E+02MB/s
72996872    1 0.687E-02 0.15E+03MB/s 0.680E-02 0.15E+03MB/s 0.106E-01 0.94E+02MB/s
73
74                     ACCESS [356: 710,356: 710]
75 bytes  loop         get                    put                 accumulate
76     8  841 0.244E-04 0.33E+00MB/s 0.187E-04 0.43E+00MB/s 0.347E-04 0.23E+00MB/s
77    72 1936 0.304E-04 0.24E+01MB/s 0.210E-04 0.34E+01MB/s 0.471E-04 0.15E+01MB/s
78   128 2500 0.333E-04 0.38E+01MB/s 0.221E-04 0.58E+01MB/s 0.542E-04 0.24E+01MB/s
79   648 1024 0.611E-04 0.11E+02MB/s 0.317E-04 0.20E+02MB/s 0.106E-03 0.61E+01MB/s
80  2048  361 0.120E-03 0.17E+02MB/s 0.494E-04 0.41E+02MB/s 0.213E-03 0.96E+01MB/s
81  7200  121 0.321E-03 0.22E+02MB/s 0.111E-03 0.65E+02MB/s 0.579E-03 0.12E+02MB/s
82 32768   25 0.121E-02 0.27E+02MB/s 0.364E-03 0.90E+02MB/s 0.229E-02 0.14E+02MB/s
83 66248    9 0.235E-02 0.28E+02MB/s 0.671E-03 0.99E+02MB/s 0.403E-02 0.16E+02MB/s
84131072    4 0.443E-02 0.30E+02MB/s 0.125E-02 0.11E+03MB/s 0.847E-02 0.15E+02MB/s
85233928    4 0.758E-02 0.31E+02MB/s 0.210E-02 0.11E+03MB/s 0.134E-01 0.17E+02MB/s
86524288    1 0.166E-01 0.32E+02MB/s 0.451E-02 0.12E+03MB/s 0.297E-01 0.18E+02MB/s
87996872    1 0.303E-01 0.33E+02MB/s 0.838E-02 0.12E+03MB/s 0.509E-01 0.20E+02MB/s
88................................................................................
89
90machine: KSR-2
91name: circus.pnl.gov
92options: -O1 -qdiv
93date: Mon Jan 23 09:22:48 PST 1995
94
95                     ACCESS [  1 :  355 ,   1 :  355 ]
96 bytes  loop         get                    put                 accumulate
97     8  841 0.357D-04 0.22D+00MB/s 0.350D-04 0.23D+00MB/s 0.379D-04 0.21D+00MB/s
98    72 1936 0.415D-04 0.17D+01MB/s 0.419D-04 0.17D+01MB/s 0.444D-04 0.16D+01MB/s
99   128 2500 0.454D-04 0.28D+01MB/s 0.463D-04 0.28D+01MB/s 0.493D-04 0.26D+01MB/s
100   648 1024 0.714D-04 0.91D+01MB/s 0.753D-04 0.86D+01MB/s 0.813D-04 0.80D+01MB/s
101  2048  361 0.119D-03 0.17D+02MB/s 0.129D-03 0.16D+02MB/s 0.147D-03 0.14D+02MB/s
102  7200  121 0.311D-03 0.23D+02MB/s 0.331D-03 0.22D+02MB/s 0.377D-03 0.19D+02MB/s
103 32768   25 0.109D-02 0.30D+02MB/s 0.113D-02 0.29D+02MB/s 0.128D-02 0.26D+02MB/s
104 66248    9 0.234D-02 0.28D+02MB/s 0.229D-02 0.29D+02MB/s 0.260D-02 0.26D+02MB/s
105131072    4 0.446D-02 0.29D+02MB/s 0.444D-02 0.29D+02MB/s 0.495D-02 0.26D+02MB/s
106233928    4 0.774D-02 0.30D+02MB/s 0.769D-02 0.30D+02MB/s 0.873D-02 0.27D+02MB/s
107524288    1 0.171D-01 0.31D+02MB/s 0.171D-01 0.31D+02MB/s 0.191D-01 0.27D+02MB/s
108996872    1 0.308D-01 0.32D+02MB/s 0.307D-01 0.32D+02MB/s 0.348D-01 0.29D+02MB/s
109
110                     ACCESS [  356 :  710 ,   356 :  710 ]
111 bytes  loop         get                    put                 accumulate
112     8  841 0.373D-04 0.21D+00MB/s 0.364D-04 0.22D+00MB/s 0.448D-04 0.18D+00MB/s
113    72 1936 0.427D-04 0.17D+01MB/s 0.430D-04 0.17D+01MB/s 0.537D-04 0.13D+01MB/s
114   128 2500 0.469D-04 0.27D+01MB/s 0.479D-04 0.27D+01MB/s 0.604D-04 0.21D+01MB/s
115   648 1024 0.751D-04 0.86D+01MB/s 0.801D-04 0.81D+01MB/s 0.118D-03 0.55D+01MB/s
116  2048  361 0.135D-03 0.15D+02MB/s 0.144D-03 0.14D+02MB/s 0.160D-03 0.13D+02MB/s
117  7200  121 0.365D-03 0.20D+02MB/s 0.376D-03 0.19D+02MB/s 0.410D-03 0.18D+02MB/s
118 32768   25 0.133D-02 0.25D+02MB/s 0.137D-02 0.24D+02MB/s 0.148D-02 0.22D+02MB/s
119 66248    9 0.276D-02 0.24D+02MB/s 0.276D-02 0.24D+02MB/s 0.307D-02 0.22D+02MB/s
120131072    4 0.543D-02 0.24D+02MB/s 0.541D-02 0.24D+02MB/s 0.588D-02 0.22D+02MB/s
121233928    4 0.928D-02 0.25D+02MB/s 0.917D-02 0.26D+02MB/s 0.101D-01 0.23D+02MB/s
122524288    1 0.209D-01 0.25D+02MB/s 0.210D-01 0.25D+02MB/s 0.227D-01 0.23D+02MB/s
123996872    1 0.353D-01 0.28D+02MB/s 0.347D-01 0.29D+02MB/s 0.387D-01 0.26D+02MB/s
124................................................................................
125
126machine: Intel Paragon
127name: trex.caltech.edu
128options: -O2 -Msafeptr -Knoieee -Mquad -Mreentrant; run with -plk
129note:122K message buffer; masktrap responsible for latency of "local" accumulate
130date: Fri Dec 29 11:28:54 PST 1995
131
132
133
134                     ACCESS [  1: 355,  1: 355]
135 bytes  loop         get                    put                 accumulate
136     8  841 0.707D-04 0.11D+00MB/s 0.297D-04 0.27D+00MB/s 0.167D-03 0.48D-01MB/s
137    72 1936 0.352D-04 0.20D+01MB/s 0.324D-04 0.22D+01MB/s 0.178D-03 0.41D+00MB/s
138   128 2500 0.379D-04 0.34D+01MB/s 0.341D-04 0.38D+01MB/s 0.183D-03 0.70D+00MB/s
139   648 1024 0.588D-04 0.11D+02MB/s 0.450D-04 0.14D+02MB/s 0.221D-03 0.29D+01MB/s
140  2048  361 0.108D-03 0.19D+02MB/s 0.695D-04 0.29D+02MB/s 0.306D-03 0.67D+01MB/s
141  7200  121 0.277D-03 0.26D+02MB/s 0.151D-03 0.48D+02MB/s 0.468D-03 0.15D+02MB/s
142 32768   25 0.550D-03 0.60D+02MB/s 0.550D-03 0.60D+02MB/s 0.109D-02 0.30D+02MB/s
143 66248    9 0.972D-03 0.68D+02MB/s 0.980D-03 0.68D+02MB/s 0.190D-02 0.35D+02MB/s
144131072    4 0.170D-02 0.77D+02MB/s 0.174D-02 0.75D+02MB/s 0.321D-02 0.41D+02MB/s
145233928    4 0.279D-02 0.84D+02MB/s 0.282D-02 0.83D+02MB/s 0.540D-02 0.43D+02MB/s
146524288    1 0.601D-02 0.87D+02MB/s 0.618D-02 0.85D+02MB/s 0.113D-01 0.47D+02MB/s
147996872    1 0.109D-01 0.92D+02MB/s 0.108D-01 0.93D+02MB/s 0.216D-01 0.46D+02MB/s
148
149                     ACCESS [356: 710,356: 710]
150 bytes  loop         get                    put                 accumulate
151     8  841 0.487D-03 0.16D-01MB/s 0.629D-04 0.13D+00MB/s 0.646D-04 0.12D+00MB/s
152    72 1936 0.507D-03 0.14D+00MB/s 0.656D-04 0.11D+01MB/s 0.688D-04 0.10D+01MB/s
153   128 2500 0.524D-03 0.24D+00MB/s 0.678D-04 0.19D+01MB/s 0.702D-04 0.18D+01MB/s
154   648 1024 0.560D-03 0.12D+01MB/s 0.859D-04 0.75D+01MB/s 0.893D-04 0.73D+01MB/s
155  2048  361 0.726D-03 0.28D+01MB/s 0.162D-03 0.13D+02MB/s 0.161D-03 0.13D+02MB/s
156  7200  121 0.132D-02 0.55D+01MB/s 0.376D-03 0.19D+02MB/s 0.353D-03 0.20D+02MB/s
157 32768   25 0.270D-02 0.12D+02MB/s 0.130D-02 0.25D+02MB/s 0.117D-02 0.28D+02MB/s
158 66248    9 0.499D-02 0.13D+02MB/s 0.250D-02 0.27D+02MB/s 0.211D-02 0.31D+02MB/s
159131072    4 0.905D-02 0.14D+02MB/s 0.477D-02 0.27D+02MB/s 0.402D-02 0.33D+02MB/s
160233928    4 0.142D-01 0.17D+02MB/s 0.811D-02 0.29D+02MB/s 0.665D-02 0.35D+02MB/s
161524288    1 0.323D-01 0.16D+02MB/s 0.181D-01 0.29D+02MB/s 0.147D-01 0.36D+02MB/s
162996872    1 0.599D-01 0.17D+02MB/s 0.337D-01 0.30D+02MB/s 0.279D-01 0.36D+02MB/s
163................................................................................
164
165machine: IBM SP2 (MPL)
166name: et{0201,0301,0401,0601}nwmpp1.emsl.pnl.gov
167options: -O (cc) -O2 (f77)
168note:122K message buffer; AIX4, TARGET=SP
169date: Wed Oct  9 12:12:39 PDT 1996
170
171                     ACCESS [  1: 355,  1: 355]
172 bytes  loop         get                    put                 accumulate
173     8  841  .972D-05  .82D+00MB/s  .928D-05  .86D+00MB/s  .134D-04  .60D+00MB/s
174    72 1936  .103D-04  .70D+01MB/s  .103D-04  .70D+01MB/s  .146D-04  .49D+01MB/s
175   128 2500  .104D-04  .12D+02MB/s  .110D-04  .12D+02MB/s  .153D-04  .84D+01MB/s
176   648 1024  .143D-04  .45D+02MB/s  .157D-04  .41D+02MB/s  .199D-04  .33D+02MB/s
177  2048  361  .249D-04  .82D+02MB/s  .315D-04  .65D+02MB/s  .346D-04  .59D+02MB/s
178  7200  121  .628D-04  .11D+03MB/s  .764D-04  .94D+02MB/s  .822D-04  .88D+02MB/s
179 32768   25  .318D-03  .10D+03MB/s  .380D-03  .86D+02MB/s  .406D-03  .81D+02MB/s
180 66248    9  .876D-03  .76D+02MB/s  .867D-03  .76D+02MB/s  .909D-03  .73D+02MB/s
181131072    4  .165D-02  .80D+02MB/s  .167D-02  .78D+02MB/s  .171D-02  .77D+02MB/s
182233928    4  .293D-02  .80D+02MB/s  .302D-02  .77D+02MB/s  .298D-02  .79D+02MB/s
183524288    1  .665D-02  .79D+02MB/s  .658D-02  .80D+02MB/s  .667D-02  .79D+02MB/s
184996872    1  .120D-01  .83D+02MB/s  .121D-01  .82D+02MB/s  .123D-01  .81D+02MB/s
185
186                     ACCESS [356: 710,356: 710]
187 bytes  loop         get                    put                 accumulate
188     8  841  .256D-03  .31D-01MB/s  .981D-04  .82D-01MB/s  .904D-04  .88D-01MB/s
189    72 1936  .253D-03  .28D+00MB/s  .115D-03  .63D+00MB/s  .943D-04  .76D+00MB/s
190   128 2500  .255D-03  .50D+00MB/s  .105D-03  .12D+01MB/s  .947D-04  .14D+01MB/s
191   648 1024  .299D-03  .22D+01MB/s  .139D-03  .47D+01MB/s  .121D-03  .53D+01MB/s
192  2048  361  .394D-03  .52D+01MB/s  .181D-03  .11D+02MB/s  .192D-03  .11D+02MB/s
193  7200  121  .679D-03  .11D+02MB/s  .392D-03  .18D+02MB/s  .366D-03  .20D+02MB/s
194 32768   25  .251D-02  .13D+02MB/s  .191D-02  .17D+02MB/s  .192D-02  .17D+02MB/s
195 66248    9  .448D-02  .15D+02MB/s  .333D-02  .20D+02MB/s  .347D-02  .19D+02MB/s
196131072    4  .791D-02  .17D+02MB/s  .619D-02  .21D+02MB/s  .635D-02  .21D+02MB/s
197233928    4  .134D-01  .17D+02MB/s  .105D-01  .22D+02MB/s  .103D-01  .23D+02MB/s
198524288    1  .330D-01  .16D+02MB/s  .241D-01  .22D+02MB/s  .241D-01  .22D+02MB/s
199996872    1  .580D-01  .17D+02MB/s  .466D-01  .21D+02MB/s  .436D-01  .23D+02MB/s
200................................................................................
201
202machine: IBM SP1 (MPL)
203name: spnode{022,024,025,026}mcs.anl.gov
204options: -O (cc) -O2 (f77)
205note:122K message buffer; latency of rcvncall in MPL is much higher than in EUIH
206date: Fri Dec 29 15:36:19 CST 199
207
208                     ACCESS [  1: 355,  1: 355]
209 bytes  loop         get                    put                 accumulate
210     8  841  .132D-04  .61D+00MB/s  .132D-04  .61D+00MB/s  .183D-04  .44D+00MB/s
211    72 1936  .148D-04  .49D+01MB/s  .145D-04  .50D+01MB/s  .196D-04  .37D+01MB/s
212   128 2500  .156D-04  .82D+01MB/s  .157D-04  .82D+01MB/s  .214D-04  .60D+01MB/s
213   648 1024  .252D-04  .26D+02MB/s  .240D-04  .27D+02MB/s  .303D-04  .21D+02MB/s
214  2048  361  .446D-04  .46D+02MB/s  .403D-04  .51D+02MB/s  .491D-04  .42D+02MB/s
215  7200  121  .884D-04  .81D+02MB/s  .912D-04  .79D+02MB/s  .108D-03  .67D+02MB/s
216 32768   25  .464D-03  .71D+02MB/s  .472D-03  .69D+02MB/s  .521D-03  .63D+02MB/s
217 66248    9  .869D-03  .76D+02MB/s  .911D-03  .73D+02MB/s  .102D-02  .65D+02MB/s
218131072    4  .170D-02  .77D+02MB/s  .171D-02  .77D+02MB/s  .194D-02  .68D+02MB/s
219233928    4  .290D-02  .81D+02MB/s  .299D-02  .78D+02MB/s  .330D-02  .71D+02MB/s
220524288    1  .637D-02  .82D+02MB/s  .643D-02  .82D+02MB/s  .728D-02  .72D+02MB/s
221996872    1  .116D-01  .86D+02MB/s  .116D-01  .86D+02MB/s  .136D-01  .73D+02MB/s
222
223                     ACCESS [356: 710,356: 710]
224 bytes  loop         get                    put                 accumulate
225     8  841  .517D-03  .15D-01MB/s  .752D-04  .11D+00MB/s  .771D-04  .10D+00MB/s
226    72 1936  .549D-03  .13D+00MB/s  .978D-04  .74D+00MB/s  .949D-04  .76D+00MB/s
227   128 2500  .570D-03  .22D+00MB/s  .973D-04  .13D+01MB/s  .994D-04  .13D+01MB/s
228   648 1024  .625D-03  .10D+01MB/s  .145D-03  .45D+01MB/s  .147D-03  .44D+01MB/s
229  2048  361  .699D-03  .29D+01MB/s  .277D-03  .74D+01MB/s  .284D-03  .72D+01MB/s
230  7200  121  .113D-02  .64D+01MB/s  .655D-03  .11D+02MB/s  .651D-03  .11D+02MB/s
231 32768   25  .332D-02  .99D+01MB/s  .255D-02  .13D+02MB/s  .253D-02  .13D+02MB/s
232 66248    9  .531D-02  .12D+02MB/s  .449D-02  .15D+02MB/s  .444D-02  .15D+02MB/s
233131072    4  .103D-01  .13D+02MB/s  .810D-02  .16D+02MB/s  .810D-02  .16D+02MB/s
234233928    4  .166D-01  .14D+02MB/s  .134D-01  .17D+02MB/s  .142D-01  .16D+02MB/s
235524288    1  .370D-01  .14D+02MB/s  .298D-01  .18D+02MB/s  .300D-01  .17D+02MB/s
236996872    1  .703D-01  .14D+02MB/s  .602D-01  .17D+02MB/s  .576D-01  .17D+02MB/s
237................................................................................
238
239machine: 4-processor 75MHz SGI Power Challenge
240name: coho.pnl.gov
241options: -O3 (f77) -O (cc)
242note: SGI uslocks instead of semaphores used for locking
243date: Tue Oct 10 15:47:51 PDT 1995
244
245
246
247                     ACCESS [  1: 355,  1: 355]
248 bytes  loop         get                    put                 accumulate
249     8  841 0.106D-04 0.76D+00MB/s 0.930D-05 0.86D+00MB/s 0.115D-04 0.70D+00MB/s
250    72 1936 0.120D-04 0.60D+01MB/s 0.107D-04 0.67D+01MB/s 0.124D-04 0.58D+01MB/s
251   128 2500 0.136D-04 0.94D+01MB/s 0.121D-04 0.11D+02MB/s 0.129D-04 0.99D+01MB/s
252   648 1024 0.177D-04 0.37D+02MB/s 0.164D-04 0.40D+02MB/s 0.164D-04 0.40D+02MB/s
253  2048  361 0.259D-04 0.79D+02MB/s 0.243D-04 0.84D+02MB/s 0.213D-04 0.96D+02MB/s
254  7200  121 0.461D-04 0.16D+03MB/s 0.440D-04 0.16D+03MB/s 0.396D-04 0.18D+03MB/s
255 32768   25 0.125D-03 0.26D+03MB/s 0.121D-03 0.27D+03MB/s 0.114D-03 0.29D+03MB/s
256 66248    9 0.208D-03 0.32D+03MB/s 0.192D-03 0.35D+03MB/s 0.222D-03 0.30D+03MB/s
257131072    4 0.390D-03 0.34D+03MB/s 0.370D-03 0.35D+03MB/s 0.389D-03 0.34D+03MB/s
258233928    4 0.674D-03 0.35D+03MB/s 0.542D-03 0.43D+03MB/s 0.677D-03 0.35D+03MB/s
259524288    1 0.226D-02 0.23D+03MB/s 0.127D-02 0.41D+03MB/s 0.147D-02 0.36D+03MB/s
260996872    1 0.331D-02 0.30D+03MB/s 0.211D-02 0.47D+03MB/s 0.276D-02 0.36D+03MB/s
261
262                     ACCESS [356: 710,356: 710]
263 bytes  loop         get                    put                 accumulate
264     8  841 0.133D-04 0.60D+00MB/s 0.118D-04 0.68D+00MB/s 0.126D-04 0.64D+00MB/s
265    72 1936 0.146D-04 0.49D+01MB/s 0.132D-04 0.55D+01MB/s 0.151D-04 0.48D+01MB/s
266   128 2500 0.158D-04 0.81D+01MB/s 0.144D-04 0.89D+01MB/s 0.160D-04 0.80D+01MB/s
267   648 1024 0.261D-04 0.25D+02MB/s 0.245D-04 0.26D+02MB/s 0.277D-04 0.23D+02MB/s
268  2048  361 0.510D-04 0.40D+02MB/s 0.490D-04 0.42D+02MB/s 0.569D-04 0.36D+02MB/s
269  7200  121 0.128D-03 0.56D+02MB/s 0.125D-03 0.58D+02MB/s 0.156D-03 0.46D+02MB/s
270 32768   25 0.495D-03 0.66D+02MB/s 0.492D-03 0.67D+02MB/s 0.674D-03 0.49D+02MB/s
271 66248    9 0.103D-02 0.64D+02MB/s 0.954D-03 0.69D+02MB/s 0.128D-02 0.52D+02MB/s
272131072    4 0.187D-02 0.70D+02MB/s 0.186D-02 0.71D+02MB/s 0.250D-02 0.52D+02MB/s
273233928    4 0.324D-02 0.72D+02MB/s 0.323D-02 0.72D+02MB/s 0.439D-02 0.53D+02MB/s
274524288    1 0.802D-02 0.65D+02MB/s 0.719D-02 0.73D+02MB/s 0.990D-02 0.53D+02MB/s
275996872    1 0.143D-01 0.70D+02MB/s 0.131D-01 0.76D+02MB/s 0.182D-01 0.55D+02MB/s
276................................................................................
277
278
279machine: 4-processor Fujitsu VX-4
280name: kaiousei.fecit.co.uk
281note: initial port, performance optimizations not completed
282date: Thu Dec  4 12:04:49 PST 1997
283
284                     ACCESS [  1: 710,  1: 355]
285 bytes  loop         get                    put                 accumulate
286     8 1711 0.131d-04 0.61d+00MB/s 0.267d-04 0.30d+00MB/s 0.474d-04 0.17d+00MB/s
287    72 3872 0.134d-04 0.54d+01MB/s 0.273d-04 0.26d+01MB/s 0.491d-04 0.15d+01MB/s
288   128 5050 0.136d-04 0.94d+01MB/s 0.275d-04 0.46d+01MB/s 0.509d-04 0.25d+01MB/s
289   648 2048 0.148d-04 0.44d+02MB/s 0.287d-04 0.23d+02MB/s 0.655d-04 0.99d+01MB/s
290  2048  741 0.167d-04 0.12d+03MB/s 0.300d-04 0.68d+02MB/s 0.103d-03 0.20d+02MB/s
291  7200  242 0.203d-04 0.35d+03MB/s 0.359d-04 0.20d+03MB/s 0.243d-03 0.30d+02MB/s
292 32768   50 0.317d-04 0.10d+04MB/s 0.429d-04 0.76d+03MB/s 0.302d-03 0.11d+03MB/s
293 66248   21 0.430d-04 0.15d+04MB/s 0.557d-04 0.12d+04MB/s 0.510d-03 0.13d+03MB/s
294131072   10 0.588d-04 0.22d+04MB/s 0.685d-04 0.19d+04MB/s 0.634d-03 0.21d+03MB/s
295233928    8 0.828d-04 0.28d+04MB/s 0.948d-04 0.25d+04MB/s 0.106d-02 0.22d+03MB/s
296524288    2 0.151d-03 0.35d+04MB/s 0.149d-03 0.35d+04MB/s 0.156d-02 0.34d+03MB/s
297996872    2 0.226d-03 0.44d+04MB/s 0.237d-03 0.42d+04MB/s 0.293d-02 0.34d+03MB/s
298
299
300                     ACCESS [  1: 710,356: 710]
301 bytes  loop         get                    put                 accumulate
302     8 1711 0.260d-04 0.31d+00MB/s 0.301d-04 0.27d+00MB/s 0.673d-04 0.12d+00MB/s
303    72 3872 0.544d-04 0.13d+01MB/s 0.343d-04 0.21d+01MB/s 0.103d-03 0.70d+00MB/s
304   128 5050 0.629d-04 0.20d+01MB/s 0.358d-04 0.36d+01MB/s 0.121d-03 0.11d+01MB/s
305   648 2048 0.112d-03 0.58d+01MB/s 0.447d-04 0.14d+02MB/s 0.221d-03 0.29d+01MB/s
306  2048  741 0.179d-03 0.11d+02MB/s 0.562d-04 0.36d+02MB/s 0.374d-03 0.55d+01MB/s
307  7200  242 0.315d-03 0.23d+02MB/s 0.864d-04 0.83d+02MB/s 0.768d-03 0.94d+01MB/s
308 32768   50 0.667d-03 0.49d+02MB/s 0.177d-03 0.18d+03MB/s 0.158d-02 0.21d+02MB/s
309 66248   21 0.979d-03 0.68d+02MB/s 0.273d-03 0.24d+03MB/s 0.234d-02 0.28d+02MB/s
310131072   10 0.149d-02 0.88d+02MB/s 0.212d-01 0.62d+01MB/s 0.334d-02 0.39d+02MB/s
311233928    8 0.206d-02 0.11d+03MB/s 0.676d-03 0.35d+03MB/s 0.498d-02 0.47d+02MB/s
312524288    2 0.327d-02 0.16d+03MB/s 0.131d-02 0.40d+03MB/s 0.781d-02 0.67d+02MB/s
313996872    2 0.509d-02 0.20d+03MB/s 0.228d-02 0.44d+03MB/s 0.128d-01 0.78d+02MB/s
314
315
316