1%
2% perf.tex - the final performance report which includes the rest
3%   of the data.
4%
5% Copyright (c) 1998 Phil Maker <pjm@gnu.org>
6% All rights reserved.
7%
8% Redistribution and use in source and binary forms, with or without
9% modification, are permitted provided that the following conditions
10% are met:
11% 1. Redistributions of source code must retain the above copyright
12%    notice, this list of conditions and the following disclaimer.
13% 2. Redistributions in binary form must reproduce the above copyright
14%    notice, this list of conditions and the following disclaimer in the
15%    documentation and/or other materials provided with the distribution.
16%
17% THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
18% ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
19% IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
20% ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
21% FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
22% DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
23% OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
24% HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
25% LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
26% OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
27% SUCH DAMAGE.
28%
29% $Id: perf.tex,v 1.1.1.1 1999/09/12 03:26:49 pjm Exp $
30%
31
32\documentclass[a4paper]{article}
33
34\begin{document}
35\title{Nana Performance Summary}
36\author{P.J.Maker}
37\maketitle
38
39This document contains some measurements for the space and time costs
40for the nana library. Data provided includes:
41
42\begin{itemize}
43\item Time cost in ns.
44\item Space in bytes.
45\item Generated assembler code.
46\item Results for various compiler options such as optimisation.
47\end{itemize}
48
49These test results were generated using:
50
51\begin{itemize}
52\item Operating System, release and cpu: \input{uname.mtex}
53\item CPU speed is analysed by the supplied \verb+bogomips+ program
54  which gives a value of \input{bogomips.mtex}. BogoMips (bogus
55  millions of instructions per seconds) are a standard unit of measure
56  invented by Linus Torvalds for use in Linux. They represent something
57  vaguely related to the number of instructions executed per second.
58\item Note: nana should be installed using \verb|I_DEFAULT=fast ./configure|
59for these measurements.
60\end{itemize}
61
62The following table contains a summary of the results:
63
64    \input{summary.mtex}
65
66Note:
67
68\begin{itemize}
69\item \verb|assert()| is your systems standard assert macro.
70\item \verb|TRAD_assert()| is the traditional implementation of assert
71  which calls \verb|fprintf()| and \verb|exit()| directly.
72\item \verb|I()| is the nana equivelant of \verb|assert|.
73\item \verb|DI()| is implemented using the debugger. It is very space efficent
74  but takes longer than inline C code (such as \verb|I()|).
75\item \verb|I(A(...))| is checking that all 10 values in the array
76  \verb|a| are positive.
77\item \verb|now()| measures the current time and returns a |double|
78  value.
79\item \verb|L()| optionally prints a debugging message.
80\item \verb|DL()| is the debugger equivelant of \verb|L()|.
81\end{itemize}
82
83Note that measurement code depends on GNU CC extensions and is not
84a thing of great beauty.
85
86\section{How was is it measured?}
87
88See \verb|Makefile.am| and \verb|measure.sh| for the true story, a
89quick summary would be:
90
91\begin{itemize}
92\item The code fragments are stored one per line in a file such
93  as \verb|summary.tst|.
94\item The \verb|measure.sh| program takes as arguments a set of
95  compiler flags such as \verb|-O| which are used for each line
96  in the input file.
97\item The code fragment is copied 256 times by a macro inside a
98  loop which in turn executes 1024 times. For small examples
99  it is expected that the entire loop will fit inside the cache.
100\item Time is measured using the nana \verb|now()| function.
101\end{itemize}
102
103The variables and code fragments used defined in \verb|prelude.c|
104and \verb|postlude.c|. All variables are declared \verb|volatile|
105to prevent the compile optimising access to variables.
106
107In addition all programs are compiled with the following options:
108
109\begin{itemize}
110\item \verb|-g| -- debugger information is always turned on since we
111  need it for parts of the nana library. Note that \verb|gcc| happily
112  optimises code with \verb|-g| enabled.
113\item \verb|-fno-defer-pop| -- \verb|gcc| by default only pops
114  arguments off the stack after a number of calls. This option
115  causes each call to immediately pop its arguments off the stack.
116\end{itemize}
117
118\section{Detailed results}
119This section contains some more detailed results.
120
121\subsection{Assert}
122\input{assert.mtex}
123
124\subsection{Quantifiers}
125\input{quantifiers.mtex}
126
127\subsection{Log}
128\input{log.mtex}
129
130\subsection{Nop}
131\input{nop.mtex}
132
133\subsection{C Operations}
134\input{c_ops.mtex}
135
136\subsection{Data cache testing}
137These are are just some tests using a large array which should
138hopefully exceed the size of the D-cache on your machine.
139
140\input{dcache.mtex}
141
142\section{Code}
143This section contains a listing of all the generated code fragments.
144
145\begin{itemize}
146\input{code.mtex}
147\end{itemize}
148
149\section{Conclusion}
150Finally, if you have used this package on an interesting (or
151uninteresting) architecture please mail me a copy of the results for
152the nana home page.
153
154\end{document}
155