1% 2% perf.tex - the final performance report which includes the rest 3% of the data. 4% 5% Copyright (c) 1998 Phil Maker <pjm@gnu.org> 6% All rights reserved. 7% 8% Redistribution and use in source and binary forms, with or without 9% modification, are permitted provided that the following conditions 10% are met: 11% 1. Redistributions of source code must retain the above copyright 12% notice, this list of conditions and the following disclaimer. 13% 2. Redistributions in binary form must reproduce the above copyright 14% notice, this list of conditions and the following disclaimer in the 15% documentation and/or other materials provided with the distribution. 16% 17% THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 18% ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 19% IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 20% ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 21% FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 22% DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 23% OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 24% HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 25% LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 26% OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 27% SUCH DAMAGE. 28% 29% $Id: perf.tex,v 1.1.1.1 1999/09/12 03:26:49 pjm Exp $ 30% 31 32\documentclass[a4paper]{article} 33 34\begin{document} 35\title{Nana Performance Summary} 36\author{P.J.Maker} 37\maketitle 38 39This document contains some measurements for the space and time costs 40for the nana library. Data provided includes: 41 42\begin{itemize} 43\item Time cost in ns. 44\item Space in bytes. 45\item Generated assembler code. 46\item Results for various compiler options such as optimisation. 47\end{itemize} 48 49These test results were generated using: 50 51\begin{itemize} 52\item Operating System, release and cpu: \input{uname.mtex} 53\item CPU speed is analysed by the supplied \verb+bogomips+ program 54 which gives a value of \input{bogomips.mtex}. BogoMips (bogus 55 millions of instructions per seconds) are a standard unit of measure 56 invented by Linus Torvalds for use in Linux. They represent something 57 vaguely related to the number of instructions executed per second. 58\item Note: nana should be installed using \verb|I_DEFAULT=fast ./configure| 59for these measurements. 60\end{itemize} 61 62The following table contains a summary of the results: 63 64 \input{summary.mtex} 65 66Note: 67 68\begin{itemize} 69\item \verb|assert()| is your systems standard assert macro. 70\item \verb|TRAD_assert()| is the traditional implementation of assert 71 which calls \verb|fprintf()| and \verb|exit()| directly. 72\item \verb|I()| is the nana equivelant of \verb|assert|. 73\item \verb|DI()| is implemented using the debugger. It is very space efficent 74 but takes longer than inline C code (such as \verb|I()|). 75\item \verb|I(A(...))| is checking that all 10 values in the array 76 \verb|a| are positive. 77\item \verb|now()| measures the current time and returns a |double| 78 value. 79\item \verb|L()| optionally prints a debugging message. 80\item \verb|DL()| is the debugger equivelant of \verb|L()|. 81\end{itemize} 82 83Note that measurement code depends on GNU CC extensions and is not 84a thing of great beauty. 85 86\section{How was is it measured?} 87 88See \verb|Makefile.am| and \verb|measure.sh| for the true story, a 89quick summary would be: 90 91\begin{itemize} 92\item The code fragments are stored one per line in a file such 93 as \verb|summary.tst|. 94\item The \verb|measure.sh| program takes as arguments a set of 95 compiler flags such as \verb|-O| which are used for each line 96 in the input file. 97\item The code fragment is copied 256 times by a macro inside a 98 loop which in turn executes 1024 times. For small examples 99 it is expected that the entire loop will fit inside the cache. 100\item Time is measured using the nana \verb|now()| function. 101\end{itemize} 102 103The variables and code fragments used defined in \verb|prelude.c| 104and \verb|postlude.c|. All variables are declared \verb|volatile| 105to prevent the compile optimising access to variables. 106 107In addition all programs are compiled with the following options: 108 109\begin{itemize} 110\item \verb|-g| -- debugger information is always turned on since we 111 need it for parts of the nana library. Note that \verb|gcc| happily 112 optimises code with \verb|-g| enabled. 113\item \verb|-fno-defer-pop| -- \verb|gcc| by default only pops 114 arguments off the stack after a number of calls. This option 115 causes each call to immediately pop its arguments off the stack. 116\end{itemize} 117 118\section{Detailed results} 119This section contains some more detailed results. 120 121\subsection{Assert} 122\input{assert.mtex} 123 124\subsection{Quantifiers} 125\input{quantifiers.mtex} 126 127\subsection{Log} 128\input{log.mtex} 129 130\subsection{Nop} 131\input{nop.mtex} 132 133\subsection{C Operations} 134\input{c_ops.mtex} 135 136\subsection{Data cache testing} 137These are are just some tests using a large array which should 138hopefully exceed the size of the D-cache on your machine. 139 140\input{dcache.mtex} 141 142\section{Code} 143This section contains a listing of all the generated code fragments. 144 145\begin{itemize} 146\input{code.mtex} 147\end{itemize} 148 149\section{Conclusion} 150Finally, if you have used this package on an interesting (or 151uninteresting) architecture please mail me a copy of the results for 152the nana home page. 153 154\end{document} 155