1.pn 0
2.ls1
3.EQ
4delim $$
5.EN
6.ev1
7.ps-2
8.vs-2
9.ev
10\&
11.sp 5
12.ps+4
13.ce
14ARITHMETIC CODING FOR DATA COMPRESSION
15.ps-4
16.sp4
17.ce
18Ian H. Witten, Radford M. Neal, and John G. Cleary
19.sp2
20.ce4
21Department of Computer Science
22The University of Calgary
232500 University Drive NW
24Calgary, Canada T2N 1N4
25.sp2
26.ce
27August, 1986, revised January 1987
28.sp 8
29.in+1i
30.ll-1i
31The state of the art in data compression is arithmetic coding, not
32the better-known Huffman method.
33Arithmetic coding gives greater compression, is faster for adaptive models,
34and clearly separates the model from the channel encoding.
35This paper presents a practical implementation of the technique.
36.sp 3
37.in +0.5i
38.ti -0.5i
39\fICR Categories and subject descriptors:\fR
40.br
41E.4 [DATA] Coding and Information Theory \(em Data Compaction and Compression
42.br
43H.1.1 [Models and Principles] Systems and Information Theory \(em Information Theory
44.sp
45.ti -0.5i
46\fIGeneral terms:\fR  algorithms, performance
47.sp
48.ti -0.5i
49\fIAdditional key words and phrases:\fR  arithmetic coding, Huffman coding, adaptive modeling
50.ll+1i
51.in 0
52.bp
53.sh "Introduction"
54.pp
55Arithmetic coding is superior in most respects to the better-known Huffman
56(1952) method.
57.[
58Huffman 1952 method construction minimum-redundancy codes
59.]
60It represents information at least as compactly, sometimes considerably more
61so.
62Its performance is optimal without the need for blocking of input data.
63It encourages a clear separation between the model for representing data and
64the encoding of information with respect to that model.
65It accommodates adaptive models easily.
66It is computationally efficient.
67Yet many authors and practitioners seem unaware of the technique.
68Indeed there is a widespread belief that Huffman coding cannot be improved
69upon.
70.pp
71This paper aims to rectify the situation by presenting an accessible
72implementation of arithmetic coding, and detailing its performance
73characteristics.
74The next section briefly reviews basic concepts of data compression and
75introduces the model-based approach which underlies most modern techniques.
76We then outline the idea of arithmetic coding using a simple example.
77Following that programs are presented for both encoding and decoding, in which
78the model occupies a separate module so that different ones can easily be
79used.
80Next we discuss the construction of fixed and adaptive models.
81The subsequent section details the compression efficiency and execution time
82of the programs, including the effect of different arithmetic word lengths on
83compression efficiency.
84Finally, we outline a few applications where arithmetic coding is appropriate.
85.sh "Data compression"
86.pp
87To many, data compression conjures up an assortment of \fIad hoc\fR
88techniques such as converting spaces in text to tabs, creating special codes
89for common words, or run-length coding of picture data (eg see Held, 1984).
90.[
91Held 1984 data compression techniques applications
92.]
93This contrasts with the more modern model-based paradigm for
94coding, where from an \fIinput string\fR of symbols and a \fImodel\fR, an
95\fIencoded string\fR is produced which is (usually) a compressed version of
96the input.
97The decoder, which must have access to the same model,
98regenerates the exact input string from the encoded string.
99Input symbols are drawn from some well-defined set such as the ASCII or
100binary alphabets;
101the encoded string is a plain sequence of bits.
102The model is a way of calculating, in any given context, the distribution of
103probabilities for the next input symbol.
104It must be possible for the decoder to produce exactly the same probability
105distribution in the same context.
106Compression is achieved by transmitting the more probable symbols in fewer
107bits than the less probable ones.
108.pp
109For example, the model may assign a predetermined probability to each symbol
110in the ASCII alphabet.
111No context is involved.
112These probabilities may be determined by counting frequencies in
113representative samples of text to be transmitted.
114Such a \fIfixed\fR model is communicated in advance to both encoder and
115decoder, after which it is used for many messages.
116.pp
117Alternatively, the probabilities the model assigns may change as each symbol
118is transmitted, based on the symbol frequencies seen \fIso far\fR in this
119message.
120This is an \fIadaptive\fR model.
121There is no need for a representative sample of text, because each message
122is treated as an independent unit, starting from scratch.
123The encoder's model changes with each symbol transmitted, and the decoder's
124changes with each symbol received, in sympathy.
125.pp
126More complex models can provide more accurate probabilistic predictions and
127hence achieve greater compression.
128For example, several characters of previous context could condition the
129next-symbol probability.
130Such methods have enabled mixed-case English text to be encoded in around
1312.2\ bit/char with two quite different kinds of model.
132(Cleary & Witten, 1984b; Cormack & Horspool, 1985).
133.[
134Cleary Witten 1984 data compression
135%D 1984b
136.]
137.[
138Cormack Horspool 1985 dynamic Markov
139%O April
140.]
141Techniques which do not separate modeling from coding so distinctly, like
142that of Ziv & Lempel (1978), do not seem to show such great potential for
143compression, although they may be appropriate when the aim is raw speed rather
144than compression performance (Welch, 1984).
145.[
146Ziv Lempel 1978 compression of individual sequences
147.]
148.[
149Welch 1984 data compression
150.]
151.pp
152The effectiveness of any model can be measured by the \fIentropy\fR of the
153message with respect to it, usually expressed in bits/symbol.
154Shannon's fundamental theorem of coding states that given messages randomly
155generated from a model, it is impossible to encode them into less bits
156(on average) than the entropy of that model (Shannon & Weaver, 1949).
157.[
158Shannon Weaver 1949
159.]
160.pp
161A message can be coded with respect to a model using either Huffman or
162arithmetic coding.
163The former method is frequently advocated as the best possible technique for
164reducing the encoded data rate.
165But it is not.
166Given that each symbol in the alphabet must translate into an integral number
167of bits in the encoding, Huffman coding indeed achieves ``minimum
168redundancy''.
169In other words, it performs optimally if all symbol probabilities are
170integral powers of 1/2.
171But this is not normally the case in practice;
172indeed, Huffman coding can take up to one extra bit per symbol.
173The worst case is realized by a source in which one symbol has probability
174approaching unity.
175Symbols emanating from such a source convey negligible information on average,
176but require at least one bit to transmit (Gallagher, 1978).
177.[
178Gallagher 1978 variations on a theme by Huffman
179.]
180Arithmetic coding dispenses with the restriction that each symbol translates
181into an integral number of bits, thereby coding more efficiently.
182It actually achieves the theoretical entropy bound to compression efficiency
183for any source, including the one just mentioned.
184.pp
185In general, sophisticated models expose the deficiencies of Huffman coding
186more starkly than simple ones.
187This is because they more often predict symbols with probabilities close to
188one, the worst case for Huffman coding.
189For example, the techniques mentioned above which code English text in
1902.2\ bit/char both use arithmetic coding as the final step, and performance
191would be impacted severely if Huffman coding were substituted.
192Nevertheless, since our topic is coding and not modeling, the illustrations in
193this paper all employ simple models.
194Even then, as we shall see, Huffman coding is inferior to arithmetic coding.
195.pp
196The basic concept of arithmetic coding can be traced back to Elias in the
197early 1960s (see Abramson, 1963, pp 61-62).
198.[
199Abramson 1963
200.]
201Practical techniques were first introduced by Rissanen (1976) and
202Pasco (1976), and developed further in Rissanen (1979).
203.[
204Rissanen 1976 Generalized Kraft Inequality
205.]
206.[
207Pasco 1976
208.]
209.[
210Rissanen 1979 number representations
211.]
212.[
213Langdon 1981 tutorial arithmetic coding
214.]
215Details of the implementation presented here have not appeared in the
216literature before; Rubin (1979) is closest to our approach.
217.[
218Rubin 1979 arithmetic stream coding
219.]
220The reader interested in the broader class of arithmetic codes is referred
221to Rissanen & Langdon (1979);
222.[
223Rissanen Langdon 1979 Arithmetic coding
224.]
225a tutorial is available in Langdon (1981).
226.[
227Langdon 1981 tutorial arithmetic coding
228.]
229Despite these publications, the method is not widely known.
230A number of recent books and papers on data compression mention it only in
231passing, or not at all.
232.sh "The idea of arithmetic coding"
233.pp
234In arithmetic coding a message is represented by an
235interval of real numbers between 0 and 1.
236As the message becomes longer, the interval needed to represent it becomes
237smaller, and the number of bits needed to specify that interval grows.
238Successive symbols of the message reduce the size of the
239interval in accordance with the symbol probabilities generated by the
240model.
241The more likely symbols reduce the range by less than the unlikely symbols,
242and hence add fewer bits to the message.
243.pp
244Before anything is transmitted, the range for the message is the entire
245interval [0,\ 1)\(dg.
246.FN
247\(dg [0,\ 1) denotes the half-open interval 0\(<=\fIx\fR<1.
248.EF
249As each symbol is processed, the range is narrowed to that portion of it
250allocated to the symbol.
251For example, suppose the alphabet is {\fIa,\ e,\ i,\ o,\ u,\ !\fR}, and a
252fixed model is used with probabilities shown in Table\ 1.
253Imagine transmitting the message \fIeaii!\fR.
254Initially, both encoder and decoder know that the range is [0,\ 1).
255After seeing the first symbol, \fIe\fR, the encoder narrows it to
256[0.2,\ 0.5), the range the model allocates to this symbol.
257The second symbol, \fIa\fR, will narrow this new range to the first 1/5 of it,
258since \fIa\fR has been allocated [0,\ 0.2).
259This produces [0.2,\ 0.26) since the previous range was 0.3 units long and
2601/5 of that is 0.06.
261The next symbol, \fIi\fR, is allocated [0.5,\ 0.6), which when applied to
262[0.2,\ 0.26) gives the smaller range [0.23,\ 0.236).
263Proceeding in this way, the encoded message builds up as follows:
264.LB
265.nf
266.ta \w'after seeing   'u +0.5i +\w'[0.23354, 'u
267initially		[0,	1)
268after seeing	\fIe\fR	[0.2,	0.5)
269	\fIa\fR	[0.2,	0.26)
270	\fIi\fR	[0.23,	0.236)
271	\fIi\fR	[0.233,	0.2336)
272	\fI!\fR	[0.23354,	0.2336)
273.fi
274.LE
275Figure\ 1 shows another representation of the encoding process.
276The vertical bars with ticks represent the symbol probabilities stipulated
277by the model.
278After the first symbol, \fIe\fR, has been processed, the model is scaled
279into the range [0.2,\ 0.5), as shown in part (a).
280The second symbol, \fIa\fR, scales it again into the range [0.2,\ 0.26).
281But the picture cannot be continued in this way without a magnifying glass!
282Consequently Figure\ 1(b) shows the ranges expanded to full height at every
283stage, and marked with a scale which gives the endpoints as numbers.
284.pp
285Suppose all the decoder knows about the message is the final range,
286[0.23354,\ 0.2336).
287It can immediately deduce that the first character was \fIe\fR, since the
288range lies entirely within the space the model of Table\ 1 allocates for
289\fIe\fR.
290Now it can simulate the operation of the \fIen\fR\^coder:
291.LB
292.nf
293.ta \w'after seeing   'u +0.5i +\w'[0.2, 'u
294initially		[0,	1)
295after seeing	\fIe\fR	[0.2,	0.5)
296.fi
297.LE
298This makes it clear that the second character of the message is \fIa\fR,
299since this will produce the range
300.LB
301.nf
302.ta \w'after seeing   'u +0.5i +\w'[0.2, 'u
303after seeing	\fIa\fR	[0.2,	0.26)
304.fi
305.LE
306which entirely encloses the given range [0.23354,\ 0.2336).
307Proceeding like this, the decoder can identify the whole message.
308.pp
309It is not really necessary for the decoder to know both ends of the range
310produced by the encoder.
311Instead, a single number within the range \(em for example, 0.23355 \(em will
312suffice.
313(Other numbers, like 0.23354, 0.23357, or even 0.23354321, would do just as
314well.)  \c
315However, the decoder will face the problem of detecting the end of the
316message, to determine when to stop decoding.
317After all, the single number 0.0 could represent any of \fIa\fR, \fIaa\fR,
318\fIaaa\fR, \fIaaaa\fR, ...\ .
319To resolve the ambiguity we ensure that each message ends with a special
320EOF symbol known to both encoder and decoder.
321For the alphabet of Table\ 1, \fI!\fR will be used to terminate messages,
322and only to terminate messages.
323When the decoder sees this symbol it stops decoding.
324.pp
325Relative to the fixed model of Table\ 1, the entropy of the 5-symbol message
326\fIeaii!\fR is
327.LB
328$- ~ log ~ 0.3 ~ - ~ log ~ 0.2 ~ - ~ log ~ 0.1 ~ - ~ log ~ 0.1 ~ - ~ log ~ 0.1 ~~=~~ - ~ log ~ 0.00006 ~~ approx ~~ 4.22$
329.LE
330(using base 10, since the above encoding was performed in decimal).
331This explains why it takes 5\ decimal digits to encode the message.
332In fact, the size of the final range is $0.2336 ~-~ 0.23354 ~~=~~ 0.00006$,
333and the entropy is the negative logarithm of this figure.
334Of course, we normally work in binary, transmitting binary digits and
335measuring entropy in bits.
336.pp
337Five decimal digits seems a lot to encode a message comprising four vowels!
338It is perhaps unfortunate that our example ended up by expanding rather than
339compressing.
340Needless to say, however, different models will give different entropies.
341The best single-character model of the message \fIeaii!\fR is the set of
342symbol frequencies
343{\fIe\fR\ (0.2),  \fIa\fR\ (0.2),  \fIi\fR\ (0.4),  \fI!\fR\ (0.2)},
344which gives an entropy for \fIeaii!\fR of 2.89\ decimal digits.
345Using this model the encoding would be only 3\ digits long.
346Moreover, as noted earlier more sophisticated models give much better
347performance in general.
348.sh "A program for arithmetic coding"
349.pp
350Figure\ 2 shows a pseudo-code fragment which summarizes the encoding and
351decoding procedures developed in the last section.
352Symbols are numbered 1, 2, 3, ...
353The frequency range for the $i$th symbol is from $cum_freq[i]$ to
354$cum_freq[i-1]$.
355$cum_freq[i]$ increases as $i$ decreases, and $cum_freq[0] = 1$.
356(The reason for this ``backwards'' convention is that later, $cum_freq[0]$
357will contain a normalizing factor, and it will be convenient to have it
358begin the array.)  \c
359The ``current interval'' is [$low$,\ $high$); and for both encoding and
360decoding this should be initialized to [0,\ 1).
361.pp
362Unfortunately, Figure\ 2 is overly simplistic.
363In practice, there are several factors which complicate both encoding and
364decoding.
365.LB
366.NP
367Incremental transmission and reception.
368.br
369The encode algorithm as described does not transmit anything until the entire
370message has been encoded; neither does the decode algorithm begin decoding
371until it has received the complete transmission.
372In most applications, an incremental mode of operation is necessary.
373.sp
374.NP
375The desire to use integer arithmetic.
376.br
377The precision required to represent the [$low$, $high$) interval grows with
378the length of the message.
379Incremental operation will help overcome this, but the potential for overflow
380and underflow must still be examined carefully.
381.sp
382.NP
383Representing the model so that it can be consulted efficiently.
384.br
385The representation used for the model should minimize the time required for
386the decode algorithm to identify the next symbol.
387Moreover, an adaptive model should be organized to minimize the
388time-consuming task of maintaining cumulative frequencies.
389.LE
390.pp
391Figure\ 3 shows working code, in C, for arithmetic encoding and decoding.
392It is considerably more detailed than the bare-bones sketch of Figure\ 2!
393Implementations of two different models are given in Figure\ 4;
394the Figure\ 3 code can use either one.
395.pp
396The remainder of this section examines the code of Figure\ 3 more closely,
397including a proof that decoding is still correct in the integer
398implementation and a review of constraints on word lengths in the program.
399.rh "Representing the model."
400Implementations of models are discussed in the next section; here we
401are concerned only with the interface to the model (lines 20-38).
402In C, a byte is represented as an integer between 0 and 255 (call this a
403$char$).
404Internally, we represent a byte as an integer between 1 and 257 inclusive
405(call this an $index$), EOF being treated as a 257th symbol.
406It is advantageous to sort the model into frequency order, to minimize the
407number of executions of the decoding loop (line 189).
408To permit such reordering, the $char$/$index$ translation is implemented as
409a pair of tables, $index_to_char[ \| ]$ and $char_to_index[ \| ]$.
410In one of our models, these tables simply form the $index$ by adding 1 to the
411$char$, but another implements a more complex translation which assigns small
412indexes to frequently-used symbols.
413.pp
414The probabilities in the model are represented as integer frequency counts,
415and cumulative counts are stored in the array $cum_freq[ \| ]$.
416As previously, this array is ``backwards,'' and the total frequency count \(em
417which is used to normalize all frequencies \(em appears in $cum_freq[0]$.
418Cumulative counts must not exceed a predetermined maximum, $Max_frequency$,
419and the model implementation must prevent overflow by scaling appropriately.
420It must also ensure that neighboring values in the $cum_freq[ \| ]$ array
421differ by at least 1; otherwise the affected symbol could not be transmitted.
422.rh "Incremental transmission and reception."
423Unlike Figure\ 2, the program of Figure\ 3 represents $low$ and $high$ as
424integers.
425A special data type, $code_value$, is defined for these quantities,
426together with some useful constants:  \c
427$Top_value$, representing the largest possible $code_value$, and
428$First_qtr$, $Half$, and $Third_qtr$, representing parts of the range
429(lines 6-16).
430Whereas in Figure\ 2 the current interval is represented by
431[$low$,\ $high$), in Figure\ 3 it is [$low$,\ $high$]; that is, the range now
432includes the value of $high$.
433Actually, it is more accurate (though more confusing) to say that in the
434program of Figure\ 3 the interval represented is
435[$low$,\ $high + 0.11111 ...$).
436This is because when the bounds are scaled up to increase the precision, 0's
437are shifted into the low-order bits of $low$ but 1's are shifted into $high$.
438While it is possible to write the program to use a different convention,
439this one has some advantages in simplifying the code.
440.pp
441As the code range narrows, the top bits of $low$ and $high$ will become the
442same.
443Any bits which are the same can be transmitted immediately, since they cannot
444be affected by future narrowing.
445For encoding, since we know that $low ~ <= ~ high$, this requires code like
446.LB "nnnn"
447.nf
448.ta \w'nnnn'u +\w'if (high < 'u +\w'Half) { 'u +\w'output_bit(1); low = 2*(low\-Half); high = 2*(high\-Half)+1; 'u
449.ne 4
450for (;;) {
451	if (high <	Half) {	output_bit(0); low = 2*low; high = 2*high+1;	}
452	if (low $>=$	Half) {	output_bit(1); low = 2*(low\-Half); high = 2*(high\-Half)+1;	}
453}
454.fi
455.LE "nnnn"
456which ensures that, upon completion, $low ~ < ~ Half ~ <= ~ high$.
457This can be found in lines 95-113 of $encode_symbol( \| )$,
458although there are some extra complications caused by underflow possibilities
459(see next subsection).
460Care is taken care to shift 1's in at the bottom when $high$ is scaled, as
461noted above.
462.pp
463Incremental reception is done using a number called $value$ as in Figure\ 2,
464in which processed bits flow out the top (high-significance) end and
465newly-received ones flow in the bottom.
466$start_decoding( \| )$ (lines 168-176) fills $value$ with received bits initially.
467Once $decode_symbol( \| )$ has identified the next input symbol, it shifts out
468now-useless high-order bits which are the same in $low$ and $high$, shifting
469$value$ by the same amount (and replacing lost bits by fresh input bits at the
470bottom end):
471.LB "nnnn"
472.nf
473.ta \w'nnnn'u +\w'if (high < 'u +\w'Half) { 'u +\w'value = 2*(value\-Half)+input_bit(\|); low = 2*(low\-Half); high = 2*(high\-Half)+1; 'u
474.ne 4
475for (;;) {
476	if (high <	Half) {	value = 2*value+input_bit(\|); low = 2*low; high = 2*high+1;	}
477	if (low $>=$	Half) {	value = 2*(value\-Half)+input_bit(\|); low = 2*(low\-Half); high = 2*(high\-Half)+1;	}
478}
479.fi
480.LE "nnnn"
481(see lines 194-213, again complicated by precautions against underflow
482discussed below).
483.rh "Proof of decoding correctness."
484At this point it is worth checking that the identification of the next
485symbol by $decode_symbol( \| )$ works properly.
486Recall from Figure\ 2 that $decode_symbol( \| )$ must use $value$ to find that
487symbol which, when encoded, reduces the range to one that still includes
488$value$.
489Lines 186-188 in $decode_symbol( \| )$ identify the symbol for which
490.LB
491$cum_freq[symbol] ~ <= ~~
492left f {(value-low+1)*cum_freq[0] ~-~ 1} over {high-low+1} right f
493~~ < ~ cum_freq[symbol-1]$,
494.LE
495where $left f ~ right f$ denotes the ``integer part of'' function that
496comes from integer division with truncation.
497It is shown in the Appendix that this implies
498.LB "nnnn"
499$low ~+~~ left f {(high-low+1)*cum_freq[symbol]} over cum_freq[0] right f
500~~ <= ~ v ~ <=  ~~
501low ~+~~ left f {(high-low+1)*cum_freq[symbol-1]} over cum_freq[0] right f ~~ - ~ 1$,
502.LE "nnnn"
503so $value$ lies within the new interval that $decode_symbol( \| )$
504calculates in lines 190-193.
505This is sufficient to guarantee that the decoding operation identifies each
506symbol correctly.
507.rh "Underflow."
508As Figure\ 1 shows, arithmetic coding works by scaling the cumulative
509probabilities given by the model into the interval [$low$,\ $high$] for
510each character transmitted.
511Suppose $low$ and $high$ are very close together, so close that this scaling
512operation maps some different symbols of the model on to the same integer in
513the [$low$,\ $high$] interval.
514This would be disastrous, because if such a symbol actually occurred it would
515not be possible to continue encoding.
516Consequently, the encoder must guarantee that the interval [$low$,\ $high$] is
517always large enough to prevent this.
518The simplest way is to ensure that this interval is at least as large as
519$Max_frequency$, the maximum allowed cumulative frequency count (line\ 36).
520.pp
521How could this condition be violated?
522The bit-shifting operation explained above ensures that $low$ and $high$ can
523only become close together when they straddle $Half$.
524Suppose in fact they become as close as
525.LB
526$First_qtr ~ <= ~ low ~<~ Half ~ <= ~ high ~<~ Third_qtr$.
527.LE
528Then the next two bits sent will have opposite polarity, either 01 or 10.
529For example, if the next bit turns out to be 0 (ie $high$ descends below
530$Half$ and [0,\ $Half$] is expanded to the full interval) the bit after
531that will be 1 since the range has to be above the midpoint of the expanded
532interval.
533Conversely if the next bit happens to be 1 the one after that will be 0.
534Therefore the interval can safely be expanded right now, if only we remember
535that whatever bit actually comes next, its opposite must be transmitted
536afterwards as well.
537Thus lines 104-109 expand [$First_qtr$,\ $Third_qtr$] into the whole interval,
538remembering in $bits_to_follow$ that the bit that is output next must be
539followed by an opposite bit.
540This explains why all output is done via $bit_plus_follow( \| )$
541(lines 128-135) instead of directly with $output_bit( \| )$.
542.pp
543But what if, after this operation, it is \fIstill\fR true that
544.LB
545$First_qtr ~ <= ~ low ~<~ Half ~ <= ~ high ~<~ Third_qtr$?
546.LE
547Figure\ 5 illustrates this situation, where the current [$low$,\ $high$]
548range (shown as a thick line) has been expanded a total of three times.
549Suppose the next bit will turn out to be 0, as indicated by the arrow in
550Figure\ 5(a) being below the halfway point.
551Then the next \fIthree\fR bits will be 1's, since not only is the arrow in the
552top half of the bottom half of the original range, it is in the top quarter,
553and moreover the top eighth, of that half \(em that is why the expansion
554can occur three times.
555Similarly, as Figure\ 5(b) shows, if the next bit turns out to be a 1 it will
556be followed by three 0's.
557Consequently we need only count the number of expansions and follow the next
558bit by that number of opposites (lines 106 and 131-134).
559.pp
560Using this technique, the encoder can guarantee that after the shifting
561operations, either
562.LB
563.ta \n(.lu-\n(.iuR
564$low ~<~ First_qtr ~<~ Half ~ <= ~ high$	(1a)
565.LE
566or
567.LB
568.ta \n(.lu-\n(.iuR
569$low ~<~ Half ~<~ Third_qtr ~ <= ~ high$.	(1b)
570.LE
571Therefore as long as the integer range spanned by the cumulative frequencies
572fits into a quarter of that provided by $code_value$s, the underflow problem
573cannot occur.
574This corresponds to the condition
575.LB
576$Max_frequency ~ <= ~~ {Top_value+1} over 4 ~~ + ~ 1$,
577.LE
578which is satisfied by Figure\ 3 since $Max_frequency ~=~ 2 sup 14 - 1$ and
579$Top_value ~=~ 2 sup 16 - 1$ (lines\ 36, 9).
580More than 14\ bits cannot be used to represent cumulative frequency
581counts without increasing the number of bits allocated to $code_value$s.
582.pp
583We have discussed underflow in the encoder only.
584Since the decoder's job, once each symbol has been decoded, is to track the
585operation of the encoder, underflow will be avoided if it performs the same
586expansion operation under the same conditions.
587.rh "Overflow."
588Now consider the possibility of overflow in the integer multiplications
589corresponding to those of Figure\ 2, which occur in lines 91-94 and 190-193
590of Figure\ 3.
591Overflow cannot occur provided the product
592.LB
593$range * Max_frequency$
594.LE
595fits within the integer word length available, since cumulative frequencies
596cannot exceed $Max_frequency$.
597$Range$ might be as large as $Top_value ~+~1$, so the largest possible product
598in Figure 3 is $2 sup 16 ( 2 sup 14 - 1 )$ which is less than $2 sup 30$.
599$Long$ declarations are used for $code_value$ (line\ 7) and $range$
600(lines\ 89, 183) to ensure that arithmetic is done to 32-bit precision.
601.rh "Constraints on the implementation."
602The constraints on word length imposed by underflow and overflow can
603be simplified by assuming frequency counts are represented in $f$\ bits, and
604$code_value$s in $c$\ bits.
605The implementation will work correctly provided
606.LB
607$f ~ <= ~ c ~ - ~2$
608.br
609$f ~+~ c ~ <= ~ p$, the precision to which arithmetic is performed.
610.LE
611In most C implementations, $p=31$ if $long$ integers are used, and $p=32$
612if they are $unsigned ~ long$.
613In Figure\ 3, $f=14$ and $c=16$.
614With appropriately modified declarations, $unsigned ~ long$ arithmetic with
615$f=15$, $c=17$ could be used.
616In assembly language $c=16$ is a natural choice because it expedites some
617comparisons and bit manipulations (eg those of lines\ 95-113 and 194-213).
618.pp
619If $p$ is restricted to 16\ bits, the best values possible are $c=9$ and
620$f=7$, making it impossible to encode a full alphabet of 256\ symbols, as each
621symbol must have a count of at least 1.
622A smaller alphabet (eg the letters, or 4-bit nibbles) could still be
623handled.
624.rh "Termination."
625To finish the transmission, it is necessary to send a unique terminating
626symbol ($EOF_symbol$, line 56) and then follow it by enough bits to ensure
627that the encoded string falls within the final range.
628Since $done_encoding( \| )$ (lines 119-123) can be sure that
629$low$ and $high$ are constrained by either (1a) or (1b) above, it need only
630transmit $01$ in the first case or $10$ in the second to remove the remaining
631ambiguity.
632It is convenient to do this using the $bit_plus_follow( \| )$ procedure
633discussed earlier.
634The $input_bit( \| )$ procedure will actually read a few more bits than were
635sent by $output_bit( \| )$ as it needs to keep the low end of the buffer full.
636It does not matter what value these bits have as the EOF is uniquely
637determined by the last two bits actually transmitted.
638.sh "Models for arithmetic coding"
639.pp
640The program of Figure\ 3 must be used with a model which provides
641a pair of translation tables $index_to_char[ \| ]$ and $char_to_index[ \| ]$,
642and a cumulative frequency array $cum_freq[ \| ]$.
643The requirements on the latter are that
644.LB
645.NP
646$cum_freq[ i-1 ] ~ >= ~ cum_freq[ i ]$;
647.NP
648an attempt is never made to encode a symbol $i$ for which
649$cum_freq[i-1] ~=~ cum_freq[i]$;
650.NP
651$cum_freq[0] ~ <= ~ Max_frequency$.
652.LE
653Provided these conditions are satisfied the values in the array need bear
654no relationship to the actual cumulative symbol frequencies in messages.
655Encoding and decoding will still work correctly, although encodings will
656occupy less space if the frequencies are accurate.
657(Recall our successfully encoding \fIeaii!\fR according to the model of
658Table\ 1, which does not actually reflect the frequencies in the message.)  \c
659.rh "Fixed models."
660The simplest kind of model is one in which symbol frequencies are fixed.
661The first model in Figure\ 4 has symbol frequencies which approximate those
662of English (taken from a part of the Brown Corpus, Kucera & Francis, 1967).
663.[
664%A Kucera, H.
665%A Francis, W.N.
666%D 1967
667%T Computational analysis of present-day American English
668%I Brown University Press
669%C Providence, RI
670.]
671However, bytes which did not occur in that sample have been given frequency
672counts of 1 in case they do occur in messages to be encoded
673(so, for example, this model will still work for binary files in which all
674256\ bytes occur).
675Frequencies have been normalized to total 8000.
676The initialization procedure $start_model( \| )$ simply computes a cumulative
677version of these frequencies (lines 48-51), having first initialized the
678translation tables (lines 44-47).
679Execution speed would be improved if these tables were used to re-order
680symbols and frequencies so that the most frequent came first in the
681$cum_freq[ \| ]$ array.
682Since the model is fixed, the procedure $update_model( \| )$, which is
683called from both $encode.c$ and $decode.c$, is null.
684.pp
685An \fIexact\fR model is one where the symbol frequencies in the message are
686exactly as prescribed by the model.
687For example, the fixed model of Figure\ 4 is close to an exact model
688for the particular excerpt of the Brown Corpus from which it was taken.
689To be truly exact, however, symbols that did not occur in the excerpt would
690be assigned counts of 0, not 1 (sacrificing the capability of
691transmitting messages containing those symbols).
692Moreover, the frequency counts would not be scaled to a predetermined
693cumulative frequency, as they have been in Figure\ 4.
694The exact model may be calculated and transmitted before sending the message.
695It is shown in Cleary & Witten (1984a) that, under quite general conditions,
696this will \fInot\fR give better overall compression than adaptive coding,
697described next.
698.[
699Cleary Witten 1984 enumerative adaptive codes
700%D 1984a
701.]
702.rh "Adaptive models."
703An adaptive model represents the changing symbol frequencies seen \fIso far\fR
704in the message.
705Initially all counts might be the same (reflecting no initial information),
706but they are updated as each symbol is seen, to approximate the observed
707frequencies.
708Provided both encoder and decoder use the same initial values (eg equal
709counts) and the same updating algorithm, their models will remain in step.
710The encoder receives the next symbol, encodes it, and updates its model.
711The decoder identifies it according to its current model, and then updates its
712model.
713.pp
714The second half of Figure\ 4 shows such an adaptive model.
715This is the type of model recommended for use with Figure\ 3, for in practice
716it will outperform a fixed model in terms of compression efficiency.
717Initialization is the same as for the fixed model, except that all frequencies
718are set to 1.
719The procedure $update_model(symbol)$ is called by both $encode_symbol( \| )$
720and $decode_symbol( \| )$ (Figure\ 3 lines 54 and 151) after each symbol is
721processed.
722.pp
723Updating the model is quite expensive, because of the need to maintain
724cumulative totals.
725In the code of Figure\ 4, frequency counts, which must be maintained anyway,
726are used to optimize access by keeping the array in frequency order \(em an
727effective kind of self-organizing linear search (Hester & Hirschberg, 1985).
728.[
729Hester Hirschberg 1985
730.]
731$Update_model( \| )$ first checks to see if the new model will exceed
732the cumulative-frequency limit, and if so scales all frequencies down by a
733factor of 2 (taking care to ensure that no count scales to zero) and
734recomputes cumulative values (Figure\ 4, lines\ 29-37).
735Then, if necessary, $update_model( \| )$ re-orders the symbols to place the
736current one in its correct rank in the frequency ordering, altering the
737translation tables to reflect the change.
738Finally, it increments the appropriate frequency count and adjusts cumulative
739frequencies accordingly.
740.sh "Performance"
741.pp
742Now consider the performance of the algorithm of Figure\ 3, both
743in compression efficiency and execution time.
744.rh "Compression efficiency."
745In principle, when a message is coded using arithmetic coding, the number of
746bits in the encoded string is the same as the entropy of that message with
747respect to the model used for coding.
748Three factors cause performance to be worse than this in practice:
749.LB
750.NP
751message termination overhead
752.NP
753the used of fixed-length rather than infinite-precision arithmetic
754.NP
755scaling of counts so that their total is at most $Max_frequency$.
756.LE
757None of these effects is significant, as we now show.
758In order to isolate the effect of arithmetic coding the model will be
759considered to be exact (as defined above).
760.pp
761Arithmetic coding must send extra bits at the end of each message, causing a
762message termination overhead.
763Two bits are needed, sent by $done_encoding( \| )$ (Figure\ 3 lines 119-123),
764in order to disambiguate the final symbol.
765In cases where a bit-stream must be blocked into 8-bit characters before
766encoding, it will be necessary to round out to the end of a block.
767Combining these, an extra 9\ bits may be required.
768.pp
769The overhead of using fixed-length arithmetic
770occurs because remainders are truncated on division.
771It can be assessed by comparing the algorithm's performance with
772the figure obtained from a theoretical entropy calculation
773which derives its frequencies from counts scaled exactly as for coding.
774It is completely negligible \(em on the order of $10 sup -4$ bits/symbol.
775.pp
776The penalty paid by scaling counts is somewhat larger, but still very
777small.
778For short messages (less than $2 sup 14$ bytes) no scaling need be done.
779Even with messages of $10 sup 5$ to $10 sup 6$ bytes, the overhead was found
780experimentally to be less than 0.25% of the encoded string.
781.pp
782The adaptive model of Figure\ 4 scales down all counts whenever the total
783threatens to exceed $Max_frequency$.
784This has the effect of weighting recent events more heavily compared with
785those earlier in the message.
786The statistics thus tend to track changes in the input sequence, which can be
787very beneficial.
788(For example, we have encountered cases where limiting counts to 6 or 7\ bits
789gives better results than working to higher precision.)  \c
790Of course, this depends on the source being modeled.
791Bentley \fIet al\fR (1986) consider other, more explicit, ways of
792incorporating a recency effect.
793.[
794Bentley Sleator Tarjan Wei 1986 locally adaptive
795%J Communications of the ACM
796.]
797.rh "Execution time."
798The program in Figure\ 3 has been written for clarity, not execution speed.
799In fact, with the adaptive model of Figure\ 4, it takes about 420\ $mu$s per
800input byte on a VAX-11/780 to encode a text file, and about the same for
801decoding.
802However, easily avoidable overheads such as procedure calls account for much
803of this, and some simple optimizations increase speed by a factor of 2.
804The following alterations were made to the C version shown:
805.LB
806.NP
807the procedures $input_bit( \| )$, $output_bit( \| )$, and
808$bit_plus_follow( \| )$ were converted to macros to eliminate
809procedure-call overhead;
810.NP
811frequently-used quantities were put in register variables;
812.NP
813multiplies by two were replaced by additions (C ``+='');
814.NP
815array indexing was replaced by pointer manipulation in the loops
816at line 189 of Figure\ 3 and lines 49-52 of the adaptive model in Figure\ 4.
817.LE
818.pp
819This mildly-optimized C implementation has an execution time of
820214\ $mu$s/262\ $mu$s, per input byte,
821for encoding/decoding 100,000\ bytes of English text on a VAX-11/780, as shown
822in Table\ 2.
823Also given are corresponding figures for the same program on an
824Apple Macintosh and a SUN-3/75.
825As can be seen, coding a C source program of the same length took slightly
826longer in all cases, and a binary object program longer still.
827The reason for this will be discussed shortly.
828Two artificial test files were included to allow readers to replicate the
829results.
830``Alphabet'' consists of enough copies of the 26-letter alphabet to fill
831out 100,000\ characters (ending with a partially-completed alphabet).
832``Skew-statistics'' contains 10,000 copies of the string
833\fIaaaabaaaac\fR\^; it demonstrates that files may be encoded into less than
8341\ bit per character (output size of 12,092\ bytes = 96,736\ bits).
835All results quoted used the adaptive model of Figure\ 4.
836.pp
837A further factor of 2 can be gained by reprogramming in assembly language.
838A carefully optimized version of Figures\ 3 and 4 (adaptive model) was
839written in both VAX and M68000 assembly language.
840Full use was made of registers.
841Advantage was taken of the 16-bit $code_value$ to expedite some crucial
842comparisons and make subtractions of $Half$ trivial.
843The performance of these implementations on the test files is also shown in
844Table\ 2 in order to give the reader some idea of typical execution speeds.
845.pp
846The VAX-11/780 assembly language timings are broken down in Table\ 3.
847These figures were obtained with the U\s-2NIX\s+2 profile facility and are
848accurate only to within perhaps 10%\(dg.
849.FN
850\(dg This mechanism constructs a histogram of program counter values at
851real-time clock interrupts, and suffers from statistical variation as well as
852some systematic errors.
853.EF
854``Bounds calculation'' refers to the initial part of $encode_symbol( \| )$
855and $decode_symbol( \| )$ (Figure\ 3 lines 90-94 and 190-193)
856which contain multiply and divide operations.
857``Bit shifting'' is the major loop in both the encode and decode routines
858(lines 95-113 and 194-213).
859The $cum$ calculation in $decode_symbol( \| )$, which requires a
860multiply/divide, and the following loop to identify the next symbol
861(lines\ 187-189), is ``Symbol decode''.
862Finally, ``Model update'' refers to the adaptive
863$update_model( \| )$ procedure of Figure\ 4 (lines\ 26-53).
864.pp
865As expected, the bounds calculation and model update take the same time for
866both encoding and decoding, within experimental error.
867Bit shifting was quicker for the text file than for the C program and object
868file because compression performance was better.
869The extra time for decoding over encoding is due entirely to the symbol
870decode step.
871This takes longer in the C program and object file tests because the loop of
872line\ 189 was executed more often (on average 9\ times, 13\ times, and
87335\ times respectively).
874This also affects the model update time because it is the number of cumulative
875counts which must be incremented in Figure\ 4 lines\ 49-52.
876In the worst case, when the symbol frequencies are uniformly distributed,
877these loops are executed an average of 128 times.
878Worst-case performance would be improved by using a more complex tree
879representation for frequencies, but this would likely be slower for text
880files.
881.sh "Some applications"
882.pp
883Applications of arithmetic coding are legion.
884By liberating \fIcoding\fR with respect to a model from the \fImodeling\fR
885required for prediction, it encourages a whole new view of data compression
886(Rissanen & Langdon, 1981).
887.[
888Rissanen Langdon 1981 Universal modeling and coding
889.]
890This separation of function costs nothing in compression performance, since
891arithmetic coding is (practically) optimal with respect to the entropy of
892the model.
893Here we intend to do no more than suggest the scope of this view
894by briefly considering
895.LB
896.NP
897adaptive text compression
898.NP
899non-adaptive coding
900.NP
901compressing black/white images
902.NP
903coding arbitrarily-distributed integers.
904.LE
905Of course, as noted earlier, greater coding efficiencies could easily be
906achieved with more sophisticated models.
907Modeling, however, is an extensive topic in its own right and is beyond the
908scope of this paper.
909.pp
910.ul
911Adaptive text compression
912using single-character adaptive frequencies shows off arithmetic coding to
913good effect.
914The results obtained using the program of Figures\ 3 and 4 vary from
9154.8\-5.3\ bit/char for short English text files ($10 sup 3$\ to $10 sup 4$
916bytes) to 4.5\-4.7\ bit/char for long ones ($10 sup 5$ to $10 sup 6$ bytes).
917Although adaptive Huffman techniques do exist (eg Gallagher, 1978;
918Cormack & Horspool, 1984) they lack the conceptual simplicity of
919arithmetic coding.
920.[
921Gallagher 1978 variations on a theme by Huffman
922.]
923.[
924Cormack Horspool 1984 adaptive Huffman codes
925.]
926While competitive in compression efficiency for many files, they are slower.
927For example, Table\ 4 compares the performance of the mildly-optimized C
928implementation of arithmetic coding with that of the U\s-2NIX\s+2
929\fIcompact\fR program which implements adaptive Huffman coding using
930a similar model\(dg.
931.FN
932\(dg \fICompact\fR's model is essentially the same for long files (like those
933of Table\ 4) but is better for short files than the model used as an example
934in this paper.
935.EF
936Casual examination of \fIcompact\fR indicates that the care taken in
937optimization is roughly comparable for both systems, yet arithmetic coding
938halves execution time.
939Compression performance is somewhat better with arithmetic coding on all the
940example files.
941The difference would be accentuated with more sophisticated models that
942predict symbols with probabilities approaching one under certain circumstances
943(eg letter ``u'' following ``q'').
944.pp
945.ul
946Non-adaptive coding
947can be performed arithmetically using fixed, pre-specified models like that in
948the first part of Figure\ 4.
949Compression performance will be better than Huffman coding.
950In order to minimize execution time, the total frequency count,
951$cum_freq[0]$, should be chosen as a power of two so the divisions
952in the bounds calculations (Figure\ 3 lines 91-94 and 190-193) can be done
953as shifts.
954Encode/decode times of around 60\ $mu$s/90\ $mu$s should then be possible
955for an assembly language implementation on a VAX-11/780.
956A carefully-written implementation of Huffman coding, using table look-up for
957encoding and decoding, would be a bit faster in this application.
958.pp
959.ul
960Compressing black/white images
961using arithmetic coding has been investigated by Langdon & Rissanen (1981),
962who achieved excellent results using a model which conditioned the probability
963of a pixel's being black on a template of pixels surrounding it.
964.[
965Langdon Rissanen 1981 compression of black-white images
966.]
967The template contained a total of ten pixels, selected from those above and
968to the left of the current one so that they precede it in the raster scan.
969This creates 1024 different possible contexts, and for each the probability of
970the pixel being black was estimated adaptively as the picture was transmitted.
971Each pixel's polarity was then coded arithmetically according to this
972probability.
973A 20%\-30% improvement in compression was attained over earlier methods.
974To increase coding speed Langdon & Rissanen used an approximate method
975of arithmetic coding which avoided multiplication by representing
976probabilities as integer powers of 1/2.
977Huffman coding cannot be directly used in this application, as it never
978compresses with a two-symbol alphabet.
979Run-length coding, a popular method for use with two-valued alphabets,
980provides another opportunity for arithmetic coding.
981The model reduces the data to a sequence of lengths of runs of the same symbol
982(eg for picture coding, run-lengths of black followed by white followed by
983black followed by white ...).
984The sequence of lengths must be transmitted.
985The CCITT facsimile coding standard (Hunter & Robinson, 1980), for example,
986bases a Huffman code on the frequencies with which black and white runs of
987different lengths occur in sample documents.
988.[
989Hunter Robinson 1980 facsimile
990.]
991A fixed arithmetic code using these same frequencies would give better
992performance; adapting the frequencies to each particular document would be
993better still.
994.pp
995.ul
996Coding arbitrarily-distributed integers
997is often called for when using more sophisticated models of text, image,
998or other data.
999Consider, for instance, Bentley \fIet al\fR's (1986) locally-adaptive data
1000compression scheme, in which the encoder and decoder cache the last $N$
1001different words seen.
1002.[
1003Bentley Sleator Tarjan Wei 1986 locally adaptive
1004%J Communications of the ACM
1005.]
1006A word present in the cache is transmitted by sending the integer cache index.
1007Words not in the cache are transmitted by sending a new-word marker followed
1008by the characters of the word.
1009This is an excellent model for text in which words are used frequently over
1010short intervals and then fall into long periods of disuse.
1011Their paper discusses several variable-length codings for the integers used
1012as cache indexes.
1013Arithmetic coding allows \fIany\fR probability distribution to be used as the
1014basis for a variable-length encoding, including \(em amongst countless others
1015\(em the ones implied by the particular codes discussed there.
1016It also permits use of an adaptive model for cache
1017indexes, which is desirable if the distribution of cache hits is
1018difficult to predict in advance.
1019Furthermore, with arithmetic coding, the code space allotted to the cache
1020indexes can be scaled down to accommodate any desired probability for the
1021new-word marker.
1022.sh "Acknowledgement"
1023.pp
1024Financial support for this work has been provided by the
1025Natural Sciences and Engineering Research Council of Canada.
1026.sh "References"
1027.sp
1028.in+4n
1029.[
1030$LIST$
1031.]
1032.in 0
1033.bp
1034.sh "APPENDIX: Proof of decoding inequality"
1035.sp
1036Using 1-letter abbreviations for $cum_freq$, $symbol$, $low$, $high$, and
1037$value$, suppose
1038.LB
1039$c[s] ~ <= ~~ left f {(v-l+1) times c[0] ~-~ 1} over {h-l+1} right f ~~ < ~
1040c[s-1]$;
1041.LE
1042in other words,
1043.LB
1044.ta \n(.lu-\n(.iuR
1045$c[s] ~ <= ~~ {(v-l+1) times c[0] ~-~ 1} over {r} ~~-~~ epsilon ~~ <= ~
1046c[s-1] ~-~1$, 	(1)
1047.LE
1048.ta 8n
1049where	$r ~=~ h-l+1$,  $0 ~ <= ~ epsilon ~ <= ~ {r-1} over r $.
1050.sp
1051(The last inequality of (1) derives from the fact that $c[s-1]$ must be an
1052integer.)  \c
1053Then we wish to show that  $l' ~ <= ~ v ~ <= ~ h'$,  where $l'$ and $h'$
1054are the updated values for $low$ and $high$ as defined below.
1055.sp
1056.ta \w'(a)    'u
1057(a)	$l' ~ == ~~ l ~+~~ left f {r times c[s]} over c[0] right f ~~ mark
1058<= ~~ l ~+~~ {r} over c[0] ~ left [ ~ {(v-l+1) times c[0] ~-~ 1} over {r}
1059                          ~~ - ~ epsilon ~ right ]$    from (1),
1060.sp 0.5
1061$lineup <= ~~ v ~ + ~ 1 ~ - ~ 1 over c[0]$ ,
1062.sp 0.5
1063	so   $l' ~ <= ~~ v$   since both $v$ and $l'$ are integers
1064and $c[0] > 0$.
1065.sp
1066(b)	$h' ~ == ~~ l ~+~~ left f {r times c[s-1]} over c[0] right f ~~-~1~~ mark
1067>= ~~ l ~+~~  {r} over c[0] ~ left [ ~ {(v-l+1) times c[0] ~-~ 1} over {r}
1068                          ~~ + ~ 1 ~ - ~ epsilon ~ right ] ~~ - ~ 1
1069$    from (1),
1070.sp 0.5
1071$lineup >= ~~ v ~ + ~~ r over c[0] ~ left [ ~ - ~ 1 over r ~+~ 1
1072                                                  ~-~~ r-1 over r right ]
1073~~ = ~~ v$.
1074.bp
1075.sh "Captions for tables"
1076.sp
1077.nf
1078.ta \w'Figure 1  'u
1079Table 1	Example fixed model for alphabet {\fIa, e, i, o, u, !\fR}
1080Table 2	Results for encoding and decoding 100,000-byte files
1081Table 3	Breakdown of timings for VAX-11/780 assembly language version
1082Table 4	Comparison of arithmetic and adaptive Huffman coding
1083.fi
1084.sh "Captions for figures"
1085.sp
1086.nf
1087.ta \w'Figure 1  'u
1088Figure 1	(a) Representation of the arithmetic coding process
1089	(b) Like (a) but with the interval scaled up at each stage
1090Figure 2	Pseudo-code for the encoding and decoding procedures
1091Figure 3	C implementation of arithmetic encoding and decoding
1092Figure 4	Fixed and adaptive models for use with Figure 3
1093Figure 5	Scaling the interval to prevent underflow
1094.fi
1095.bp 0
1096.ev2
1097.nr x2 \w'symbol'/2
1098.nr x3 (\w'symbol'/2)+0.5i+(\w'probability'/2)
1099.nr x4 (\w'probability'/2)+0.5i
1100.nr x5 (\w'[0.0, '
1101.nr x1 \n(x2+\n(x3+\n(x4+\n(x5+\w'0.0)'
1102.nr x0 (\n(.l-\n(x1)/2
1103.in \n(x0u
1104.ta \n(x2uC +\n(x3uC +\n(x4u +\n(x5u
1105\l'\n(x1u'
1106.sp
1107	symbol	probability	\0\0range
1108\l'\n(x1u'
1109.sp
1110	\fIa\fR	0.2	[0,	0.2)
1111	\fIe\fR	0.3	[0.2,	0.5)
1112	\fIi\fR	0.1	[0.5,	0.6)
1113	\fIo\fR	0.2	[0.6,	0.8)
1114	\fIu\fR	0.1	[0.8,	0.9)
1115	\fI!\fR	0.1	[0.9,	1.0)
1116\l'\n(x1u'
1117.sp
1118.in 0
1119.FE "Table 1  Example fixed model for alphabet {\fIa, e, i, o, u, !\fR}"
1120.bp 0
1121.ev2
1122.nr x1 0.5i+\w'\fIVAX object program\fR      '+\w'100,000      '+\w'time ($mu$s)  '+\w'time ($mu$s)    '+\w'time ($mu$s)  '+\w'time ($mu$s)    '+\w'time ($mu$s)  '+\w'time ($mu$s)'
1123.nr x0 (\n(.l-\n(x1)/2
1124.in \n(x0u
1125.ta 0.5i +\w'\fIVAX object program\fR      'u +\w'100,000      'u +\w'time ($mu$s)  'u +\w'time ($mu$s)    'u +\w'time ($mu$s)  'u +\w'time ($mu$s)    'u +\w'time ($mu$s)  'u
1126\l'\n(x1u'
1127.sp
1128			\0\0VAX-11/780	\0\0\0Macintosh	\0\0\0\0SUN-3/75
1129		output	 encode	 decode	 encode	 decode	 encode	 decode
1130		(bytes)	time ($mu$s)	time ($mu$s)	time ($mu$s)	time ($mu$s)	time ($mu$s)	time ($mu$s)
1131\l'\n(x1u'
1132.sp
1133Mildly optimized C implementation
1134.sp
1135	\fIText file\fR	\057718	\0\0214	\0\0262	\0\0687	\0\0881	\0\0\098	\0\0121
1136	\fIC program\fR	\062991	\0\0230	\0\0288	\0\0729	\0\0950	\0\0105	\0\0131
1137	\fIVAX object program\fR	\073501	\0\0313	\0\0406	\0\0950	\01334	\0\0145	\0\0190
1138	\fIAlphabet\fR	\059292	\0\0223	\0\0277	\0\0719	\0\0942	\0\0105	\0\0130
1139	\fISkew-statistics\fR	\012092	\0\0143	\0\0170	\0\0507	\0\0645	\0\0\070	\0\0\085
1140.sp
1141Carefully optimized assembly language implementation
1142.sp
1143	\fIText file\fR	\057718	\0\0104	\0\0135	\0\0194	\0\0243	\0\0\046	\0\0\058
1144	\fIC program\fR	\062991	\0\0109	\0\0151	\0\0208	\0\0266	\0\0\051	\0\0\065
1145	\fIVAX object program\fR	\073501	\0\0158	\0\0241	\0\0280	\0\0402	\0\0\075	\0\0107
1146	\fIAlphabet\fR	\059292	\0\0105	\0\0145	\0\0204	\0\0264	\0\0\051	\0\0\065
1147	\fISkew-statistics\fR	\012092	\0\0\063	\0\0\081	\0\0126	\0\0160	\0\0\028	\0\0\036
1148
1149\l'\n(x1u'
1150.sp 2
1151.nr x0 \n(.l
1152.ll \n(.lu-\n(.iu
1153.fi
1154.in \w'\fINotes:\fR  'u
1155.ti -\w'\fINotes:\fR  'u
1156\fINotes:\fR\ \ \c
1157Times are measured in $mu$s per byte of uncompressed data.
1158.sp 0.5
1159The VAX-11/780 had a floating-point accelerator, which reduces integer
1160multiply and divide times.
1161.sp 0.5
1162The Macintosh uses an 8\ MHz MC68000 with some memory wait states.
1163.sp 0.5
1164The SUN-3/75 uses a 16.67\ MHz MC68020.
1165.sp 0.5
1166All times exclude I/O and operating system overhead in support of I/O.
1167VAX and SUN figures give user time from the U\s-2NIX\s+2 \fItime\fR
1168command; on the Macintosh I/O was explicitly directed to an array.
1169.sp 0.5
1170The 4.2BSD C compiler was used for VAX and SUN; Aztec C 1.06g for Macintosh.
1171.sp
1172.ll \n(x0u
1173.nf
1174.in 0
1175.FE "Table 2  Results for encoding and decoding 100,000-byte files"
1176.bp 0
1177.ev2
1178.nr x1 \w'\fIVAX object program\fR        '+\w'Bounds calculation        '+\w'time ($mu$s)    '+\w'time ($mu$s)'
1179.nr x0 (\n(.l-\n(x1)/2
1180.in \n(x0u
1181.ta \w'\fIVAX object program\fR        'u +\w'Bounds calculation        'u +\w'time ($mu$s)    'u +\w'time ($mu$s)'u
1182\l'\n(x1u'
1183.sp
1184		 encode	 decode
1185		time ($mu$s)	time ($mu$s)
1186\l'\n(x1u'
1187.sp
1188\fIText file\fR	Bounds calculation	\0\0\032	\0\0\031
1189	Bit shifting	\0\0\039	\0\0\030
1190	Model update	\0\0\029	\0\0\029
1191	Symbol decode	\0\0\0\(em	\0\0\045
1192	Other	\0\0\0\04	\0\0\0\00
1193		\0\0\l'\w'100'u'	\0\0\l'\w'100'u'
1194		\0\0104	\0\0135
1195.sp
1196\fIC program\fR	Bounds calculation	\0\0\030	\0\0\028
1197	Bit shifting	\0\0\042	\0\0\035
1198	Model update	\0\0\033	\0\0\036
1199	Symbol decode	\0\0\0\(em	\0\0\051
1200	Other	\0\0\0\04	\0\0\0\01
1201		\0\0\l'\w'100'u'	\0\0\l'\w'100'u'
1202		\0\0109	\0\0151
1203.sp
1204\fIVAX object program\fR	Bounds calculation	\0\0\034	\0\0\031
1205	Bit shifting	\0\0\046	\0\0\040
1206	Model update	\0\0\075	\0\0\075
1207	Symbol decode	\0\0\0\(em	\0\0\094
1208	Other	\0\0\0\03	\0\0\0\01
1209		\0\0\l'\w'100'u'	\0\0\l'\w'100'u'
1210		\0\0158	\0\0241
1211\l'\n(x1u'
1212.in 0
1213.FE "Table 3  Breakdown of timings for VAX-11/780 assembly language version"
1214.bp 0
1215.ev2
1216.nr x1 \w'\fIVAX object program\fR      '+\w'100,000      '+\w'time ($mu$s)  '+\w'time ($mu$s)    '+\w'100,000      '+\w'time ($mu$s)  '+\w'time ($mu$s)'
1217.nr x0 (\n(.l-\n(x1)/2
1218.in \n(x0u
1219.ta \w'\fIVAX object program\fR      'u +\w'100,000      'u +\w'time ($mu$s)  'u +\w'time ($mu$s)    'u +\w'100,000      'u +\w'time ($mu$s)  'u +\w'time ($mu$s)'u
1220\l'\n(x1u'
1221.sp
1222	\0\0\0\0\0\0Arithmetic coding	\0\0\0Adaptive Huffman coding
1223	output	 encode	 decode	output	 encode	 decode
1224	(bytes)	time ($mu$s)	time ($mu$s)	(bytes)	time ($mu$s)	time ($mu$s)
1225\l'\n(x1u'
1226.sp
1227\fIText file\fR	\057718	\0\0214	\0\0262	\057781	\0\0550	\0\0414
1228\fIC program\fR	\062991	\0\0230	\0\0288	\063731	\0\0596	\0\0441
1229\fIVAX object program\fR	\073546	\0\0313	\0\0406	\076950	\0\0822	\0\0606
1230\fIAlphabet\fR	\059292	\0\0223	\0\0277	\060127	\0\0598	\0\0411
1231\fISkew-statistics\fR	\012092	\0\0143	\0\0170	\016257	\0\0215	\0\0132
1232\l'\n(x1u'
1233.sp 2
1234.nr x0 \n(.l
1235.ll \n(.lu-\n(.iu
1236.fi
1237.in +\w'\fINotes:\fR  'u
1238.ti -\w'\fINotes:\fR  'u
1239\fINotes:\fR\ \ \c
1240Mildly optimized C implementation used for arithmetic coding
1241.sp 0.5
1242U\s-2NIX\s+2 \fIcompact\fR used for adaptive Huffman coding
1243.sp 0.5
1244Times are for a VAX-11/780, and exclude I/O and operating system overhead in
1245support of I/O.
1246.sp
1247.ll \n(x0u
1248.nf
1249.in 0
1250.FE "Table 4  Comparison of arithmetic and adaptive Huffman coding"
1251