papers/pulldown/8.t

         $Id: 8.t,v 1.1 2001/07/04 05:29:25 itojun Exp $

.ds RH Comparisons

Comparisons

This section compares the following three approaches in terms of
their characteristics and actual behavior:
(1) 4.4BSD
 m_pullup, (2) NRL
 m_pullup2, and (3) KAME
 m_pulldown.

Comparison of assumption

Table 1 shows the assumptions made by each of the three approaches.
As mentioned earlier,
 m_pullup imposes too stringent requirement for the total length of packet headers.
 m_pullup2 is workable in most cases, although
this approach adds more restrictions than the specification claims.
 m_pulldown assumes that the single packet header is smaller than MCLBYTES,
but makes
no restriction regarding the total length of packet headers.
With a standard mbuf chain,
this is the best
 m_pulldown can do, since there is no way to hold continuous region longer than MCLBYTES.
This characteristic can contribute to better specification conformance,
since
 m_pulldown will impose fewer additional restrictions due to the
requirements of implementation.

Among the three approaches, only
 m_pulldown avoids making unnecessary copies of intermediate header data and
avoids pointer reinitialization after calls to these functions.
These attributes result in smaller overhead during input packet processing.

.nr table +1
At present,
we know of no other 4.4BSD-based IPv6/IPsec stack that addresses kernel
stack overflow issues,
although we are open to
new perspectives and new information.

Performance comparison based on simulated statistics

To compare the behavior and performance of
 m_pulldown against
 m_pullup and
 m_pullup2 using the same set of traffic and
mbuf chains, we have gathered simulated statistics for
 m_pullup and
 m_pullup2, in
 m_pulldown function.
By running a kernel using the modified
 m_pulldown function,
we can easily
gather statistics for these three functions against exactly the same traffic.

The comparison was made on a computer
(with Celeron 366MHz CPU, 192M bytes of memory)
running NetBSD 1.4.1 with the KAME IPv6/IPsec stack.
Network drivers allocate mbufs just as normal 4.4BSD does.
 m_pulldown is called whenever it is needed to ensure continuity in packet data
during inbound packet processing.
The role of the computer is as an end node, not a router.

To describe the content of the following table,
we must look at the source code fragment.
.nr figure +1
Figure \n[figure]
.nr figure -1
shows the code fragment from our source code.
The code fragment will
(1) make the TCP header on the mbuf chain
 m at offset
 hdrlen continuous, and (2) point the region with pointer
 th. We use a macro named IP6_EXTHDR_CHECK,
and the code before and after the macro expansion is shown in the figure.
.KF

.vs 7
\f[CR]/* ensure that *th from hdrlen is continuous */
/* before macro expansion... */
struct tcphdr *th;
IP6_EXTHDR_CHECK(th, struct tcphdr *, m,
 hdrlen, sizeof(*th));
if (th == NULL)
 return; /*m is already freed*/


/* after macro expansion... */
struct tcphdr *th;
int off;
struct mbuf *n;
if (m->m_len < hdrlen + sizeof(*th)) {
 n = m_pulldown(m, hdrlen, sizeof(*th), &off);
 if (n)
 th = (struct tcphdr *)(mtod(n, caddr_t) + off);
 else
 th = NULL;
} else
 th = (struct tcphdr *)(mtod(m, caddr_t) + hdrlen);
if (th == NULL)
 return;

.nr figure +1
Figure \n[figure]: code fragment for trimming mbuf chain.
.KE
In Table 2,
the first column identifies the test case.
The second column shows the number of times
the IP6_EXTHDR_CHECK macro was used.
In other words, it shows the number of times we have made checks against
mbuf length.
The remaining columns show, from left to right,
the number of times memory allocation/copy was performed in each of the variants.
In the case of
 m_pullup, we counted the number of cases we passed
 len in excess of MHLEN (96 bytes in this installation).
With
.I m_pullup2
and
.I m_pulldown,
there were no such failures.
This result suggests
that there was no packet with a packet header portion larger than
MCLBYTES (2048 bytes).
 The percentage in parentheses is ratio against the number on the first column.
In the evaluation we have used
 m_pulldown against IPv6 traffic only.
.1C
.KF


 m_pullup m_pullup2 m_pulldown
total header length MHLEN(100) MCLBYTES(2048) -
single header length - - MCLBYTES(2048)
avoids copy on intermediate headers
 no no yes
avoids pointer reinitialization
 no no yes


Table 1: assumptions in mbuf manipulation approaches.
.KE
.KF


test len checks m_pulldown m_pullup m_pullup2
 call alloc copy alloc copy fail alloc copy
(1) 204923 1706 1595 1596 165 165 1541 1596 1596
(2) 1063995 23786 22931 23008 1171 1229 22557 22895 22953
(3) 520028 1245 948 957 432 432 813 945 945
(4) 438602 180 6 6 178 178 2 24 24
(5) 5570 2236 206 206 812 812 1424 1424 1424


Table 2: number of mbuf allocation/copy against traffic
.KE
.KF


test IPv6 input TCP UDP ICMPv6 1 mbuf 2 mbufs ext mbuf(s)
(1) 29334 20892 2699 5739 3624 15632 10078
(2) 313218 215919 15930 80263 38751 172976 101491
(3) 132267 117822 8561 5882 12782 59799 59686
(4) 73160 66512 5249 1343 7475 42053 23632
(5) 1433 148 53 52 103 1203 127


Table 3: Traffic characteristics for tests in Table 2
.KE

From these measured results, we obtain several interesting observations.
 m_pullup actually failed on IPv6 trafic.
If an IPv6 implementation uses
 m_pullup for IPv6 input processing,
it must be coded carefully so as to avoid trying
 m_pullup against any length longer than MHLEN.
To achieve this end, the code copies the data portion from the mbuf
chain to a separate buffer, and the cost of memory copies becomes a penalty.

Due to the nature of this simulation,
the comparison described above may contain an implicit bias.
Since the IPv6 protocol processing code is written by using
 m_pulldown, the code is somewhat biased toward
 m_pulldown. If a programmer had to write the entire IPv6 protocol processing with
 m_pullup only, he or she would use
 m_copydata to copy intermediate
extension headers buried deep inside the header chains,
thus making it unnecessary to call
 m_pullup. In any case, a call to
 m_copydata will result in a data copy,
which causes extra overhead.
The author thinks that this bias toward
.I m_pulldown
is therefore negligible.

In all cases, the number of length checks (second column) exceeds the
number of inbound packets.
This behavior is the same as in the original 4.4BSD stack;
we did not add a significant number of length checks to the code.
This is because
 m_pulldown (or
 m_pullup in the 4.4BSD case)
is called
as necessary during the parsing of the headers.
For example, to process a TCP-over-IPv6 packet, at least 3
checks would be made against m->m_len;
these checks would be made
to grab the IPv6 header (40 bytes),
to grab the TCP header (20 bytes), and to grab the TCP header
and options (20 to 60 bytes).
The length of the TCP option part is kept inside the TCP header,
so the length needs to be checked twice for the TCP part.
If the function call overhead is more significant than the actual
.I m_pullup
or
.I m_pulldown
operation,
we may be able to blindly call
.I m_pulldown
with the maximum TCP option length
(60 bytes) in order to reduce the number of function calls.
.KF
S
Ao: box invis ht boxht*2
A: box at center of Ao "IPv6 header"
Bo: box invis ht boxht*2
B: box at center of Bo "TCP header" "(len)"
Co: box invis ht boxht*2
C: box at center of Co "TCP options"
D: box "payload"

arrow from 1/3 of the way between Ao.sw and Ao.se to Ao.sw
arrow from 2/3 of the way between Ao.sw and Ao.se to Ao.se
line invis from Ao.sw to Ao.se "40"
line from Ao.sw to 4/5 of the way between Ao.sw and A.sw
line from Ao.se to 4/5 of the way between Ao.se and A.se

arrow from 1/3 of the way between Bo.nw and Bo.ne to Bo.nw
arrow from 2/3 of the way between Bo.nw and Bo.ne to Bo.ne
line invis from Bo.nw to Bo.ne "20"
line from Bo.nw to 4/5 of the way between Bo.nw and B.nw
line from Bo.ne to 4/5 of the way between Bo.ne and B.ne

arrow from 1/3 of the way between Bo.sw and Co.se to Bo.sw
arrow from 2/3 of the way between Bo.sw and Co.se to Co.se
line invis from Bo.sw to Co.se "20 to 60"
line from Bo.sw to 4/5 of the way between Bo.sw and B.sw
line from Co.se to 4/5 of the way between Co.se and C.se
E

.nr figure +1
Figure \n[figure]: processing a TCP-over-IPv6 packet requires 3 length checks.
.KE
The results suggest that we call
 m_pulldown more frequently in ICMPv6 processing than in the processing of other protocols.
These additional calls are made for parsing of ICMPv6 and for neighbor discovery options.
The use of loopback interface also contributes to the use of
 m_pulldown.
In the tests, the number of copies made in the
 m_pullup2 case is similar to the number made in the
 m_pulldown case.
 m_pulldown makes less copies than
 m_pullup2 against packets like below:
 \(sq
A packet is kept in multiple mbuf.
With mbuf allocation policy in
 m_devget, we will see two mbufs to hold single packet
if the packet is larger than MHLEN and smaller than MHLEN + MLEN,
or the packet is larger than MCLBYTES.
 \(sq
We have extension headers in multiple mbufs.
Header portion in the packet needs to occupy first mbuf and
subsequent mbufs.

To demonstrate the difference, we have generated an IPv6 packet with a
routing header, with 4 IPv6 addresses.
The test result is presented as the 5th test in Table 2.
Packet will look like
.nr figure +1
Figure \n[figure].
.nr figure -1
First 112 bytes are occupied by an IPv6 header and a routing header,
and the remaining 16 bytes are used for an ICMPv6 header and payload.
The packet met the above condition, and
 m_pulldown made less copies than
 m_pullup2. To process single incoming ICMPv6 packet shown in the figure,
 m_pullup2 made 7 copies while
 m_pulldown made only 1 copy.
.KF

.vs 7
\f[CR]node A (source) = 2001:240:0:200:260:97ff:fe07:69ea
node B (destination) = 2001:240:0:200:a00:5aff:fe38:6f86
17:39:43.346078 A > B:
 srcrt (type=0,segleft=4,[0]B,[1]B,[2]B,[3]B):
 icmp6: echo request (len 88, hlim 64)
 6000 0000 0058 2b40 2001 0240 0000 0200
 0260 97ff fe07 69ea 2001 0240 0000 0200
 0a00 5aff fe38 6f86 3a08 0004 0000 0000
 2001 0240 0000 0200 0a00 5aff fe38 6f86
 2001 0240 0000 0200 0a00 5aff fe38 6f86
 2001 0240 0000 0200 0a00 5aff fe38 6f86
 2001 0240 0000 0200 0a00 5aff fe38 6f86
 8000 b650 030e 00c8 ce6e fd38 d553 0700


.nr figure +1
Figure \n[figure]: Packets with IPv6 routing header.
.KE

During the test, we experienced no kernel stack overflow,
thanks to a new calling sequence between IPv6 protocol handlers.

The number of copies and mbuf allocations vary very much by tests.
We need to investigate the traffic characteristic more carefully,
for example, about the average length of header portion in packets.