• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

plugins.example/H03-May-2022-9563

README.captureH A D29-Dec-20214.8 KiB139102

README.designH A D29-Dec-20212.1 KiB5842

README.developerH A D29-Dec-202132.3 KiB799619

README.display_filterH A D29-Dec-202121.8 KiB602444

README.dissectorH A D29-Dec-2021152.7 KiB3,6262,785

README.heuristicH A D29-Dec-202110.1 KiB241191

README.idl2wrsH A D29-Dec-20213.8 KiB14586

README.pluginsH A D29-Dec-202115.9 KiB425295

README.regressionH A D29-Dec-20212.3 KiB7765

README.request_response_trackingH A D29-Dec-20216 KiB172145

README.stats_treeH A D29-Dec-20218.3 KiB216157

README.tappingH A D29-Dec-20219.5 KiB240188

README.vagrantH A D29-Dec-20213.8 KiB8763

README.wmemH A D29-Dec-202116.2 KiB399292

README.wsluaH A D29-Dec-202128.6 KiB565453

README.xml-outputH A D29-Dec-20218.9 KiB254197

androiddump.adocH A D29-Dec-20216.7 KiB283219

asn2deb.adocH A D29-Dec-20211.8 KiB10681

capinfos.adocH A D29-Dec-202110.8 KiB475397

captype.adocH A D29-Dec-20211.8 KiB7253

ciscodump.adocH A D29-Dec-20217.2 KiB270209

dftest.adocH A D29-Dec-2021648 4327

dpauxmon.adocH A D29-Dec-20213 KiB170123

dumpcap.adocH A D29-Dec-202117 KiB536445

editcap.adocH A D29-Dec-202119.7 KiB585461

etwdump.adocH A D29-Dec-20213.1 KiB157114

extcap.adocH A D29-Dec-20215 KiB162123

extcap_example.pyH A D29-Dec-202120.6 KiB518404

idl2deb.adocH A D29-Dec-20211.7 KiB10479

idl2wrs.adocH A D29-Dec-20212.4 KiB10163

make-authors-short.plH A D29-Dec-2021763 4024

mergecap.adocH A D29-Dec-20216.7 KiB209161

mmdbresolve.adocH A D29-Dec-20211.7 KiB7554

packet-PROTOABBREV.cH A D29-Dec-202113.4 KiB349116

randpkt.adocH A D29-Dec-20213.1 KiB11285

randpktdump.adocH A D29-Dec-20213.9 KiB185139

rawshark.adocH A D29-Dec-202120.7 KiB612491

reordercap.adocH A D29-Dec-20212.4 KiB7957

sdjournal.adocH A D29-Dec-20213.1 KiB157111

sshdump.adocH A D29-Dec-20218.9 KiB302233

text2pcap.adocH A D29-Dec-202110.6 KiB316269

tshark.adocH A D29-Dec-202193.3 KiB2,6592,167

udpdump.adocH A D29-Dec-20213.1 KiB152110

wireshark-filter.adocH A D29-Dec-202115.5 KiB458316

wireshark.adocH A D29-Dec-2021103.4 KiB3,4242,763

README.capture

1This document is an attempt, to bring some light to the things done, when
2packet capturing is performed. There might be things missing, and others
3maybe wrong :-( The following will concentrate a bit on the Windows
4port of Wireshark.
5
6
7XXX: when ongoing file reorganization will be completed, the following
8two lists maybe won't be needed any longer!
9
10libpcap related source files:
11-----------------------------
12capture-pcap-util.c
13capture-pcap-util.h
14capture-pcap-util-int.h
15capture-pcap-util-unix.c
16capture-wpcap.c
17capture-wpcap.h
18
19Capture related source files:
20-----------------------------
21capture.c
22capture.h
23capture_loop.c
24capture_loop.h
25capture_opts.c
26capture_sync.c
27capture_ui_utils.c
28capture_ui_utils.h
29
30
31Capture driver
32--------------
33Wireshark doesn't have direct access to the capture hardware. Instead of this,
34it uses the Libpcap/Winpcap library to capture data from network cards.
35
36On Win32, in capture-wpcap.c the function ws_module_open("wpcap.dll") is called
37to load the wpcap.dll. This dll includes all functions needed for
38packet capturing.
39
40
41
42Capture File
43------------
44There are some kinds of targets to put the capture data into:
45
46-temporary file
47-user specified "single" capture file
48-user specified "ringbuffer" capture file
49
50Which kind of file is used depends on the user settings. In principle there
51is no difference in handling these files, so if not otherwise notified,
52it will be called the capture file.
53
54The capture file is stored, using the wiretap library.
55
56
57Overview
58--------
59Capturing is done using a two task model: the currently running (parent)
60process will spawn a child process to do the real capture work, namely
61controlling libpcap. This two task model is used because it's necessary
62to split the capturing process (which should avoid packet drop) from the parent
63process which might need significant time to display the data.
64
65When a capture is started, the parent builds a "command line" and creates a
66new child process with it. A pipe from the child to the parent is created
67which is used to transfer control messages.
68
69The child will init libpcap and send the parent a "new capture file is used"
70control message through the pipe.
71
72The child cyclically takes the packet data from libpcap and saves it to disk.
73From time to time it will send the parent a "new packets" control message.
74
75If the parent process receives this "new packets" message and the option
76"Update list of packets in real time" is used, it will read the packet data
77from the file, dissect and display it.
78
79
80If the user wants to stop the capture, this can be done in two ways: by
81menu/toolbar of the parent process or the Stop button of the child processes
82dialog box (which obviously cannot be used it this dialog is hidden).
83
84The Stop button will stop the capture itself, close the control pipe and then
85closes itself. The parent will detect this and stop its part of the capture.
86
87If the menu/toolbar is used, the parent will send a break signal to the child
88which will lead to the same sequence as described above.
89
90Win32 only: as the windows implementation of signals simply doesn't work,
91another pipe from the parent to the child is used to send a "close capture"
92message instead of a signal.
93
94
95Start capture
96-------------
97A capture is started, by specifying to start the capture at the command line,
98trigger the OK button in the "Capture Options" dialog box and some more. The
99capture start is actually done by calling the capture_start() function in
100capture.c.
101
102
103Capture child (Loop)
104--------------------
105The capture child will open the target capture file, prepare pcap things,
106init stop conditions, init the capture statistic dialog (if not hidden) and
107start a loop which is running until the flag ld.go is FALSE.
108
109Inside this loop,
110
111-Qt main things are updated
112-pcap_dispatch(capture_pcap_cb) is called
113-the capture stop conditions are checked (ld.go is set to FALSE to finish)
114-update the capture statistic dialog (if not hidden)
115
116While this loop is running, the pcap_dispatch() will call capture_pcap_cb()
117for every packet captured. Inside this, the packet data is converted into
118wtap (wiretap) format and saved to file. Beside saving, it is trying to
119do some basic dissecting (for the statistic window), by calling the
120appropriate capture_xxx function.
121
122When the user triggered a capture stop or one of the capture stop conditions
123matched, the ld.go flag is set to FALSE, and the loop will stop shortly after.
124
125
126Capture parent
127--------------
128In the capture parent the cap_pipe_input_cb() function is called "cyclically"
129(unix:waiting for pipe, win32:timer,1000ms) to read data from the pipe and show
130it on the main screen. While the capture is in progress, no other capture file
131can be opened.
132
133
134Updating
135--------
136The actual packet capturing inside the libpcap is done using its own task.
137Catching and processing the packet data from the libpcap is done using the
138pcap_dispatch() function.
139

README.design

1Unfortunately, the closest thing to a design document is the
2"README.developer" document in the "doc" directory of the Wireshark
3source tree; however, although that's useful for people adding new
4protocol dissectors to Wireshark, it doesn't describe the operations of
5the "core" of Wireshark.
6
7We have no document describing that; however, a quick summary of the
8part of the code you'd probably be working with is:
9
10	for every capture file that Wireshark has open, there's a
11	"capture_file" structure - Wireshark currently supports only one
12	open capture file at a time, and that structure is named
13	"cfile" (see the "file.h" header file);
14
15	that structure has a member "plist", which points to a
16	"frame_data" structure - every link-layer frame that Wireshark
17	has read in has a "frame_data" structure (see the
18	"epan/packet.h" header file), the "plist" member of "cfile"
19	points to the first frame, and each frame has a "next" member
20	that points to the next frame in the capture (or is null for the
21	last frame);
22
23	each "frame_data" struct has:
24
25		a pointer to the next frame (null for the last frame);
26
27		a pointer to the previous frame (null for the first
28		frame);
29
30		information such as the ordinal number of the frame in
31		the capture, the time stamps for the capture, the size
32		of the packet data in bytes, the size of the frame in
33		bytes (which might not equal the size of the packet data
34		if, for example, the program capturing the packets
35		captured no more than the first N bytes of the capture,
36		for some value of N);
37
38		the byte offset in the capture file where the frame's
39		data is located.
40
41See the "print_packets()" routine in "file.c" for an example of a
42routine that goes through all the packets in the capture; the loop does
43
44	for (fdata = cf->plist; fdata != NULL; fdata = fdata->next) {
45
46		update a progress bar (because it could take a
47		    significant period of time to process all packets);
48
49		read the packet data if the packet is to be printed;
50
51		print the packet;
52
53	}
54
55The "wtap_seek_read()" call reads the packet data into memory; the
56"epan_dissect_new()" call "dissects" that data, building a tree
57structure for the fields in the packet.
58

README.developer

1This file is a HOWTO for Wireshark developers. It describes general development
2and coding practices for contributing to Wireshark no matter which part of
3Wireshark you want to work on.
4
5To learn how to write a dissector, read this first, then read the file
6README.dissector.
7
8This file is compiled to give in depth information on Wireshark.
9It is by no means all inclusive and complete. Please feel free to send
10remarks and patches to the developer mailing list.
11
120. Prerequisites.
13
14Before starting to develop a new dissector, a "running" Wireshark build
15environment is required - there's no such thing as a standalone "dissector
16build toolkit".
17
18How to setup such an environment is platform dependent; detailed
19information about these steps can be found in the "Developer's Guide"
20(available from: https://www.wireshark.org) and in the INSTALL and
21README.md files of the sources root dir.
22
230.1. General README files.
24
25You'll find additional information in the following README files:
26
27- README.capture        - the capture engine internals
28- README.design         - Wireshark software design - incomplete
29- README.developer      - this file
30- README.dissector      - How to dissect a packet
31- README.display_filter - Display Filter Engine
32- README.idl2wrs        - CORBA IDL converter
33- README.packaging      - how to distribute a software package containing WS
34- README.regression     - regression testing of WS and TS
35- README.stats_tree     - a tree statistics counting specific packets
36- README.tapping        - "tap" a dissector to get protocol specific events
37- README.xml-output     - how to work with the PDML exported output
38- wiretap/README.developer - how to add additional capture file types to
39  Wiretap
40
410.2. Dissector related README files.
42
43You'll find additional dissector related information in the file
44README.dissector as well as the following README files:
45
46- README.heuristic      - what are heuristic dissectors and how to write them
47- README.plugins        - how to "pluginize" a dissector
48- README.request_response_tracking - how to track req./resp. times and such
49- README.wmem           - how to obtain "memory leak free" memory
50
510.3 Contributors
52
53James Coe <jammer[AT]cin.net>
54Gilbert Ramirez <gram[AT]alumni.rice.edu>
55Jeff Foster <jfoste[AT]woodward.com>
56Olivier Abad <oabad[AT]cybercable.fr>
57Laurent Deniel <laurent.deniel[AT]free.fr>
58Gerald Combs <gerald[AT]wireshark.org>
59Guy Harris <guy[AT]alum.mit.edu>
60Ulf Lamping <ulf.lamping[AT]web.de>
61
621. Portability.
63
64Wireshark runs on many platforms, and can be compiled with a number of
65different compilers; here are some rules for writing code that will work
66on multiple platforms.
67
68In general, not all C99 features can be used since some C compilers used to
69compile Wireshark, such as Microsoft's C compiler, don't support all C99
70features.  The C99 features that can be used are:
71
72 - flexible array members
73 - compound literals
74 - designated initializers
75 - "//" comments
76 - mixed declarations and code
77 - new block scopes for selection and iteration statements (that is, declaring
78   the type in a for-loop like: for (int i = 0; i < n; i++) ;)
79 - macros with a variable number of arguments (variadic macros)
80 - trailing comma in enum declarations
81 - inline functions (guaranteed only by use of glib.h)
82
83Don't initialize global or static variables (variables with static
84storage duration) in their declaration with non-constant values.  Not
85all compilers support this.  E.g., if "i" is a static or global
86variable, don't declare "i" as
87
88    guint32 i = somearray[2];
89
90outside a function, or as
91
92    static guint32 i = somearray[2];
93
94inside or outside a function, declare it as just
95
96    guint32 i;
97
98or
99
100    static guint32 i;
101
102and later, in code, initialize it with
103
104    i = somearray[2];
105
106instead.  Initializations of variables with automatic storage duration -
107i.e., local variables - with non-constant values is permitted, so,
108within a function
109
110    guint32 i = somearray[2];
111
112is allowed.
113
114Don't use zero-length arrays as structure members, use flexible array members
115instead.
116
117Don't use anonymous unions; not all compilers support them.
118Example:
119
120    typedef struct foo {
121      guint32 foo;
122      union {
123        guint32 foo_l;
124        guint16 foo_s;
125      } u;  /* have a name here */
126    } foo_t;
127
128Don't use "uchar", "u_char", "ushort", "u_short", "uint", "u_int",
129"ulong", "u_long" or "boolean"; they aren't defined on all platforms.
130If you want an 8-bit unsigned quantity, use "guint8"; if you want an
1318-bit character value with the 8th bit not interpreted as a sign bit,
132use "guchar"; if you want a 16-bit unsigned quantity, use "guint16";
133if you want a 32-bit unsigned quantity, use "guint32"; and if you want
134an "int-sized" unsigned quantity, use "guint"; if you want a boolean,
135use "gboolean".  Use "%d", "%u", "%x", and "%o" to print those types;
136don't use "%ld", "%lu", "%lx", or "%lo", as longs are 64 bits long on
137many platforms, but "guint32" is 32 bits long.
138
139Don't use "long" to mean "signed 32-bit integer", and don't use
140"unsigned long" to mean "unsigned 32-bit integer"; "long"s are 64 bits
141long on many platforms.  Use "gint32" for signed 32-bit integers and use
142"guint32" for unsigned 32-bit integers.
143
144Don't use "long" to mean "signed 64-bit integer" and don't use "unsigned
145long" to mean "unsigned 64-bit integer"; "long"s are 32 bits long on
146many other platforms.  Don't use "long long" or "unsigned long long",
147either, as not all platforms support them; use "gint64" or "guint64",
148which will be defined as the appropriate types for 64-bit signed and
149unsigned integers.
150
151On LLP64 data model systems (notably 64-bit Windows), "int" and "long"
152are 32 bits while "size_t" and "ptrdiff_t" are 64 bits. This means that
153the following will generate a compiler warning:
154
155    int i;
156    i = strlen("hello, sailor");  /* Compiler warning */
157
158Normally, you'd just make "i" a size_t. However, many GLib and Wireshark
159functions won't accept a size_t on LLP64:
160
161    size_t i;
162    char greeting[] = "hello, sailor";
163    guint byte_after_greet;
164
165    i = strlen(greeting);
166    byte_after_greet = tvb_get_guint8(tvb, i); /* Compiler warning */
167
168Try to use the appropriate data type when you can. When you can't, you
169will have to cast to a compatible data type, e.g.
170
171    size_t i;
172    char greeting[] = "hello, sailor";
173    guint byte_after_greet;
174
175    i = strlen(greeting);
176    byte_after_greet = tvb_get_guint8(tvb, (gint) i); /* OK */
177
178or
179
180    gint i;
181    char greeting[] = "hello, sailor";
182    guint byte_after_greet;
183
184    i = (gint) strlen(greeting);
185    byte_after_greet = tvb_get_guint8(tvb, i); /* OK */
186
187See http://www.unix.org/version2/whatsnew/lp64_wp.html for more
188information on the sizes of common types in different data models.
189
190When printing or displaying the values of 64-bit integral data types,
191don't use "%lld", "%llu", "%llx", or "%llo" - not all platforms
192support "%ll" for printing 64-bit integral data types.  Instead, for
193GLib routines, and routines that use them, such as all the routines in
194Wireshark that take format arguments, use G_GINT64_MODIFIER, for example:
195
196    proto_tree_add_uint64_format_value(tree, hf_uint64, tvb, offset, len,
197                                       val, "%" G_GINT64_MODIFIER "u", val);
198
199When specifying an integral constant that doesn't fit in 32 bits, don't
200use "LL" at the end of the constant - not all compilers use "LL" for
201that.  Instead, put the constant in a call to the "G_GINT64_CONSTANT()"
202macro, e.g.
203
204    G_GINT64_CONSTANT(-11644473600), G_GUINT64_CONSTANT(11644473600)
205
206rather than
207
208    -11644473600LL, 11644473600ULL
209
210Don't assume that you can scan through a va_list initialized by va_start
211more than once without closing it with va_end and re-initializing it with
212va_start.  This applies even if you're not scanning through it yourself,
213but are calling a routine that scans through it, such as vfprintf() or
214one of the routines in Wireshark that takes a format and a va_list as an
215argument.  You must do
216
217    va_start(ap, format);
218    call_routine1(xxx, format, ap);
219    va_end(ap);
220    va_start(ap, format);
221    call_routine2(xxx, format, ap);
222    va_end(ap);
223
224rather
225    va_start(ap, format);
226    call_routine1(xxx, format, ap);
227    call_routine2(xxx, format, ap);
228    va_end(ap);
229
230Don't use a label without a statement following it.  For example,
231something such as
232
233    if (...) {
234
235        ...
236
237    done:
238    }
239
240will not work with all compilers - you have to do
241
242    if (...) {
243
244        ...
245
246    done:
247        ;
248    }
249
250with some statement, even if it's a null statement, after the label.
251
252Don't use "bzero()", "bcopy()", or "bcmp()"; instead, use the ANSI C
253routines
254
255    "memset()" (with zero as the second argument, so that it sets
256    all the bytes to zero);
257
258    "memcpy()" or "memmove()" (note that the first and second
259    arguments to "memcpy()" are in the reverse order to the
260    arguments to "bcopy()"; note also that "bcopy()" is typically
261    guaranteed to work on overlapping memory regions, while
262    "memcpy()" isn't, so if you may be copying from one region to a
263    region that overlaps it, use "memmove()", not "memcpy()" - but
264    "memcpy()" might be faster as a result of not guaranteeing
265    correct operation on overlapping memory regions);
266
267    and "memcmp()" (note that "memcmp()" returns 0, 1, or -1, doing
268    an ordered comparison, rather than just returning 0 for "equal"
269    and 1 for "not equal", as "bcmp()" does).
270
271Not all platforms necessarily have "bzero()"/"bcopy()"/"bcmp()", and
272those that do might not declare them in the header file on which they're
273declared on your platform.
274
275Don't use "index()" or "rindex()"; instead, use the ANSI C equivalents,
276"strchr()" and "strrchr()".  Not all platforms necessarily have
277"index()" or "rindex()", and those that do might not declare them in the
278header file on which they're declared on your platform.
279
280Don't use "tvb_get_ptr()".  If you must use it, keep in mind that the pointer
281returned by a call to "tvb_get_ptr()" is not guaranteed to be aligned on any
282particular byte boundary; this means that you cannot safely cast it to any
283data type other than a pointer to "char", "unsigned char", "guint8", or other
284one-byte data types.  Casting a pointer returned by tvb_get_ptr() into any
285multi-byte data type or structure may cause crashes on some platforms (even
286if it does not crash on x86-based PCs).  Even if such mis-aligned accesses
287don't crash on your platform they will be slower than properly aligned
288accesses would be.  Furthermore, the data in a packet is not necessarily in
289the byte order of the machine on which Wireshark is running.  Use the tvbuff
290routines to extract individual items from the packet, or, better yet, use
291"proto_tree_add_item()" and let it extract the items for you.
292
293Don't use structures that overlay packet data, or into which you copy
294packet data; the C programming language does not guarantee any
295particular alignment of fields within a structure, and even the
296extensions that try to guarantee that are compiler-specific and not
297necessarily supported by all compilers used to build Wireshark.  Using
298bitfields in those structures is even worse; the order of bitfields
299is not guaranteed.
300
301Don't use "ntohs()", "ntohl()", "htons()", or "htonl()"; the header
302files required to define or declare them differ between platforms, and
303you might be able to get away with not including the appropriate header
304file on your platform but that might not work on other platforms.
305Instead, use "g_ntohs()", "g_ntohl()", "g_htons()", and "g_htonl()";
306those are declared by <glib.h>, and you'll need to include that anyway,
307as Wireshark header files that all dissectors must include use stuff from
308<glib.h>.
309
310Don't fetch a little-endian value using "tvb_get_ntohs() or
311"tvb_get_ntohl()" and then using "g_ntohs()", "g_htons()", "g_ntohl()",
312or "g_htonl()" on the resulting value - the g_ routines in question
313convert between network byte order (big-endian) and *host* byte order,
314not *little-endian* byte order; not all machines on which Wireshark runs
315are little-endian, even though PCs are.  Fetch those values using
316"tvb_get_letohs()" and "tvb_get_letohl()".
317
318Do not use "open()", "rename()", "mkdir()", "stat()", "unlink()", "remove()",
319"fopen()", "freopen()" directly.  Instead use "ws_open()", "ws_rename()",
320"ws_mkdir()", "ws_stat()", "ws_unlink()", "ws_remove()", "ws_fopen()",
321"ws_freopen()": these wrapper functions change the path and file name from
322UTF8 to UTF16 on Windows allowing the functions to work correctly when the
323path or file name contain non-ASCII characters.
324
325Also, use ws_read(), ws_write(), ws_lseek(), ws_dup(), ws_fstat(), and
326ws_fdopen(), rather than read(), write(), lseek(), dup(), fstat(), and
327fdopen() on descriptors returned by ws_open().
328
329Those functions are declared in <wsutil/file_util.h>; include that
330header in any code that uses any of those routines.
331
332When opening a file with "ws_fopen()", "ws_freopen()", or "ws_fdopen()", if
333the file contains ASCII text, use "r", "w", "a", and so on as the open mode
334- but if it contains binary data, use "rb", "wb", and so on.  On
335Windows, if a file is opened in a text mode, writing a byte with the
336value of octal 12 (newline) to the file causes two bytes, one with the
337value octal 15 (carriage return) and one with the value octal 12, to be
338written to the file, and causes bytes with the value octal 15 to be
339discarded when reading the file (to translate between C's UNIX-style
340lines that end with newline and Windows' DEC-style lines that end with
341carriage return/line feed).
342
343In addition, that also means that when opening or creating a binary
344file, you must use "ws_open()" (with O_CREAT and possibly O_TRUNC if the
345file is to be created if it doesn't exist), and OR in the O_BINARY flag,
346even on UN*X - O_BINARY is defined by <wsutil/file_util.h> as 0 on UN*X.
347
348Do not include <unistd.h>, <fcntl.h>, or <io.h> to declare any of the
349routines listed as replaced by routines in <wsutil/file_util.h>;
350instead, just include <wsutil/file_util.h>.
351
352If you need the declarations of other functions defined by <unistd.h>,
353don't include it without protecting it with
354
355    #ifdef HAVE_UNISTD_H
356
357        ...
358
359    #endif
360
361Don't use forward declarations of static arrays without a specified size
362in a fashion such as this:
363
364    static const value_string foo_vals[];
365
366        ...
367
368    static const value_string foo_vals[] = {
369        { 0,        "Red" },
370        { 1,        "Green" },
371        { 2,        "Blue" },
372        { 0,        NULL }
373    };
374
375as some compilers will reject the first of those statements.  Instead,
376initialize the array at the point at which it's first declared, so that
377the size is known.
378
379For #define names and enum member names, prefix the names with a tag so
380as to avoid collisions with other names - this might be more of an issue
381on Windows, as it appears to #define names such as DELETE and
382OPTIONAL.
383
384Don't use the "positional parameters" extension that many UNIX printf's
385implement, e.g.:
386
387    snprintf(add_string, 30, " - (%1$d) (0x%1$04x)", value);
388
389as not all UNIX printf's implement it, and Windows printf doesn't appear
390to implement it.  Use something like
391
392    snprintf(add_string, 30, " - (%d) (0x%04x)", value, value);
393
394instead.
395
396Don't use
397
398    case N ... M:
399
400as that's not supported by all compilers.
401
402Prefer the C99 output functions from <stdio.h> instead of their GLib
403replacements (note that positional format parameters are not part of C99).
404In the past we used to recommend using g_snprintf() and g_vsnprintf()
405instead but since Visual Studio 2015 native C99 implementations are
406available on all platforms we support. These are optimized better than
407the gnulib (GLib) implementation and on hot codepaths that can be a
408noticeable difference in execution speed.
409
410tmpnam() -> mkstemp()
411tmpnam is insecure and should not be used any more. Wireshark brings its
412own mkstemp implementation for use on platforms that lack mkstemp.
413Note: mkstemp does not accept NULL as a parameter.
414
415Wireshark requires minimum versions of each of the libraries it uses, in
416particular GLib 2.32.0 and Qt 5.3.0 or newer. If you require a mechanism
417that is available only in a newer version of a library then use its
418version detection macros, e.g. "#if GLIB_CHECK_VERSION(...)" and "#if
419QT_VERSION_CHECK(...)" to conditionally compile code using that
420mechanism.
421
422When different code must be used on UN*X and Win32, use a #if or #ifdef
423that tests _WIN32, not WIN32.  Try to write code portably whenever
424possible, however; note that there are some routines in Wireshark with
425platform-dependent implementations and platform-independent APIs, such
426as the routines in epan/filesystem.c, allowing the code that calls it to
427be written portably without #ifdefs.
428
429Wireshark uses Libgcrypt as general-purpose crypto library. To use it from
430your dissector, do not include gcrypt.h directly, but use the wrapper file
431wsutil/wsgcrypt.h instead.
432
4332. String handling
434
435Do not use functions such as strcat() or strcpy().
436A lot of work has been done to remove the existing calls to these functions and
437we do not want any new callers of these functions.
438
439Instead use snprintf() since that function will if used correctly prevent
440buffer overflows for large strings.
441
442Be sure that all pointers passed to %s specifiers in format strings are non-
443NULL. Some implementations will automatically replace NULL pointers with the
444string "(NULL)", but most will not.
445
446When using a buffer to create a string, do not use a buffer stored on the stack.
447I.e. do not use a buffer declared as
448
449   char buffer[1024];
450
451instead allocate a buffer dynamically using the string-specific or plain wmem
452routines (see README.wmem) such as
453
454   wmem_strbuf_t *strbuf;
455   strbuf = wmem_strbuf_new(pinfo->pool, "");
456   wmem_strbuf_append_printf(strbuf, ...
457
458or
459
460   char *buffer=NULL;
461   ...
462   #define MAX_BUFFER 1024
463   buffer=wmem_alloc(pinfo->pool, MAX_BUFFER);
464   buffer[0]='\0';
465   ...
466   snprintf(buffer, MAX_BUFFER, ...
467
468This avoids the stack from being corrupted in case there is a bug in your code
469that accidentally writes beyond the end of the buffer.
470
471
472If you write a routine that will create and return a pointer to a filled in
473string and if that buffer will not be further processed or appended to after
474the routine returns (except being added to the proto tree),
475do not preallocate the buffer to fill in and pass as a parameter instead
476pass a pointer to a pointer to the function and return a pointer to a
477wmem-allocated buffer that will be automatically freed. (see README.wmem)
478
479I.e. do not write code such as
480  static void
481  foo_to_str(char *string, ... ){
482     <fill in string>
483  }
484  ...
485     char buffer[1024];
486     ...
487     foo_to_str(buffer, ...
488     proto_tree_add_string(... buffer ...
489
490instead write the code as
491  static void
492  foo_to_str(char **buffer, ...
493    #define MAX_BUFFER x
494    *buffer=wmem_alloc(pinfo->pool, MAX_BUFFER);
495    <fill in *buffer>
496  }
497  ...
498    char *buffer;
499    ...
500    foo_to_str(&buffer, ...
501    proto_tree_add_string(... *buffer ...
502
503Use wmem_ allocated buffers. They are very fast and nice. These buffers are all
504automatically free()d when the dissection of the current packet ends so you
505don't have to worry about free()ing them explicitly in order to not leak memory.
506Please read README.wmem.
507
508Source files can use UTF-8 encoding, but characters outside the ASCII
509range should be used sparingly. It should be safe to use non-ASCII
510characters in comments and strings, but some compilers (such as GCC
511versions prior to 10) may not support extended identifiers very well.
512There is also no guarantee that a developer's text editor will interpret
513the characters the way you intend them to be interpreted.
514
515The majority of Wireshark encodes strings as UTF-8. The main exception
516is the code that uses the Qt API, which uses UTF-16. Console output is
517UTF-8, but as with the source code extended characters should be used
518sparingly since some consoles (most notably Windows' cmd.exe) have
519limited support for UTF-8.
520
5213. Robustness.
522
523Wireshark is not guaranteed to read only network traces that contain correctly-
524formed packets. Wireshark is commonly used to track down networking
525problems, and the problems might be due to a buggy protocol implementation
526sending out bad packets.
527
528Therefore, code does not only have to be able to handle
529correctly-formed packets without, for example, crashing or looping
530infinitely, they also have to be able to handle *incorrectly*-formed
531packets without crashing or looping infinitely.
532
533Here are some suggestions for making code more robust in the face
534of incorrectly-formed packets:
535
536Do *NOT* use "ws_assert()" or "ws_assert_not_reached()" with input data in dissectors.
537*NO* value in a packet's data should be considered "wrong" in the sense
538that it's a problem with the dissector if found; if it cannot do
539anything else with a particular value from a packet's data, the
540dissector should put into the protocol tree an indication that the
541value is invalid, and should return.  The "expert" mechanism should be
542used for that purpose.
543
544Use assertions to catch logic errors in your program. A failed assertion
545indicates a bug in the code. Use ws_assert() instead of g_assert() to
546test a logic condition. Note that ws_assert() will be removed with
547WS_DISABLE_ASSERT. Therefore assertions should not have any side-effects,
548otherwise the program may behave inconsistently.
549
550Use ws_assert_not_reached() instead of g_assert_not_reached() for
551unreachable error conditions. For example if (and only if) you know
552'myvar' can only have the values 1 and 2 do:
553    switch(myvar) {
554    case 1:
555        (...)
556        break;
557    case 2:
558        (...)
559        break;
560    default:
561        ws_assert_not_reached();
562        break;
563    }
564
565For dissectors use DISSECTOR_ASSERT() and DISSECTOR_ASSERT_NOT_REACHED()
566instead, with the same caveats as above.
567
568You should continue to use g_assert_true(), g_assert_cmpstr(), etc for
569"test code", such as unit testing. These assertions are always active.
570See the GLib Testing API documentation for the details on each of those
571functions.
572
573If there is a case where you are checking not for an invalid data item
574in the packet, but for a bug in the dissector (for example, an
575assumption being made at a particular point in the code about the
576internal state of the dissector), use the DISSECTOR_ASSERT macro for
577that purpose; this will put into the protocol tree an indication that
578the dissector has a bug in it, and will not crash the application.
579
580If you are allocating a chunk of memory to contain data from a packet,
581or to contain information derived from data in a packet, and the size of
582the chunk of memory is derived from a size field in the packet, make
583sure all the data is present in the packet before allocating the buffer.
584Doing so means that:
585
586    1) Wireshark won't leak that chunk of memory if an attempt to
587       fetch data not present in the packet throws an exception.
588
589and
590
591    2) it won't crash trying to allocate an absurdly-large chunk of
592       memory if the size field has a bogus large value.
593
594If you're fetching into such a chunk of memory a string from the buffer,
595and the string has a specified size, you can use "tvb_get_*_string()",
596which will check whether the entire string is present before allocating
597a buffer for the string, and will also put a trailing '\0' at the end of
598the buffer.
599
600If you're fetching into such a chunk of memory a 2-byte Unicode string
601from the buffer, and the string has a specified size, you can use
602"tvb_get_faked_unicode()", which will check whether the entire string
603is present before allocating a buffer for the string, and will also
604put a trailing '\0' at the end of the buffer.  The resulting string will be
605a sequence of single-byte characters; the only Unicode characters that
606will be handled correctly are those in the ASCII range.  (Wireshark's
607ability to handle non-ASCII strings is limited; it needs to be
608improved.)
609
610If you're fetching into such a chunk of memory a sequence of bytes from
611the buffer, and the sequence has a specified size, you can use
612"tvb_memdup()", which will check whether the entire sequence is present
613before allocating a buffer for it.
614
615Otherwise, you can check whether the data is present by using
616"tvb_ensure_bytes_exist()" although this frequently is not needed: the
617TVB-accessor routines can handle requests to read data beyond the end of
618the TVB (by throwing an exception which will either mark the frame as
619truncated--not all the data was captured--or as malformed).
620
621Note also that you should only fetch string data into a fixed-length
622buffer if the code ensures that no more bytes than will fit into the
623buffer are fetched ("the protocol ensures" isn't good enough, as
624protocol specifications can't ensure only packets that conform to the
625specification will be transmitted or that only packets for the protocol
626in question will be interpreted as packets for that protocol by
627Wireshark).  If there's no maximum length of string data to be fetched,
628routines such as "tvb_get_*_string()" are safer, as they allocate a buffer
629large enough to hold the string.  (Note that some variants of this call
630require you to free the string once you're finished with it.)
631
632If you have gotten a pointer using "tvb_get_ptr()" (which you should not
633have: you should seriously consider a better alternative to this function),
634you must make sure that you do not refer to any data past the length passed
635as the last argument to "tvb_get_ptr()"; while the various "tvb_get"
636routines perform bounds checking and throw an exception if you refer to data
637not available in the tvbuff, direct references through a pointer gotten from
638"tvb_get_ptr()" do not do any bounds checking.
639
640If you have a loop that dissects a sequence of items, each of which has
641a length field, with the offset in the tvbuff advanced by the length of
642the item, then, if the length field is the total length of the item, and
643thus can be zero, you *MUST* check for a zero-length item and abort the
644loop if you see one.  Otherwise, a zero-length item could cause the
645dissector to loop infinitely.  You should also check that the offset,
646after having the length added to it, is greater than the offset before
647the length was added to it, if the length field is greater than 24 bits
648long, so that, if the length value is *very* large and adding it to the
649offset causes an overflow, that overflow is detected.
650
651If you have a
652
653    for (i = {start}; i < {end}; i++)
654
655loop, make sure that the type of the loop index variable is large enough
656to hold the maximum {end} value plus 1; otherwise, the loop index
657variable can overflow before it ever reaches its maximum value.  In
658particular, be very careful when using gint8, guint8, gint16, or guint16
659variables as loop indices; you almost always want to use an "int"/"gint"
660or "unsigned int"/"guint" as the loop index rather than a shorter type.
661
662If you are fetching a length field from the buffer, corresponding to the
663length of a portion of the packet, and subtracting from that length a
664value corresponding to the length of, for example, a header in the
665packet portion in question, *ALWAYS* check that the value of the length
666field is greater than or equal to the length you're subtracting from it,
667and report an error in the packet and stop dissecting the packet if it's
668less than the length you're subtracting from it.  Otherwise, the
669resulting length value will be negative, which will either cause errors
670in the dissector or routines called by the dissector, or, if the value
671is interpreted as an unsigned integer, will cause the value to be
672interpreted as a very large positive value.
673
674Any tvbuff offset that is added to as processing is done on a packet
675should be stored in a 32-bit variable, such as an "int"; if you store it
676in an 8-bit or 16-bit variable, you run the risk of the variable
677overflowing.
678
679sprintf() -> snprintf()
680Prevent yourself from using the sprintf() function, as it does not test the
681length of the given output buffer and might be writing into unintended memory
682areas. This function is one of the main causes of security problems like buffer
683exploits and many other bugs that are very hard to find. It's much better to
684use the snprintf() function declared by <stdio.h> instead.
685
686You should test your dissector against incorrectly-formed packets.  This
687can be done using the randpkt and editcap utilities that come with the
688Wireshark distribution.  Testing using randpkt can be done by generating
689output at the same layer as your protocol, and forcing Wireshark/TShark
690to decode it as your protocol, e.g. if your protocol sits on top of UDP:
691
692    randpkt -c 50000 -t dns randpkt.pcap
693    tshark -nVr randpkt.pcap -d udp.port==53,<myproto>
694
695Testing using editcap can be done using preexisting capture files and the
696"-E" flag, which introduces errors in a capture file.  E.g.:
697
698    editcap -E 0.03 infile.pcap outfile.pcap
699    tshark -nVr outfile.pcap
700
701The script fuzz-test.sh is available to help automate these tests.
702
7034. Name convention.
704
705Wireshark uses the underscore_convention rather than the InterCapConvention for
706function names, so new code should probably use underscores rather than
707intercaps for functions and variable names. This is especially important if you
708are writing code that will be called from outside your code.  We are just
709trying to keep things consistent for other developers.
710
711C symbols exported from libraries shipped with Wireshark should start with a
712prefix that helps avoiding name collision with public symbols from other shared
713libraries. The current suggested prefixes for newly added symbols are
714ws_, wslua_, wmem_ and wtap_.
715
7165. White space convention.
717
718Most of the C and C++ files in Wireshark use 4-space or 2-space indentation.
719When creating new files you are you are strongly encouraged to use 4-space
720indentation for source code in order to ensure consistency between files.
721
722Please avoid using tab expansions different from 8 column widths, as not all
723text editors in use by the developers support this. For a detailed discussion
724of tabs, spaces, and indentation, see
725
726    http://www.jwz.org/doc/tabs-vs-spaces.html
727
728We use EditorConfig (http://editorconfig.org) files to provide formatting
729hints. Most editors and IDEs support EditorConfig, either directly or via
730a plugin. If yours requires a plugin we encourage you to install it. Our
731default EditorConfig indentation style for C and C++ files is 4 spaces.
732
733Many files also have a short comment (modelines) on the indentation logic at
734the end of the file. This was required in the past but has been superseded by
735EditorConfig. See
736
737    https://www.wireshark.org/tools/modelines.html
738
739for more information.
740
741Please do not leave trailing whitespace (spaces/tabs) on lines.
742
743Quite a bit of our source code has varying indentation styles. When editing an
744existing file, try following the existing indentation logic. If you wish to
745convert a file to 4 space indentation, please do so in its own commit and be
746sure to remove its .editorconfig entry so that the default setting takes
747effect.
748
7496. Compiler warnings
750
751You should write code that is free of compiler warnings. Such warnings will
752often indicate questionable code and sometimes even real bugs, so it's best
753to avoid warnings at all.
754
755The compiler flags in the Makefiles are set to "treat warnings as errors",
756so your code won't even compile when warnings occur.
757
7587. General observations about architecture
759
760One day we might conceivably wish to load dissectors on demand and do other
761more sophisticated kinds of unit test. Plus other scenarios not immediately
762obvious. For this to be possible it is important that the code in epan/ does
763not depend on code in epan/dissectors, i.e it is possible to compile epan
764without linking with dissector code. It helps to view dissectors as clients
765of an API provided by epan (libwireshark being constituted by two distinct
766components "epan" and "dissectors" bundled together, plus other bits and
767pieces). The reverse is not* true; epan should not be the client of an API
768provided by dissectors.
769
770The main way this separation of concerns is achieved is by using runtime
771registration interfaces in epan for dissectors, preferences, etc. that are
772dynamic and do not have any dissector routines hard coded. Naturally this
773is also an essential component of a plugin system (libwireshark has plugins
774for taps, dissectors and an experimental interface to augment dissection with
775new extension languages).
776
7778. Miscellaneous notes
778
779Each commit in your branch corresponds to a different VCSVERSION string
780automatically defined in the header 'vcs_version.h' during the build. If you happen
781to find it convenient to disable this feature it can be done using:
782
783    touch .git/wireshark-disable-versioning
784
785i.e., the file 'wireshark-disable-versioning' must exist in the git repo dir.
786
787/*
788 * Editor modelines  -  https://www.wireshark.org/tools/modelines.html
789 *
790 * Local variables:
791 * c-basic-offset: 4
792 * tab-width: 8
793 * indent-tabs-mode: nil
794 * End:
795 *
796 * vi: set shiftwidth=4 tabstop=8 expandtab:
797 * :indentSize=4:tabSize=8:noTabs=true:
798 */
799

README.display_filter

1(This is a consolidation of documentation written by stig, sahlberg, and gram)
2
3What is the display filter system?
4==================================
5The display filter system allows the user to select packets by testing
6for values in the proto_tree that Wireshark constructs for that packet.
7Every proto_item in the proto_tree has an 'abbrev' field
8and a 'type' field, which tells the display filter engine the name
9of the field and its type (what values it can hold).
10
11For example, this is the definition of the ip.proto field from packet-ip.c:
12
13{ &hf_ip_proto,
14      { "Protocol", "ip.proto", FT_UINT8, BASE_DEC | BASE_EXT_STRING,
15              &ipproto_val_ext, 0x0, NULL, HFILL }},
16
17This definition says that "ip.proto" is the display-filter name for
18this field, and that its field-type is FT_UINT8.
19
20The display filter system has 3 major parts to it:
21
22    1. A type system (field types, or "ftypes")
23    2. A parser, to convert a user's query to an internal representation
24    3. An engine that uses the internal representation to select packets.
25
26
27code:
28epan/dfilter/* - the display filter engine, including
29		scanner, parser, syntax-tree semantics checker, DFVM bytecode
30		generator, and DFVM engine.
31epan/ftypes/* - the definitions of the various FT_* field types.
32epan/proto.c   - proto_tree-related routines
33
34
35The field type system
36=====================
37The field type system is stored in epan/ftypes.
38
39The proto_tree system #includes ftypes.h, which gives it the ftenum
40definition, which is the enum of all possible ftypes:
41
42/* field types */
43enum ftenum {
44    FT_NONE,        /* used for text labels with no value */
45    FT_PROTOCOL,
46    FT_BOOLEAN,     /* TRUE and FALSE come from <glib.h> */
47    FT_UINT8,
48    FT_UINT16,
49    FT_UINT24,      /* really a UINT32, but displayed as3 hex-digits if FD_HEX*/
50    FT_UINT32,
51    FT_UINT64,
52    etc., etc.
53}
54
55It also provides the definition of fvalue_t, the struct that holds the *value*
56that corresponds to the type. Each proto_item (proto_node) holds an fvalue_t
57due to having a field_info struct (defined in proto.h).
58
59The fvalue_t is mostly just a gigantic union of possible C-language types
60(as opposed to FT_* types):
61
62typedef struct _fvalue_t {
63        ftype_t *ftype;
64        union {
65                /* Put a few basic types in here */
66                guint32         uinteger;
67                gint32          sinteger;
68                guint64         integer64;
69                gdouble         floating;
70                gchar           *string;
71                guchar          *ustring;
72                GByteArray      *bytes;
73                ipv4_addr       ipv4;
74                ipv6_addr       ipv6;
75                e_guid_t        guid;
76                nstime_t        time;
77                tvbuff_t        *tvb;
78                GRegex          *re;
79        } value;
80
81        /* The following is provided for private use
82         * by the fvalue. */
83        gboolean        fvalue_gboolean1;
84
85} fvalue_t;
86
87
88Defining a field type
89---------------------
90The ftype system itself is designed to be modular, so that new field types
91can be added when necessary.
92
93Each field type must implement an ftype_t structure, also defined in
94ftypes.h. This is the way a field type is registered with the ftype engine.
95
96If you take a look at ftype-integer.c, you will see that it provides
97an ftype_register_integers() function, that fills in many such ftype_t
98structs. It creates one for each integer type: FT_UINT8, FT_UINT16,
99FT_UINT32, etc.
100
101The ftype_t struct defines the things needed for the ftype:
102
103    * its ftenum value
104    * a string representation of the FT name ("FT_UINT8")
105    * how much data it consumes in the packet
106    * how to store that value in an fvalue_t: new(), free(),
107        various value-related functions
108    * how to compare that value against another
109    * how to slice that value (strings and byte ranges can be sliced)
110
111Using an fvalue_t
112-----------------
113Once the value of a field is stored in an fvalue_t (stored in
114each proto_item via field_info), it's easy to use those values,
115thanks to the various fvalue_*() functions defined in ftypes.h.
116
117Functions like fvalue_get(), fvalue_eq(), etc., are all generic
118interfaces to get information about the field's value. They work
119on any field type because of the ftype_t struct, which is the lookup
120table that the field-type engine uses to work with any field type.
121
122The display filter parser
123=========================
124The display filter parser (along with the comparison engine)
125is stored in epan/dfilter.
126
127The scanner/parser pair read the string representing the display filter
128and convert it into a very simple syntax tree.  The syntax tree is very
129simple in that it is possible that many of the nodes contain unparsed
130chunks of text from the display filter.
131
132There are four phases to parsing a user's request:
133
134 1. Scanning the string for dfilter syntax
135 2. Parsing the keywords according to the dfilter grammar, into a
136        syntax tree
137 3. Doing a semantic check of the nodes in that syntax tree
138 4. Converting the syntax tree into a series of DFVM byte codes
139
140The dfilter_compile() function, in epan/dfilter/dfilter.c,
141runs these 4 phases. The end result is a dfwork_t object (dfw), that
142can be passed to dfilter_apply() to actually run the display filter
143against a set of proto_trees.
144
145
146Scanning the display filter string
147----------------------------------
148epan/dfilter/scanner.l is the lex scanner for finding keywords
149in the user's display filter string.
150
151Its operation is simple. It finds the special function and comparison
152operators ("==", "!=", "eq", "ne", etc.), it finds slice operations
153( "[0:1]" ), quoted strings, IP addresses, numbers, and any other "special"
154keywords or string types.
155
156Anything it doesn't know how to handle is passed to to grammar parser
157as an unparsed string (TOKEN_UNPARSED). This includes field names. The
158scanner does not interpret any protocol field names at all.
159
160The scanner has to return a token type (TOKEN_*, and in many cases,
161a value. The value will be an stnode_t struct, which is a syntax
162tree node object. Since the final storage of the parse will
163be in a syntax tree, it is convenient for the scanner to fill in
164syntax tree nodes with values when it can.
165
166The stnode_t definition is in epan/dfilter/syntax-tree.h
167
168
169Parsing the keywords according to the dfilter grammar
170-----------------------------------------------------
171The grammar parser is implemented with the 'lemon' tool,
172rather than the traditional yacc or bison grammar parser,
173as lemon grammars were found to be easier to work with. The
174lemon parser specification (epan/dfilter/grammar.lemon) is
175much easier to read than its bison counterpart would be,
176thanks to lemon's feature of being able to name fields, rather
177then using numbers ($1, $2, etc.)
178
179The lemon tool is located in tools/lemon in the Wireshark
180distribution.
181
182An on-line introduction to lemon is available at:
183
184http://www.sqlite.org/src/doc/trunk/doc/lemon.html
185
186The grammar specifies which type of constructs are possible
187within the dfilter language ("dfilter-lang")
188
189An "expression" in dfilter-lang can be a relational test or a logical test.
190
191A relational test compares a value against another, which is usually
192a field (or a slice of a field) against some static value, like:
193
194    ip.proto == 1
195    eth.dst != ff:ff:ff:ff:ff:ff
196
197A logical test combines other expressions with "and", "or", and "not".
198
199At the end of the grammatical parsing, the dfw object will
200have a valid syntax tree, pointed at by dfw->st_root.
201
202If there is an error in the syntax, the parser will call dfilter_fail()
203with an appropriate error message, which the UI will need to report
204to the user.
205
206The syntax tree system
207----------------------
208The syntax tree is created as a result of running the lemon-based
209grammar parser on the scanned tokens. The syntax tree code
210is in epan/dfilter/syntax-tree* and epan/dfilter/sttree-*. It too
211uses a set of code modules that implement different syntax node types,
212similar to how the field-type system registers a set of ftypes
213with a central engine.
214
215Each node (stnode_t) in the syntax tree has a type (sttype).
216These sttypes are very much related to ftypes (field types), but there
217is not a one-to-one correspondence. The syntax tree nodes are slightly
218high-level. For example, there is only a single INTEGER sttype, unlike
219the ftype system that has a type for UINT64, UINT32, UINT16, UINT8, etc.
220
221typedef enum {
222        STTYPE_UNINITIALIZED,
223        STTYPE_TEST,
224        STTYPE_UNPARSED,
225        STTYPE_STRING,
226        STTYPE_FIELD,
227        STTYPE_FVALUE,
228        STTYPE_INTEGER,
229        STTYPE_RANGE,
230        STTYPE_FUNCTION,
231        STTYPE_SET,
232        STTYPE_NUM_TYPES
233} sttype_id_t;
234
235
236The root node of the syntax tree is the main test or comparison
237being done.
238
239Semantic Check
240--------------
241After the parsing is done and a syntax tree is available, the
242code in semcheck.c does a semantic check of what is in the syntax
243tree.
244
245The semantics of the simple syntax tree are checked to make sure that
246the fields that are being compared are being compared to appropriate
247values.  For example, if a field is an integer, it can't be compared to
248a string, unless a value_string has been defined for that field.
249
250During the process of checking the semantics, the simple syntax tree is
251fleshed out and no longer contains nodes with unparsed information.  The
252syntax tree is no longer in its simple form, but in its complete form.
253
254For example, if the dfilter is slicing a field and comparing
255against a set of bytes, semcheck.c has to check that the field
256in question can indeed be sliced.
257
258Or, can a field be compared against a certain type of value (string,
259integer, float, IPv4 address, etc.)
260
261The semcheck code also makes adjustments to the syntax tree
262when it needs to. The parser sometimes stores raw, unparsed strings
263in the syntax tree, and semcheck has to convert them to
264certain types. For example, the display filter may contain
265a value_string string (the "enum" type that protocols can use
266to define the possible textual descriptions of numeric fields), and
267semcheck will convert that value_string string into the correct
268integer value.
269
270Truth be told, the semcheck.c code is a bit disorganized, and could
271be re-designed & re-written.
272
273DFVM Byte Codes
274---------------
275The syntax tree is analyzed to create a sequence of bytecodes in the
276"DFVM" language.  "DFVM" stands for Display Filter Virtual Machine.  The
277DFVM is similar in spirit, but not in definition, to the BPF VM that
278libpcap uses to analyze packets.
279
280A virtual bytecode is created and used so that the actual process of
281filtering packets will be fast.  That is, it should be faster to process
282a list of VM bytecodes than to attempt to filter packets directly from
283the syntax tree.  (heh...  no measurement has been made to support this
284supposition)
285
286The DFVM opcodes are defined in epan/dfilter/dfvm.h (dfvm_opcode_t).
287Similar to how the BPF opcode system works in libpcap, there is a
288limited set of opcodes. They operate by loading values from the
289proto_tree into registers, loading pre-defined values into
290registers, and comparing them. The opcodes are checked in sequence, and
291there are only 2 branching opcodes: IF_TRUE_GOTO and IF_FALSE_GOTO.
292Both of these can only branch forwards, and never backwards. In this way
293sets of DFVM instructions will never get into an infinite loop.
294
295The epan/dfilter/gencode.c code converts the syntax tree
296into a set of dvfm instructions.
297
298The constants that are in the DFVM instructions (the constant
299values that the user is checking against) are pre-loaded
300into registers via the dvfm_init_const() call, and stored
301in the dfilter_t structure for when the display filter is
302actually applied.
303
304
305DFVM Engine
306===========
307Once the DFVM bytecode has been produced, it's a simple matter of
308running the DFVM engine against the proto_tree from the packet
309dissection, using the DFVM bytecodes as instructions.  If the DFVM
310bytecode is known before packet dissection occurs, the
311proto_tree-related code can be "primed" to store away pointers to
312field_info structures that are interesting to the display filter.  This
313makes lookup of those field_info structures during the filtering process
314faster.
315
316The dfilter_apply() function runs a single pre-compiled
317display filter against a single proto_tree function, and returns
318TRUE or FALSE, meaning that the filter matched or not.
319
320That function calls dfvm_apply(), which runs across the DFVM
321instructions, loading protocol field values into DFVM registers
322and doing the comparisons.
323
324There is a top-level Makefile target called 'dftest' which
325builds a 'dftest' executable that will print out the DFVM
326bytecode for any display filter given on the command-line.
327To build it, run:
328
329$ make dftest
330
331To use it, give it the display filter on the command-line:
332
333$ ./dftest 'ip.addr == 127.0.0.1'
334Filter: "ip.addr == 127.0.0.1"
335
336Constants:
33700000 PUT_FVALUE        127.0.0.1 <FT_IPv4> -> reg#1
338
339Instructions:
34000000 READ_TREE         ip.addr -> reg#0
34100001 IF-FALSE-GOTO     3
34200002 ANY_EQ            reg#0 == reg#1
34300003 RETURN
344
345
346The output shows the original display filter, then the opcodes
347that put constant values into registers. The registers are
348numbered, and are shown in the output as "reg#n", where 'n' is the
349identifying number.
350
351Then the instructions are shown. These are the instructions
352which are run for each proto_tree.
353
354This is what happens in this example:
355
35600000 READ_TREE         ip.addr -> reg#0
357
358Any ip.addr fields in the proto_tree are loaded into register 0. Yes,
359multiple values can be loaded into a single register. As a result
360of this READ_TREE, the accumulator will hold TRUE or FALSE, indicating
361if any field's value was loaded, or not.
362
36300001 IF-FALSE-GOTO     3
364
365If the load failed because there were no ip.addr fields
366in the proto_tree, then we jump to instruction 3.
367
36800002 ANY_EQ            reg#0 == reg#1
369
370This checks to see if any of the fields in register 0
371(which has the pre-loaded constant value of 127.0.0.1) are equal
372to any of the fields in register 1 (which are all of the ip.addr
373fields in the proto tree). The resulting value in the
374accumulator will be TRUE if any of the fields match, or FALSE
375if none match.
376
37700003 RETURN
378
379This returns the accumulator's value, either TRUE or FALSE.
380
381In addition to dftest, there is also a tools/dfilter-test script
382which is a unit-test script for the display filter engine.
383It makes use of text2pcap and tshark to run specific display
384filters against specific captures (embedded within dfilter-test).
385
386
387
388Display Filter Functions
389========================
390You define a display filter function by adding an entry to
391the df_functions table in epan/dfilter/dfunctions.c. The record struct
392is defined in dfunctions.h, and shown here:
393
394typedef struct {
395    char            *name;
396    DFFuncType      function;
397    ftenum_t        retval_ftype;
398    guint           min_nargs;
399    guint           max_nargs;
400    DFSemCheckType  semcheck_param_function;
401} df_func_def_t;
402
403name - the name of the function; this is how the user will call your
404    function in the display filter language
405
406function - this is the run-time processing of your function.
407
408retval_ftype - what type of FT_* type does your function return?
409
410min_nargs - minimum number of arguments your function accepts
411max_nargs - maximum number of arguments your function accepts
412
413semcheck_param_function - called during the semantic check of the
414    display filter string.
415
416DFFuncType function
417-------------------
418typedef gboolean (*DFFuncType)(GList *arg1list, GList *arg2list, GList **retval);
419
420The return value of your function is a gboolean; TRUE if processing went fine,
421or FALSE if there was some sort of exception.
422
423For now, display filter functions can accept a maximum of 2 arguments.
424The "arg1list" parameter is the GList for the first argument. The
425'arg2list" parameter is the GList for the second argument. All arguments
426to display filter functions are lists. This is because in the display
427filter language a protocol field may have multiple instances. For example,
428a field like "ip.addr" will exist more than once in a single frame. So
429when the user invokes this display filter:
430
431    somefunc(ip.addr) == TRUE
432
433even though "ip.addr" is a single argument, the "somefunc" function will
434receive a GList of *all* the values of "ip.addr" in the frame.
435
436Similarly, the return value of the function needs to be a GList, since all
437values in the display filter language are lists. The GList** retval argument
438is passed to your function so you can set the pointer to your return value.
439
440DFSemCheckType
441--------------
442typedef void (*DFSemCheckType)(int param_num, stnode_t *st_node);
443
444For each parameter in the syntax tree, this function will be called.
445"param_num" will indicate the number of the parameter, starting with 0.
446The "stnode_t" is the syntax-tree node representing that parameter.
447If everything is okay with the value of that stnode_t, your function
448does nothing --- it merely returns. If something is wrong, however,
449it should THROW a TypeError exception.
450
451
452Example: add an 'in' display filter operation
453=============================================
454
455This example has been discussed on wireshark-dev in April 2004. It illustrates
456how a more complex operation can be added to the display filter language.
457
458Question:
459
460	If I want to add an 'in' display filter operation, I need to define
461	several things. This can happen in different ways. For instance,
462	every value from the "in" value collection will result in a test.
463	There are 2 options here, either a test for a single value:
464
465		(x in {a b c})
466
467	or a test for a value in a given range:
468
469		(x in {a ... z})
470
471	or even a combination of both. The former example can be reduced to:
472
473		((x == a) or (x == b) or (x == c))
474
475	while the latter can be reduced to
476
477		((x >= MIN(a, z)) and (x <= MAX(a, z)))
478
479	I understand that I can replace "x in {" with the following steps:
480	first store x in the "in" test buffer, then add "(" to the display
481	filter expression internally.
482
483	Similarly I can replace the closing brace "}" with the following
484	steps: release x from the "in" test buffer and then add ")"
485	to the display filter expression internally.
486
487	How could I do this?
488
489Answer:
490
491	This could be done in grammar.lemon. The grammar would produce
492	syntax tree nodes, combining them with "or", when it is given
493	tokens that represent the "in" syntax.
494
495	It could also be done later in the process, maybe in
496	semcheck.c. But if you can do it earlier, in grammar.lemon,
497	then you shouldn't have to worry about modifying anything in
498	semcheck.c, as the syntax tree that is passed to semcheck.c
499	won't contain any new type of operators... just lots of nodes
500	combined with "or".
501
502How to add an operator FOO to the display filter language?
503==========================================================
504
505Go to wireshark/epan/dfilter/
506
507Edit grammar.lemon and add the operator. Add the operator FOO and the
508test logic (defining TEST_OP_FOO).
509
510Edit scanner.l and add the operator name(s) hence defining
511TOKEN_TEST_FOO. Also update the simple() or add the new operand's code.
512
513Edit sttype-test.h and add the TEST_OP_FOO to the list of test operations.
514
515Edit sttype-test.c and add TEST_OP_FOO to the num_operands() method.
516
517Edit gencode.c, add TEST_OP_FOO in the gen_test() method by defining
518ANY_FOO.
519
520Edit dfvm.h and add ANY_FOO to the enum dfvm_opcode_t structure.
521
522Edit dfvm.c and add ANY_FOO to dfvm_dump() (for the dftest display filter
523test binary), to dfvm_apply() hence defining the methods fvalue_foo().
524
525Edit semcheck.c and look at the check_relation_XXX() methods if they
526still apply to the foo operator; if not, amend the code. Start from the
527check_test() method to discover the logic.
528
529Go to wireshark/epan/ftypes/
530
531Edit ftypes.h and declare the fvalue_foo(), ftype_can_foo() and
532fvalue_foo() methods. Add the cmp_foo() method to the struct _ftype_t.
533
534This is the first time that a make in wireshark/epan/dfilter/ can
535succeed. If it fails, then some code in the previously edited files must
536be corrected.
537
538Edit ftypes.c and define the fvalue_foo() method with its associated
539logic. Define also the ftype_can_foo() and fvalue_foo() methods.
540
541Edit all ftype-*.c files and add the required fvalue_foo() methods.
542
543This is the point where you should be able to compile without errors in
544wireshark/epan/ftypes/. If not, first fix the errors.
545
546Go to wireshark/epan/ and run make. If this one succeeds, then we're
547almost done as no errors should occur here.
548
549Go to wireshark/ and run make. One thing to do is make dftest and see
550if you can construct valid display filters with your new operator. Or
551you may want to move directly to the generation of Wireshark.
552
553Also look at ui/qt/display_filter_expression_dialog.cpp and the display
554filter expression generator.
555
556How to add a new test to the test suite
557=======================================
558
559All display filter tests are located in test/suite_dfilter.
560You can add a test to an existing file or create a new file.
561
562Each new test class must define "trace_file", which names
563a capture file in "test/captures". All the tests
564run in that class will use that one capture file.
565
566There are 2 fixtures you can use for testing:
567
568checkDFilterCount(dfilter, expected_count)
569
570    This will run the display filter through tshark, on the
571    file named by "trace_file", and assert that the
572    number of resulting packets equals "expected_count". This
573    also asserts that tshark does not fail; success with zero
574    matches is not the same as failure to compile the display
575    filter string.
576
577checkDFilterFail(dfilter, error)
578
579    This will run dftest with the display filter, and check
580    that it fails with a given error message. This is useful
581    when expecting display filter syntax errors to be caught.
582
583To execute tests:
584
585# Run all dfilter tests
586$ test/test.py suite_dfilter
587
588# Run all tests from group_tvb.py:
589$ test/test.py suite_dfilter.group_tvb
590
591# For faster, parallel tests, install the "pytest-xdist" first
592# (for example, using "pip install pytest-xdist"), then:
593$ pytest -nauto test -k suite_dfilter
594
595# Run all tests from group_tvb.py, in parallel:
596$ pytest -nauto test -k case_tvb
597
598# Run a single test from group_tvb.py, case_tvb.test_slice_4:
599$ pytest test -k "case_tvb and test_slice_4"
600
601See also https://www.wireshark.org/docs/wsdg_html_chunked/ChapterTests.html
602

README.dissector

1This file is a HOWTO for Wireshark developers interested in writing or working
2on Wireshark protocol dissectors. It describes expected code patterns and the
3use of some of the important functions and variables.
4
5This file is compiled to give in depth information on Wireshark.
6It is by no means all inclusive and complete. Please feel free to send
7remarks and patches to the developer mailing list.
8
9If you haven't read README.developer, read that first!
10
110. Prerequisites.
12
13Before starting to develop a new dissector, a "running" Wireshark build
14environment is required - there's no such thing as a standalone "dissector
15build toolkit".
16
17How to setup such an environment is platform dependent; detailed
18information about these steps can be found in the "Developer's Guide"
19(available from: https://www.wireshark.org) and in the INSTALL and
20README.md files of the sources root dir.
21
220.1. Dissector related README files.
23
24You'll find additional dissector related information in the following README
25files:
26
27- README.heuristic      - what are heuristic dissectors and how to write them
28- README.plugins        - how to "pluginize" a dissector
29- README.request_response_tracking - how to track req./resp. times and such
30- README.wmem           - how to obtain "memory leak free" memory
31
320.2 Contributors
33
34James Coe <jammer[AT]cin.net>
35Gilbert Ramirez <gram[AT]alumni.rice.edu>
36Jeff Foster <jfoste[AT]woodward.com>
37Olivier Abad <oabad[AT]cybercable.fr>
38Laurent Deniel <laurent.deniel[AT]free.fr>
39Gerald Combs <gerald[AT]wireshark.org>
40Guy Harris <guy[AT]alum.mit.edu>
41Ulf Lamping <ulf.lamping[AT]web.de>
42Barbu Paul - Gheorghe <barbu.paul.gheorghe[AT]gmail.com>
43
441. Setting up your protocol dissector code.
45
46This section provides skeleton code for a protocol dissector. It also explains
47the basic functions needed to enter values in the traffic summary columns,
48add to the protocol tree, and work with registered header fields.
49
501.1 Skeleton code.
51
52Wireshark requires certain things when setting up a protocol dissector.
53We provide basic skeleton code for a dissector that you can copy to a new file
54and fill in. Your dissector should follow the naming convention of "packet-"
55followed by the abbreviated name for the protocol. It is recommended that where
56possible you keep to the IANA abbreviated name for the protocol, if there is
57one, or a commonly-used abbreviation for the protocol, if any.
58
59The skeleton code lives in the file "packet-PROTOABBREV.c" in the same source
60directory as this README.
61
62If instead of using the skeleton you base your dissector on an existing real
63dissector, please put a little note in the copyright header indicating which
64dissector you started with.
65
66Usually, you will put your newly created dissector file into the directory
67epan/dissectors/, just like all the other packet-*.c files already in there.
68
69Also, please add your dissector file to the corresponding makefiles,
70described in section "1.8 Editing CMakeLists.txt to add your dissector" below.
71
72Dissectors that use the dissector registration API to register with a lower
73level protocol (this is the vast majority) don't need to define a prototype in
74their .h file. For other dissectors the main dissector routine should have a
75prototype in a header file whose name is "packet-", followed by the abbreviated
76name for the protocol, followed by ".h"; any dissector file that calls your
77dissector should be changed to include that file.
78
79You may not need to include all the headers listed in the skeleton, and you may
80need to include additional headers.
81
821.2 Explanation of needed substitutions in code skeleton.
83
84In the skeleton sample code the following strings should be substituted with
85your information.
86
87YOUR_NAME       Your name, of course. You do want credit, don't you?
88                It's the only payment you will receive....
89YOUR_EMAIL_ADDRESS  Keep those cards and letters coming.
90PROTONAME       The name of the protocol; this is displayed in the
91                top-level protocol tree item for that protocol.
92PROTOSHORTNAME  An abbreviated name for the protocol; this is displayed
93                in the "Preferences" dialog box if your dissector has
94                any preferences, in the dialog box of enabled protocols,
95                and in the dialog box for filter fields when constructing
96                a filter expression.
97PROTOABBREV     A name for the protocol for use in filter expressions;
98                it may contain only lower-case letters, digits, and hyphens,
99                underscores, and periods.
100LICENSE         The license this dissector is under. Please use a SPDX License
101                identifier.
102YEARS           The years the above license is valid for.
103FIELDNAME       The displayed name for the header field.
104FIELDABBREV     The abbreviated name for the header field; it may contain
105                only letters, digits, hyphens, underscores, and periods.
106FIELDTYPE       FT_NONE, FT_BOOLEAN, FT_CHAR, FT_UINT8, FT_UINT16, FT_UINT24,
107                FT_UINT32, FT_UINT40, FT_UINT48, FT_UINT56, FT_UINT64,
108                FT_INT8, FT_INT16, FT_INT24, FT_INT32, FT_INT40, FT_INT48,
109                FT_INT56, FT_INT64, FT_FLOAT, FT_DOUBLE, FT_ABSOLUTE_TIME,
110                FT_RELATIVE_TIME, FT_STRING, FT_STRINGZ, FT_EUI64,
111                FT_UINT_STRING, FT_ETHER, FT_BYTES, FT_UINT_BYTES, FT_IPv4,
112                FT_IPv6, FT_IPXNET, FT_FRAMENUM, FT_PROTOCOL, FT_GUID, FT_OID,
113                FT_REL_OID, FT_AX25, FT_VINES, FT_SYSTEM_ID, FT_FC, FT_FCWWN
114FIELDDISPLAY    --For FT_UINT{8,16,24,32,40,48,56,64} and
115                       FT_INT{8,16,24,32,40,48,56,64):
116
117                  BASE_DEC, BASE_HEX, BASE_OCT, BASE_DEC_HEX, BASE_HEX_DEC,
118                  BASE_CUSTOM, or BASE_NONE, possibly ORed with
119                  BASE_RANGE_STRING, BASE_EXT_STRING, BASE_VAL64_STRING,
120                  BASE_ALLOW_ZERO, BASE_UNIT_STRING, BASE_SPECIAL_VALS,
121                  BASE_NO_DISPLAY_VALUE, or BASE_SHOW_ASCII_PRINTABLE
122
123                  BASE_NONE may be used with a non-NULL FIELDCONVERT when the
124                  numeric value of the field itself is not of significance to
125                  the user (for example, the number is a generated field).
126                  When this is the case the numeric value is not shown to the
127                  user in the protocol decode nor is it used when preparing
128                  filters for the field in question.
129
130                  BASE_NO_DISPLAY_VALUE will just display the field name with
131                  no value. It is intended for byte arrays (FT_BYTES or
132                  FT_UINT_BYTES) or header fields above a subtree. The
133                  value will still be filterable, just not displayed.
134
135                --For FT_UINT16:
136
137                  BASE_PT_UDP, BASE_PT_TCP, BASE_PT_DCCP or BASE_PT_SCTP
138
139                --For FT_UINT24:
140
141                  BASE_OUI
142
143                --For FT_CHAR:
144                  BASE_HEX, BASE_OCT, BASE_CUSTOM, or BASE_NONE, possibly
145                  ORed with BASE_RANGE_STRING, BASE_EXT_STRING or
146                  BASE_VAL64_STRING.
147
148                  BASE_NONE can be used in the same way as with FT_UINT8.
149
150                --For FT_ABSOLUTE_TIME:
151
152                  ABSOLUTE_TIME_LOCAL, ABSOLUTE_TIME_UTC, or
153                  ABSOLUTE_TIME_DOY_UTC
154
155                --For FT_BOOLEAN:
156
157                  if BITMASK is non-zero:
158                    Number of bits in the field containing the FT_BOOLEAN
159                    bitfield.
160                  otherwise:
161                    (must be) BASE_NONE
162
163                --For FT_STRING, FT_STRINGZ and FT_UINT_STRING:
164
165                  STR_ASCII or STR_UNICODE
166
167                --For FT_BYTES and FT_UINT_BYTES:
168
169                  SEP_DOT, SEP_DASH, SEP_COLON, or SEP_SPACE to provide
170                  a separator between bytes; BASE_NONE has no separator
171                  between bytes.  These can be ORed with BASE_ALLOW_ZERO
172                  and BASE_SHOW_ASCII_PRINTABLE.
173
174                  BASE_ALLOW_ZERO displays <none> instead of <MISSING>
175                  for a zero-sized byte array.
176                  BASE_SHOW_ASCII_PRINTABLE will check whether the
177                  field's value consists entirely of printable ASCII
178                  characters and, if so, will display the field's value
179                  as a string, in quotes.  The value will still be
180                  filterable as a byte value.
181
182                --For FT_IPv4:
183
184                  BASE_NETMASK - Used for IPv4 address that should never
185                                 attempted to be resolved (like netmasks)
186                  otherwise:
187                    (must be) BASE_NONE
188
189                --For all other types:
190
191                  BASE_NONE
192FIELDCONVERT    VALS(x), VALS64(x), RVALS(x), TFS(x), CF_FUNC(x), NULL
193BITMASK         Used to mask a field not 8-bit aligned or with a size other
194                than a multiple of 8 bits
195FIELDDESCR      A brief description of the field, or NULL. [Please do not use ""].
196
197If, for example, PROTONAME is "Internet Bogosity Discovery Protocol",
198PROTOSHORTNAME would be "IBDP", and PROTOABBREV would be "ibdp". Try to
199conform with IANA names.
200
2011.2.1 Automatic substitution in code skeleton
202
203Instead of manual substitutions in the code skeleton, a tool to automate it can
204be found under the tools directory. The script is called tools/generate-dissector.py
205and takes all the needed options to generate a compilable dissector. Look at the
206above fields to know how to set them. Some assumptions have been made in the
207generation to shorten the list of required options. The script patches the
208CMakeLists.txt file adding the new dissector in the proper list, alphabetically
209sorted.
210
2111.3 The dissector and the data it receives.
212
213
2141.3.1 Header file.
215
216This is only needed if the dissector doesn't use self-registration to
217register itself with the lower level dissector, or if the protocol dissector
218wants/needs to expose code to other subdissectors.
219
220The dissector must be declared exactly as follows in the file
221packet-PROTOABBREV.h:
222
223int
224dissect_PROTOABBREV(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree);
225
226
2271.3.2 Extracting data from packets.
228
229NOTE: See the file /epan/tvbuff.h for more details.
230
231The "tvb" argument to a dissector points to a buffer containing the raw
232data to be analyzed by the dissector; for example, for a protocol
233running atop UDP, it contains the UDP payload (but not the UDP header,
234or any protocol headers above it). A tvbuffer is an opaque data
235structure, the internal data structures are hidden and the data must be
236accessed via the tvbuffer accessors.
237
238The accessors are:
239
240Bit accessors for a maximum of 8-bits, 16-bits 32-bits and 64-bits:
241
242guint8 tvb_get_bits8(tvbuff_t *tvb, gint bit_offset, const gint no_of_bits);
243guint16 tvb_get_bits16(tvbuff_t *tvb, guint bit_offset, const gint no_of_bits, const guint encoding);
244guint32 tvb_get_bits32(tvbuff_t *tvb, guint bit_offset, const gint no_of_bits, const guint encoding);
245guint64 tvb_get_bits64(tvbuff_t *tvb, guint bit_offset, const gint no_of_bits, const guint encoding);
246
247Single-byte accessors for 8-bit unsigned integers (guint8) and 8-bit
248signed integers (gint8):
249
250guint8  tvb_get_guint8(tvbuff_t *tvb, const gint offset);
251gint8  tvb_get_gint8(tvbuff_t *tvb, const gint offset);
252
253Network-to-host-order accessors:
254
25516-bit unsigned (guint16) and signed (gint16) integers:
256
257guint16 tvb_get_ntohs(tvbuff_t *tvb, const gint offset);
258gint16 tvb_get_ntohis(tvbuff_t *tvb, const gint offset);
259
26024-bit unsigned and signed integers:
261
262guint32 tvb_get_ntoh24(tvbuff_t *tvb, const gint offset);
263gint32 tvb_get_ntohi24(tvbuff_t *tvb, const gint offset);
264
26532-bit unsigned (guint32) and signed (gint32) integers:
266
267guint32 tvb_get_ntohl(tvbuff_t *tvb, const gint offset);
268gint32 tvb_get_ntohil(tvbuff_t *tvb, const gint offset);
269
27040-bit unsigned and signed integers:
271
272guint64 tvb_get_ntoh40(tvbuff_t *tvb, const gint offset);
273gint64 tvb_get_ntohi40(tvbuff_t *tvb, const gint offset);
274
27548-bit unsigned and signed integers:
276
277guint64 tvb_get_ntoh48(tvbuff_t *tvb, const gint offset);
278gint64 tvb_get_ntohi48(tvbuff_t *tvb, const gint offset);
279
28056-bit unsigned and signed integers:
281
282guint64 tvb_get_ntoh56(tvbuff_t *tvb, const gint offset);
283gint64 tvb_get_ntohi56(tvbuff_t *tvb, const gint offset);
284
28564-bit unsigned (guint64) and signed (gint64) integers:
286
287guint64 tvb_get_ntoh64(tvbuff_t *tvb, const gint offset);
288gint64 tvb_get_ntohi64(tvbuff_t *tvb, const gint offset);
289
290Single-precision and double-precision IEEE floating-point numbers:
291
292gfloat tvb_get_ntohieee_float(tvbuff_t *tvb, const gint offset);
293gdouble tvb_get_ntohieee_double(tvbuff_t *tvb, const gint offset);
294
295Little-Endian-to-host-order accessors:
296
29716-bit unsigned (guint16) and signed (gint16) integers:
298
299guint16 tvb_get_letohs(tvbuff_t *tvb, const gint offset);
300gint16 tvb_get_letohis(tvbuff_t *tvb, const gint offset);
301
30224-bit unsigned and signed integers:
303
304guint32 tvb_get_letoh24(tvbuff_t *tvb, const gint offset);
305gint32 tvb_get_letohi24(tvbuff_t *tvb, const gint offset);
306
30732-bit unsigned (guint32) and signed (gint32) integers:
308
309guint32 tvb_get_letohl(tvbuff_t *tvb, const gint offset);
310gint32 tvb_get_letohil(tvbuff_t *tvb, const gint offset);
311
31240-bit unsigned and signed integers:
313
314guint64 tvb_get_letoh40(tvbuff_t *tvb, const gint offset);
315gint64 tvb_get_letohi40(tvbuff_t *tvb, const gint offset);
316
31748-bit unsigned and signed integers:
318
319guint64 tvb_get_letoh48(tvbuff_t *tvb, const gint offset);
320gint64 tvb_get_letohi48(tvbuff_t *tvb, const gint offset);
321
32256-bit unsigned and signed integers:
323
324guint64 tvb_get_letoh56(tvbuff_t *tvb, const gint offset);
325gint64 tvb_get_letohi56(tvbuff_t *tvb, const gint offset);
326
32764-bit unsigned (guint64) and signed (gint64) integers:
328
329guint64 tvb_get_letoh64(tvbuff_t *tvb, const gint offset);
330gint64 tvb_get_letohi64(tvbuff_t *tvb, const gint offset);
331
332NOTE: Although each of the integer accessors above return types with
333specific sizes, the returned values are subject to C's integer promotion
334rules. It's often safer and more useful to use int or guint for 32-bit
335and smaller types, and gint64 or guint64 for 40-bit and larger types.
336Just because a value occupied 16 bits on the wire or over the air
337doesn't mean it will within Wireshark.
338
339Single-precision and double-precision IEEE floating-point numbers:
340
341gfloat tvb_get_letohieee_float(tvbuff_t *tvb, const gint offset);
342gdouble tvb_get_letohieee_double(tvbuff_t *tvb, const gint offset);
343
344Encoding-to_host-order accessors:
345
34616-bit unsigned (guint16) and signed (gint16) integers:
347
348guint16 tvb_get_guint16(tvbuff_t *tvb, const gint offset, const guint encoding);
349gint16 tvb_get_gint16(tvbuff_t *tvb, const gint offset, const guint encoding);
350
35124-bit unsigned and signed integers:
352
353guint32 tvb_get_guint24(tvbuff_t *tvb, const gint offset, const guint encoding);
354gint32 tvb_get_gint24(tvbuff_t *tvb, const gint offset, const guint encoding);
355
35632-bit unsigned (guint32) and signed (gint32) integers:
357
358guint32 tvb_get_guint32(tvbuff_t *tvb, const gint offset, const guint encoding);
359gint32 tvb_get_gint32(tvbuff_t *tvb, const gint offset, const guint encoding);
360
36140-bit unsigned and signed integers:
362
363guint64 tvb_get_guint40(tvbuff_t *tvb, const gint offset, const guint encoding);
364gint64 tvb_get_gint40(tvbuff_t *tvb, const gint offset, const guint encoding);
365
36648-bit unsigned and signed integers:
367
368guint64 tvb_get_guint48(tvbuff_t *tvb, const gint offset, const guint encoding);
369gint64 tvb_get_gint48(tvbuff_t *tvb, const gint offset, const guint encoding);
370
37156-bit unsigned and signed integers:
372
373guint64 tvb_get_guint56(tvbuff_t *tvb, const gint offset, const guint encoding);
374gint64 tvb_get_gint56(tvbuff_t *tvb, const gint offset, const guint encoding);
375
37664-bit unsigned (guint64) and signed (gint64) integers:
377
378guint64 tvb_get_guint64(tvbuff_t *tvb, const gint offset, const guint encoding);
379gint64 tvb_get_gint64(tvbuff_t *tvb, const gint offset, const guint encoding);
380
381Single-precision and double-precision IEEE floating-point numbers:
382
383gfloat tvb_get_ieee_float(tvbuff_t *tvb, const gint offset, const guint encoding);
384gdouble tvb_get_ieee_double(tvbuff_t *tvb, const gint offset, const guint encoding);
385
386"encoding" should be ENC_BIG_ENDIAN for Network-to-host-order,
387ENC_LITTLE_ENDIAN for Little-Endian-to-host-order, or ENC_HOST_ENDIAN
388for host order.
389
390Accessors for IPv4 and IPv6 addresses:
391
392guint32 tvb_get_ipv4(tvbuff_t *tvb, const gint offset);
393void tvb_get_ipv6(tvbuff_t *tvb, const gint offset, ws_in6_addr *addr);
394
395NOTE: IPv4 addresses are not to be converted to host byte order before
396being passed to "proto_tree_add_ipv4()". You should use "tvb_get_ipv4()"
397to fetch them, not "tvb_get_ntohl()" *OR* "tvb_get_letohl()" - don't,
398for example, try to use "tvb_get_ntohl()", find that it gives you the
399wrong answer on the PC on which you're doing development, and try
400"tvb_get_letohl()" instead, as "tvb_get_letohl()" will give the wrong
401answer on big-endian machines.
402
403gchar *tvb_ip_to_str(wmem_allocator_t *scope, tvbuff_t *tvb, const gint offset)
404gchar *tvb_ip6_to_str(wmem_allocator_t *scope, tvbuff_t *tvb, const gint offset)
405
406Returns a null-terminated buffer containing a string with IPv4 or IPv6 Address
407from the specified tvbuff, starting at the specified offset.
408
409Accessors for GUID:
410
411void tvb_get_ntohguid(tvbuff_t *tvb, const gint offset, e_guid_t *guid);
412void tvb_get_letohguid(tvbuff_t *tvb, const gint offset, e_guid_t *guid);
413void tvb_get_guid(tvbuff_t *tvb, const gint offset, e_guid_t *guid, const guint encoding);
414
415String accessors:
416
417guint8 *tvb_get_string_enc(wmem_allocator_t *scope, tvbuff_t *tvb, const gint offset, const gint length, const guint encoding);
418
419Returns a null-terminated buffer allocated from the specified scope, containing
420data from the specified tvbuff, starting at the specified offset, and containing
421the specified length worth of characters. Reads data in the specified encoding
422and produces UTF-8 in the buffer. See below for a list of input encoding values.
423
424The buffer is allocated in the given wmem scope (see README.wmem for more
425information).
426
427guint8 *tvb_get_stringz_enc(wmem_allocator_t *scope, tvbuff_t *tvb, const gint offset, gint *lengthp, const guint encoding);
428
429Returns a null-terminated buffer allocated from the specified scope,
430containing data from the specified tvbuff, starting at the specified
431offset, and containing all characters from the tvbuff up to and
432including a terminating null character in the tvbuff. Reads data in the
433specified encoding and produces UTF-8 in the buffer. See below for a
434list of input encoding values. "*lengthp" will be set to the length of
435the string, including the terminating null.
436
437The buffer is allocated in the given wmem scope (see README.wmem for more
438information).
439
440const guint8 *tvb_get_const_stringz(tvbuff_t *tvb, const gint offset, gint *lengthp);
441
442Returns a null-terminated const buffer containing data from the
443specified tvbuff, starting at the specified offset, and containing all
444bytes from the tvbuff up to and including a terminating null character
445in the tvbuff. "*lengthp" will be set to the length of the string,
446including the terminating null.
447
448You do not need to free() this buffer; it will happen automatically once
449the next packet is dissected. This function is slightly more efficient
450than the others because it does not allocate memory and copy the string,
451but it does not do any mapping to UTF-8 or checks for valid octet
452sequences.
453
454gint tvb_get_nstringz(tvbuff_t *tvb, const gint offset, const guint bufsize, guint8* buffer);
455gint tvb_get_nstringz0(tvbuff_t *tvb, const gint offset, const guint bufsize, guint8* buffer);
456
457Copies bufsize bytes, including the terminating NULL, to buffer. If a NULL
458terminator is found before reaching bufsize, only the bytes up to and including
459the NULL are copied. Returns the number of bytes copied (not including
460terminating NULL), or -1 if the string was truncated in the buffer due to
461not having reached the terminating NULL. In this case, the resulting
462buffer is not NULL-terminated.
463tvb_get_nstringz0() works like tvb_get_nstringz(), but never returns -1 since
464the string is guaranteed to have a terminating NULL. If the string was truncated
465when copied into buffer, a NULL is placed at the end of buffer to terminate it.
466
467gchar *tvb_get_ts_23_038_7bits_string(wmem_allocator_t *scope, tvbuff_t *tvb,
468                                      const gint bit_offset, gint no_of_chars);
469
470tvb_get_ts_23_038_7bits_string() returns a string of a given number of
471characters and encoded according to 3GPP TS 23.038 7 bits alphabet.
472
473The buffer is allocated in the given wmem scope (see README.wmem for more
474information).
475
476Byte Array Accessors:
477
478gchar *tvb_bytes_to_str(wmem_allocator_t *scope, tvbuff_t *tvb, const gint offset, const gint len);
479
480Formats a bunch of data from a tvbuff as bytes, returning a pointer
481to the string with the data formatted as two hex digits for each byte.
482The string pointed to is stored in an "wmem_alloc'd" buffer which will be freed
483depending on its scope (typically wmem_packet_scope which is freed after the frame).
484The formatted string will contain the hex digits for at most the first 16 bytes of
485the data. If len is greater than 16 bytes, a trailing "..." will be added to the string.
486
487gchar *tvb_bytes_to_str_punct(wmem_allocator_t *scope, tvbuff_t *tvb,
488                              const gint offset, const gint len, const gchar punct);
489
490This function is similar to tvb_bytes_to_str(...) except that 'punct' is inserted
491between the hex representation of each byte.
492
493GByteArray *tvb_get_string_bytes(tvbuff_t *tvb, const gint offset, const gint length,
494                                 const guint encoding, GByteArray* bytes, gint *endoff)
495
496Given a tvbuff, an offset into the tvbuff, and a length that starts
497at that offset (which may be -1 for "all the way to the end of the
498tvbuff"), fetch the hex-decoded byte values of the tvbuff into the
499passed-in 'bytes' array, based on the passed-in encoding. In other
500words, convert from a hex-ascii string in tvbuff, into the supplied
501GByteArray.
502
503gchar *tvb_bcd_dig_to_wmem_packet_str(tvbuff_t *tvb, const gint offset, const gint len, dgt_set_t *dgt, gboolean skip_first);
504
505Given a tvbuff, an offset into the tvbuff, and a length that starts
506at that offset (which may be -1 for "all the way to the end of the
507tvbuff"), fetch BCD encoded digits from a tvbuff starting from either
508the low or high half byte, formatting the digits according to an input digit set,
509if NUll a default digit set of 0-9 returning "?" for overdecadic digits will be used.
510A pointer to the packet scope allocated string will be returned.
511Note: a tvbuff content of 0xf is considered a 'filler' and will end the conversion.
512
513Copying memory:
514void* tvb_memcpy(tvbuff_t *tvb, void* target, const gint offset, size_t length);
515
516Copies into the specified target the specified length's worth of data
517from the specified tvbuff, starting at the specified offset.
518
519void *tvb_memdup(wmem_allocator_t *scope, tvbuff_t *tvb, const gint offset, size_t length);
520
521Returns a buffer containing a copy of the given TVB bytes. The buffer is
522allocated in the given wmem scope (see README.wmem for more information).
523
524Pointer-retrieval:
525/* WARNING! Don't use this function. There is almost always a better way.
526 * It's dangerous because once this pointer is given to the user, there's
527 * no guarantee that the user will honor the 'length' and not overstep the
528 * boundaries of the buffer. Also see the warning in the Portability section.
529 */
530const guint8* tvb_get_ptr(tvbuff_t *tvb, const gint offset, const gint length);
531
532Length query:
533Get amount of captured data in the buffer (which is *NOT* necessarily the
534length of the packet). You probably want tvb_reported_length instead:
535
536    guint tvb_captured_length(const tvbuff_t *tvb);
537
538Get reported length of buffer:
539
540    guint tvb_reported_length(const tvbuff_t *tvb);
541
542
5431.4 Functions to handle columns in the traffic summary window.
544
545The topmost pane of the main window is a list of the packets in the
546capture, possibly filtered by a display filter.
547
548Each line corresponds to a packet, and has one or more columns, as
549configured by the user.
550
551Many of the columns are handled by code outside individual dissectors;
552most dissectors need only specify the value to put in the "Protocol" and
553"Info" columns.
554
555Columns are specified by COL_ values; the COL_ value for the "Protocol"
556field, typically giving an abbreviated name for the protocol (but not
557the all-lower-case abbreviation used elsewhere) is COL_PROTOCOL, and the
558COL_ value for the "Info" field, giving a summary of the contents of the
559packet for that protocol, is COL_INFO.
560
561The value for a column can be specified with one of several functions,
562all of which take the 'fd' argument to the dissector as their first
563argument, and the COL_ value for the column as their second argument.
564
5651.4.1 The col_set_str function.
566
567'col_set_str' takes a string as its third argument, and sets the value
568for the column to that value. It assumes that the pointer passed to it
569points to a string constant or a static "const" array, not to a
570variable, as it doesn't copy the string, it merely saves the pointer
571value; the argument can itself be a variable, as long as it always
572points to a string constant or a static "const" array.
573
574It is more efficient than 'col_add_str' or 'col_add_fstr'; however, if
575the dissector will be using 'col_append_str' or 'col_append_fstr" to
576append more information to the column, the string will have to be copied
577anyway, so it's best to use 'col_add_str' rather than 'col_set_str' in
578that case.
579
580For example, to set the "Protocol" column
581to "PROTOABBREV":
582
583    col_set_str(pinfo->cinfo, COL_PROTOCOL, "PROTOABBREV");
584
585
5861.4.2 The col_add_str function.
587
588'col_add_str' takes a string as its third argument, and sets the value
589for the column to that value. It takes the same arguments as
590'col_set_str', but copies the string, so that if the string is, for
591example, an automatic variable that won't remain in scope when the
592dissector returns, it's safe to use.
593
594
5951.4.3 The col_add_fstr function.
596
597'col_add_fstr' takes a 'printf'-style format string as its third
598argument, and 'printf'-style arguments corresponding to '%' format
599items in that string as its subsequent arguments. For example, to set
600the "Info" field to "<XXX> request, <N> bytes", where "reqtype" is a
601string containing the type of the request in the packet and "n" is an
602unsigned integer containing the number of bytes in the request:
603
604    col_add_fstr(pinfo->cinfo, COL_INFO, "%s request, %u bytes",
605                 reqtype, n);
606
607Don't use 'col_add_fstr' with a format argument of just "%s" -
608'col_add_str', or possibly even 'col_set_str' if the string that matches
609the "%s" is a static constant string, will do the same job more
610efficiently.
611
612
6131.4.4 The col_clear function.
614
615If the Info column will be filled with information from the packet, that
616means that some data will be fetched from the packet before the Info
617column is filled in. If the packet is so small that the data in
618question cannot be fetched, the routines to fetch the data will throw an
619exception (see the comment at the beginning about tvbuffers improving
620the handling of short packets - the tvbuffers keep track of how much
621data is in the packet, and throw an exception on an attempt to fetch
622data past the end of the packet, so that the dissector won't process
623bogus data), causing the Info column not to be filled in.
624
625This means that the Info column will have data for the previous
626protocol, which would be confusing if, for example, the Protocol column
627had data for this protocol.
628
629Therefore, before a dissector fetches any data whatsoever from the
630packet (unless it's a heuristic dissector fetching data to determine
631whether the packet is one that it should dissect, in which case it
632should check, before fetching the data, whether there's any data to
633fetch; if there isn't, it should return FALSE), it should set the
634Protocol column and the Info column.
635
636If the Protocol column will ultimately be set to, for example, a value
637containing a protocol version number, with the version number being a
638field in the packet, the dissector should, before fetching the version
639number field or any other field from the packet, set it to a value
640without a version number, using 'col_set_str', and should later set it
641to a value with the version number after it's fetched the version
642number.
643
644If the Info column will ultimately be set to a value containing
645information from the packet, the dissector should, before fetching any
646fields from the packet, clear the column using 'col_clear' (which is
647more efficient than clearing it by calling 'col_set_str' or
648'col_add_str' with a null string), and should later set it to the real
649string after it's fetched the data to use when doing that.
650
651
6521.4.5 The col_append_str function.
653
654Sometimes the value of a column, especially the "Info" column, can't be
655conveniently constructed at a single point in the dissection process;
656for example, it might contain small bits of information from many of the
657fields in the packet. 'col_append_str' takes, as arguments, the same
658arguments as 'col_add_str', but the string is appended to the end of the
659current value for the column, rather than replacing the value for that
660column. (Note that no blank separates the appended string from the
661string to which it is appended; if you want a blank there, you must add
662it yourself as part of the string being appended.)
663
664
6651.4.6 The col_append_fstr function.
666
667'col_append_fstr' is to 'col_add_fstr' as 'col_append_str' is to
668'col_add_str' - it takes, as arguments, the same arguments as
669'col_add_fstr', but the formatted string is appended to the end of the
670current value for the column, rather than replacing the value for that
671column.
672
6731.4.7 The col_append_sep_str and col_append_sep_fstr functions.
674
675In specific situations the developer knows that a column's value will be
676created in a stepwise manner, where the appended values are listed. Both
677'col_append_sep_str' and 'col_append_sep_fstr' functions will add an item
678separator between two consecutive items, and will not add the separator at the
679beginning of the column. The remainder of the work both functions do is
680identical to what 'col_append_str' and 'col_append_fstr' do.
681
6821.4.8 The col_set_fence and col_prepend_fence_fstr functions.
683
684Sometimes a dissector may be called multiple times for different PDUs in the
685same frame (for example in the case of SCTP chunk bundling: several upper
686layer data packets may be contained in one SCTP packet). If the upper layer
687dissector calls 'col_set_str()' or 'col_clear()' on the Info column when it
688begins dissecting each of those PDUs then when the frame is fully dissected
689the Info column would contain only the string from the last PDU in the frame.
690The 'col_set_fence' function erects a "fence" in the column that prevents
691subsequent 'col_...' calls from clearing the data currently in that column.
692For example, the SCTP dissector calls 'col_set_fence' on the Info column
693after it has called any subdissectors for that chunk so that subdissectors
694of any subsequent chunks may only append to the Info column.
695'col_prepend_fence_fstr' prepends data before a fence (moving it if
696necessary). It will create a fence at the end of the prepended data if the
697fence does not already exist.
698
699
7001.4.9 The col_set_time function.
701
702The 'col_set_time' function takes an nstime value as its third argument.
703This nstime value is a relative value and will be added as such to the
704column. The fourth argument is the filtername holding this value. This
705way, rightclicking on the column makes it possible to build a filter
706based on the time-value.
707
708For example:
709
710    col_set_time(pinfo->cinfo, COL_REL_TIME, &ts, "s4607.ploc.time");
711
712
7131.5 Constructing the protocol tree.
714
715The middle pane of the main window, and the topmost pane of a packet
716popup window, are constructed from the "protocol tree" for a packet.
717
718The protocol tree, or proto_tree, is a GNode, the N-way tree structure
719available within GLIB. Of course the protocol dissectors don't care
720what a proto_tree really is; they just pass the proto_tree pointer as an
721argument to the routines which allow them to add items and new branches
722to the tree.
723
724When a packet is selected in the packet-list pane, or a packet popup
725window is created, a new logical protocol tree (proto_tree) is created.
726The pointer to the proto_tree (in this case, 'protocol tree'), is passed
727to the top-level protocol dissector, and then to all subsequent protocol
728dissectors for that packet, and then the GUI tree is drawn via
729proto_tree_draw().
730
731The logical proto_tree needs to know detailed information about the protocols
732and fields about which information will be collected from the dissection
733routines. By strictly defining (or "typing") the data that can be attached to a
734proto tree, searching and filtering becomes possible. This means that for
735every protocol and field (which I also call "header fields", since they are
736fields in the protocol headers) which might be attached to a tree, some
737information is needed.
738
739Every dissector routine will need to register its protocols and fields
740with the central protocol routines (in proto.c). At first I thought I
741might keep all the protocol and field information about all the
742dissectors in one file, but decentralization seemed like a better idea.
743That one file would have gotten very large; one small change would have
744required a re-compilation of the entire file. Also, by allowing
745registration of protocols and fields at run-time, loadable modules of
746protocol dissectors (perhaps even user-supplied) is feasible.
747
748To do this, each protocol should have a register routine, which will be
749called when Wireshark starts. The code to call the register routines is
750generated automatically; to arrange that a protocol's register routine
751be called at startup:
752
753    the file containing a dissector's "register" routine must be
754    added to "DISSECTOR_SRC" in "epan/dissectors/CMakeLists.txt";
755
756    the "register" routine must have a name of the form
757    "proto_register_XXX";
758
759    the "register" routine must take no argument, and return no
760    value;
761
762    the "register" routine's name must appear in the source file
763    either at the beginning of the line, or preceded only by "void "
764    at the beginning of the line (that would typically be the
765    definition) - other white space shouldn't cause a problem, e.g.:
766
767void proto_register_XXX(void) {
768
769    ...
770
771}
772
773and
774
775void
776proto_register_XXX( void )
777{
778
779    ...
780
781}
782
783    and so on should work.
784
785For every protocol or field that a dissector wants to register, a variable of
786type int needs to be used to keep track of the protocol. The IDs are
787needed for establishing parent/child relationships between protocols and
788fields, as well as associating data with a particular field so that it
789can be stored in the logical tree and displayed in the GUI protocol
790tree.
791
792Some dissectors will need to create branches within their tree to help
793organize header fields. These branches should be registered as header
794fields. Only true protocols should be registered as protocols. This is
795so that a display filter user interface knows how to distinguish
796protocols from fields.
797
798A protocol is registered with the name of the protocol and its
799abbreviation.
800
801Here is how the frame "protocol" is registered.
802
803        int proto_frame;
804
805        proto_frame = proto_register_protocol (
806                /* name */            "Frame",
807                /* short name */      "Frame",
808                /* abbrev */          "frame" );
809
810A header field is also registered with its name and abbreviation, but
811information about its data type is needed. It helps to look at
812the header_field_info struct to see what information is expected:
813
814struct header_field_info {
815    const char      *name;
816    const char      *abbrev;
817    enum ftenum     type;
818    int             display;
819    const void      *strings;
820    guint64         bitmask;
821    const char      *blurb;
822    .....
823};
824
825name (FIELDNAME)
826----------------
827A string representing the name of the field. This is the name
828that will appear in the graphical protocol tree. It must be a non-empty
829string.
830
831abbrev (FIELDABBREV)
832--------------------
833A string with an abbreviation of the field. The abbreviation should start
834with the abbreviation of the parent protocol followed by a period as a
835separator. For example, the "src" field in an IP packet would have "ip.src"
836as an abbreviation. It is acceptable to have multiple levels of periods if,
837for example, you have fields in your protocol that are then subdivided into
838subfields. For example, TRMAC has multiple error fields, so the abbreviations
839follow this pattern: "trmac.errors.iso", "trmac.errors.noniso", etc.
840
841The abbreviation is the identifier used in a display filter. As such it
842cannot be an empty string.
843
844type (FIELDTYPE)
845----------------
846The type of value this field holds. The current field types are:
847
848    FT_NONE     No field type. Used for fields that
849                aren't given a value, and that can only
850                be tested for presence or absence; a
851                field that represents a data structure,
852                with a subtree below it containing
853                fields for the members of the structure,
854                or that represents an array with a
855                subtree below it containing fields for
856                the members of the array, might be an
857                FT_NONE field.
858    FT_PROTOCOL Used for protocols which will be placing
859                themselves as top-level items in the
860                "Packet Details" pane of the UI.
861    FT_BOOLEAN  0 means "false", any other value means
862                "true".
863    FT_FRAMENUM A frame number; if this is used, the "Go
864                To Corresponding Frame" menu item can
865                work on that field.
866    FT_CHAR     An 8-bit ASCII character. It's treated similarly to an
867                FT_UINT8, but is displayed as a C-style character
868                constant.
869    FT_UINT8    An 8-bit unsigned integer.
870    FT_UINT16   A 16-bit unsigned integer.
871    FT_UINT24   A 24-bit unsigned integer.
872    FT_UINT32   A 32-bit unsigned integer.
873    FT_UINT40   A 40-bit unsigned integer.
874    FT_UINT48   A 48-bit unsigned integer.
875    FT_UINT56   A 56-bit unsigned integer.
876    FT_UINT64   A 64-bit unsigned integer.
877    FT_INT8     An 8-bit signed integer.
878    FT_INT16    A 16-bit signed integer.
879    FT_INT24    A 24-bit signed integer.
880    FT_INT32    A 32-bit signed integer.
881    FT_INT40    A 40-bit signed integer.
882    FT_INT48    A 48-bit signed integer.
883    FT_INT56    A 56-bit signed integer.
884    FT_INT64    A 64-bit signed integer.
885    FT_FLOAT    A single-precision floating point number.
886    FT_DOUBLE   A double-precision floating point number.
887    FT_ABSOLUTE_TIME    An absolute time from some fixed point in time,
888                displayed as the date, followed by the time, as
889                hours, minutes, and seconds with 9 digits after
890                the decimal point.
891    FT_RELATIVE_TIME    Seconds (4 bytes) and nanoseconds (4 bytes)
892                of time relative to an arbitrary time.
893                displayed as seconds and 9 digits
894                after the decimal point.
895    FT_STRING   A string of characters, not necessarily
896                NULL-terminated, but possibly NULL-padded.
897                This, and the other string-of-characters
898                types, are to be used for text strings,
899                not raw binary data.
900    FT_STRINGZ  A NULL-terminated string of characters.
901                The string length is normally the length
902                given in the proto_tree_add_item() call.
903                However if the length given in the call
904                is -1, then the length used is that
905                returned by calling tvb_strsize().
906                This should only be used if the string,
907                in the packet, is always terminated with
908                a NULL character, either because the length
909                isn't otherwise specified or because a
910                character count *and* a NULL terminator are
911                both used.
912    FT_STRINGZPAD   A NULL-padded string of characters.
913                The length is given in the proto_tree_add_item()
914                call, but may be larger than the length of
915                the string, with extra bytes being NULL padding.
916                This is typically used for fixed-length fields
917                that contain a string value that might be shorter
918                than the fixed length.
919    FT_STRINGZTRUNC  A NULL-truncated string of characters.
920                The length is given in the proto_tree_add_item()
921                call, but may be larger than the length of
922                the string, with a NULL character after the last
923                character of the string, and the remaining bytes
924                being padding with unspecified contents.  This is
925                typically used for fixed-length fields that contain
926                a string value that might be shorter than the fixed
927                length.
928    FT_UINT_STRING  A counted string of characters, consisting
929                of a count (represented as an integral value,
930                of width given in the proto_tree_add_item()
931                call) followed immediately by that number of
932                characters.
933    FT_ETHER    A six octet string displayed in
934                Ethernet-address format.
935    FT_BYTES    A string of bytes with arbitrary values;
936                used for raw binary data.
937    FT_UINT_BYTES   A counted string of bytes, consisting
938                of a count (represented as an integral value,
939                of width given in the proto_tree_add_item()
940                call) followed immediately by that number of
941                arbitrary values; used for raw binary data.
942    FT_IPv4     A version 4 IP address (4 bytes) displayed
943                in dotted-quad IP address format (4
944                decimal numbers separated by dots).
945    FT_IPv6     A version 6 IP address (16 bytes) displayed
946                in standard IPv6 address format.
947    FT_IPXNET   An IPX address displayed in hex as a 6-byte
948                network number followed by a 6-byte station
949                address.
950    FT_GUID     A Globally Unique Identifier
951    FT_OID      An ASN.1 Object Identifier
952    FT_REL_OID  An ASN.1 Relative Object Identifier
953    FT_EUI64    A EUI-64 Address
954    FT_AX25     A AX-25 Address
955    FT_VINES    A Vines Address
956    FT_SYSTEM_ID  An OSI System-ID
957    FT_FCWWN    A Fibre Channel WWN Address
958
959Some of these field types are still not handled in the display filter
960routines, but the most common ones are. The FT_UINT* variables all
961represent unsigned integers, and the FT_INT* variables all represent
962signed integers; the number on the end represent how many bits are used
963to represent the number.
964
965Some constraints are imposed on the header fields depending on the type
966(e.g. FT_BYTES) of the field. Fields of type FT_ABSOLUTE_TIME must use
967'ABSOLUTE_TIME_{LOCAL,UTC,DOY_UTC}, NULL, 0x0' as values for the
968'display, 'strings', and 'bitmask' fields, and all other non-integral
969types (i.e.. types that are _not_ FT_INT* and FT_UINT*) must use
970'BASE_NONE, NULL, 0x0' as values for the 'display', 'strings', 'bitmask'
971fields. The reason is simply that the type itself implicitly defines the
972nature of 'display', 'strings', 'bitmask'.
973
974display (FIELDDISPLAY)
975----------------------
976The display field has a couple of overloaded uses. This is unfortunate,
977but since we're using C as an application programming language, this sometimes
978makes for cleaner programs. Right now I still think that overloading
979this variable was okay.
980
981For integer fields (FT_UINT* and FT_INT*), this variable represents the
982base in which you would like the value displayed. The acceptable bases
983are:
984
985    BASE_DEC,
986    BASE_HEX,
987    BASE_OCT,
988    BASE_DEC_HEX,
989    BASE_HEX_DEC,
990    BASE_CUSTOM
991
992BASE_DEC, BASE_HEX, and BASE_OCT are decimal, hexadecimal, and octal,
993respectively. BASE_DEC_HEX and BASE_HEX_DEC display value in two bases
994(the 1st representation followed by the 2nd in parenthesis).
995
996BASE_CUSTOM allows one to specify a callback function pointer that will
997format the value.
998
999For 32-bit and smaller values, custom_fmt_func_t can be used to declare
1000the callback function pointer. Specifically, this is defined as:
1001
1002    void func(gchar *, guint32);
1003
1004For values larger than 32-bits, custom_fmt_func_64_t can be used to declare
1005the callback function pointer. Specifically, this is defined as:
1006
1007    void func(gchar *, guint64);
1008
1009The first argument is a pointer to a buffer of the ITEM_LABEL_LENGTH size
1010and the second argument is the value to be formatted.
1011
1012Both custom_fmt_func_t and custom_fmt_func_64_t are defined in epan/proto.h.
1013
1014For FT_UINT16 'display' can be used to select a transport layer protocol using one
1015of BASE_PT_UDP, BASE_PT_TCP, BASE_PT_DCCP or BASE_PT_SCTP. If transport name
1016resolution is enabled the port field label is displayed in decimal and as a well-known
1017service name (if one is available).
1018
1019For FT_BOOLEAN fields that are also bitfields (i.e., 'bitmask' is non-zero),
1020'display' is used specify a "field-width" (i.e., tell the proto_tree how
1021wide the parent bitfield is). (If the FT_BOOLEAN 'bitmask' is zero, then
1022'display' must be BASE_NONE).
1023
1024For integer fields a "field-width" is not needed since the type of
1025integer itself (FT_UINT8, FT_UINT16, FT_UINT24, FT_UINT32, FT_UINT40,
1026FT_UINT48, FT_UINT56, FT_UINT64, etc) tells the proto_tree how wide the
1027parent bitfield is. The same is true of FT_CHAR, as it's an 8-bit
1028character.
1029
1030For FT_ABSOLUTE_TIME fields, 'display' is used to indicate whether the
1031time is to be displayed as a time in the time zone for the machine on
1032which Wireshark/TShark is running or as UTC and, for UTC, whether the
1033date should be displayed as "{monthname} {day_of_month}, {year}" or as
1034"{year/day_of_year}".
1035
1036Additionally, BASE_NONE is used for 'display' as a NULL-value. That is, for
1037non-integers other than FT_ABSOLUTE_TIME fields, and non-bitfield
1038FT_BOOLEANs, you'll want to use BASE_NONE in the 'display' field. You may
1039not use BASE_NONE for integers.
1040
1041It is possible that in the future we will record the endianness of
1042integers. If so, it is likely that we'll use a bitmask on the display field
1043so that integers would be represented as BEND|BASE_DEC or LEND|BASE_HEX.
1044But that has not happened yet; note that there are protocols for which
1045no endianness is specified, such as the X11 protocol and the DCE RPC
1046protocol, so it would not be possible to record the endianness of all
1047integral fields.
1048
1049strings (FIELDCONVERT)
1050----------------------
1051-- value_string
1052Some integer fields, of type FT_UINT*, need labels to represent the true
1053value of a field. You could think of those fields as having an
1054enumerated data type, rather than an integral data type.
1055
1056A 'value_string' structure is a way to map values to strings.
1057
1058    typedef struct _value_string {
1059        guint32  value;
1060        gchar   *strptr;
1061    } value_string;
1062
1063For fields of that type, you would declare an array of "value_string"s:
1064
1065    static const value_string valstringname[] = {
1066        { INTVAL1, "Descriptive String 1" },
1067        { INTVAL2, "Descriptive String 2" },
1068        { 0,       NULL }
1069    };
1070
1071(the last entry in the array must have a NULL 'strptr' value, to
1072indicate the end of the array). The 'strings' field would be set to
1073'VALS(valstringname)'.
1074
1075If the field has a numeric rather than an enumerated type, the 'strings'
1076field would be set to NULL.
1077
1078If BASE_SPECIAL_VALS is also applied to the display bitmask, then if the
1079numeric value of a field doesn't match any values in the value_string
1080then just the numeric value is displayed (i.e. no "Unknown"). This is
1081intended for use when the value_string only gives special names for
1082certain field values and values not in the value_string are expected.
1083
1084-- Extended value strings
1085You can also use an extended version of the value_string for faster lookups.
1086It requires a value_string array as input.
1087If all of a contiguous range of values from min to max are present in the array
1088in ascending order the value will be used as a direct index into a value_string array.
1089
1090If the values in the array are not contiguous (ie: there are "gaps"), but are
1091in ascending order a binary search will be used.
1092
1093Note: "gaps" in a value_string array can be filled with "empty" entries eg:
1094{value, "Unknown"} so that direct access to the array is is possible.
1095
1096Note: the value_string array values are *unsigned*; IOW: -1 is greater than 0.
1097      So:
1098      { -2, -1,  1,  2 }; wrong:   linear search will be used (note gap)
1099      {  1,  2, -2, -1 }; correct: binary search will be used
1100
1101      As a special case:
1102      { -2, -1,  0,  1,  2 }; OK: direct(indexed) access will be used (note no gap)
1103
1104The init macro (see below) will perform a check on the value string the first
1105time it is used to determine which search algorithm fits and fall back to a
1106linear search if the value_string does not meet the criteria above.
1107
1108Use this macro to initialize the extended value_string at compile time:
1109
1110static value_string_ext valstringname_ext = VALUE_STRING_EXT_INIT(valstringname);
1111
1112Extended value strings can be created at run time by calling
1113   value_string_ext_new(<ptr to value_string array>,
1114                        <total number of entries in the value_string_array>, /* include {0, NULL} entry */
1115                        <value_string_name>);
1116
1117For hf[] array FT_(U)INT* fields that need a 'valstringname_ext' struct, the
1118'strings' field would be set to '&valstringname_ext'. Furthermore, the 'display'
1119field must be ORed with 'BASE_EXT_STRING' (e.g. BASE_DEC|BASE_EXT_STRING).
1120
1121-- val64_string
1122
1123val64_strings are like value_strings, except that the integer type
1124used is a guint64 (instead of guint32). Instead of using the VALS()
1125macro for the 'strings' field in the header_field_info struct array,
1126'VALS64()' is used.
1127
1128BASE_SPECIAL_VALS can also be used for val64_string.
1129
1130-- val64_string_ext
1131
1132val64_string_ext is like value_string_ext, except that the integer type
1133used is a guint64 (instead of guint32).
1134
1135Use this macro to initialize the extended val64_string at compile time:
1136
1137static val64_string_ext val64stringname_ext = VAL64_STRING_EXT_INIT(val64stringname);
1138
1139Extended val64 strings can be created at run time by calling
1140   val64_string_ext_new(<ptr to val64_string array>,
1141                        <total number of entries in the val64_string_array>, /* include {0, NULL} entry */
1142                        <val64_string_name>);
1143
1144For hf[] array FT_(U)INT* fields that need a 'val64stringname_ext' struct, the
1145'strings' field would be set to '&val64stringname_ext'. Furthermore, the 'display'
1146field must be ORed with both 'BASE_EXT_STRING' and 'BASE_VAL64_STRING'
1147(e.g. BASE_DEC|BASE_EXT_STRING|BASE_VAL64_STRING).
1148
1149-- Unit string
1150Some integer fields, of type FT_UINT* and float fields, of type FT_FLOAT
1151or FT_DOUBLE, need units of measurement to help convey the field value.
1152
1153A 'unit_name_string' structure is a way to add a unit suffix to a field.
1154
1155    typedef struct unit_name_string {
1156        char *singular;     /* name to use for 1 unit */
1157        char *plural;       /* name to use for < 1 or > 1 units */
1158    } unit_name_string;
1159
1160For fields with that unit name, you would declare a "unit_name_string":
1161
1162    static const unit_name_string unitname[] =
1163        { "single item name" , "multiple item name" };
1164
1165(the second entry can be NULL if there is no plural form of the unit name.
1166This is typically the case when abbreviations are used instead of full words.)
1167
1168There are several "common" unit name structures already defined in
1169epan/unit_strings.h. Dissector authors may choose to add the unit name
1170structure there rather than locally in a dissector.
1171
1172For hf[] array FT_(U)INT*, FT_FlOAT and FT_DOUBLE fields that need a
1173'unit_name_string' struct, the 'strings' field would be set to
1174'&units_second_seconds'. Furthermore, the 'display' field must be ORed
1175with 'BASE_UNIT_STRING' (e.g. BASE_DEC|BASE_UNIT_STRING).
1176
1177-- Ranges
1178If the field has a numeric type that might logically fit in ranges of values
1179one can use a range_string struct.
1180
1181Thus a 'range_string' structure is a way to map ranges to strings.
1182
1183        typedef struct _range_string {
1184                guint32        value_min;
1185                guint32        value_max;
1186                const gchar   *strptr;
1187        } range_string;
1188
1189For fields of that type, you would declare an array of "range_string"s:
1190
1191    static const range_string rvalstringname[] = {
1192        { INTVAL_MIN1, INTVALMAX1, "Descriptive String 1" },
1193        { INTVAL_MIN2, INTVALMAX2, "Descriptive String 2" },
1194        { 0,           0,          NULL                   }
1195    };
1196
1197If INTVAL_MIN equals INTVAL_MAX for a given entry the range_string
1198behavior collapses to the one of value_string. Note that each range_string
1199within the array is tested in order, so any 'catch-all' entries need to come
1200after specific individual entries.
1201
1202For FT_(U)INT* fields that need a 'range_string' struct, the 'strings' field
1203would be set to 'RVALS(rvalstringname)'. Furthermore, 'display' field must be
1204ORed with 'BASE_RANGE_STRING' (e.g. BASE_DEC|BASE_RANGE_STRING).
1205
1206-- Booleans
1207FT_BOOLEANs have a default map of 0 = "False", 1 (or anything else) = "True".
1208Sometimes it is useful to change the labels for boolean values (e.g.,
1209to "Yes"/"No", "Fast"/"Slow", etc.). For these mappings, a struct called
1210true_false_string is used.
1211
1212    typedef struct true_false_string {
1213        char    *true_string;
1214        char    *false_string;
1215    } true_false_string;
1216
1217For Boolean fields for which "False" and "True" aren't the desired
1218labels, you would declare a "true_false_string"s:
1219
1220    static const true_false_string boolstringname = {
1221        "String for True",
1222        "String for False"
1223    };
1224
1225Its two fields are pointers to the string representing truth, and the
1226string representing falsehood. For FT_BOOLEAN fields that need a
1227'true_false_string' struct, the 'strings' field would be set to
1228'TFS(&boolstringname)'.
1229
1230If the Boolean field is to be displayed as "False" or "True", the
1231'strings' field would be set to NULL.
1232
1233Wireshark predefines a whole range of ready made "true_false_string"s
1234in tfs.h, included via packet.h.
1235
1236-- Custom
1237Custom fields (BASE_CUSTOM) should use CF_FUNC(&custom_format_func) for the
1238'strings' field.
1239
1240-- Note to plugin authors
1241Data cannot get exported from DLLs. For this reason plugin authors cannot use
1242existing fieldconvert strings (e.g. from existing dissectors or those from
1243epan/unit_strings.h). Plugins must define value_strings, unit_name_strings,
1244range_strings and true_false_strings locally.
1245
1246bitmask (BITMASK)
1247-----------------
1248If the field is a bitfield, then the bitmask is the mask which will
1249leave only the bits needed to make the field when ANDed with a value.
1250The proto_tree routines will calculate 'bitshift' automatically
1251from 'bitmask', by finding the rightmost set bit in the bitmask.
1252This shift is applied before applying string mapping functions or
1253filtering.
1254
1255If the field is not a bitfield, then bitmask should be set to 0.
1256
1257blurb (FIELDDESCR)
1258------------------
1259This is a string giving a proper description of the field. It should be
1260at least one grammatically complete sentence, or NULL in which case the
1261name field is used. (Please do not use "").
1262
1263It is meant to provide a more detailed description of the field than the
1264name alone provides. This information will be used in the man page, and
1265in a future GUI display-filter creation tool. We might also add tooltips
1266to the labels in the GUI protocol tree, in which case the blurb would
1267be used as the tooltip text.
1268
1269
12701.5.1 Field Registration.
1271
1272Protocol registration is handled by creating an instance of the
1273header_field_info struct (or an array of such structs), and
1274calling the registration function along with the registration ID of
1275the protocol that is the parent of the fields. Here is a complete example:
1276
1277    static int proto_eg = -1;
1278    static int hf_field_a = -1;
1279    static int hf_field_b = -1;
1280
1281    static hf_register_info hf[] = {
1282
1283        { &hf_field_a,
1284        { "Field A", "proto.field_a", FT_UINT8, BASE_HEX, NULL,
1285            0xf0, "Field A represents Apples", HFILL }},
1286
1287        { &hf_field_b,
1288        { "Field B", "proto.field_b", FT_UINT16, BASE_DEC, VALS(vs),
1289            0x0, "Field B represents Bananas", HFILL }}
1290    };
1291
1292    proto_eg = proto_register_protocol("Example Protocol",
1293        "PROTO", "proto");
1294    proto_register_field_array(proto_eg, hf, array_length(hf));
1295
1296Be sure that your array of hf_register_info structs is declared 'static',
1297since the proto_register_field_array() function does not create a copy
1298of the information in the array... it uses that static copy of the
1299information that the compiler created inside your array. Here's the
1300layout of the hf_register_info struct:
1301
1302typedef struct hf_register_info {
1303    int            *p_id; /* pointer to parent variable */
1304    header_field_info hfinfo;
1305} hf_register_info;
1306
1307Also be sure to use the handy array_length() macro found in packet.h
1308to have the compiler compute the array length for you at compile time.
1309
1310If you don't have any fields to register, do *NOT* create a zero-length
1311"hf" array; not all compilers used to compile Wireshark support them.
1312Just omit the "hf" array, and the "proto_register_field_array()" call,
1313entirely.
1314
1315It is OK to have header fields with a different format be registered with
1316the same abbreviation. For instance, the following is valid:
1317
1318    static hf_register_info hf[] = {
1319
1320        { &hf_field_8bit, /* 8-bit version of proto.field */
1321        { "Field (8 bit)", "proto.field", FT_UINT8, BASE_DEC, NULL,
1322            0x00, "Field represents FOO", HFILL }},
1323
1324        { &hf_field_32bit, /* 32-bit version of proto.field */
1325        { "Field (32 bit)", "proto.field", FT_UINT32, BASE_DEC, NULL,
1326            0x00, "Field represents FOO", HFILL }}
1327    };
1328
1329This way a filter expression can match a header field, irrespective of the
1330representation of it in the specific protocol context. This is interesting
1331for protocols with variable-width header fields.
1332
1333Note that the formats used must all belong to the same group as defined below:
1334- FT_INT8, FT_INT16, FT_INT24 and FT_INT32
1335- FT_CHAR, FT_UINT8, FT_UINT16, FT_UINT24, FT_UINT32, FT_IPXNET and FT_FRAMENUM
1336- FT_INT40, FT_INT48, FT_INT56 and FT_INT64
1337- FT_UINT40, FT_UINT48, FT_UINT56, FT_UINT64 and FT_EUI64
1338- FT_ABSOLUTE_TIME and FT_RELATIVE_TIME
1339- FT_STRING, FT_STRINGZ, FT_UINT_STRING, FT_STRINGZPAD, and FT_STRINGZTRUNC
1340- FT_FLOAT and FT_DOUBLE
1341- FT_BYTES, FT_UINT_BYTES, FT_ETHER, FT_AX25, FT_VINES and FT_FCWWN
1342- FT_OID, FT_REL_OID and FT_SYSTEM_ID
1343
1344Any field not in a grouping above should *NOT* be used in duplicate field
1345abbreviations. The current code does not prevent it, but someday in the future
1346it might.
1347
1348The HFILL macro at the end of the struct will set reasonable default values
1349for internally used fields.
1350
13511.5.2 Adding Items and Values to the Protocol Tree.
1352
1353A protocol item is added to an existing protocol tree with one of a
1354handful of proto_XXX_DO_YYY() functions.
1355
1356Subtrees can be made with the proto_item_add_subtree() function:
1357
1358    item = proto_tree_add_item(....);
1359    new_tree = proto_item_add_subtree(item, tree_type);
1360
1361This will add a subtree under the item in question; a subtree can be
1362created under an item made by any of the "proto_tree_add_XXX" functions,
1363so that the tree can be given an arbitrary depth.
1364
1365Subtree types are integers, assigned by
1366"proto_register_subtree_array()". To register subtree types, pass an
1367array of pointers to "gint" variables to hold the subtree type values to
1368"proto_register_subtree_array()":
1369
1370    static gint ett_eg = -1;
1371    static gint ett_field_a = -1;
1372
1373    static gint *ett[] = {
1374        &ett_eg,
1375        &ett_field_a
1376    };
1377
1378    proto_register_subtree_array(ett, array_length(ett));
1379
1380in your "register" routine, just as you register the protocol and the
1381fields for that protocol.
1382
1383The ett_ variables identify particular type of subtree so that if you expand
1384one of them, Wireshark keeps track of that and, when you click on
1385another packet, it automatically opens all subtrees of that type.
1386If you close one of them, all subtrees of that type will be closed when
1387you move to another packet.
1388
1389There are many functions that the programmer can use to add either
1390protocol or field labels to the proto_tree, for example:
1391
1392    proto_item*
1393    proto_tree_add_item(tree, id, tvb, start, length, encoding);
1394
1395    proto_item*
1396    proto_tree_add_item_ret_int(tree, id, tvb, start, length, encoding,
1397        *retval);
1398
1399    proto_item*
1400    proto_tree_add_subtree(tree, tvb, start, length, idx, tree_item,
1401        text);
1402
1403    proto_item *
1404    proto_tree_add_int_format_value(tree, id, tvb, start, length,
1405        value, format, ...);
1406
1407    proto_item *
1408    proto_tree_add_checksum(proto_tree *tree, tvbuff_t *tvb, const guint offset,
1409        const int hf_checksum, const int hf_checksum_status,
1410        struct expert_field* bad_checksum_expert, packet_info *pinfo,
1411        guint32 computed_checksum, const guint encoding, const guint flags);
1412
1413    proto_item *
1414    proto_tree_add_bitmask(tree, tvb, start, header, ett, fields,
1415        encoding);
1416
1417    proto_item *
1418    proto_tree_add_bits_item(tree, id, tvb, bit_offset, no_of_bits,
1419        encoding);
1420
1421The 'tree' argument is the tree to which the item is to be added. The
1422'tvb' argument is the tvbuff from which the item's value is being
1423extracted; the 'start' argument is the offset from the beginning of that
1424tvbuff of the item being added, and the 'length' argument is the length,
1425in bytes, of the item, bit_offset is the offset in bits and no_of_bits
1426is the length in bits.
1427
1428The length of some items cannot be determined until the item has been
1429dissected; to add such an item, add it with a length of -1, and, when the
1430dissection is complete, set the length with 'proto_item_set_len()':
1431
1432    void
1433    proto_item_set_len(ti, length);
1434
1435The "ti" argument is the value returned by the call that added the item
1436to the tree, and the "length" argument is the length of the item.
1437
1438All available protocol tree functions are declared in epan/proto.h, with
1439their documentation. The details of these functions and their parameters
1440are described below.
1441
1442proto_tree_add_item()
1443---------------------
1444proto_tree_add_item is used when you wish to do no special formatting.
1445The item added to the GUI tree will contain the name (as passed in the
1446proto_register_*() function) and a value. The value will be fetched
1447from the tvbuff by proto_tree_add_item(), based on the type of the field
1448and the encoding of the value as specified by the "encoding" argument.
1449
1450For FT_NONE, FT_BYTES, FT_ETHER, FT_IPv6, FT_IPXNET, FT_OID, FT_REL_OID,
1451FT_AX25, FT_VINES, FT_SYSTEM_ID, FT_FCWWN fields, and 'protocol' fields
1452the encoding is not relevant; the 'encoding' argument should be
1453ENC_NA (Not Applicable).
1454
1455For FT_UINT_BYTES fields, the byte order of the count must be specified
1456as well as the 'encoding' for bytes which should be ENC_NA,
1457i.e. ENC_LITTLE_ENDIAN|ENC_NA
1458
1459For integral, floating-point, Boolean, FT_GUID, and FT_EUI64 fields,
1460the encoding specifies the byte order of the value; the 'encoding'
1461argument should be ENC_LITTLE_ENDIAN if the value is little-endian
1462and ENC_BIG_ENDIAN if it is big-endian.
1463
1464For FT_IPv4 fields, the encoding also specifies the byte order of the
1465value. In almost all cases, the encoding is in network byte order,
1466hence big-endian, but in at least one protocol dissected by Wireshark,
1467at least one IPv4 address is byte-swapped, so it's in little-endian
1468order.
1469
1470For string fields, the encoding specifies the character set used for the
1471string and the way individual code points in that character set are
1472encoded. For FT_UINT_STRING fields, the byte order of the count must be
1473specified; for UCS-2 and UTF-16, the byte order of the encoding must be
1474specified (for counted UCS-2 and UTF-16 strings, the byte order of the
1475count and the 16-bit values in the string must be the same). In other
1476cases, ENC_NA should be used. The character encodings that are
1477currently supported are:
1478
1479    ENC_ASCII - ASCII (currently treated as UTF-8; in the future,
1480        all bytes with the 8th bit set will be treated as
1481        errors)
1482    ENC_UTF_8 - UTF-8-encoded Unicode
1483    ENC_UTF_16 - UTF-16-encoded Unicode, with surrogate pairs
1484    ENC_UCS_2 - UCS-2-encoded subset of Unicode, with no surrogate pairs
1485        and thus no code points above 0xFFFF
1486    ENC_UCS_4 - UCS-4-encoded Unicode
1487    ENC_WINDOWS_1250 - Windows-1250 code page
1488    ENC_WINDOWS_1251 - Windows-1251 code page
1489    ENC_WINDOWS_1252 - Windows-1252 code page
1490    ENC_ISO_646_BASIC - ISO 646 "basic code table"
1491    ENC_ISO_8859_1 - ISO 8859-1
1492    ENC_ISO_8859_2 - ISO 8859-2
1493    ENC_ISO_8859_3 - ISO 8859-3
1494    ENC_ISO_8859_4 - ISO 8859-4
1495    ENC_ISO_8859_5 - ISO 8859-5
1496    ENC_ISO_8859_6 - ISO 8859-6
1497    ENC_ISO_8859_7 - ISO 8859-7
1498    ENC_ISO_8859_8 - ISO 8859-8
1499    ENC_ISO_8859_9 - ISO 8859-9
1500    ENC_ISO_8859_10 - ISO 8859-10
1501    ENC_ISO_8859_11 - ISO 8859-11
1502    ENC_ISO_8859_13 - ISO 8859-13
1503    ENC_ISO_8859_14 - ISO 8859-14
1504    ENC_ISO_8859_15 - ISO 8859-15
1505    ENC_ISO_8859_16 - ISO 8859-16
1506    ENC_3GPP_TS_23_038_7BITS - GSM 7 bits alphabet as described
1507        in 3GPP TS 23.038
1508    ENC_3GPP_TS_23_038_7BITS_UNPACKED - GSM 7 bits alphabet where each
1509        7 bit character occupies a distinct octet
1510    ENC_ETSI_TS_102_221_ANNEX_A - Coding scheme for SIM cards with GSM 7 bit
1511        alphabet, UCS-2 characters, or a mixture of the two as described
1512        in ETSI TS 102 221 Annex A
1513    ENC_EBCDIC - EBCDIC
1514    ENC_EBCDIC_CP037 - EBCDIC code page 037
1515    ENC_MAC_ROMAN - MAC ROMAN
1516    ENC_CP437 - DOS code page 437
1517    ENC_CP855 - DOS code page 855
1518    ENC_CP866 - DOS code page 866
1519    ENC_ASCII_7BITS - 7 bits ASCII
1520    ENC_T61 - ITU T.61
1521    ENC_BCD_DIGITS_0_9 - packed BCD (one digit per nibble), digits 0-9
1522    ENC_KEYPAD_ABC_TBCD - keypad-with-a/b/c "telephony packed BCD" = 0-9, *, #, a, b, c
1523    ENC_KEYPAD_BC_TBCD - keypad-with-B/C "telephony packed BCD" = 0-9, B, C, *, #
1524    ENC_GB18030 - GB 18030
1525    ENC_EUC_KR - EUC-KR
1526
1527Other encodings will be added in the future.
1528
1529For FT_ABSOLUTE_TIME fields, the encoding specifies the form in which
1530the time stamp is specified, as well as its byte order. The time stamp
1531encodings that are currently supported are:
1532
1533    ENC_TIME_SECS_NSECS - 8, 12, or 16 bytes. For 8 bytes, the first 4
1534        bytes are seconds and the next 4 bytes are nanoseconds; for 12
1535        bytes, the first 8 bytes are seconds and the next 4 bytes are
1536        nanoseconds; for 16 bytes, the first 8 bytes are seconds and
1537        the next 8 bytes are nanoseconds. The seconds are seconds
1538        since the UN*X epoch (1970-01-01 00:00:00 UTC). (I.e., a UN*X
1539        struct timespec with a 4-byte or 8-byte time_t or a structure
1540        with an 8-byte time_t and an 8-byte nanoseconds field.)
1541
1542    ENC_TIME_NTP - 8 bytes; the first 4 bytes are seconds since the NTP
1543        epoch (1900-01-01 00:00:00 GMT) and the next 4 bytes are 1/2^32's of
1544        a second since that second. (I.e., a 64-bit count of 1/2^32's of a
1545        second since the NTP epoch, with the upper 32 bits first and the
1546        lower 32 bits second, even when little-endian.)
1547
1548    ENC_TIME_TOD - 8 bytes, as a count of microseconds since the System/3x0
1549        and z/Architecture epoch (1900-01-01 00:00:00 GMT).
1550
1551    ENC_TIME_RTPS - 8 bytes; the first 4 bytes are seconds since the UN*X
1552        epoch and the next 4 bytes are 1/2^32's of a second since that
1553        second. (I.e., it's the offspring of a mating between UN*X time and
1554        NTP time). It's used by the Object Management Group's Real-Time
1555        Publish-Subscribe Wire Protocol for the Data Distribution Service.
1556
1557    ENC_TIME_SECS_USECS - 8 bytes; the first 4 bytes are seconds since the
1558        UN*X epoch and the next 4 bytes are microseconds since that
1559        second. (I.e., a UN*X struct timeval with a 4-byte time_t.)
1560
1561    ENC_TIME_SECS - 4 to 8 bytes, representing a value in seconds since
1562        the UN*X epoch.
1563
1564    ENC_TIME_MSECS - 6 to 8 bytes, representing a value in milliseconds
1565        since the UN*X epoch.
1566
1567    ENC_TIME_NSECS - 8 bytes, representing a value in nanoseconds since
1568        the UN*X epoch.
1569
1570    ENC_TIME_SECS_NTP - 4 bytes, representing a count of seconds since
1571        the NTP epoch.
1572
1573    ENC_TIME_RFC_3971 - 8 bytes, representing a count of 1/64ths of a
1574        second since the UN*X epoch; see section 5.3.1 "Timestamp Option"
1575        in RFC 3971.
1576
1577    ENC_TIME_MSEC_NTP - 4-8 bytes, representing a count of milliseconds since
1578        the NTP epoch.
1579
1580    ENC_MIP6 - 8 bytes; the first 48 bits are seconds since the UN*X epoch
1581        and the remaining 16 bits indicate the number of 1/65536's of a second
1582        since that second.
1583
1584    ENC_TIME_CLASSIC_MAC_OS_SECS - 4-8 bytes, representing a count of seconds
1585        since January 1, 1904, 00:00:00 UTC.
1586
1587For FT_RELATIVE_TIME fields, the encoding specifies the form in which
1588the time stamp is specified, as well as its byte order. The time stamp
1589encodings that are currently supported are:
1590
1591    ENC_TIME_SECS_NSECS - 8, 12, or 16 bytes. For 8 bytes, the first 4
1592        bytes are seconds and the next 4 bytes are nanoseconds; for 12
1593        bytes, the first 8 bytes are seconds and the next 4 bytes are
1594        nanoseconds; for 16 bytes, the first 8 bytes are seconds and
1595        the next 8 bytes are nanoseconds.
1596
1597    ENC_TIME_SECS_USECS - 8 bytes; the first 4 bytes are seconds and the
1598        next 4 bytes are microseconds.
1599
1600    ENC_TIME_SECS - 4 to 8 bytes, representing a value in seconds.
1601
1602    ENC_TIME_MSECS - 6 to 8 bytes, representing a value in milliseconds.
1603
1604    ENC_TIME_NSECS - 8 bytes, representing a value in nanoseconds.
1605
1606For other types, there is no support for proto_tree_add_item().
1607
1608Now that definitions of fields have detailed information about bitfield
1609fields, you can use proto_tree_add_item() with no extra processing to
1610add bitfield values to your tree. Here's an example. Take the Format
1611Identifier (FID) field in the Transmission Header (TH) portion of the SNA
1612protocol. The FID is the high nibble of the first byte of the TH. The
1613FID would be registered like this:
1614
1615    name        = "Format Identifier"
1616    abbrev      = "sna.th.fid"
1617    type        = FT_UINT8
1618    display     = BASE_HEX
1619    strings     = sna_th_fid_vals
1620    bitmask     = 0xf0
1621
1622The bitmask contains the value which would leave only the FID if bitwise-ANDed
1623against the parent field, the first byte of the TH.
1624
1625The code to add the FID to the tree would be;
1626
1627    proto_tree_add_item(bf_tree, hf_sna_th_fid, tvb, offset, 1,
1628        ENC_BIG_ENDIAN);
1629
1630The definition of the field already has the information about bitmasking
1631and bitshifting, so it does the work of masking and shifting for us!
1632This also means that you no longer have to create value_string structs
1633with the values bitshifted. The value_string for FID looks like this,
1634even though the FID value is actually contained in the high nibble.
1635(You'd expect the values to be 0x0, 0x10, 0x20, etc.)
1636
1637/* Format Identifier */
1638static const value_string sna_th_fid_vals[] = {
1639    { 0x0, "SNA device <--> Non-SNA Device" },
1640    { 0x1, "Subarea Node <--> Subarea Node" },
1641    { 0x2, "Subarea Node <--> PU2" },
1642    { 0x3, "Subarea Node or SNA host <--> Subarea Node" },
1643    { 0x4, "?" },
1644    { 0x5, "?" },
1645    { 0xf, "Adjacent Subarea Nodes" },
1646    { 0, NULL }
1647};
1648
1649The final implication of this is that display filters work the way you'd
1650naturally expect them to. You'd type "sna.th.fid == 0xf" to find Adjacent
1651Subarea Nodes. The user does not have to shift the value of the FID to
1652the high nibble of the byte ("sna.th.fid == 0xf0") as was necessary
1653in the past.
1654
1655proto_tree_add_item_ret_XXX()
1656------------------------------
1657proto_tree_add_item_ret_XXX is used when you want the displayed value returned
1658for further processing only integer and unsigned integer types up to 32 bits are
1659supported usage of proper FT_ is checked.
1660
1661proto_tree_add_XXX_item()
1662---------------------
1663proto_tree_add_XXX_item is used when you wish to do no special formatting,
1664but also either wish for the retrieved value from the tvbuff to be handed
1665back (to avoid doing tvb_get_...), and/or wish to have the value be decoded
1666from the tvbuff in a string-encoded format.
1667
1668The item added to the GUI tree will contain the name (as passed in the
1669proto_register_*() function) and a value. The value will be fetched
1670from the tvbuff, based on the type of the XXX name and the encoding of
1671the value as specified by the "encoding" argument.
1672
1673This function retrieves the value even if the passed-in tree param is NULL,
1674so that it can be used by dissectors at all times to both get the value
1675and set the tree item to it.
1676
1677Like other proto_tree_add functions, if there is a tree and the value cannot
1678be decoded from the tvbuff, then an expert info error is reported. For string
1679encoding, this means that a failure to decode the hex value from the string
1680results in an expert info error being added to the tree.
1681
1682For string-decoding, the passed-in encoding argument needs to specify the
1683string encoding (e.g., ENC_ASCII, ENC_UTF_8) as well as the format. For
1684some XXX types, the format is constrained - for example for the encoding format
1685for proto_tree_add_time_item() can only be one of the ENC_ISO_8601_* ones
1686or ENC_RFC_822 or ENC_RFC_1123. For proto_tree_add_bytes_item() it can only
1687be ENC_STR_HEX bit-or'ed with one or more of the ENC_SEP_* separator types.
1688
1689proto_tree_add_protocol_format()
1690--------------------------------
1691proto_tree_add_protocol_format is used to add the top-level item for the
1692protocol when the dissector routine wants complete control over how the
1693field and value will be represented on the GUI tree. The ID value for
1694the protocol is passed in as the "id" argument; the rest of the
1695arguments are a "printf"-style format and any arguments for that format.
1696The caller must include the name of the protocol in the format; it is
1697not added automatically as in proto_tree_add_item().
1698
1699proto_tree_add_none_format()
1700----------------------------
1701proto_tree_add_none_format is used to add an item of type FT_NONE.
1702The caller must include the name of the field in the format; it is
1703not added automatically as in proto_tree_add_item().
1704
1705proto_tree_add_bytes()
1706proto_tree_add_time()
1707proto_tree_add_ipxnet()
1708proto_tree_add_ipv4()
1709proto_tree_add_ipv6()
1710proto_tree_add_ether()
1711proto_tree_add_string()
1712proto_tree_add_boolean()
1713proto_tree_add_float()
1714proto_tree_add_double()
1715proto_tree_add_uint()
1716proto_tree_add_uint64()
1717proto_tree_add_int()
1718proto_tree_add_int64()
1719proto_tree_add_guid()
1720proto_tree_add_oid()
1721proto_tree_add_eui64()
1722------------------------
1723These routines are used to add items to the protocol tree if either:
1724
1725    the value of the item to be added isn't just extracted from the
1726    packet data, but is computed from data in the packet;
1727
1728    the value was fetched into a variable.
1729
1730The 'value' argument has the value to be added to the tree.
1731
1732NOTE: in all cases where the 'value' argument is a pointer, a copy is
1733made of the object pointed to; if you have dynamically allocated a
1734buffer for the object, that buffer will not be freed when the protocol
1735tree is freed - you must free the buffer yourself when you don't need it
1736any more.
1737
1738For proto_tree_add_bytes(), the 'value_ptr' argument is a pointer to a
1739sequence of bytes.
1740
1741
1742proto_tree_add_bytes_with_length() is similar to proto_tree_add_bytes,
1743except that the length is not derived from the tvb length. Instead,
1744the displayed data size is controlled by 'ptr_length'.
1745
1746For proto_tree_add_bytes_format() and proto_tree_add_bytes_format_value(), the
1747'value_ptr' argument is a pointer to a sequence of bytes or NULL if the bytes
1748should be taken from the given TVB using the given offset and length.
1749
1750For proto_tree_add_time(), the 'value_ptr' argument is a pointer to an
1751"nstime_t", which is a structure containing the time to be added; it has
1752'secs' and 'nsecs' members, giving the integral part and the fractional
1753part of a time in units of seconds, with 'nsecs' being the number of
1754nanoseconds. For absolute times, "secs" is a UNIX-style seconds since
1755January 1, 1970, 00:00:00 GMT value.
1756
1757For proto_tree_add_ipxnet(), the 'value' argument is a 32-bit IPX
1758network address.
1759
1760For proto_tree_add_ipv4(), the 'value' argument is a 32-bit IPv4
1761address, in network byte order.
1762
1763For proto_tree_add_ipv6(), the 'value_ptr' argument is a pointer to a
1764128-bit IPv6 address.
1765
1766For proto_tree_add_ether(), the 'value_ptr' argument is a pointer to a
176748-bit MAC address.
1768
1769For proto_tree_add_string(), the 'value_ptr' argument is a pointer to a
1770text string; this string must be NULL terminated even if the string in the
1771TVB is not (as may be the case with FT_STRINGs).
1772
1773For proto_tree_add_boolean(), the 'value' argument is a 32-bit integer.
1774It is masked and shifted as defined by the field info after which zero
1775means "false", and non-zero means "true".
1776
1777For proto_tree_add_float(), the 'value' argument is a 'float' in the
1778host's floating-point format.
1779
1780For proto_tree_add_double(), the 'value' argument is a 'double' in the
1781host's floating-point format.
1782
1783For proto_tree_add_uint(), the 'value' argument is a 32-bit unsigned
1784integer value, in host byte order. (This routine cannot be used to add
178564-bit integers.)
1786
1787For proto_tree_add_uint64(), the 'value' argument is a 64-bit unsigned
1788integer value, in host byte order.
1789
1790For proto_tree_add_int(), the 'value' argument is a 32-bit signed
1791integer value, in host byte order. (This routine cannot be used to add
179264-bit integers.)
1793
1794For proto_tree_add_int64(), the 'value' argument is a 64-bit signed
1795integer value, in host byte order.
1796
1797For proto_tree_add_guid(), the 'value_ptr' argument is a pointer to an
1798e_guid_t structure.
1799
1800For proto_tree_add_oid(), the 'value_ptr' argument is a pointer to an
1801ASN.1 Object Identifier.
1802
1803For proto_tree_add_eui64(), the 'value' argument is a 64-bit integer
1804value
1805
1806proto_tree_add_bytes_format()
1807proto_tree_add_time_format()
1808proto_tree_add_ipxnet_format()
1809proto_tree_add_ipv4_format()
1810proto_tree_add_ipv6_format()
1811proto_tree_add_ether_format()
1812proto_tree_add_string_format()
1813proto_tree_add_boolean_format()
1814proto_tree_add_float_format()
1815proto_tree_add_double_format()
1816proto_tree_add_uint_format()
1817proto_tree_add_uint64_format()
1818proto_tree_add_int_format()
1819proto_tree_add_int64_format()
1820proto_tree_add_guid_format()
1821proto_tree_add_oid_format()
1822proto_tree_add_eui64_format()
1823----------------------------
1824These routines are used to add items to the protocol tree when the
1825dissector routine wants complete control over how the field and value
1826will be represented on the GUI tree. The argument giving the value is
1827the same as the corresponding proto_tree_add_XXX() function; the rest of
1828the arguments are a "printf"-style format and any arguments for that
1829format. The caller must include the name of the field in the format; it
1830is not added automatically as in the proto_tree_add_XXX() functions.
1831
1832proto_tree_add_bytes_format_value()
1833proto_tree_add_time_format_value()
1834proto_tree_add_ipxnet_format_value()
1835proto_tree_add_ipv4_format_value()
1836proto_tree_add_ipv6_format_value()
1837proto_tree_add_ether_format_value()
1838proto_tree_add_string_format_value()
1839proto_tree_add_boolean_format_value()
1840proto_tree_add_float_format_value()
1841proto_tree_add_double_format_value()
1842proto_tree_add_uint_format_value()
1843proto_tree_add_uint64_format_value()
1844proto_tree_add_int_format_value()
1845proto_tree_add_int64_format_value()
1846proto_tree_add_guid_format_value()
1847proto_tree_add_oid_format_value()
1848proto_tree_add_eui64_format_value()
1849------------------------------------
1850
1851These routines are used to add items to the protocol tree when the
1852dissector routine wants complete control over how the value will be
1853represented on the GUI tree. The argument giving the value is the same
1854as the corresponding proto_tree_add_XXX() function; the rest of the
1855arguments are a "printf"-style format and any arguments for that format.
1856With these routines, unlike the proto_tree_add_XXX_format() routines,
1857the name of the field is added automatically as in the
1858proto_tree_add_XXX() functions; only the value is added with the format.
1859One use case for this would be to add a unit of measurement string to
1860the value of the field, however using BASE_UNIT_STRING in the hf_
1861definition is now preferred.
1862
1863proto_tree_add_checksum()
1864----------------------------
1865proto_tree_add_checksum is used to add a checksum field. The hf field
1866provided must be the correct size of the checksum (FT_UINT, FT_UINT16,
1867FT_UINT32, etc). Additional parameters are there to provide "status"
1868and expert info depending on whether the checksum matches the provided
1869value. The "status" and expert info can be used in cases except
1870where PROTO_CHECKSUM_NO_FLAGS is used.
1871
1872proto_tree_add_subtree()
1873---------------------
1874proto_tree_add_subtree() is used to add a label to the GUI tree and create
1875a subtree for other fields. It will contain no value, so it is not searchable
1876in the display filter process.
1877
1878This should only be used for items with subtrees, which may not
1879have values themselves - the items in the subtree are the ones with values.
1880
1881For a subtree, the label on the subtree might reflect some of the items
1882in the subtree. This means the label can't be set until at least some
1883of the items in the subtree have been dissected. To do this, use
1884'proto_item_set_text()' or 'proto_item_append_text()':
1885
1886    void
1887    proto_item_set_text(proto_item *ti, ...);
1888
1889    void
1890    proto_item_append_text(proto_item *ti, ...);
1891
1892'proto_item_set_text()' takes as an argument the proto_item value returned by
1893one of the parameters in 'proto_tree_add_subtree()', a 'printf'-style format
1894string, and a set of arguments corresponding to '%' format items in that string,
1895and replaces the text for the item created by 'proto_tree_add_subtree()' with the result
1896of applying the arguments to the format string.
1897
1898'proto_item_append_text()' is similar, but it appends to the text for
1899the item the result of applying the arguments to the format string.
1900
1901For example, early in the dissection, one might do:
1902
1903    subtree = proto_tree_add_subtree(tree, tvb, offset, length, ett, &ti, <label>);
1904
1905and later do
1906
1907    proto_item_set_text(ti, "%s: %s", type, value);
1908
1909after the "type" and "value" fields have been extracted and dissected.
1910<label> would be a label giving what information about the subtree is
1911available without dissecting any of the data in the subtree.
1912
1913Note that an exception might be thrown when trying to extract the values of
1914the items used to set the label, if not all the bytes of the item are
1915available. Thus, one should create the item with text that is as
1916meaningful as possible, and set it or append additional information to
1917it as the values needed to supply that information are extracted.
1918
1919proto_tree_add_subtree_format()
1920----------------------------
1921This is like proto_tree_add_subtree(), but uses printf-style arguments to
1922create the label; it is used to allow routines that take a printf-like
1923variable-length list of arguments to add a text item to the protocol
1924tree.
1925
1926proto_tree_add_bits_item()
1927--------------------------
1928Adds a number of bits to the protocol tree which does not have to be byte
1929aligned. The offset and length is in bits.
1930Output format:
1931
1932..10 1010 10.. .... "value" (formatted as FT_ indicates).
1933
1934proto_tree_add_bits_ret_val()
1935-----------------------------
1936Works in the same way but also returns the value of the read bits.
1937
1938proto_tree_add_split_bits_item_ret_val()
1939-----------------------------------
1940Similar, but is used for items that are made of 2 or more smaller sets of bits (crumbs)
1941which are not contiguous, but are concatenated to form the actual value. The size of
1942the crumbs and the order of assembly are specified in an array of crumb_spec structures.
1943
1944proto_tree_add_split_bits_crumb()
1945---------------------------------
1946Helper function for the above, to add text for each crumb as it is encountered.
1947
1948proto_tree_add_ts_23_038_7bits_item()
1949-------------------------------------
1950Adds a string of a given number of characters and encoded according to 3GPP TS 23.038 7 bits
1951alphabet.
1952
1953proto_tree_add_bitmask() et al.
1954-------------------------------
1955These functions provide easy to use and convenient dissection of many types of common
1956bitmasks into individual fields.
1957
1958header is an integer type and must be of type FT_[U]INT{8|16|24|32||40|48|56|64} and
1959represents the entire dissectable width of the bitmask.
1960
1961'header' and 'ett' are the hf fields and ett field respectively to create an
1962expansion that covers the bytes of the bitmask.
1963
1964'fields' is a NULL terminated array of pointers to hf fields representing
1965the individual subfields of the bitmask. These fields must either be integers
1966(usually of the same byte width as 'header') or of the type FT_BOOLEAN.
1967Each of the entries in 'fields' will be dissected as an item under the
1968'header' expansion and also IF the field is a boolean and IF it is set to 1,
1969then the name of that boolean field will be printed on the 'header' expansion
1970line. For integer type subfields that have a value_string defined, the
1971matched string from that value_string will be printed on the expansion line
1972as well.
1973
1974Example: (from the SCSI dissector)
1975    static int hf_scsi_inq_peripheral        = -1;
1976    static int hf_scsi_inq_qualifier         = -1;
1977    static int hf_scsi_inq_devtype           = -1;
1978    ...
1979    static gint ett_scsi_inq_peripheral = -1;
1980    ...
1981    static int * const peripheral_fields[] = {
1982        &hf_scsi_inq_qualifier,
1983        &hf_scsi_inq_devtype,
1984        NULL
1985    };
1986    ...
1987    /* Qualifier and DeviceType */
1988    proto_tree_add_bitmask(tree, tvb, offset, hf_scsi_inq_peripheral,
1989        ett_scsi_inq_peripheral, peripheral_fields, ENC_BIG_ENDIAN);
1990    offset+=1;
1991    ...
1992        { &hf_scsi_inq_peripheral,
1993          {"Peripheral", "scsi.inquiry.peripheral", FT_UINT8, BASE_HEX,
1994           NULL, 0, NULL, HFILL}},
1995        { &hf_scsi_inq_qualifier,
1996          {"Qualifier", "scsi.inquiry.qualifier", FT_UINT8, BASE_HEX,
1997           VALS (scsi_qualifier_val), 0xE0, NULL, HFILL}},
1998        { &hf_scsi_inq_devtype,
1999          {"Device Type", "scsi.inquiry.devtype", FT_UINT8, BASE_HEX,
2000           VALS (scsi_devtype_val), SCSI_DEV_BITS, NULL, HFILL}},
2001    ...
2002
2003Which provides very pretty dissection of this one byte bitmask.
2004
2005    Peripheral: 0x05, Qualifier: Device type is connected to logical unit, Device Type: CD-ROM
2006        000. .... = Qualifier: Device type is connected to logical unit (0x00)
2007        ...0 0101 = Device Type: CD-ROM (0x05)
2008
2009The proto_tree_add_bitmask_text() function is an extended version of
2010the proto_tree_add_bitmask() function. In addition, it allows to:
2011- Provide a leading text (e.g. "Flags: ") that will appear before
2012  the comma-separated list of field values
2013- Provide a fallback text (e.g. "None") that will be appended if
2014  no fields warranted a change to the top-level title.
2015- Using flags, specify which fields will affect the top-level title.
2016
2017There are the following flags defined:
2018
2019  BMT_NO_APPEND - the title is taken "as-is" from the 'name' argument.
2020  BMT_NO_INT - only boolean flags are added to the title.
2021  BMT_NO_FALSE - boolean flags are only added to the title if they are set.
2022  BMT_NO_TFS - only add flag name to the title, do not use true_false_string
2023
2024The proto_tree_add_bitmask_with_flags() function is an extended version
2025of the proto_tree_add_bitmask() function. It allows using flags to specify
2026which fields will affect the top-level title. The flags are the
2027same BMT_NO_* flags as used in the proto_tree_add_bitmask_text() function.
2028
2029The proto_tree_add_bitmask() behavior can be obtained by providing
2030both 'name' and 'fallback' arguments as NULL, and a flags of
2031(BMT_NO_FALSE|BMT_NO_TFS).
2032
2033The proto_tree_add_bitmask_len() function is intended for protocols where
2034bitmask length is permitted to vary, so a length is specified explicitly
2035along with the bitmask value. USB Video "bmControl" and "bControlSize"
2036fields follow this pattern. The primary intent of this is "forward
2037compatibility," enabling an interpreter coded for version M of a structure
2038to comprehend fields in version N of the structure, where N > M and
2039bControlSize increases from version M to version N.
2040
2041proto_tree_add_bitmask_len() is an extended version of proto_tree_add_bitmask()
2042that uses an explicitly specified (rather than inferred) length to control
2043dissection. Because of this, it may encounter two cases that
2044proto_tree_add_bitmask() and proto_tree_add_bitmask_text() may not:
2045- A length that exceeds that of the 'header' and bitmask subfields.
2046  In this case the least-significant bytes of the bitmask are dissected.
2047  An expert warning is generated in this case, because the dissection code
2048  likely needs to be updated for a new revision of the protocol.
2049- A length that is shorter than that of the 'header' and bitmask subfields.
2050  In this case, subfields whose data is fully present are dissected,
2051  and other subfields are not. No warning is generated in this case,
2052  because the dissection code is likely for a later revision of the protocol
2053  than the packet it was called to interpret.
2054
2055
2056proto_item_set_generated()
2057--------------------------
2058proto_item_set_generated is used to mark fields as not being read from the
2059captured data directly, but inferred from one or more values.
2060
2061One of the primary uses of this is the presentation of verification of
2062checksums. Every IP packet has a checksum line, which can present the result
2063of the checksum verification, if enabled in the preferences. The result is
2064presented as a subtree, where the result is enclosed in square brackets
2065indicating a generated field.
2066
2067  Header checksum: 0x3d42 [correct]
2068  [Checksum Status: Good (1)]
2069
2070proto_item_set_hidden()
2071-----------------------
2072proto_item_set_hidden is used to hide fields, which have already been added
2073to the tree, from being visible in the displayed tree.
2074
2075NOTE that creating hidden fields is actually quite a bad idea from a UI design
2076perspective because the user (someone who did not write nor has ever seen the
2077code) has no way of knowing that hidden fields are there to be filtered on
2078thus defeating the whole purpose of putting them there. A Better Way might
2079be to add the fields (that might otherwise be hidden) to a subtree where they
2080won't be seen unless the user opens the subtree--but they can be found if the
2081user wants.
2082
2083One use for hidden fields (which would be better implemented using visible
2084fields in a subtree) follows: The caller may want a value to be
2085included in a tree so that the packet can be filtered on this field, but
2086the representation of that field in the tree is not appropriate. An
2087example is the token-ring routing information field (RIF). The best way
2088to show the RIF in a GUI is by a sequence of ring and bridge numbers.
2089Rings are 3-digit hex numbers, and bridges are single hex digits:
2090
2091    RIF: 001-A-013-9-C0F-B-555
2092
2093In the case of RIF, the programmer should use a field with no value and
2094use proto_tree_add_none_format() to build the above representation. The
2095programmer can then add the ring and bridge values, one-by-one, with
2096proto_tree_add_item() and hide them with proto_item_set_hidden() so that the
2097user can then filter on or search for a particular ring or bridge. Here's a
2098skeleton of how the programmer might code this.
2099
2100    char *rif;
2101    rif = create_rif_string(...);
2102
2103    proto_tree_add_none_format(tree, hf_tr_rif_label, ..., "RIF: %s", rif);
2104
2105    for(i = 0; i < num_rings; i++) {
2106        proto_item *pi;
2107
2108        pi = proto_tree_add_item(tree, hf_tr_rif_ring, ...,
2109            ENC_BIG_ENDIAN);
2110        proto_item_set_hidden(pi);
2111    }
2112    for(i = 0; i < num_rings - 1; i++) {
2113        proto_item *pi;
2114
2115        pi = proto_tree_add_item(tree, hf_tr_rif_bridge, ...,
2116            ENC_BIG_ENDIAN);
2117        proto_item_set_hidden(pi);
2118    }
2119
2120The logical tree has these items:
2121
2122    hf_tr_rif_label, text="RIF: 001-A-013-9-C0F-B-555", value = NONE
2123    hf_tr_rif_ring,  hidden, value=0x001
2124    hf_tr_rif_bridge, hidden, value=0xA
2125    hf_tr_rif_ring,  hidden, value=0x013
2126    hf_tr_rif_bridge, hidden, value=0x9
2127    hf_tr_rif_ring,  hidden, value=0xC0F
2128    hf_tr_rif_bridge, hidden, value=0xB
2129    hf_tr_rif_ring,  hidden, value=0x555
2130
2131GUI or print code will not display the hidden fields, but a display
2132filter or "packet grep" routine will still see the values. The possible
2133filter is then possible:
2134
2135    tr.rif_ring eq 0x013
2136
2137proto_item_set_url
2138------------------
2139proto_item_set_url is used to mark fields as containing a URL. This can only
2140be done with fields of type FT_STRING(Z). If these fields are presented they
2141are underlined, as could be done in a browser. These fields are sensitive to
2142clicks as well, launching the configured browser with this URL as parameter.
2143
21441.6 Utility routines.
2145
21461.6.1 val_to_str, val_to_str_const, try_val_to_str and try_val_to_str_idx
2147
2148A dissector may need to convert a value to a string, using a
2149'value_string' structure, by hand, rather than by declaring a field with
2150an associated 'value_string' structure; this might be used, for example,
2151to generate a COL_INFO line for a frame.
2152
2153val_to_str() handles the most common case:
2154
2155    const gchar*
2156    val_to_str(guint32 val, const value_string *vs, const char *fmt)
2157
2158If the value 'val' is found in the 'value_string' table pointed to by
2159'vs', 'val_to_str' will return the corresponding string; otherwise, it
2160will use 'fmt' as an 'sprintf'-style format, with 'val' as an argument,
2161to generate a string, and will return a pointer to that string.
2162You can use it in a call to generate a COL_INFO line for a frame such as
2163
2164    col_add_fstr(COL_INFO, ", %s", val_to_str(val, table, "Unknown %d"));
2165
2166If you don't need to display 'val' in your fmt string, you can use
2167val_to_str_const() which just takes a string constant instead and returns it
2168unmodified when 'val' isn't found.
2169
2170If you need to handle the failure case in some custom way, try_val_to_str()
2171will return NULL if val isn't found:
2172
2173    const gchar*
2174    try_val_to_str(guint32 val, const value_string *vs)
2175
2176Note that, you must check whether 'try_val_to_str()' returns NULL, and arrange
2177that its return value not be dereferenced if it's NULL. 'try_val_to_str_idx()'
2178behaves similarly, except it also returns an index into the value_string array,
2179or -1 if 'val' was not found.
2180
2181The *_ext functions are "extended" versions of those already described. They
2182should be used for large value-string arrays which contain many entries. They
2183implement value to string conversions which will do either a direct access or
2184a binary search of the value string array if possible. See
2185"Extended Value Strings" under section 1.6 "Constructing the protocol tree" for
2186more information.
2187
2188See epan/value_string.h for detailed information on the various value_string
2189functions.
2190
2191To handle 64-bit values, there are an equivalent set of functions. These are:
2192
2193    const gchar *
2194    val64_to_str(const guint64 val, const val64_string *vs, const char *fmt)
2195
2196    const gchar *
2197    val64_to_str_const(const guint64 val, const val64_string *vs, const char *unknown_str);
2198
2199    const gchar *
2200    try_val64_to_str(const guint64 val, const val64_string *vs);
2201
2202    const gchar *
2203    try_val64_to_str_idx(const guint64 val, const val64_string *vs, gint *idx);
2204
2205
22061.6.2 rval_to_str, try_rval_to_str and try_rval_to_str_idx
2207
2208A dissector may need to convert a range of values to a string, using a
2209'range_string' structure.
2210
2211Most of the same functions exist as with regular value_strings (see section
22121.6.1) except with the names 'rval' instead of 'val'.
2213
2214
22151.7 Calling Other Dissectors.
2216
2217As each dissector completes its portion of the protocol analysis, it
2218is expected to create a new tvbuff of type TVBUFF_SUBSET which
2219contains the payload portion of the protocol (that is, the bytes
2220that are relevant to the next dissector).
2221
2222To create a new TVBUFF_SUBSET that begins at a specified offset in a
2223parent tvbuff, and runs to the end of the parent tvbuff, the routine
2224tvbuff_new_subset_remaining() is used:
2225
2226    next_tvb = tvb_new_subset_remaining(tvb, offset);
2227
2228Where:
2229    tvb is the tvbuff that the dissector has been working on. It
2230    can be a tvbuff of any type.
2231
2232    next_tvb is the new TVBUFF_SUBSET.
2233
2234    offset is the byte offset of 'tvb' at which the new tvbuff
2235    should start. The first byte is the byte at offset 0.
2236
2237To create a new TVBUFF_SUBSET that begins at a specified offset in a
2238parent tvbuff, with a specified number of bytes in the payload, the
2239routine tvbuff_new_subset_length() is used:
2240
2241    next_tvb = tvb_new_subset_length(tvb, offset, reported_length);
2242
2243Where:
2244    tvb is the tvbuff that the dissector has been working on. It
2245    can be a tvbuff of any type.
2246
2247    next_tvb is the new TVBUFF_SUBSET.
2248
2249    offset is the byte offset of 'tvb' at which the new tvbuff
2250    should start. The first byte is the byte at offset 0.
2251
2252    reported_length is the number of bytes that the current protocol
2253    says should be in the payload.
2254
2255In the few cases where the number of bytes available in the new subset
2256must be explicitly specified, rather than being calculated based on the
2257number of bytes in the payload, the routine tvb_new_subset_length_caplen()
2258is used:
2259
2260    next_tvb = tvb_new_subset_length_caplen(tvb, offset, length, reported_length);
2261
2262Where:
2263    tvb is the tvbuff that the dissector has been working on. It
2264    can be a tvbuff of any type.
2265
2266    next_tvb is the new TVBUFF_SUBSET.
2267
2268    offset is the byte offset of 'tvb' at which the new tvbuff
2269    should start. The first byte is the byte at offset 0.
2270
2271    length is the number of bytes in the new TVBUFF_SUBSET. A length
2272    argument of -1 says to use as many bytes as are available in
2273    'tvb'.
2274
2275    reported_length is the number of bytes that the current protocol
2276    says should be in the payload. A reported_length of -1 says that
2277    the protocol doesn't say anything about the size of its payload.
2278
2279To call a dissector you need to get the handle of the dissector using
2280find_dissector(), passing it the string name of the dissector. The setting
2281of the handle is usually done once at startup during the proto_reg_handoff
2282function within the calling dissector.
2283
22841.7.1 Dissector Tables
2285
2286Another way to call a subdissector is to setup a dissector table. A dissector
2287table is a list of subdissectors grouped by a common identifier (integer or
2288string) in a dissector. Subdissectors will register themselves with the dissector
2289table using their unique identifier using one of the following APIs:
2290
2291    void dissector_add_uint(const char *abbrev, const guint32 pattern,
2292                            dissector_handle_t handle);
2293
2294    void dissector_add_uint_range(const char *abbrev, struct epan_range *range,
2295                            dissector_handle_t handle);
2296
2297    void dissector_add_string(const char *name, const gchar *pattern,
2298                            dissector_handle_t handle);
2299
2300    void dissector_add_for_decode_as(const char *name,
2301                            dissector_handle_t handle);
2302
2303    dissector_add_for_decode_as doesn't add a unique identifier in the dissector
2304    table, but it lets the user add it from the command line or, in Wireshark,
2305    through the "Decode As" UI.
2306
2307Then when the dissector hits the common identifier field, it will use one of the
2308following APIs to invoke the subdissector:
2309
2310    int dissector_try_uint(dissector_table_t sub_dissectors,
2311        const guint32 uint_val, tvbuff_t *tvb, packet_info *pinfo,
2312        proto_tree *tree);
2313
2314    int dissector_try_uint_new(dissector_table_t sub_dissectors,
2315        const guint32 uint_val, tvbuff_t *tvb, packet_info *pinfo,
2316        proto_tree *tree, const gboolean add_proto_name, void *data);
2317
2318    int dissector_try_string(dissector_table_t sub_dissectors, const gchar *string,
2319        tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree, void *data);
2320
2321These pass a subset of the remaining packet (typically the rest of the
2322packet) for the dissector table to determine which subdissector is called.
2323This allows dissection of a packet to be expanded outside of dissector without
2324having to modify the dissector directly.
2325
2326
23271.8 Editing CMakeLists.txt to add your dissector.
2328
2329To arrange that your dissector will be built as part of Wireshark, you
2330must add the name of the source file for your dissector to the DISSECTOR_SRC
2331section of epan/dissectors/CMakeLists.txt
2332
2333
23341.9 Using the git source code tree.
2335
2336  See <https://www.wireshark.org/develop.html>
2337
2338
23391.10 Submitting code for your new dissector.
2340
2341  See <https://www.wireshark.org/docs/wsdg_html_chunked/ChSrcContribute.html>
2342  and <https://gitlab.com/wireshark/wireshark/-/wikis/Development/SubmittingPatches>.
2343
2344  - VERIFY that your dissector code does not use prohibited or deprecated APIs
2345    as follows:
2346    perl <wireshark_root>/tools/checkAPIs.pl <source-filename(s)>
2347
2348  - VERIFY that your dissector code does not contain any header field related
2349    problems:
2350    perl <wireshark_root>/tools/checkhf.pl <source-filename(s)>
2351
2352  - VERIFY that your dissector code does not contain any display filter related
2353    problems:
2354    perl <wireshark_root>/tools/checkfiltername.pl <source-filename(s)>
2355
2356  - CHECK your dissector with CppCheck (http://cppcheck.sourceforge.net/) using
2357    Wireshark's customized configuration. This is particularly important on
2358    Windows, since Microsoft's compiler warnings are quite thin:
2359    ./tools/cppcheck/cppcheck.sh <source-filename(s)>
2360
2361  - TEST YOUR DISSECTOR BEFORE SUBMITTING IT.
2362    Use fuzz-test.sh and/or randpkt against your dissector. These are
2363    described at <https://gitlab.com/wireshark/wireshark/-/wikis/FuzzTesting>.
2364
2365  - Subscribe to <mailto:wireshark-dev[AT]wireshark.org> by sending an email to
2366    <mailto:wireshark-dev-request[AT]wireshark.org?body="help"> or visiting
2367    <https://www.wireshark.org/lists/>.
2368
2369  - 'git diff' to verify all your changes look good.
2370
2371  - 'git add' all the files you changed.
2372
2373  - 'git commit' to commit (locally) your changes. First line of commit message
2374    should be a summary of the changes followed by an empty line and a more
2375    verbose description.
2376
2377  - 'git push downstream HEAD' to push the changes to GitLab. (This assumes
2378    that you have a remote named "downstream" that points to a fork of
2379    https://gitlab.com/wireshark/wireshark.)
2380
2381  - Create a Wiki page on the protocol at <https://gitlab.com/wireshark/editor-wiki>.
2382    (You'll need to request access to https://gitlab.com/wireshark/wiki-editors.)
2383    A template is provided so it is easy to setup in a consistent style.
2384      See: <https://gitlab.com/wireshark/wireshark/-/wikis/HowToEdit>
2385      and  <https://gitlab.com/wireshark/wireshark/-/wikis/ProtocolReference>
2386
2387  - If possible, add sample capture files to the sample captures page at
2388    <https://gitlab.com/wireshark/wireshark/-/wikis/SampleCaptures>. These
2389    files are used by the automated build system for fuzz testing.
2390
2391  - If you don't think the wiki is the right place for your sample capture,
2392    submit a bug report to the Wireshark issue database, found at
2393    <https://gitlab.com/wireshark/wireshark/-/issues>, qualified as an
2394    enhancement and attach your sample capture there. Normally a new
2395    dissector won't be accepted without a sample capture! If you open a
2396    bug be sure to cross-link your GitLab merge request.
2397
23982. Advanced dissector topics.
2399
24002.1 Introduction.
2401
2402Some of the advanced features are being worked on constantly. When using them
2403it is wise to check the relevant header and source files for additional details.
2404
24052.2 Following "conversations".
2406
2407In Wireshark a conversation is defined as a series of data packets between two
2408address:port combinations. A conversation is not sensitive to the direction of
2409the packet. The same conversation will be returned for a packet bound from
2410ServerA:1000 to ClientA:2000 and the packet from ClientA:2000 to ServerA:1000.
2411
24122.2.1 Conversation Routines
2413
2414There are seven routines that you will use to work with a conversation:
2415conversation_new, find_conversation, find_or_create_conversation,
2416conversation_add_proto_data, conversation_get_proto_data,
2417conversation_delete_proto_data, and conversation_set_dissector.
2418
2419
24202.2.1.1 The conversation_init function.
2421
2422This is an internal routine for the conversation code. As such you
2423will not have to call this routine. Just be aware that this routine is
2424called at the start of each capture and before the packets are filtered
2425with a display filter. The routine will destroy all stored
2426conversations. This routine does NOT clean up any data pointers that are
2427passed in the conversation_add_proto_data 'data' variable. You are
2428responsible for this clean up if you pass a malloc'ed pointer
2429in this variable.
2430
2431See item 2.2.1.5 for more information about use of the 'data' pointer.
2432
2433
24342.2.1.2 The conversation_new function.
2435
2436This routine will create a new conversation based upon two address/port
2437pairs. If you want to associate with the conversation a pointer to a
2438private data structure you must use the conversation_add_proto_data
2439function. The ptype variable is used to differentiate between
2440conversations over different protocols, i.e. TCP and UDP. The options
2441variable is used to define a conversation that will accept any destination
2442address and/or port. Set options = 0 if the destination port and address
2443are known when conversation_new is called. See section 2.4 for more
2444information on usage of the options parameter.
2445
2446The conversation_new prototype:
2447    conversation_t *conversation_new(guint32 setup_frame, address *addr1,
2448        address *addr2, port_type ptype, guint32 port1, guint32 port2,
2449        guint options);
2450
2451Where:
2452    guint32 setup_frame = The lowest numbered frame for this conversation
2453    address* addr1      = first data packet address
2454    address* addr2      = second data packet address
2455    port_type ptype     = port type, this is defined in packet.h
2456    guint32 port1       = first data packet port
2457    guint32 port2       = second data packet port
2458    guint options       = conversation options, NO_ADDR2 and/or NO_PORT2
2459
2460setup_frame indicates the first frame for this conversation, and is used to
2461distinguish multiple conversations with the same addr1/port1 and addr2/port2
2462pair that occur within the same capture session.
2463
2464"addr1" and "port1" are the first address/port pair; "addr2" and "port2"
2465are the second address/port pair. A conversation doesn't have source
2466and destination address/port pairs - packets in a conversation go in
2467both directions - so "addr1"/"port1" may be the source or destination
2468address/port pair; "addr2"/"port2" would be the other pair.
2469
2470If NO_ADDR2 is specified, the conversation is set up so that a
2471conversation lookup will match only the "addr1" address; if NO_PORT2 is
2472specified, the conversation is set up so that a conversation lookup will
2473match only the "port1" port; if both are specified, i.e.
2474NO_ADDR2|NO_PORT2, the conversation is set up so that the lookup will
2475match only the "addr1"/"port1" address/port pair. This can be used if a
2476packet indicates that, later in the capture, a conversation will be
2477created using certain addresses and ports, in the case where the packet
2478doesn't specify the addresses and ports of both sides.
2479
24802.2.1.3 The find_conversation function.
2481
2482Call this routine to look up a conversation. If no conversation is found,
2483the routine will return a NULL value.
2484
2485The find_conversation prototype:
2486
2487    conversation_t *find_conversation(guint32 frame_num, address *addr_a,
2488        address *addr_b, port_type ptype, guint32 port_a, guint32 port_b,
2489        guint options);
2490
2491Where:
2492    guint32 frame_num = a frame number to match
2493    address* addr_a = first address
2494    address* addr_b = second address
2495    port_type ptype = port type
2496    guint32 port_a = first data packet port
2497    guint32 port_b = second data packet port
2498    guint options = conversation options, NO_ADDR_B and/or NO_PORT_B
2499
2500frame_num is a frame number to match. The conversation returned is where
2501    (frame_num >= conversation->setup_frame
2502    && frame_num < conversation->next->setup_frame)
2503Suppose there are a total of 3 conversations (A, B, and C) that match
2504addr_a/port_a and addr_b/port_b, where the setup_frame used in
2505conversation_new() for A, B and C are 10, 50, and 100 respectively. The
2506frame_num passed in find_conversation is compared to the setup_frame of each
2507conversation. So if (frame_num >= 10 && frame_num < 50), conversation A is
2508returned. If (frame_num >= 50 && frame_num < 100), conversation B is returned.
2509If (frame_num >= 100) conversation C is returned.
2510
2511"addr_a" and "port_a" are the first address/port pair; "addr_b" and
2512"port_b" are the second address/port pair. Again, as a conversation
2513doesn't have source and destination address/port pairs, so
2514"addr_a"/"port_a" may be the source or destination address/port pair;
2515"addr_b"/"port_b" would be the other pair. The search will match the
2516"a" address/port pair against both the "1" and "2" address/port pairs,
2517and match the "b" address/port pair against both the "2" and "1"
2518address/port pairs; you don't have to worry about which side the "a" or
2519"b" pairs correspond to.
2520
2521If the NO_ADDR_B flag was specified to "find_conversation()", the
2522"addr_b" address will be treated as matching any "wildcarded" address;
2523if the NO_PORT_B flag was specified, the "port_b" port will be treated
2524as matching any "wildcarded" port. If both flags are specified, i.e.
2525NO_ADDR_B|NO_PORT_B, the "addr_b" address will be treated as matching
2526any "wildcarded" address and the "port_b" port will be treated as
2527matching any "wildcarded" port.
2528
25292.2.1.4 The find_conversation_pinfo function.
2530
2531This convenience function will find an existing conversation (by calling
2532find_conversation())
2533
2534The find_conversation_pinfo prototype:
2535
2536    extern conversation_t *find_conversation_pinfo(packet_info *pinfo, const guint options);
2537
2538Where:
2539    packet_info *pinfo = the packet_info structure
2540    const guint options = conversation options, NO_ADDR_B and/or NO_PORT_B
2541
2542The frame number and the addresses necessary for find_conversation() are
2543taken from the pinfo structure (as is commonly done).
2544
25452.2.1.5 The find_or_create_conversation function.
2546
2547This convenience function will find an existing conversation (by calling
2548find_conversation()) and, if a conversation does not already exist, create a
2549new conversation by calling conversation_new().
2550
2551The find_or_create_conversation prototype:
2552
2553    extern conversation_t *find_or_create_conversation(packet_info *pinfo);
2554
2555Where:
2556    packet_info *pinfo = the packet_info structure
2557
2558The frame number and the addresses necessary for find_conversation() and
2559conversation_new() are taken from the pinfo structure (as is commonly done)
2560and no 'options' are used.
2561
2562
25632.2.1.6 The conversation_add_proto_data function.
2564
2565Once you have created a conversation with conversation_new, you can
2566associate data with it using this function.
2567
2568The conversation_add_proto_data prototype:
2569
2570    void conversation_add_proto_data(conversation_t *conv, int proto,
2571        void *proto_data);
2572
2573Where:
2574    conversation_t *conv    = the conversation in question
2575    int proto               = registered protocol number
2576    void *data              = dissector data structure
2577
2578"conversation" is the value returned by conversation_new. "proto" is a
2579unique protocol number created with proto_register_protocol. Protocols
2580are typically registered in the proto_register_XXXX section of your
2581dissector. "data" is a pointer to the data you wish to associate with the
2582conversation. "data" usually points to "wmem_alloc'd" memory; the
2583memory will be automatically freed each time a new dissection begins
2584and thus need not be managed (freed) by the dissector.
2585Using the protocol number allows several dissectors to
2586associate data with a given conversation.
2587
2588
25892.2.1.7 The conversation_get_proto_data function.
2590
2591After you have located a conversation with find_conversation, you can use
2592this function to retrieve any data associated with it.
2593
2594The conversation_get_proto_data prototype:
2595
2596    void *conversation_get_proto_data(conversation_t *conv, int proto);
2597
2598Where:
2599    conversation_t *conv    = the conversation in question
2600    int proto               = registered protocol number
2601
2602"conversation" is the conversation created with conversation_new. "proto"
2603is a unique protocol number created with proto_register_protocol,
2604typically in the proto_register_XXXX portion of a dissector. The function
2605returns a pointer to the data requested, or NULL if no data was found.
2606
2607
26082.2.1.8 The conversation_delete_proto_data function.
2609
2610After you are finished with a conversation, you can remove your association
2611with this function. Please note that ONLY the conversation entry is
2612removed. If you have allocated any memory for your data (other than with wmem_alloc),
2613 you must free it as well.
2614
2615The conversation_delete_proto_data prototype:
2616
2617    void conversation_delete_proto_data(conversation_t *conv, int proto);
2618
2619Where:
2620    conversation_t *conv = the conversation in question
2621    int proto            = registered protocol number
2622
2623"conversation" is the conversation created with conversation_new. "proto"
2624is a unique protocol number created with proto_register_protocol,
2625typically in the proto_register_XXXX portion of a dissector.
2626
26272.2.1.9 The conversation_set_dissector function
2628
2629This function sets the protocol dissector to be invoked whenever
2630conversation parameters (addresses, port_types, ports, etc) are matched
2631during the dissection of a packet.
2632
2633The conversation_set_dissector prototype:
2634
2635    void conversation_set_dissector(conversation_t *conversation, const dissector_handle_t handle);
2636
2637Where:
2638    conversation_t *conv = the conversation in question
2639    const dissector_handle_t handle = the dissector handle.
2640
2641
26422.2.2 Using timestamps relative to the conversation
2643
2644There is a framework to calculate timestamps relative to the start of the
2645conversation. First of all the timestamp of the first packet that has been
2646seen in the conversation must be kept in the protocol data to be able
2647to calculate the timestamp of the current packet relative to the start
2648of the conversation. The timestamp of the last packet that was seen in the
2649conversation should also be kept in the protocol data. This way the
2650delta time between the current packet and the previous packet in the
2651conversation can be calculated.
2652
2653So add the following items to the struct that is used for the protocol data:
2654
2655  nstime_t ts_first;
2656  nstime_t ts_prev;
2657
2658The ts_prev value should only be set during the first run through the
2659packets (ie PINFO_FD_VISITED(pinfo) is false).
2660
2661Next step is to use the per-packet information (described in section 2.5)
2662to keep the calculated delta timestamp, as it can only be calculated
2663on the first run through the packets. This is because a packet can be
2664selected in random order once the whole file has been read.
2665
2666After calculating the conversation timestamps, it is time to put them in
2667the appropriate columns with the function 'col_set_time' (described in
2668section 1.5.9). The column used for relative timestamps is:
2669
2670COL_REL_TIME, /* Delta time to last frame in conversation */
2671
2672Last but not least, there MUST be a preference in each dissector that
2673uses conversation timestamps that makes it possible to enable and
2674disable the calculation of conversation timestamps. The main argument
2675for this is that a higher level conversation is able to overwrite
2676the values of lower level conversations in these two columns. Being
2677able to actively select which protocols may overwrite the conversation
2678timestamp columns gives the user the power to control these columns.
2679(A second reason is that conversation timestamps use the per-packet
2680data structure which uses additional memory, which should be avoided
2681if these timestamps are not needed)
2682
2683Have a look at the differences to packet-tcp.[ch] in SVN 22966 and
2684SVN 23058 to see the implementation of conversation timestamps for
2685the tcp-dissector.
2686
2687
26882.2.3 The example conversation code using wmem_file_scope memory.
2689
2690For a conversation between two IP addresses and ports you can use this as an
2691example. This example uses wmem_alloc() with wmem_file_scope() to allocate
2692memory and stores the data pointer in the conversation 'data' variable.
2693
2694/************************ Global values ************************/
2695
2696/* define your structure here */
2697typedef struct {
2698
2699} my_entry_t;
2700
2701/* Registered protocol number */
2702static int my_proto = -1;
2703
2704/********************* in the dissector routine *********************/
2705
2706/* the local variables in the dissector */
2707
2708conversation_t *conversation;
2709my_entry_t *data_ptr;
2710
2711
2712/* look up the conversation */
2713
2714conversation = find_conversation(pinfo->num, &pinfo->src, &pinfo->dst,
2715        pinfo->ptype, pinfo->srcport, pinfo->destport, 0);
2716
2717/* if conversation found get the data pointer that you stored */
2718if (conversation)
2719    data_ptr = (my_entry_t*)conversation_get_proto_data(conversation, my_proto);
2720else {
2721
2722    /* new conversation create local data structure */
2723
2724    data_ptr = wmem_alloc(wmem_file_scope(), sizeof(my_entry_t));
2725
2726    /*** add your code here to setup the new data structure ***/
2727
2728    /* create the conversation with your data pointer  */
2729
2730    conversation = conversation_new(pinfo->num,  &pinfo->src, &pinfo->dst, pinfo->ptype,
2731        pinfo->srcport, pinfo->destport, 0);
2732    conversation_add_proto_data(conversation, my_proto, (void *)data_ptr);
2733}
2734
2735/* at this point the conversation data is ready */
2736
2737/***************** in the protocol register routine *****************/
2738
2739my_proto = proto_register_protocol("My Protocol", "My Protocol", "my_proto");
2740
2741
27422.2.4 An example conversation code that starts at a specific frame number.
2743
2744Sometimes a dissector has determined that a new conversation is needed that
2745starts at a specific frame number, when a capture session encompasses multiple
2746conversation that reuse the same src/dest ip/port pairs. You can use the
2747conversation->setup_frame returned by find_conversation with
2748pinfo->num to determine whether or not there already exists a conversation
2749that starts at the specific frame number.
2750
2751/* in the dissector routine */
2752
2753    conversation = find_conversation(pinfo->num, &pinfo->src, &pinfo->dst,
2754        pinfo->ptype, pinfo->srcport, pinfo->destport, 0);
2755    if (conversation == NULL || (conversation->setup_frame != pinfo->num)) {
2756        /* It's not part of any conversation or the returned
2757         * conversation->setup_frame doesn't match the current frame
2758         * create a new one.
2759         */
2760        conversation = conversation_new(pinfo->num, &pinfo->src,
2761            &pinfo->dst, pinfo->ptype, pinfo->srcport, pinfo->destport,
2762            NULL, 0);
2763    }
2764
2765
27662.2.5 The example conversation code using conversation index field.
2767
2768Sometimes the conversation isn't enough to define a unique data storage
2769value for the network traffic. For example if you are storing information
2770about requests carried in a conversation, the request may have an
2771identifier that is used to  define the request. In this case the
2772conversation and the identifier are required to find the data storage
2773pointer. You can use the conversation data structure index value to
2774uniquely define the conversation.
2775
2776See packet-afs.c for an example of how to use the conversation index. In
2777this dissector multiple requests are sent in the same conversation. To store
2778information for each request the dissector has an internal hash table based
2779upon the conversation index and values inside the request packets.
2780
2781
2782    /* in the dissector routine */
2783
2784    /* to find a request value, first lookup conversation to get index */
2785    /* then used the conversation index, and request data to find data */
2786    /* in the local hash table */
2787
2788    conversation = find_or_create_conversation(pinfo);
2789
2790    request_key.conversation = conversation->index;
2791    request_key.service = pntoh16(&rxh->serviceId);
2792    request_key.callnumber = pntoh32(&rxh->callNumber);
2793
2794    request_val = (struct afs_request_val *)g_hash_table_lookup(
2795        afs_request_hash, &request_key);
2796
2797    /* only allocate a new hash element when it's a request */
2798    opcode = 0;
2799    if (!request_val && !reply)
2800    {
2801        new_request_key = wmem_alloc(wmem_file_scope(), sizeof(struct afs_request_key));
2802        *new_request_key = request_key;
2803
2804        request_val = wmem_alloc(wmem_file_scope(), sizeof(struct afs_request_val));
2805        request_val -> opcode = pntoh32(&afsh->opcode);
2806        opcode = request_val->opcode;
2807
2808        g_hash_table_insert(afs_request_hash, new_request_key,
2809            request_val);
2810    }
2811
2812
2813
28142.3 Dynamic conversation dissector registration.
2815
2816
2817NOTE:   This sections assumes that all information is available to
2818    create a complete conversation, source port/address and
2819    destination port/address. If either the destination port or
2820    address is known, see section 2.4 Dynamic server port dissector
2821    registration.
2822
2823For protocols that negotiate a secondary port connection, for example
2824packet-msproxy.c, a conversation can install a dissector to handle
2825the secondary protocol dissection. After the conversation is created
2826for the negotiated ports use the conversation_set_dissector to define
2827the dissection routine.
2828Before we create these conversations or assign a dissector to them we should
2829first check that the conversation does not already exist and if it exists
2830whether it is registered to our protocol or not.
2831We should do this because it is uncommon but it does happen that multiple
2832different protocols can use the same socketpair during different stages of
2833an application cycle. By keeping track of the frame number a conversation
2834was started in Wireshark can still tell these different protocols apart.
2835
2836The second argument to conversation_set_dissector is a dissector handle,
2837which is created with a call to create_dissector_handle or
2838register_dissector.
2839
2840create_dissector_handle takes as arguments a pointer to the dissector
2841function and a protocol ID as returned by proto_register_protocol;
2842register_dissector takes as arguments a string giving a name for the
2843dissector, a pointer to the dissector function, and a protocol ID.
2844
2845The protocol ID is the ID for the protocol dissected by the function.
2846The function will not be called if the protocol has been disabled by the
2847user; instead, the data for the protocol will be dissected as raw data.
2848
2849An example -
2850
2851/* the handle for the dynamic dissector *
2852static dissector_handle_t sub_dissector_handle;
2853
2854/* prototype for the dynamic dissector */
2855static void sub_dissector(tvbuff_t *tvb, packet_info *pinfo,
2856                proto_tree *tree);
2857
2858/* in the main protocol dissector, where the next dissector is setup */
2859
2860/* if conversation has a data field, create it and load structure */
2861
2862/* First check if a conversation already exists for this
2863    socketpair
2864*/
2865    conversation = find_conversation(pinfo->num,
2866                &pinfo->src, &pinfo->dst, protocol,
2867                src_port, dst_port,  0);
2868
2869/* If there is no such conversation, or if there is one but for
2870   someone else's protocol then we just create a new conversation
2871   and assign our protocol to it.
2872*/
2873    if ( (conversation == NULL) ||
2874         (conversation->dissector_handle != sub_dissector_handle) ) {
2875        new_conv_info = wmem_alloc(wmem_file_scope(), sizeof(struct _new_conv_info));
2876        new_conv_info->data1 = value1;
2877
2878/* create the conversation for the dynamic port */
2879            conversation = conversation_new(pinfo->num,
2880            &pinfo->src, &pinfo->dst, protocol,
2881                src_port, dst_port, new_conv_info, 0);
2882
2883/* set the dissector for the new conversation */
2884            conversation_set_dissector(conversation, sub_dissector_handle);
2885    }
2886    ...
2887
2888void
2889proto_register_PROTOABBREV(void)
2890{
2891    ...
2892
2893    sub_dissector_handle = create_dissector_handle(sub_dissector,
2894        proto);
2895
2896    ...
2897}
2898
28992.4 Dynamic server port dissector registration.
2900
2901NOTE: While this example used both NO_ADDR2 and NO_PORT2 to create a
2902conversation with only one port and address set, this isn't a
2903requirement. Either the second port or the second address can be set
2904when the conversation is created.
2905
2906For protocols that define a server address and port for a secondary
2907protocol, a conversation can be used to link a protocol dissector to
2908the server port and address. The key is to create the new
2909conversation with the second address and port set to the "accept
2910any" values.
2911
2912Some server applications can use the same port for different protocols during
2913different stages of a transaction. For example it might initially use SNMP
2914to perform some discovery and later switch to use TFTP using the same port.
2915In order to handle this properly we must first check whether such a
2916conversation already exists or not and if it exists we also check whether the
2917registered dissector_handle for that conversation is "our" dissector or not.
2918If not we create a new conversation on top of the previous one and set this new
2919conversation to use our protocol.
2920Since Wireshark keeps track of the frame number where a conversation started
2921wireshark will still be able to keep the packets apart even though they do use
2922the same socketpair.
2923        (See packet-tftp.c and packet-snmp.c for examples of this)
2924
2925There are two support routines that will allow the second port and/or
2926address to be set later.
2927
2928conversation_set_port2( conversation_t *conv, guint32 port);
2929conversation_set_addr2( conversation_t *conv, address addr);
2930
2931These routines will change the second address or port for the
2932conversation. So, the server port conversation will be converted into a
2933more complete conversation definition. Don't use these routines if you
2934want to create a conversation between the server and client and retain the
2935server port definition, you must create a new conversation.
2936
2937
2938An example -
2939
2940/* the handle for the dynamic dissector *
2941static dissector_handle_t sub_dissector_handle;
2942
2943    ...
2944
2945/* in the main protocol dissector, where the next dissector is setup */
2946
2947/* if conversation has a data field, create it and load structure */
2948
2949    new_conv_info = wmem_alloc(wmem_file_scope(), sizeof(struct _new_conv_info));
2950    new_conv_info->data1 = value1;
2951
2952/* create the conversation for the dynamic server address and port      */
2953/* NOTE: The second address and port values don't matter because the    */
2954/* NO_ADDR2 and NO_PORT2 options are set.                               */
2955
2956/* First check if a conversation already exists for this
2957    IP/protocol/port
2958*/
2959    conversation = find_conversation(pinfo->num,
2960                &server_src_addr, 0, protocol,
2961                server_src_port, 0, NO_ADDR2 | NO_PORT_B);
2962/* If there is no such conversation, or if there is one but for
2963   someone else's protocol then we just create a new conversation
2964   and assign our protocol to it.
2965*/
2966    if ( (conversation == NULL) ||
2967         (conversation->dissector_handle != sub_dissector_handle) ) {
2968        conversation = conversation_new(pinfo->num,
2969        &server_src_addr, 0, protocol,
2970        server_src_port, 0, new_conv_info, NO_ADDR2 | NO_PORT2);
2971
2972/* set the dissector for the new conversation */
2973        conversation_set_dissector(conversation, sub_dissector_handle);
2974    }
2975
29762.5 Per-packet information.
2977
2978Information can be stored for each data packet that is processed by the
2979dissector. The information is added with the p_add_proto_data function and
2980retrieved with the p_get_proto_data function. The data pointers passed into
2981the p_add_proto_data are not managed by the proto_data routines, however the
2982data pointer memory scope must match that of the scope parameter.
2983The two most common use cases for p_add_proto_data/p_get_proto_data are for
2984persistent data about the packet for the lifetime of the capture (file scope)
2985and to exchange data between dissectors across a single packet (packet scope).
2986It is also used to provide packet data for Decode As dialog (packet scope).
2987
2988These functions are declared in <epan/proto_data.h>.
2989
2990void
2991p_add_proto_data(wmem_allocator_t *scope, packet_info *pinfo, int proto, guint32 key, void *proto_data)
2992void *
2993p_get_proto_data(wmem_allocator_t *scope, packet_info *pinfo, int proto, guint32 key)
2994
2995Where:
2996    scope      - Lifetime of the data to be stored, typically wmem_file_scope()
2997                 or pinfo->pool (packet scope). Must match scope of data
2998                 allocated.
2999    pinfo      - The packet info pointer.
3000    proto      - Protocol id returned by the proto_register_protocol call
3001                 during initialization
3002    key        - key associated with 'proto_data'
3003    proto_data - pointer to the dissector data.
3004
3005
30062.6 User Preferences.
3007
3008If the dissector has user options, there is support for adding these preferences
3009to a configuration dialog.
3010
3011You must register the module with the preferences routine with -
3012
3013       module_t *prefs_register_protocol(proto_id, void (*apply_cb)(void))
3014       or
3015       module_t *prefs_register_protocol_subtree(const char *subtree, int id,
3016              void (*apply_cb)(void));
3017
3018
3019Where: proto_id   - the value returned by "proto_register_protocol()" when
3020                    the protocol was registered.
3021       apply_cb   - Callback routine that is called when preferences are
3022                    applied. It may be NULL, which inhibits the callback.
3023       subtree    - grouping preferences tree node name (several protocols can
3024                    be grouped under one preferences subtree)
3025
3026Then you can register the fields that can be configured by the user with these
3027routines -
3028
3029    /* Register a preference with an unsigned integral value. */
3030    void prefs_register_uint_preference(module_t *module, const char *name,
3031        const char *title, const char *description, guint base, guint *var);
3032
3033    /* Register a preference with an Boolean value. */
3034    void prefs_register_bool_preference(module_t *module, const char *name,
3035        const char *title, const char *description, gboolean *var);
3036
3037    /* Register a preference with an enumerated value. */
3038    void prefs_register_enum_preference(module_t *module, const char *name,
3039        const char *title, const char *description, gint *var,
3040        const enum_val_t *enumvals, gboolean radio_buttons)
3041
3042    /* Register a preference with a character-string value. */
3043    void prefs_register_string_preference(module_t *module, const char *name,
3044        const char *title, const char *description, char **var)
3045
3046    /* Register a preference with a file name (string) value.
3047    * File name preferences are basically like string preferences
3048    * except that the GUI gives the user the ability to browse for the
3049    * file. Set for_writing TRUE to show a Save dialog instead of normal Open.
3050    */
3051    void prefs_register_filename_preference(module_t *module, const char *name,
3052        const char *title, const char *description, char **var,
3053        gboolean for_writing)
3054
3055    /* Register a preference with a range of unsigned integers (e.g.,
3056     * "1-20,30-40").
3057     */
3058    void prefs_register_range_preference(module_t *module, const char *name,
3059        const char *title, const char *description, range_t *var,
3060        guint32 max_value)
3061
3062Where: module - Returned by the prefs_register_protocol routine
3063     name     - This is appended to the name of the protocol, with a
3064            "." between them, to construct a name that identifies
3065            the field in the preference file; the name itself
3066            should not include the protocol name, as the name in
3067            the preference file will already have it. Make sure that
3068            only lower-case ASCII letters, numbers, underscores and
3069            dots appear in the preference name.
3070     title    - Field title in the preferences dialog
3071     description - Comments added to the preference file above the
3072               preference value and shown as tooltip in the GUI, or NULL
3073     var      - pointer to the storage location that is updated when the
3074            field is changed in the preference dialog box. Note that
3075            with string preferences the given pointer is overwritten
3076            with a pointer to a new copy of the string during the
3077            preference registration. The passed-in string may be
3078            freed, but you must keep another pointer to the string
3079            in order to free it.
3080     base      - Base that the unsigned integer is expected to be in,
3081            see strtoul(3).
3082     enumvals - an array of enum_val_t structures. This must be
3083            NULL-terminated; the members of that structure are:
3084
3085            a short name, to be used with the "-o" flag - it
3086            should not contain spaces or upper-case letters,
3087            so that it's easier to put in a command line;
3088
3089            a description, which is used in the GUI (and
3090            which, for compatibility reasons, is currently
3091            what's written to the preferences file) - it can
3092            contain spaces, capital letters, punctuation,
3093            etc.;
3094
3095            the numerical value corresponding to that name
3096            and description
3097     radio_buttons - TRUE if the field is to be displayed in the
3098             preferences dialog as a set of radio buttons,
3099             FALSE if it is to be displayed as an option
3100             menu
3101     max_value - The maximum allowed value for a range (0 is the minimum).
3102
3103These functions are declared in <epan/prefs.h>.
3104
3105An example from packet-rtpproxy.c -
3106
3107    proto_rtpproxy = proto_register_protocol ( "Sippy RTPproxy Protocol", "RTPproxy", "rtpproxy");
3108
3109    ...
3110
3111    rtpproxy_module = prefs_register_protocol(proto_rtpproxy, proto_reg_handoff_rtpproxy);
3112
3113    prefs_register_bool_preference(rtpproxy_module, "establish_conversation",
3114                                 "Establish Media Conversation",
3115                                 "Specifies that RTP/RTCP/T.38/MSRP/etc streams are decoded based "
3116                                 "upon port numbers found in RTPproxy answers",
3117                                 &rtpproxy_establish_conversation);
3118
3119    prefs_register_uint_preference(rtpproxy_module, "reply.timeout",
3120                                 "RTPproxy reply timeout", /* Title */
3121                                 "Maximum timeout value in waiting for reply from RTPProxy (in milliseconds).", /* Descr */
3122                                 10,
3123                                 &rtpproxy_timeout);
3124
3125This will create preferences "rtpproxy.establish_conversation" and
3126"rtpproxy.reply.timeout", the first of which is an Boolean and the
3127second of which is a unsigned integer.
3128
3129Note that a warning will pop up if you've saved such preference to the
3130preference file and you subsequently take the code out. The way to make
3131a preference obsolete is to register it as such:
3132
3133/* Register a preference that used to be supported but no longer is. */
3134    void prefs_register_obsolete_preference(module_t *module,
3135        const char *name);
3136
31372.7 Reassembly/desegmentation for protocols running atop TCP.
3138
3139There are two main ways of reassembling a Protocol Data Unit (PDU) which
3140spans across multiple TCP segments. The first approach is simpler, but
3141assumes you are running atop of TCP when this occurs (but your dissector
3142might run atop of UDP, too, for example), and that your PDUs consist of a
3143fixed amount of data that includes enough information to determine the PDU
3144length, possibly followed by additional data. The second method is more
3145generic but requires more code and is less efficient.
3146
31472.7.1 Using tcp_dissect_pdus().
3148
3149For the first method, you register two different dissection methods, one
3150for the TCP case, and one for the other cases. It is a good idea to
3151also have a dissect_PROTO_common function which will parse the generic
3152content that you can find in all PDUs which is called from
3153dissect_PROTO_tcp when the reassembly is complete and from
3154dissect_PROTO_udp (or dissect_PROTO_other).
3155
3156To register the distinct dissector functions, consider the following
3157example, stolen from packet-hartip.c:
3158
3159    #include "packet-tcp.h"
3160
3161    dissector_handle_t hartip_tcp_handle;
3162    dissector_handle_t hartip_udp_handle;
3163
3164    hartip_tcp_handle = create_dissector_handle(dissect_hartip_tcp, proto_hartip);
3165    hartip_udp_handle = create_dissector_handle(dissect_hartip_udp, proto_hartip);
3166
3167    dissector_add_uint("udp.port", HARTIP_PORT, hartip_udp_handle);
3168    dissector_add_uint_with_preference("tcp.port", HARTIP_PORT, hartip_tcp_handle);
3169
3170The dissect_hartip_udp function does very little work and calls
3171dissect_hartip_common, while dissect_hartip_tcp calls tcp_dissect_pdus with a
3172reference to a callback which will be called with reassembled data:
3173
3174    static int
3175    dissect_hartip_tcp(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree,
3176                   void *data)
3177    {
3178        if (!tvb_bytes_exist(tvb, 0, HARTIP_HEADER_LENGTH))
3179            return 0;
3180
3181        tcp_dissect_pdus(tvb, pinfo, tree, hartip_desegment, HARTIP_HEADER_LENGTH,
3182                   get_dissect_hartip_len, dissect_hartip_pdu, data);
3183        return tvb_reported_length(tvb);
3184    }
3185
3186(The dissect_hartip_pdu function acts similarly to dissect_hartip_udp.)
3187The arguments to tcp_dissect_pdus are:
3188
3189    the tvbuff pointer, packet_info pointer, and proto_tree pointer
3190    passed to the dissector;
3191
3192    a gboolean flag indicating whether desegmentation is enabled for
3193    your protocol;
3194
3195    the number of bytes of PDU data required to determine the length
3196    of the PDU;
3197
3198    a routine that takes as arguments a packet_info pointer, a tvbuff
3199    pointer and an offset value representing the offset into the tvbuff
3200    at which a PDU begins, and a void pointer for user data, and should
3201    return the total length of the PDU in bytes (or 0 if more bytes are
3202    needed to determine the message length).
3203    The routine must not throw exceptions (it is guaranteed that the
3204    number of bytes specified by the previous argument to
3205    tcp_dissect_pdus is available, but more data might not be available,
3206    so don't refer to any data past that);
3207
3208    a new_dissector_t routine to dissect the pdu that's passed a tvbuff
3209    pointer, packet_info pointer, proto_tree pointer and a void pointer for
3210    user data, with the tvbuff containing a possibly-reassembled PDU. (The
3211    "reported_length" of the tvbuff will be the length of the PDU);
3212
3213    a void pointer to user data that is passed to the length-determining
3214    routine, and the dissector routine referenced in the previous parameter.
3215
32162.7.2 Modifying the pinfo struct.
3217
3218The second reassembly mode is preferred when the dissector cannot determine
3219how many bytes it will need to read in order to determine the size of a PDU.
3220It may also be useful if your dissector needs to support reassembly from
3221protocols other than TCP.
3222
3223Your dissect_PROTO will initially be passed a tvbuff containing the payload of
3224the first packet. It should dissect as much data as it can, noting that it may
3225contain more than one complete PDU. If the end of the provided tvbuff coincides
3226with the end of a PDU then all is well and your dissector can just return as
3227normal. (If it is a new-style dissector, it should return the number of bytes
3228successfully processed.)
3229
3230If the dissector discovers that the end of the tvbuff does /not/ coincide with
3231the end of a PDU, (ie, there is half of a PDU at the end of the tvbuff), it can
3232indicate this to the parent dissector, by updating the pinfo struct. The
3233desegment_offset field is the offset in the tvbuff at which the dissector will
3234continue processing when next called. The desegment_len field should contain
3235the estimated number of additional bytes required for completing the PDU. Next
3236time your dissect_PROTO is called, it will be passed a tvbuff composed of the
3237end of the data from the previous tvbuff together with desegment_len more bytes.
3238
3239If the dissector cannot tell how many more bytes it will need, it should set
3240desegment_len=DESEGMENT_ONE_MORE_SEGMENT; it will then be called again as soon
3241as any more data becomes available. Dissectors should set the desegment_len to a
3242reasonable value when possible rather than always setting
3243DESEGMENT_ONE_MORE_SEGMENT as it will generally be more efficient. Also, you
3244*must not* set desegment_len=1 in this case, in the hope that you can change
3245your mind later: once you return a positive value from desegment_len, your PDU
3246boundary is set in stone.
3247
3248static hf_register_info hf[] = {
3249    {&hf_cstring,
3250     {"C String", "c.string", FT_STRING, BASE_NONE, NULL, 0x0,
3251      NULL, HFILL}
3252     }
3253   };
3254
3255/**
3256*   Dissect a buffer containing ASCII C strings.
3257*
3258*   @param  tvb     The buffer to dissect.
3259*   @param  pinfo   Packet Info.
3260*   @param  tree    The protocol tree.
3261*   @param  data    Optional data parameter given by parent dissector.
3262**/
3263static int dissect_cstr(tvbuff_t * tvb, packet_info * pinfo, proto_tree * tree, void *data _U_)
3264{
3265    guint offset = 0;
3266    while(offset < tvb_reported_length(tvb)) {
3267        gint available = tvb_reported_length_remaining(tvb, offset);
3268        gint len = tvb_strnlen(tvb, offset, available);
3269
3270        if( -1 == len ) {
3271            /* we ran out of data: ask for more */
3272            pinfo->desegment_offset = offset;
3273            pinfo->desegment_len = DESEGMENT_ONE_MORE_SEGMENT;
3274            return (offset + available);
3275        }
3276
3277        col_set_str(pinfo->cinfo, COL_INFO, "C String");
3278
3279        len += 1; /* Add one for the '\0' */
3280
3281        if (tree) {
3282            proto_tree_add_item(tree, hf_cstring, tvb, offset, len,
3283                ENC_ASCII|ENC_NA);
3284        }
3285        offset += (guint)len;
3286    }
3287
3288    /* if we get here, then the end of the tvb coincided with the end of a
3289       string. Happy days. */
3290    return tvb_captured_length(tvb);
3291}
3292
3293This simple dissector will repeatedly return DESEGMENT_ONE_MORE_SEGMENT
3294requesting more data until the tvbuff contains a complete C string. The C string
3295will then be added to the protocol tree. Note that there may be more
3296than one complete C string in the tvbuff, so the dissection is done in a
3297loop.
3298
32992.8 Using udp_dissect_pdus().
3300
3301As noted in section 2.7.1, TCP has an API to dissect its PDU that can handle
3302a PDU spread across multiple packets or multiple PDUs spread across a single
3303packet. This section describes a similar mechanism for UDP, but is only
3304applicable for one or more PDUs in a single packet. If a protocol runs on top
3305of TCP as well as UDP, a common PDU dissection function can be created for both.
3306
3307To register the distinct dissector functions, consider the following
3308example using UDP and TCP dissection, stolen from packet-dnp.c:
3309
3310    #include "packet-tcp.h"
3311    #include "packet-udp.h"
3312
3313    dissector_handle_t dnp3_tcp_handle;
3314    dissector_handle_t dnp3_udp_handle;
3315
3316    dnp3_tcp_handle = create_dissector_handle(dissect_dnp3_tcp, proto_dnp3);
3317    dnp3_udp_handle = create_dissector_handle(dissect_dnp3_udp, proto_dnp3);
3318
3319    dissector_add_uint("tcp.port", TCP_PORT_DNP, dnp3_tcp_handle);
3320    dissector_add_uint("udp.port", UDP_PORT_DNP, dnp3_udp_handle);
3321
3322Both dissect_dnp3_tcp and dissect_dnp3_udp call tcp_dissect_pdus and
3323udp_dissect_pdus respectively, with a reference to the same callbacks which
3324are called to handle PDU data.
3325
3326    static int
3327    dissect_dnp3_udp(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree, void *data)
3328    {
3329        return udp_dissect_pdus(tvb, pinfo, tree, DNP_HDR_LEN, dnp3_udp_check_header,
3330                   get_dnp3_message_len, dissect_dnp3_message, data);
3331    }
3332
3333    static int
3334    dissect_dnp3_tcp(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree, void *data)
3335    {
3336        if (!check_dnp3_header(tvb, FALSE)) {
3337            return 0;
3338        }
3339
3340        tcp_dissect_pdus(tvb, pinfo, tree, TRUE, DNP_HDR_LEN,
3341                   get_dnp3_message_len, dissect_dnp3_message, data);
3342
3343        return tvb_captured_length(tvb);
3344    }
3345
3346(udp_dissect_pdus has an option of a heuristic check function within it while
3347tcp_dissect_pdus does not, so it's done outside)
3348
3349The arguments to udp_dissect_pdus are:
3350
3351    the tvbuff pointer, packet_info pointer, and proto_tree pointer
3352    passed to the dissector;
3353
3354    the number of bytes of PDU data required to determine the length
3355    of the PDU;
3356
3357    an optional routine (passing NULL is okay) that takes as arguments a
3358    packet_info pointer, a tvbuff pointer and an offset value representing the
3359    offset into the tvbuff at which a PDU begins, and a void pointer for user
3360    data, and should return TRUE if the packet belongs to the dissector.
3361    The routine must not throw exceptions (it is guaranteed that the
3362    number of bytes specified by the previous argument to
3363    udp_dissect_pdus is available, but more data might not be available,
3364    so don't refer to any data past that);
3365
3366    a routine that takes as arguments a packet_info pointer, a tvbuff
3367    pointer and an offset value representing the offset into the tvbuff
3368    at which a PDU begins, and a void pointer for user data, and should
3369    return the total length of the PDU in bytes. If return value is 0,
3370    it's treated the same as a failed heuristic.
3371    The routine must not throw exceptions (it is guaranteed that the
3372    number of bytes specified by the previous argument to
3373    tcp_dissect_pdus is available, but more data might not be available,
3374    so don't refer to any data past that);
3375
3376    a new_dissector_t routine to dissect the pdu that's passed a tvbuff
3377    pointer, packet_info pointer, proto_tree pointer and a void pointer for
3378    user data, with the tvbuff containing a possibly-reassembled PDU. (The
3379    "reported_length" of the tvbuff will be the length of the PDU);
3380
3381    a void pointer to user data that is passed to the length-determining
3382    routine, and the dissector routine referenced in the previous parameter.
3383
33842.9 PINOs (Protocols in name only)
3385
3386For the typical dissector there is a 1-1 relationship between it and it's
3387protocol. However, there are times when a protocol needs multiple "names"
3388because it has multiple dissection functions going into the same dissector
3389table. The multiple names removes confusion when picking dissection through
3390Decode As functionality.
3391
3392Once the "main" protocol name has been created through proto_register_protocol,
3393additional "pinos" can be created with proto_register_protocol_in_name_only.
3394These pinos have all of the naming conventions of a protocol, but are stored
3395separately as to remove confusion from real protocols. "pinos" the main
3396protocol's properties for things like enable/disable. i.e. If the "main"
3397protocol has been disabled, all of its pinos will be disabled as well.
3398Pinos should not have any fields registered with them or heuristic tables
3399associated with them.
3400
3401Another use case for pinos is when a protocol contains a TLV design and it
3402wants to create a dissector table to handle dissection of the "V". Dissector
3403tables require a "protocol", but the dissection functions for that table
3404typically aren't a protocol. In this case proto_register_protocol_in_name_only
3405creates the necessary placeholder for the dissector table. In addition, because
3406a dissector table exists, "V"s of the TLVs can be dissected outside of the
3407original dissector file.
3408
34092.10 Creating Decode As functionality.
3410
3411While the Decode As functionality is available through the GUI, the underlying
3412functionality is controlled by dissectors themselves. To create Decode As
3413functionality for a dissector, two things are required:
3414    1. A dissector table
3415    2. A series of structures to assist the GUI in how to present the dissector
3416       table data.
3417
3418Consider the following example using IP dissection, stolen from packet-ip.c:
3419
3420    static build_valid_func ip_da_build_value[1] = {ip_value};
3421    static decode_as_value_t ip_da_values = {ip_prompt, 1, ip_da_build_value};
3422    static decode_as_t ip_da = {"ip", "ip.proto", 1, 0, &ip_da_values, NULL, NULL,
3423                              decode_as_default_populate_list, decode_as_default_reset, decode_as_default_change, NULL};
3424    ...
3425    ip_dissector_table = register_dissector_table("ip.proto", "IP protocol", ip_proto, FT_UINT8, BASE_DEC);
3426    ...
3427    register_decode_as(&ip_da);
3428
3429ip_da_build_value contains all of the function pointers (typically just 1) that
3430can be used to retrieve the value(s) that go into the dissector table. This is
3431usually data saved by the dissector during packet dissector with an API like
3432p_add_proto_data and retrieved in the "value" function with p_get_proto_data.
3433
3434ip_da_values contains all of the function pointers (typically just 1) that
3435provide the text explaining the name and use of the value field that will
3436be passed to the dissector table to change the dissection output.
3437
3438ip_da pulls everything together including the dissector (protocol) name, the
3439"layer type" of the dissector, the dissector table name, the function pointer
3440values as well as handlers for populating, applying and resetting the changes
3441to the dissector table through Decode As GUI functionality. For dissector
3442tables that are an integer or string type, the provided "default" handling
3443functions shown in the example should suffice.
3444
3445All entries into a dissector table that use Decode As must have a unique
3446protocol ID. If a protocol wants multiple entries into a dissector table,
3447a pino should be used (see section 2.9)
3448
34492.11 ptvcursors.
3450
3451The ptvcursor API allows a simpler approach to writing dissectors for
3452simple protocols. The ptvcursor API works best for protocols whose fields
3453are static and whose format does not depend on the value of other fields.
3454However, even if only a portion of your protocol is statically defined,
3455then that portion could make use of ptvcursors.
3456
3457The ptvcursor API lets you extract data from a tvbuff, and add it to a
3458protocol tree in one step. It also keeps track of the position in the
3459tvbuff so that you can extract data again without having to compute any
3460offsets --- hence the "cursor" name of the API.
3461
3462The three steps for a simple protocol are:
3463    1. Create a new ptvcursor with ptvcursor_new()
3464    2. Add fields with multiple calls of ptvcursor_add()
3465    3. Delete the ptvcursor with ptvcursor_free()
3466
3467ptvcursor offers the possibility to add subtrees in the tree as well. It can be
3468done in very simple steps :
3469    1. Create a new subtree with ptvcursor_push_subtree(). The old subtree is
3470       pushed in a stack and the new subtree will be used by ptvcursor.
3471    2. Add fields with multiple calls of ptvcursor_add(). The fields will be
3472       added in the new subtree created at the previous step.
3473    3. Pop the previous subtree with ptvcursor_pop_subtree(). The previous
3474       subtree is again used by ptvcursor.
3475Note that at the end of the parsing of a packet you must have popped each
3476subtree you pushed. If it's not the case, the dissector will generate an error.
3477
3478To use the ptvcursor API, include the "ptvcursor.h" file. The PGM dissector
3479is an example of how to use it. You don't need to look at it as a guide;
3480instead, the API description here should be good enough.
3481
34822.11.1 ptvcursor API.
3483
3484ptvcursor_t*
3485ptvcursor_new(proto_tree* tree, tvbuff_t* tvb, gint offset)
3486    This creates a new ptvcursor_t object for iterating over a tvbuff.
3487You must call this and use this ptvcursor_t object so you can use the
3488ptvcursor API.
3489
3490proto_item*
3491ptvcursor_add(ptvcursor_t* ptvc, int hf, gint length, const guint encoding)
3492    This will extract 'length' bytes from the tvbuff and place it in
3493the proto_tree as field 'hf', which is a registered header_field. The
3494pointer to the proto_item that is created is passed back to you. Internally,
3495the ptvcursor advances its cursor so the next call to ptvcursor_add
3496starts where this call finished. The 'encoding' parameter is relevant for
3497certain type of fields (See above under proto_tree_add_item()).
3498
3499proto_item*
3500ptvcursor_add_ret_uint(ptvcursor_t* ptvc, int hf, gint length, const guint encoding, guint32 *retval);
3501    Like ptvcursor_add, but returns uint value retrieved
3502
3503proto_item*
3504ptvcursor_add_ret_int(ptvcursor_t* ptvc, int hf, gint length, const guint encoding, gint32 *retval);
3505    Like ptvcursor_add, but returns int value retrieved
3506
3507proto_item*
3508ptvcursor_add_ret_string(ptvcursor_t* ptvc, int hf, gint length, const guint encoding, wmem_allocator_t *scope, const guint8 **retval);
3509    Like ptvcursor_add, but returns string retrieved
3510
3511proto_item*
3512ptvcursor_add_ret_boolean(ptvcursor_t* ptvc, int hf, gint length, const guint encoding, gboolean *retval);
3513    Like ptvcursor_add, but returns boolean value retrieved
3514
3515proto_item*
3516ptvcursor_add_no_advance(ptvcursor_t* ptvc, int hf, gint length, const guint encoding)
3517    Like ptvcursor_add, but does not advance the internal cursor.
3518
3519void
3520ptvcursor_advance(ptvcursor_t* ptvc, gint length)
3521    Advances the internal cursor without adding anything to the proto_tree.
3522
3523void
3524ptvcursor_free(ptvcursor_t* ptvc)
3525    Frees the memory associated with the ptvcursor. You must call this
3526after your dissection with the ptvcursor API is completed.
3527
3528
3529proto_tree*
3530ptvcursor_push_subtree(ptvcursor_t* ptvc, proto_item* it, gint ett_subtree)
3531    Pushes the current subtree in the tree stack of the cursor, creates a new
3532one and sets this one as the working tree.
3533
3534void
3535ptvcursor_pop_subtree(ptvcursor_t* ptvc);
3536    Pops a subtree in the tree stack of the cursor
3537
3538proto_tree*
3539ptvcursor_add_with_subtree(ptvcursor_t* ptvc, int hfindex, gint length,
3540                            const guint encoding, gint ett_subtree);
3541    Adds an item to the tree and creates a subtree.
3542If the length is unknown, length may be defined as SUBTREE_UNDEFINED_LENGTH.
3543In this case, at the next pop, the item length will be equal to the advancement
3544of the cursor since the creation of the subtree.
3545
3546proto_tree*
3547ptvcursor_add_text_with_subtree(ptvcursor_t* ptvc, gint length,
3548                                gint ett_subtree, const char* format, ...);
3549    Add a text node to the tree and create a subtree.
3550If the length is unknown, length may be defined as SUBTREE_UNDEFINED_LENGTH.
3551In this case, at the next pop, the item length will be equal to the advancement
3552of the cursor since the creation of the subtree.
3553
35542.11.2 Miscellaneous functions.
3555
3556tvbuff_t*
3557ptvcursor_tvbuff(ptvcursor_t* ptvc)
3558    Returns the tvbuff associated with the ptvcursor.
3559
3560gint
3561ptvcursor_current_offset(ptvcursor_t* ptvc)
3562    Returns the current offset.
3563
3564proto_tree*
3565ptvcursor_tree(ptvcursor_t* ptvc)
3566    Returns the proto_tree associated with the ptvcursor.
3567
3568void
3569ptvcursor_set_tree(ptvcursor_t* ptvc, proto_tree *tree)
3570    Sets a new proto_tree for the ptvcursor.
3571
3572proto_tree*
3573ptvcursor_set_subtree(ptvcursor_t* ptvc, proto_item* it, gint ett_subtree);
3574    Creates a subtree and adds it to the cursor as the working tree but does
3575not save the old working tree.
3576
35772.12 Optimizations
3578
3579A protocol dissector may be called in 2 different ways - with, or
3580without a non-null "tree" argument.
3581
3582If the proto_tree argument is null, Wireshark does not need to use
3583the protocol tree information from your dissector, and therefore is
3584passing the dissector a null "tree" argument so that it doesn't
3585need to do work necessary to build the protocol tree.
3586
3587In the interest of speed, if "tree" is NULL, avoid building a
3588protocol tree and adding stuff to it, or even looking at any packet
3589data needed only if you're building the protocol tree, if possible.
3590
3591Note, however, that you must fill in column information, create
3592conversations, reassemble packets, do calls to "expert" functions,
3593build any other persistent state needed for dissection, and call
3594subdissectors regardless of whether "tree" is NULL or not.
3595
3596This might be inconvenient to do without doing most of the
3597dissection work; the routines for adding items to the protocol tree
3598can be passed a null protocol tree pointer, in which case they'll
3599return a null item pointer, and "proto_item_add_subtree()" returns
3600a null tree pointer if passed a null item pointer, so, if you're
3601careful not to dereference any null tree or item pointers, you can
3602accomplish this by doing all the dissection work. This might not
3603be as efficient as skipping that work if you're not building a
3604protocol tree, but if the code would have a lot of tests whether
3605"tree" is null if you skipped that work, you might still be better
3606off just doing all that work regardless of whether "tree" is null
3607or not.
3608
3609Note also that there is no guarantee, the first time the dissector is
3610called, whether "tree" will be null or not; your dissector must work
3611correctly, building or updating whatever state information is
3612necessary, in either case.
3613
3614/*
3615 * Editor modelines  -  https://www.wireshark.org/tools/modelines.html
3616 *
3617 * Local variables:
3618 * c-basic-offset: 4
3619 * tab-width: 8
3620 * indent-tabs-mode: nil
3621 * End:
3622 *
3623 * vi: set shiftwidth=4 tabstop=8 expandtab:
3624 * :indentSize=4:tabSize=8:noTabs=true:
3625 */
3626

README.heuristic

1This file is a HOWTO for Wireshark developers. It describes how Wireshark
2heuristic protocol dissectors work and how to write them.
3
4This file is compiled to give in depth information on Wireshark.
5It is by no means all inclusive and complete. Please feel free to send
6remarks and patches to the developer mailing list.
7
8
9Prerequisites
10-------------
11As this file is an addition to README.dissector, it is essential to read
12and understand that document first.
13
14
15Why heuristic dissectors?
16-------------------------
17When Wireshark "receives" a packet, it has to find the right dissector to
18start decoding the packet data. Often this can be done by known conventions,
19e.g. the Ethernet type 0x0800 means "IP on top of Ethernet" - an easy and
20reliable match for Wireshark.
21
22Unfortunately, these conventions are not always available, or (accidentally
23or knowingly) some protocols don't care about those conventions and "reuse"
24existing "magic numbers / tokens".
25
26For example TCP defines port 80 only for the use of HTTP traffic. But, this
27convention doesn't prevent anyone from using TCP port 80 for some different
28protocol, or on the other hand using HTTP on a port number different than 80.
29
30To solve this problem, Wireshark introduced the so called heuristic dissector
31mechanism to try to deal with these problems.
32
33
34How Wireshark uses heuristic dissectors?
35----------------------------------------
36While Wireshark starts, heuristic dissectors (HD) register themselves slightly
37different than "normal" dissectors, e.g. a HD can ask for any TCP packet, as
38it *may* contain interesting packet data for this dissector. In reality more
39than one HD will exist for e.g. TCP packet data.
40
41So if Wireshark has to decode TCP packet data, it will first try to find a
42dissector registered directly for the TCP port used in that packet. If it
43finds such a registered dissector it will just hand over the packet data to it.
44
45In case there is no such "normal" dissector, WS will hand over the packet data
46to the first matching HD. Now the HD will look into the data and decide if that
47data looks like something the dissector "is interested in". The return value
48signals WS if the HD processed the data (so WS can stop working on that packet)
49or if the heuristic didn't match (so WS tries the next HD until one matches -
50or the data simply can't be processed).
51
52Note that it is possible to configure WS through preference settings so that it
53hands off a packet to the heuristic dissectors before the "normal" dissectors
54are called. This allows the HD the chance to receive packets and process them
55differently than they otherwise would be. Of course if no HD is interested in
56the packet, then the packet will ultimately get handed off to the "normal"
57dissector as if the HD wasn't involved at all. As of this writing, the DCCP,
58SCTP, TCP, TIPC and UDP dissectors all provide this capability via their
59"Try heuristic sub-dissectors first" preference, but none of them have this
60option enabled by default.
61
62Once a packet for a particular "connection" has been identified as belonging
63to a particular protocol, Wireshark must then be set up to always directly
64call the dissector for that protocol. This removes the overhead of having
65to identify each packet of the connection heuristically.
66
67
68How do these heuristics work?
69-----------------------------
70It's difficult to give a general answer here. The usual heuristic works as follows:
71
72A HD looks into the first few packet bytes and searches for common patterns that
73are specific to the protocol in question. Most protocols starts with a
74specific header, so a specific pattern may look like (synthetic example):
75
761) first byte must be 0x42
772) second byte is a type field and can only contain values between 0x20 - 0x33
783) third byte is a flag field, where the lower 4 bits always contain the value 0
794) fourth and fifth bytes contain a 16 bit length field, where the value can't
80   be larger than 10000 bytes
81
82So the heuristic dissector will check incoming packet data for all of the
834 above conditions, and only if all of the four conditions are true there is a
84good chance that the packet really contains the expected protocol - and the
85dissector continues to decode the packet data. If one condition fails, it's
86very certainly not the protocol in question and the dissector returns to WS
87immediately "this is not my protocol" - maybe some other heuristic dissector
88is interested!
89
90Obviously, this is *not* 100% bullet proof, but it's the best WS can offer to
91its users here - and improving the heuristic is always possible if it turns out
92that it's not good enough to distinguish between two given protocols.
93
94Note: The heuristic code in a dissector *must not* cause an exception
95      (before returning FALSE) as this will prevent following
96      heuristic dissector handoffs. In practice, this normally means
97      that a test must be done to verify that the required data is
98      available in the tvb before fetching from the tvb. (See the
99      example below).
100
101
102Heuristic Code Example
103----------------------
104You can find a lot of code examples in the Wireshark sources, e.g.:
105grep -l heur_dissector_add epan/dissectors/*.c
106returns 177 files (October 2015).
107
108For the above example criteria, the following code example might do the work
109(combine this with the dissector skeleton in README.developer):
110
111XXX - please note: The following code examples were not tried in reality,
112please report problems to the dev-list!
113
114--------------------------------------------------------------------------------------------
115
116static dissector_handle_t PROTOABBREV_tcp_handle;
117static dissector_handle_t PROTOABBREV_pdu_handle;
118
119/* Heuristics test */
120static gboolean
121test_PROTOABBREV(packet_info *pinfo _U_, tvbuff_t *tvb, int offset _U_, void *data _U_)
122{
123    /* 0) Verify needed bytes available in tvb so tvb_get...() doesn't cause exception.
124    if (tvb_captured_length(tvb) < 5)
125        return FALSE;
126
127    /* 1) first byte must be 0x42 */
128    if ( tvb_get_guint8(tvb, 0) != 0x42 )
129        return FALSE;
130
131    /* 2) second byte is a type field and only can contain values between 0x20-0x33 */
132    if ( tvb_get_guint8(tvb, 1) < 0x20 || tvb_get_guint8(tvb, 1) > 0x33 )
133        return FALSE;
134
135    /* 3) third byte is a flag field, where the lower 4 bits always contain the value 0 */
136    if ( tvb_get_guint8(tvb, 2) & 0x0f )
137        return FALSE;
138
139    /* 4) fourth and fifth bytes contains a 16 bit length field, where the value can't be longer than 10000 bytes */
140    /* Assumes network byte order */
141    if ( tvb_get_ntohs(tvb, 3) > 10000 )
142        return FALSE;
143
144    /* Assume it's your packet ... */
145    return TRUE;
146}
147
148/* Dissect the complete PROTOABBREV pdu */
149static int
150dissect_PROTOABBREV_pdu(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree, void *data _U_)
151{
152    /* Dissection ... */
153
154    return tvb_reported_length(tvb);
155}
156
157/* For tcp_dissect_pdus() */
158static guint
159get_PROTOABBREV_len(packet_info *pinfo _U_, tvbuff_t *tvb, int offset, void *data _U_)
160{
161    return (guint) tvb_get_ntohs(tvb, offset+3);
162}
163
164static int
165dissect_PROTOABBREV_tcp(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree, void *data)
166{
167    tcp_dissect_pdus(tvb, pinfo, tree, TRUE, 5,
168                     get_PROTOABBREV_len, dissect_PROTOABBREV_pdu, data);
169    return tvb_reported_length(tvb);
170}
171
172static gboolean
173dissect_PROTOABBREV_heur_tcp(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree, void *data)
174{
175    if (!test_PROTOABBREV(pinfo, tvb, 0, data))
176        return FALSE;
177
178    /* specify that dissect_PROTOABBREV is to be called directly from now on for
179     * packets for this "connection" ... but only do this if your heuristic sits directly
180     * on top of (was called by) a dissector which established a conversation for the
181     * protocol "port type". In other words: only directly over TCP, UDP, DCCP, ...
182     * otherwise you'll be overriding the dissector that called your heuristic dissector.
183     */
184    conversation = find_or_create_conversation(pinfo);
185    conversation_set_dissector(conversation, PROTOABBREV_tcp_handle);
186
187    /*   and do the dissection */
188    dissect_PROTOABBREV_tcp(tvb, pinfo, tree, data);
189
190    return (TRUE);
191}
192
193static int
194dissect_PROTOABBREV_udp(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree, void *data)
195{
196    udp_dissect_pdus(tvb, pinfo, tree, TRUE, 5, NULL,
197                     get_PROTOABBREV_len, dissect_PROTOABBREV_pdu, data);
198    return tvb_reported_length(tvb);
199}
200
201static gboolean
202dissect_PROTOABBREV_heur_udp(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree, void *data)
203{
204...
205    /*   and do the dissection */
206    return (udp_dissect_pdus(tvb, pinfo, tree, TRUE, 5, test_PROTOABBREV,
207                     get_PROTOABBREV_len, dissect_PROTOABBREV_pdu, data) != 0);
208}
209
210void
211proto_reg_handoff_PROTOABBREV(void)
212{
213    PROTOABBREV_tcp_handle = create_dissector_handle(dissect_PROTOABBREV_tcp,
214                                                         proto_PROTOABBREV);
215    PROTOABBREV_pdu_handle = create_dissector_handle(dissect_PROTOABBREV_pdu,
216                                                         proto_PROTOABBREV);
217
218    /* register as heuristic dissector for both TCP and UDP */
219    heur_dissector_add("tcp", dissect_PROTOABBREV_heur_tcp, "PROTOABBREV over TCP",
220                       "PROTOABBREV_tcp", proto_PROTOABBREV, HEURISTIC_ENABLE);
221    heur_dissector_add("udp", dissect_PROTOABBREV_heur_udp, "PROTOABBREV over UDP",
222                       "PROTOABBREV_udp", proto_PROTOABBREV, HEURISTIC_ENABLE);
223
224#ifdef OPTIONAL
225    /* It's possible to write a dissector to be a dual heuristic/normal dissector */
226    /*  by also registering the dissector "normally".                             */
227    dissector_add_uint("ip.proto", IP_PROTO_PROTOABBREV, PROTOABBREV_pdu_handle);
228#endif
229}
230
231
232Please note, that registering a heuristic dissector is only possible for a
233small variety of protocols. In most cases a heuristic is not needed, and
234adding the support would only add unused code to the dissector.
235
236TCP and UDP are prominent examples that support HDs, as there seems to be a
237tendency to re-use known port numbers for new protocols. But TCP and UDP are
238not the only dissectors that provide support for HDs.  You can find more
239examples by searching the Wireshark sources as follows:
240grep -l register_heur_dissector_list epan/dissectors/packet-*.c
241

README.idl2wrs

1Copyright (C) 2001 Frank Singleton <frank.singleton@ericsson.com>
2
3
4What is it ?
5============
6
7As you have probably guessed from the name, "idl2wrs" takes a
8user specified IDL file and attempts to build a dissector that
9can decode the IDL traffic over GIOP. The resulting file is
10"C" code that should compile okay as a Wireshark dissector.
11
12idl2wrs basically parses the data struct given to it by
13the omniidl compiler, and using the GIOP API available in packet-giop.[ch],
14generates get_CDR_xxx calls to decode the CORBA traffic on the wire.
15
16It consists of 4 main files.
17
18README.idl2wrs     - This document
19wireshark_be.py    - The main compiler backend
20wireshark_gen.py   - A helper class that generates the C code.
21idl2wrs            - A simple shell script wrapper that the end user should
22                     use to generate the dissector from the IDL file(s).
23
24Why did you do this ?
25=====================
26
27It is important to understand how CORBA traffic looks
28like over GIOP/IIOP, and to help build a tool that can assist
29in troubleshooting CORBA interworking. This was especially the
30case after seeing a lot of discussions about how particular
31IDL types are represented inside an octet stream.
32
33I have also had comments/feedback that this tool would be good for say
34a CORBA class when teaching students how CORBA traffic looks like
35"on the wire".
36
37It is also COOL to work on a great Open Source project such as
38the case with "Wireshark" (https://www.wireshark.org)
39
40
41How to use idl2wrs
42==================
43
44To use the idl2wrs to generate Wireshark dissectors, you
45need the following.
46
47
481. Python must be installed
49   http://python.org/
50
512. omniidl from the omniORB package must be available.
52   http://omniorb.sourceforge.net/
53
543. Of course you need Wireshark installed to compile the
55   code and tweak it if required. idl2wrs is part of the
56   standard Wireshark distribution.
57
58
59Procedure
60=========
61
621.  To write the C code to stdout.
63
64    idl2wrs  <your_file.idl>
65
66    eg: idl2wrs echo.idl
67
68
692. To write to a file, just redirect the output.
70
71    idl2wrs echo.idl > packet-test-idl.c
72
73   You may wish to comment out the register_giop_user_module() code
74   and that will leave you with heuristic dissection.
75
76
77If you don't want to use the shell script wrapper, then try
78steps 3 or 4 instead.
79
803.  To write the C code to stdout.
81
82    Usage: omniidl  -p ./ -b wireshark_be <your_file.idl>
83
84    eg: omniidl  -p ./ -b wireshark_be echo.idl
85
86
874. To write to a file, just redirect the output.
88
89    omniidl  -p ./ -b wireshark_be echo.idl > packet-test-idl.c
90
91   You may wish to comment out the register_giop_user_module() code
92   and that will leave you with heuristic dissection.
93
94
955. Copy the resulting C code to your Wireshark src directory, edit the
96   following file to include the packet-test-idl.c
97
98   cp packet-test-idl.c /dir/where/wireshark/lives/epan/dissectors/
99   cp /dir/where/wireshark/lives/epan/dissectors/CMakeLists.txt.example \
100     /dir/where/wireshark/lives/epan/dissectors/CMakeLists.txt
101   nano /dir/where/wireshark/lives/epan/dissectors/CMakeLists.txt
102
103
1046. Run CMake
105
106   cmake /dir/where/wireshark/lives
107
108
1097. Compile the code
110
111   make
112
113
1148. Good Luck !!
115
116
117TODO
118====
119
1201. Exception code not generated  (yet), but can be added manually.
1212. Enums not converted to symbolic values (yet), but can be added manually.
1223. Add command line options, etc.
1234. More I am sure :-)
124
125
126Limitations
127===========
128
129See TODO list inside packet-giop.c
130
131
132Notes
133=====
134
1351. The "-p ./" option passed to omniidl indicates that the wireshark_be.py
136   and wireshark_gen.py are residing in the current directory. This may need
137   tweaking if you place these files somewhere else.
138
1392. If it complains about being unable to find some modules (eg tempfile.py),
140   you may want to check if PYTHONPATH is set correctly.
141   On my Linux box, it is  PYTHONPATH=/usr/lib/python1.5/
142
143Frank Singleton.
144
145

README.plugins

10. Plugins
2
3There are a multitude of plugin options available in Wireshark that allow to
4extend its functionality without changing the source code itself.  Using the
5available APIs gives you the means to do this.
6
7Currently plugin APIs are available for dissectors (epan), capture file types
8(wiretap) and media decoders (codecs).  This README focuses primarily on
9dissector plugins; most of the descriptions are applicable to the other plugin
10types as well.
11
121. Dissector plugins
13
14Writing a "plugin" dissector is not very different from writing a standard
15one.  In fact all of the functions described in README.dissector can be
16used in the plugins exactly as they are used in standard dissectors.
17
18(Note, however, that not all OSes on which Wireshark runs can support
19plugins.)
20
21If you've chosen "foo" as the name of your plugin (typically, that would
22be a short name for your protocol, in all lower case), the following
23instructions tell you how to implement it as a plugin.  All occurrences
24of "foo" below should be replaced by the name of your plugin.
25
262. The directory for the plugin, and its files
27
28The plugin should be placed in a new plugins/epan/foo directory which should
29contain at least the following files:
30
31CMakeLists.txt
32README
33
34The README can be brief but it should provide essential information relevant
35to developers and users. Optionally AUTHORS and ChangeLog files can be added.
36Optionally you can add your own plugin.rc.in.
37
38And of course the source and header files for your dissector.
39
40Examples of these files can be found in plugins/epan/gryphon.
41
422.1 CMakeLists.txt
43
44For your plugins/epan/foo/CMakeLists.txt file, see the corresponding file in
45plugins/epan/gryphon.  Replace all occurrences of "gryphon" in those files
46with "foo" and add your source files to the DISSECTOR_SRC variable.
47
482.2 plugin.rc.in
49
50Your plugins/epan/foo/plugin.rc.in is the Windows resource template file used
51to add the plugin specific information as resources to the DLL.
52If not provided the plugins/plugin.rc.in file will be used.
53
543. Changes to existing Wireshark files
55
56There are two ways to add your plugin dissector to the build, as a custom
57extension or as a permanent addition.  The custom extension is easy to
58configure, but won't be used for inclusion in the distribution if that's
59your goal.  Setting up the permanent addition is somewhat more involved.
60
613.1 Custom extension
62
63For CMake builds, either pass the custom plugin dir on the CMake generation
64step command line:
65
66CMake ... -DCUSTOM_PLUGIN_SRC_DIR="plugins/epan/foo"
67
68or copy the top-level file CMakeListsCustom.txt.example to CMakeListsCustom.txt
69(also in the top-level source dir) and edit so that CUSTOM_PLUGIN_SRC_DIR is
70set() to the relative path of your plugin, e.g.
71
72set(CUSTOM_PLUGIN_SRC_DIR plugins/epan/foo)
73
74and re-run the CMake generation step.
75
76To build the plugin, run your normal Wireshark build step.
77
78If you want to add the plugin to your own Windows installer add a text
79file named custom_plugins.txt to the packaging/nsis directory, with a
80"File" statement for NSIS:
81
82File "${STAGING_DIR}\plugins\${VERSION_MAJOR}.${VERSION_MINOR}\epan\foo.dll"
83
843.2 Permanent addition
85
86In order to be able to permanently add a plugin take the following steps.
87You will need to change the following files:
88	CMakeLists.txt
89	packaging/nsis/wireshark.nsi
90
91You might also want to search your Wireshark development directory for
92occurrences of an existing plugin name, in case this document is out of
93date with the current directory structure.  For example,
94
95	grep -rl gryphon .
96
97could be used from a shell prompt.
98
993.2.1  Changes to CMakeLists.txt
100
101Add your plugin (in alphabetical order) to the PLUGIN_SRC_DIRS:
102
103if(ENABLE_PLUGINS)
104        ...
105        set(PLUGIN_SRC_DIRS
106                ...
107                plugins/epan/ethercat
108                plugins/epan/foo
109                plugins/epan/gryphon
110                plugins/epan/irda
111                ...
112
1133.2.2  Changes to the installers
114
115If you want to include your plugin in an installer you have to add lines
116in the NSIS installer wireshark.nsi file.
117
1183.2.2.1  Changes to packaging/nsis/wireshark.nsi
119
120Add the relative path of your plugin DLL (in alphabetical order) to the
121list of "File" statements in the "Dissector Plugins" section:
122
123File "${STAGING_DIR}\plugins\${VERSION_MAJOR}.${VERSION_MINOR}\epan\ethercat.dll"
124File "${STAGING_DIR}\plugins\${VERSION_MAJOR}.${VERSION_MINOR}\epan\foo.dll"
125File "${STAGING_DIR}\plugins\${VERSION_MAJOR}.${VERSION_MINOR}\epan\gryphon.dll"
126File "${STAGING_DIR}\plugins\${VERSION_MAJOR}.${VERSION_MINOR}\epan\irda.dll"
127
1283.2.2.2  Other installers
129
130The PortableApps installer copies plugins from the build directory
131and should not require configuration.
132
1334. Development and plugins on Unix
134
135Plugins make some aspects of development easier and some harder.
136
137The first thing is that you'll have to run cmake once more to setup your
138build environment.
139
140The good news is that if you are working on a single plugin then you will
141find recompiling the plugin MUCH faster than recompiling a dissector and
142then linking it back into Wireshark. Use "make plugins" to compile just
143your plugins.
144
145The bad news is that Wireshark will not use the plugins unless the plugins
146are installed in one of the places it expects them to find.
147
148One way of dealing with this problem is to set an environment variable
149when running Wireshark: WIRESHARK_RUN_FROM_BUILD_DIRECTORY=1.
150
151Another way to deal with this problem is to set up a working root for
152wireshark, say in $HOME/build/root and build Wireshark to install
153there
154
155cmake -D CMAKE_INSTALL_PREFIX=${HOME}/build/root && make install
156
157then subsequent rebuilds/installs of your plugin can be accomplished
158by going to the plugins/foo directory and running
159
160make install
161
1625. Update "old style" plugins
163
1645.1 How to update an "old style" plugin (since Wireshark 2.5)
165
166Plugins need exactly four visible symbols: plugin_version, plugin_want_major,
167plugin_want_minor and plugin_register. Each plugin is either a codec plugin,
168libwiretap plugin or libwireshark plugin and the library will call
169"plugin_register" after loading the plugin. "plugin_register" in turn calls all
170the hooks necessary to enable the plugin. So if you had two function like so:
171
172    WS_DLL_PUBLIC void plugin_register(void);
173    WS_DLL_PUBLIC void plugin_reg_handoff(void);
174
175    void plugin_register(void) {...};
176    void plugin_reg_handoff(void) {...};
177
178You'll have to rewrite it as:
179
180    WS_DLL_PUBLIC void plugin_register(void);
181
182    static void proto_register_foo(void) {...};
183    static void proto_reg_handoff_foo(void) {...};
184
185    void plugin_register(void)
186    {
187	static proto_plugin plugin_foo;
188
189	plugin_foo.register_protoinfo = proto_register_foo;
190	plugin_foo.register_handoff = proto_reg_handoff_foo;
191	proto_register_plugin(&plugin_foo);
192    }
193
194See doc/plugins.example for an example.
195
1965.2 How to update an "old style" plugin (using plugin_register and
197    plugin_reg_handoff functions).
198
199The plugin registration has changed with the extension of the build
200scripts. These now generate the additional code needed for plugin
201encapsulation in plugin.c. When using the new style build scripts,
202strips the parts outlined below:
203
204    o Remove the following include statements:
205
206        #include <gmodule.h>
207        #include "moduleinfo.h"
208
209    o Removed the definition:
210
211        #ifndef ENABLE_STATIC
212        WS_DLL_PUBLIC_DEF gchar version[] = VERSION;
213        #endif
214
215    o Move relevant code from the blocks and delete these functions:
216
217        #ifndef ENABLE_STATIC
218        plugin_reg_handoff()
219        ....
220        #endif
221
222        #ifndef ENABLE_STATIC
223        plugin_register()
224        ....
225        #endif
226
227This will leave a clean dissector source file without plugin specifics.
228
2295.3 How to update an "old style" plugin (using plugin_init function)
230
231The plugin registering has changed between 0.10.9 and 0.10.10; everyone
232is encouraged to update their plugins as outlined below:
233
234    o Remove following include statements from all plugin sources:
235
236	#include "plugins/plugin_api.h"
237	#include "plugins/plugin_api_defs.h"
238
239    o Remove the init function.
240
2416 How to plugin related interface options
242
243To demonstrate the functionality of the plugin interface options, a
244demonstration plugin exists (pluginifdemo). To build it using cmake, the
245build option ENABLE_PLUGIN_IFDEMO has to be enabled.
246
2476.1 Implement a plugin GUI menu
248
249A plugin (as well as built-in dissectors) may implement a menu within
250Wireshark to be used to trigger options, start tools, open Websites, ...
251
252This menu structure is built using the plugin_if.h interface and its
253corresponding functions.
254
255The menu items all call a callback provided by the plugin, which takes
256a pointer to the menuitem entry as data. This pointer may be used to
257provide userdata to each entry. The pointer must utilize WS_DLL_PUBLIC_DEF
258and has the following structure:
259
260    WS_DLL_PUBLIC_DEF void
261    menu_cb(ext_menubar_gui_type gui_type, gpointer gui_data,
262            gpointer user_data _U_)
263    {
264        ... Do something ...
265    }
266
267The menu entries themselves are generated with the following code structure:
268
269    ext_menu_t * ext_menu, *os_menu = NULL;
270
271    ext_menu = ext_menubar_register_menu (
272            <your_proto_item>, "Some Menu Entry", TRUE );
273    ext_menubar_add_entry(ext_menu, "Test Entry 1",
274            "This is a tooltip", menu_cb, <user_data>);
275    ext_menubar_add_entry(ext_menu, "Test Entry 2",
276            NULL, menu_cb, <user_data>);
277
278    os_menu = ext_menubar_add_submenu(ext_menu, "Sub Menu" );
279    ext_menubar_add_entry(os_menu, "Test Entry A",
280            NULL, menu_cb, <user_data>);
281    ext_menubar_add_entry(os_menu, "Test Entry B",
282            NULL, menu_cb, <user_data>);
283
284For a more detailed information, please refer to plugin_if.h
285
2866.2 Implement interactions with the main interface
287
288Due to memory constraints on most platforms, plugin functionality cannot be
289called directly from a DLL context. Instead special functions will be used,
290which will implement certain options for plugins to utilize.
291
292The following methods exist so far:
293
294	/* Applies the given filter string as display filter */
295	WS_DLL_PUBLIC void plugin_if_apply_filter
296		(const char * filter_string, gboolean force);
297
298	/* Saves the given preference to the main preference storage */
299	WS_DLL_PUBLIC void plugin_if_save_preference
300		(const char * pref_module, const char * pref_key, const char * pref_value);
301
302	/* Jumps to the given frame number */
303	WS_DLL_PUBLIC void plugin_if_goto_frame(guint32 framenr);
304
3056.3 Implement a plugin specific toolbar
306
307A toolbar may be registered which allows implementing an interactive user
308interaction with the main application. The toolbar is generated using the following
309code:
310
311    ext_toolbar_t * tb = ext_toolbar_register_toolbar("Plugin Interface Demo Toolbar");
312
313This registers a toolbar, which will be shown underneath "View->Additional Toolbars" in
314the main menu, as well as the popup action window when right-clicking on any other tool-
315or menubar.
316
317It behaves identically to the existing toolbars and can be hidden as well as defined to
318appear specific to selected profiles. The name with which it is being shown is the given
319name in this function call.
320
3216.3.1 Register elements for the toolbar
322
323To add items to the toolbar, 4 different types of elements do exist.
324
325  * BOOLEAN - a checkbox to select / unselect
326  * BUTTON - a button to click
327  * STRING - a text field with validation options
328  * SELECTOR - a dropdown selection field
329
330To add an element to the toolbar, the following function is being used:
331
332    ext_toolbar_add_entry( ext_toolbar_t * parent, ext_toolbar_item_t type, const gchar *label,
333        const gchar *defvalue, const gchar *tooltip, gboolean capture_only, GList * value_list,
334        gboolean is_required, const gchar * regex, ext_toolbar_action_cb callback, gpointer user_data)
335
336    parent_bar - the parent toolbar for this entry, to be registered by ext_toolbar_register_toolbar
337    name - the entry name (the internal used one) for the item, used to send updates to the element
338    label - the entry label (the displayed name) for the item, visible to the user
339    defvalue - the default value for the toolbar element
340        - EXT_TOOLBAR_BOOLEAN - 1 is for a checked element, 0 is unchecked
341        - EXT_TOOLBAR_STRING - Text already entered upon initial display
342    tooltip - a tooltip to be displayed on mouse-over
343    capture_only - entry is only active, if a capture is active
344    callback - the action which will be invoked after the item is activated
345    value_list - a non-null list of values created by ext_toolbar_add_val(), if the item type
346        is EXT_TOOLBAR_SELECTOR
347    valid_regex - a validation regular expression for EXT_TOOLBAR_STRING
348    is_required - a zero entry for EXT_TOOLBAR_STRING is not allowed
349    user_data - a user defined pointer, which will be added to the toolbar callback
350
351In case of the toolbar type EXT_TOOLBAR_SELECTOR a value list has to be provided. This list
352is generated using ext_toolbar_add_val():
353
354    GList * entries = 0;
355    entries = ext_toolbar_add_val(entries, "1", "ABCD", FALSE );
356    entries = ext_toolbar_add_val(entries, "2", "EFG", FALSE );
357    entries = ext_toolbar_add_val(entries, "3", "HIJ", TRUE );
358    entries = ext_toolbar_add_val(entries, "4", "KLM", FALSE );
359
3606.3.2 Callback for activation of an item
361
362If an item has been activated, the provided callback is being triggered.
363
364    void toolbar_cb(gpointer toolbar_item, gpointer item_data, gpointer user_data)
365
366For EXT_TOOLBAR_BUTTON the callback is triggered upon a click on the button, for
367EXT_TOOLBAR_BOOLEAN and EXT_TOOLBAR_SELECTOR the callback is triggered with every change
368of the selection.
369
370For EXT_TOOLBAR_STRING either the return key has to be hit or the apply button pressed.
371
372The parameters of the callback are defined as follows:
373
374    toolbar_item - an element of the type ext_toolbar_t * representing the item that has been
375                   activated
376    item_data - the data of the item during activation. The content depends on the item type:
377         - EXT_TOOLBAR_BUTTON - the entry is null
378         - EXT_TOOLBAR_BOOLEAN - the entry is 0 if the checkbox is unchecked and 1 if it is checked
379         - EXT_TOOLBAR_STRING - a string representing the context of the textbox. Only valid strings
380                   are being passed, it can be safely assumed, that an applied regular expression has
381                   been checked.
382         - EXT_TOOLBAR_SELECTOR - the value of the selected entry
383    user_data - the data provided during element registration
384
3856.3.3 Sending updates to the toolbar items
386
387A plugin may send updates to the toolbar entry, using one of the following methods. The parameter
388silent defines, if the registered toolbar callback is triggered by the update or not.
389
390    void ext_toolbar_update_value(ext_toolbar_t * entry, gpointer data, gboolean silent)
391
392    - EXT_TOOLBAR_BUTTON, EXT_TOOLBAR_STRING - the displayed text (on the button or in the textbox)
393        are being changed, in that case data is expected to be a string
394    - EXT_TOOLBAR_BOOLEAN - the checkbox value is being changed, to either 0 or 1, in both cases
395        data is expected to be an integer sent by GINT_TO_POINTER(n)
396    - EXT_TOOLBAR_SELECTOR - the display text to be changed. If no element exists with this text,
397        nothing will happen
398
399    void ext_toolbar_update_data(ext_toolbar_t * entry, gpointer data, gboolean silent)
400
401    - EXT_TOOLBAR_SELECTOR - change the value list to the one provided with data. Attention! this
402        does not change the list stored within the item just the one in the displayed combobox
403
404    void ext_toolbar_update_data_by_index(ext_toolbar_t * entry, gpointer data, gpointer value,
405        gboolean silent)
406
407    - EXT_TOOLBAR_SELECTOR - change the display text for the entry with the provided value. Both
408        data and value must be gchar * pointer.
409
410
411----------------
412
413Ed Warnicke <hagbard@physics.rutgers.edu>
414Guy Harris <guy@alum.mit.edu>
415
416Derived and expanded from the plugin section of README.developers
417which was originally written by
418
419James Coe <jammer@cin.net>
420Gilbert Ramirez <gram@alumni.rice.edu>
421Jeff Foster <jfoste@woodward.com>
422Olivier Abad <oabad@cybercable.fr>
423Laurent Deniel <laurent.deniel@free.fr>
424Jaap Keuter <jaap.keuter@xs4all.nl>
425

README.regression

1#
2# Wireshark/TShark Regression Testing
3#
4# This is a sample Makefile for regression testing of the
5# Wireshark engine. These tests use that uses 'tshark -V' to analyze all
6# the frames of a capture file.
7#
8# You should probably rename this file as 'Makefile' in a separate directory
9# set aside for the sole purpose of regression testing. Two text files will
10# be created for each capture file you test, so expect to have lots of files.
11#
12# Set TSHARK, CAPTURE_DIR, and CAPTURE_FILES to values appropriate for
13# your system. Run 'make' to create the initial datasets. Type 'make accept'
14# to accept those files as the reference set.
15#
16# After you make changes to TShark, run 'make regress'. This will re-run
17# the tests and compare them against the accepted reference set of data.
18# The comparison, which is just an invocation of 'diff -u' for the output
19# of each trace file, will be put into a file called 'regress'. Examine
20# this file for any changes that you did or did not expect.
21#
22# If you have introduced a change to TShark that shows up in the tests, but
23# it is a valid change, run 'make accept' to accept those new data as your
24# reference set.
25#
26# Commands:
27#
28# 'make'		Creates tests
29# 'make regress'	Checks tests against accepted reference test results
30#			Report is put in file 'regress'
31# 'make accept'		Accept current tests; make them the reference test results
32# 'make clean'		Cleans any tests (but not references!)
33
34TSHARK=/home/gram/prj/wireshark/debug/linux-ix86/tshark
35
36CAPTURE_DIR=/home/gram/prj/sniff
37
38CAPTURE_FILES=\
39	dhcp-g.tr1	\
40	genbroad.snoop	\
41	ipv6-ripng.gz	\
42	ipx.pcap	\
43	pcmjh03.tr1	\
44	strange.iptrace	\
45	teardrop.toshiba.gz	\
46	zlip-1.pcap	\
47	zlip-2.pcap	\
48	zlip-3.pcap
49
50######################################## No need to modify below this line
51
52TESTS = $(CAPTURE_FILES:=.tether)
53REFERENCES = $(TESTS:.tether=.ref)
54
55all:	$(TESTS)
56
57clean:
58	rm -f $(TESTS)
59
60%.tether : $(CAPTURE_DIR)/% $(TSHARK)
61	$(TSHARK) -V -n -r $< > $@
62
63accept: $(REFERENCES)
64
65%.ref : %.tether
66	mv $< $@
67
68regress: $(TESTS)
69	@echo "Regression Report" 			> regress
70	@date						>> regress
71	@echo "BOF------------------------------------"	>> regress
72	@for file in $(CAPTURE_FILES); do \
73		echo Checking regression of $$file ; \
74		diff -u $${file}.ref $${file}.tether	>> regress ; \
75	done
76	@echo "EOF------------------------------------"	>> regress
77

README.request_response_tracking

11. Introduction
2
3It is often useful to enhance dissectors for request/response style protocols
4to match requests with responses.
5This allows you to display useful information in the decode tree such as which
6requests are matched to which response and the response time for individual
7transactions.
8
9This is also useful if you want to pass some data from the request onto the
10dissection of the actual response. The RPC dissector for example does
11something like this to pass the actual command opcode from the request onto
12the response dissector since the opcode itself is not part of the response
13packet and without the opcode we would not know how to decode the data.
14
15It is also useful when you need to track information on a per conversation
16basis such as when some parameters are negotiated during a login phase of the
17protocol and when these parameters affect how future commands on that session
18are to be decoded. The iSCSI dissector does something similar to that to track
19which sessions that HeaderDigest is activated for and which ones it is not.
20
212. Implementation
22
23The example below shows how simple this is to add to the dissector IF:
241. there is something like a transaction id in the header,
252. it is very unlikely that the transaction identifier is reused for the
26   same conversation.
27
28The example is taken from the PANA dissector:
29
30First we need to include the definitions for conversations.
31
32	#include <epan/conversation.h>
33
34Then we also need a few header fields to show the relations between request
35and response as well as the response time.
36
37	static int hf_pana_response_in = -1;
38	static int hf_pana_response_to = -1;
39	static int hf_pana_response_time = -1;
40
41We need a structure that holds all the information we need to remember
42between the request and the responses. One such structure will be allocated
43for each unique transaction.
44In the example we only keep the frame numbers of the request and the response
45as well as the timestamp for the request.
46But since this structure is persistent and also a unique one is allocated for
47each request/response pair, this is a good place to store other additional
48data you may want to keep track of from a request to a response.
49
50	typedef struct _pana_transaction_t {
51	        guint32 req_frame;
52	        guint32 rep_frame;
53	        nstime_t req_time;
54	} pana_transaction_t;
55
56We also need a structure that holds persistent information for each
57conversation. A conversation is identified by SRC/DST address, protocol and
58SRC/DST port, see README.dissector, section 2.2.
59In this case we only want to have a hash table to track the actual
60transactions that occur for this unique conversation.
61Some protocols negotiate session parameters during a login phase and those
62parameters may affect how later commands on the same session is to be decoded,
63this would be a good place to store that additional info you may want to keep
64around.
65
66	typedef struct _pana_conv_info_t {
67	        wmem_map_t *pdus;
68	} pana_conv_info_t;
69
70Finally for the meat of it, add the conversation and tracking code to the
71actual dissector.
72
73	...
74	guint32 seq_num;
75	conversation_t *conversation;
76	pana_conv_info_t *pana_info;
77	pana_transaction_t *pana_trans;
78
79	...
80	/* Get the transaction identifier */
81	seq_num = tvb_get_ntohl(tvb, 8);
82	...
83
84	/*
85	 * We need to track some state for this protocol on a per conversation
86	 * basis so we can do neat things like request/response tracking
87	 */
88	conversation = find_or_create_conversation(pinfo);
89
90	/*
91	 * Do we already have a state structure for this conv
92	 */
93	pana_info = (pana_conv_info_t *)conversation_get_proto_data(conversation, proto_pana);
94	if (!pana_info) {
95		/*
96                 * No.  Attach that information to the conversation, and add
97		 * it to the list of information structures.
98		 */
99		pana_info = wmem_new(wmem_file_scope(), pana_conv_info_t);
100		pana_info->pdus=wmem_map_new(wmem_file_scope(), g_direct_hash, g_direct_equal);
101
102		conversation_add_proto_data(conversation, proto_pana, pana_info);
103	}
104	if (!PINFO_FD_VISITED(pinfo)) {
105		if (flags&PANA_FLAG_R) {
106			/* This is a request */
107			pana_trans=wmem_new(wmem_file_scope(), pana_transaction_t);
108			pana_trans->req_frame = pinfo->num;
109			pana_trans->rep_frame = 0;
110			pana_trans->req_time = pinfo->fd->abs_ts;
111			wmem_map_insert(pana_info->pdus, GUINT_TO_POINTER(seq_num), (void *)pana_trans);
112		} else {
113			pana_trans=(pana_transaction_t *)wmem_map_lookup(pana_info->pdus, GUINT_TO_POINTER(seq_num));
114			if (pana_trans) {
115				pana_trans->rep_frame = pinfo->num;
116			}
117		}
118	} else {
119		pana_trans=(pana_transaction_t *)wmem_map_lookup(pana_info->pdus, GUINT_TO_POINTER(seq_num));
120	}
121	if (!pana_trans) {
122		/* create a "fake" pana_trans structure */
123		pana_trans=wmem_new(pinfo->pool, pana_transaction_t);
124		pana_trans->req_frame = 0;
125		pana_trans->rep_frame = 0;
126		pana_trans->req_time = pinfo->fd->abs_ts;
127	}
128
129	/* print state tracking in the tree */
130	if (flags&PANA_FLAG_R) {
131		/* This is a request */
132		if (pana_trans->rep_frame) {
133			proto_item *it;
134
135			it = proto_tree_add_uint(pana_tree, hf_pana_response_in,
136					tvb, 0, 0, pana_trans->rep_frame);
137			proto_item_set_generated(it);
138		}
139	} else {
140		/* This is a reply */
141		if (pana_trans->req_frame) {
142			proto_item *it;
143			nstime_t ns;
144
145			it = proto_tree_add_uint(pana_tree, hf_pana_response_to,
146					tvb, 0, 0, pana_trans->req_frame);
147			proto_item_set_generated(it);
148
149			nstime_delta(&ns, &pinfo->fd->abs_ts, &pana_trans->req_time);
150			it = proto_tree_add_time(pana_tree, hf_pana_response_time, tvb, 0, 0, &ns);
151			proto_item_set_generated(it);
152		}
153	}
154
155Then we just need to declare the hf fields we used.
156
157	{ &hf_pana_response_in,
158		{ "Response In", "pana.response_in",
159		FT_FRAMENUM, BASE_NONE, FRAMENUM_TYPE(FT_FRAMENUM_RESPONSE), 0x0,
160		"The response to this PANA request is in this frame", HFILL }
161	},
162	{ &hf_pana_response_to,
163		{ "Request In", "pana.response_to",
164		FT_FRAMENUM, BASE_NONE, FRAMENUM_TYPE(FT_FRAMENUM_REQUEST), 0x0,
165		"This is a response to the PANA request in this frame", HFILL }
166	},
167	{ &hf_pana_response_time,
168		{ "Response Time", "pana.response_time",
169		FT_RELATIVE_TIME, BASE_NONE, NULL, 0x0,
170		"The time between the Call and the Reply", HFILL }
171	},
172

README.stats_tree

1tapping with stats_tree
2
3Let's suppose that you want to write a tap only to keep counters, and you
4don't want to get involved with GUI programming or maybe you'd like to make
5it a plugin. A stats_tree might be the way to go. The stats_tree module takes
6care of the representation (GUI for Wireshark and text for TShark) of the
7tap data. So there's very little code to write to make a tap listener usable
8from both Wireshark and TShark.
9
10First, you should add the TAP to the dissector in question as described in
11README.tapping .
12
13Once the dissector in question is "tapped" you have to write the stats tree
14code which is made of three parts:
15
16The init callback routine:
17   which will be executed before any packet is passed to the tap. Here you
18   should create the "static" nodes of your tree. As well as initialize your
19   data.
20
21The (per)packet callback routine:
22   As the tap_packet callback is going to be called for every packet, it
23   should be used to increment the counters.
24
25The cleanup callback:
26   It is called at the destruction of the stats_tree and might be used to
27   free ....
28
29Other than that the stats_tree should be registered.
30
31If you want to make it a plugin, stats_tree_register() should be called by
32plugin_register_tap_listener() read README.plugins for other information
33regarding Wireshark plugins.
34
35If you want it as part of the dissector stats_tree_register() can be called
36either by proto_register_xxx() or if you prefer by proto_reg_handoff_xxx().
37
38
39A small example of a very basic stats_tree plugin follows.
40
41----- example stats_tree plugin ------
42/* udpterm_stats_tree.c
43 * A small example of stats_tree plugin that counts udp packets by termination
44 * 2005, Luis E. G. Ontanon
45 *
46 * Wireshark - Network traffic analyzer
47 * By Gerald Combs <gerald@wireshark.org>
48 * Copyright 1998 Gerald Combs
49 *
50 * SPDX-License-Identifier: GPL-2.0-or-later
51 */
52
53#include "config.h"
54
55#include <gmodule.h>
56
57#include <epan/stats_tree.h>
58#include <epan/dissectors/udp.h>
59
60static int st_udp_term;
61static gchar* st_str_udp_term = "UDP terminations";
62
63/* this one initializes the tree, creating the root nodes */
64extern void udp_term_stats_tree_init(stats_tree* st) {
65	/* we create a node under which we'll add every termination */
66	st_udp_term = stats_tree_create_node(st, st_str_udp_term, 0, TRUE);
67}
68
69/* this one will be called with every udp packet */
70extern int udp_term_stats_tree_packet(stats_tree *st, /* st as it was passed to us */
71                                      packet_info *pinfo,  /* we'll fetch the addresses from here */
72                                      epan_dissect_t *edt _U_, /* unused */
73                                      const void *p) /* we'll use this to fetch the ports */
74{
75	static guint8 str[128];
76	e_udphdr* udphdr = (e_udphdr*) p;
77
78	/* we increment by one (tick) the root node */
79	tick_stat_node(st, st_str_udp_term, 0, FALSE);
80
81	/* we then tick a node for this src_addr:src_port
82	   if the node doesn't exists it will be created */
83	g_snprintf(str, sizeof(str),"%s:%u",address_to_str(&pinfo->net_src),udphdr->sport);
84	tick_stat_node(st, str, st_udp_term, FALSE);
85
86	/* same thing for dst */
87	g_snprintf(str, sizeof(str),"%s:%u",address_to_str(&pinfo->net_dst),udphdr->dport);
88	tick_stat_node(st, str, st_udp_term, FALSE);
89
90	return 1;
91}
92
93WS_DLL_PUBLIC_DEF const gchar version[] = "0.0";
94
95WS_DLL_PUBLIC_DEF void plugin_register_tap_listener(void) {
96
97    stats_tree_register_plugin("udp", /* the proto we are going to "tap" */
98                               "udp_terms", /* the abbreviation for this tree (to be used as -z udp_terms,tree) */
99                               st_str_udp_term, /* the name of the menu and window (use "/" for sub menus)*/
100                               0, /* tap listener flags for per-packet callback */
101                               udp_term_stats_tree_packet, /* the per packet callback */
102                               udp_term_stats_tree_init, /* the init callback */
103                               NULL ); /* the cleanup callback (in this case there isn't) */
104
105}
106
107----- END ------
108
109the stats_tree API
110==================
111 every stats_tree callback has a stats_tree* parameter (st), stats_tree is an obscure
112 data structure which should be passed to the api functions.
113
114stats_tree_register(tapname, abbr, name, flags, packet_cb, init_cb, cleanup_cb);
115 registers a new stats tree
116
117stats_tree_register_plugin(tapname, abbr, name, flags, packet_cb, init_cb, cleanup_cb);
118 registers a new stats tree from a plugin
119
120stats_tree_parent_id_by_name( st, parent_name)
121  returns the id of a candidate parent node given its name
122
123
124Node functions
125==============
126
127All the functions that operate on nodes return a parent_id
128
129stats_tree_create_node(st, name, parent_id, with_children)
130  Creates a node in the tree (to be used in the in init_cb)
131    name: the name of the new node
132    parent_id: the id of the parent_node (NULL for root)
133    with_children: TRUE if this node will have "dynamically created" children
134                   (i.e. it will be a candidate parent)
135
136
137stats_tree_create_node_by_pname(st, name, parent_name, with_children);
138  As before but creates a node using its parent's name
139
140
141stats_tree_create_range_node(st, name, parent_id, ...)
142stats_tree_create_range_node_string(st, name, parent_id, num_str_ranges, str_ranges)
143stats_tree_range_node_with_pname(st, name, parent_name, ...)
144  Creates a node in the tree, that will contain a ranges list.
145    example:
146       stats_tree_create_range_node(st,name,parent_id,
147				"-99","100-199","200-299","300-399","400-", NULL);
148
149stats_tree_tick_range( st, name,  parent_id, value_in_range);
150stats_tree_tick_range_by_pname(st,name,parent_name,value_in_range)
151   Increases by one the ranged node and the sub node to whose range the value belongs
152
153
154stats_tree_create_pivot(st, name, parent_id);
155stats_tree_create_pivot_by_pname(st, name, parent_name);
156  Creates a "pivot node"
157
158stats_tree_tick_pivot(st, pivot_id, pivoted_string);
159 Each time a pivot node will be ticked it will get increased, and, it will
160 increase (or create) the children named as pivoted_string
161
162
163the following will either increase or create a node (with value 1) when called
164
165tick_stat_node(st,name,parent_id,with_children)
166increases by one a stat_node
167
168increase_stat_node(st,name,parent_id,with_children,value)
169increases by value a stat_node
170
171set_stat_node(st,name,parent_id,with_children,value)
172sets the value of a stat_node
173
174zero_stat_node(st,name,parent_id,with_children)
175resets to zero a stat_node
176
177Averages work by tracking both the number of items added to node (the ticking
178action) and the value of each item added to the node. This is done
179automatically for ranged nodes; for other node types you need to call one of
180the functions below to associate item values with each tick.
181
182avg_stat_node_add_value_notick(st,name,parent_id,with_children,value)
183avg_stat_node_add_value(st,name,parent_id,with_children,value)
184
185The difference between the above functions is whether the item count is
186increased or not. To properly compute the average you need to either call
187avg_stat_node_add_value or avg_stat_node_add_value_notick combined
188tick_stat_node. The later sequence allows for plug-ins which are compatible
189with older Wireshark versions which ignores avg_stat_node_add_value because
190it does not understand the command. This would result in 0 counts for all
191nodes. It is preferred to use avg_stat_node_add_value if you are not writing
192a plug-in.
193
194avg_stat_node_add_value is used the same way as tick_stat_node with the
195exception that you now specify an additional value associated with the tick.
196
197Do not mix increase_stat_node, set_stat_node or zero_stat_node
198with avg_stat_node_add_value as this will lead to incorrect results for the
199average value.
200
201stats_tree now also support setting flags per node to control the behaviour
202of these nodes. This can be done using the stat_node_set_flags and
203stat_node_clear_flags functions. Currently these flags are defined:
204
205	ST_FLG_DEF_NOEXPAND: By default the top-level nodes in a tree are
206			automatically expanded in the GUI. Setting this flag on
207			such a node	prevents the node from automatically expanding.
208	ST_FLG_SORT_TOP: Nodes with this flag is sorted separately from nodes
209			without this flag (in effect partitioning tree into a top and
210			bottom half. Each half is sorted normally. Top always appear
211			first :)
212
213You can find more examples of these in $srcdir/plugins/epan/stats_tree/pinfo_stats_tree.c
214
215Luis E. G. Ontanon.
216

README.tapping

1The TAP system in Wireshark is a powerful and flexible mechanism to get event
2driven notification on packets matching certain protocols and/or filters.
3In order to use the tapping system, very little knowledge of Wireshark
4internals are required.
5
6As examples on how to use the tap system see the implementation of
7tap-rpcprogs.c (tshark version)
8ui/qt/rpc_service_response_time_dialog.cpp (wireshark version)
9
10If all you need is to keep some counters, there's the stats_tree API,
11which offers a simple way to make a GUI and tshark tap-listener; see
12README.stats_tree.  However, keep reading, as you'll need much of what's
13in this document.
14
15The tap system consists of two parts:
161, code in the actual dissectors to allow tapping data from that particular
17protocol dissector, and
182, event driven code in an extension such as tap-rpcstat.c that registers
19a tap listener and processes received data.
20
21So you want to hack together a tap application?
22
23TAP
24===
25First you must decide which protocol you are interested in writing a tap
26application for and check if that protocol has already got a tap installed
27in it.
28If it already has a tap device installed then you don't have to do anything.
29If not, then you have to add a tap but don't worry, this is extremely easy to
30do and is done in four easy steps;
31(see packet-rpc.c and search for tap for an example)
32
331, We need tap.h so just add '#include "tap.h"' (preceded by packet.h) to
34the includes.
35
362, We need a tap handler so just add 'static int <protocol>_tap = -1;'
37
383, Down in proto_register_<protocol>() you need to add
39'<protocol>_tap = register_tap("<protocol>");'
40
414, In the actual dissector for that protocol, after any child dissectors
42have returned, just add 'tap_queue_packet(<protocol>_tap, pinfo, <pointer>);'
43
44<pointer> is used if the tap has any special additional data to provide to the
45tap listeners. What this points to is dependent on the protocol that is tapped,
46or if there are no useful extra data to provide just specify NULL.  For
47packet-rpc.c what we specify there is the persistent structure 'rpc_call' which
48contains lots of useful information from the rpc layer that a listener might
49need.
50
51
52
53TAP LISTENER
54============
55(see tap-rpcprogs.c as an example)
56Interfacing your application is not that much harder either.
57Only 4 callbacks and two functions.
58
59The two functions to start or stop tapping are
60
61register_tap_listener(const char *tapname, void *tapdata, const char *fstring,
62	guint flags,
63	void (*reset)(void *tapdata),
64        tap_packet_status (*packet)(void *tapdata, packet_info *pinfo, epan_dissect_t *edt, const void *data),
65	void (*draw)(void *tapdata),
66	void (*finish)(void *tapdata));
67
68This function is used to register an instance of a tap application
69to the tap system.
70
71remove_tap_listener(void *tapdata);
72
73This function is used to deregister and stop a tap listener.
74
75The parameters have the following meanings:
76
77*tapname
78is the name of the tap we want to listen to. I.e. the name used in
79step 3 above.
80
81*tapdata
82is the instance identifier. The tap system uses the value of this
83pointer to distinguish between different instances of a tap.
84Just make sure that it is unique by letting it be the pointer to a struct
85holding all state variables. If you want to allow multiple concurrent
86instances, just put ALL state variables inside a struct allocated by
87g_malloc() and use that pointer.
88(tap-rpcstat.c use this technique to allow multiple simultaneous instances)
89
90*fstring
91is a pointer to a filter string.
92If this is NULL, then the tap system will provide ALL packets passing the
93tapped protocol to your listener.
94If you specify a filter string here the tap system will first try
95to apply this string to the packet and then only pass those packets that
96matched the filter to your listener.
97The syntax for the filter string is identical to normal display filters.
98
99NOTE: Specifying filter strings will have a significant performance impact
100on your application and Wireshark. If possible it is MUCH better to take
101unfiltered data and just filter it yourself in the packet-callback than
102to specify a filter string.
103ONLY use a filter string if no other option exist.
104
105flags
106is a set of flags for the tap listener.  The flags that can be set are:
107
108    TL_REQUIRES_PROTO_TREE
109
110	set if your tap listener "packet" routine requires a protocol
111	tree to be built.  It will require a protocol tree to be
112	built if either
113
114		1) it looks at the protocol tree in edt->tree
115
116	or
117
118		2) the tap-specific data passed to it is constructed only if
119		   the protocol tree is being built.
120
121    TL_REQUIRES_COLUMNS
122
123	set if your tap listener "packet" routine requires the column
124	strings to be constructed.
125
126    If no flags are needed, use TL_REQUIRES_NOTHING.
127
128void (*reset)(void *tapdata)
129This callback is called whenever Wireshark wants to inform your
130listener that it is about to start [re]reading a capture file or a new capture
131from an interface and that your application should reset any state it has
132in the *tapdata instance.
133
134tap_packet_status (*packet)(void *tapdata, packet_info *pinfo, epan_dissect_t *edt, const void *data)
135This callback is used whenever a new packet has arrived at the tap and that
136it has passed the filter (if there were a filter).
137The *data structure type is specific to each tap.
138This function returns a tap_packet_status enum and it should return
139 TAP_PACKET_REDRAW, if the data in the packet caused state to be updated
140       (and thus a redraw of the window would later be required)
141 TAP_PACKET_DONT_REDRAW, if we don't need to redraw the window
142 TAP_PACKET_FAILED, if the tap failed and shouldn't be called again
143       in this pass (for example, if it's writing to a file and gets
144       an I/O error)
145NOTE: that (*packet) should be as fast and efficient as possible. Use this
146function ONLY to store data for later and do the CPU-intensive processing
147or GUI updates down in (*draw) instead.
148
149
150void (*draw)(void *tapdata)
151This callback is used when Wireshark wants your application to redraw its
152output. It will usually not be called unless your application has received
153new data through the (*packet) callback.
154On some ports of Wireshark (Qt) (*draw) will be called asynchronously
155from a separate thread up to once every 2-3 seconds.
156On other ports it might only be called once when the capture is finished
157or the file has been [re]read completely.
158
159void (*finish)(void *tapdata)
160This callback is called when your listener is removed.
161
162
163So, create four callbacks:
1641, reset   to reset the state variables in the structure passed to it.
1652, packet  to update these state variables.
1663, draw    to take these state variables and draw them on the screen.
1674, finish  to free all state variables.
168
169then just make Wireshark call register_tap_listener() when you want to tap
170and call remove_tap_listener() when you are finished.
171
172
173WHEN DO TAP LISTENERS GET CALLED?
174===================================
175Tap listeners are only called when Wireshark reads a new capture for
176the first time or whenever Wireshark needs to rescan/redissect
177the capture.
178Redissection occurs when you apply a new display filter or if you
179change and Save/Apply a preference setting that might affect how
180packets are dissected.
181After each individual packet has been completely dissected and all
182dissectors have returned, all the tap listeners that have been flagged
183to receive tap data during the dissection of the frame will be called in
184sequence.
185The order in which the tap listeners will be called is not defined.
186Not until all tap listeners for the frame has been called and returned
187will Wireshark continue to dissect the next packet.
188This is why it is important to make the *_packet() callbacks execute as
189quickly as possible, else we create an extra delay until the next packet
190is dissected.
191
192Keep in mind though: for some protocols, such as IP, the protocol can
193appear multiple times in different layers inside the same packet.
194For example, IP encapsulated over IP which will call the ip dissector
195twice for the same packet.
196IF the tap is going to return private data using the last parameter to
197tap_queue_packet() and IF the protocol can appear multiple times inside the
198same packet, you will have to make sure that each instance of
199tap_queue_packet() is using its own instance of private struct variable
200so they don't overwrite each other.
201
202See packet-ip.c which has a simple solution to the problem. It creates
203a unique instance of the IP header using wmem_alloc().
204Previous versions used a static struct of 4 instances of the IP header
205struct and cycled through them each time the dissector was called. (4
206was just a number taken out of the blue but it should be enough for most
207cases.) This would fail if there were more than 4 IP headers in the same
208packet, but that was unlikely.
209
210
211TIPS
212====
213Of course, there is nothing that forces you to make (*draw) draw stuff
214on the screen.
215You can hand register_tap_listener() NULL for (*draw), (*reset) and (*finish)
216(well also for (*packet) but that would be a very boring extension).
217
218Perhaps you want an extension that will execute a certain command
219every time it sees a certain packet?
220Well, try this :
221	tap_packet_status packet(void *tapdata,...) {
222		...
223		system("mail ...");
224		return TAP_PACKET_DONT_REDRAW;
225	}
226
227	register_tap_listener("tcp", struct, "tcp.port==57", NULL, packet, NULL, NULL);
228
229	Let struct contain an email address?
230	Then you have something simple that will make Wireshark send an email
231	out automagically for each and every time it dissects
232	a packet containing TCP traffic to port 57.
233	Please put in some rate limitation if you do this.
234
235	Let struct contain a command line and make (*packet) execute it?
236	The possibilities are rather large.
237
238See tap.c as well. It contains lots of comments and descriptions on the tap
239system.
240

README.vagrant

11. Introduction
2
3Vagrant is a virtual machine management program that makes it trivial to build
4and configure reproducible virtual machines. Wireshark's source code includes
5a Vagrantfile which can be used to set up a complete development environments
6in a virtual machine, including all necessary compilers, dependent libraries,
7and tools like valgrind.
8
9Using vagrant can greatly simplify the creation of a Linux build environment
10for new developers, at the cost of running your builds in a virtual machine,
11thus with reduced performance.
12
132. Installation
14
15The Vagrantfile included in Wireshark's source directory is configured to use
16VirtualBox as the backing virtual machine technology provider. You must first
17install VirtualBox from https://www.virtualbox.org/.
18
19Now install vagrant itself from https://www.vagrantup.com/.
20
21Please note that vagrant is a CLI command and should typically be installed in
22your host system's $PATH. To better understand what vagrant is doing you may
23want to review vagrant's `Getting Started` web pages.
24
253. Setup
26
27By default vagrant looks for the file name Vagrantfile in the current
28directory. Wireshark's Vagrantfile is located in the root of the Wireshark
29source folder.
30
31Once both VirtualBox and vagrant are installed, setting up an Ubuntu Wireshark
32development VM is as simple as running `vagrant up ubuntu`.
33
34The first time that the `vagrant up` command is executed vagrant will initiate
35the download of a specific VM image (what they call a box) from HashiCorp's
36Atlas box catalog. Once the box is downloaded a VM will be instantiated and
37powered-on.
38
39Use the command `vagrant status` to determine the state of the VMs.
40
41The command `vagrant provision` will run any provisioning tasks defined in the
42Vagrantfile. Wireshark's Vagrantfile is configured to provision the machine
43and build the project using vagrant_build.sh.
44
45The vagrant_build.sh script sets up a cmake
46build environment which includes creating a ~/build folder initialized for an
47out-of-tree cmake build and then triggering a build.
48
494. Usage
50
51Running `vagrant ssh ubuntu` from the Wireshark source directory will log you
52into Ubuntu VM as the userid vagrant.
53
54The Ubuntu VM's build folder is located in ~/build. The Ubuntu VM's source
55folder is actually the source folder from the host system mounted as
56/home/vagrant/wireshark. Any changes made in the VM's ~/wireshark folder are
57reflected in the host system's Wireshark source folder and vice-versa.
58
59Installing the vagrant-vbguest plugin is strongly recommended to get synced
60folders working on all boxes and other niceties. You can install it with the
61command `vagrant plugin install vagrant-vbguest`.
62
63Once logged into the VM issue the command `cd ~/build` followed by `make` to
64trigger a new wireshark build based on whatever is in your host system's source
65tree (the VM's ~/wireshark folder).
66
67The various Wireshark applications can be run from the ~/build folder of the
68VM with commands such as `./run/wireshark`, `./run/tshark`, etc.
69
70To run the Wireshark GUI you will need an X server and the X authority file
71utility (xauth) installed in the guest. For Ubuntu use `apt-get install xauth`,
72Fedora use `dnf install xorg-x11-xauth`.
73
74If you are using macOS ({Mac} OS X) as the host system then you would likely
75use XQuartz as your X server. XQuartz can be downloaded from
76https://www.xquartz.org/.
77
78The VM can be shutdown or suspended from the host system with the
79commands `vagrant halt` and `vagrant suspend` respectively. In either case the
80VM can be brought back up with the command `vagrant up`.
81
825. Using Vagrant with multiple VMs
83Wireshark's Vagrantfile is configured with more than one box so most vagrant
84commands, those that apply to machines, need to be provided with the machine
85name. You can list all machines and their state with the `vagrant status`
86command.
87

README.wmem

11. Introduction
2
3The 'wmem' memory manager is Wireshark's memory management framework, replacing
4the old 'emem' framework which was removed in Wireshark 2.0.
5
6In order to make memory management easier and to reduce the probability of
7memory leaks, Wireshark provides its own memory management API. This API is
8implemented inside wsutil/wmem/ and provides memory pools and functions that make
9it easy to manage memory even in the face of exceptions (which many dissector
10functions can raise). Memory scopes for dissection are defined in epan/wmem_scopes.h.
11
12Correct use of these functions will make your code faster, and greatly reduce
13the chances that it will leak memory in exceptional cases.
14
15Wmem was originally conceived in this email to the wireshark-dev mailing list:
16https://www.wireshark.org/lists/wireshark-dev/201210/msg00178.html
17
182. Usage for Consumers
19
20If you're writing a dissector, or other "userspace" code, then using wmem
21should be very similar to using malloc or g_malloc or whatever else you're used
22to. All you need to do is include the header (epan/wmem_scopes.h) and optionally
23get a handle to a memory pool (if you want to *create* a memory pool, see the
24section "3. Usage for Producers" below).
25
26A memory pool is an opaque pointer to an object of type wmem_allocator_t, and
27it is the very first parameter passed to almost every call you make to wmem.
28Other than that parameter (and the fact that functions are prefixed wmem_)
29usage is very similar to glib and other utility libraries. For example:
30
31    wmem_alloc(myPool, 20);
32
33allocates 20 bytes in the pool pointed to by myPool.
34
352.1 Memory Pool Lifetimes
36
37Every memory pool should have a defined lifetime, or scope, after which all the
38memory in that pool is unconditionally freed. When you choose to allocate memory
39in a pool, you *must* be aware of its lifetime: if the lifetime is shorter than
40you need, your code will contain use-after-free bugs; if the lifetime is longer
41than you need, your code may contain undetectable memory leaks. In either case,
42the risks outweigh the benefits.
43
44If no pool exists whose lifetime matches the lifetime of your memory, you have
45two options: create a new pool (see section 3 of this document) or use the NULL
46pool. Any function that takes a pointer to a wmem_allocator_t can also be passed
47NULL instead, in which case the memory is managed manually (just like malloc or
48g_malloc). Memory allocated like this *must* be manually passed to wmem_free()
49in order to prevent memory leaks (however these memory leaks will at least show
50up in valgrind). Note that passing wmem_allocated memory directly to free()
51or g_free() is not safe; the backing type of manually managed memory may be
52changed without warning.
53
542.2 Wireshark Global Pools
55
56Dissectors that include the wmem_scopes.h header file will have three pools available
57to them automatically: pinfo->pool, wmem_file_scope() and
58wmem_epan_scope(); there is also a wmem_packet_scope() for cases when the
59`pinfo` argument is not accessible, but pinfo->pool should be preferred.
60
61The pinfo pool is scoped to the dissection of each packet, meaning that any
62memory allocated in it will be automatically freed at the end of the current
63packet. The file pool is similarly scoped to the dissection of each file,
64meaning that any memory allocated in it will be automatically freed when the
65current capture file is closed.
66
67NB: Using these pools outside of the appropriate scope (e.g. using the file
68    pool when there isn't a file open) will throw an assertion.
69    See the comment in epan/wmem_scopes.c for details.
70
71The epan pool is scoped to the library's lifetime - memory allocated in it is
72not freed until epan_cleanup() is called, which is typically but not necessarily
73at the very end of the program.
74
752.3 The Pinfo Pool
76
77Certain allocations (such as AT_STRINGZ address allocations and anything that
78might end up being passed to add_new_data_source) need their memory to stick
79around a little longer than the usual packet scope - basically until the
80next packet is dissected. This is, in fact, the scope of Wireshark's pinfo
81structure, so the pinfo struct has a 'pool' member which is a wmem pool scoped
82to the lifetime of the pinfo struct.
83
842.4 API
85
86Full documentation for each function (parameters, return values, behaviours)
87lives (or will live) in Doxygen-format in the header files for those functions.
88This is just an overview of which header files you should be looking at.
89
902.4.1 Core API
91
92wmem_core.h
93 - Basic memory management functions (wmem_alloc, wmem_realloc, wmem_free).
94
952.4.2 Strings
96
97wmem_strutl.h
98 - Utility functions for manipulating null-terminated C-style strings.
99   Functions like strdup and strdup_printf.
100
101wmem_strbuf.h
102 - A managed string object implementation, similar to std::string in C++ or
103   GString from Glib.
104
1052.4.3 Container Data Structures
106
107wmem_array.h
108 - A growable array (AKA vector) implementation.
109
110wmem_list.h
111 - A doubly-linked list implementation.
112
113wmem_map.h
114 - A hash map (AKA hash table) implementation.
115
116wmem_queue.h
117 - A queue implementation (first-in, first-out).
118
119wmem_stack.h
120 - A stack implementation (last-in, first-out).
121
122wmem_tree.h
123 - A balanced binary tree (red-black tree) implementation.
124
1252.4.4 Miscellaneous Utilities
126
127wmem_miscutl.h
128 - Misc. utility functions like memdup.
129
1302.5 Callbacks
131
132WARNING: You probably don't actually need these; use them only when you're
133         sure you understand the dangers.
134
135Sometimes (though hopefully rarely) it may be necessary to store data in a wmem
136pool that requires additional cleanup before it is freed. For example, perhaps
137you have a pointer to a file-handle that needs to be closed. In this case, you
138can register a callback with the wmem_register_callback function
139declared in wmem_user_cb.h. Every time the memory in a pool is freed, all
140registered cleanup functions are called first.
141
142Note that callback calling order is not defined, you cannot rely on a
143certain callback being called before or after another.
144
145WARNING: Manually freeing or moving memory (with wmem_free or wmem_realloc)
146         will NOT trigger any callbacks. It is an error to call either of
147         those functions on memory if you have a callback registered to deal
148         with the contents of that memory.
149
1503. Usage for Producers
151
152NB: If you're just writing a dissector, you probably don't need to read
153    this section.
154
155One of the problems with the old emem framework was that there were basically
156two allocator backends (glib and mmap) that were all mixed together in a mess
157of if statements, environment variables and #ifdefs. In wmem the different
158allocator backends are cleanly separated out, and it's up to the owner of the
159pool to pick one.
160
1613.1 Available Allocator Back-Ends
162
163Each available allocator type has a corresponding entry in the
164wmem_allocator_type_t enumeration defined in wmem_core.h. See the doxygen
165comments in that header file for details on each type.
166
1673.2 Creating a Pool
168
169To create a pool, include the regular wmem header and call the
170wmem_allocator_new() function with the appropriate type value.
171For example:
172
173    #include <wsutil/wmem/wmem.h>
174
175    wmem_allocator_t *myPool;
176    myPool = wmem_allocator_new(WMEM_ALLOCATOR_SIMPLE);
177
178From here on in, you don't need to remember which type of allocator you used
179(although allocator authors are welcome to expose additional allocator-specific
180helper functions in their headers). The "myPool" variable can be passed around
181and used as normal in allocation requests as described in section 2 of this
182document.
183
1843.3 Destroying a Pool
185
186Regardless of which allocator you used to create a pool, it can be destroyed
187with a call to the function wmem_destroy_allocator(). For example:
188
189    #include <wsutil/wmem/wmem.h>
190
191    wmem_allocator_t *myPool;
192
193    myPool = wmem_allocator_new(WMEM_ALLOCATOR_SIMPLE);
194
195    /* Allocate some memory in myPool ... */
196
197    wmem_destroy_allocator(myPool);
198
199Destroying a pool will free all the memory allocated in it.
200
2013.4 Reusing a Pool
202
203It is possible to free all the memory in a pool without destroying it,
204allowing it to be reused later. Depending on the type of allocator, doing this
205(by calling wmem_free_all()) can be significantly cheaper than fully destroying
206and recreating the pool. This method is therefore recommended, especially when
207the pool would otherwise be scoped to a single iteration of a loop. For example:
208
209    #include <wsutil/wmem/wmem.h>
210
211    wmem_allocator_t *myPool;
212
213    myPool = wmem_allocator_new(WMEM_ALLOCATOR_SIMPLE);
214    for (...) {
215
216        /* Allocate some memory in myPool ... */
217
218        /* Free the memory, faster than destroying and recreating
219           the pool each time through the loop. */
220        wmem_free_all(myPool);
221    }
222    wmem_destroy_allocator(myPool);
223
2244. Internal Design
225
226Despite being written in Wireshark's standard C90, wmem follows a fairly
227object-oriented design pattern. Although efficiency is always a concern, the
228primary goals in writing wmem were maintainability and preventing memory
229leaks.
230
2314.1 struct _wmem_allocator_t
232
233The heart of wmem is the _wmem_allocator_t structure defined in the
234wmem_allocator.h header file. This structure uses C function pointers to
235implement a common object-oriented design pattern known as an interface (also
236known as an abstract class to those who are more familiar with C++).
237
238Different allocator implementations can provide exactly the same interface by
239assigning their own functions to the members of an instance of the structure.
240The structure has eight members in three groups.
241
2424.1.1 Implementation Details
243
244 - private_data
245 - type
246
247The private_data pointer is a void pointer that the allocator implementation can
248use to store whatever internal structures it needs. A pointer to private_data is
249passed to almost all of the other functions that the allocator implementation
250must define.
251
252The type field is an enumeration of type wmem_allocator_type_t (see
253section 3.1). Its value is set by the wmem_allocator_new() function, not
254by the implementation-specific constructor. This field should be considered
255read-only by the allocator implementation.
256
2574.1.2 Consumer Functions
258
259 - walloc()
260 - wfree()
261 - wrealloc()
262
263These function pointers should be set to functions with semantics obviously
264similar to their standard-library namesakes. Each one takes an extra parameter
265that is a copy of the allocator's private_data pointer.
266
267Note that wrealloc() and wfree() are not expected to be called directly by user
268code in most cases - they are primarily optimizations for use by data
269structures that wmem might want to implement (it's inefficient, for example, to
270implement a dynamically sized array without some form of realloc).
271
272Also note that allocators do not have to handle NULL pointers or 0-length
273requests in any way - those checks are done in an allocator-agnostic way
274higher up in wmem. Allocator authors can assume that all incoming pointers
275(to wrealloc and wfree) are non-NULL, and that all incoming lengths (to walloc
276and wrealloc) are non-0.
277
2784.1.3 Producer/Manager Functions
279
280 - free_all()
281 - gc()
282 - cleanup()
283
284All of these functions take only one parameter, which is the allocator's
285private_data pointer.
286
287The free_all() function should free all the memory currently allocated in the
288pool. Note that this is not necessarily exactly the same as calling free()
289on all the allocated blocks - free_all() is allowed to do additional cleanup
290or to make use of optimizations not available when freeing one block at a time.
291
292The gc() function should do whatever it can to reduce excess memory usage in
293the dissector by returning unused blocks to the OS, optimizing internal data
294structures, etc.
295
296The cleanup() function should do any final cleanup and free any and all memory.
297It is basically the equivalent of a destructor function. For simplicity, wmem
298is guaranteed to call free_all() immediately before calling this function. There
299is no such guarantee that gc() has (ever) been called.
300
3014.2 Pool-Agnostic API
302
303One of the issues with emem was that the API (including the public data
304structures) required wrapper functions for each scope implemented. Even
305if there was a stack implementation in emem, it wasn't necessarily available
306for use with file-scope memory unless someone took the time to write se_stack_
307wrapper functions for the interface.
308
309In wmem, all public APIs take the pool as the first argument, so that they can
310be written once and used with any available memory pool. Data structures like
311wmem's stack implementation only take the pool when created - the provided
312pointer is stored internally with the data structure, and subsequent calls
313(like push and pop) will take the stack itself instead of the pool.
314
3154.3 Debugging
316
317The primary debugging control for wmem is the WIRESHARK_DEBUG_WMEM_OVERRIDE
318environment variable. If set, this value forces all calls to
319wmem_allocator_new() to return the same type of allocator, regardless of which
320type is requested normally by the code. It currently has three valid values:
321
322 - The value "simple" forces the use of WMEM_ALLOCATOR_SIMPLE. The valgrind
323   script currently sets this value, since the simple allocator is the only
324   one whose memory allocations are trackable properly by valgrind.
325
326 - The value "strict" forces the use of WMEM_ALLOCATOR_STRICT. The fuzz-test
327   script currently sets this value, since the goal when fuzz-testing is to find
328   as many errors as possible.
329
330 - The value "block" forces the use of WMEM_ALLOCATOR_BLOCK. This is not
331   currently used by any scripts, but is useful for stress-testing the block
332   allocator.
333
334 - The value "block_fast" forces the use of WMEM_ALLOCATOR_BLOCK_FAST. This is
335   not currently used by any scripts, but is useful for stress-testing the fast
336   block allocator.
337
338Note that regardless of the value of this variable, it will always be safe to
339call allocator-specific helpers functions. They are required to be safe no-ops
340if the allocator argument is of the wrong type.
341
3424.4 Testing
343
344There is a simple test suite for wmem that lives in the file wmem_test.c and
345should get automatically built into the binary 'wmem_test' when building
346Wireshark. It contains at least basic tests for all existing functionality.
347The suite is run automatically by the build-bots via the shell script
348test/test.py which calls out to test/suite_unittests.py.
349
350New features added to wmem (allocators, data structures, utility
351functions, etc.) MUST also have tests added to this suite.
352
353The test suite could potentially use a clean-up by someone more
354intimately familiar with Glib's testing framework, but it does the job.
355
3565. A Note on Performance
357
358Because of my own bad judgment, there is the persistent idea floating around
359that wmem is somehow magically faster than other allocators in the general case.
360This is false.
361
362First, wmem supports multiple different allocator backends (see sections 3 and 4
363of this document), so it is confusing and misleading to try and compare the
364performance of "wmem" in general with another system anyways.
365
366Second, any modern system-provided malloc already has a very clever and
367efficient allocator algorithm that makes use of blocks, arenas and all sorts of
368other fancy tricks. Trying to be faster than libc's allocator is generally a
369waste of time unless you have a specific allocation pattern to optimize for.
370
371Third, while there were historically arguments to be made for putting something
372in front of the kernel to reduce the number of context-switches, modern libc
373implementations should already do that. Making a dynamic library call is still
374marginally more expensive than calling a locally-defined linker-optimized
375function, but it's a difference too small to care about.
376
377With all that said, it is true that *some* of wmem's allocators can be
378substantially faster than your standard libc malloc, in *some* use cases:
379 - The BLOCK and BLOCK_FAST allocators both provide very efficient free_all
380   operations, which can be many orders of magnitude faster than calling free()
381   on each individual allocation.
382 - The BLOCK_FAST allocator in particular is optimized for Wireshark's packet
383   scope pool. It has an extremely short, well-defined lifetime, and a very
384   regular pattern of allocations; I was able to use that knowledge to beat libc
385   rather handily, *in that specific use case*.
386
387/*
388 * Editor modelines  -  https://www.wireshark.org/tools/modelines.html
389 *
390 * Local variables:
391 * c-basic-offset: 4
392 * tab-width: 8
393 * indent-tabs-mode: nil
394 * End:
395 *
396 * vi: set shiftwidth=4 tabstop=8 expandtab:
397 * :indentSize=4:tabSize=8:noTabs=true:
398 */
399

README.wslua

1README.wslua
2
3This is a HOWTO for adding support for new Lua hooks/functions/abilities in
4Wireshark.   If you see any errors or have any improvements, submit patches -
5free software is a community effort....
6
7This is NOT a guide for how to write Lua plugins - that's documented already
8on the Wireshark webpages.
9
10Contributors to this README:
11Hadriel Kaplan <hadrielk[AT]yahoo.com>
12
13==============================================================================
14
15Overview:
16
17The way Wireshark exposes functions for Lua is generally based on a
18callback/event model, letting Lua plugins register their custom Lua functions
19into event callbacks.  C-based "objects" are exposed as Lua tables with
20typical Lua USERDATA pointer dispatching, plain C-functions are registered as
21such in Lua, and C-based enums/variables are registered into Lua as table
22key=value (usually... though rarely they're registered as array indexed
23values).  All of that is very typical for applications that expose things
24into a Lua scripting environment.
25
26The details that make it a little different are (1) the process by which the
27code is bound/registered into Lua, and (2) the API documentation generator.
28Wireshark uses C-macros liberally, both for the usual reasons as well as for
29the binding generator and documentation generator scripts.  The macros are
30described within this document.
31
32The API documentation is auto-generated from a Perl script called 'make-
33wsluarm.pl', which searches C-files for the known macros and generates
34appropriate HTML documentation from them.  This includes using the C-comments
35after the macros for the API document info.
36
37Likewise, another Perl script called 'make-reg.pl' generates the C-files
38'register_wslua.c' and 'declare_wslua.h', based on the C-macros it searches
39for in existing source files.  The code this Perl script auto-generates is
40what actually registers some classes/functions into Lua - you don't have to
41write your own registration functions to get your new functions/classes into
42Lua tables. (you can do so, but it's not advisable)
43
44Both of the perl scripts above are given the C-source files to search through
45by the make process, generated from the lists in epan/wslua/CMakeLists.txt.
46Naturally if you add new source files, you need to add them to the list in
47epan/wslua/CMakeLists.txt. You also have to add the module name into
48docbook/user-guide.xml and docbook/wsluarm.xml, and the source files into
49docbook/CMakeLists.txt, to get it to be generated in the user guide.
50
51Another Perl script is used as well, called 'make-init-lua.pl', which
52generates the init.lua script.  A large part of it deals with exposing #define
53values into the Lua global table, or sub-tables.  Unfortunately not all of
54them are put in sub-tables, which means the global Lua table is quite polluted
55now.  If you add new ones in here, please think of putting them in a subtable,
56as they are for wtap, ftypes, and base.  For example, there are several put in
57as 'PI_' prefixed names, such as 'PI_SEVERITY_MASK = 15728640'.  The fact they
58all have a common 'PI_' prefix should be an indicator they can be put in a
59table named PI, or PacketInfo.  Just because C-code doesn't have namespaces,
60doesn't mean Lua can't. This has now been fixed, and the PI_* names are now in
61two separate subtables of a table named 'expert', as 'expert.group' and
62'expert.severity' subtables. Follow that model in 'make-init-lua.pl'.
63
64
65Due to those documentation and registration scripts, you MUST follow some very
66specific conventions in the functions you write to expose C-side code to Lua,
67as described in this document.
68
69Naming conventions/rules:
70
71Class/object names must be UpperCamelCase, no numbers/underscores.
72Function and method names must be lower_underscore_case, no numbers.
73Constants/enums must be ALLCAPS, and can have numbers.
74
75The above rules are more than merely conventions - the Perl scripts which
76auto-generate stuff use regex patterns that require the naming syntax to be
77followed.
78
79==============================================================================
80
81Documenting things for the API docs:
82
83As explained previously, the API documentation is auto-generated from a
84Perl script called 'make-wsluarm.pl', which searches C-files for the known
85macros and generates appropriate HTML documentation from them. This includes
86using the C-comments after the macros for the API document info. The comments
87are extremely important, because the API documentation is what most Lua script
88authors will see - do *not* expect them to go looking through the C-source code
89to figure things out.
90
91Please make sure to at least use the '@since' version notification markup
92in your comments, to let users know when the new class/function/etc. you
93created became available.
94
95Because documentation is so important, the make-wsluarm.pl script supports
96specific markup syntax in comments, and converts them to XML and ultimately
97into the various documentation formats. The markup syntax is documented in
98the top comments in make-wsluarm.pl, but are repeated here as well:
99  - two (or more) line breaks in comments result in separate paragraphs
100  - all '&' are converted into their entity names, except inside urls
101  - all '<', and '>' are converted into their entity names everywhere
102  - any word(s) wrapped in one star, e.g., *foo bar*, become italics
103  - any word(s) wrapped in two stars, e.g., **foo bar**, become bold
104  - any word(s) wrapped in backticks, e.g., `foo bar`, become bold (for now)
105  - any word(s) wrapped in two backticks, e.g., ``foo bar``, become one backtick
106  - any "[[url]]" becomes an XML ulink with the url as both the url and text
107  - any "[[url|text]]" becomes an XML ulink with the url as the url and text as text
108  - any indent with a single leading star '*' followed by space is a bulleted list item
109    reducing indent or having an extra linebreak stops the list
110  - any indent with a leading digits-dot followed by space, i.e. "1. ", is a numbered list item
111    reducing indent or having an extra linebreak stops the list
112  - supports meta-tagged info inside comment descriptions as follows:
113    * a line starting with "@note" or "Note:" becomes an XML note line
114    * a line starting with "@warning" or "Warning:" becomes an XML warning line
115    * a line starting with "@version" or "@since" becomes a "Since:" line
116    * a line starting with "@code" and ending with "@endcode" becomes an
117      XML programlisting block, with no indenting/parsing within the block
118    The above '@' commands are based on Doxygen commands
119
120
121==============================================================================
122
123Some implementation details:
124
125Creating new C-classes for Lua:
126
127Explaining the Lua class/object model and how it's bound to C-code functions
128and data types is beyond the scope of this document; if you don't already know
129how that works, I suggest you start reading lua-users.org's wiki, and
130lua.org's free reference manual.
131
132Wireshark generally uses a model close to the typical binding
133model: 'registering' class methods and metamethods, pushing objects into Lua
134by applying the class' metatable to the USERDATA, etc.  This latter part is
135mostly handled for you by the C-macro's created by WSLUA_CLASS_DEFINE, such as
136push/check, described later in this document.
137
138The actual way methods are dispatched is a little different from normal Lua
139bindings, because attributes are supported as well (see next section). The
140details won't be covered in this document - they're documented in the code
141itself in: wslua_internals.c above the wslua_reg_attributes function.
142
143Registering a class requires you to write some code: a WSLUA_METHODS table,
144a WSLUA_META table, and a registration function.  The WSLUA_METHODS table is an
145array of luaL_Reg structs, which map a string name that will be the function's
146name in Lua, to a C-function pointer which is the C-function to be invoked by
147Lua when the user calls the name.  Instead of defining this array of structs
148explicitly using strings and function names, you should use the WSLUA_METHODS
149macro name for the array, and use WSLUA_CLASS_FNREG macro for each entry.
150The WSLUA_META table follows the same behavior, with the WSLUA_CLASS_MTREG
151macro for each entry. Make sure your C-function names use two underscores
152instead of one (for instance, ClassName__tostring).
153
154Once you've created the appropriate array tables, define a registration
155function named 'ClassName_register', where 'ClassName'is your class name, the
156same one used in WSLUA_CLASS_DEFINE.  The make-reg.pl Perl script will search
157your file for WSLUA_CLASS_DEFINE, and it generates a register_wslua.c which
158will call your ClassName_register function during Wireshark initialization.
159Define a wslua_class structure which describes the class and register this in
160your ClassName_register function using one of:
161 - wslua_register_classinstance_meta to create a metatable which allows
162   instances of a class to have methods and attributes. C code can create such
163   instances by setting the metatable on userdata, the class itself is not
164   directly visible in the Lua scope.
165 - wslua_register_class to additionally expose a class with static functions
166   that is also directly visible in the Lua global scope.
167Also, you should read the 'Memory management model' section later in this
168document.
169
170Class member variable attributes (getters/setters):
171
172The current implementation does not follow a single/common class-variable
173attribute accessor model for the Lua API: some class member values are
174populated/retrieved when a table field attribute is used that triggers the
175__index or __newindex metamethods, and others are accessed through explicit
176getter/setter method functions.  In other words from a Lua code perspective
177some class object variables are retrieves as 'foo = myObj.var', while others
178are done as 'foo = myObj.getVar()'.
179
180From the C-side code perspective, some classes register no real method
181functions but just have attributes (and use the WSLUA_ATTRIBUTE documentation
182model for them). For example the FieldInfo class in wslua_field.c does this.
183Other classes provide access to member variable through getter/setter method
184functions (and thus use the WSLUA_METHOD documentation model). For example
185the TvbRange class in wslua_tvb.c does this. Using the latter model of having
186a getter/setter method function allows one to pass multiple arguments, whereas
187the former __index/__newindex metamethod model does not. Both models are
188fairly common in Lua APIs, although having a mixture of both in the same API
189probably isn't.  There is even a third model in use: pre-loading the member
190fields of the class table with the values, instead of waiting for the Lua
191script to access a particular one to retrieve it; for example the Listener tap
192extractors table is pre-populated (see files 'wslua_listener.c' and 'taps'
193which through the make-taps.pl perl script creates 'taps_wslua.c'). The
194downside of that approach is the performance impact, filling fields the Lua
195script may never access.  Lastly, the Field, FieldInfo, and Tvb's ByteArray
196type each provide a __call metamethod as an accessor - I strongly suggest you
197do NOT do that, as it's not a common model and will confuse people since it
198doesn't follow the model of the other classes in Wireshark.
199
200Attributes are handled internally like this:
201
202    -- invoked on myObj.myAttribute
203    function myObj.__metatable:__index(key)
204        if "getter for key exists" then
205            return getter(self)
206        elseif "method for key exists" then
207            -- ensures that myObj.myMethod() works
208            return method
209        else
210            error("no such property error message")
211        end
212    end
213    -- invoked on myObj.myAttribute = 1
214    function myObj.__metatable:__newindex(key, value)
215        if "setter for key exists" then
216            return setter(self, value)
217        else
218            error("no such property error message")
219        end
220    end
221
222To add getters/setters in C, initialize the "attrs" member of the wslua_class
223structure. This should contain an array table similar to the WSLUA_METHODS and
224WSLUA_META tables, except using the macro name WSLUA_ATTRIBUTES. Inside this
225array, each entry should use one of the following macros: WSLUA_ATTRIBUTE_ROREG,
226WSLUA_ATTRIBUTE_WOREG, or WSLUA_ATTRIBUTE_RWREG. Those provide the hooks for
227a getter-only, setter-only, or both getter and setter function. The functions
228themselves need to follow a naming scheme of ClassName_get_attributename(),
229or ClassName_set_attributename(), for the respective getter vs. setter function.
230Trivial getters/setters have macros provided to make this automatic, for things
231such as getting numbers, strings, etc. The macros are in wslua.h. For example,
232the WSLUA_ATTRIBUTE_NAMED_BOOLEAN_GETTER(Foo,bar,choo) macro creates a getter
233function to get the boolean value of the Class Foo's choo member variable, as
234the Lua attribute named 'bar'.
235
236Callback function registration:
237
238For some callbacks, there are register_* Lua global functions, which take a
239user-defined Lua function name as the argument - the one to be hooked into
240that event.  Unlike in most Lua APIs, there's a unique register_foo() function
241for each event type, instead of a single register() with the event as an
242argument.  For example there's a register_postdissector() function.  In some
243cases the Lua functions are invoked based on a pre-defined function-name model
244instead of explicit register_foo(), whereby a C-object looks for a defined
245member variable in the Registry that represents a Lua function created by the
246plugin.  This would be the case if the Lua plugin had defined a pre-defined
247member key of its object's table in Lua, for that purpose.  For example if the
248Lua plugin sets the 'reset' member of the Listener object table to a function,
249then Wireshark creates a Registry entry for that Lua function, and executes
250that Lua function when the Listener resets. (see the example Listener Lua
251script in the online docs) That model is only useful if the object can only be
252owned by one plugin so only one function is ever hooked, obviously, and thus
253only if it's created by the Lua plugin (e.g., Listener.new()).
254
255Creating new Listener tap types:
256
257The Listener object is one of the more complicated ones.  When the Lua script
258creates a Listener (using Listener.new()), the code creates and returns a tap
259object.  The type of tap is based on the passed-in argument to Listener.new(),
260and it creates a Lua table of the tap member variables.  That happens in
261taps_wslua.c, which is an auto-generated file from make-taps.pl.  That Perl
262script reads from a file called 'taps', which identifies every struct name
263(and associated enum name) that should be exposed as a tap type.  The Perl
264script then generates the taps_wslua.c to push those whenever the Listener
265calls for a tap; and it also generates a taps.tx file documenting them all.
266So to add a new type, add the info to the taps file (or uncomment an existing
267one), and make sure every member of the tap struct you're exposing is of a
268type that make-taps.pl has in its Perl %types and %comments associative
269arrays.
270
271Note on Lua versions:
272
273Wireshark supports both Lua 5.1 and 5.2, which are defined as LUA_VERSION_NUM
274values 501 and 502 respectively.  When exposing things into Lua, make sure to
275use ifdef wrappers for things which changed between the versions of Lua.  See
276this for details: http://www.lua.org/manual/5.2/manual.html#8.3
277
278==============================================================================
279
280Defined Macros for Lua-API C-files:
281
282WSLUA_MODULE - this isn't actually used in real C-code, but rather only
283appears in C-comments at the top of .c files.  That's because it's purely used
284for documentation, and it makes a new section in the API documentation.
285
286For example, this appears near the top of the wslua_gui.c file:
287
288    /* WSLUA_MODULE Gui GUI support */
289
290That makes the API documentation have a section titled 'GUI support' (it's
291currently section 11.7 in the API docs).  It does NOT mean there's any Lua
292table named 'Gui' (in fact there isn't).  It's just for documentation.
293If you look at the documentation, you'll see there is 'ProgDlg', 'TextWindow',
294etc. in that 'GUI support' section.  That's because both ProgDlg and
295TextWindow are defined in that same wslua_gui.c file using the
296'WSLUA_CLASS_DEFINE' macro. (see description of that later)  make-wsluarm.pl
297created those in the same documentation section because they're in the same c
298file as that WSLUA_MODULE comment.  You'll also note the documentation
299includes a sub-section for 'Non Method Functions', which it auto-generated
300from anything with a 'WSLUA_FUNCTION' macro (as opposed to class member
301functions, which use the 'WSLUA_METHOD' and 'WSLUA_CONSTRUCTOR' macros). Also,
302to make new wslua files generate documentation, it is not sufficient to just
303add this macro to a new file and add the file to the CMakeLists.txt; you also
304have to add the module name into docbook/user-guide.xml, and docbook/wsluarm.xml.
305
306
307WSLUA_CONTINUE_MODULE - like WSLUA_MODULE, except used at the top of a .c file
308to continue defining classes/functions/etc. within a previously declared module
309in a previous file (i.e., one that used WSLUA_MODULE). The module name must match
310the original one, and the .c file must be listed after the original one in the
311CMakeLists.txt lists in the docbook directory.
312
313
314WSLUA_ATTRIBUTE - this is another documentation-only "macro", only used within
315comments.  It makes the API docs generate documentation for a member variable
316of a class, i.e. a key of a Lua table that is not called as a function in Lua,
317but rather just retrieved or set.  The 'WSLUA_ATTRIBUTE' token is followed by
318a 'RO', 'WO', or 'RW' token, for Read-Only, Write-Only, or Read-Write. (ie,
319whether the variable can be retrieved, written to, or both)  This read/write
320mode indication gets put into the API documentation. After that comes the name
321of the attribute, which must be the class name followed by the specific
322attribute name.
323
324Example:
325
326    /* WSLUA_ATTRIBUTE Pinfo_rel_ts RO Number of seconds passed since beginning of capture */
327
328
329
330WSLUA_FUNCTION - this is used for any global Lua function (functions put into
331the global table) you want to expose, but not for object-style methods (that's
332the 'WSLUA_METHOD' macro), nor static functions within an object (that's
333WSLUA_CONSTRUCTOR).  Unlike many of the macros here, the function name must
334begin with 'wslua_'.  Everything after that prefix will be the name of the
335function in Lua.  You can ONLY use lower-case letters and the underscore
336character in this function name.  For example 'WSLUA_FUNCTION
337wslua_get_foo(lua_State* L)' will become a Lua function named 'get_foo'.
338Documentation for it will also be automatically generated, as it is for the
339other macros.  Although from a Lua perspective it is a global function (not in
340any class' table), the documentation will append it to the documentation page
341of the module/file its source code is in, in a "Non Method Functions" section.
342Descriptive text about the function must be located after the '{' and optional
343whitespace, within a '\*' '*\' comment block on one line.
344
345Example:
346
347    WSLUA_FUNCTION wslua_gui_enabled(lua_State* L) { /* Checks whether the GUI facility is enabled. */
348        lua_pushboolean(L,GPOINTER_TO_INT(ops && ops->add_button));
349        WSLUA_RETURN(1); /* A boolean: true if it is enabled, false if it isn't. */
350    }
351
352
353WSLUA_CLASS_DEFINE - this is used to define/create a new Lua class type (i.e.,
354table with methods).  A Class name must begin with an uppercase letter,
355followed by any upper or lower case letters but not underscores; in other
356words, UpperCamelCase without numbers.  The macro is expanded to create a
357bunch of helper functions - see wslua.h.  Documentation for it will also be
358automatically generated, as it is for the other macros.
359
360Example:
361
362    WSLUA_CLASS_DEFINE(ProgDlg,NOP,NOP); /* Manages a progress bar dialog. */
363
364
365WSLUA_CONSTRUCTOR - this is used to define a function of a class that is a
366static function rather than a per-object method; i.e., from a Lua perspective
367the function is called as 'myObj.func()' instead of 'myObj:func()'.  From a
368C-code perspective the code generated by make-reg.pl does not treat this
369differently than a WSLUA_METHOD, the only real difference being that the code
370you write within the function won't be checking the object instance as the
371first passed-in argument on the Lua-API stack.  But from a documentation
372perspective this macro correctly documents the usage using a '.' period rather
373than ':' colon.  This can also be used within comments, but then it's
374'_WSLUA_CONSTRUCTOR_'.  The name of the function must use the Class name
375first, followed by underscore, and then the specific lower_underscore name
376that will end up being the name of the function in Lua.
377
378Example:
379
380    WSLUA_CONSTRUCTOR Dissector_get (lua_State *L) {
381        /* Obtains a dissector reference by name */
382    #define WSLUA_ARG_Dissector_get_NAME 1 /* The name of the dissector */
383        const gchar* name = luaL_checkstring(L,WSLUA_ARG_Dissector_get_NAME);
384        Dissector d;
385
386        if (!name)
387            WSLUA_ARG_ERROR(Dissector_get,NAME,"must be a string");
388
389        if ((d = find_dissector(name))) {
390            pushDissector(L, d);
391            WSLUA_RETURN(1); /* The Dissector reference */
392        } else
393            WSLUA_ARG_ERROR(Dissector_get,NAME,"No such dissector");
394    }
395
396
397WSLUA_METHOD - this is used for object-style class method definitions.  The
398documentation will use the colon syntax, and it will be called as
399'myObj:func()' in Lua, so your function needs to check the first argument of
400the stack for the object pointer.  Two helper functions are automatically made
401for this purpose, from the macro expansion of WSLUA_CLASS_DEFINE, of the
402signatures 'MyObj toMyObj(lua_State* L, int idx)' and 'MyObj
403checkMyObj(lua_State* L, int idx)'.  They do the same thing, but the former
404generates a Lua Error on failure, while the latter does not.
405
406Example:
407
408    WSLUA_METHOD Listener_remove(lua_State* L) {
409        /* Removes a tap listener */
410        Listener tap = checkListener(L,1);
411        if (!tap) return 0;
412        remove_tap_listener(tap);
413        return 0;
414    }
415
416
417WSLUA_METAMETHOD - this is used for defining object metamethods (ie, Lua
418metatable functions).  The documentation will describe these as well, although
419currently it doesn't specify they're metamethods but rather makes them appear
420as regular object methods.  The name of it must be the class name followed by
421*two* underscores, or else it will not appear in the documentation.
422
423Example:
424
425    WSLUA_METAMETHOD NSTime__eq(lua_State* L) { /* Compares two NSTimes */
426        NSTime time1 = checkNSTime(L,1);
427        NSTime time2 = checkNSTime(L,2);
428        gboolean result = FALSE;
429
430        if (!time1 || !time2)
431          WSLUA_ERROR(FieldInfo__eq,"Data source must be the same for both fields");
432
433        if (nstime_cmp(time1, time2) == 0)
434            result = TRUE;
435
436        lua_pushboolean(L,result);
437
438        return 1;
439    }
440
441
442WSLUA_ARG_ - the prefix used in a #define statement, for a required
443function/method argument (ie, one without a default value).  It is defined to
444an integer representing the index slot number of the Lua stack it will be at,
445when calling the appropriate lua_check/lua_opt routine to get it from the
446stack.  The make_wsluarm.pl Perl script will generate API documentation with
447this argument name for the function/method, removing the 'WSLUA_ARG_' prefix.
448The name following the 'WSLUA_ARG_' prefix must be the same name as the
449function it's an argument for, followed by an underscore and then an ALLCAPS
450argument name (including numbers is ok).  Although this last part is in
451ALLCAPS, it is documented in lowercase.  The argument name itself is
452meaningless since it does not exist in Lua or C code.
453
454Example: see the example in WSLUA_CONSTRUCTOR above, where
455WSLUA_ARG_Dissector_get_NAME is used.
456
457
458WSLUA_OPTARG_ - the prefix used in a #define statement, for an optional
459function/method argument (ie, one with a default value).  It is defined to an
460integer representing the index slot number of the Lua stack it will be at,
461when calling the appropriate lua_check/lua_opt routine to get it from the
462stack.  The make_wsluarm.pl Perl script will generate API documentation with
463this argument name for the function/method, removing the 'WSLUA_OPTARG_'
464prefix.  The rules for the name of the argument after the prefix are the same
465as for 'WSLUA_ARG_' above.
466
467Example:
468
469    #define WSLUA_OPTARG_Dumper_new_FILETYPE 2 /* The type of the file to be created */
470
471
472WSLUA_MOREARGS - a documentation-only macro used to document that more
473arguments are expected/supported.  This is useful when the number of
474arguments is not fixed, i.e., a vararg model.  The macro is followed by the
475name of the function it's an argument for (without the 'wslua_' prefix if the
476function is a WSLUA_FUNCTION type), and then followed by descriptive text.
477
478Example:
479
480    WSLUA_FUNCTION wslua_critical( lua_State* L ) { /* Will add a log entry with critical severity*/
481    /* WSLUA_MOREARGS critical objects to be printed    */
482        wslua_log(L,G_LOG_LEVEL_CRITICAL);
483        return 0;
484    }
485
486
487WSLUA_RETURN - a macro with parentheses containing the number of return
488values, meaning the number of items pushed back to Lua.  Lua supports multiple
489return values, although Wireshark usually just returns 0 or 1 value.  The
490argument can be an integer or a variable of the integer, and is not actually
491documented.  The API documentation will use the comments after this macro for
492the return description.  This macro can also be within comments, but is then
493'_WSLUA_RETURNS_'.
494
495Example:
496
497    WSLUA_RETURN(1); /* The ethernet pseudoheader */
498
499
500WSLUA_ERROR - this C macro takes arguments, and expands to call luaL_error()
501using them, and returns 0.  The arguments it takes is the full function name
502and a string describing the error.  For documentation, it uses the string
503argument and displays it with the function it's associated to.
504
505Example:
506    if (!wtap_dump_can_write_encap(filetype, encap))
507        WSLUA_ERROR(Dumper_new,"Not every filetype handles every encap");
508
509
510WSLUA_ARG_ERROR - this is a pure C macro and does not generate any
511documentation.  It is used for errors in type/value of function/method
512arguments.
513
514Example: see the example in thr WSLUA_CONSTRUCTOR above.
515
516
517==============================================================================
518
519Memory management model:
520
521Lua uses a garbage collection model, which for all intents and purposes can
522collect garbage at any time once an item is no longer referenced by something
523in Lua.  When C-malloc'ed values are pushed into Lua, the Lua library has to
524let you decide whether to try to free them or not.  This is done through the
525'__gc' metamethod, so every Wireshark class created by WSLUA_CLASS_DEFINE must
526implement a metamethod function to handle this.  The name of the function must
527be 'ClassName__gc', where 'ClassName' is the same name as the class.  Even if
528you decide to do nothing, you still have to define the function or it will
529fail to compile - as of this writing, which changed it to do so, in order to
530make the programmer think about it and not forget.
531
532The thing to think about is the lifetime of the object/value.  If C-code
533controls/manages the object after pushing it into Lua, then C-code MUST NOT
534free it until it knows Lua has garbage collected it, which is only known by
535the __gc metamethod being invoked.  Otherwise you run the risk of the Lua
536script trying to use it later, which will dereference a pointer to something
537that has been free'd, and crash.  There are known ways to avoid this, but
538those ways are not currently used in Wireshark's Lua API implementation;
539except Tvb and TvbRange do implement a simple model of reference counting to
540protect against this.
541
542If possible/reasonable, the best model is to malloc the object when you push
543it into Lua, usually in a class function (not method) named 'new', and then
544free it in the __gc metamethod.  But if that's not reasonable, then the next
545best model is to have a boolean member of the class called something like
546'expired', which is set to true if the C-code decides it is dead/no-longer-
547useful, and then have every Lua-to-C accessor method for that class type check
548that boolean before trying to use it, and have the __gc metamethod set
549expired=true or free it if it's already expired by C-side code; and vice-versa
550for the C-side code.
551
552In some cases the class is exposed with a specific method to free/remove it,
553typically called 'remove'; the Listener class does this, for example.  When
554the Lua script calls myListener:remove(), the C-code for that class method
555free's the Listener that was malloc'ed previously in Listener.new().  The
556Listener__gc() metamethod does not do anything, since it's hopefully already
557been free'd.  The downside with this approach is if the script never calls
558remove(), then it leaks memory; and if the script ever tries to use the
559Listener userdata object after it called remove(), then Wireshark crashes.  Of
560course either case would be a Lua script programming error, and easily
561fixable, so it's not a huge deal.
562
563==============================================================================
564
565

README.xml-output

1Protocol Dissection in XML Format
2=================================
3Copyright (c) 2003 by Gilbert Ramirez <gram@alumni.rice.edu>
4
5Wireshark has the ability to export its protocol dissection in an
6XML format, tshark has similar functionality by using the "-Tpdml"
7option.
8
9The XML that Wireshark produces follows the Packet Details Markup
10Language (PDML) specified by the group at the Politecnico Di Torino
11working on Analyzer. The specification was found at:
12
13http://analyzer.polito.it/30alpha/docs/dissectors/PDMLSpec.htm
14
15That URL is not working anymore, but a copy can be found at the Internet
16Archive:
17
18https://web.archive.org/web/20050305174853/http://analyzer.polito.it/30alpha/docs/dissectors/PDMLSpec.htm
19
20This is similar to the NetPDL language specification:
21
22http://www.nbee.org/doku.php?id=netpdl:index
23
24The domain registration there has also expired, but an Internet Archive
25copy is also available at:
26
27https://web.archive.org/web/20160305211810/http://nbee.org/doku.php?id=netpdl:index
28
29A related XML format, the Packet Summary Markup Language (PSML), is
30also defined by the Analyzer group to provide packet summary information.
31The PSML format is not documented in a publicly-available HTML document,
32but its format is simple. Wireshark can export this format too, and
33tshark can produce it with the "-Tpsml" option.
34
35PDML
36====
37The PDML that Wireshark produces is known not to be loadable into Analyzer.
38It causes Analyzer to crash. As such, the PDML that Wireshark produces
39is labeled with a version number of "0", which means that the PDML does
40not fully follow the PDML spec. Furthermore, a creator attribute in the
41"<pdml>" tag gives the version number of wireshark/tshark that produced the
42PDML.
43
44In that way, as the PDML produced by Wireshark matures, but still does not
45meet the PDML spec, scripts can make intelligent decisions about how to
46best parse the PDML, based on the "creator" attribute.
47
48A PDML file is delimited by a "<pdml>" tag.
49A PDML file contains multiple packets, denoted by the "<packet>" tag.
50A packet will contain multiple protocols, denoted by the "<proto>" tag.
51A protocol might contain one or more fields, denoted by the "<field>" tag.
52
53A pseudo-protocol named "geninfo" is produced, as is required by the PDML
54spec, and exported as the first protocol after the opening "<packet>" tag.
55Its information comes from wireshark's "frame" protocol, which serves
56the similar purpose of storing packet meta-data. Both "geninfo" and
57"frame" protocols are provided in the PDML output.
58
59The "<pdml>" tag
60================
61Example:
62	<pdml version="0" creator="wireshark/0.9.17">
63
64The creator is "wireshark" (i.e., the "wireshark" engine. It will always say
65"wireshark", not "tshark") version 0.9.17.
66
67
68The "<proto>" tag
69=================
70"<proto>" tags can have the following attributes:
71
72	name - the display filter name for the protocol
73	showname - the label used to describe this protocol in the protocol
74		tree. This is usually the descriptive name of the protocol,
75		but it can be modified by dissectors to include more data
76		(tcp can do this)
77	pos - the starting offset within the packet data where this
78		protocol starts
79	size - the number of octets in the packet data that this protocol
80		covers.
81
82The "<field>" tag
83=================
84"<field>" tags can have the following attributes:
85
86	name - the display filter name for the field
87	showname - the label used to describe this field in the protocol
88		tree. This is usually the descriptive name of the protocol,
89		followed by some representation of the value.
90	pos - the starting offset within the packet data where this
91		field starts
92	size - the number of octets in the packet data that this field
93		covers.
94	value - the actual packet data, in hex, that this field covers
95	show - the representation of the packet data ('value') as it would
96		appear in a display filter.
97
98
99Deviations from the PDML standard
100=================================
101Various dissectors parse packets in a way that does not fit all the assumptions
102in the PDML specification. In some cases Wireshark adjusts the output to match
103the spec more closely, but exceptions exist.
104
105Some dissectors sometimes place text into the protocol tree, without using
106a field with a field-name. Those appear in PDML as "<field>" tags with no
107'name' attribute, but with a 'show' attribute giving that text.
108
109Some dissectors place field items at the top level instead of inside a
110protocol. In these cases, in the PDML output the field items are placed
111inside a fake "<proto>" element named "fake-field-wrapper" in order to
112maximize compliance.
113
114Many dissectors label the undissected payload of a protocol as belonging
115to a "data" protocol, and the "data" protocol often resides inside
116that last protocol dissected. In the PDML, the "data" protocol becomes
117a "data" field, placed exactly where the "data" protocol is in Wireshark's
118protocol tree. So, if Wireshark would normally show:
119
120+-- Frame
121|
122+-- Ethernet
123|
124+-- IP
125|
126+-- TCP
127|
128+-- HTTP
129    |
130    +-- Data
131
132In PDML, the "Data" protocol would become another field under HTTP:
133
134<packet>
135	<proto name="frame">
136	...
137	</proto>
138
139	<proto name="eth">
140	...
141	</proto>
142
143	<proto name="ip">
144	...
145	</proto>
146
147	<proto name="tcp">
148	...
149	</proto>
150
151	<proto name="http">
152	...
153		<field name="data" value="........."/>
154	</proto>
155</packet>
156
157In cases where the "data" protocol appears at the top level, it is
158still converted to a field, and placed inside the "fake-field-wrapper"
159protocol, just as any other top level field.
160
161Similarly, expert info items in Wireshark belong to an internal protocol
162named "_ws.expert", which is likewise converted into a "<field>" element
163of that name.
164
165Some dissectors also place subdissected protocols in a subtree instead of
166at the top level. Unlike with the "data" protocol, the PDML output does
167_not_ change these protocols to fields, but rather outputs them as "<proto>"
168elements. This results in well-formed XML that does, however, violate the
169PDML spec, as "<proto>" elements should only appear as direct children of
170"<packet>" elements, with only "<field>" elements nested therein.
171
172Note that packet tag may have nonstandard color attributes, "foreground" and "background"
173
174
175tools/WiresharkXML.py
176====================
177This is a python module which provides some infrastructure for
178Python developers who wish to parse PDML. It is designed to read
179a PDML file and call a user's callback function every time a packet
180is constructed from the protocols and fields for a single packet.
181
182The python user should import the module, define a callback function
183which accepts one argument, and call the parse_fh function:
184
185------------------------------------------------------------
186import WiresharkXML
187
188def my_callback(packet):
189	# do something
190
191# If the PDML is stored in a file, you can:
192fh = open(xml_filename)
193WiresharkXML.parse_fh(fh, my_callback)
194
195# or, if the PDML is contained within a string, you can:
196WiresharkXML.parse_string(my_string, my_callback)
197
198# Now that the script has the packet data, do something.
199------------------------------------------------------------
200
201The object that is passed to the callback function is an
202WiresharkXML.Packet object, which corresponds to a single packet.
203WiresharkXML Provides 3 classes, each of which corresponds to a PDML tag:
204
205	Packet	 - "<packet>" tag
206	Protocol - "<proto>" tag
207	Field    - "<field>" tag
208
209Each of these classes has accessors which will return the defined attributes:
210
211	get_name()
212	get_showname()
213	get_pos()
214	get_size()
215	get_value()
216	get_show()
217
218Protocols and fields can contain other fields. Thus, the Protocol and
219Field class have a "children" member, which is a simple list of the
220Field objects, if any, that are contained. The "children" list can be
221directly accessed by code using the object. The "children" list will be
222empty if this Protocol or Field contains no Fields.
223
224Furthermore, the Packet class is a sub-class of the PacketList class.
225The PacketList class provides methods to look for protocols and fields.
226The term "item" is used when the item being looked for can be
227a protocol or a field:
228
229	item_exists(name) - checks if an item exists in the PacketList
230	get_items(name) - returns a PacketList of all matching items
231
232
233General Notes
234=============
235Generally, parsing XML is slow. If you're writing a script to parse
236the PDML output of tshark, pass a read filter with "-R" to tshark to
237try to reduce as much as possible the number of packets coming out of tshark.
238The less your script has to process, the faster it will be.
239
240tools/msnchat
241=============
242tools/msnchat is a sample Python program that uses WiresharkXML to parse
243PDML. Given one or more capture files, it runs tshark on each of them,
244providing a read filter to reduce tshark's output. It finds MSN Chat
245conversations in the capture file and produces nice HTML showing the
246conversations. It has only been tested with capture files containing
247non-simultaneous chat sessions, but was written to more-or-less handle any
248number of simultaneous chat sessions.
249
250pdml2html.xsl
251=============
252pdml2html.xsl is a XSLT file to convert PDML files into HTML.
253See https://gitlab.com/wireshark/wireshark/-/wikis/PDML for more details.
254