1January 7, 2002
2
3MP4V2 LIBRARY INTERNALS
4=======================
5
6This document provides an overview of the internals of the mp4v2 library
7to aid those who wish to modify and extend it. Before reading this document,
8I recommend familiarizing yourself with the MP4 (or Quicktime) file format
9standard and the mp4v2 library API. The API is described in a set of man pages
10in mpeg4ip/doc/mp4v2, or if you prefer by looking at mp4.h.
11
12All the library code is written in C++, however the library API follows uses
13C calling conventions hence is linkable by both C and C++ programs. The
14library has been compiled and used on Linux, BSD, Windows, and Mac OS X.
15Other than libc, the library has no external dependencies, and hence can
16be used independently of the mpeg4ip package if desired.  The library is
17used for both real-time recording and playback in mpeg4ip, and its runtime
18performance is up to those tasks. On the IA32 architecture compiled with gcc,
19the stripped library is approximately 600 KB code and initialized data.
20
21It is useful to think of the mp4v2 library as consisting of four layers:
22infrastructure, file format, generic tracks, and type specific track helpers.
23A description of each layer follows, from the fundamental to the optional.
24
25
26Infrastructure
27==============
28
29The infrastructure layer provides basic file I/O, memory allocation,
30error handling, string utilities, and protected arrays. The source files
31for this layer are mp4file_io, mp4util, and mp4array.
32
33Note that the array classes uses preprocessor macros instead of C++
34templates. The rationale for this is to increase portability given the
35sometimes incomplete support by some compilers for templates.
36
37
38File Format
39===========
40
41The file format layer provides the translation from the on-disk MP4 file
42format to in-memory C++ structures and back to disk. It is intended
43to exactly match the MP4 specification in syntax and semantics. It
44represents the majority of the code.
45
46There are three key structures at the file format layer: atoms, properties,
47and descriptors.
48
49Atoms are the primary containers within an mp4 file. They can contain
50any combination of properties, other atoms, or descriptors.
51
52The mp4atom files contain the base class for all the atoms, and provide
53generic functions that cover most cases. Most atoms are covered in
54atom_standard.cpp.  Atoms that have a special read, generation or
55write needs are contained in their subclass contained in file atom_<name>.cpp,
56 where <name> is the four letter name of the atom defined in the MP4
57specification.
58
59Atoms that only specifies the properties of the atom or the possible child
60atoms in the case of a container atom are located in atom_standard.cpp.
61
62In more specialized cases the atom specific file provides routines to
63initialize, read, or write the atom.
64
65Properties are the atomic pieces of information. The basic types of
66properties are integers, floats, strings, and byte arrays. For integers
67and floats there are subclasses that represent the different storage sizes,
68e.g. 8, 16, 24, 32, and 64 bit integers. For strings, there is 1 property
69class with a number of options regarding exact storage details, e.g. null
70terminated, fixed length, counted.
71
72For implementation reasons, there are also two special properties, table
73and descriptor, that are actually containers for groups of properties.
74I.e by making these containers provide a property interface much code can
75be written in a generic fashion.
76
77The mp4property files contain all the property related classes.
78
79Descriptors are containers that derive from the MPEG conventions and use
80different encoding rules than the atoms derived from the QuickTime file
81format. This means more use of bitfields and conditional existence with
82an emphasis on bit efficiency at the cost of encoding/decoding complexity.
83Descriptors can contain other descriptors and/or properties.
84
85The mp4descriptor files contain the generic base class for descriptors.
86Also the mp4property files have a descriptor wrapper class that allows a
87descriptor to behave as if it were a property. The specific descriptors
88are implemented as subclasses of the base class descriptor in manner similar
89to that of atoms. The descriptors, ocidescriptors, and qosqualifiers files
90contain these implementations.
91
92Each atom/property/descriptor has a name closely related to that in the
93MP4 specification. The difference being that the mp4v2 library doesn't
94use '-' or '_' in property names and capitalizes the first letter of each
95word, e.g. "thisIsAPropertyName". A complete name specifies the complete
96container path.  The names follow the C/C++ syntax for elements and array
97indices.
98
99Examples are:
100	"moov.mvhd.duration"
101	"moov.trak[2].tkhd.duration"
102	"moov.trak[3].minf.mdia.stbl.stsz[101].sampleSize"
103
104Note "*" can be used as a wildcard for an atom name (only). This is most
105useful when dealing with the stsd atom which contains child atoms with
106various names, but shared property names.
107
108Note that internally when performance matters the code looks up a property
109by name once, and then stores the returned pointer to the property class.
110
111To add an atom, first you should see if an existing atom exists that
112can be used.  If not, you need to decide if special read/write or
113generate properties need to be established; for example a property in the atom
114changes other properties (adds, or subtracts).  If there are no
115special cases, add the atom properties to atom_standard.cpp.  If there
116are special properties, add a new file, add a new class to atoms.h, and
117add the class to  MP4Atom::CreateAtom in mp4atom.cpp.
118
119
120
121Generic Tracks
122==============
123
124The two entities at this level are the mp4 file as a whole and the tracks
125which are contained with it. The mp4file and mp4track files contain the
126implementation.
127
128The critical work done by this layer is to map the collection of atoms,
129properties, and descriptors that represent a media track into a useful,
130and consistent set of operations. For example, reading or writing a media
131sample of a track is a relatively simple operation from the library API
132perspective. However there are numerous pieces of information in the mp4
133file that need to be properly used and updated to do this. This layer
134handles all those details.
135
136Given familiarity with the mp4 spec, the code should be straight-forward.
137What may not be immediately obvious are the functions to handle chunks of
138media samples. These exist to allow optimization of the mp4 file layout by
139reordering the chunks on disk to interleave the media sample chunks of
140multiple tracks in time order. (See MP4Optimize API doc).
141
142
143Type Specific Track Helpers
144===========================
145
146This specialized code goes beyond the meta-information about tracks in
147the mp4 file to understanding and manipulating the information in the
148track samples. There are currently two helpers in the library:
149the MPEG-4 Systems Helper, and the RTP Hint Track Helper.
150
151The MPEG-4 Systems Helper is currently limited to creating the OD, BIFS,
152and SDP information about a minimal audio/video scene consistent with
153the Internet Streaming Media Alliance (ISMA) specifications. We will be
154evaluating how best to generalize the library's helper functions for
155MPEG-4 Systems without overburdening the implementation. The code for
156this helper is found in the isma and odcommands files.
157
158The RTP Hint Track Helper is more extensive in its support. The hint
159tracks contain the track packetization information needed to build
160RTP packets for streaming. The library can construct RTP packets based
161on the hint track making RTP based servers significantly easier to write.
162
163All code related to rtp hint tracks is in the rtphint files. It would also
164be useful to look at test/mp4broadcaster and mpeg4ip/server/mp4creator for
165examples of how this part of the library API can be used.
166
167
168Library API
169===========
170
171The library API is defined and implemented in the mp4 files. The API uses
172C linkage conventions, and the mp4.h file adapts itself according to whether
173C or C++ is the compilation mode.
174
175All API calls are implemented in mp4.cpp and basically pass thru's to the
176MP4File member functions. This ensures that the library has internal access
177to the same functions as available via the API. All the calls in mp4.cpp use
178C++ try/catch blocks to protect against any runtime errors in the library.
179Upon error the library will print a diagnostic message if the verbostiy level
180has MP4_DETAILS_ERROR set, and return a distinguished error value, typically
1810 or -1.
182
183The test and util subdirectories contain useful examples of how to
184use the library. Also the mp4creator and mp4live programs within
185mpeg4ip demonstrate more complete usage of the library API.
186
187
188Debugging
189=========
190
191Since mp4 files are fairly complicated, extensive debugging support is
192built into the library. Multi-level diagnostic messages are available
193under the control of a verbosity bitmask described in the API.
194
195Also the library provides the MP4Dump() call which provides an ASCII
196version of the mp4 file meta-information. The mp4dump utilitity is a
197wrapper executable around this function.
198
199The mp4extract program is also provided in the utilities directory
200which is useful for extracting a track from an mp4file and putting the
201media data back into it's own file. It can also extract each sample of
202a track into its own file it that is desired.
203
204When all else fails, mp4 files are amenable to debugging by direct
205examination. Since the atom names are four letter ASCII codes finding
206reference points in a hex dump is feasible. On UNIX, the od command
207is your friend: "od -t x1z -A x [-j 0xXXXXXX] foo.mp4" will print
208a hex and ASCII dump, with hex addresses, starting optionally from
209a specified offset. The library diagnostic messages can provide
210information on where the library is reading or writing.
211
212
213General caveats
214===============
215
216The coding convention is to use the C++ throw operator whenever an
217unrecoverable error occurs. This throw is caught at the API layer
218in mp4.cpp and translated into an error value.
219
220Be careful about indices. Internally, we follow the C/C++ convention
221to use zero-based indices. However the MP4 spec uses one-based indices
222for things like samples and hence the library API uses this convention.
223
224