1README for the Dirac video codec
2~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3
4by the BBC R&D Dirac team (diracinfo@rd.bbc.co.uk)
5
6
71. Executive Summary
8~~~~~~~~~~~~~~~~~~~~
9
10Dirac is an open source video codec. It uses a traditional hybrid video codec
11architecture, but with the wavelet transform instead of the usual block
12transforms.  Motion compensation uses overlapped blocks to reduce block
13artefacts that would upset the transform coding stage.
14
15Dirac can code just about any size of video, from streaming up to HD and
16beyond, although certain presets are defined for different applications and
17standards.  These cover the parameters that need to be set for the encoder to
18work, such as block sizes and temporal prediction structures, which must
19otherwise be set by hand.
20
21Dirac is intended to develop into real coding and decoding software, capable
22of plugging into video processing applications and media players that need
23compression. It is intended to develop into a simple set of reliable but
24effective coding tools that work over a wide variety of content and formats,
25using well-understood compression techniques, in a clear and accessible
26software structure. It is not intended as a demonstration or reference coder.
27
28
292. Documentation
30~~~~~~~~~~~~~~~~
31
32Documentation can be found at
33http://diracvideo.org/wiki/index.php/Main_Page#Documentation
34
353. Building and installing
36~~~~~~~~~~~~~~~~~~~~~~~~~~
37
38  GNU/Linux, Unix, MacOS X, Cygwin, Mingw
39  ---------------------------------------
40    ./configure --enable-debug
41        (to enable extra debug compile options)
42     OR
43    ./configure --enable-profile
44        (to enable the g++ profiling flag -pg)
45     OR
46    ./configure --disable-mmx
47        (to disable MMX optimisation which is enabled by default)
48     OR
49    ./configure --enable-debug --enable-profile
50        (to enable extra debug compile options and profiling options)
51     OR
52     ./configure
53
54     By default, both shared and static libraries are built. To build all-static
55     libraries use
56     ./configure --disable-shared
57
58     To build shared libraries only use
59     ./configure --disable-static
60
61     make
62     make install
63
64  The INSTALL file documents arguments to ./configure such as
65  --prefix=/usr/local (specify the installation location prefix).
66
67
68  MSYS and Microsoft Visual C++
69  -----------------------------
70     Download and install the no-cost Microsoft Visual C++ 2008 Express
71     Edition  from
72     http://msdn.microsoft.com/vstudio/express/visualc/
73
74     Download and install MSYS (the MinGW Minimal SYStem), MSYS-1.0.10.exe,
75     from http://www.mingw.org/download.shtml. An MSYS icon will be available
76     on the desktop.
77
78     Click on the MSYS icon on the desktop to open a MSYS shell window.
79
80     Create a .profile file to set up the environment variables required.
81     vi .profile
82
83     Include the following four lines in the .profile file.
84
85     PATH=/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/Common7/IDE:/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/BIN:/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/Common7/Tools:/c/WINDOWS/Microsoft.NET/Framework/v3.5:/c/WINDOWS/Microsoft.NET/Framework/v2.0.50727:/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/VCPackages:$PATH
86
87     INCLUDE=/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/INCLUDE:$INCLUDE
88     LIB=/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/LIB:$LIB
89
90     LIBPATH=/c/WINDOWS/Microsoft.NET/Framework/v3.5:/c/WINDOWS/Microsoft.NET/Framework/v2.0.50727:/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/LIB:$LIBPATH
91
92    (Replace /c/Program\ Files/Microsoft\ Visual\ Studio\ 9/ with
93    the location where VC++ 2008 is installed if necessary)
94
95     Exit from the MSYS shell and click on the MSYS icon on the desktop to open
96     a new MSYS shell window for the .profile to take effect.
97
98     Change directory to the directory where Dirac was unpacked. By default
99     only the dynamic libraries are built.
100
101     ./configure CXX=cl LD=cl --enable-debug
102         (to enable extra debug compile options)
103     OR
104     ./configure CXX=cl LD=cl --disable-shared
105         (to build static libraries)
106     OR
107     ./configure CXX=cl LD=cl
108     make
109     make install
110
111     The INSTALL file documents arguments to ./configure such as
112     --prefix=/usr/local (specify the installation location prefix).
113
114  Microsoft Visual C++ .NET 2008
115  ------------------------------
116  Download and install the no-cost Microsoft Visual C++ 2008 Express
117  Edition  from
118  http://www.microsoft.com/express/download/
119
120  The MS VC++ 2008 solution and project files are in win32/VisualStudio
121  directory.  Double-click on the solution file, dirac.sln, in the
122  win32/VisualStudio directory.  The target 'Everything' builds the codec
123  libraries and utilities. Four build-types are supported
124
125  Debug - builds unoptimised encoder and decoder dlls with debug symbols
126  Release - builds optimised encoder and decoder dlls
127  Debug-mmx - builds unoptimised encoder and decoder dlls with debug symbols
128              and mmx optimisations enabled.
129  Release-mmx - builds optimised encoder and decoder dlls  with mmx
130              optimisations enabled.
131  Static-Debug - builds unoptimised encoder and decoder static libraries
132                 with debug symbols
133  Static-Release - builds optimised encoder and decoder static libraries
134  Static-Debug-mmx - builds unoptimised encoder and decoder static libraries
135                     with debug symbols and mmx optmisations enabled.
136  Static-Release-mmx - builds optimised encoder and decoder static libraries
137                       with mmx optmisations enabled.
138
139  Static libraries are created in the win32/VisualStudio/build/lib/<build-type> directory.
140
141  Encoder and Decoder dlls and import libraries, encoder and decoder apps are
142  created in the win32/VisualStudio/build/bin/<build-type> directory. The "C"
143  public API is exported using the _declspec(dllexport) mechanism.
144
145  Conversion utilites are created in the
146  win32/VisualStudio/build/utils/conversion/<build-type> directory. Only static
147  versions are  built.
148  Instrumentation utility is created in the
149  win32/VisualStudio/build/utils/instrumentation/<build-type> directory. Only
150  static versions are built.
151
152
153  Older editions of Microsoft Visual C++  (e.g. 2003 and 2005)
154  -----------------------------------------------------------
155
156  NOTE: Since Visual C++ 2008 Express edition is freely available to
157  download, older versions of the Visual C++ editions are no longer
158  supported. So it is suggested that the users upgrade their VC++ environment
159  to VC++ 2008.
160
1614. Running the example programs
162~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
163
1644.1 Command-line parameters
165
166At the moment there is a simple command-line parser class which is
167used in all the executables. The general procedure for running a program
168is to type:
169
170  prog_name -<flag_name> flag_val ... param1 param2 ...
171
172In other words, options are prefixed by a dash; some options take values,
173while others are boolean options that enable specific features. For example:
174When running the encoder, the -qf options requires a numeric argument
175specifying the "quality factor" for encoding. The -verbose option enables
176detailed output and does not require an argument.
177
178Running any program without arguments will display a list of parameters and
179options.
180
1814.2 File formats
182
183The example coder and decoder use raw 8-bit planar YUV data.  This means that
184data is stored bytewise, with a frame of Y followed by a frame of U followed
185by a frame of V, all scanned in the usual raster order. The video dimensions
186, frame rate and chroma are passed to the encoder via command line arguments.
187
188Other file formats are supported by means of conversion utilities that
189may be found in the subdirectory util/conversion. These will convert to
190and from raw RGB format, and support all the standard raw YUV formats as
191well as bitmaps. Raw RGB can be obtained as an output from standard conversion
192utilities such as ImageMagick.
193
194Example.
195  Compress an image sequence of 100 frames of 352x288 video in tiff format.
196
197  Step 1.
198
199  Use your favourite conversion routine to produce a single raw RGB file of
200  all the data. If your routine converts frame-by-frame then you will
201  need to concatenate the output.
202
203  Step 2.
204
205  Convert from RGB to the YUV format of your choice. For example, to do
206  420, type
207
208  RGBtoYUV420 <file.rgb >file.yuv 352 288 100
209
210  Note that this uses stdin and stdout to read and write the data.
211
212  We have provided a script create_test_data.pl to help convert rgb format
213  files into all the input formats supported by Dirac. The command line
214  arguments it supports can be listed using
215
216  create_test_data.pl -use
217
218  Sample usage is
219
220  create_test_data.pl -width=352 -height=288 -num_frames=100 file.rgb
221
222  (This assumes that the RGBtoYUV utilities  are in a directory specified in
223  PATH variable. If not in the path, then use options -convutildir and to set
224  the directories where the script can find the conversion utilities.)
225
226  The scripts then outputs files in all chroma formats (420, 422,
227  444) supported by Dirac to the current directory.
228
229
230  Step 4.
231
232  Run the encoder. This will produce a locally decoded output in the
233  same format if the locally decoded output is enabled using the -local flag.
234
235  Step 5.
236
237  Convert back to RGB.
238
239  YUV420toRGB <file.yuv >file.rgb 352 288 100
240
241  Step 6.
242
243  Use your favourite conversion utility to convert to the format of your
244  choice.
245
246You can also use the transcode utility to convert data to and from Dirac's
247native formats (see http://zebra.fh-weingarten.de/~transcode/):
248
249  This example uses a 720x576x50 DV source, and transcodes to 720x576 YUV in
250  4:2:0 chroma format.  Cascading codecs (DV + Dirac) is generally a bad idea
251  - use this only if you don't have any other source of uncompressed video.
252
253    transcode -i source.dv -x auto,null --dv_yuy2_mode -k -V -y raw,null -o file.avi
254    tcextract -i test.avi -x rgb > file.yuv
255
256Viewing and playback utilities for uncompressed video include MPlayer and
257ImageMagick's display command.
258
259  Continuing the 352x288 4:2:0 example above, to display a single frame
260  of raw YUV with ImageMagick use the following (use <spacebar> to see
261  subsequent frames):
262
263    display -size 352x288 test.yuv
264
265  Raw YUV 420 data can also be played back in MPlayer - use the following
266  MPlayer command:
267
268    mplayer -fps 15 -rawvideo on:size=152064:w=352:h=288 test.yuv
269
270  (at the time of writing MPlayer could not playback 4:2:2 or 4:4:4 YUV data)
271
272
2734.3 Encoding
274
275The basic encoding syntax is to type
276
277dirac_encoder [options] file_in file_out
278
279This will compress file_in and produce an output file_out of compressed data.
280
281A locally decoded output file_out.local-dec.yuv and instrumentation data
282file_out.imt  (for debugging the encoder and of interest to developers only)
283are also produced if the -local flag is enabled on the command line.
284
285There are a large number of optional parameters that can be used to run the
286encoder, all of which are listed below. To encode video you need three types of
287parameter need to be set:
288
289a) quality factor or target bit rate
290b) source parameters (width, height, frame rate, chroma format)
291c) encoding parameters (motion compensation block sizes, preferred viewing
292   distance)
293
294In practice you don't have to set all these directly because presets can be used
295to use appropriate default values.
296
297a) The most important parameters are the quality factor or target bit rate.
298
299The quality factor is specified by using the option
300
301qf     : Overall quality factor (>0)
302
303This value is greater than 0, the higher the number, the better
304the quality. Typical high quality is 8-10, but it will vary from sequence to
305sequence, sometimes higher and sometimes lower.
306
307The target bit rate is set using the option
308
309targetrate : Target bit rate in Kb/s
310
311This will attempt to maintain constant bit rate over the sequence. It works
312reasonably well, but actual bit rate, especially over short sequences, may be
313slightly different from the target.
314
315Setting -targetrate overrides -qf, in that CBR will still be applied, although
316the initial quality will be set by the given qf value. This might help the CBR
317algorithm to adapt faster.
318
319Setting -lossless overrides both -qf and -targetrate, and enforces lossless
320coding.
321
322b) Source parameters need to be set as the imput is just a raw YUV file and
323the encoder doesn't have any information about it.
324
325The best way to set source parameters is to use a preset for
326different video formats.
327
328The available preset options  are:
329QSIF525   : width=176; height=120; 4:2:0 format; 14.98 frames/sec
330QCIF      : width=176; height=144; 4:2:0 format; 12.5 frames/sec
331SIF525    : width=352; height=240; 4:2:0 format; 14.98 frames/sec
332CIF       : width=352; height=288; 4:2:0 format; 12.5 frames/sec
3334SIF525   : width=704; height=480; 4:2:0 format; 14.98 frames/sec
3344CIF      : width=704; height=576; 4:2:0 format; 12.5  frames/sec
335SD480I60  : width=720; height=480; 4:2:2 format; 29.97 frames/sec
336SD576I50  : width=720; height=576; 4:2:2 format; 25 frames/sec
337HD720P60  : width=1280; height=720; 4:2:2 format; 60 frames/sec
338HD720P50  : width=1280; height=720; 4:2:2 format; 50 frames/sec
339HD1080I60 : width=1920; height=1080; 4:2:2 format; 29,97 frames/sec
340HD1080I50 : width=1920; height=1080; 4:2:2 format; 25 frames/sec
341HD1080P60 : width=1920; height=1080; 4:2:2 format; 59.94 frames/sec
342HD1080P50 : width=1920; height=1080; 4:2:2 format; 50 frames/sec
343DC2K24    : width=2048; height=1080; 4:2:2 format; 24 frames/sec
344DC4K24    : width=4096; height=2160; 4:2:2 format; 24 frames/sec
345UHDTV4K60 : width=3840; height=2160; 4:2:2 format; 59.94 frames/sec
346UHDTV4K50 : width=3840; height=2160; 4:2:2 format; 50 frames/sec
347UHDTV8K60 : width=7680; height=4320; 4:2:2 format; 59.94 frames/sec
348UHDTV8K50 : width=7680; height=4320; 4:2:2 format; 50 frames/sec
349
350The default format used is CUSTOM format which has the following preset values
351width=640; height=480; 4:2:0 format; 23.97 frames/sec.
352
353If your video is not one of these formats, you should pick the nearest preset
354and override the parameters that are different.
355
356Example 1 Simple coding example. Code a 720x576 sequence in Planar 420 format to
357high quality.
358
359Solution.
360
361  dirac_encoder -cformat YUV420P -SD576I50 -qf 9 test.yuv test_out.drc
362
363Example 2. Code a 720x486 sequence at 29.97 frames/sec in 422 format to
364medium quality
365
366Solution
367
368  dirac_encoder -SD576I50 -width 720 -height 486 -fr 29.97 -cformat YUV422P -qf 5.5 test.yuv test_out.drc
369
370Source parameters that affect coding are:
371
372width           : Width of video frame
373height          : Height of video frame
374cformat         : Chroma Sampling format. Acceptable values are
375                  YUV444P, YUV422P and YUV420P.
376fr              : Frame rate. Can be a decimal number or a fraction. Examples
377                  of acceptable values are 25, 29.97, 12.5, 30000/1001.
378source_sampling : Source material type - 0 - progressive or 1 - interlaced
379
380For a complete list of source parameters, refer to Annex C of the Dirac
381Specification.
382
383WARNING!! If you use a preset but don't override source parameters that
384are different, then Dirac will still compress, but the bit rate will be
385much, much higher and there may well be serious artefacts. The encoder prints
386out the parameters it's actually using before starting encoding (in verbose
387mode only), so that you can abort at this point.
388
389c) The presets ALSO set encoding parameters. That's why it's a very good idea
390to use presets, as the encoding parameters are a bit obscure. They're still
391supported for those who want to experiment, but use with care.
392
393Encoding parameters are:
394
395L1_sep        : the separation between L1 frames (frames that are predicted but
396                also used as reference frames, like P frames in MPEG-2)
397num_L1        : the number of L1 frames before the next intra frame
398xblen         : the width of blocks used for motion compensation
399yblen         : the height of blocks used for motion compensation
400xbsep         : the horizontal separation between blocks. Always <xblen
401ybsep         : the vertical separation between blocks. Always <yblen
402cpd           : normalised viewing distance parameter, in cycles per degree.
403iwlt_filter   : transform filter to use when encoding INTRA frames, Valid
404                values are DD9_7, LEGALL5_3, DD13_7, HAAR0, HAAR1, FIDELITY,
405                DAUB97. Default value is DD13_7.
406rwlt_filter   : transform filter to use when encoding INTER frames, Valid
407                values are DD9_7, LEGALL5_3, DD13_7, HAAR0, HAAR1, FIDELITY,
408                DAUB97. Default value is DD13_7.
409wlt_depth     : transform depth, i.e number of times the component is split
410                while applying the wavelet transform
411no_spartition : Do not split a subband into coefficient blocks before
412                entropy coding
413multi_quants  : If subbands are split into multiple coefficient blocks before
414                entropy coding, assign different quantisers to each block
415                within the subband.
416prefilter     : Prefilter to apply to input video before encoding. The name of
417                the filter to be used and the filter strength have to be
418                supplied. Valid filter names are NO_PF, CWM, RECTLP and
419                DIAGLP. Filter strenth range should be in the range 0-10.
420                (note PSNR statistics will be computed relative to the
421                filtered video if -local is enabled)
422lossless      : Lossless coding.(overrides -qf and -targetrate)
423mv_prec       : Motion vector precision. Valid values are 1 (Pixel precision),
424                1/2 (half-pixel precision), 1/4 (quarter pixel precision which
425                is the default), 1/8 ( Eighth pixel precision).
426full_search   : Use full search motion estimation
427combined_me   : Use combination of all three components to do motion estimation
428field_coding  : Code the input video as fields instead of frames.
429                Default coding is frames.
430use_vlc       : Use VLC for entropy coding of coefficients instead of
431                arithmetic coding.
432Modifying L1_sep and num_L1 allows for new GOP structures to be used, and
433should be entirely safe. There are two non-GOP modes that can also be used for
434encoding: setting num_L1=0 gives I-frame only coding, and setting L1_sep to
4351 will do IP-only coding (no B-pictures). P-only coding isn't possible, but
436num_L1=very large and L1_sep=1 will approximate it.
437
438Modifying the block parameters is strongly deprecated: it's likely to break
439the encoder as there are many constraints. Modifying cpd will not break
440anything, but will change the way noise is distributed which may be more (or
441less) suitable for your application. Setting cpd equal zero turns off
442perceptual weighting altogether.
443
444For more information, see the algorithm documentation on the website:
445http://diracvideo.org/wiki/index.php/Dirac_Algorithm
446
447Other options. The encoder also supports some other options, which are
448
449verbose   : turn on verbosity (if you don't, you won't see the final bitrate!)
450start     : code from this frame number
451stop      : code up until this frame number
452local     : Generate diagnostics and locally decoded output (to avoid running a
453            decoder to see your video)
454
455Using -start and -stop allows a small section to be coded, rather than the
456whole thing.
457
458If the -local flag is present in the command line, the encoder produces
459diagnostic information about motion vectors that can be used to debug the
460encoder algorithm. It also produces a locally decoded picture so that you
461don't have to run the decoder to see what the pictures are like.
462
4634.4 Decoding
464
465Decoding is much simpler. Just point the decoder input at the bitstream and the
466output to a file:
467
468  dirac_decoder -verbose test_enc test_dec
469
470will decode test_enc into test_dec with running commentary.
471