1README for the Dirac video codec 2~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3 4by the BBC R&D Dirac team (diracinfo@rd.bbc.co.uk) 5 6 71. Executive Summary 8~~~~~~~~~~~~~~~~~~~~ 9 10Dirac is an open source video codec. It uses a traditional hybrid video codec 11architecture, but with the wavelet transform instead of the usual block 12transforms. Motion compensation uses overlapped blocks to reduce block 13artefacts that would upset the transform coding stage. 14 15Dirac can code just about any size of video, from streaming up to HD and 16beyond, although certain presets are defined for different applications and 17standards. These cover the parameters that need to be set for the encoder to 18work, such as block sizes and temporal prediction structures, which must 19otherwise be set by hand. 20 21Dirac is intended to develop into real coding and decoding software, capable 22of plugging into video processing applications and media players that need 23compression. It is intended to develop into a simple set of reliable but 24effective coding tools that work over a wide variety of content and formats, 25using well-understood compression techniques, in a clear and accessible 26software structure. It is not intended as a demonstration or reference coder. 27 28 292. Documentation 30~~~~~~~~~~~~~~~~ 31 32Documentation can be found at 33http://diracvideo.org/wiki/index.php/Main_Page#Documentation 34 353. Building and installing 36~~~~~~~~~~~~~~~~~~~~~~~~~~ 37 38 GNU/Linux, Unix, MacOS X, Cygwin, Mingw 39 --------------------------------------- 40 ./configure --enable-debug 41 (to enable extra debug compile options) 42 OR 43 ./configure --enable-profile 44 (to enable the g++ profiling flag -pg) 45 OR 46 ./configure --disable-mmx 47 (to disable MMX optimisation which is enabled by default) 48 OR 49 ./configure --enable-debug --enable-profile 50 (to enable extra debug compile options and profiling options) 51 OR 52 ./configure 53 54 By default, both shared and static libraries are built. To build all-static 55 libraries use 56 ./configure --disable-shared 57 58 To build shared libraries only use 59 ./configure --disable-static 60 61 make 62 make install 63 64 The INSTALL file documents arguments to ./configure such as 65 --prefix=/usr/local (specify the installation location prefix). 66 67 68 MSYS and Microsoft Visual C++ 69 ----------------------------- 70 Download and install the no-cost Microsoft Visual C++ 2008 Express 71 Edition from 72 http://msdn.microsoft.com/vstudio/express/visualc/ 73 74 Download and install MSYS (the MinGW Minimal SYStem), MSYS-1.0.10.exe, 75 from http://www.mingw.org/download.shtml. An MSYS icon will be available 76 on the desktop. 77 78 Click on the MSYS icon on the desktop to open a MSYS shell window. 79 80 Create a .profile file to set up the environment variables required. 81 vi .profile 82 83 Include the following four lines in the .profile file. 84 85 PATH=/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/Common7/IDE:/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/BIN:/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/Common7/Tools:/c/WINDOWS/Microsoft.NET/Framework/v3.5:/c/WINDOWS/Microsoft.NET/Framework/v2.0.50727:/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/VCPackages:$PATH 86 87 INCLUDE=/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/INCLUDE:$INCLUDE 88 LIB=/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/LIB:$LIB 89 90 LIBPATH=/c/WINDOWS/Microsoft.NET/Framework/v3.5:/c/WINDOWS/Microsoft.NET/Framework/v2.0.50727:/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/LIB:$LIBPATH 91 92 (Replace /c/Program\ Files/Microsoft\ Visual\ Studio\ 9/ with 93 the location where VC++ 2008 is installed if necessary) 94 95 Exit from the MSYS shell and click on the MSYS icon on the desktop to open 96 a new MSYS shell window for the .profile to take effect. 97 98 Change directory to the directory where Dirac was unpacked. By default 99 only the dynamic libraries are built. 100 101 ./configure CXX=cl LD=cl --enable-debug 102 (to enable extra debug compile options) 103 OR 104 ./configure CXX=cl LD=cl --disable-shared 105 (to build static libraries) 106 OR 107 ./configure CXX=cl LD=cl 108 make 109 make install 110 111 The INSTALL file documents arguments to ./configure such as 112 --prefix=/usr/local (specify the installation location prefix). 113 114 Microsoft Visual C++ .NET 2008 115 ------------------------------ 116 Download and install the no-cost Microsoft Visual C++ 2008 Express 117 Edition from 118 http://www.microsoft.com/express/download/ 119 120 The MS VC++ 2008 solution and project files are in win32/VisualStudio 121 directory. Double-click on the solution file, dirac.sln, in the 122 win32/VisualStudio directory. The target 'Everything' builds the codec 123 libraries and utilities. Four build-types are supported 124 125 Debug - builds unoptimised encoder and decoder dlls with debug symbols 126 Release - builds optimised encoder and decoder dlls 127 Debug-mmx - builds unoptimised encoder and decoder dlls with debug symbols 128 and mmx optimisations enabled. 129 Release-mmx - builds optimised encoder and decoder dlls with mmx 130 optimisations enabled. 131 Static-Debug - builds unoptimised encoder and decoder static libraries 132 with debug symbols 133 Static-Release - builds optimised encoder and decoder static libraries 134 Static-Debug-mmx - builds unoptimised encoder and decoder static libraries 135 with debug symbols and mmx optmisations enabled. 136 Static-Release-mmx - builds optimised encoder and decoder static libraries 137 with mmx optmisations enabled. 138 139 Static libraries are created in the win32/VisualStudio/build/lib/<build-type> directory. 140 141 Encoder and Decoder dlls and import libraries, encoder and decoder apps are 142 created in the win32/VisualStudio/build/bin/<build-type> directory. The "C" 143 public API is exported using the _declspec(dllexport) mechanism. 144 145 Conversion utilites are created in the 146 win32/VisualStudio/build/utils/conversion/<build-type> directory. Only static 147 versions are built. 148 Instrumentation utility is created in the 149 win32/VisualStudio/build/utils/instrumentation/<build-type> directory. Only 150 static versions are built. 151 152 153 Older editions of Microsoft Visual C++ (e.g. 2003 and 2005) 154 ----------------------------------------------------------- 155 156 NOTE: Since Visual C++ 2008 Express edition is freely available to 157 download, older versions of the Visual C++ editions are no longer 158 supported. So it is suggested that the users upgrade their VC++ environment 159 to VC++ 2008. 160 1614. Running the example programs 162~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 163 1644.1 Command-line parameters 165 166At the moment there is a simple command-line parser class which is 167used in all the executables. The general procedure for running a program 168is to type: 169 170 prog_name -<flag_name> flag_val ... param1 param2 ... 171 172In other words, options are prefixed by a dash; some options take values, 173while others are boolean options that enable specific features. For example: 174When running the encoder, the -qf options requires a numeric argument 175specifying the "quality factor" for encoding. The -verbose option enables 176detailed output and does not require an argument. 177 178Running any program without arguments will display a list of parameters and 179options. 180 1814.2 File formats 182 183The example coder and decoder use raw 8-bit planar YUV data. This means that 184data is stored bytewise, with a frame of Y followed by a frame of U followed 185by a frame of V, all scanned in the usual raster order. The video dimensions 186, frame rate and chroma are passed to the encoder via command line arguments. 187 188Other file formats are supported by means of conversion utilities that 189may be found in the subdirectory util/conversion. These will convert to 190and from raw RGB format, and support all the standard raw YUV formats as 191well as bitmaps. Raw RGB can be obtained as an output from standard conversion 192utilities such as ImageMagick. 193 194Example. 195 Compress an image sequence of 100 frames of 352x288 video in tiff format. 196 197 Step 1. 198 199 Use your favourite conversion routine to produce a single raw RGB file of 200 all the data. If your routine converts frame-by-frame then you will 201 need to concatenate the output. 202 203 Step 2. 204 205 Convert from RGB to the YUV format of your choice. For example, to do 206 420, type 207 208 RGBtoYUV420 <file.rgb >file.yuv 352 288 100 209 210 Note that this uses stdin and stdout to read and write the data. 211 212 We have provided a script create_test_data.pl to help convert rgb format 213 files into all the input formats supported by Dirac. The command line 214 arguments it supports can be listed using 215 216 create_test_data.pl -use 217 218 Sample usage is 219 220 create_test_data.pl -width=352 -height=288 -num_frames=100 file.rgb 221 222 (This assumes that the RGBtoYUV utilities are in a directory specified in 223 PATH variable. If not in the path, then use options -convutildir and to set 224 the directories where the script can find the conversion utilities.) 225 226 The scripts then outputs files in all chroma formats (420, 422, 227 444) supported by Dirac to the current directory. 228 229 230 Step 4. 231 232 Run the encoder. This will produce a locally decoded output in the 233 same format if the locally decoded output is enabled using the -local flag. 234 235 Step 5. 236 237 Convert back to RGB. 238 239 YUV420toRGB <file.yuv >file.rgb 352 288 100 240 241 Step 6. 242 243 Use your favourite conversion utility to convert to the format of your 244 choice. 245 246You can also use the transcode utility to convert data to and from Dirac's 247native formats (see http://zebra.fh-weingarten.de/~transcode/): 248 249 This example uses a 720x576x50 DV source, and transcodes to 720x576 YUV in 250 4:2:0 chroma format. Cascading codecs (DV + Dirac) is generally a bad idea 251 - use this only if you don't have any other source of uncompressed video. 252 253 transcode -i source.dv -x auto,null --dv_yuy2_mode -k -V -y raw,null -o file.avi 254 tcextract -i test.avi -x rgb > file.yuv 255 256Viewing and playback utilities for uncompressed video include MPlayer and 257ImageMagick's display command. 258 259 Continuing the 352x288 4:2:0 example above, to display a single frame 260 of raw YUV with ImageMagick use the following (use <spacebar> to see 261 subsequent frames): 262 263 display -size 352x288 test.yuv 264 265 Raw YUV 420 data can also be played back in MPlayer - use the following 266 MPlayer command: 267 268 mplayer -fps 15 -rawvideo on:size=152064:w=352:h=288 test.yuv 269 270 (at the time of writing MPlayer could not playback 4:2:2 or 4:4:4 YUV data) 271 272 2734.3 Encoding 274 275The basic encoding syntax is to type 276 277dirac_encoder [options] file_in file_out 278 279This will compress file_in and produce an output file_out of compressed data. 280 281A locally decoded output file_out.local-dec.yuv and instrumentation data 282file_out.imt (for debugging the encoder and of interest to developers only) 283are also produced if the -local flag is enabled on the command line. 284 285There are a large number of optional parameters that can be used to run the 286encoder, all of which are listed below. To encode video you need three types of 287parameter need to be set: 288 289a) quality factor or target bit rate 290b) source parameters (width, height, frame rate, chroma format) 291c) encoding parameters (motion compensation block sizes, preferred viewing 292 distance) 293 294In practice you don't have to set all these directly because presets can be used 295to use appropriate default values. 296 297a) The most important parameters are the quality factor or target bit rate. 298 299The quality factor is specified by using the option 300 301qf : Overall quality factor (>0) 302 303This value is greater than 0, the higher the number, the better 304the quality. Typical high quality is 8-10, but it will vary from sequence to 305sequence, sometimes higher and sometimes lower. 306 307The target bit rate is set using the option 308 309targetrate : Target bit rate in Kb/s 310 311This will attempt to maintain constant bit rate over the sequence. It works 312reasonably well, but actual bit rate, especially over short sequences, may be 313slightly different from the target. 314 315Setting -targetrate overrides -qf, in that CBR will still be applied, although 316the initial quality will be set by the given qf value. This might help the CBR 317algorithm to adapt faster. 318 319Setting -lossless overrides both -qf and -targetrate, and enforces lossless 320coding. 321 322b) Source parameters need to be set as the imput is just a raw YUV file and 323the encoder doesn't have any information about it. 324 325The best way to set source parameters is to use a preset for 326different video formats. 327 328The available preset options are: 329QSIF525 : width=176; height=120; 4:2:0 format; 14.98 frames/sec 330QCIF : width=176; height=144; 4:2:0 format; 12.5 frames/sec 331SIF525 : width=352; height=240; 4:2:0 format; 14.98 frames/sec 332CIF : width=352; height=288; 4:2:0 format; 12.5 frames/sec 3334SIF525 : width=704; height=480; 4:2:0 format; 14.98 frames/sec 3344CIF : width=704; height=576; 4:2:0 format; 12.5 frames/sec 335SD480I60 : width=720; height=480; 4:2:2 format; 29.97 frames/sec 336SD576I50 : width=720; height=576; 4:2:2 format; 25 frames/sec 337HD720P60 : width=1280; height=720; 4:2:2 format; 60 frames/sec 338HD720P50 : width=1280; height=720; 4:2:2 format; 50 frames/sec 339HD1080I60 : width=1920; height=1080; 4:2:2 format; 29,97 frames/sec 340HD1080I50 : width=1920; height=1080; 4:2:2 format; 25 frames/sec 341HD1080P60 : width=1920; height=1080; 4:2:2 format; 59.94 frames/sec 342HD1080P50 : width=1920; height=1080; 4:2:2 format; 50 frames/sec 343DC2K24 : width=2048; height=1080; 4:2:2 format; 24 frames/sec 344DC4K24 : width=4096; height=2160; 4:2:2 format; 24 frames/sec 345UHDTV4K60 : width=3840; height=2160; 4:2:2 format; 59.94 frames/sec 346UHDTV4K50 : width=3840; height=2160; 4:2:2 format; 50 frames/sec 347UHDTV8K60 : width=7680; height=4320; 4:2:2 format; 59.94 frames/sec 348UHDTV8K50 : width=7680; height=4320; 4:2:2 format; 50 frames/sec 349 350The default format used is CUSTOM format which has the following preset values 351width=640; height=480; 4:2:0 format; 23.97 frames/sec. 352 353If your video is not one of these formats, you should pick the nearest preset 354and override the parameters that are different. 355 356Example 1 Simple coding example. Code a 720x576 sequence in Planar 420 format to 357high quality. 358 359Solution. 360 361 dirac_encoder -cformat YUV420P -SD576I50 -qf 9 test.yuv test_out.drc 362 363Example 2. Code a 720x486 sequence at 29.97 frames/sec in 422 format to 364medium quality 365 366Solution 367 368 dirac_encoder -SD576I50 -width 720 -height 486 -fr 29.97 -cformat YUV422P -qf 5.5 test.yuv test_out.drc 369 370Source parameters that affect coding are: 371 372width : Width of video frame 373height : Height of video frame 374cformat : Chroma Sampling format. Acceptable values are 375 YUV444P, YUV422P and YUV420P. 376fr : Frame rate. Can be a decimal number or a fraction. Examples 377 of acceptable values are 25, 29.97, 12.5, 30000/1001. 378source_sampling : Source material type - 0 - progressive or 1 - interlaced 379 380For a complete list of source parameters, refer to Annex C of the Dirac 381Specification. 382 383WARNING!! If you use a preset but don't override source parameters that 384are different, then Dirac will still compress, but the bit rate will be 385much, much higher and there may well be serious artefacts. The encoder prints 386out the parameters it's actually using before starting encoding (in verbose 387mode only), so that you can abort at this point. 388 389c) The presets ALSO set encoding parameters. That's why it's a very good idea 390to use presets, as the encoding parameters are a bit obscure. They're still 391supported for those who want to experiment, but use with care. 392 393Encoding parameters are: 394 395L1_sep : the separation between L1 frames (frames that are predicted but 396 also used as reference frames, like P frames in MPEG-2) 397num_L1 : the number of L1 frames before the next intra frame 398xblen : the width of blocks used for motion compensation 399yblen : the height of blocks used for motion compensation 400xbsep : the horizontal separation between blocks. Always <xblen 401ybsep : the vertical separation between blocks. Always <yblen 402cpd : normalised viewing distance parameter, in cycles per degree. 403iwlt_filter : transform filter to use when encoding INTRA frames, Valid 404 values are DD9_7, LEGALL5_3, DD13_7, HAAR0, HAAR1, FIDELITY, 405 DAUB97. Default value is DD13_7. 406rwlt_filter : transform filter to use when encoding INTER frames, Valid 407 values are DD9_7, LEGALL5_3, DD13_7, HAAR0, HAAR1, FIDELITY, 408 DAUB97. Default value is DD13_7. 409wlt_depth : transform depth, i.e number of times the component is split 410 while applying the wavelet transform 411no_spartition : Do not split a subband into coefficient blocks before 412 entropy coding 413multi_quants : If subbands are split into multiple coefficient blocks before 414 entropy coding, assign different quantisers to each block 415 within the subband. 416prefilter : Prefilter to apply to input video before encoding. The name of 417 the filter to be used and the filter strength have to be 418 supplied. Valid filter names are NO_PF, CWM, RECTLP and 419 DIAGLP. Filter strenth range should be in the range 0-10. 420 (note PSNR statistics will be computed relative to the 421 filtered video if -local is enabled) 422lossless : Lossless coding.(overrides -qf and -targetrate) 423mv_prec : Motion vector precision. Valid values are 1 (Pixel precision), 424 1/2 (half-pixel precision), 1/4 (quarter pixel precision which 425 is the default), 1/8 ( Eighth pixel precision). 426full_search : Use full search motion estimation 427combined_me : Use combination of all three components to do motion estimation 428field_coding : Code the input video as fields instead of frames. 429 Default coding is frames. 430use_vlc : Use VLC for entropy coding of coefficients instead of 431 arithmetic coding. 432Modifying L1_sep and num_L1 allows for new GOP structures to be used, and 433should be entirely safe. There are two non-GOP modes that can also be used for 434encoding: setting num_L1=0 gives I-frame only coding, and setting L1_sep to 4351 will do IP-only coding (no B-pictures). P-only coding isn't possible, but 436num_L1=very large and L1_sep=1 will approximate it. 437 438Modifying the block parameters is strongly deprecated: it's likely to break 439the encoder as there are many constraints. Modifying cpd will not break 440anything, but will change the way noise is distributed which may be more (or 441less) suitable for your application. Setting cpd equal zero turns off 442perceptual weighting altogether. 443 444For more information, see the algorithm documentation on the website: 445http://diracvideo.org/wiki/index.php/Dirac_Algorithm 446 447Other options. The encoder also supports some other options, which are 448 449verbose : turn on verbosity (if you don't, you won't see the final bitrate!) 450start : code from this frame number 451stop : code up until this frame number 452local : Generate diagnostics and locally decoded output (to avoid running a 453 decoder to see your video) 454 455Using -start and -stop allows a small section to be coded, rather than the 456whole thing. 457 458If the -local flag is present in the command line, the encoder produces 459diagnostic information about motion vectors that can be used to debug the 460encoder algorithm. It also produces a locally decoded picture so that you 461don't have to run the decoder to see what the pictures are like. 462 4634.4 Decoding 464 465Decoding is much simpler. Just point the decoder input at the bitstream and the 466output to a file: 467 468 dirac_decoder -verbose test_enc test_dec 469 470will decode test_enc into test_dec with running commentary. 471