dirac/dirac-1.0.2/README

README for the Dirac video codec
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

by the BBC R&D Dirac team (diracinfo@rd.bbc.co.uk)


1. Executive Summary
~~~~~~~~~~~~~~~~~~~~

Dirac is an open source video codec. It uses a traditional hybrid video codec
architecture, but with the wavelet transform instead of the usual block
transforms.  Motion compensation uses overlapped blocks to reduce block
artefacts that would upset the transform coding stage.

Dirac can code just about any size of video, from streaming up to HD and
beyond, although certain presets are defined for different applications and
standards.  These cover the parameters that need to be set for the encoder to
work, such as block sizes and temporal prediction structures, which must
otherwise be set by hand.

Dirac is intended to develop into real coding and decoding software, capable
of plugging into video processing applications and media players that need
compression. It is intended to develop into a simple set of reliable but
effective coding tools that work over a wide variety of content and formats,
using well-understood compression techniques, in a clear and accessible
software structure. It is not intended as a demonstration or reference coder.


2. Documentation
~~~~~~~~~~~~~~~~

Documentation can be found at
http://diracvideo.org/wiki/index.php/Main_Page#Documentation

3. Building and installing
~~~~~~~~~~~~~~~~~~~~~~~~~~

  GNU/Linux, Unix, MacOS X, Cygwin, Mingw
  ---------------------------------------
    ./configure --enable-debug
        (to enable extra debug compile options)
     OR
    ./configure --enable-profile
        (to enable the g++ profiling flag -pg)
     OR
    ./configure --disable-mmx
        (to disable MMX optimisation which is enabled by default)
     OR
    ./configure --enable-debug --enable-profile
        (to enable extra debug compile options and profiling options)
     OR
     ./configure

     By default, both shared and static libraries are built. To build all-static
     libraries use
     ./configure --disable-shared

     To build shared libraries only use
     ./configure --disable-static

     make
     make install

  The INSTALL file documents arguments to ./configure such as
  --prefix=/usr/local (specify the installation location prefix).


  MSYS and Microsoft Visual C++
  -----------------------------
     Download and install the no-cost Microsoft Visual C++ 2008 Express
     Edition  from
     http://msdn.microsoft.com/vstudio/express/visualc/

     Download and install MSYS (the MinGW Minimal SYStem), MSYS-1.0.10.exe,
     from http://www.mingw.org/download.shtml. An MSYS icon will be available
     on the desktop.

     Click on the MSYS icon on the desktop to open a MSYS shell window.

     Create a .profile file to set up the environment variables required.
     vi .profile

     Include the following four lines in the .profile file.

     PATH=/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/Common7/IDE:/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/BIN:/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/Common7/Tools:/c/WINDOWS/Microsoft.NET/Framework/v3.5:/c/WINDOWS/Microsoft.NET/Framework/v2.0.50727:/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/VCPackages:$PATH

     INCLUDE=/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/INCLUDE:$INCLUDE
     LIB=/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/LIB:$LIB

     LIBPATH=/c/WINDOWS/Microsoft.NET/Framework/v3.5:/c/WINDOWS/Microsoft.NET/Framework/v2.0.50727:/c/Program\ Files/Microsoft\ Visual\ Studio\ 9.0/VC/LIB:$LIBPATH

    (Replace /c/Program\ Files/Microsoft\ Visual\ Studio\ 9/ with
    the location where VC++ 2008 is installed if necessary)

     Exit from the MSYS shell and click on the MSYS icon on the desktop to open
     a new MSYS shell window for the .profile to take effect.

     Change directory to the directory where Dirac was unpacked. By default
     only the dynamic libraries are built.

     ./configure CXX=cl LD=cl --enable-debug
         (to enable extra debug compile options)
     OR
     ./configure CXX=cl LD=cl --disable-shared
         (to build static libraries)
     OR
     ./configure CXX=cl LD=cl
     make
     make install

     The INSTALL file documents arguments to ./configure such as
     --prefix=/usr/local (specify the installation location prefix).

  Microsoft Visual C++ .NET 2008
  ------------------------------
  Download and install the no-cost Microsoft Visual C++ 2008 Express
  Edition  from
  http://www.microsoft.com/express/download/

  The MS VC++ 2008 solution and project files are in win32/VisualStudio
  directory.  Double-click on the solution file, dirac.sln, in the
  win32/VisualStudio directory.  The target 'Everything' builds the codec
  libraries and utilities. Four build-types are supported

  Debug - builds unoptimised encoder and decoder dlls with debug symbols
  Release - builds optimised encoder and decoder dlls
  Debug-mmx - builds unoptimised encoder and decoder dlls with debug symbols
              and mmx optimisations enabled.
  Release-mmx - builds optimised encoder and decoder dlls  with mmx
              optimisations enabled.
  Static-Debug - builds unoptimised encoder and decoder static libraries
                 with debug symbols
  Static-Release - builds optimised encoder and decoder static libraries
  Static-Debug-mmx - builds unoptimised encoder and decoder static libraries
                     with debug symbols and mmx optmisations enabled.
  Static-Release-mmx - builds optimised encoder and decoder static libraries
                       with mmx optmisations enabled.

  Static libraries are created in the win32/VisualStudio/build/lib/<build-type> directory.

  Encoder and Decoder dlls and import libraries, encoder and decoder apps are
  created in the win32/VisualStudio/build/bin/<build-type> directory. The "C"
  public API is exported using the _declspec(dllexport) mechanism.

  Conversion utilites are created in the
  win32/VisualStudio/build/utils/conversion/<build-type> directory. Only static
  versions are  built.
  Instrumentation utility is created in the
  win32/VisualStudio/build/utils/instrumentation/<build-type> directory. Only
  static versions are built.


  Older editions of Microsoft Visual C++  (e.g. 2003 and 2005)
  -----------------------------------------------------------

  NOTE: Since Visual C++ 2008 Express edition is freely available to
  download, older versions of the Visual C++ editions are no longer
  supported. So it is suggested that the users upgrade their VC++ environment
  to VC++ 2008.

4. Running the example programs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

4.1 Command-line parameters

At the moment there is a simple command-line parser class which is
used in all the executables. The general procedure for running a program
is to type:

  prog_name -<flag_name> flag_val ... param1 param2 ...

In other words, options are prefixed by a dash; some options take values,
while others are boolean options that enable specific features. For example:
When running the encoder, the -qf options requires a numeric argument
specifying the "quality factor" for encoding. The -verbose option enables
detailed output and does not require an argument.

Running any program without arguments will display a list of parameters and
options.

4.2 File formats

The example coder and decoder use raw 8-bit planar YUV data.  This means that
data is stored bytewise, with a frame of Y followed by a frame of U followed
by a frame of V, all scanned in the usual raster order. The video dimensions
, frame rate and chroma are passed to the encoder via command line arguments.

Other file formats are supported by means of conversion utilities that
may be found in the subdirectory util/conversion. These will convert to
and from raw RGB format, and support all the standard raw YUV formats as
well as bitmaps. Raw RGB can be obtained as an output from standard conversion
utilities such as ImageMagick.

Example.
  Compress an image sequence of 100 frames of 352x288 video in tiff format.

  Step 1.

  Use your favourite conversion routine to produce a single raw RGB file of
  all the data. If your routine converts frame-by-frame then you will
  need to concatenate the output.

  Step 2.

  Convert from RGB to the YUV format of your choice. For example, to do
  420, type

  RGBtoYUV420 <file.rgb >file.yuv 352 288 100

  Note that this uses stdin and stdout to read and write the data.

  We have provided a script create_test_data.pl to help convert rgb format
  files into all the input formats supported by Dirac. The command line
  arguments it supports can be listed using

  create_test_data.pl -use

  Sample usage is

  create_test_data.pl -width=352 -height=288 -num_frames=100 file.rgb

  (This assumes that the RGBtoYUV utilities  are in a directory specified in
  PATH variable. If not in the path, then use options -convutildir and to set
  the directories where the script can find the conversion utilities.)

  The scripts then outputs files in all chroma formats (420, 422,
  444) supported by Dirac to the current directory.


  Step 4.

  Run the encoder. This will produce a locally decoded output in the
  same format if the locally decoded output is enabled using the -local flag.

  Step 5.

  Convert back to RGB.

  YUV420toRGB <file.yuv >file.rgb 352 288 100

  Step 6.

  Use your favourite conversion utility to convert to the format of your
  choice.

You can also use the transcode utility to convert data to and from Dirac's
native formats (see http://zebra.fh-weingarten.de/~transcode/):

  This example uses a 720x576x50 DV source, and transcodes to 720x576 YUV in
  4:2:0 chroma format.  Cascading codecs (DV + Dirac) is generally a bad idea
  - use this only if you don't have any other source of uncompressed video.

    transcode -i source.dv -x auto,null --dv_yuy2_mode -k -V -y raw,null -o file.avi
    tcextract -i test.avi -x rgb > file.yuv

Viewing and playback utilities for uncompressed video include MPlayer and
ImageMagick's display command.

  Continuing the 352x288 4:2:0 example above, to display a single frame
  of raw YUV with ImageMagick use the following (use <spacebar> to see
  subsequent frames):

    display -size 352x288 test.yuv

  Raw YUV 420 data can also be played back in MPlayer - use the following
  MPlayer command:

    mplayer -fps 15 -rawvideo on:size=152064:w=352:h=288 test.yuv

  (at the time of writing MPlayer could not playback 4:2:2 or 4:4:4 YUV data)


4.3 Encoding

The basic encoding syntax is to type

dirac_encoder [options] file_in file_out

This will compress file_in and produce an output file_out of compressed data.

A locally decoded output file_out.local-dec.yuv and instrumentation data
file_out.imt  (for debugging the encoder and of interest to developers only)
are also produced if the -local flag is enabled on the command line.

There are a large number of optional parameters that can be used to run the
encoder, all of which are listed below. To encode video you need three types of
parameter need to be set:

a) quality factor or target bit rate
b) source parameters (width, height, frame rate, chroma format)
c) encoding parameters (motion compensation block sizes, preferred viewing
   distance)

In practice you don't have to set all these directly because presets can be used
to use appropriate default values.

a) The most important parameters are the quality factor or target bit rate.

The quality factor is specified by using the option

qf     : Overall quality factor (>0)

This value is greater than 0, the higher the number, the better
the quality. Typical high quality is 8-10, but it will vary from sequence to
sequence, sometimes higher and sometimes lower.

The target bit rate is set using the option

targetrate : Target bit rate in Kb/s

This will attempt to maintain constant bit rate over the sequence. It works
reasonably well, but actual bit rate, especially over short sequences, may be
slightly different from the target.

Setting -targetrate overrides -qf, in that CBR will still be applied, although
the initial quality will be set by the given qf value. This might help the CBR
algorithm to adapt faster.

Setting -lossless overrides both -qf and -targetrate, and enforces lossless
coding.

b) Source parameters need to be set as the imput is just a raw YUV file and
the encoder doesn't have any information about it.

The best way to set source parameters is to use a preset for
different video formats.

The available preset options  are:
QSIF525   : width=176; height=120; 4:2:0 format; 14.98 frames/sec
QCIF      : width=176; height=144; 4:2:0 format; 12.5 frames/sec
SIF525    : width=352; height=240; 4:2:0 format; 14.98 frames/sec
CIF       : width=352; height=288; 4:2:0 format; 12.5 frames/sec
4SIF525   : width=704; height=480; 4:2:0 format; 14.98 frames/sec
4CIF      : width=704; height=576; 4:2:0 format; 12.5  frames/sec
SD480I60  : width=720; height=480; 4:2:2 format; 29.97 frames/sec
SD576I50  : width=720; height=576; 4:2:2 format; 25 frames/sec
HD720P60  : width=1280; height=720; 4:2:2 format; 60 frames/sec
HD720P50  : width=1280; height=720; 4:2:2 format; 50 frames/sec
HD1080I60 : width=1920; height=1080; 4:2:2 format; 29,97 frames/sec
HD1080I50 : width=1920; height=1080; 4:2:2 format; 25 frames/sec
HD1080P60 : width=1920; height=1080; 4:2:2 format; 59.94 frames/sec
HD1080P50 : width=1920; height=1080; 4:2:2 format; 50 frames/sec
DC2K24    : width=2048; height=1080; 4:2:2 format; 24 frames/sec
DC4K24    : width=4096; height=2160; 4:2:2 format; 24 frames/sec
UHDTV4K60 : width=3840; height=2160; 4:2:2 format; 59.94 frames/sec
UHDTV4K50 : width=3840; height=2160; 4:2:2 format; 50 frames/sec
UHDTV8K60 : width=7680; height=4320; 4:2:2 format; 59.94 frames/sec
UHDTV8K50 : width=7680; height=4320; 4:2:2 format; 50 frames/sec

The default format used is CUSTOM format which has the following preset values
width=640; height=480; 4:2:0 format; 23.97 frames/sec.

If your video is not one of these formats, you should pick the nearest preset
and override the parameters that are different.

Example 1 Simple coding example. Code a 720x576 sequence in Planar 420 format to
high quality.

Solution.

  dirac_encoder -cformat YUV420P -SD576I50 -qf 9 test.yuv test_out.drc

Example 2. Code a 720x486 sequence at 29.97 frames/sec in 422 format to
medium quality

Solution

  dirac_encoder -SD576I50 -width 720 -height 486 -fr 29.97 -cformat YUV422P -qf 5.5 test.yuv test_out.drc

Source parameters that affect coding are:

width           : Width of video frame
height          : Height of video frame
cformat         : Chroma Sampling format. Acceptable values are
                  YUV444P, YUV422P and YUV420P.
fr              : Frame rate. Can be a decimal number or a fraction. Examples
                  of acceptable values are 25, 29.97, 12.5, 30000/1001.
source_sampling : Source material type - 0 - progressive or 1 - interlaced

For a complete list of source parameters, refer to Annex C of the Dirac
Specification.

WARNING!! If you use a preset but don't override source parameters that
are different, then Dirac will still compress, but the bit rate will be
much, much higher and there may well be serious artefacts. The encoder prints
out the parameters it's actually using before starting encoding (in verbose
mode only), so that you can abort at this point.

c) The presets ALSO set encoding parameters. That's why it's a very good idea
to use presets, as the encoding parameters are a bit obscure. They're still
supported for those who want to experiment, but use with care.

Encoding parameters are:

L1_sep        : the separation between L1 frames (frames that are predicted but
                also used as reference frames, like P frames in MPEG-2)
num_L1        : the number of L1 frames before the next intra frame
xblen         : the width of blocks used for motion compensation
yblen         : the height of blocks used for motion compensation
xbsep         : the horizontal separation between blocks. Always <xblen
ybsep         : the vertical separation between blocks. Always <yblen
cpd           : normalised viewing distance parameter, in cycles per degree.
iwlt_filter   : transform filter to use when encoding INTRA frames, Valid
                values are DD9_7, LEGALL5_3, DD13_7, HAAR0, HAAR1, FIDELITY,
                DAUB97. Default value is DD13_7.
rwlt_filter   : transform filter to use when encoding INTER frames, Valid
                values are DD9_7, LEGALL5_3, DD13_7, HAAR0, HAAR1, FIDELITY,
                DAUB97. Default value is DD13_7.
wlt_depth     : transform depth, i.e number of times the component is split
                while applying the wavelet transform
no_spartition : Do not split a subband into coefficient blocks before
                entropy coding
multi_quants  : If subbands are split into multiple coefficient blocks before
                entropy coding, assign different quantisers to each block
                within the subband.
prefilter     : Prefilter to apply to input video before encoding. The name of
                the filter to be used and the filter strength have to be
                supplied. Valid filter names are NO_PF, CWM, RECTLP and
                DIAGLP. Filter strenth range should be in the range 0-10.
                (note PSNR statistics will be computed relative to the
                filtered video if -local is enabled)
lossless      : Lossless coding.(overrides -qf and -targetrate)
mv_prec       : Motion vector precision. Valid values are 1 (Pixel precision),
                1/2 (half-pixel precision), 1/4 (quarter pixel precision which
                is the default), 1/8 ( Eighth pixel precision).
full_search   : Use full search motion estimation
combined_me   : Use combination of all three components to do motion estimation
field_coding  : Code the input video as fields instead of frames.
                Default coding is frames.
use_vlc       : Use VLC for entropy coding of coefficients instead of
                arithmetic coding.
Modifying L1_sep and num_L1 allows for new GOP structures to be used, and
should be entirely safe. There are two non-GOP modes that can also be used for
encoding: setting num_L1=0 gives I-frame only coding, and setting L1_sep to
1 will do IP-only coding (no B-pictures). P-only coding isn't possible, but
num_L1=very large and L1_sep=1 will approximate it.

Modifying the block parameters is strongly deprecated: it's likely to break
the encoder as there are many constraints. Modifying cpd will not break
anything, but will change the way noise is distributed which may be more (or
less) suitable for your application. Setting cpd equal zero turns off
perceptual weighting altogether.

For more information, see the algorithm documentation on the website:
http://diracvideo.org/wiki/index.php/Dirac_Algorithm

Other options. The encoder also supports some other options, which are

verbose   : turn on verbosity (if you don't, you won't see the final bitrate!)
start     : code from this frame number
stop      : code up until this frame number
local     : Generate diagnostics and locally decoded output (to avoid running a
            decoder to see your video)

Using -start and -stop allows a small section to be coded, rather than the
whole thing.

If the -local flag is present in the command line, the encoder produces
diagnostic information about motion vectors that can be used to debug the
encoder algorithm. It also produces a locally decoded picture so that you
don't have to run the decoder to see what the pictures are like.

4.4 Decoding

Decoding is much simpler. Just point the decoder input at the bitstream and the
output to a file:

  dirac_decoder -verbose test_enc test_dec

will decode test_enc into test_dec with running commentary.