src/mpe2/README.windows

                 MPE (Multi-Processing Environment) for Windows
                 ----------------------------------------------

                    Mathematics and Computer Science Division
                           Argonne National Laboratory

I.  INTRODUCTION
----------------

The Multi-Processing Environment (MPE) attempts to provide programmers with
a complete suite of performance analysis tools for their MPI programs based
on post processing approach.  These tools include a set of profiling libraries,
a set of utility programs, and a set of graphical tools.

The first set of tools to be used with user MPI programs is profiling libraries
which provide a collection of routines that create log files.  These log files
can be created manually by inserting MPE calls in the MPI program, or
automatically by linking with the appropriate MPE libraries, or by combining
the above two methods.  Currently, the MPE offers the following 4 profiling
libraries.

   1) Tracing Library - Traces all MPI calls.  Each MPI call is preceded by a
      line that contains the rank in MPI_COMM_WORLD of the calling process,
      and followed by another line indicating that the call has completed.
      Most send and receive routines also indicate the values of count, tag,
      and partner (destination for sends, source for receives).  Output is to
      standard output.

   2) Animation Libraries - A simple form of real-time program animation
      that requires X window routines.

   3) Logging Libraries - The most useful and widely used profiling libraries
      in MPE.  These libraries form the basis to generate log files from
      user MPI programs.  There are several different log file formats
      available in MPE.  The default log file format is CLOG2.  It is a low
      overhead logging format, a simple collection of single timestamp events.
      The old format ALOG, which is not being developed for years, is not
      distributed here.  The powerful visualization format is SLOG-2, stands
      for Scalable LOGfile format version II which is a total redesign of the
      original SLOG format.  SLOG-2 allows for much improved scalability for
      visualization purpose.  CLOG2 file can be easily converted to
      SLOG-2 file through the new SLOG-2 viewer, Jumpshot-4.

   4) Collective and datatype checking library - An argument consistency
      checking library for MPI collective calls.  It checks for datatype, root,
      and various argument consistency in MPI collective calls.

The set of utility programs in MPE includes log format converter (e.g.
clogTOslog2), logfile print (e.g. slog2print) and logfile viewer and
convertor (e.g. jumpshot).  These new tools, clog2TOslog2, slog2print and
jumpshot(Jumpshot-4) replace old tools, clog2slog, slog_print and logviewer
(i.e. Jumpshot-2 and Jumpshot-3).  For more information of various
logfile formats and their viewers, see

http://www.mcs.anl.gov/perfvis


II. USAGE
---------

II. a) CUSTOMIZING LOGFILES
---------------------------

In addition to using the predefined MPE logging libraries to log all MPI
calls, MPE logging calls can be inserted into user's MPI program to define
and log states.  These states are called User-Defined states.  States may
be nested, allowing one to define a state describing a user routine that
contains several MPI calls, and display both the user-defined state and
the MPI operations contained within it.

The simplest way to insert user-defined states is as follows:
1) Get handles from MPE logging library: MPE_Log_get_state_eventIDs()
   has to be used to get unique event IDs (MPE logging handles).
   This is important if you are writing a library that uses
   the MPE logging routines from the MPE system.

   PS. Hardwiring the eventIDs is considered a bad idea since it may cause
   eventID confict and so the practice isn't supported.  Older MPE libraries
   provide MPE_Log_get_event_number() which is still being supported but
   has been deprecated.  Users are strongly urged to use
   MPE_Log_get_state_eventIDs() instead.
2) Set the logged state's characteristics: MPE_Describe_state() sets the
   name and color of the states.
3) Log the events of the logged states: MPE_Log_event() are called twice
   to log the user-defined states.

Below is a simple example that uses the 3 steps outlined above.

\begin{verbatim}

int eventID_begin, eventID_end;
...
MPE_Log_get_state_eventIDs( &eventID_begin, &eventID_end );
...
MPE_Describe_state( eventID_begin, eventID_end, "Multiplication", "red" );
...
MyAmult( Matrix m, Vector v )
{
    /* Log the start event along with the size of the matrix */
    MPE_Log_event( eventID_begin, 0, NULL );
    ... Amult code, including MPI calls ...
    MPE_Log_event( eventID_end, 0, NULL );
}

\end{verbatim}

The logfile generated by this code will have the MPI routines nested within
the routine MyAmult().

Besides user-defined states, MPE2 also provides support for user-defined
events which can be defined through use of MPE_Log_get_solo_eventID()
and MPE_Describe_event.  For more details, e.g. see cpilog.c.

If the MPE logging library, liblmpe.a, is NOT linked with the user program,
MPE_Init_log() and MPE_Finish_log() need to be used before and after all
the MPE calls.   Sample programs cpilog.c and fpilog.f are available to
illustrate the use of these MPE routines.  They are in the MPE
source directory, mpe2/src/wrappers/test or the installed directory,
share/examples_logging to illustrate the use of these MPE routines.
For futher linking information, see section "Convenient Compiler Wrappers".

For undefined user-defined state, i.e. corresponding MPE_Describe_state()
has not been issued, new jumpshot (Jumpshot-4) may display the legend name as
"UnknownType-INDEX" where INDEX is the internal MPE category index.


II. b) ENVIRONMENTAL VARIABLES
------------------------------

For MPE logging, MPE_TMPDIR and MPE_LOGFILE_PREFIX are 2 environment variables
that most users find to be very useful.  So it is recommended to set these
2 env. variables before launching the MPI program during logging :

CLOG_BLOCK_SIZE: The integer value determines the clog2 buffer block size
                 which set the least minimum clog2 file size.  If
                 CLOG_BLOCK_SIZE is not set, 64K per block is assumed.

CLOG_BUFFERED_BLOCKS: The integer value determines the number of blocks
                      witin the CLOG2's internal buffer.  Together with
                      CLOG_BLOCK_SIZE, CLOG_BUFFERED_BLOCKS determines how
                      often the internal buffer is flushed to the disk.
                      The total buffer size is determined by the product of
                      CLOG_BLOCK_SIZE and CLOG_BUFFERED_BLOCKS.  These 2
                      environmental variables allows user to minimize MPE2
                      logging overhead when large local memory is available.
                      The default value is 128.

MPE_TMPDIR: MPE_TMPDIR takes precedence over TMPDIR.  It specifies a
            directory to be used as temporary storage for each process.
            By default, when MPE_TMPDIR and TMPDIR are NOT set,
            /tmp will be used.  When user needs to generate a very large
            logfile for long-running MPI job, user needs to make sure that
            MPE_TMPDIR(or TMPDIR) is big enough to hold the temporary local
            logfile which will be deleted if the merged logfile can be
            created successfully.  In order to minimize the overhead of the
            logging to the MPI program, it is highly recommended user to
            use a *local* file system for TMPDIR.

             Note : The final merged logfile will be written back to the
                    file system where process 0 is.

MPE_DELETE_LOCALFILE:  The boolean value determines whether to delete the
                       temporary local clog2 file.  When this flag is
                       set to true, user needs to collect from the temporary
                       clog2 files from each slave node's MPE_TMPDIR.
                       Then separate serial programs, clog2_join and
                       clog2_repair, can be used to merge the local clog2
                       files.  This process is useful e.g. when MPI_Finalize()
                       fails to complete properly, e.g.  due to user program
                       overwritten to MPE/MPI internal data structures.

MPE_CLOCKS_SYNC: The boolean value determines the behavior of
                 MPE_Log_sync_clocks() and the default clock synchronization
                 at the end of logging.  Users may way to force MPE
                 clock synchronization when the MPI implementation has
                 buggy clock synchronization mechanism, e.g. Some versions
                 of BG/L MPI's MPI_WTIME_IS_GLOBAL is incorrectly set
                 to true when 64-ways or 256-ways partition is used.

MPE_SYNC_ALGORITHM: specifies the clock synchronization algorithm.  The
                    accepted values are "DEFAULT", "SEQ", "BITREE"
                    and "ALTNGBR".
                    SEQ: a O(N) steps algorithm and is non-scalable and slowest
                         but is also the most accurate.
                    BITREE: a O(log2(N)) steps algorithm, scalable and much
                            faster than SEQ but less accurate than SEQ.
                            A good compromise.
                    ALTNGBR: a O(1) steps algorithm, perfectly parallel
                             is the fastest of 3 algorithms supported.
                             It is also the least accurate.
                    DEFAULT: uses SEQ when the number of processes <= 16.
                             uses BITREE when number of processes > 16.

MPE_SYNC_FREQUENCY: specifies the number of iterations of selected clock
                    synchronization.  In general, the higher of
                    MPI_SYNC_FREQUENCY, the higher the probability of
                    obtaining a accurate measurement of all the clocks,
                    i.e. less error.  Keep in mind, this is generally
                    not a guarantee and is highly dependent of the system
                    noise.  The default is 3.

MPE_LOGFILE_PREFIX: specifies the filename prefix of the output logfile.
                    The file extension will be determined by the output
                    logfile format, i.e. MPE_LOG_FORMAT.

MPE_LOG_FORMAT: determines the format of the logfile generated from the
                execution of application linked with MPE logging libraries.
                The allowed value for MPE_LOG_FORMAT is CLOG2 only.
                So there is no need to use this variable at the moment.

MPE_LOG_OVERHEAD: The boolean value determines to log MPE/CLOG2's internal
                  profiling state CLOG_Buffer_write2disk(). The default
                  setting is yes.  CLOG_Buffer_write2disk labels region
                  in each process that MPE/CLOG2 spends on flushing logging
                  data in the memory to the disk.  The frequency and location
                  of CLOG_Buffer_write2disk state can be altered by changing
                  CLOG_BLOCK_SIZE and/or CLOG_BUFFERED_BLOCKS.

Possible boolean values are "true", "false", "yes" and "no" in either
all lower or upper cases.


For MPE X11 graphics, environment variables DISPLAY set in each process
is read during MPE_Open_graphics.

DISPLAY: determines where MPE X11 graphics on each process is connected to.


II. c) EXAMPLE PROJECT FILE
---------------------------
In examples/, user can find example source code, cpilog.c, on how to customize
log files.

II. d) UTILITY PROGRAMS
-----------------------

In bin/, user can find several useful utility programs when manipulating
logfiles.  These includes log format converters, log format print programs,
and logfile display program,


Log Format Converters
---------------------

clog2TOslog2 : a CLOG2 to SLOG-2 logfile convertor.  For more details,
              do "clog2TOslog2 -h".

rlogTOslog2 : a RLOG to SLOG-2 logfile convertor.  For more details,
              do "rlogTOslog2 -h".  Where RLOG is an internal MPICH2 logging
              format.

logconvertor : a standalone GUI based convertor that invokes clog2TOslog2
               or rlogTOslog2 based on logfile extension.  The GUI also
               shows the progress of the conversion.  The same convertor
               can be invoked from within the logfile viewer, jumpshot.

slog2filter : a SLOG-2 to SLOG-2 logfile convertor.  It allows for removal
              unwanted categories (when used with slog2print -c).  It also
              allows for changing of the SLOG-2 internal structure, e.g.
              modify the duration of preview drawable.  The tool reads
              and writes SLOG-2 file of same version.

slog2updater: a SLOG-2 file format update utility.  It is essentially
              a slog2filter that reads in older SLOG-2 file and writes
              out the latest SLOG-2 file format.


Log Format Print Programs
-------------------------

clog2_print : a stdout print program for CLOG file.
              Java version is named as clogprint.

clog2_join  : a clog2 serial merging program that merges clog2 files
              1) temporary local clog2 files which all are from the
                 same MPI_COMM_WORLD.
              2) merged clog2 files from each MPI_COMM_WORLDs
                 (Incomplete!, timestamps are not sync'ed yet.)

clog2_repair : a clog2 repair program that tries to fix the missing data
               of a clog2 file (when the MPI program that is being profiled
               aborts) so that the file can be processed by other tools
               like clog2TOslog2.

rlog_print  : a stdout print program for SLOG-2 file.

slog2print  : a stdout print program for SLOG-2 file.


Log File Display Program
------------------------

jumpshot : the Jumpshot-4 launcher script.  Jumpshot-4 does logfile
           conversion as well as visualization.

To view a logfile, say fpilog.slog2, do

jumpshot fpilog.slog2

The command will select and invoke Jumpshot-4 to display the content
of SLOG-2 file if Jumpshot-4 has been built and installed successfully.

One can also do

jumpshot fpilog.clog2

or

jumpshot barrier.rlog

Both will invoke the logfile convertor first before visualization.