1 MPE (Multi-Processing Environment) for Windows 2 ---------------------------------------------- 3 4 Mathematics and Computer Science Division 5 Argonne National Laboratory 6 7I. INTRODUCTION 8---------------- 9 10The Multi-Processing Environment (MPE) attempts to provide programmers with 11a complete suite of performance analysis tools for their MPI programs based 12on post processing approach. These tools include a set of profiling libraries, 13a set of utility programs, and a set of graphical tools. 14 15The first set of tools to be used with user MPI programs is profiling libraries 16which provide a collection of routines that create log files. These log files 17can be created manually by inserting MPE calls in the MPI program, or 18automatically by linking with the appropriate MPE libraries, or by combining 19the above two methods. Currently, the MPE offers the following 4 profiling 20libraries. 21 22 1) Tracing Library - Traces all MPI calls. Each MPI call is preceded by a 23 line that contains the rank in MPI_COMM_WORLD of the calling process, 24 and followed by another line indicating that the call has completed. 25 Most send and receive routines also indicate the values of count, tag, 26 and partner (destination for sends, source for receives). Output is to 27 standard output. 28 29 2) Animation Libraries - A simple form of real-time program animation 30 that requires X window routines. 31 32 3) Logging Libraries - The most useful and widely used profiling libraries 33 in MPE. These libraries form the basis to generate log files from 34 user MPI programs. There are several different log file formats 35 available in MPE. The default log file format is CLOG2. It is a low 36 overhead logging format, a simple collection of single timestamp events. 37 The old format ALOG, which is not being developed for years, is not 38 distributed here. The powerful visualization format is SLOG-2, stands 39 for Scalable LOGfile format version II which is a total redesign of the 40 original SLOG format. SLOG-2 allows for much improved scalability for 41 visualization purpose. CLOG2 file can be easily converted to 42 SLOG-2 file through the new SLOG-2 viewer, Jumpshot-4. 43 44 4) Collective and datatype checking library - An argument consistency 45 checking library for MPI collective calls. It checks for datatype, root, 46 and various argument consistency in MPI collective calls. 47 48The set of utility programs in MPE includes log format converter (e.g. 49clogTOslog2), logfile print (e.g. slog2print) and logfile viewer and 50convertor (e.g. jumpshot). These new tools, clog2TOslog2, slog2print and 51jumpshot(Jumpshot-4) replace old tools, clog2slog, slog_print and logviewer 52(i.e. Jumpshot-2 and Jumpshot-3). For more information of various 53logfile formats and their viewers, see 54 55http://www.mcs.anl.gov/perfvis 56 57 58 59II. USAGE 60--------- 61 62II. a) CUSTOMIZING LOGFILES 63--------------------------- 64 65In addition to using the predefined MPE logging libraries to log all MPI 66calls, MPE logging calls can be inserted into user's MPI program to define 67and log states. These states are called User-Defined states. States may 68be nested, allowing one to define a state describing a user routine that 69contains several MPI calls, and display both the user-defined state and 70the MPI operations contained within it. 71 72The simplest way to insert user-defined states is as follows: 731) Get handles from MPE logging library: MPE_Log_get_state_eventIDs() 74 has to be used to get unique event IDs (MPE logging handles). 75 This is important if you are writing a library that uses 76 the MPE logging routines from the MPE system. 77 78 PS. Hardwiring the eventIDs is considered a bad idea since it may cause 79 eventID confict and so the practice isn't supported. Older MPE libraries 80 provide MPE_Log_get_event_number() which is still being supported but 81 has been deprecated. Users are strongly urged to use 82 MPE_Log_get_state_eventIDs() instead. 832) Set the logged state's characteristics: MPE_Describe_state() sets the 84 name and color of the states. 853) Log the events of the logged states: MPE_Log_event() are called twice 86 to log the user-defined states. 87 88Below is a simple example that uses the 3 steps outlined above. 89 90\begin{verbatim} 91 92int eventID_begin, eventID_end; 93... 94MPE_Log_get_state_eventIDs( &eventID_begin, &eventID_end ); 95... 96MPE_Describe_state( eventID_begin, eventID_end, "Multiplication", "red" ); 97... 98MyAmult( Matrix m, Vector v ) 99{ 100 /* Log the start event along with the size of the matrix */ 101 MPE_Log_event( eventID_begin, 0, NULL ); 102 ... Amult code, including MPI calls ... 103 MPE_Log_event( eventID_end, 0, NULL ); 104} 105 106\end{verbatim} 107 108The logfile generated by this code will have the MPI routines nested within 109the routine MyAmult(). 110 111Besides user-defined states, MPE2 also provides support for user-defined 112events which can be defined through use of MPE_Log_get_solo_eventID() 113and MPE_Describe_event. For more details, e.g. see cpilog.c. 114 115If the MPE logging library, liblmpe.a, is NOT linked with the user program, 116MPE_Init_log() and MPE_Finish_log() need to be used before and after all 117the MPE calls. Sample programs cpilog.c and fpilog.f are available to 118illustrate the use of these MPE routines. They are in the MPE 119source directory, mpe2/src/wrappers/test or the installed directory, 120share/examples_logging to illustrate the use of these MPE routines. 121For futher linking information, see section "Convenient Compiler Wrappers". 122 123For undefined user-defined state, i.e. corresponding MPE_Describe_state() 124has not been issued, new jumpshot (Jumpshot-4) may display the legend name as 125"UnknownType-INDEX" where INDEX is the internal MPE category index. 126 127 128 129II. b) ENVIRONMENTAL VARIABLES 130------------------------------ 131 132For MPE logging, MPE_TMPDIR and MPE_LOGFILE_PREFIX are 2 environment variables 133that most users find to be very useful. So it is recommended to set these 1342 env. variables before launching the MPI program during logging : 135 136CLOG_BLOCK_SIZE: The integer value determines the clog2 buffer block size 137 which set the least minimum clog2 file size. If 138 CLOG_BLOCK_SIZE is not set, 64K per block is assumed. 139 140CLOG_BUFFERED_BLOCKS: The integer value determines the number of blocks 141 witin the CLOG2's internal buffer. Together with 142 CLOG_BLOCK_SIZE, CLOG_BUFFERED_BLOCKS determines how 143 often the internal buffer is flushed to the disk. 144 The total buffer size is determined by the product of 145 CLOG_BLOCK_SIZE and CLOG_BUFFERED_BLOCKS. These 2 146 environmental variables allows user to minimize MPE2 147 logging overhead when large local memory is available. 148 The default value is 128. 149 150MPE_TMPDIR: MPE_TMPDIR takes precedence over TMPDIR. It specifies a 151 directory to be used as temporary storage for each process. 152 By default, when MPE_TMPDIR and TMPDIR are NOT set, 153 /tmp will be used. When user needs to generate a very large 154 logfile for long-running MPI job, user needs to make sure that 155 MPE_TMPDIR(or TMPDIR) is big enough to hold the temporary local 156 logfile which will be deleted if the merged logfile can be 157 created successfully. In order to minimize the overhead of the 158 logging to the MPI program, it is highly recommended user to 159 use a *local* file system for TMPDIR. 160 161 Note : The final merged logfile will be written back to the 162 file system where process 0 is. 163 164MPE_DELETE_LOCALFILE: The boolean value determines whether to delete the 165 temporary local clog2 file. When this flag is 166 set to true, user needs to collect from the temporary 167 clog2 files from each slave node's MPE_TMPDIR. 168 Then separate serial programs, clog2_join and 169 clog2_repair, can be used to merge the local clog2 170 files. This process is useful e.g. when MPI_Finalize() 171 fails to complete properly, e.g. due to user program 172 overwritten to MPE/MPI internal data structures. 173 174MPE_CLOCKS_SYNC: The boolean value determines the behavior of 175 MPE_Log_sync_clocks() and the default clock synchronization 176 at the end of logging. Users may way to force MPE 177 clock synchronization when the MPI implementation has 178 buggy clock synchronization mechanism, e.g. Some versions 179 of BG/L MPI's MPI_WTIME_IS_GLOBAL is incorrectly set 180 to true when 64-ways or 256-ways partition is used. 181 182MPE_SYNC_ALGORITHM: specifies the clock synchronization algorithm. The 183 accepted values are "DEFAULT", "SEQ", "BITREE" 184 and "ALTNGBR". 185 SEQ: a O(N) steps algorithm and is non-scalable and slowest 186 but is also the most accurate. 187 BITREE: a O(log2(N)) steps algorithm, scalable and much 188 faster than SEQ but less accurate than SEQ. 189 A good compromise. 190 ALTNGBR: a O(1) steps algorithm, perfectly parallel 191 is the fastest of 3 algorithms supported. 192 It is also the least accurate. 193 DEFAULT: uses SEQ when the number of processes <= 16. 194 uses BITREE when number of processes > 16. 195 196MPE_SYNC_FREQUENCY: specifies the number of iterations of selected clock 197 synchronization. In general, the higher of 198 MPI_SYNC_FREQUENCY, the higher the probability of 199 obtaining a accurate measurement of all the clocks, 200 i.e. less error. Keep in mind, this is generally 201 not a guarantee and is highly dependent of the system 202 noise. The default is 3. 203 204MPE_LOGFILE_PREFIX: specifies the filename prefix of the output logfile. 205 The file extension will be determined by the output 206 logfile format, i.e. MPE_LOG_FORMAT. 207 208MPE_LOG_FORMAT: determines the format of the logfile generated from the 209 execution of application linked with MPE logging libraries. 210 The allowed value for MPE_LOG_FORMAT is CLOG2 only. 211 So there is no need to use this variable at the moment. 212 213MPE_LOG_OVERHEAD: The boolean value determines to log MPE/CLOG2's internal 214 profiling state CLOG_Buffer_write2disk(). The default 215 setting is yes. CLOG_Buffer_write2disk labels region 216 in each process that MPE/CLOG2 spends on flushing logging 217 data in the memory to the disk. The frequency and location 218 of CLOG_Buffer_write2disk state can be altered by changing 219 CLOG_BLOCK_SIZE and/or CLOG_BUFFERED_BLOCKS. 220 221Possible boolean values are "true", "false", "yes" and "no" in either 222all lower or upper cases. 223 224 225For MPE X11 graphics, environment variables DISPLAY set in each process 226is read during MPE_Open_graphics. 227 228DISPLAY: determines where MPE X11 graphics on each process is connected to. 229 230 231 232II. c) EXAMPLE PROJECT FILE 233--------------------------- 234In examples/, user can find example source code, cpilog.c, on how to customize 235log files. 236 237II. d) UTILITY PROGRAMS 238----------------------- 239 240In bin/, user can find several useful utility programs when manipulating 241logfiles. These includes log format converters, log format print programs, 242and logfile display program, 243 244 245Log Format Converters 246--------------------- 247 248clog2TOslog2 : a CLOG2 to SLOG-2 logfile convertor. For more details, 249 do "clog2TOslog2 -h". 250 251rlogTOslog2 : a RLOG to SLOG-2 logfile convertor. For more details, 252 do "rlogTOslog2 -h". Where RLOG is an internal MPICH2 logging 253 format. 254 255logconvertor : a standalone GUI based convertor that invokes clog2TOslog2 256 or rlogTOslog2 based on logfile extension. The GUI also 257 shows the progress of the conversion. The same convertor 258 can be invoked from within the logfile viewer, jumpshot. 259 260slog2filter : a SLOG-2 to SLOG-2 logfile convertor. It allows for removal 261 unwanted categories (when used with slog2print -c). It also 262 allows for changing of the SLOG-2 internal structure, e.g. 263 modify the duration of preview drawable. The tool reads 264 and writes SLOG-2 file of same version. 265 266slog2updater: a SLOG-2 file format update utility. It is essentially 267 a slog2filter that reads in older SLOG-2 file and writes 268 out the latest SLOG-2 file format. 269 270 271Log Format Print Programs 272------------------------- 273 274clog2_print : a stdout print program for CLOG file. 275 Java version is named as clogprint. 276 277clog2_join : a clog2 serial merging program that merges clog2 files 278 1) temporary local clog2 files which all are from the 279 same MPI_COMM_WORLD. 280 2) merged clog2 files from each MPI_COMM_WORLDs 281 (Incomplete!, timestamps are not sync'ed yet.) 282 283clog2_repair : a clog2 repair program that tries to fix the missing data 284 of a clog2 file (when the MPI program that is being profiled 285 aborts) so that the file can be processed by other tools 286 like clog2TOslog2. 287 288rlog_print : a stdout print program for SLOG-2 file. 289 290slog2print : a stdout print program for SLOG-2 file. 291 292 293 294Log File Display Program 295------------------------ 296 297jumpshot : the Jumpshot-4 launcher script. Jumpshot-4 does logfile 298 conversion as well as visualization. 299 300To view a logfile, say fpilog.slog2, do 301 302jumpshot fpilog.slog2 303 304The command will select and invoke Jumpshot-4 to display the content 305of SLOG-2 file if Jumpshot-4 has been built and installed successfully. 306 307One can also do 308 309jumpshot fpilog.clog2 310 311or 312 313jumpshot barrier.rlog 314 315Both will invoke the logfile convertor first before visualization. 316 317 318 319