1.. _runtime-control: 2 3Running a compiled program 4========================== 5 6.. index:: 7 single: runtime control of Haskell programs 8 single: running, compiled program 9 single: RTS options 10 11To make an executable program, the GHC system compiles your code and 12then links it with a non-trivial runtime system (RTS), which handles 13storage management, thread scheduling, profiling, and so on. 14 15The RTS has a lot of options to control its behaviour. For example, you 16can change the context-switch interval, the default size of the heap, 17and enable heap profiling. These options can be passed to the runtime 18system in a variety of different ways; the next section 19(:ref:`setting-rts-options`) describes the various methods, and the 20following sections describe the RTS options themselves. 21 22.. _setting-rts-options: 23 24Setting RTS options 25------------------- 26 27.. index:: 28 single: RTS options, setting 29 30There are four ways to set RTS options: 31 32- on the command line between ``+RTS ... -RTS``, when running the 33 program (:ref:`rts-opts-cmdline`) 34 35- at compile-time, using :ghc-flag:`-with-rtsopts=⟨opts⟩` 36 (:ref:`rts-opts-compile-time`) 37 38- with the environment variable :envvar:`GHCRTS` 39 (:ref:`rts-options-environment`) 40 41- by overriding "hooks" in the runtime system (:ref:`rts-hooks`) 42 43.. _rts-opts-cmdline: 44 45Setting RTS options on the command line 46~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 47 48.. index:: 49 single: +RTS 50 single: -RTS 51 single: --RTS 52 53If you set the :ghc-flag:`-rtsopts[=⟨none|some|all|ignore|ignoreAll⟩]` flag 54appropriately when linking (see :ref:`options-linker`), you can give RTS 55options on the command line when running your program. 56 57When your Haskell program starts up, the RTS extracts command-line 58arguments bracketed between ``+RTS`` and ``-RTS`` as its own. For example: 59 60.. code-block:: none 61 62 $ ghc prog.hs -rtsopts 63 [1 of 1] Compiling Main ( prog.hs, prog.o ) 64 Linking prog ... 65 $ ./prog -f +RTS -H32m -S -RTS -h foo bar 66 67The RTS will snaffle ``-H32m -S`` for itself, and the remaining 68arguments ``-f -h foo bar`` will be available to your program if/when it 69calls ``System.Environment.getArgs``. 70 71No ``-RTS`` option is required if the runtime-system options extend to 72the end of the command line, as in this example: 73 74.. code-block:: none 75 76 % hls -ltr /usr/etc +RTS -A5m 77 78If you absolutely positively want all the rest of the options in a 79command line to go to the program (and not the RTS), use a 80``--RTS``. 81 82As always, for RTS options that take ⟨size⟩s: If the last character of 83⟨size⟩ is a K or k, multiply by 1000; if an M or m, by 1,000,000; if a G 84or G, by 1,000,000,000. (And any wraparound in the counters is *your* 85fault!) 86 87Giving a ``+RTS -?`` RTS option option will print out the RTS 88options actually available in your program (which vary, depending on how 89you compiled). 90 91.. note:: 92 Since GHC is itself compiled by GHC, you can change RTS options in 93 the compiler using the normal ``+RTS ... -RTS`` combination. For instance, to set 94 the maximum heap size for a compilation to 128M, you would add 95 ``+RTS -M128m -RTS`` to the command line. 96 97.. _rts-opts-compile-time: 98 99Setting RTS options at compile time 100~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 101 102GHC lets you change the default RTS options for a program at compile 103time, using the ``-with-rtsopts`` flag (:ref:`options-linker`). A common 104use for this is to give your program a default heap and/or stack size 105that is greater than the default. For example, to set ``-H128m -K64m``, 106link with ``-with-rtsopts="-H128m -K64m"``. 107 108.. _rts-options-environment: 109 110Setting RTS options with the ``GHCRTS`` environment variable 111~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 112 113.. index:: 114 single: RTS options; from the environment 115 single: environment variable; for setting RTS options 116 single: GHCRTS environment variable 117 118.. envvar:: GHCRTS 119 120 If the ``-rtsopts`` flag is set to something other than ``none`` or ``ignoreAll`` 121 when linking, RTS options are also taken from the environment variable 122 :envvar:`GHCRTS`. For example, to set the maximum heap size to 2G 123 for all GHC-compiled programs (using an ``sh``\-like shell): 124 125 .. code-block:: sh 126 127 GHCRTS='-M2G' 128 export GHCRTS 129 130 RTS options taken from the :envvar:`GHCRTS` environment variable can be 131 overridden by options given on the command line. 132 133.. tip:: 134 Setting something like ``GHCRTS=-M2G`` in your environment is a 135 handy way to avoid Haskell programs growing beyond the real memory in 136 your machine, which is easy to do by accident and can cause the machine 137 to slow to a crawl until the OS decides to kill the process (and you 138 hope it kills the right one). 139 140.. _rts-hooks: 141 142"Hooks" to change RTS behaviour 143~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 144 145.. index:: 146 single: hooks; RTS 147 single: RTS hooks 148 single: RTS behaviour, changing 149 150GHC lets you exercise rudimentary control over certain RTS settings for 151any given program, by compiling in a "hook" that is called by the 152run-time system. The RTS contains stub definitions for these hooks, but 153by writing your own version and linking it on the GHC command line, you 154can override the defaults. 155 156Owing to the vagaries of DLL linking, these hooks don't work under 157Windows when the program is built dynamically. 158 159Runtime events 160############## 161 162You can change the messages printed when the runtime system "blows up," 163e.g., on stack overflow. The hooks for these are as follows: 164 165.. c:function:: void OutOfHeapHook (unsigned long, unsigned long) 166 167 The heap-overflow message. 168 169.. c:function:: void StackOverflowHook (long int) 170 171 The stack-overflow message. 172 173.. c:function:: void MallocFailHook (long int) 174 175 The message printed if ``malloc`` fails. 176 177.. _event_log_output_api: 178 179Event log output 180################ 181 182Furthermore GHC lets you specify the way event log data (see :rts-flag:`-l 183⟨flags⟩`) is written through a custom :c:type:`EventLogWriter`: 184 185.. c:type:: EventLogWriter 186 187 A sink of event-log data. 188 189 .. c:member:: void initEventLogWriter(void) 190 191 Initializes your :c:type:`EventLogWriter`. This is optional. 192 193 .. c:member:: bool writeEventLog(void *eventlog, size_t eventlog_size) 194 195 Hands buffered event log data to your event log writer. Return true on success. 196 Required for a custom :c:type:`EventLogWriter`. 197 198 .. c:member:: void flushEventLog(void) 199 200 Flush buffers (if any) of your custom :c:type:`EventLogWriter`. This can 201 be ``NULL``. 202 203 .. c:member:: void stopEventLogWriter(void) 204 205 Called when event logging is about to stop. This can be ``NULL``. 206 207To use an :c:type:`EventLogWriter` the RTS API provides the following functions: 208 209.. c:function:: EventLogStatus eventLogStatus(void) 210 211 Query whether the current runtime system supports the eventlog (e.g. whether 212 the current executable was linked with :ghc-flag:`-eventlog`) and, if it 213 is supported, whether it is currently logging. 214 215.. c:function:: bool startEventLogging(const EventLogWriter *writer) 216 217 Start logging events to the given :c:type:`EventLogWriter`. Returns true on 218 success or false is another writer has already been configured. 219 220.. c:function:: void endEventLogging() 221 222 Tear down the active :c:type:`EventLogWriter`. 223 224where the ``enum`` :c:type:`EventLogStatus` is: 225 226.. c:type:: EventLogStatus 227 228 * ``EVENTLOG_NOT_SUPPORTED``: The runtime system wasn't compiled with 229 eventlog support. 230 * ``EVENTLOG_NOT_CONFIGURED``: An :c:type:`EventLogWriter` has not yet been 231 configured. 232 * ``EVENTLOG_RUNNING``: An :c:type:`EventLogWriter` has been configured and 233 is running. 234 235 236.. _rts-options-misc: 237 238Miscellaneous RTS options 239------------------------- 240 241.. rts-flag:: --install-signal-handlers=⟨yes|no⟩ 242 243 If yes (the default), the RTS installs signal handlers to catch 244 things like :kbd:`Ctrl-C`. This option is primarily useful for when you are 245 using the Haskell code as a DLL, and want to set your own signal 246 handlers. 247 248 Note that even with ``--install-signal-handlers=no``, the RTS 249 interval timer signal is still enabled. The timer signal is either 250 SIGVTALRM or SIGALRM, depending on the RTS configuration and OS 251 capabilities. To disable the timer signal, use the ``-V0`` RTS 252 option (see :rts-flag:`-V ⟨secs⟩`). 253 254.. rts-flag:: --install-seh-handlers=⟨yes|no⟩ 255 256 If yes (the default), the RTS on Windows installs exception handlers to 257 catch unhandled exceptions using the Windows exception handling mechanism. 258 This option is primarily useful for when you are using the Haskell code as a 259 DLL, and don't want the RTS to ungracefully terminate your application on 260 errors such as segfaults. 261 262.. rts-flag:: --generate-crash-dumps 263 264 If yes (the default), the RTS on Windows will generate a core dump on 265 any crash. These dumps can be inspected using debuggers such as WinDBG. 266 The dumps record all code, registers and threading information at the time 267 of the crash. Note that this implies ``--install-seh-handlers=yes``. 268 269.. rts-flag:: --generate-stack-traces=<yes|no> 270 271 If yes (the default), the RTS on Windows will generate a stack trace on 272 crashes if exception handling are enabled. In order to get more information 273 in compiled executables, C code or DLLs symbols need to be available. 274 275.. rts-flag:: --disable-delayed-os-memory-return 276 277 If given, uses ``MADV_DONTNEED`` instead of ``MADV_FREE`` on platforms where 278 this results in more accurate resident memory usage of the program as shown 279 in memory usage reporting tools (e.g. the ``RSS`` column in ``top`` and ``htop``). 280 281 Using this is expected to make the program slightly slower. 282 283 On Linux, MADV_FREE is newer and faster because it can avoid zeroing 284 pages if they are re-used by the process later (see ``man 2 madvise``), 285 but for the trade-off that memory inspection tools like ``top`` will 286 not immediately reflect the freeing in their display of resident memory 287 (RSS column): Only under memory pressure will Linux actually remove 288 the freed pages from the process and update its RSS statistics. 289 Until then, the pages show up as ``LazyFree`` in ``/proc/PID/smaps`` 290 (see ``man 5 proc``). 291 292 The delayed RSS update can confuse programmers debugging memory issues, 293 production memory monitoring tools, and end users who may complain about 294 undue memory usage shown in reporting tools, so with this flag it can 295 be turned off. 296 297 298.. rts-flag:: -xp 299 300 On 64-bit machines, the runtime linker usually needs to map object code 301 into the low 2Gb of the address space, due to the x86_64 small memory model 302 where most symbol references are 32 bits. The problem is that this 2Gb of 303 address space can fill up, especially if you're loading a very large number 304 of object files into GHCi. 305 306 This flag offers a workaround, albeit a slightly convoluted one. To be able 307 to load an object file outside of the low 2Gb, the object code needs to be 308 compiled with ``-fPIC -fexternal-dynamic-refs``. When the ``+RTS -xp`` flag 309 is passed, the linker will assume that all object files were compiled with 310 ``-fPIC -fexternal-dynamic-refs`` and load them anywhere in the address 311 space. It's up to you to arrange that the object files you load (including 312 all packages) were compiled in the right way. If this is not the case for 313 an object, the linker will probably fail with an error message when the 314 problem is detected. 315 316 On some platforms where PIC is always the case, e.g. macOS and OpenBSD on 317 x86_64, and macOS and Linux on aarch64 this flag is enabled by default. 318 One repercussion of this is that referenced system libraries also need to be 319 compiled with ``-fPIC`` if we need to load them in the runtime linker. 320 321.. rts-flag:: -xm ⟨address⟩ 322 323 .. index:: 324 single: -xm; RTS option 325 326 .. warning:: 327 328 This option is for working around memory allocation 329 problems only. Do not use unless GHCi fails with a message like 330 “\ ``failed to mmap() memory below 2Gb``\ ”. Consider recompiling 331 the objects with ``-fPIC -fexternal-dynamic-refs`` and using the 332 ``-xp`` flag instead. If you need to use this option to get GHCi 333 working on your machine, please file a bug. 334 335 On 64-bit machines, the RTS needs to allocate memory in the low 2Gb 336 of the address space. Support for this across different operating 337 systems is patchy, and sometimes fails. This option is there to give 338 the RTS a hint about where it should be able to allocate memory in 339 the low 2Gb of the address space. For example, 340 ``+RTS -xm20000000 -RTS`` would hint that the RTS should allocate 341 starting at the 0.5Gb mark. The default is to use the OS's built-in 342 support for allocating memory in the low 2Gb if available (e.g. 343 ``mmap`` with ``MAP_32BIT`` on Linux), or otherwise ``-xm40000000``. 344 345.. rts-flag:: -xq ⟨size⟩ 346 347 :default: 100k 348 349 This option relates to allocation limits; for more about this see 350 :base-ref:`GHC.Conc.enableAllocationLimit`. 351 When a thread hits its allocation limit, the RTS throws an exception 352 to the thread, and the thread gets an additional quota of allocation 353 before the exception is raised again, the idea being so that the 354 thread can execute its exception handlers. The ``-xq`` controls the 355 size of this additional quota. 356 357.. _rts-options-gc: 358 359RTS options to control the garbage collector 360-------------------------------------------- 361 362.. index:: 363 single: garbage collector; options 364 single: RTS options; garbage collection 365 366There are several options to give you precise control over garbage 367collection. Hopefully, you won't need any of these in normal operation, 368but there are several things that can be tweaked for maximum 369performance. 370 371.. rts-flag:: --copying-gc 372 373 :default: on 374 :since: 8.10.2 375 :reverse: --nonmoving-gc 376 377 Uses the generational copying garbage collector for all generations. 378 This is the default. 379 380.. rts-flag:: --nonmoving-gc 381 382 :default: off 383 :since: 8.10.1 384 :reverse: --copying-gc 385 386 .. index:: 387 single: concurrent mark and sweep 388 389 Enable the concurrent mark-and-sweep garbage collector for old generation 390 collectors. Typically GHC uses a stop-the-world copying garbage collector 391 for all generations. This can cause long pauses in execution during major 392 garbage collections. :rts-flag:`--nonmoving-gc` enables the use of a 393 concurrent mark-and-sweep garbage collector for oldest generation 394 collections. Under this collection strategy oldest-generation garbage 395 collection can proceed concurrently with mutation. 396 397 Note that :rts-flag:`--nonmoving-gc` cannot be used with ``-G1``, 398 :rts-flag:`profiling <-hc>` nor :rts-flag:`-c`. 399 400.. rts-flag:: -xn 401 402 :default: off 403 :since: 8.10.1 404 405 An alias for :rts-flag:`--nonmoving-gc` 406 407.. rts-flag:: -A ⟨size⟩ 408 409 :default: 1MB 410 411 .. index:: 412 single: allocation area, size 413 414 Set the allocation area size used by the garbage 415 collector. The allocation area (actually generation 0 step 0) is 416 fixed and is never resized (unless you use :rts-flag:`-H [⟨size⟩]`, below). 417 418 Increasing the allocation area size may or may not give better 419 performance (a bigger allocation area means worse cache behaviour 420 but fewer garbage collections and less promotion). 421 422 With only 1 generation (e.g. ``-G1``, see :rts-flag:`-G ⟨generations⟩`) the 423 ``-A`` option specifies the minimum allocation area, since the actual size 424 of the allocation area will be resized according to the amount of data in 425 the heap (see :rts-flag:`-F ⟨factor⟩`, below). 426 427.. rts-flag:: -AL ⟨size⟩ 428 429 :default: :rts-flag:`-A <-A ⟨size⟩>` value 430 :since: 8.2.1 431 432 .. index:: 433 single: allocation area for large objects, size 434 435 Sets the limit on the total size of "large objects" (objects 436 larger than about 3KB) that can be allocated before a GC is 437 triggered. By default this limit is the same as the :rts-flag:`-A <-A 438 ⟨size⟩>` value. 439 440 Large objects are not allocated from the normal allocation area 441 set by the ``-A`` flag, which is why there is a separate limit for 442 these. Large objects tend to be much rarer than small objects, so 443 most programs hit the ``-A`` limit before the ``-AL`` limit. However, 444 the ``-A`` limit is per-capability, whereas the ``-AL`` limit is global, 445 so as ``-N`` gets larger it becomes more likely that we hit the 446 ``-AL`` limit first. To counteract this, it might be necessary to 447 use a larger ``-AL`` limit when using a large ``-N``. 448 449 To see whether you're making good use of all the memory reseverd 450 for the allocation area (``-A`` times ``-N``), look at the output of 451 ``+RTS -S`` and check whether the amount of memory allocated between 452 GCs is equal to ``-A`` times ``-N``. If not, there are two possible 453 remedies: use ``-n`` to set a nursery chunk size, or use ``-AL`` to 454 increase the limit for large objects. 455 456.. rts-flag:: -O ⟨size⟩ 457 458 :default: 1m 459 460 .. index:: 461 single: old generation, size 462 463 Set the minimum size of the old generation. The old generation is collected 464 whenever it grows to this size or the value of the :rts-flag:`-F ⟨factor⟩` 465 option multiplied by the size of the live data at the previous major 466 collection, whichever is larger. 467 468.. rts-flag:: -n ⟨size⟩ 469 470 :default: 4m with :rts-flag:`-A16m <-A ⟨size⟩>` or larger, otherwise 0. 471 472 .. index:: 473 single: allocation area, chunk size 474 475 [Example: ``-n4m`` ] When set to a non-zero value, this 476 option divides the allocation area (``-A`` value) into chunks of the 477 specified size. During execution, when a processor exhausts its 478 current chunk, it is given another chunk from the pool until the 479 pool is exhausted, at which point a collection is triggered. 480 481 This option is only useful when running in parallel (``-N2`` or 482 greater). It allows the processor cores to make better use of the 483 available allocation area, even when cores are allocating at 484 different rates. Without ``-n``, each core gets a fixed-size 485 allocation area specified by the ``-A``, and the first core to 486 exhaust its allocation area triggers a GC across all the cores. This 487 can result in a collection happening when the allocation areas of 488 some cores are only partially full, so the purpose of the ``-n`` is 489 to allow cores that are allocating faster to get more of the 490 allocation area. This means less frequent GC, leading a lower GC 491 overhead for the same heap size. 492 493 This is particularly useful in conjunction with larger ``-A`` 494 values, for example ``-A64m -n4m`` is a useful combination on larger core 495 counts (8+). 496 497.. rts-flag:: -c 498 499 .. index:: 500 single: garbage collection; compacting 501 single: compacting garbage collection 502 503 Use a compacting algorithm for collecting the oldest generation. By 504 default, the oldest generation is collected using a copying 505 algorithm; this option causes it to be compacted in-place instead. 506 The compaction algorithm is slower than the copying algorithm, but 507 the savings in memory use can be considerable. 508 509 For a given heap size (using the :rts-flag:`-H [⟨size⟩]` option), 510 compaction can in fact reduce the GC cost by allowing fewer GCs to be 511 performed. This is more likely when the ratio of live data to heap size is 512 high, say greater than 30%. 513 514 .. note:: 515 Compaction doesn't currently work when a single generation is 516 requested using the ``-G1`` option. 517 518.. rts-flag:: -c ⟨n⟩ 519 520 :default: 30 521 522 Automatically enable compacting collection when the live data exceeds ⟨n⟩% 523 of the maximum heap size (see the :rts-flag:`-M ⟨size⟩` option). Note that 524 the maximum heap size is unlimited by default, so this option has no effect 525 unless the maximum heap size is set with :rts-flag:`-M ⟨size⟩`. 526 527.. rts-flag:: -F ⟨factor⟩ 528 529 :default: 2 530 531 .. index:: 532 single: heap size, factor 533 534 This option controls the amount of memory reserved for 535 the older generations (and in the case of a two space collector the 536 size of the allocation area) as a factor of the amount of live data. 537 For example, if there was 2M of live data in the oldest generation 538 when we last collected it, then by default we'll wait until it grows 539 to 4M before collecting it again. 540 541 The default seems to work well here. If you have plenty of memory, it is 542 usually better to use ``-H ⟨size⟩`` (see :rts-flag:`-H [⟨size⟩]`) than to 543 increase :rts-flag:`-F ⟨factor⟩`. 544 545 The :rts-flag:`-F ⟨factor⟩` setting will be automatically reduced by the garbage 546 collector when the maximum heap size (the :rts-flag:`-M ⟨size⟩` setting) is approaching. 547 548.. rts-flag:: -G ⟨generations⟩ 549 550 :default: 2 551 552 .. index:: 553 single: generations, number of 554 555 Set the number of generations used by the garbage 556 collector. The default of 2 seems to be good, but the garbage 557 collector can support any number of generations. Anything larger 558 than about 4 is probably not a good idea unless your program runs 559 for a *long* time, because the oldest generation will hardly ever 560 get collected. 561 562 Specifying 1 generation with ``+RTS -G1`` gives you a simple 2-space 563 collector, as you would expect. In a 2-space collector, the :rts-flag:`-A 564 ⟨size⟩` option specifies the *minimum* allocation area size, since the 565 allocation area will grow with the amount of live data in the heap. In a 566 multi-generational collector the allocation area is a fixed size (unless 567 you use the :rts-flag:`-H [⟨size⟩]` option). 568 569.. rts-flag:: -qg ⟨gen⟩ 570 571 :default: 0 572 :since: 6.12.1 573 574 Use parallel GC in generation ⟨gen⟩ and higher. Omitting ⟨gen⟩ turns off the 575 parallel GC completely, reverting to sequential GC. 576 577 The default parallel GC settings are usually suitable for parallel programs 578 (i.e. those using :base-ref:`GHC.Conc.par`, Strategies, or with 579 multiple threads). However, it is sometimes beneficial to enable the 580 parallel GC for a single-threaded sequential program too, especially if the 581 program has a large amount of heap data and GC is a significant fraction of 582 runtime. To use the parallel GC in a sequential program, enable the parallel 583 runtime with a suitable :rts-flag:`-N ⟨x⟩` option, and additionally it might 584 be beneficial to restrict parallel GC to the old generation with ``-qg1``. 585 586.. rts-flag:: -qb ⟨gen⟩ 587 588 :default: 1 for :rts-flag:`-A <-A ⟨size⟩>` < 32M, 0 otherwise 589 :since: 6.12.1 590 591 Use load-balancing in the parallel GC in generation ⟨gen⟩ and higher. 592 Omitting ⟨gen⟩ disables load-balancing entirely. 593 594 Load-balancing shares out the work of GC between the available 595 cores. This is a good idea when the heap is large and we need to 596 parallelise the GC work, however it is also pessimal for the short 597 young-generation collections in a parallel program, because it can 598 harm locality by moving data from the cache of the CPU where is it 599 being used to the cache of another CPU. Hence the default is to do 600 load-balancing only in the old-generation. In fact, for a parallel 601 program it is sometimes beneficial to disable load-balancing 602 entirely with ``-qb``. 603 604.. rts-flag:: -qn ⟨x⟩ 605 606 :default: the value of :rts-flag:`-N <-N ⟨x⟩>` or the number of CPU cores, 607 whichever is smaller. 608 :since: 8.2.1 609 610 .. index:: 611 single: GC threads, setting the number of 612 613 By default, all of the capabilities participate in parallel 614 garbage collection. If we want to use a very large ``-N`` value, 615 however, this can reduce the performance of the GC. For this 616 reason, the ``-qn`` flag can be used to specify a lower number for 617 the threads that should participate in GC. During GC, if there 618 are more than this number of workers active, some of them will 619 sleep for the duration of the GC. 620 621 The ``-qn`` flag may be useful when running with a large ``-A`` value 622 (so that GC is infrequent), and a large ``-N`` value (so as to make 623 use of hyperthreaded cores, for example). For example, on a 624 24-core machine with 2 hyperthreads per core, we might use 625 ``-N48 -qn24 -A128m`` to specify that the mutator should use 626 hyperthreads but the GC should only use real cores. Note that 627 this configuration would use 6GB for the allocation area. 628 629.. rts-flag:: -H [⟨size⟩] 630 631 :default: 0 632 633 .. index:: 634 single: heap size, suggested 635 636 This option provides a "suggested heap size" for the garbage collector. 637 Think of ``-Hsize`` as a variable :rts-flag:`-A ⟨size⟩` option. It says: I 638 want to use at least ⟨size⟩ bytes, so use whatever is left over to increase 639 the ``-A`` value. 640 641 This option does not put a *limit* on the heap size: the heap may 642 grow beyond the given size as usual. 643 644 If ⟨size⟩ is omitted, then the garbage collector will take the size 645 of the heap at the previous GC as the ⟨size⟩. This has the effect of 646 allowing for a larger ``-A`` value but without increasing the 647 overall memory requirements of the program. It can be useful when 648 the default small ``-A`` value is suboptimal, as it can be in 649 programs that create large amounts of long-lived data. 650 651.. rts-flag:: -I ⟨seconds⟩ 652 653 :default: 0.3 seconds in the threaded runtime, 0 in the non-threaded runtime 654 655 .. index:: 656 single: idle GC 657 658 In the threaded and SMP versions of the RTS (see 659 :ghc-flag:`-threaded`, :ref:`options-linker`), a major GC is automatically 660 performed if the runtime has been idle (no Haskell computation has 661 been running) for a period of time. The amount of idle time which 662 must pass before a GC is performed is set by the ``-I ⟨seconds⟩`` 663 option. Specifying ``-I0`` disables the idle GC. 664 665 For an interactive application, it is probably a good idea to use 666 the idle GC, because this will allow finalizers to run and 667 deadlocked threads to be detected in the idle time when no Haskell 668 computation is happening. Also, it will mean that a GC is less 669 likely to happen when the application is busy, and so responsiveness 670 may be improved. However, if the amount of live data in the heap is 671 particularly large, then the idle GC can cause a significant delay, 672 and too small an interval could adversely affect interactive 673 responsiveness. 674 675 This is an experimental feature, please let us know if it causes 676 problems and/or could benefit from further tuning. 677 678.. rts-flag:: -Iw ⟨seconds⟩ 679 680 :default: 0 seconds 681 682 .. index:: 683 single: idle GC 684 685 By default, if idle GC is enabled in the threaded runtime, a major 686 GC will be performed every time the process goes idle for a 687 sufficiently long duration (see :rts-flag:`-I ⟨seconds⟩`). For 688 large server processes accepting regular but infrequent requests 689 (e.g., once per second), an expensive, major GC may run after 690 every request. As an alternative to shutting off idle GC entirely 691 (with ``-I0``), a minimum wait time between idle GCs can be 692 specified with this flag. For example, ``-Iw60`` will ensure that 693 an idle GC runs at most once per minute. 694 695 This is an experimental feature, please let us know if it causes 696 problems and/or could benefit from further tuning. 697 698.. rts-flag:: -ki ⟨size⟩ 699 700 :default: 1k 701 702 .. index:: 703 single: stack, initial size 704 705 Set the initial stack size for new threads. 706 707 Thread stacks (including the main thread's stack) live on the heap. 708 As the stack grows, new stack chunks are added as required; if the 709 stack shrinks again, these extra stack chunks are reclaimed by the 710 garbage collector. The default initial stack size is deliberately 711 small, in order to keep the time and space overhead for thread 712 creation to a minimum, and to make it practical to spawn threads for 713 even tiny pieces of work. 714 715 .. note:: 716 This flag used to be simply ``-k``, but was renamed to ``-ki`` in 717 GHC 7.2.1. The old name is still accepted for backwards 718 compatibility, but that may be removed in a future version. 719 720.. rts-flag:: -kc ⟨size⟩ 721 722 :default: 32k 723 724 .. index:: 725 single: stack; chunk size 726 727 Set the size of "stack chunks". When a thread's current stack overflows, a 728 new stack chunk is created and added to the thread's stack, until the limit 729 set by :rts-flag:`-K ⟨size⟩` is reached. 730 731 The advantage of smaller stack chunks is that the garbage collector can 732 avoid traversing stack chunks if they are known to be unmodified since the 733 last collection, so reducing the chunk size means that the garbage 734 collector can identify more stack as unmodified, and the GC overhead might 735 be reduced. On the other hand, making stack chunks too small adds some 736 overhead as there will be more overflow/underflow between chunks. The 737 default setting of 32k appears to be a reasonable compromise in most cases. 738 739.. rts-flag:: -kb ⟨size⟩ 740 741 :default: 1k 742 743 .. index:: 744 single: stack; chunk buffer size 745 746 Sets the stack chunk buffer size. When a stack chunk 747 overflows and a new stack chunk is created, some of the data from 748 the previous stack chunk is moved into the new chunk, to avoid an 749 immediate underflow and repeated overflow/underflow at the boundary. 750 The amount of stack moved is set by the ``-kb`` option. 751 752 Note that to avoid wasting space, this value should typically be less than 753 10% of the size of a stack chunk (:rts-flag:`-kc ⟨size⟩`), because in a 754 chain of stack chunks, each chunk will have a gap of unused space of this 755 size. 756 757.. rts-flag:: -K ⟨size⟩ 758 759 :default: 80% of physical memory 760 761 .. index:: 762 single: stack, maximum size 763 764 Set the maximum stack size for 765 an individual thread to ⟨size⟩ bytes. If the thread attempts to 766 exceed this limit, it will be sent the ``StackOverflow`` exception. 767 The limit can be disabled entirely by specifying a size of zero. 768 769 This option is there mainly to stop the program eating up all the 770 available memory in the machine if it gets into an infinite loop. 771 772.. rts-flag:: -m ⟨n⟩ 773 774 :default: 3% 775 776 .. index:: 777 single: heap, minimum free 778 779 Minimum % ⟨n⟩ of heap which must be available for allocation. 780 781.. rts-flag:: -M ⟨size⟩ 782 783 :default: unlimited 784 785 .. index:: 786 single: heap size, maximum 787 788 Set the maximum heap size to ⟨size⟩ bytes. The 789 heap normally grows and shrinks according to the memory requirements 790 of the program. The only reason for having this option is to stop 791 the heap growing without bound and filling up all the available swap 792 space, which at the least will result in the program being summarily 793 killed by the operating system. 794 795 The maximum heap size also affects other garbage collection 796 parameters: when the amount of live data in the heap exceeds a 797 certain fraction of the maximum heap size, compacting collection 798 will be automatically enabled for the oldest generation, and the 799 ``-F`` parameter will be reduced in order to avoid exceeding the 800 maximum heap size. 801 802.. rts-flag:: -Mgrace=⟨size⟩ 803 804 :default: 1M 805 806 .. index:: 807 single: heap size, grace 808 809 If the program's heap exceeds the value set by :rts-flag:`-M ⟨size⟩`, the 810 RTS throws an exception to the program, and the program gets an 811 additional quota of allocation before the exception is raised 812 again, the idea being so that the program can execute its 813 exception handlers. ``-Mgrace=`` controls the size of this 814 additional quota. 815 816.. rts-flag:: --numa 817 --numa=<mask> 818 819 .. index:: 820 single: NUMA, enabling in the runtime 821 822 Enable NUMA-aware memory allocation in the runtime (only available 823 with ``-threaded``, and only on Linux and Windows currently). 824 825 Background: some systems have a Non-Uniform Memory Architecture, 826 whereby main memory is split into banks which are "local" to 827 specific CPU cores. Accessing local memory is faster than 828 accessing remote memory. The OS provides APIs for allocating 829 local memory and binding threads to particular CPU cores, so that 830 we can ensure certain memory accesses are using local memory. 831 832 The ``--numa`` option tells the RTS to tune its memory usage to 833 maximize local memory accesses. In particular, the RTS will: 834 835 - Determine the number of NUMA nodes (N) by querying the OS. 836 - Manage separate memory pools for each node. 837 - Map capabilities to NUMA nodes. Capability C is mapped to 838 NUMA node C mod N. 839 - Bind worker threads on a capability to the appropriate node. 840 - Allocate the nursery from node-local memory. 841 - Perform other memory allocation, including in the GC, from 842 node-local memory. 843 - When load-balancing, we prefer to migrate threads to another 844 Capability on the same node. 845 846 The ``--numa`` flag is typically beneficial when a program is 847 using all cores of a large multi-core NUMA system, with a large 848 allocation area (``-A``). All memory accesses to the allocation 849 area will go to local memory, which can save a significant amount 850 of remote memory access. A runtime speedup on the order of 10% 851 is typical, but can vary a lot depending on the hardware and the 852 memory behaviour of the program. 853 854 Note that the RTS will not set CPU affinity for bound threads and 855 threads entering Haskell from C/C++, so if your program uses bound 856 threads you should ensure that each bound thread calls the RTS API 857 `rts_setInCallCapability(c,1)` from C/C++ before calling into 858 Haskell. Otherwise there could be a mismatch between the CPU that 859 the thread is running on and the memory it is using while running 860 Haskell code, which will negate any benefits of ``--numa``. 861 862 If given an explicit <mask>, the <mask> is interpreted as a bitmap 863 that indicates the NUMA nodes on which to run the program. For 864 example, ``--numa=3`` would run the program on NUMA nodes 0 and 1. 865 866.. rts-flag:: --long-gc-sync 867 --long-gc-sync=<seconds> 868 869 .. index:: 870 single: GC sync time, measuring 871 872 When a GC starts, all the running mutator threads have to stop and 873 synchronise. The period between when the GC is initiated and all 874 the mutator threads are stopped is called the GC synchronisation 875 phase. If this phase is taking a long time (longer than 1ms is 876 considered long), then it can have a severe impact on overall 877 throughput. 878 879 A long GC sync can be caused by a mutator thread that is inside an 880 ``unsafe`` FFI call, or running in a loop that doesn't allocate 881 memory and so doesn't yield. To fix the former, make the call 882 ``safe``, and to fix the latter, either avoid calling the code in 883 question or compile it with :ghc-flag:`-fomit-yields`. 884 885 By default, the flag will cause a warning to be emitted to stderr 886 when the sync time exceeds the specified time. This behaviour can 887 be overridden, however: the ``longGCSync()`` hook is called when 888 the sync time is exceeded during the sync period, and the 889 ``longGCSyncEnd()`` hook at the end. Both of these hooks can be 890 overridden in the ``RtsConfig`` when the runtime is started with 891 ``hs_init_ghc()``. The default implementations of these hooks 892 (``LongGcSync()`` and ``LongGCSyncEnd()`` respectively) print 893 warnings to stderr. 894 895 One way to use this flag is to set a breakpoint on 896 ``LongGCSync()`` in the debugger, and find the thread that is 897 delaying the sync. You probably want to use :ghc-flag:`-g` to 898 provide more info to the debugger. 899 900 The GC sync time, along with other GC stats, are available by 901 calling the ``getRTSStats()`` function from C, or 902 ``GHC.Stats.getRTSStats`` from Haskell. 903 904.. _rts-options-statistics: 905 906RTS options to produce runtime statistics 907----------------------------------------- 908 909.. rts-flag:: -T 910 -t [⟨file⟩] 911 -s [⟨file⟩] 912 -S [⟨file⟩] 913 --machine-readable 914 --internal-counters 915 916 These options produce runtime-system statistics, such as the amount 917 of time spent executing the program and in the garbage collector, 918 the amount of memory allocated, the maximum size of the heap, and so 919 on. The three variants give different levels of detail: ``-T`` 920 collects the data but produces no output ``-t`` produces a single 921 line of output in the same format as GHC's ``-Rghc-timing`` option, 922 ``-s`` produces a more detailed summary at the end of the program, 923 and ``-S`` additionally produces information about each and every 924 garbage collection. Passing ``--internal-counters`` to a threaded 925 runtime will cause a detailed summary to include various internal 926 counts accumulated during the run; note that these are unspecified 927 and may change between releases. 928 929 The output is placed in ⟨file⟩. If ⟨file⟩ is omitted, then the 930 output is sent to ``stderr``. 931 932 If you use the ``-T`` flag then, you should access the statistics 933 using :base-ref:`GHC.Stats.`. 934 935 If you use the ``-t`` flag then, when your program finishes, you 936 will see something like this: 937 938 .. code-block:: none 939 940 <<ghc: 36169392 bytes, 69 GCs, 603392/1065272 avg/max bytes residency (2 samples), 3M in use, 0.00 INIT (0.00 elapsed), 0.02 MUT (0.02 elapsed), 0.07 GC (0.07 elapsed) :ghc>> 941 942 This tells you: 943 944 - The total number of bytes allocated by the program over the whole 945 run. 946 947 - The total number of garbage collections performed. 948 949 - The average and maximum "residency", which is the amount of live 950 data in bytes. The runtime can only determine the amount of live 951 data during a major GC, which is why the number of samples 952 corresponds to the number of major GCs (and is usually relatively 953 small). To get a better picture of the heap profile of your 954 program, use the :rts-flag:`-hT` RTS option (:ref:`rts-profiling`). 955 956 - The peak memory the RTS has allocated from the OS. 957 958 - The amount of CPU time and elapsed wall clock time while 959 initialising the runtime system (INIT), running the program 960 itself (MUT, the mutator), and garbage collecting (GC). 961 962 You can also get this in a more future-proof, machine readable 963 format, with ``-t --machine-readable``: 964 965 :: 966 967 [("bytes allocated", "36169392") 968 ,("num_GCs", "69") 969 ,("average_bytes_used", "603392") 970 ,("max_bytes_used", "1065272") 971 ,("num_byte_usage_samples", "2") 972 ,("peak_megabytes_allocated", "3") 973 ,("init_cpu_seconds", "0.00") 974 ,("init_wall_seconds", "0.00") 975 ,("mutator_cpu_seconds", "0.02") 976 ,("mutator_wall_seconds", "0.02") 977 ,("GC_cpu_seconds", "0.07") 978 ,("GC_wall_seconds", "0.07") 979 ] 980 981 If you use the ``-s`` flag then, when your program finishes, you 982 will see something like this (the exact details will vary depending 983 on what sort of RTS you have, e.g. you will only see profiling data 984 if your RTS is compiled for profiling): 985 986 .. code-block:: none 987 988 36,169,392 bytes allocated in the heap 989 4,057,632 bytes copied during GC 990 1,065,272 bytes maximum residency (2 sample(s)) 991 54,312 bytes maximum slop 992 3 MB total memory in use (0 MB lost due to fragmentation) 993 994 Generation 0: 67 collections, 0 parallel, 0.04s, 0.03s elapsed 995 Generation 1: 2 collections, 0 parallel, 0.03s, 0.04s elapsed 996 997 SPARKS: 359207 (557 converted, 149591 pruned) 998 999 INIT time 0.00s ( 0.00s elapsed) 1000 MUT time 0.01s ( 0.02s elapsed) 1001 GC time 0.07s ( 0.07s elapsed) 1002 EXIT time 0.00s ( 0.00s elapsed) 1003 Total time 0.08s ( 0.09s elapsed) 1004 1005 %GC time 89.5% (75.3% elapsed) 1006 1007 Alloc rate 4,520,608,923 bytes per MUT second 1008 1009 Productivity 10.5% of total user, 9.1% of total elapsed 1010 1011 - The "bytes allocated in the heap" is the total bytes allocated by 1012 the program over the whole run. 1013 1014 - GHC uses a copying garbage collector by default. "bytes copied 1015 during GC" tells you how many bytes it had to copy during garbage 1016 collection. 1017 1018 - The maximum space actually used by your program is the "bytes 1019 maximum residency" figure. This is only checked during major 1020 garbage collections, so it is only an approximation; the number 1021 of samples tells you how many times it is checked. 1022 1023 - The "bytes maximum slop" tells you the most space that is ever 1024 wasted due to the way GHC allocates memory in blocks. Slop is 1025 memory at the end of a block that was wasted. There's no way to 1026 control this; we just like to see how much memory is being lost 1027 this way. 1028 1029 - The "total memory in use" tells you the peak memory the RTS has 1030 allocated from the OS. 1031 1032 - Next there is information about the garbage collections done. For 1033 each generation it says how many garbage collections were done, 1034 how many of those collections were done in parallel, the total 1035 CPU time used for garbage collecting that generation, and the 1036 total wall clock time elapsed while garbage collecting that 1037 generation. 1038 1039 - The ``SPARKS`` statistic refers to the use of 1040 ``Control.Parallel.par`` and related functionality in the 1041 program. Each spark represents a call to ``par``; a spark is 1042 "converted" when it is executed in parallel; and a spark is 1043 "pruned" when it is found to be already evaluated and is 1044 discarded from the pool by the garbage collector. Any remaining 1045 sparks are discarded at the end of execution, so "converted" plus 1046 "pruned" does not necessarily add up to the total. 1047 1048 - Next there is the CPU time and wall clock time elapsed broken 1049 down by what the runtime system was doing at the time. INIT is 1050 the runtime system initialisation. MUT is the mutator time, i.e. 1051 the time spent actually running your code. GC is the time spent 1052 doing garbage collection. RP is the time spent doing retainer 1053 profiling. PROF is the time spent doing other profiling. EXIT is 1054 the runtime system shutdown time. And finally, Total is, of 1055 course, the total. 1056 1057 %GC time tells you what percentage GC is of Total. "Alloc rate" 1058 tells you the "bytes allocated in the heap" divided by the MUT 1059 CPU time. "Productivity" tells you what percentage of the Total 1060 CPU and wall clock elapsed times are spent in the mutator (MUT). 1061 1062 The ``-S`` flag, as well as giving the same output as the ``-s`` 1063 flag, prints information about each GC as it happens: 1064 1065 .. code-block:: none 1066 1067 Alloc Copied Live GC GC TOT TOT Page Flts 1068 bytes bytes bytes user elap user elap 1069 528496 47728 141512 0.01 0.02 0.02 0.02 0 0 (Gen: 1) 1070 [...] 1071 524944 175944 1726384 0.00 0.00 0.08 0.11 0 0 (Gen: 0) 1072 1073 For each garbage collection, we print: 1074 1075 - How many bytes we allocated this garbage collection. 1076 1077 - How many bytes we copied this garbage collection. 1078 1079 - How many bytes are currently live. 1080 1081 - How long this garbage collection took (CPU time and elapsed wall 1082 clock time). 1083 1084 - How long the program has been running (CPU time and elapsed wall 1085 clock time). 1086 1087 - How many page faults occurred this garbage collection. 1088 1089 - How many page faults occurred since the end of the last garbage 1090 collection. 1091 1092 - Which generation is being garbage collected. 1093 1094RTS options for concurrency and parallelism 1095------------------------------------------- 1096 1097The RTS options related to concurrency are described in 1098:ref:`using-concurrent`, and those for parallelism in 1099:ref:`parallel-options`. 1100 1101.. _rts-profiling: 1102 1103RTS options for profiling 1104------------------------- 1105 1106Most profiling runtime options are only available when you compile your 1107program for profiling (see :ref:`prof-compiler-options`, and 1108:ref:`rts-options-heap-prof` for the runtime options). However, there is 1109one profiling option that is available for ordinary non-profiled 1110executables: 1111 1112.. rts-flag:: -hT 1113 -h 1114 1115 Generates a basic heap profile, in the file :file:`prog.hp`. To produce the 1116 heap profile graph, use :command:`hp2ps` (see :ref:`hp2ps`). The basic heap 1117 profile is broken down by data constructor, with other types of closures 1118 (functions, thunks, etc.) grouped into broad categories (e.g. ``FUN``, 1119 ``THUNK``). To get a more detailed profile, use the full profiling support 1120 (:ref:`profiling`). Can be shortened to :rts-flag:`-h`. 1121 1122 .. note:: The meaning of the shortened :rts-flag:`-h` is dependent on whether 1123 your program was compiled for profiling. 1124 (See :ref:`rts-options-heap-prof` for details.) 1125 1126.. rts-flag:: -L ⟨n⟩ 1127 1128 :default: 25 characters 1129 1130 Sets the maximum length of the cost-centre names listed in the heap profile. 1131 1132.. _rts-eventlog: 1133 1134Tracing 1135------- 1136 1137.. index:: 1138 single: tracing 1139 single: events 1140 single: eventlog files 1141 1142When the program is linked with the :ghc-flag:`-eventlog` option 1143(:ref:`options-linker`), runtime events can be logged in several ways: 1144 1145- In binary format to a file for later analysis by a variety of tools. 1146 One such tool is 1147 `ThreadScope <http://www.haskell.org/haskellwiki/ThreadScope>`__, 1148 which interprets the event log to produce a visual parallel execution 1149 profile of the program. 1150 1151- In binary format to customized event log writer. This enables live 1152 analysis of the events while the program is running. 1153 1154- As text to standard output, for debugging purposes. 1155 1156.. rts-flag:: -l ⟨flags⟩ 1157 1158 Log events in binary format. Without any ⟨flags⟩ specified, this 1159 logs a default set of events, suitable for use with tools like ThreadScope. 1160 1161 Per default the events are written to :file:`{program}.eventlog` though 1162 the mechanism for writing event log data can be overridden with a custom 1163 `EventLogWriter`. 1164 1165 For some special use cases you may want more control over which 1166 events are included. The ⟨flags⟩ is a sequence of zero or more 1167 characters indicating which classes of events to log. Currently 1168 these the classes of events that can be enabled/disabled: 1169 1170 - ``s`` — scheduler events, including Haskell thread creation and start/stop 1171 events. Enabled by default. 1172 1173 - ``g`` — GC events, including GC start/stop. Enabled by default. 1174 1175 - ``n`` — non-moving garbage collector (see :rts-flag:`--nonmoving-gc`) 1176 events including start and end of the concurrent mark and census 1177 information to characterise heap fragmentation. Disabled by default. 1178 1179 - ``p`` — parallel sparks (sampled). Enabled by default. 1180 1181 - ``f`` — parallel sparks (fully accurate). Disabled by default. 1182 1183 - ``u`` — user events. These are events emitted from Haskell code using 1184 functions such as ``Debug.Trace.traceEvent``. Enabled by default. 1185 1186 You can disable specific classes, or enable/disable all classes at 1187 once: 1188 1189 - ``a`` — enable all event classes listed above 1190 - ``-⟨x⟩`` — disable the given class of events, for any event class listed above 1191 - ``-a`` — disable all classes 1192 1193 For example, ``-l-ag`` would disable all event classes (``-a``) except for 1194 GC events (``g``). 1195 1196 For spark events there are two modes: sampled and fully accurate. 1197 There are various events in the life cycle of each spark, usually 1198 just creating and running, but there are some more exceptional 1199 possibilities. In the sampled mode the number of occurrences of each 1200 kind of spark event is sampled at frequent intervals. In the fully 1201 accurate mode every spark event is logged individually. The latter 1202 has a higher runtime overhead and is not enabled by default. 1203 1204 The format of the log file is described in this users guide in 1205 :ref:`eventlog-encodings` It can be parsed in Haskell using the 1206 `ghc-events <http://hackage.haskell.org/package/ghc-events>`__ 1207 library. To dump the contents of a ``.eventlog`` file as text, use 1208 the tool ``ghc-events show`` that comes with the 1209 `ghc-events <http://hackage.haskell.org/package/ghc-events>`__ 1210 package. 1211 1212 Each event is associated with a timestamp which is the number of 1213 nanoseconds since the start of executation of the running program. 1214 This is the elapsed time, not the CPU time. 1215 1216.. rts-flag:: -ol ⟨filename⟩ 1217 1218 :default: :file:`<program>.eventlog` 1219 :since: 8.8 1220 1221 Sets the destination for the eventlog produced with the 1222 :rts-flag:`-l ⟨flags⟩` flag. 1223 1224.. rts-flag:: -v [⟨flags⟩] 1225 1226 Log events as text to standard output, instead of to the 1227 ``.eventlog`` file. The ⟨flags⟩ are the same as for ``-l``, with the 1228 additional option ``t`` which indicates that the each event printed 1229 should be preceded by a timestamp value (in the binary ``.eventlog`` 1230 file, all events are automatically associated with a timestamp). 1231 1232The debugging options ``-Dx`` also generate events which are logged 1233using the tracing framework. By default those events are dumped as text 1234to stdout (``-Dx`` implies ``-v``), but they may instead be stored in 1235the binary eventlog file by using the ``-l`` option. 1236 1237.. _rts-options-debugging: 1238 1239RTS options for hackers, debuggers, and over-interested souls 1240------------------------------------------------------------- 1241 1242.. index:: 1243 single: RTS options, hacking/debugging 1244 1245These RTS options might be used (a) to avoid a GHC bug, (b) to see 1246"what's really happening", or (c) because you feel like it. Not 1247recommended for everyday use! 1248 1249.. rts-flag:: -B 1250 1251 Sound the bell at the start of each (major) garbage collection. 1252 1253 Oddly enough, people really do use this option! Our pal in Durham 1254 (England), Paul Callaghan, writes: “Some people here use it for a 1255 variety of purposes—honestly!—e.g., confirmation that the 1256 code/machine is doing something, infinite loop detection, gauging 1257 cost of recently added code. Certain people can even tell what stage 1258 [the program] is in by the beep pattern. But the major use is for 1259 annoying others in the same office…” 1260 1261.. rts-flag:: -D ⟨x⟩ 1262 1263 An RTS debugging flag; only available if the program was linked with 1264 the :ghc-flag:`-debug` option. Various values of ⟨x⟩ are provided to enable 1265 debug messages and additional runtime sanity checks in different 1266 subsystems in the RTS, for example ``+RTS -Ds -RTS`` enables debug 1267 messages from the scheduler. Use ``+RTS -?`` to find out which debug 1268 flags are supported. 1269 1270 Full list of currently supported flags: 1271 1272.. rts-flag:: -Ds DEBUG: scheduler 1273.. rts-flag:: -Di DEBUG: interpreter 1274.. rts-flag:: -Dw DEBUG: weak 1275.. rts-flag:: -DG DEBUG: gccafs 1276.. rts-flag:: -Dg DEBUG: gc 1277.. rts-flag:: -Db DEBUG: block 1278.. rts-flag:: -DS DEBUG: sanity 1279.. rts-flag:: -DZ DEBUG: zero freed memory on GC 1280.. rts-flag:: -Dt DEBUG: stable 1281.. rts-flag:: -Dp DEBUG: prof 1282.. rts-flag:: -Da DEBUG: apply 1283.. rts-flag:: -Dl DEBUG: linker 1284.. rts-flag:: -Dm DEBUG: stm 1285.. rts-flag:: -Dz DEBUG: stack squeezing 1286.. rts-flag:: -Dc DEBUG: program coverage 1287.. rts-flag:: -Dr DEBUG: sparks 1288.. rts-flag:: -DC DEBUG: compact 1289 1290 Debug messages will be sent to the binary event log file instead of 1291 stdout if the :rts-flag:`-l ⟨flags⟩` option is added. This might be useful 1292 for reducing the overhead of debug tracing. 1293 1294 To figure out what exactly they do, the least bad way is to grep the rts/ directory in 1295 the ghc code for macros like ``DEBUG(scheduler`` or ``DEBUG_scheduler``. 1296 1297.. rts-flag:: -r ⟨file⟩ 1298 1299 .. index:: 1300 single: ticky ticky profiling 1301 single: profiling; ticky ticky 1302 1303 Produce "ticky-ticky" statistics at the end of the program run (only 1304 available if the program was linked with :ghc-flag:`-debug`). The ⟨file⟩ 1305 business works just like on the :rts-flag:`-S [⟨file⟩]` RTS option, above. 1306 1307 For more information on ticky-ticky profiling, see 1308 :ref:`ticky-ticky`. 1309 1310.. rts-flag:: -xc 1311 1312 (Only available when the program is compiled for profiling.) When an 1313 exception is raised in the program, this option causes a stack trace 1314 to be dumped to ``stderr``. 1315 1316 This can be particularly useful for debugging: if your program is 1317 complaining about a ``head []`` error and you haven't got a clue 1318 which bit of code is causing it, compiling with 1319 ``-prof -fprof-auto`` (see :ghc-flag:`-prof`) and running with ``+RTS -xc 1320 -RTS`` will tell you exactly the call stack at the point the error was 1321 raised. 1322 1323 The output contains one report for each exception raised in the 1324 program (the program might raise and catch several exceptions during 1325 its execution), where each report looks something like this: 1326 1327 .. code-block:: none 1328 1329 *** Exception raised (reporting due to +RTS -xc), stack trace: 1330 GHC.List.CAF 1331 --> evaluated by: Main.polynomial.table_search, 1332 called from Main.polynomial.theta_index, 1333 called from Main.polynomial, 1334 called from Main.zonal_pressure, 1335 called from Main.make_pressure.p, 1336 called from Main.make_pressure, 1337 called from Main.compute_initial_state.p, 1338 called from Main.compute_initial_state, 1339 called from Main.CAF 1340 ... 1341 1342 The stack trace may often begin with something uninformative like 1343 ``GHC.List.CAF``; this is an artifact of GHC's optimiser, which 1344 lifts out exceptions to the top-level where the profiling system 1345 assigns them to the cost centre "CAF". However, ``+RTS -xc`` doesn't 1346 just print the current stack, it looks deeper and reports the stack 1347 at the time the CAF was evaluated, and it may report further stacks 1348 until a non-CAF stack is found. In the example above, the next stack 1349 (after ``--> evaluated by``) contains plenty of information about 1350 what the program was doing when it evaluated ``head []``. 1351 1352 Implementation details aside, the function names in the stack should 1353 hopefully give you enough clues to track down the bug. 1354 1355 See also the function ``traceStack`` in the module ``Debug.Trace`` 1356 for another way to view call stacks. 1357 1358.. rts-flag:: -Z 1359 1360 Turn *off* update frame squeezing on context switch. 1361 (There's no particularly good reason to turn it off, except to 1362 ensure the accuracy of certain data collected regarding thunk entry 1363 counts.) 1364 1365.. _ghc-info: 1366 1367Getting information about the RTS 1368--------------------------------- 1369 1370.. index:: 1371 single: RTS 1372 1373.. rts-flag:: --info 1374 1375 It is possible to ask the RTS to give some information about itself. To 1376 do this, use the :rts-flag:`--info` flag, e.g. 1377 1378 .. code-block:: none 1379 1380 $ ./a.out +RTS --info 1381 [("GHC RTS", "YES") 1382 ,("GHC version", "6.7") 1383 ,("RTS way", "rts_p") 1384 ,("Host platform", "x86_64-unknown-linux") 1385 ,("Host architecture", "x86_64") 1386 ,("Host OS", "linux") 1387 ,("Host vendor", "unknown") 1388 ,("Build platform", "x86_64-unknown-linux") 1389 ,("Build architecture", "x86_64") 1390 ,("Build OS", "linux") 1391 ,("Build vendor", "unknown") 1392 ,("Target platform", "x86_64-unknown-linux") 1393 ,("Target architecture", "x86_64") 1394 ,("Target OS", "linux") 1395 ,("Target vendor", "unknown") 1396 ,("Word size", "64") 1397 ,("Compiler unregisterised", "NO") 1398 ,("Tables next to code", "YES") 1399 ,("Flag -with-rtsopts", "") 1400 ] 1401 1402 The information is formatted such that it can be read as a of type 1403 ``[(String, String)]``. Currently the following fields are present: 1404 1405 ``GHC RTS`` 1406 Is this program linked against the GHC RTS? (always "YES"). 1407 1408 ``GHC version`` 1409 The version of GHC used to compile this program. 1410 1411 ``RTS way`` 1412 The variant (“way”) of the runtime. The most common values are 1413 ``rts_v`` (vanilla), ``rts_thr`` (threaded runtime, i.e. linked 1414 using the :ghc-flag:`-threaded` option) and ``rts_p`` (profiling runtime, 1415 i.e. linked using the :ghc-flag:`-prof` option). Other variants include 1416 ``debug`` (linked using :ghc-flag:`-debug`), and ``dyn`` (the RTS is linked 1417 in dynamically, i.e. a shared library, rather than statically linked 1418 into the executable itself). These can be combined, e.g. you might 1419 have ``rts_thr_debug_p``. 1420 1421 ``Target platform``\ ``Target architecture``\ ``Target OS``\ ``Target vendor`` 1422 These are the platform the program is compiled to run on. 1423 1424 ``Build platform``\ ``Build architecture``\ ``Build OS``\ ``Build vendor`` 1425 These are the platform where the program was built on. (That is, the 1426 target platform of GHC itself.) Ordinarily this is identical to the 1427 target platform. (It could potentially be different if 1428 cross-compiling.) 1429 1430 ``Host platform``\ ``Host architecture``\ ``Host OS``\ ``Host vendor`` 1431 These are the platform where GHC itself was compiled. Again, this 1432 would normally be identical to the build and target platforms. 1433 1434 ``Word size`` 1435 Either ``"32"`` or ``"64"``, reflecting the word size of the target 1436 platform. 1437 1438 ``Compiler unregistered`` 1439 Was this program compiled with an :ref:`"unregistered" <unreg>` 1440 version of GHC? (I.e., a version of GHC that has no 1441 platform-specific optimisations compiled in, usually because this is 1442 a currently unsupported platform.) This value will usually be no, 1443 unless you're using an experimental build of GHC. 1444 1445 ``Tables next to code`` 1446 Putting info tables directly next to entry code is a useful 1447 performance optimisation that is not available on all platforms. 1448 This field tells you whether the program has been compiled with this 1449 optimisation. (Usually yes, except on unusual platforms.) 1450 1451 ``Flag -with-rtsopts`` 1452 The value of the GHC flag :ghc-flag:`-with-rtsopts=⟨opts⟩` at compile/link time. 1453