1.. _runtime-control:
2
3Running a compiled program
4==========================
5
6.. index::
7   single: runtime control of Haskell programs
8   single: running, compiled program
9   single: RTS options
10
11To make an executable program, the GHC system compiles your code and
12then links it with a non-trivial runtime system (RTS), which handles
13storage management, thread scheduling, profiling, and so on.
14
15The RTS has a lot of options to control its behaviour. For example, you
16can change the context-switch interval, the default size of the heap,
17and enable heap profiling. These options can be passed to the runtime
18system in a variety of different ways; the next section
19(:ref:`setting-rts-options`) describes the various methods, and the
20following sections describe the RTS options themselves.
21
22.. _setting-rts-options:
23
24Setting RTS options
25-------------------
26
27.. index::
28   single: RTS options, setting
29
30There are four ways to set RTS options:
31
32-  on the command line between ``+RTS ... -RTS``, when running the
33   program (:ref:`rts-opts-cmdline`)
34
35-  at compile-time, using :ghc-flag:`-with-rtsopts=⟨opts⟩`
36   (:ref:`rts-opts-compile-time`)
37
38-  with the environment variable :envvar:`GHCRTS`
39   (:ref:`rts-options-environment`)
40
41-  by overriding "hooks" in the runtime system (:ref:`rts-hooks`)
42
43.. _rts-opts-cmdline:
44
45Setting RTS options on the command line
46~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
47
48.. index::
49   single: +RTS
50   single: -RTS
51   single: --RTS
52
53If you set the :ghc-flag:`-rtsopts[=⟨none|some|all|ignore|ignoreAll⟩]` flag
54appropriately when linking (see :ref:`options-linker`), you can give RTS
55options on the command line when running your program.
56
57When your Haskell program starts up, the RTS extracts command-line
58arguments bracketed between ``+RTS`` and ``-RTS`` as its own. For example:
59
60.. code-block:: none
61
62    $ ghc prog.hs -rtsopts
63    [1 of 1] Compiling Main             ( prog.hs, prog.o )
64    Linking prog ...
65    $ ./prog -f +RTS -H32m -S -RTS -h foo bar
66
67The RTS will snaffle ``-H32m -S`` for itself, and the remaining
68arguments ``-f -h foo bar`` will be available to your program if/when it
69calls ``System.Environment.getArgs``.
70
71No ``-RTS`` option is required if the runtime-system options extend to
72the end of the command line, as in this example:
73
74.. code-block:: none
75
76    % hls -ltr /usr/etc +RTS -A5m
77
78If you absolutely positively want all the rest of the options in a
79command line to go to the program (and not the RTS), use a
80``--RTS``.
81
82As always, for RTS options that take ⟨size⟩s: If the last character of
83⟨size⟩ is a K or k, multiply by 1000; if an M or m, by 1,000,000; if a G
84or G, by 1,000,000,000. (And any wraparound in the counters is *your*
85fault!)
86
87Giving a ``+RTS -?`` RTS option option will print out the RTS
88options actually available in your program (which vary, depending on how
89you compiled).
90
91.. note::
92    Since GHC is itself compiled by GHC, you can change RTS options in
93    the compiler using the normal ``+RTS ... -RTS`` combination. For instance, to set
94    the maximum heap size for a compilation to 128M, you would add
95    ``+RTS -M128m -RTS`` to the command line.
96
97.. _rts-opts-compile-time:
98
99Setting RTS options at compile time
100~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
101
102GHC lets you change the default RTS options for a program at compile
103time, using the ``-with-rtsopts`` flag (:ref:`options-linker`). A common
104use for this is to give your program a default heap and/or stack size
105that is greater than the default. For example, to set ``-H128m -K64m``,
106link with ``-with-rtsopts="-H128m -K64m"``.
107
108.. _rts-options-environment:
109
110Setting RTS options with the ``GHCRTS`` environment variable
111~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
112
113.. index::
114   single: RTS options; from the environment
115   single: environment variable; for setting RTS options
116   single: GHCRTS environment variable
117
118.. envvar:: GHCRTS
119
120    If the ``-rtsopts`` flag is set to something other than ``none`` or ``ignoreAll``
121    when linking, RTS options are also taken from the environment variable
122    :envvar:`GHCRTS`. For example, to set the maximum heap size to 2G
123    for all GHC-compiled programs (using an ``sh``\-like shell):
124
125    .. code-block:: sh
126
127        GHCRTS='-M2G'
128        export GHCRTS
129
130    RTS options taken from the :envvar:`GHCRTS` environment variable can be
131    overridden by options given on the command line.
132
133.. tip::
134    Setting something like ``GHCRTS=-M2G`` in your environment is a
135    handy way to avoid Haskell programs growing beyond the real memory in
136    your machine, which is easy to do by accident and can cause the machine
137    to slow to a crawl until the OS decides to kill the process (and you
138    hope it kills the right one).
139
140.. _rts-hooks:
141
142"Hooks" to change RTS behaviour
143~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
144
145.. index::
146   single: hooks; RTS
147   single: RTS hooks
148   single: RTS behaviour, changing
149
150GHC lets you exercise rudimentary control over certain RTS settings for
151any given program, by compiling in a "hook" that is called by the
152run-time system. The RTS contains stub definitions for these hooks, but
153by writing your own version and linking it on the GHC command line, you
154can override the defaults.
155
156Owing to the vagaries of DLL linking, these hooks don't work under
157Windows when the program is built dynamically.
158
159Runtime events
160##############
161
162You can change the messages printed when the runtime system "blows up,"
163e.g., on stack overflow. The hooks for these are as follows:
164
165.. c:function:: void OutOfHeapHook (unsigned long, unsigned long)
166
167    The heap-overflow message.
168
169.. c:function:: void StackOverflowHook (long int)
170
171    The stack-overflow message.
172
173.. c:function:: void MallocFailHook (long int)
174
175    The message printed if ``malloc`` fails.
176
177.. _event_log_output_api:
178
179Event log output
180################
181
182Furthermore GHC lets you specify the way event log data (see :rts-flag:`-l
183⟨flags⟩`) is written through a custom :c:type:`EventLogWriter`:
184
185.. c:type:: EventLogWriter
186
187    A sink of event-log data.
188
189    .. c:member:: void initEventLogWriter(void)
190
191        Initializes your :c:type:`EventLogWriter`. This is optional.
192
193    .. c:member:: bool writeEventLog(void *eventlog, size_t eventlog_size)
194
195        Hands buffered event log data to your event log writer. Return true on success.
196        Required for a custom :c:type:`EventLogWriter`.
197
198    .. c:member:: void flushEventLog(void)
199
200        Flush buffers (if any) of your custom :c:type:`EventLogWriter`. This can
201        be ``NULL``.
202
203    .. c:member:: void stopEventLogWriter(void)
204
205        Called when event logging is about to stop. This can be ``NULL``.
206
207To use an :c:type:`EventLogWriter` the RTS API provides the following functions:
208
209.. c:function:: EventLogStatus eventLogStatus(void)
210
211   Query whether the current runtime system supports the eventlog (e.g. whether
212   the current executable was linked with :ghc-flag:`-eventlog`) and, if it
213   is supported, whether it is currently logging.
214
215.. c:function:: bool startEventLogging(const EventLogWriter *writer)
216
217   Start logging events to the given :c:type:`EventLogWriter`. Returns true on
218   success or false is another writer has already been configured.
219
220.. c:function:: void endEventLogging()
221
222   Tear down the active :c:type:`EventLogWriter`.
223
224where the ``enum`` :c:type:`EventLogStatus` is:
225
226.. c:type:: EventLogStatus
227
228    * ``EVENTLOG_NOT_SUPPORTED``: The runtime system wasn't compiled with
229      eventlog support.
230    * ``EVENTLOG_NOT_CONFIGURED``: An :c:type:`EventLogWriter` has not yet been
231      configured.
232    * ``EVENTLOG_RUNNING``: An :c:type:`EventLogWriter` has been configured and
233      is running.
234
235
236.. _rts-options-misc:
237
238Miscellaneous RTS options
239-------------------------
240
241.. rts-flag:: --install-signal-handlers=⟨yes|no⟩
242
243    If yes (the default), the RTS installs signal handlers to catch
244    things like :kbd:`Ctrl-C`. This option is primarily useful for when you are
245    using the Haskell code as a DLL, and want to set your own signal
246    handlers.
247
248    Note that even with ``--install-signal-handlers=no``, the RTS
249    interval timer signal is still enabled. The timer signal is either
250    SIGVTALRM or SIGALRM, depending on the RTS configuration and OS
251    capabilities. To disable the timer signal, use the ``-V0`` RTS
252    option (see :rts-flag:`-V ⟨secs⟩`).
253
254.. rts-flag:: --install-seh-handlers=⟨yes|no⟩
255
256    If yes (the default), the RTS on Windows installs exception handlers to
257    catch unhandled exceptions using the Windows exception handling mechanism.
258    This option is primarily useful for when you are using the Haskell code as a
259    DLL, and don't want the RTS to ungracefully terminate your application on
260    errors such as segfaults.
261
262.. rts-flag:: --generate-crash-dumps
263
264    If yes (the default), the RTS on Windows will generate a core dump on
265    any crash. These dumps can be inspected using debuggers such as WinDBG.
266    The dumps record all code, registers and threading information at the time
267    of the crash. Note that this implies ``--install-seh-handlers=yes``.
268
269.. rts-flag:: --generate-stack-traces=<yes|no>
270
271    If yes (the default), the RTS on Windows will generate a stack trace on
272    crashes if exception handling are enabled. In order to get more information
273    in compiled executables, C code or DLLs symbols need to be available.
274
275.. rts-flag:: --disable-delayed-os-memory-return
276
277    If given, uses ``MADV_DONTNEED`` instead of ``MADV_FREE`` on platforms where
278    this results in more accurate resident memory usage of the program as shown
279    in memory usage reporting tools (e.g. the ``RSS`` column in ``top`` and ``htop``).
280
281    Using this is expected to make the program slightly slower.
282
283    On Linux, MADV_FREE is newer and faster because it can avoid zeroing
284    pages if they are re-used by the process later (see ``man 2 madvise``),
285    but for the trade-off that memory inspection tools like ``top`` will
286    not immediately reflect the freeing in their display of resident memory
287    (RSS column): Only under memory pressure will Linux actually remove
288    the freed pages from the process and update its RSS statistics.
289    Until then, the pages show up as ``LazyFree`` in ``/proc/PID/smaps``
290    (see ``man 5 proc``).
291
292    The delayed RSS update can confuse programmers debugging memory issues,
293    production memory monitoring tools, and end users who may complain about
294    undue memory usage shown in reporting tools, so with this flag it can
295    be turned off.
296
297
298.. rts-flag:: -xp
299
300    On 64-bit machines, the runtime linker usually needs to map object code
301    into the low 2Gb of the address space, due to the x86_64 small memory model
302    where most symbol references are 32 bits. The problem is that this 2Gb of
303    address space can fill up, especially if you're loading a very large number
304    of object files into GHCi.
305
306    This flag offers a workaround, albeit a slightly convoluted one. To be able
307    to load an object file outside of the low 2Gb, the object code needs to be
308    compiled with ``-fPIC -fexternal-dynamic-refs``. When the ``+RTS -xp`` flag
309    is passed, the linker will assume that all object files were compiled with
310    ``-fPIC -fexternal-dynamic-refs`` and load them anywhere in the address
311    space. It's up to you to arrange that the object files you load (including
312    all packages) were compiled in the right way. If this is not the case for
313    an object, the linker will probably fail with an error message when the
314    problem is detected.
315
316    On some platforms where PIC is always the case, e.g. macOS and OpenBSD on
317    x86_64, and macOS and Linux on aarch64 this flag is enabled by default.
318    One repercussion of this is that referenced system libraries also need to be
319    compiled with ``-fPIC`` if we need to load them in the runtime linker.
320
321.. rts-flag:: -xm ⟨address⟩
322
323    .. index::
324       single: -xm; RTS option
325
326    .. warning::
327
328        This option is for working around memory allocation
329        problems only. Do not use unless GHCi fails with a message like
330        “\ ``failed to mmap() memory below 2Gb``\ ”. Consider recompiling
331        the objects with ``-fPIC -fexternal-dynamic-refs`` and using the
332        ``-xp`` flag instead. If you need to use this option to get GHCi
333        working on your machine, please file a bug.
334
335    On 64-bit machines, the RTS needs to allocate memory in the low 2Gb
336    of the address space. Support for this across different operating
337    systems is patchy, and sometimes fails. This option is there to give
338    the RTS a hint about where it should be able to allocate memory in
339    the low 2Gb of the address space. For example,
340    ``+RTS -xm20000000 -RTS`` would hint that the RTS should allocate
341    starting at the 0.5Gb mark. The default is to use the OS's built-in
342    support for allocating memory in the low 2Gb if available (e.g.
343    ``mmap`` with ``MAP_32BIT`` on Linux), or otherwise ``-xm40000000``.
344
345.. rts-flag:: -xq ⟨size⟩
346
347    :default: 100k
348
349    This option relates to allocation limits; for more about this see
350    :base-ref:`GHC.Conc.enableAllocationLimit`.
351    When a thread hits its allocation limit, the RTS throws an exception
352    to the thread, and the thread gets an additional quota of allocation
353    before the exception is raised again, the idea being so that the
354    thread can execute its exception handlers. The ``-xq`` controls the
355    size of this additional quota.
356
357.. _rts-options-gc:
358
359RTS options to control the garbage collector
360--------------------------------------------
361
362.. index::
363   single: garbage collector; options
364   single: RTS options; garbage collection
365
366There are several options to give you precise control over garbage
367collection. Hopefully, you won't need any of these in normal operation,
368but there are several things that can be tweaked for maximum
369performance.
370
371.. rts-flag:: --copying-gc
372
373    :default: on
374    :since: 8.10.2
375    :reverse: --nonmoving-gc
376
377    Uses the generational copying garbage collector for all generations.
378    This is the default.
379
380.. rts-flag:: --nonmoving-gc
381
382    :default: off
383    :since: 8.10.1
384    :reverse: --copying-gc
385
386    .. index::
387       single: concurrent mark and sweep
388
389    Enable the concurrent mark-and-sweep garbage collector for old generation
390    collectors. Typically GHC uses a stop-the-world copying garbage collector
391    for all generations. This can cause long pauses in execution during major
392    garbage collections. :rts-flag:`--nonmoving-gc` enables the use of a
393    concurrent mark-and-sweep garbage collector for oldest generation
394    collections. Under this collection strategy oldest-generation garbage
395    collection can proceed concurrently with mutation.
396
397    Note that :rts-flag:`--nonmoving-gc` cannot be used with ``-G1``,
398    :rts-flag:`profiling <-hc>` nor :rts-flag:`-c`.
399
400.. rts-flag:: -xn
401
402    :default: off
403    :since: 8.10.1
404
405    An alias for :rts-flag:`--nonmoving-gc`
406
407.. rts-flag:: -A ⟨size⟩
408
409    :default: 1MB
410
411    .. index::
412       single: allocation area, size
413
414    Set the allocation area size used by the garbage
415    collector. The allocation area (actually generation 0 step 0) is
416    fixed and is never resized (unless you use :rts-flag:`-H [⟨size⟩]`, below).
417
418    Increasing the allocation area size may or may not give better
419    performance (a bigger allocation area means worse cache behaviour
420    but fewer garbage collections and less promotion).
421
422    With only 1 generation (e.g. ``-G1``, see :rts-flag:`-G ⟨generations⟩`) the
423    ``-A`` option specifies the minimum allocation area, since the actual size
424    of the allocation area will be resized according to the amount of data in
425    the heap (see :rts-flag:`-F ⟨factor⟩`, below).
426
427.. rts-flag:: -AL ⟨size⟩
428
429    :default: :rts-flag:`-A <-A ⟨size⟩>` value
430    :since: 8.2.1
431
432    .. index::
433       single: allocation area for large objects, size
434
435    Sets the limit on the total size of "large objects" (objects
436    larger than about 3KB) that can be allocated before a GC is
437    triggered. By default this limit is the same as the :rts-flag:`-A <-A
438    ⟨size⟩>` value.
439
440    Large objects are not allocated from the normal allocation area
441    set by the ``-A`` flag, which is why there is a separate limit for
442    these.  Large objects tend to be much rarer than small objects, so
443    most programs hit the ``-A`` limit before the ``-AL`` limit.  However,
444    the ``-A`` limit is per-capability, whereas the ``-AL`` limit is global,
445    so as ``-N`` gets larger it becomes more likely that we hit the
446    ``-AL`` limit first.  To counteract this, it might be necessary to
447    use a larger ``-AL`` limit when using a large ``-N``.
448
449    To see whether you're making good use of all the memory reseverd
450    for the allocation area (``-A`` times ``-N``), look at the output of
451    ``+RTS -S`` and check whether the amount of memory allocated between
452    GCs is equal to ``-A`` times ``-N``. If not, there are two possible
453    remedies: use ``-n`` to set a nursery chunk size, or use ``-AL`` to
454    increase the limit for large objects.
455
456.. rts-flag:: -O ⟨size⟩
457
458    :default: 1m
459
460    .. index::
461       single: old generation, size
462
463    Set the minimum size of the old generation. The old generation is collected
464    whenever it grows to this size or the value of the :rts-flag:`-F ⟨factor⟩`
465    option multiplied by the size of the live data at the previous major
466    collection, whichever is larger.
467
468.. rts-flag:: -n ⟨size⟩
469
470    :default: 4m with :rts-flag:`-A16m <-A ⟨size⟩>` or larger, otherwise 0.
471
472    .. index::
473       single: allocation area, chunk size
474
475    [Example: ``-n4m`` ] When set to a non-zero value, this
476    option divides the allocation area (``-A`` value) into chunks of the
477    specified size. During execution, when a processor exhausts its
478    current chunk, it is given another chunk from the pool until the
479    pool is exhausted, at which point a collection is triggered.
480
481    This option is only useful when running in parallel (``-N2`` or
482    greater). It allows the processor cores to make better use of the
483    available allocation area, even when cores are allocating at
484    different rates. Without ``-n``, each core gets a fixed-size
485    allocation area specified by the ``-A``, and the first core to
486    exhaust its allocation area triggers a GC across all the cores. This
487    can result in a collection happening when the allocation areas of
488    some cores are only partially full, so the purpose of the ``-n`` is
489    to allow cores that are allocating faster to get more of the
490    allocation area. This means less frequent GC, leading a lower GC
491    overhead for the same heap size.
492
493    This is particularly useful in conjunction with larger ``-A``
494    values, for example ``-A64m -n4m`` is a useful combination on larger core
495    counts (8+).
496
497.. rts-flag:: -c
498
499    .. index::
500       single: garbage collection; compacting
501       single: compacting garbage collection
502
503    Use a compacting algorithm for collecting the oldest generation. By
504    default, the oldest generation is collected using a copying
505    algorithm; this option causes it to be compacted in-place instead.
506    The compaction algorithm is slower than the copying algorithm, but
507    the savings in memory use can be considerable.
508
509    For a given heap size (using the :rts-flag:`-H [⟨size⟩]` option),
510    compaction can in fact reduce the GC cost by allowing fewer GCs to be
511    performed. This is more likely when the ratio of live data to heap size is
512    high, say greater than 30%.
513
514    .. note::
515       Compaction doesn't currently work when a single generation is
516       requested using the ``-G1`` option.
517
518.. rts-flag:: -c ⟨n⟩
519
520    :default: 30
521
522    Automatically enable compacting collection when the live data exceeds ⟨n⟩%
523    of the maximum heap size (see the :rts-flag:`-M ⟨size⟩` option). Note that
524    the maximum heap size is unlimited by default, so this option has no effect
525    unless the maximum heap size is set with :rts-flag:`-M ⟨size⟩`.
526
527.. rts-flag:: -F ⟨factor⟩
528
529    :default: 2
530
531    .. index::
532       single: heap size, factor
533
534    This option controls the amount of memory reserved for
535    the older generations (and in the case of a two space collector the
536    size of the allocation area) as a factor of the amount of live data.
537    For example, if there was 2M of live data in the oldest generation
538    when we last collected it, then by default we'll wait until it grows
539    to 4M before collecting it again.
540
541    The default seems to work well here. If you have plenty of memory, it is
542    usually better to use ``-H ⟨size⟩`` (see :rts-flag:`-H [⟨size⟩]`) than to
543    increase :rts-flag:`-F ⟨factor⟩`.
544
545    The :rts-flag:`-F ⟨factor⟩` setting will be automatically reduced by the garbage
546    collector when the maximum heap size (the :rts-flag:`-M ⟨size⟩` setting) is approaching.
547
548.. rts-flag:: -G ⟨generations⟩
549
550    :default: 2
551
552    .. index::
553       single: generations, number of
554
555    Set the number of generations used by the garbage
556    collector. The default of 2 seems to be good, but the garbage
557    collector can support any number of generations. Anything larger
558    than about 4 is probably not a good idea unless your program runs
559    for a *long* time, because the oldest generation will hardly ever
560    get collected.
561
562    Specifying 1 generation with ``+RTS -G1`` gives you a simple 2-space
563    collector, as you would expect. In a 2-space collector, the :rts-flag:`-A
564    ⟨size⟩` option specifies the *minimum* allocation area size, since the
565    allocation area will grow with the amount of live data in the heap. In a
566    multi-generational collector the allocation area is a fixed size (unless
567    you use the :rts-flag:`-H [⟨size⟩]` option).
568
569.. rts-flag:: -qg ⟨gen⟩
570
571    :default: 0
572    :since: 6.12.1
573
574    Use parallel GC in generation ⟨gen⟩ and higher. Omitting ⟨gen⟩ turns off the
575    parallel GC completely, reverting to sequential GC.
576
577    The default parallel GC settings are usually suitable for parallel programs
578    (i.e. those using :base-ref:`GHC.Conc.par`, Strategies, or with
579    multiple threads). However, it is sometimes beneficial to enable the
580    parallel GC for a single-threaded sequential program too, especially if the
581    program has a large amount of heap data and GC is a significant fraction of
582    runtime. To use the parallel GC in a sequential program, enable the parallel
583    runtime with a suitable :rts-flag:`-N ⟨x⟩` option, and additionally it might
584    be beneficial to restrict parallel GC to the old generation with ``-qg1``.
585
586.. rts-flag:: -qb ⟨gen⟩
587
588    :default: 1 for :rts-flag:`-A <-A ⟨size⟩>` < 32M, 0 otherwise
589    :since: 6.12.1
590
591    Use load-balancing in the parallel GC in generation ⟨gen⟩ and higher.
592    Omitting ⟨gen⟩ disables load-balancing entirely.
593
594    Load-balancing shares out the work of GC between the available
595    cores. This is a good idea when the heap is large and we need to
596    parallelise the GC work, however it is also pessimal for the short
597    young-generation collections in a parallel program, because it can
598    harm locality by moving data from the cache of the CPU where is it
599    being used to the cache of another CPU. Hence the default is to do
600    load-balancing only in the old-generation. In fact, for a parallel
601    program it is sometimes beneficial to disable load-balancing
602    entirely with ``-qb``.
603
604.. rts-flag:: -qn ⟨x⟩
605
606    :default: the value of :rts-flag:`-N <-N ⟨x⟩>` or the number of CPU cores,
607              whichever is smaller.
608    :since: 8.2.1
609
610    .. index::
611       single: GC threads, setting the number of
612
613    By default, all of the capabilities participate in parallel
614    garbage collection.  If we want to use a very large ``-N`` value,
615    however, this can reduce the performance of the GC.  For this
616    reason, the ``-qn`` flag can be used to specify a lower number for
617    the threads that should participate in GC.  During GC, if there
618    are more than this number of workers active, some of them will
619    sleep for the duration of the GC.
620
621    The ``-qn`` flag may be useful when running with a large ``-A`` value
622    (so that GC is infrequent), and a large ``-N`` value (so as to make
623    use of hyperthreaded cores, for example).  For example, on a
624    24-core machine with 2 hyperthreads per core, we might use
625    ``-N48 -qn24 -A128m`` to specify that the mutator should use
626    hyperthreads but the GC should only use real cores.  Note that
627    this configuration would use 6GB for the allocation area.
628
629.. rts-flag:: -H [⟨size⟩]
630
631    :default: 0
632
633    .. index::
634       single: heap size, suggested
635
636    This option provides a "suggested heap size" for the garbage collector.
637    Think of ``-Hsize`` as a variable :rts-flag:`-A ⟨size⟩` option.  It says: I
638    want to use at least ⟨size⟩ bytes, so use whatever is left over to increase
639    the ``-A`` value.
640
641    This option does not put a *limit* on the heap size: the heap may
642    grow beyond the given size as usual.
643
644    If ⟨size⟩ is omitted, then the garbage collector will take the size
645    of the heap at the previous GC as the ⟨size⟩. This has the effect of
646    allowing for a larger ``-A`` value but without increasing the
647    overall memory requirements of the program. It can be useful when
648    the default small ``-A`` value is suboptimal, as it can be in
649    programs that create large amounts of long-lived data.
650
651.. rts-flag:: -I ⟨seconds⟩
652
653    :default: 0.3 seconds in the threaded runtime, 0 in the non-threaded runtime
654
655    .. index::
656       single: idle GC
657
658    In the threaded and SMP versions of the RTS (see
659    :ghc-flag:`-threaded`, :ref:`options-linker`), a major GC is automatically
660    performed if the runtime has been idle (no Haskell computation has
661    been running) for a period of time. The amount of idle time which
662    must pass before a GC is performed is set by the ``-I ⟨seconds⟩``
663    option. Specifying ``-I0`` disables the idle GC.
664
665    For an interactive application, it is probably a good idea to use
666    the idle GC, because this will allow finalizers to run and
667    deadlocked threads to be detected in the idle time when no Haskell
668    computation is happening. Also, it will mean that a GC is less
669    likely to happen when the application is busy, and so responsiveness
670    may be improved. However, if the amount of live data in the heap is
671    particularly large, then the idle GC can cause a significant delay,
672    and too small an interval could adversely affect interactive
673    responsiveness.
674
675    This is an experimental feature, please let us know if it causes
676    problems and/or could benefit from further tuning.
677
678.. rts-flag:: -Iw ⟨seconds⟩
679
680    :default: 0 seconds
681
682    .. index::
683       single: idle GC
684
685    By default, if idle GC is enabled in the threaded runtime, a major
686    GC will be performed every time the process goes idle for a
687    sufficiently long duration (see :rts-flag:`-I ⟨seconds⟩`).  For
688    large server processes accepting regular but infrequent requests
689    (e.g., once per second), an expensive, major GC may run after
690    every request.  As an alternative to shutting off idle GC entirely
691    (with ``-I0``), a minimum wait time between idle GCs can be
692    specified with this flag.  For example, ``-Iw60`` will ensure that
693    an idle GC runs at most once per minute.
694
695    This is an experimental feature, please let us know if it causes
696    problems and/or could benefit from further tuning.
697
698.. rts-flag:: -ki ⟨size⟩
699
700    :default: 1k
701
702    .. index::
703       single: stack, initial size
704
705    Set the initial stack size for new threads.
706
707    Thread stacks (including the main thread's stack) live on the heap.
708    As the stack grows, new stack chunks are added as required; if the
709    stack shrinks again, these extra stack chunks are reclaimed by the
710    garbage collector. The default initial stack size is deliberately
711    small, in order to keep the time and space overhead for thread
712    creation to a minimum, and to make it practical to spawn threads for
713    even tiny pieces of work.
714
715    .. note::
716        This flag used to be simply ``-k``, but was renamed to ``-ki`` in
717        GHC 7.2.1. The old name is still accepted for backwards
718        compatibility, but that may be removed in a future version.
719
720.. rts-flag:: -kc ⟨size⟩
721
722    :default: 32k
723
724    .. index::
725       single: stack; chunk size
726
727    Set the size of "stack chunks". When a thread's current stack overflows, a
728    new stack chunk is created and added to the thread's stack, until the limit
729    set by :rts-flag:`-K ⟨size⟩` is reached.
730
731    The advantage of smaller stack chunks is that the garbage collector can
732    avoid traversing stack chunks if they are known to be unmodified since the
733    last collection, so reducing the chunk size means that the garbage
734    collector can identify more stack as unmodified, and the GC overhead might
735    be reduced. On the other hand, making stack chunks too small adds some
736    overhead as there will be more overflow/underflow between chunks. The
737    default setting of 32k appears to be a reasonable compromise in most cases.
738
739.. rts-flag:: -kb ⟨size⟩
740
741    :default: 1k
742
743    .. index::
744       single: stack; chunk buffer size
745
746    Sets the stack chunk buffer size. When a stack chunk
747    overflows and a new stack chunk is created, some of the data from
748    the previous stack chunk is moved into the new chunk, to avoid an
749    immediate underflow and repeated overflow/underflow at the boundary.
750    The amount of stack moved is set by the ``-kb`` option.
751
752    Note that to avoid wasting space, this value should typically be less than
753    10% of the size of a stack chunk (:rts-flag:`-kc ⟨size⟩`), because in a
754    chain of stack chunks, each chunk will have a gap of unused space of this
755    size.
756
757.. rts-flag:: -K ⟨size⟩
758
759    :default: 80% of physical memory
760
761    .. index::
762       single: stack, maximum size
763
764    Set the maximum stack size for
765    an individual thread to ⟨size⟩ bytes. If the thread attempts to
766    exceed this limit, it will be sent the ``StackOverflow`` exception.
767    The limit can be disabled entirely by specifying a size of zero.
768
769    This option is there mainly to stop the program eating up all the
770    available memory in the machine if it gets into an infinite loop.
771
772.. rts-flag:: -m ⟨n⟩
773
774    :default: 3%
775
776    .. index::
777       single: heap, minimum free
778
779    Minimum % ⟨n⟩ of heap which must be available for allocation.
780
781.. rts-flag:: -M ⟨size⟩
782
783    :default: unlimited
784
785    .. index::
786       single: heap size, maximum
787
788    Set the maximum heap size to ⟨size⟩ bytes. The
789    heap normally grows and shrinks according to the memory requirements
790    of the program. The only reason for having this option is to stop
791    the heap growing without bound and filling up all the available swap
792    space, which at the least will result in the program being summarily
793    killed by the operating system.
794
795    The maximum heap size also affects other garbage collection
796    parameters: when the amount of live data in the heap exceeds a
797    certain fraction of the maximum heap size, compacting collection
798    will be automatically enabled for the oldest generation, and the
799    ``-F`` parameter will be reduced in order to avoid exceeding the
800    maximum heap size.
801
802.. rts-flag:: -Mgrace=⟨size⟩
803
804    :default: 1M
805
806    .. index::
807       single: heap size, grace
808
809    If the program's heap exceeds the value set by :rts-flag:`-M ⟨size⟩`, the
810    RTS throws an exception to the program, and the program gets an
811    additional quota of allocation before the exception is raised
812    again, the idea being so that the program can execute its
813    exception handlers. ``-Mgrace=`` controls the size of this
814    additional quota.
815
816.. rts-flag:: --numa
817              --numa=<mask>
818
819    .. index::
820       single: NUMA, enabling in the runtime
821
822    Enable NUMA-aware memory allocation in the runtime (only available
823    with ``-threaded``, and only on Linux and Windows currently).
824
825    Background: some systems have a Non-Uniform Memory Architecture,
826    whereby main memory is split into banks which are "local" to
827    specific CPU cores.  Accessing local memory is faster than
828    accessing remote memory.  The OS provides APIs for allocating
829    local memory and binding threads to particular CPU cores, so that
830    we can ensure certain memory accesses are using local memory.
831
832    The ``--numa`` option tells the RTS to tune its memory usage to
833    maximize local memory accesses.  In particular, the RTS will:
834
835       - Determine the number of NUMA nodes (N) by querying the OS.
836       - Manage separate memory pools for each node.
837       - Map capabilities to NUMA nodes.  Capability C is mapped to
838         NUMA node C mod N.
839       - Bind worker threads on a capability to the appropriate node.
840       - Allocate the nursery from node-local memory.
841       - Perform other memory allocation, including in the GC, from
842         node-local memory.
843       - When load-balancing, we prefer to migrate threads to another
844         Capability on the same node.
845
846    The ``--numa`` flag is typically beneficial when a program is
847    using all cores of a large multi-core NUMA system, with a large
848    allocation area (``-A``).  All memory accesses to the allocation
849    area will go to local memory, which can save a significant amount
850    of remote memory access.  A runtime speedup on the order of 10%
851    is typical, but can vary a lot depending on the hardware and the
852    memory behaviour of the program.
853
854    Note that the RTS will not set CPU affinity for bound threads and
855    threads entering Haskell from C/C++, so if your program uses bound
856    threads you should ensure that each bound thread calls the RTS API
857    `rts_setInCallCapability(c,1)` from C/C++ before calling into
858    Haskell.  Otherwise there could be a mismatch between the CPU that
859    the thread is running on and the memory it is using while running
860    Haskell code, which will negate any benefits of ``--numa``.
861
862    If given an explicit <mask>, the <mask> is interpreted as a bitmap
863    that indicates the NUMA nodes on which to run the program.  For
864    example, ``--numa=3`` would run the program on NUMA nodes 0 and 1.
865
866.. rts-flag:: --long-gc-sync
867              --long-gc-sync=<seconds>
868
869    .. index::
870       single: GC sync time, measuring
871
872    When a GC starts, all the running mutator threads have to stop and
873    synchronise.  The period between when the GC is initiated and all
874    the mutator threads are stopped is called the GC synchronisation
875    phase. If this phase is taking a long time (longer than 1ms is
876    considered long), then it can have a severe impact on overall
877    throughput.
878
879    A long GC sync can be caused by a mutator thread that is inside an
880    ``unsafe`` FFI call, or running in a loop that doesn't allocate
881    memory and so doesn't yield.  To fix the former, make the call
882    ``safe``, and to fix the latter, either avoid calling the code in
883    question or compile it with :ghc-flag:`-fomit-yields`.
884
885    By default, the flag will cause a warning to be emitted to stderr
886    when the sync time exceeds the specified time.  This behaviour can
887    be overridden, however: the ``longGCSync()`` hook is called when
888    the sync time is exceeded during the sync period, and the
889    ``longGCSyncEnd()`` hook at the end. Both of these hooks can be
890    overridden in the ``RtsConfig`` when the runtime is started with
891    ``hs_init_ghc()``. The default implementations of these hooks
892    (``LongGcSync()`` and ``LongGCSyncEnd()`` respectively) print
893    warnings to stderr.
894
895    One way to use this flag is to set a breakpoint on
896    ``LongGCSync()`` in the debugger, and find the thread that is
897    delaying the sync. You probably want to use :ghc-flag:`-g` to
898    provide more info to the debugger.
899
900    The GC sync time, along with other GC stats, are available by
901    calling the ``getRTSStats()`` function from C, or
902    ``GHC.Stats.getRTSStats`` from Haskell.
903
904.. _rts-options-statistics:
905
906RTS options to produce runtime statistics
907-----------------------------------------
908
909.. rts-flag:: -T
910              -t [⟨file⟩]
911              -s [⟨file⟩]
912              -S [⟨file⟩]
913              --machine-readable
914              --internal-counters
915
916    These options produce runtime-system statistics, such as the amount
917    of time spent executing the program and in the garbage collector,
918    the amount of memory allocated, the maximum size of the heap, and so
919    on. The three variants give different levels of detail: ``-T``
920    collects the data but produces no output ``-t`` produces a single
921    line of output in the same format as GHC's ``-Rghc-timing`` option,
922    ``-s`` produces a more detailed summary at the end of the program,
923    and ``-S`` additionally produces information about each and every
924    garbage collection. Passing ``--internal-counters`` to a threaded
925    runtime will cause a detailed summary to include various internal
926    counts accumulated during the run; note that these are unspecified
927    and may change between releases.
928
929    The output is placed in ⟨file⟩. If ⟨file⟩ is omitted, then the
930    output is sent to ``stderr``.
931
932    If you use the ``-T`` flag then, you should access the statistics
933    using :base-ref:`GHC.Stats.`.
934
935    If you use the ``-t`` flag then, when your program finishes, you
936    will see something like this:
937
938    .. code-block:: none
939
940        <<ghc: 36169392 bytes, 69 GCs, 603392/1065272 avg/max bytes residency (2 samples), 3M in use, 0.00 INIT (0.00 elapsed), 0.02 MUT (0.02 elapsed), 0.07 GC (0.07 elapsed) :ghc>>
941
942    This tells you:
943
944    -  The total number of bytes allocated by the program over the whole
945       run.
946
947    -  The total number of garbage collections performed.
948
949    -  The average and maximum "residency", which is the amount of live
950       data in bytes. The runtime can only determine the amount of live
951       data during a major GC, which is why the number of samples
952       corresponds to the number of major GCs (and is usually relatively
953       small). To get a better picture of the heap profile of your
954       program, use the :rts-flag:`-hT` RTS option (:ref:`rts-profiling`).
955
956    -  The peak memory the RTS has allocated from the OS.
957
958    -  The amount of CPU time and elapsed wall clock time while
959       initialising the runtime system (INIT), running the program
960       itself (MUT, the mutator), and garbage collecting (GC).
961
962    You can also get this in a more future-proof, machine readable
963    format, with ``-t --machine-readable``:
964
965    ::
966
967         [("bytes allocated", "36169392")
968         ,("num_GCs", "69")
969         ,("average_bytes_used", "603392")
970         ,("max_bytes_used", "1065272")
971         ,("num_byte_usage_samples", "2")
972         ,("peak_megabytes_allocated", "3")
973         ,("init_cpu_seconds", "0.00")
974         ,("init_wall_seconds", "0.00")
975         ,("mutator_cpu_seconds", "0.02")
976         ,("mutator_wall_seconds", "0.02")
977         ,("GC_cpu_seconds", "0.07")
978         ,("GC_wall_seconds", "0.07")
979         ]
980
981    If you use the ``-s`` flag then, when your program finishes, you
982    will see something like this (the exact details will vary depending
983    on what sort of RTS you have, e.g. you will only see profiling data
984    if your RTS is compiled for profiling):
985
986    .. code-block:: none
987
988              36,169,392 bytes allocated in the heap
989               4,057,632 bytes copied during GC
990               1,065,272 bytes maximum residency (2 sample(s))
991                  54,312 bytes maximum slop
992                       3 MB total memory in use (0 MB lost due to fragmentation)
993
994          Generation 0:    67 collections,     0 parallel,  0.04s,  0.03s elapsed
995          Generation 1:     2 collections,     0 parallel,  0.03s,  0.04s elapsed
996
997          SPARKS: 359207 (557 converted, 149591 pruned)
998
999          INIT  time    0.00s  (  0.00s elapsed)
1000          MUT   time    0.01s  (  0.02s elapsed)
1001          GC    time    0.07s  (  0.07s elapsed)
1002          EXIT  time    0.00s  (  0.00s elapsed)
1003          Total time    0.08s  (  0.09s elapsed)
1004
1005          %GC time      89.5%  (75.3% elapsed)
1006
1007          Alloc rate    4,520,608,923 bytes per MUT second
1008
1009          Productivity  10.5% of total user, 9.1% of total elapsed
1010
1011    -  The "bytes allocated in the heap" is the total bytes allocated by
1012       the program over the whole run.
1013
1014    -  GHC uses a copying garbage collector by default. "bytes copied
1015       during GC" tells you how many bytes it had to copy during garbage
1016       collection.
1017
1018    -  The maximum space actually used by your program is the "bytes
1019       maximum residency" figure. This is only checked during major
1020       garbage collections, so it is only an approximation; the number
1021       of samples tells you how many times it is checked.
1022
1023    -  The "bytes maximum slop" tells you the most space that is ever
1024       wasted due to the way GHC allocates memory in blocks. Slop is
1025       memory at the end of a block that was wasted. There's no way to
1026       control this; we just like to see how much memory is being lost
1027       this way.
1028
1029    -  The "total memory in use" tells you the peak memory the RTS has
1030       allocated from the OS.
1031
1032    -  Next there is information about the garbage collections done. For
1033       each generation it says how many garbage collections were done,
1034       how many of those collections were done in parallel, the total
1035       CPU time used for garbage collecting that generation, and the
1036       total wall clock time elapsed while garbage collecting that
1037       generation.
1038
1039    -  The ``SPARKS`` statistic refers to the use of
1040       ``Control.Parallel.par`` and related functionality in the
1041       program. Each spark represents a call to ``par``; a spark is
1042       "converted" when it is executed in parallel; and a spark is
1043       "pruned" when it is found to be already evaluated and is
1044       discarded from the pool by the garbage collector. Any remaining
1045       sparks are discarded at the end of execution, so "converted" plus
1046       "pruned" does not necessarily add up to the total.
1047
1048    -  Next there is the CPU time and wall clock time elapsed broken
1049       down by what the runtime system was doing at the time. INIT is
1050       the runtime system initialisation. MUT is the mutator time, i.e.
1051       the time spent actually running your code. GC is the time spent
1052       doing garbage collection. RP is the time spent doing retainer
1053       profiling. PROF is the time spent doing other profiling. EXIT is
1054       the runtime system shutdown time. And finally, Total is, of
1055       course, the total.
1056
1057       %GC time tells you what percentage GC is of Total. "Alloc rate"
1058       tells you the "bytes allocated in the heap" divided by the MUT
1059       CPU time. "Productivity" tells you what percentage of the Total
1060       CPU and wall clock elapsed times are spent in the mutator (MUT).
1061
1062    The ``-S`` flag, as well as giving the same output as the ``-s``
1063    flag, prints information about each GC as it happens:
1064
1065    .. code-block:: none
1066
1067            Alloc    Copied     Live    GC    GC     TOT     TOT  Page Flts
1068            bytes     bytes     bytes  user  elap    user    elap
1069           528496     47728    141512  0.01  0.02    0.02    0.02    0    0  (Gen:  1)
1070        [...]
1071           524944    175944   1726384  0.00  0.00    0.08    0.11    0    0  (Gen:  0)
1072
1073    For each garbage collection, we print:
1074
1075    -  How many bytes we allocated this garbage collection.
1076
1077    -  How many bytes we copied this garbage collection.
1078
1079    -  How many bytes are currently live.
1080
1081    -  How long this garbage collection took (CPU time and elapsed wall
1082       clock time).
1083
1084    -  How long the program has been running (CPU time and elapsed wall
1085       clock time).
1086
1087    -  How many page faults occurred this garbage collection.
1088
1089    -  How many page faults occurred since the end of the last garbage
1090       collection.
1091
1092    -  Which generation is being garbage collected.
1093
1094RTS options for concurrency and parallelism
1095-------------------------------------------
1096
1097The RTS options related to concurrency are described in
1098:ref:`using-concurrent`, and those for parallelism in
1099:ref:`parallel-options`.
1100
1101.. _rts-profiling:
1102
1103RTS options for profiling
1104-------------------------
1105
1106Most profiling runtime options are only available when you compile your
1107program for profiling (see :ref:`prof-compiler-options`, and
1108:ref:`rts-options-heap-prof` for the runtime options). However, there is
1109one profiling option that is available for ordinary non-profiled
1110executables:
1111
1112.. rts-flag:: -hT
1113              -h
1114
1115    Generates a basic heap profile, in the file :file:`prog.hp`. To produce the
1116    heap profile graph, use :command:`hp2ps` (see :ref:`hp2ps`). The basic heap
1117    profile is broken down by data constructor, with other types of closures
1118    (functions, thunks, etc.) grouped into broad categories (e.g. ``FUN``,
1119    ``THUNK``). To get a more detailed profile, use the full profiling support
1120    (:ref:`profiling`). Can be shortened to :rts-flag:`-h`.
1121
1122    .. note:: The meaning of the shortened :rts-flag:`-h` is dependent on whether
1123              your program was compiled for profiling.
1124              (See :ref:`rts-options-heap-prof` for details.)
1125
1126.. rts-flag:: -L ⟨n⟩
1127
1128    :default: 25 characters
1129
1130    Sets the maximum length of the cost-centre names listed in the heap profile.
1131
1132.. _rts-eventlog:
1133
1134Tracing
1135-------
1136
1137.. index::
1138   single: tracing
1139   single: events
1140   single: eventlog files
1141
1142When the program is linked with the :ghc-flag:`-eventlog` option
1143(:ref:`options-linker`), runtime events can be logged in several ways:
1144
1145-  In binary format to a file for later analysis by a variety of tools.
1146   One such tool is
1147   `ThreadScope <http://www.haskell.org/haskellwiki/ThreadScope>`__,
1148   which interprets the event log to produce a visual parallel execution
1149   profile of the program.
1150
1151-  In binary format to customized event log writer. This enables live
1152   analysis of the events while the program is running.
1153
1154-  As text to standard output, for debugging purposes.
1155
1156.. rts-flag:: -l ⟨flags⟩
1157
1158    Log events in binary format. Without any ⟨flags⟩ specified, this
1159    logs a default set of events, suitable for use with tools like ThreadScope.
1160
1161    Per default the events are written to :file:`{program}.eventlog` though
1162    the mechanism for writing event log data can be overridden with a custom
1163    `EventLogWriter`.
1164
1165    For some special use cases you may want more control over which
1166    events are included. The ⟨flags⟩ is a sequence of zero or more
1167    characters indicating which classes of events to log. Currently
1168    these the classes of events that can be enabled/disabled:
1169
1170    - ``s`` — scheduler events, including Haskell thread creation and start/stop
1171      events. Enabled by default.
1172
1173    - ``g`` — GC events, including GC start/stop. Enabled by default.
1174
1175    - ``n`` — non-moving garbage collector (see :rts-flag:`--nonmoving-gc`)
1176      events including start and end of the concurrent mark and census
1177      information to characterise heap fragmentation. Disabled by default.
1178
1179    - ``p`` — parallel sparks (sampled). Enabled by default.
1180
1181    - ``f`` — parallel sparks (fully accurate). Disabled by default.
1182
1183    - ``u`` — user events. These are events emitted from Haskell code using
1184      functions such as ``Debug.Trace.traceEvent``. Enabled by default.
1185
1186    You can disable specific classes, or enable/disable all classes at
1187    once:
1188
1189    - ``a`` — enable all event classes listed above
1190    - ``-⟨x⟩`` — disable the given class of events, for any event class listed above
1191    - ``-a`` — disable all classes
1192
1193    For example, ``-l-ag`` would disable all event classes (``-a``) except for
1194    GC events (``g``).
1195
1196    For spark events there are two modes: sampled and fully accurate.
1197    There are various events in the life cycle of each spark, usually
1198    just creating and running, but there are some more exceptional
1199    possibilities. In the sampled mode the number of occurrences of each
1200    kind of spark event is sampled at frequent intervals. In the fully
1201    accurate mode every spark event is logged individually. The latter
1202    has a higher runtime overhead and is not enabled by default.
1203
1204    The format of the log file is described in this users guide in
1205    :ref:`eventlog-encodings` It can be parsed in Haskell using the
1206    `ghc-events <http://hackage.haskell.org/package/ghc-events>`__
1207    library. To dump the contents of a ``.eventlog`` file as text, use
1208    the tool ``ghc-events show`` that comes with the
1209    `ghc-events <http://hackage.haskell.org/package/ghc-events>`__
1210    package.
1211
1212    Each event is associated with a timestamp which is the number of
1213    nanoseconds since the start of executation of the running program.
1214    This is the elapsed time, not the CPU time.
1215
1216.. rts-flag:: -ol ⟨filename⟩
1217
1218    :default: :file:`<program>.eventlog`
1219    :since: 8.8
1220
1221    Sets the destination for the eventlog produced with the
1222    :rts-flag:`-l ⟨flags⟩` flag.
1223
1224.. rts-flag:: -v [⟨flags⟩]
1225
1226    Log events as text to standard output, instead of to the
1227    ``.eventlog`` file. The ⟨flags⟩ are the same as for ``-l``, with the
1228    additional option ``t`` which indicates that the each event printed
1229    should be preceded by a timestamp value (in the binary ``.eventlog``
1230    file, all events are automatically associated with a timestamp).
1231
1232The debugging options ``-Dx`` also generate events which are logged
1233using the tracing framework. By default those events are dumped as text
1234to stdout (``-Dx`` implies ``-v``), but they may instead be stored in
1235the binary eventlog file by using the ``-l`` option.
1236
1237.. _rts-options-debugging:
1238
1239RTS options for hackers, debuggers, and over-interested souls
1240-------------------------------------------------------------
1241
1242.. index::
1243   single: RTS options, hacking/debugging
1244
1245These RTS options might be used (a) to avoid a GHC bug, (b) to see
1246"what's really happening", or (c) because you feel like it. Not
1247recommended for everyday use!
1248
1249.. rts-flag:: -B
1250
1251    Sound the bell at the start of each (major) garbage collection.
1252
1253    Oddly enough, people really do use this option! Our pal in Durham
1254    (England), Paul Callaghan, writes: “Some people here use it for a
1255    variety of purposes—honestly!—e.g., confirmation that the
1256    code/machine is doing something, infinite loop detection, gauging
1257    cost of recently added code. Certain people can even tell what stage
1258    [the program] is in by the beep pattern. But the major use is for
1259    annoying others in the same office…”
1260
1261.. rts-flag:: -D ⟨x⟩
1262
1263    An RTS debugging flag; only available if the program was linked with
1264    the :ghc-flag:`-debug` option. Various values of ⟨x⟩ are provided to enable
1265    debug messages and additional runtime sanity checks in different
1266    subsystems in the RTS, for example ``+RTS -Ds -RTS`` enables debug
1267    messages from the scheduler. Use ``+RTS -?`` to find out which debug
1268    flags are supported.
1269
1270    Full list of currently supported flags:
1271
1272.. rts-flag::  -Ds  DEBUG: scheduler
1273.. rts-flag::  -Di  DEBUG: interpreter
1274.. rts-flag::  -Dw  DEBUG: weak
1275.. rts-flag::  -DG  DEBUG: gccafs
1276.. rts-flag::  -Dg  DEBUG: gc
1277.. rts-flag::  -Db  DEBUG: block
1278.. rts-flag::  -DS  DEBUG: sanity
1279.. rts-flag::  -DZ  DEBUG: zero freed memory on GC
1280.. rts-flag::  -Dt  DEBUG: stable
1281.. rts-flag::  -Dp  DEBUG: prof
1282.. rts-flag::  -Da  DEBUG: apply
1283.. rts-flag::  -Dl  DEBUG: linker
1284.. rts-flag::  -Dm  DEBUG: stm
1285.. rts-flag::  -Dz  DEBUG: stack squeezing
1286.. rts-flag::  -Dc  DEBUG: program coverage
1287.. rts-flag::  -Dr  DEBUG: sparks
1288.. rts-flag::  -DC  DEBUG: compact
1289
1290    Debug messages will be sent to the binary event log file instead of
1291    stdout if the :rts-flag:`-l ⟨flags⟩` option is added. This might be useful
1292    for reducing the overhead of debug tracing.
1293
1294    To figure out what exactly they do, the least bad way is to grep the rts/ directory in
1295    the ghc code for macros like ``DEBUG(scheduler`` or ``DEBUG_scheduler``.
1296
1297.. rts-flag:: -r ⟨file⟩
1298
1299    .. index::
1300       single: ticky ticky profiling
1301       single: profiling; ticky ticky
1302
1303    Produce "ticky-ticky" statistics at the end of the program run (only
1304    available if the program was linked with :ghc-flag:`-debug`). The ⟨file⟩
1305    business works just like on the :rts-flag:`-S [⟨file⟩]` RTS option, above.
1306
1307    For more information on ticky-ticky profiling, see
1308    :ref:`ticky-ticky`.
1309
1310.. rts-flag:: -xc
1311
1312    (Only available when the program is compiled for profiling.) When an
1313    exception is raised in the program, this option causes a stack trace
1314    to be dumped to ``stderr``.
1315
1316    This can be particularly useful for debugging: if your program is
1317    complaining about a ``head []`` error and you haven't got a clue
1318    which bit of code is causing it, compiling with
1319    ``-prof -fprof-auto`` (see :ghc-flag:`-prof`) and running with ``+RTS -xc
1320    -RTS`` will tell you exactly the call stack at the point the error was
1321    raised.
1322
1323    The output contains one report for each exception raised in the
1324    program (the program might raise and catch several exceptions during
1325    its execution), where each report looks something like this:
1326
1327    .. code-block:: none
1328
1329        *** Exception raised (reporting due to +RTS -xc), stack trace:
1330          GHC.List.CAF
1331          --> evaluated by: Main.polynomial.table_search,
1332          called from Main.polynomial.theta_index,
1333          called from Main.polynomial,
1334          called from Main.zonal_pressure,
1335          called from Main.make_pressure.p,
1336          called from Main.make_pressure,
1337          called from Main.compute_initial_state.p,
1338          called from Main.compute_initial_state,
1339          called from Main.CAF
1340          ...
1341
1342    The stack trace may often begin with something uninformative like
1343    ``GHC.List.CAF``; this is an artifact of GHC's optimiser, which
1344    lifts out exceptions to the top-level where the profiling system
1345    assigns them to the cost centre "CAF". However, ``+RTS -xc`` doesn't
1346    just print the current stack, it looks deeper and reports the stack
1347    at the time the CAF was evaluated, and it may report further stacks
1348    until a non-CAF stack is found. In the example above, the next stack
1349    (after ``--> evaluated by``) contains plenty of information about
1350    what the program was doing when it evaluated ``head []``.
1351
1352    Implementation details aside, the function names in the stack should
1353    hopefully give you enough clues to track down the bug.
1354
1355    See also the function ``traceStack`` in the module ``Debug.Trace``
1356    for another way to view call stacks.
1357
1358.. rts-flag:: -Z
1359
1360    Turn *off* update frame squeezing on context switch.
1361    (There's no particularly good reason to turn it off, except to
1362    ensure the accuracy of certain data collected regarding thunk entry
1363    counts.)
1364
1365.. _ghc-info:
1366
1367Getting information about the RTS
1368---------------------------------
1369
1370.. index::
1371   single: RTS
1372
1373.. rts-flag:: --info
1374
1375    It is possible to ask the RTS to give some information about itself. To
1376    do this, use the :rts-flag:`--info` flag, e.g.
1377
1378    .. code-block:: none
1379
1380        $ ./a.out +RTS --info
1381        [("GHC RTS", "YES")
1382        ,("GHC version", "6.7")
1383        ,("RTS way", "rts_p")
1384        ,("Host platform", "x86_64-unknown-linux")
1385        ,("Host architecture", "x86_64")
1386        ,("Host OS", "linux")
1387        ,("Host vendor", "unknown")
1388        ,("Build platform", "x86_64-unknown-linux")
1389        ,("Build architecture", "x86_64")
1390        ,("Build OS", "linux")
1391        ,("Build vendor", "unknown")
1392        ,("Target platform", "x86_64-unknown-linux")
1393        ,("Target architecture", "x86_64")
1394        ,("Target OS", "linux")
1395        ,("Target vendor", "unknown")
1396        ,("Word size", "64")
1397        ,("Compiler unregisterised", "NO")
1398        ,("Tables next to code", "YES")
1399        ,("Flag -with-rtsopts", "")
1400        ]
1401
1402    The information is formatted such that it can be read as a of type
1403    ``[(String, String)]``. Currently the following fields are present:
1404
1405    ``GHC RTS``
1406        Is this program linked against the GHC RTS? (always "YES").
1407
1408    ``GHC version``
1409        The version of GHC used to compile this program.
1410
1411    ``RTS way``
1412        The variant (“way”) of the runtime. The most common values are
1413        ``rts_v`` (vanilla), ``rts_thr`` (threaded runtime, i.e. linked
1414        using the :ghc-flag:`-threaded` option) and ``rts_p`` (profiling runtime,
1415        i.e. linked using the :ghc-flag:`-prof` option). Other variants include
1416        ``debug`` (linked using :ghc-flag:`-debug`), and ``dyn`` (the RTS is linked
1417        in dynamically, i.e. a shared library, rather than statically linked
1418        into the executable itself). These can be combined, e.g. you might
1419        have ``rts_thr_debug_p``.
1420
1421    ``Target platform``\ ``Target architecture``\ ``Target OS``\ ``Target vendor``
1422        These are the platform the program is compiled to run on.
1423
1424    ``Build platform``\ ``Build architecture``\ ``Build OS``\ ``Build vendor``
1425        These are the platform where the program was built on. (That is, the
1426        target platform of GHC itself.) Ordinarily this is identical to the
1427        target platform. (It could potentially be different if
1428        cross-compiling.)
1429
1430    ``Host platform``\ ``Host architecture``\ ``Host OS``\ ``Host vendor``
1431        These are the platform where GHC itself was compiled. Again, this
1432        would normally be identical to the build and target platforms.
1433
1434    ``Word size``
1435        Either ``"32"`` or ``"64"``, reflecting the word size of the target
1436        platform.
1437
1438    ``Compiler unregistered``
1439        Was this program compiled with an :ref:`"unregistered" <unreg>`
1440        version of GHC? (I.e., a version of GHC that has no
1441        platform-specific optimisations compiled in, usually because this is
1442        a currently unsupported platform.) This value will usually be no,
1443        unless you're using an experimental build of GHC.
1444
1445    ``Tables next to code``
1446        Putting info tables directly next to entry code is a useful
1447        performance optimisation that is not available on all platforms.
1448        This field tells you whether the program has been compiled with this
1449        optimisation. (Usually yes, except on unusual platforms.)
1450
1451    ``Flag -with-rtsopts``
1452        The value of the GHC flag :ghc-flag:`-with-rtsopts=⟨opts⟩` at compile/link time.
1453