1
2# processx
3
4> Execute and Control System Processes
5
6<!-- badges: start -->
7
8[![lifecycle](https://lifecycle.r-lib.org/articles/figures/lifecycle-stable.svg)](https://lifecycle.r-lib.org/articles/stages.html)
9[![R build
10status](https://github.com/r-lib/processx/workflows/R-CMD-check/badge.svg)](https://github.com/r-lib/processx/actions)
11[![](https://www.r-pkg.org/badges/version/processx)](https://www.r-pkg.org/pkg/processx)
12[![CRAN RStudio mirror
13downloads](https://cranlogs.r-pkg.org/badges/processx)](https://www.r-pkg.org/pkg/processx)
14[![Coverage
15Status](https://img.shields.io/codecov/c/github/r-lib/processx/master.svg)](https://codecov.io/github/r-lib/processx?branch=master)
16<!-- badges: end -->
17
18Tools to run system processes in the background, read their standard
19output and error and kill them.
20
21processx can poll the standard output and error of a single process, or
22multiple processes, using the operating system’s polling and waiting
23facilities, with a timeout.
24
25-----
26
27  - [Features](#features)
28  - [Installation](#installation)
29  - [Usage](#usage)
30      - [Running an external process](#running-an-external-process)
31          - [Errors](#errors)
32          - [Showing output](#showing-output)
33          - [Spinner](#spinner)
34          - [Callbacks for I/O](#callbacks-for-io)
35      - [Managing external processes](#managing-external-processes)
36          - [Starting processes](#starting-processes)
37          - [Killing a process](#killing-a-process)
38          - [Standard output and error](#standard-output-and-error)
39          - [End of output](#end-of-output)
40          - [Polling the standard output and
41            error](#polling-the-standard-output-and-error)
42          - [Polling multiple processes](#polling-multiple-processes)
43          - [Waiting on a process](#waiting-on-a-process)
44          - [Exit statuses](#exit-statuses)
45          - [Mixing processx and the parallel base R
46            package](#mixing-processx-and-the-parallel-base-r-package)
47          - [Errors](#errors-1)
48  - [Related tools](#related-tools)
49  - [Code of Conduct](#code-of-conduct)
50  - [License](#license)
51
52## Features
53
54  - Start system processes in the background and find their process id.
55  - Read the standard output and error, using non-blocking connections
56  - Poll the standard output and error connections of a single process
57    or multiple processes.
58  - Write to the standard input of background processes.
59  - Check if a background process is running.
60  - Wait on a background process, or multiple processes, with a timeout.
61  - Get the exit status of a background process, if it has already
62    finished.
63  - Kill background processes.
64  - Kill background process, when its associated object is garbage
65    collected.
66  - Kill background processes and all their child processes.
67  - Works on Linux, macOS and Windows.
68  - Lightweight, it only depends on the also lightweight R6 and ps
69    packages.
70
71## Installation
72
73Install the stable version from CRAN:
74
75``` r
76install.packages("processx")
77```
78
79## Usage
80
81``` r
82library(processx)
83```
84
85> Note: the following external commands are usually present in macOS and
86> Linux systems, but not necessarily on Windows. We will also use the
87> `px` command line tool (`px.exe` on Windows), that is a very simple
88> program that can produce output to `stdout` and `stderr`, with the
89> specified timings.
90
91``` r
92px <- paste0(
93  system.file(package = "processx", "bin", "px"),
94  system.file(package = "processx", "bin", .Platform$r_arch, "px.exe")
95)
96px
97```
98
99    #> [1] "/private/var/folders/59/0gkmw1yj2w7bf2dfc3jznv5w0000gn/T/Rtmp7ipFsS/temp_libpathb89a55e5c2f9/processx/bin/px"
100
101### Running an external process
102
103The `run()` function runs an external command. It requires a single
104command, and a character vector of arguments. You don’t need to quote
105the command or the arguments, as they are passed directly to the
106operating system, without an intermediate shell.
107
108``` r
109run("echo", "Hello R!")
110```
111
112    #> $status
113    #> [1] 0
114    #>
115    #> $stdout
116    #> [1] "Hello R!\n"
117    #>
118    #> $stderr
119    #> [1] ""
120    #>
121    #> $timeout
122    #> [1] FALSE
123
124Short summary of the `px` binary we are using extensively below:
125
126``` r
127result <- run(px, "--help", echo = TRUE)
128```
129
130    #> Usage: px [command arg] [command arg] ...
131    #>
132    #> Commands:
133    #>   sleep  <seconds>           -- sleep for a number os seconds
134    #>   out    <string>            -- print string to stdout
135    #>   err    <string>            -- print string to stderr
136    #>   outln  <string>            -- print string to stdout, add newline
137    #>   errln  <string>            -- print string to stderr, add newline
138    #>   errflush                   -- flush stderr stream
139    #>   cat    <filename>          -- print file to stdout
140    #>   return <exitcode>          -- return with exitcode
141    #>   writefile <path> <string>  -- write to file
142    #>   write <fd> <string>        -- write to file descriptor
143    #>   echo <fd1> <fd2> <nbytes>  -- echo from fd to another fd
144    #>   getenv <var>               -- environment variable to stdout
145
146> Note: From version 3.0.1, processx does not let you specify a full
147> shell command line, as this involves starting a grandchild process
148> from the child process, and it is difficult to clean up the grandchild
149> process when the child process is killed. The user can still start a
150> shell (`sh` or `cmd.exe`) directly of course, and then proper cleanup
151> is the user’s responsibility.
152
153#### Errors
154
155By default `run()` throws an error if the process exits with a non-zero
156status code. To avoid this, specify `error_on_status = FALSE`:
157
158``` r
159run(px, c("out", "oh no!", "return", "2"), error_on_status = FALSE)
160```
161
162    #> $status
163    #> [1] 2
164    #>
165    #> $stdout
166    #> [1] "oh no!"
167    #>
168    #> $stderr
169    #> [1] ""
170    #>
171    #> $timeout
172    #> [1] FALSE
173
174#### Showing output
175
176To show the output of the process on the screen, use the `echo`
177argument. Note that the order of `stdout` and `stderr` lines may be
178incorrect, because they are coming from two different connections.
179
180``` r
181result <- run(px,
182  c("outln", "out", "errln", "err", "outln", "out again"),
183  echo = TRUE)
184```
185
186    #> out
187    #> out again
188    #> err
189
190If you have a terminal that support ANSI colors, then the standard error
191output is shown in red.
192
193The standard output and error are still included in the result of the
194`run()` call:
195
196``` r
197result
198```
199
200    #> $status
201    #> [1] 0
202    #>
203    #> $stdout
204    #> [1] "out\nout again\n"
205    #>
206    #> $stderr
207    #> [1] "err\n"
208    #>
209    #> $timeout
210    #> [1] FALSE
211
212Note that `run()` is different from `system()`, and it always shows the
213output of the process on R’s proper standard output, instead of writing
214to the terminal directly. This means for example that you can capture
215the output with `capture.output()` or use `sink()`, etc.:
216
217``` r
218out1 <- capture.output(r1 <- system("ls"))
219out2 <- capture.output(r2 <- run("ls", echo = TRUE))
220```
221
222``` r
223out1
224```
225
226    #> character(0)
227
228``` r
229out2
230```
231
232    #>  [1] "CODE_OF_CONDUCT.md" "DESCRIPTION"        "LICENSE"
233    #>  [4] "LICENSE.md"         "Makefile"           "NAMESPACE"
234    #>  [7] "NEWS.md"            "R"                  "README.Rmd"
235    #> [10] "README.html"        "README.md"          "_pkgdown.yml"
236    #> [13] "inst"               "man"                "processx.Rproj"
237    #> [16] "src"                "tests"
238
239#### Spinner
240
241The `spinner` option of `run()` puts a calming spinner to the terminal
242while the background program is running. The spinner is always shown in
243the first character of the last line, so you can make it work nicely
244with the regular output of the background process if you like. E.g. try
245this in your R terminal:
246
247    result <- run(px,
248      c("out", "  foo",
249        "sleep", "1",
250        "out", "\r  bar",
251        "sleep", "1",
252        "out", "\rX foobar\n"),
253      echo = TRUE, spinner = TRUE)
254
255#### Callbacks for I/O
256
257`run()` can call an R function for each line of the standard output or
258error of the process, just supply the `stdout_line_callback` or the
259`stderr_line_callback` arguments. The callback functions take two
260arguments, the first one is a character scalar, the output line. The
261second one is the `process` object that represents the background
262process. (See more below about `process` objects.) You can manipulate
263this object in the callback, if you want. For example you can kill it in
264response to an error or some text on the standard output:
265
266``` r
267cb <- function(line, proc) {
268  cat("Got:", line, "\n")
269  if (line == "done") proc$kill()
270}
271result <- run(px,
272  c("outln", "this", "outln", "that", "outln", "done",
273    "outln", "still here", "sleep", "10", "outln", "dead by now"),
274  stdout_line_callback = cb,
275  error_on_status = FALSE,
276)
277```
278
279    #> Got: this
280    #> Got: that
281    #> Got: done
282    #> Got: still here
283
284``` r
285result
286```
287
288    #> $status
289    #> [1] -9
290    #>
291    #> $stdout
292    #> [1] "this\nthat\ndone\nstill here\n"
293    #>
294    #> $stderr
295    #> [1] ""
296    #>
297    #> $timeout
298    #> [1] FALSE
299
300Keep in mind, that while the R callback is running, the background
301process is not stopped, it is also running. In the previous example,
302whether `still here` is printed or not depends on the scheduling of the
303R process and the background process by the OS. Typically, it is
304printed, because the R callback takes a while to run.
305
306In addition to the line-oriented callbacks, the `stdout_callback` and
307`stderr_callback` arguments can specify callback functions that are
308called with output chunks instead of single lines. A chunk may contain
309multiple lines (separated by `\n` or `\r\n`), or even incomplete lines.
310
311### Managing external processes
312
313If you need better control over possibly multiple background processes,
314then you can use the R6 `process` class directly.
315
316#### Starting processes
317
318To start a new background process, create a new instance of the
319`process` class.
320
321``` r
322p <- process$new("sleep", "20")
323```
324
325#### Killing a process
326
327A process can be killed via the `kill()` method.
328
329``` r
330p$is_alive()
331```
332
333    #> [1] TRUE
334
335``` r
336p$kill()
337```
338
339    #> [1] TRUE
340
341``` r
342p$is_alive()
343```
344
345    #> [1] FALSE
346
347Note that processes are finalized (and killed) automatically if the
348corresponding `process` object goes out of scope, as soon as the object
349is garbage collected by R:
350
351``` r
352p <- process$new("sleep", "20")
353rm(p)
354gc()
355```
356
357    #>          used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
358    #> Ncells 493821 26.4    1069461 57.2         NA   682911 36.5
359    #> Vcells 928674  7.1    8388608 64.0      16384  1883216 14.4
360
361Here, the direct call to the garbage collector kills the `sleep` process
362as well. See the `cleanup` option if you want to avoid this behavior.
363
364#### Standard output and error
365
366By default the standard output and error of the processes are ignored.
367You can set the `stdout` and `stderr` constructor arguments to a file
368name, and then they are redirected there, or to `"|"`, and then processx
369creates connections to them. (Note that starting from processx 3.0.0
370these connections are not regular R connections, because the public R
371connection API was retroactively removed from R.)
372
373The `read_output_lines()` and `read_error_lines()` methods can be used
374to read complete lines from the standard output or error connections.
375They work similarly to the `readLines()` base R function.
376
377Note, that the connections have a buffer, which can fill up, if R does
378not read out the output, and then the process will stop, until R reads
379the connection and the buffer is freed.
380
381> **Always make sure that you read out the standard output and/or
382> error** **of the pipes, otherwise the background process will stop
383> running\!**
384
385If you don’t need the standard output or error any more, you can also
386close it, like this:
387
388``` r
389close(p$get_output_connection())
390close(p$get_error_connection())
391```
392
393Note that the connections used for reading the output and error streams
394are non-blocking, so the read functions will return immediately, even if
395there is no text to read from them. If you want to make sure that there
396is data available to read, you need to poll, see below.
397
398``` r
399p <- process$new(px,
400  c("sleep", "1", "outln", "foo", "errln", "bar", "outln", "foobar"),
401  stdout = "|", stderr = "|")
402p$read_output_lines()
403```
404
405    #> character(0)
406
407``` r
408p$read_error_lines()
409```
410
411    #> character(0)
412
413#### End of output
414
415The standard R way to query the end of the stream for a non-blocking
416connection, is to use the `isIncomplete()` function. *After a read
417attempt*, this function returns `FALSE` if the connection has surely no
418more data. (If the read attempt returns no data, but `isIncomplete()`
419returns `TRUE`, then the connection might deliver more data in the
420future.
421
422The `is_incomplete_output()` and `is_incomplete_error()` functions work
423similarly for `process` objects.
424
425#### Polling the standard output and error
426
427The `poll_io()` method waits for data on the standard output and/or
428error of a process. It will return if any of the following events
429happen:
430
431  - data is available on the standard output of the process (assuming
432    there is a connection to the standard output).
433  - data is available on the standard error of the process (assuming the
434    is a connection to the standard error).
435  - The process has finished and the standard output and/or error
436    connections were closed on the other end.
437  - The specified timeout period expired.
438
439For example the following code waits about a second for output.
440
441``` r
442p <- process$new(px, c("sleep", "1", "outln", "kuku"), stdout = "|")
443
444## No output yet
445p$read_output_lines()
446```
447
448    #> character(0)
449
450``` r
451## Wait at most 5 sec
452p$poll_io(5000)
453```
454
455    #>   output    error  process
456    #>  "ready" "nopipe" "nopipe"
457
458``` r
459## There is output now
460p$read_output_lines()
461```
462
463    #> [1] "kuku"
464
465#### Polling multiple processes
466
467If you need to manage multiple background processes, and need to wait
468for output from all of them, processx defines a `poll()` function that
469does just that. It is similar to the `poll_io()` method, but it takes
470multiple process objects, and returns as soon as one of them have data
471on standard output or error, or a timeout expires. Here is an example:
472
473``` r
474p1 <- process$new(px, c("sleep", "1", "outln", "output"), stdout = "|")
475p2 <- process$new(px, c("sleep", "2", "errln", "error"), stderr = "|")
476
477## After 100ms no output yet
478poll(list(p1 = p1, p2 = p2), 100)
479```
480
481    #> $p1
482    #>    output     error   process
483    #> "timeout"  "nopipe"  "nopipe"
484    #>
485    #> $p2
486    #>    output     error   process
487    #>  "nopipe" "timeout"  "nopipe"
488
489``` r
490## But now we surely have something
491poll(list(p1 = p1, p2 = p2), 1000)
492```
493
494    #> $p1
495    #>   output    error  process
496    #>  "ready" "nopipe" "nopipe"
497    #>
498    #> $p2
499    #>   output    error  process
500    #> "nopipe" "silent" "nopipe"
501
502``` r
503p1$read_output_lines()
504```
505
506    #> [1] "output"
507
508``` r
509## Done with p1
510close(p1$get_output_connection())
511```
512
513    #> NULL
514
515``` r
516## The second process should have data on stderr soonish
517poll(list(p1 = p1, p2 = p2), 5000)
518```
519
520    #> $p1
521    #>   output    error  process
522    #> "closed" "nopipe" "nopipe"
523    #>
524    #> $p2
525    #>   output    error  process
526    #> "nopipe"  "ready" "nopipe"
527
528``` r
529p2$read_error_lines()
530```
531
532    #> [1] "error"
533
534#### Waiting on a process
535
536As seen before, `is_alive()` checks if a process is running. The
537`wait()` method can be used to wait until it has finished (or a
538specified timeout expires).. E.g. in the following code `wait()` needs
539to wait about 2 seconds for the `sleep` `px` command to finish.
540
541``` r
542p <- process$new(px, c("sleep", "2"))
543p$is_alive()
544```
545
546    #> [1] TRUE
547
548``` r
549Sys.time()
550```
551
552    #> [1] "2021-03-23 15:08:37 CET"
553
554``` r
555p$wait()
556Sys.time()
557```
558
559    #> [1] "2021-03-23 15:08:39 CET"
560
561It is safe to call `wait()` multiple times:
562
563``` r
564p$wait() # already finished!
565```
566
567#### Exit statuses
568
569After a process has finished, its exit status can be queried via the
570`get_exit_status()` method. If the process is still running, then this
571method returns `NULL`.
572
573``` r
574p <- process$new(px, c("sleep", "2"))
575p$get_exit_status()
576```
577
578    #> NULL
579
580``` r
581p$wait()
582p$get_exit_status()
583```
584
585    #> [1] 0
586
587#### Mixing processx and the parallel base R package
588
589In general, mixing processx (via callr or not) and parallel works fine.
590If you use parallel’s ‘fork’ clusters, e.g. via
591`parallel::mcparallel()`, then you might see two issues. One is that
592processx will not be able to determine the exit status of some processx
593processes. This is because the status is read out by parallel, and
594processx will set it to `NA`. The other one is that parallel might
595complain that it could not clean up some subprocesses. This is not an
596error, and it is harmless, but it does hold up R for about 10 seconds,
597before parallel gives up. To work around this, you can set the
598`PROCESSX_NOTIFY_OLD_SIGCHLD` environment variable to a non-empty value,
599before you load processx. This behavior might be the default in the
600future.
601
602#### Errors
603
604Errors are typically signalled via non-zero exits statuses. The processx
605constructor fails if the external program cannot be started, but it does
606not deal with errors that happen after the program has successfully
607started running.
608
609``` r
610p <- process$new("nonexistant-command-for-sure")
611```
612
613    #> Error in rethrow_call(c_processx_exec, command, c(command, args), pty, : cannot start processx process 'nonexistant-command-for-sure' (system error 2, No such file or directory) @unix/processx.c:610 (processx_exec)
614
615``` r
616p2 <- process$new(px, c("sleep", "1", "command-does-not-exist"))
617p2$wait()
618p2$get_exit_status()
619```
620
621    #> [1] 5
622
623## Related tools
624
625  - The [`ps` package](https://ps.r-lib.org/) can query, list,
626    manipulate all system processes (not just subprocesses), and
627    processx uses it internally for some of its functionality. You can
628    also convert a `processx::process` object to a `ps::ps_handle` with
629    the `as_ps_handle()` method.
630
631  - The [`callr` package](https://callr.r-lib.org/) uses processx to
632    start another R process, and run R code in it, in the foreground or
633    background.
634
635## Code of Conduct
636
637Please note that this project is released with a [Contributor Code of
638Conduct](https://processx.r-lib.org/CODE_OF_CONDUCT.html). By
639participating in this project you agree to abide by its terms.
640
641## License
642
643MIT © Mango Solutions, RStudio, Gábor Csárdi
644