1 2# processx 3 4> Execute and Control System Processes 5 6<!-- badges: start --> 7 8[![lifecycle](https://lifecycle.r-lib.org/articles/figures/lifecycle-stable.svg)](https://lifecycle.r-lib.org/articles/stages.html) 9[![R build 10status](https://github.com/r-lib/processx/workflows/R-CMD-check/badge.svg)](https://github.com/r-lib/processx/actions) 11[![](https://www.r-pkg.org/badges/version/processx)](https://www.r-pkg.org/pkg/processx) 12[![CRAN RStudio mirror 13downloads](https://cranlogs.r-pkg.org/badges/processx)](https://www.r-pkg.org/pkg/processx) 14[![Coverage 15Status](https://img.shields.io/codecov/c/github/r-lib/processx/master.svg)](https://codecov.io/github/r-lib/processx?branch=master) 16<!-- badges: end --> 17 18Tools to run system processes in the background, read their standard 19output and error and kill them. 20 21processx can poll the standard output and error of a single process, or 22multiple processes, using the operating system’s polling and waiting 23facilities, with a timeout. 24 25----- 26 27 - [Features](#features) 28 - [Installation](#installation) 29 - [Usage](#usage) 30 - [Running an external process](#running-an-external-process) 31 - [Errors](#errors) 32 - [Showing output](#showing-output) 33 - [Spinner](#spinner) 34 - [Callbacks for I/O](#callbacks-for-io) 35 - [Managing external processes](#managing-external-processes) 36 - [Starting processes](#starting-processes) 37 - [Killing a process](#killing-a-process) 38 - [Standard output and error](#standard-output-and-error) 39 - [End of output](#end-of-output) 40 - [Polling the standard output and 41 error](#polling-the-standard-output-and-error) 42 - [Polling multiple processes](#polling-multiple-processes) 43 - [Waiting on a process](#waiting-on-a-process) 44 - [Exit statuses](#exit-statuses) 45 - [Mixing processx and the parallel base R 46 package](#mixing-processx-and-the-parallel-base-r-package) 47 - [Errors](#errors-1) 48 - [Related tools](#related-tools) 49 - [Code of Conduct](#code-of-conduct) 50 - [License](#license) 51 52## Features 53 54 - Start system processes in the background and find their process id. 55 - Read the standard output and error, using non-blocking connections 56 - Poll the standard output and error connections of a single process 57 or multiple processes. 58 - Write to the standard input of background processes. 59 - Check if a background process is running. 60 - Wait on a background process, or multiple processes, with a timeout. 61 - Get the exit status of a background process, if it has already 62 finished. 63 - Kill background processes. 64 - Kill background process, when its associated object is garbage 65 collected. 66 - Kill background processes and all their child processes. 67 - Works on Linux, macOS and Windows. 68 - Lightweight, it only depends on the also lightweight R6 and ps 69 packages. 70 71## Installation 72 73Install the stable version from CRAN: 74 75``` r 76install.packages("processx") 77``` 78 79## Usage 80 81``` r 82library(processx) 83``` 84 85> Note: the following external commands are usually present in macOS and 86> Linux systems, but not necessarily on Windows. We will also use the 87> `px` command line tool (`px.exe` on Windows), that is a very simple 88> program that can produce output to `stdout` and `stderr`, with the 89> specified timings. 90 91``` r 92px <- paste0( 93 system.file(package = "processx", "bin", "px"), 94 system.file(package = "processx", "bin", .Platform$r_arch, "px.exe") 95) 96px 97``` 98 99 #> [1] "/private/var/folders/59/0gkmw1yj2w7bf2dfc3jznv5w0000gn/T/Rtmp7ipFsS/temp_libpathb89a55e5c2f9/processx/bin/px" 100 101### Running an external process 102 103The `run()` function runs an external command. It requires a single 104command, and a character vector of arguments. You don’t need to quote 105the command or the arguments, as they are passed directly to the 106operating system, without an intermediate shell. 107 108``` r 109run("echo", "Hello R!") 110``` 111 112 #> $status 113 #> [1] 0 114 #> 115 #> $stdout 116 #> [1] "Hello R!\n" 117 #> 118 #> $stderr 119 #> [1] "" 120 #> 121 #> $timeout 122 #> [1] FALSE 123 124Short summary of the `px` binary we are using extensively below: 125 126``` r 127result <- run(px, "--help", echo = TRUE) 128``` 129 130 #> Usage: px [command arg] [command arg] ... 131 #> 132 #> Commands: 133 #> sleep <seconds> -- sleep for a number os seconds 134 #> out <string> -- print string to stdout 135 #> err <string> -- print string to stderr 136 #> outln <string> -- print string to stdout, add newline 137 #> errln <string> -- print string to stderr, add newline 138 #> errflush -- flush stderr stream 139 #> cat <filename> -- print file to stdout 140 #> return <exitcode> -- return with exitcode 141 #> writefile <path> <string> -- write to file 142 #> write <fd> <string> -- write to file descriptor 143 #> echo <fd1> <fd2> <nbytes> -- echo from fd to another fd 144 #> getenv <var> -- environment variable to stdout 145 146> Note: From version 3.0.1, processx does not let you specify a full 147> shell command line, as this involves starting a grandchild process 148> from the child process, and it is difficult to clean up the grandchild 149> process when the child process is killed. The user can still start a 150> shell (`sh` or `cmd.exe`) directly of course, and then proper cleanup 151> is the user’s responsibility. 152 153#### Errors 154 155By default `run()` throws an error if the process exits with a non-zero 156status code. To avoid this, specify `error_on_status = FALSE`: 157 158``` r 159run(px, c("out", "oh no!", "return", "2"), error_on_status = FALSE) 160``` 161 162 #> $status 163 #> [1] 2 164 #> 165 #> $stdout 166 #> [1] "oh no!" 167 #> 168 #> $stderr 169 #> [1] "" 170 #> 171 #> $timeout 172 #> [1] FALSE 173 174#### Showing output 175 176To show the output of the process on the screen, use the `echo` 177argument. Note that the order of `stdout` and `stderr` lines may be 178incorrect, because they are coming from two different connections. 179 180``` r 181result <- run(px, 182 c("outln", "out", "errln", "err", "outln", "out again"), 183 echo = TRUE) 184``` 185 186 #> out 187 #> out again 188 #> err 189 190If you have a terminal that support ANSI colors, then the standard error 191output is shown in red. 192 193The standard output and error are still included in the result of the 194`run()` call: 195 196``` r 197result 198``` 199 200 #> $status 201 #> [1] 0 202 #> 203 #> $stdout 204 #> [1] "out\nout again\n" 205 #> 206 #> $stderr 207 #> [1] "err\n" 208 #> 209 #> $timeout 210 #> [1] FALSE 211 212Note that `run()` is different from `system()`, and it always shows the 213output of the process on R’s proper standard output, instead of writing 214to the terminal directly. This means for example that you can capture 215the output with `capture.output()` or use `sink()`, etc.: 216 217``` r 218out1 <- capture.output(r1 <- system("ls")) 219out2 <- capture.output(r2 <- run("ls", echo = TRUE)) 220``` 221 222``` r 223out1 224``` 225 226 #> character(0) 227 228``` r 229out2 230``` 231 232 #> [1] "CODE_OF_CONDUCT.md" "DESCRIPTION" "LICENSE" 233 #> [4] "LICENSE.md" "Makefile" "NAMESPACE" 234 #> [7] "NEWS.md" "R" "README.Rmd" 235 #> [10] "README.html" "README.md" "_pkgdown.yml" 236 #> [13] "inst" "man" "processx.Rproj" 237 #> [16] "src" "tests" 238 239#### Spinner 240 241The `spinner` option of `run()` puts a calming spinner to the terminal 242while the background program is running. The spinner is always shown in 243the first character of the last line, so you can make it work nicely 244with the regular output of the background process if you like. E.g. try 245this in your R terminal: 246 247 result <- run(px, 248 c("out", " foo", 249 "sleep", "1", 250 "out", "\r bar", 251 "sleep", "1", 252 "out", "\rX foobar\n"), 253 echo = TRUE, spinner = TRUE) 254 255#### Callbacks for I/O 256 257`run()` can call an R function for each line of the standard output or 258error of the process, just supply the `stdout_line_callback` or the 259`stderr_line_callback` arguments. The callback functions take two 260arguments, the first one is a character scalar, the output line. The 261second one is the `process` object that represents the background 262process. (See more below about `process` objects.) You can manipulate 263this object in the callback, if you want. For example you can kill it in 264response to an error or some text on the standard output: 265 266``` r 267cb <- function(line, proc) { 268 cat("Got:", line, "\n") 269 if (line == "done") proc$kill() 270} 271result <- run(px, 272 c("outln", "this", "outln", "that", "outln", "done", 273 "outln", "still here", "sleep", "10", "outln", "dead by now"), 274 stdout_line_callback = cb, 275 error_on_status = FALSE, 276) 277``` 278 279 #> Got: this 280 #> Got: that 281 #> Got: done 282 #> Got: still here 283 284``` r 285result 286``` 287 288 #> $status 289 #> [1] -9 290 #> 291 #> $stdout 292 #> [1] "this\nthat\ndone\nstill here\n" 293 #> 294 #> $stderr 295 #> [1] "" 296 #> 297 #> $timeout 298 #> [1] FALSE 299 300Keep in mind, that while the R callback is running, the background 301process is not stopped, it is also running. In the previous example, 302whether `still here` is printed or not depends on the scheduling of the 303R process and the background process by the OS. Typically, it is 304printed, because the R callback takes a while to run. 305 306In addition to the line-oriented callbacks, the `stdout_callback` and 307`stderr_callback` arguments can specify callback functions that are 308called with output chunks instead of single lines. A chunk may contain 309multiple lines (separated by `\n` or `\r\n`), or even incomplete lines. 310 311### Managing external processes 312 313If you need better control over possibly multiple background processes, 314then you can use the R6 `process` class directly. 315 316#### Starting processes 317 318To start a new background process, create a new instance of the 319`process` class. 320 321``` r 322p <- process$new("sleep", "20") 323``` 324 325#### Killing a process 326 327A process can be killed via the `kill()` method. 328 329``` r 330p$is_alive() 331``` 332 333 #> [1] TRUE 334 335``` r 336p$kill() 337``` 338 339 #> [1] TRUE 340 341``` r 342p$is_alive() 343``` 344 345 #> [1] FALSE 346 347Note that processes are finalized (and killed) automatically if the 348corresponding `process` object goes out of scope, as soon as the object 349is garbage collected by R: 350 351``` r 352p <- process$new("sleep", "20") 353rm(p) 354gc() 355``` 356 357 #> used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) 358 #> Ncells 493821 26.4 1069461 57.2 NA 682911 36.5 359 #> Vcells 928674 7.1 8388608 64.0 16384 1883216 14.4 360 361Here, the direct call to the garbage collector kills the `sleep` process 362as well. See the `cleanup` option if you want to avoid this behavior. 363 364#### Standard output and error 365 366By default the standard output and error of the processes are ignored. 367You can set the `stdout` and `stderr` constructor arguments to a file 368name, and then they are redirected there, or to `"|"`, and then processx 369creates connections to them. (Note that starting from processx 3.0.0 370these connections are not regular R connections, because the public R 371connection API was retroactively removed from R.) 372 373The `read_output_lines()` and `read_error_lines()` methods can be used 374to read complete lines from the standard output or error connections. 375They work similarly to the `readLines()` base R function. 376 377Note, that the connections have a buffer, which can fill up, if R does 378not read out the output, and then the process will stop, until R reads 379the connection and the buffer is freed. 380 381> **Always make sure that you read out the standard output and/or 382> error** **of the pipes, otherwise the background process will stop 383> running\!** 384 385If you don’t need the standard output or error any more, you can also 386close it, like this: 387 388``` r 389close(p$get_output_connection()) 390close(p$get_error_connection()) 391``` 392 393Note that the connections used for reading the output and error streams 394are non-blocking, so the read functions will return immediately, even if 395there is no text to read from them. If you want to make sure that there 396is data available to read, you need to poll, see below. 397 398``` r 399p <- process$new(px, 400 c("sleep", "1", "outln", "foo", "errln", "bar", "outln", "foobar"), 401 stdout = "|", stderr = "|") 402p$read_output_lines() 403``` 404 405 #> character(0) 406 407``` r 408p$read_error_lines() 409``` 410 411 #> character(0) 412 413#### End of output 414 415The standard R way to query the end of the stream for a non-blocking 416connection, is to use the `isIncomplete()` function. *After a read 417attempt*, this function returns `FALSE` if the connection has surely no 418more data. (If the read attempt returns no data, but `isIncomplete()` 419returns `TRUE`, then the connection might deliver more data in the 420future. 421 422The `is_incomplete_output()` and `is_incomplete_error()` functions work 423similarly for `process` objects. 424 425#### Polling the standard output and error 426 427The `poll_io()` method waits for data on the standard output and/or 428error of a process. It will return if any of the following events 429happen: 430 431 - data is available on the standard output of the process (assuming 432 there is a connection to the standard output). 433 - data is available on the standard error of the process (assuming the 434 is a connection to the standard error). 435 - The process has finished and the standard output and/or error 436 connections were closed on the other end. 437 - The specified timeout period expired. 438 439For example the following code waits about a second for output. 440 441``` r 442p <- process$new(px, c("sleep", "1", "outln", "kuku"), stdout = "|") 443 444## No output yet 445p$read_output_lines() 446``` 447 448 #> character(0) 449 450``` r 451## Wait at most 5 sec 452p$poll_io(5000) 453``` 454 455 #> output error process 456 #> "ready" "nopipe" "nopipe" 457 458``` r 459## There is output now 460p$read_output_lines() 461``` 462 463 #> [1] "kuku" 464 465#### Polling multiple processes 466 467If you need to manage multiple background processes, and need to wait 468for output from all of them, processx defines a `poll()` function that 469does just that. It is similar to the `poll_io()` method, but it takes 470multiple process objects, and returns as soon as one of them have data 471on standard output or error, or a timeout expires. Here is an example: 472 473``` r 474p1 <- process$new(px, c("sleep", "1", "outln", "output"), stdout = "|") 475p2 <- process$new(px, c("sleep", "2", "errln", "error"), stderr = "|") 476 477## After 100ms no output yet 478poll(list(p1 = p1, p2 = p2), 100) 479``` 480 481 #> $p1 482 #> output error process 483 #> "timeout" "nopipe" "nopipe" 484 #> 485 #> $p2 486 #> output error process 487 #> "nopipe" "timeout" "nopipe" 488 489``` r 490## But now we surely have something 491poll(list(p1 = p1, p2 = p2), 1000) 492``` 493 494 #> $p1 495 #> output error process 496 #> "ready" "nopipe" "nopipe" 497 #> 498 #> $p2 499 #> output error process 500 #> "nopipe" "silent" "nopipe" 501 502``` r 503p1$read_output_lines() 504``` 505 506 #> [1] "output" 507 508``` r 509## Done with p1 510close(p1$get_output_connection()) 511``` 512 513 #> NULL 514 515``` r 516## The second process should have data on stderr soonish 517poll(list(p1 = p1, p2 = p2), 5000) 518``` 519 520 #> $p1 521 #> output error process 522 #> "closed" "nopipe" "nopipe" 523 #> 524 #> $p2 525 #> output error process 526 #> "nopipe" "ready" "nopipe" 527 528``` r 529p2$read_error_lines() 530``` 531 532 #> [1] "error" 533 534#### Waiting on a process 535 536As seen before, `is_alive()` checks if a process is running. The 537`wait()` method can be used to wait until it has finished (or a 538specified timeout expires).. E.g. in the following code `wait()` needs 539to wait about 2 seconds for the `sleep` `px` command to finish. 540 541``` r 542p <- process$new(px, c("sleep", "2")) 543p$is_alive() 544``` 545 546 #> [1] TRUE 547 548``` r 549Sys.time() 550``` 551 552 #> [1] "2021-03-23 15:08:37 CET" 553 554``` r 555p$wait() 556Sys.time() 557``` 558 559 #> [1] "2021-03-23 15:08:39 CET" 560 561It is safe to call `wait()` multiple times: 562 563``` r 564p$wait() # already finished! 565``` 566 567#### Exit statuses 568 569After a process has finished, its exit status can be queried via the 570`get_exit_status()` method. If the process is still running, then this 571method returns `NULL`. 572 573``` r 574p <- process$new(px, c("sleep", "2")) 575p$get_exit_status() 576``` 577 578 #> NULL 579 580``` r 581p$wait() 582p$get_exit_status() 583``` 584 585 #> [1] 0 586 587#### Mixing processx and the parallel base R package 588 589In general, mixing processx (via callr or not) and parallel works fine. 590If you use parallel’s ‘fork’ clusters, e.g. via 591`parallel::mcparallel()`, then you might see two issues. One is that 592processx will not be able to determine the exit status of some processx 593processes. This is because the status is read out by parallel, and 594processx will set it to `NA`. The other one is that parallel might 595complain that it could not clean up some subprocesses. This is not an 596error, and it is harmless, but it does hold up R for about 10 seconds, 597before parallel gives up. To work around this, you can set the 598`PROCESSX_NOTIFY_OLD_SIGCHLD` environment variable to a non-empty value, 599before you load processx. This behavior might be the default in the 600future. 601 602#### Errors 603 604Errors are typically signalled via non-zero exits statuses. The processx 605constructor fails if the external program cannot be started, but it does 606not deal with errors that happen after the program has successfully 607started running. 608 609``` r 610p <- process$new("nonexistant-command-for-sure") 611``` 612 613 #> Error in rethrow_call(c_processx_exec, command, c(command, args), pty, : cannot start processx process 'nonexistant-command-for-sure' (system error 2, No such file or directory) @unix/processx.c:610 (processx_exec) 614 615``` r 616p2 <- process$new(px, c("sleep", "1", "command-does-not-exist")) 617p2$wait() 618p2$get_exit_status() 619``` 620 621 #> [1] 5 622 623## Related tools 624 625 - The [`ps` package](https://ps.r-lib.org/) can query, list, 626 manipulate all system processes (not just subprocesses), and 627 processx uses it internally for some of its functionality. You can 628 also convert a `processx::process` object to a `ps::ps_handle` with 629 the `as_ps_handle()` method. 630 631 - The [`callr` package](https://callr.r-lib.org/) uses processx to 632 start another R process, and run R code in it, in the foreground or 633 background. 634 635## Code of Conduct 636 637Please note that this project is released with a [Contributor Code of 638Conduct](https://processx.r-lib.org/CODE_OF_CONDUCT.html). By 639participating in this project you agree to abide by its terms. 640 641## License 642 643MIT © Mango Solutions, RStudio, Gábor Csárdi 644