1
2*********************
3GNU Parallel Tutorial
4*********************
5
6
7This tutorial shows off much of GNU \ **parallel**\ 's functionality. The
8tutorial is meant to learn the options in and syntax of GNU
9\ **parallel**\ .  The tutorial is \ **not**\  to show realistic examples from the
10real world.
11
12Reader's guide
13==============
14
15
16If you prefer reading a book buy \ **GNU Parallel 2018**\  at
17https://www.lulu.com/shop/ole-tange/gnu-parallel-2018/paperback/product-23558902.html
18or download it at: https://doi.org/10.5281/zenodo.1146014
19
20Otherwise start by watching the intro videos for a quick introduction:
21https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
22
23Then browse through the \ **EXAMPLE**\ s after the list of \ **OPTIONS**\  in
24\ **man parallel**\  (Use \ **LESS=+/EXAMPLE: man parallel**\ ). That will give
25you an idea of what GNU \ **parallel**\  is capable of.
26
27If you want to dive even deeper: spend a couple of hours walking
28through the tutorial (\ **man parallel_tutorial**\ ). Your command line
29will love you for it.
30
31Finally you may want to look at the rest of the manual (\ **man
32parallel**\ ) if you have special needs not already covered.
33
34If you want to know the design decisions behind GNU \ **parallel**\ , try:
35\ **man parallel_design**\ . This is also a good intro if you intend to
36change GNU \ **parallel**\ .
37
38
39
40*************
41Prerequisites
42*************
43
44
45To run this tutorial you must have the following:
46
47
48- parallel >= version 20160822
49
50 Install the newest version using your package manager (recommended for
51 security reasons), the way described in README, or with this command:
52
53
54 .. code-block:: perl
55
56    $ (wget -O - pi.dk/3 || lynx -source pi.dk/3 || curl pi.dk/3/ || \
57       fetch -o - http://pi.dk/3 ) > install.sh
58    $ sha1sum install.sh
59    12345678 3374ec53 bacb199b 245af2dd a86df6c9
60    $ md5sum install.sh
61    029a9ac0 6e8b5bc6 052eac57 b2c3c9ca
62    $ sha512sum install.sh
63    40f53af6 9e20dae5 713ba06c f517006d 9897747b ed8a4694 b1acba1b 1464beb4
64    60055629 3f2356f3 3e9c4e3c 76e3f3af a9db4b32 bd33322b 975696fc e6b23cfb
65    $ bash install.sh
66
67
68 This will also install the newest version of the tutorial which you
69 can see by running this:
70
71
72 .. code-block:: perl
73
74    man parallel_tutorial
75
76
77 Most of the tutorial will work on older versions, too.
78
79
80
81- abc-file:
82
83 The file can be generated by this command:
84
85
86 .. code-block:: perl
87
88    parallel -k echo ::: A B C > abc-file
89
90
91
92
93- def-file:
94
95 The file can be generated by this command:
96
97
98 .. code-block:: perl
99
100    parallel -k echo ::: D E F > def-file
101
102
103
104
105- abc0-file:
106
107 The file can be generated by this command:
108
109
110 .. code-block:: perl
111
112    perl -e 'printf "A\0B\0C\0"' > abc0-file
113
114
115
116
117- abc_-file:
118
119 The file can be generated by this command:
120
121
122 .. code-block:: perl
123
124    perl -e 'printf "A_B_C_"' > abc_-file
125
126
127
128
129- tsv-file.tsv
130
131 The file can be generated by this command:
132
133
134 .. code-block:: perl
135
136    perl -e 'printf "f1\tf2\nA\tB\nC\tD\n"' > tsv-file.tsv
137
138
139
140
141- num8
142
143 The file can be generated by this command:
144
145
146 .. code-block:: perl
147
148    perl -e 'for(1..8){print "$_\n"}' > num8
149
150
151
152
153- num128
154
155 The file can be generated by this command:
156
157
158 .. code-block:: perl
159
160    perl -e 'for(1..128){print "$_\n"}' > num128
161
162
163
164
165- num30000
166
167 The file can be generated by this command:
168
169
170 .. code-block:: perl
171
172    perl -e 'for(1..30000){print "$_\n"}' > num30000
173
174
175
176
177- num1000000
178
179 The file can be generated by this command:
180
181
182 .. code-block:: perl
183
184    perl -e 'for(1..1000000){print "$_\n"}' > num1000000
185
186
187
188
189- num_%header
190
191 The file can be generated by this command:
192
193
194 .. code-block:: perl
195
196    (echo %head1; echo %head2; \
197     perl -e 'for(1..10){print "$_\n"}') > num_%header
198
199
200
201
202- fixedlen
203
204 The file can be generated by this command:
205
206
207 .. code-block:: perl
208
209    perl -e 'print "HHHHAAABBBCCC"' > fixedlen
210
211
212
213
214- For remote running: ssh login on 2 servers with no password in $SERVER1 and $SERVER2 must work.
215
216
217 .. code-block:: perl
218
219    SERVER1=server.example.com
220    SERVER2=server2.example.net
221
222
223 So you must be able to do this without entering a password:
224
225
226 .. code-block:: perl
227
228    ssh $SERVER1 echo works
229    ssh $SERVER2 echo works
230
231
232 It can be setup by running 'ssh-keygen -t dsa; ssh-copy-id $SERVER1'
233 and using an empty passphrase, or you can use \ **ssh-agent**\ .
234
235
236
237
238*************
239Input sources
240*************
241
242
243GNU \ **parallel**\  reads input from input sources. These can be files, the
244command line, and stdin (standard input or a pipe).
245
246A single input source
247=====================
248
249
250Input can be read from the command line:
251
252
253.. code-block:: perl
254
255   parallel echo ::: A B C
256
257
258Output (the order may be different because the jobs are run in
259parallel):
260
261
262.. code-block:: perl
263
264   A
265   B
266   C
267
268
269The input source can be a file:
270
271
272.. code-block:: perl
273
274   parallel -a abc-file echo
275
276
277Output: Same as above.
278
279STDIN (standard input) can be the input source:
280
281
282.. code-block:: perl
283
284   cat abc-file | parallel echo
285
286
287Output: Same as above.
288
289
290Multiple input sources
291======================
292
293
294GNU \ **parallel**\  can take multiple input sources given on the command
295line. GNU \ **parallel**\  then generates all combinations of the input
296sources:
297
298
299.. code-block:: perl
300
301   parallel echo ::: A B C ::: D E F
302
303
304Output (the order may be different):
305
306
307.. code-block:: perl
308
309   A D
310   A E
311   A F
312   B D
313   B E
314   B F
315   C D
316   C E
317   C F
318
319
320The input sources can be files:
321
322
323.. code-block:: perl
324
325   parallel -a abc-file -a def-file echo
326
327
328Output: Same as above.
329
330STDIN (standard input) can be one of the input sources using \ **-**\ :
331
332
333.. code-block:: perl
334
335   cat abc-file | parallel -a - -a def-file echo
336
337
338Output: Same as above.
339
340Instead of \ **-a**\  files can be given after \ **::::**\ :
341
342
343.. code-block:: perl
344
345   cat abc-file | parallel echo :::: - def-file
346
347
348Output: Same as above.
349
350::: and :::: can be mixed:
351
352
353.. code-block:: perl
354
355   parallel echo ::: A B C :::: def-file
356
357
358Output: Same as above.
359
360Linking arguments from input sources
361------------------------------------
362
363
364With \ **--link**\  you can link the input sources and get one argument
365from each input source:
366
367
368.. code-block:: perl
369
370   parallel --link echo ::: A B C ::: D E F
371
372
373Output (the order may be different):
374
375
376.. code-block:: perl
377
378   A D
379   B E
380   C F
381
382
383If one of the input sources is too short, its values will wrap:
384
385
386.. code-block:: perl
387
388   parallel --link echo ::: A B C D E ::: F G
389
390
391Output (the order may be different):
392
393
394.. code-block:: perl
395
396   A F
397   B G
398   C F
399   D G
400   E F
401
402
403For more flexible linking you can use \ **:::+**\  and \ **::::+**\ . They work
404like \ **:::**\  and \ **::::**\  except they link the previous input source to
405this input source.
406
407This will link ABC to GHI:
408
409
410.. code-block:: perl
411
412   parallel echo :::: abc-file :::+ G H I :::: def-file
413
414
415Output (the order may be different):
416
417
418.. code-block:: perl
419
420   A G D
421   A G E
422   A G F
423   B H D
424   B H E
425   B H F
426   C I D
427   C I E
428   C I F
429
430
431This will link GHI to DEF:
432
433
434.. code-block:: perl
435
436   parallel echo :::: abc-file ::: G H I ::::+ def-file
437
438
439Output (the order may be different):
440
441
442.. code-block:: perl
443
444   A G D
445   A H E
446   A I F
447   B G D
448   B H E
449   B I F
450   C G D
451   C H E
452   C I F
453
454
455If one of the input sources is too short when using \ **:::+**\  or
456\ **::::+**\ , the rest will be ignored:
457
458
459.. code-block:: perl
460
461   parallel echo ::: A B C D E :::+ F G
462
463
464Output (the order may be different):
465
466
467.. code-block:: perl
468
469   A F
470   B G
471
472
473
474
475Changing the argument separator.
476================================
477
478
479GNU \ **parallel**\  can use other separators than \ **:::**\  or \ **::::**\ . This is
480typically useful if \ **:::**\  or \ **::::**\  is used in the command to run:
481
482
483.. code-block:: perl
484
485   parallel --arg-sep ,, echo ,, A B C :::: def-file
486
487
488Output (the order may be different):
489
490
491.. code-block:: perl
492
493   A D
494   A E
495   A F
496   B D
497   B E
498   B F
499   C D
500   C E
501   C F
502
503
504Changing the argument file separator:
505
506
507.. code-block:: perl
508
509   parallel --arg-file-sep // echo ::: A B C // def-file
510
511
512Output: Same as above.
513
514
515Changing the argument delimiter
516===============================
517
518
519GNU \ **parallel**\  will normally treat a full line as a single argument: It
520uses \ **\n**\  as argument delimiter. This can be changed with \ **-d**\ :
521
522
523.. code-block:: perl
524
525   parallel -d _ echo :::: abc_-file
526
527
528Output (the order may be different):
529
530
531.. code-block:: perl
532
533   A
534   B
535   C
536
537
538NUL can be given as \ **\0**\ :
539
540
541.. code-block:: perl
542
543   parallel -d '\0' echo :::: abc0-file
544
545
546Output: Same as above.
547
548A shorthand for \ **-d '\0'**\  is \ **-0**\  (this will often be used to read files
549from \ **find ... -print0**\ ):
550
551
552.. code-block:: perl
553
554   parallel -0 echo :::: abc0-file
555
556
557Output: Same as above.
558
559
560End-of-file value for input source
561==================================
562
563
564GNU \ **parallel**\  can stop reading when it encounters a certain value:
565
566
567.. code-block:: perl
568
569   parallel -E stop echo ::: A B stop C D
570
571
572Output:
573
574
575.. code-block:: perl
576
577   A
578   B
579
580
581
582Skipping empty lines
583====================
584
585
586Using \ **--no-run-if-empty**\  GNU \ **parallel**\  will skip empty lines.
587
588
589.. code-block:: perl
590
591   (echo 1; echo; echo 2) | parallel --no-run-if-empty echo
592
593
594Output:
595
596
597.. code-block:: perl
598
599   1
600   2
601
602
603
604
605*************************
606Building the command line
607*************************
608
609
610No command means arguments are commands
611=======================================
612
613
614If no command is given after parallel the arguments themselves are
615treated as commands:
616
617
618.. code-block:: perl
619
620   parallel ::: ls 'echo foo' pwd
621
622
623Output (the order may be different):
624
625
626.. code-block:: perl
627
628   [list of files in current dir]
629   foo
630   [/path/to/current/working/dir]
631
632
633The command can be a script, a binary or a Bash function if the function is
634exported using \ **export -f**\ :
635
636
637.. code-block:: perl
638
639   # Only works in Bash
640   my_func() {
641     echo in my_func $1
642   }
643   export -f my_func
644   parallel my_func ::: 1 2 3
645
646
647Output (the order may be different):
648
649
650.. code-block:: perl
651
652   in my_func 1
653   in my_func 2
654   in my_func 3
655
656
657
658Replacement strings
659===================
660
661
662The 7 predefined replacement strings
663------------------------------------
664
665
666GNU \ **parallel**\  has several replacement strings. If no replacement
667strings are used the default is to append \ **{}**\ :
668
669
670.. code-block:: perl
671
672   parallel echo ::: A/B.C
673
674
675Output:
676
677
678.. code-block:: perl
679
680   A/B.C
681
682
683The default replacement string is \ **{}**\ :
684
685
686.. code-block:: perl
687
688   parallel echo {} ::: A/B.C
689
690
691Output:
692
693
694.. code-block:: perl
695
696   A/B.C
697
698
699The replacement string \ **{.}**\  removes the extension:
700
701
702.. code-block:: perl
703
704   parallel echo {.} ::: A/B.C
705
706
707Output:
708
709
710.. code-block:: perl
711
712   A/B
713
714
715The replacement string \ **{/}**\  removes the path:
716
717
718.. code-block:: perl
719
720   parallel echo {/} ::: A/B.C
721
722
723Output:
724
725
726.. code-block:: perl
727
728   B.C
729
730
731The replacement string \ **{//}**\  keeps only the path:
732
733
734.. code-block:: perl
735
736   parallel echo {//} ::: A/B.C
737
738
739Output:
740
741
742.. code-block:: perl
743
744   A
745
746
747The replacement string \ **{/.}**\  removes the path and the extension:
748
749
750.. code-block:: perl
751
752   parallel echo {/.} ::: A/B.C
753
754
755Output:
756
757
758.. code-block:: perl
759
760   B
761
762
763The replacement string \ **{#}**\  gives the job number:
764
765
766.. code-block:: perl
767
768   parallel echo {#} ::: A B C
769
770
771Output (the order may be different):
772
773
774.. code-block:: perl
775
776   1
777   2
778   3
779
780
781The replacement string \ **{%}**\  gives the job slot number (between 1 and
782number of jobs to run in parallel):
783
784
785.. code-block:: perl
786
787   parallel -j 2 echo {%} ::: A B C
788
789
790Output (the order may be different and 1 and 2 may be swapped):
791
792
793.. code-block:: perl
794
795   1
796   2
797   1
798
799
800
801Changing the replacement strings
802--------------------------------
803
804
805The replacement string \ **{}**\  can be changed with \ **-I**\ :
806
807
808.. code-block:: perl
809
810   parallel -I ,, echo ,, ::: A/B.C
811
812
813Output:
814
815
816.. code-block:: perl
817
818   A/B.C
819
820
821The replacement string \ **{.}**\  can be changed with \ **--extensionreplace**\ :
822
823
824.. code-block:: perl
825
826   parallel --extensionreplace ,, echo ,, ::: A/B.C
827
828
829Output:
830
831
832.. code-block:: perl
833
834   A/B
835
836
837The replacement string \ **{/}**\  can be replaced with \ **--basenamereplace**\ :
838
839
840.. code-block:: perl
841
842   parallel --basenamereplace ,, echo ,, ::: A/B.C
843
844
845Output:
846
847
848.. code-block:: perl
849
850   B.C
851
852
853The replacement string \ **{//}**\  can be changed with \ **--dirnamereplace**\ :
854
855
856.. code-block:: perl
857
858   parallel --dirnamereplace ,, echo ,, ::: A/B.C
859
860
861Output:
862
863
864.. code-block:: perl
865
866   A
867
868
869The replacement string \ **{/.}**\  can be changed with \ **--basenameextensionreplace**\ :
870
871
872.. code-block:: perl
873
874   parallel --basenameextensionreplace ,, echo ,, ::: A/B.C
875
876
877Output:
878
879
880.. code-block:: perl
881
882   B
883
884
885The replacement string \ **{#}**\  can be changed with \ **--seqreplace**\ :
886
887
888.. code-block:: perl
889
890   parallel --seqreplace ,, echo ,, ::: A B C
891
892
893Output (the order may be different):
894
895
896.. code-block:: perl
897
898   1
899   2
900   3
901
902
903The replacement string \ **{%}**\  can be changed with \ **--slotreplace**\ :
904
905
906.. code-block:: perl
907
908   parallel -j2 --slotreplace ,, echo ,, ::: A B C
909
910
911Output (the order may be different and 1 and 2 may be swapped):
912
913
914.. code-block:: perl
915
916   1
917   2
918   1
919
920
921
922Perl expression replacement string
923----------------------------------
924
925
926When predefined replacement strings are not flexible enough a perl
927expression can be used instead. One example is to remove two
928extensions: foo.tar.gz becomes foo
929
930
931.. code-block:: perl
932
933   parallel echo '{= s:\.[^.]+$::;s:\.[^.]+$::; =}' ::: foo.tar.gz
934
935
936Output:
937
938
939.. code-block:: perl
940
941   foo
942
943
944In \ **{= =}**\  you can access all of GNU \ **parallel**\ 's internal functions
945and variables. A few are worth mentioning.
946
947\ **total_jobs()**\  returns the total number of jobs:
948
949
950.. code-block:: perl
951
952   parallel echo Job {#} of {= '$_=total_jobs()' =} ::: {1..5}
953
954
955Output:
956
957
958.. code-block:: perl
959
960   Job 1 of 5
961   Job 2 of 5
962   Job 3 of 5
963   Job 4 of 5
964   Job 5 of 5
965
966
967\ **Q(...)**\  shell quotes the string:
968
969
970.. code-block:: perl
971
972   parallel echo {} shell quoted is {= '$_=Q($_)' =} ::: '*/!#$'
973
974
975Output:
976
977
978.. code-block:: perl
979
980   */!#$ shell quoted is \*/\!\#\$
981
982
983\ **skip()**\  skips the job:
984
985
986.. code-block:: perl
987
988   parallel echo {= 'if($_==3) { skip() }' =} ::: {1..5}
989
990
991Output:
992
993
994.. code-block:: perl
995
996   1
997   2
998   4
999   5
1000
1001
1002\ **@arg**\  contains the input source variables:
1003
1004
1005.. code-block:: perl
1006
1007   parallel echo {= 'if($arg[1]==$arg[2]) { skip() }' =} \
1008     ::: {1..3} ::: {1..3}
1009
1010
1011Output:
1012
1013
1014.. code-block:: perl
1015
1016   1 2
1017   1 3
1018   2 1
1019   2 3
1020   3 1
1021   3 2
1022
1023
1024If the strings \ **{=**\  and \ **=}**\  cause problems they can be replaced with \ **--parens**\ :
1025
1026
1027.. code-block:: perl
1028
1029   parallel --parens ,,,, echo ',, s:\.[^.]+$::;s:\.[^.]+$::; ,,' \
1030     ::: foo.tar.gz
1031
1032
1033Output:
1034
1035
1036.. code-block:: perl
1037
1038   foo
1039
1040
1041To define a shorthand replacement string use \ **--rpl**\ :
1042
1043
1044.. code-block:: perl
1045
1046   parallel --rpl '.. s:\.[^.]+$::;s:\.[^.]+$::;' echo '..' \
1047     ::: foo.tar.gz
1048
1049
1050Output: Same as above.
1051
1052If the shorthand starts with \ **{**\  it can be used as a positional
1053replacement string, too:
1054
1055
1056.. code-block:: perl
1057
1058   parallel --rpl '{..} s:\.[^.]+$::;s:\.[^.]+$::;' echo '{..}'
1059     ::: foo.tar.gz
1060
1061
1062Output: Same as above.
1063
1064If the shorthand contains matching parenthesis the replacement string
1065becomes a dynamic replacement string and the string in the parenthesis
1066can be accessed as $$1. If there are multiple matching parenthesis,
1067the matched strings can be accessed using $$2, $$3 and so on.
1068
1069You can think of this as giving arguments to the replacement
1070string. Here we give the argument \ **.tar.gz**\  to the replacement string
1071\ **{%\ \*string\*\ }**\  which removes \ *string*\ :
1072
1073
1074.. code-block:: perl
1075
1076   parallel --rpl '{%(.+?)} s/$$1$//;' echo {%.tar.gz}.zip ::: foo.tar.gz
1077
1078
1079Output:
1080
1081
1082.. code-block:: perl
1083
1084   foo.zip
1085
1086
1087Here we give the two arguments \ **tar.gz**\  and \ **zip**\  to the replacement
1088string \ **{/\ \*string1\*\ /\ \*string2\*\ }**\  which replaces \ *string1*\  with
1089\ *string2*\ :
1090
1091
1092.. code-block:: perl
1093
1094   parallel --rpl '{/(.+?)/(.*?)} s/$$1/$$2/;' echo {/tar.gz/zip} \
1095     ::: foo.tar.gz
1096
1097
1098Output:
1099
1100
1101.. code-block:: perl
1102
1103   foo.zip
1104
1105
1106GNU \ **parallel**\ 's 7 replacement strings are implemented as this:
1107
1108
1109.. code-block:: perl
1110
1111   --rpl '{} '
1112   --rpl '{#} $_=$job->seq()'
1113   --rpl '{%} $_=$job->slot()'
1114   --rpl '{/} s:.*/::'
1115   --rpl '{//} $Global::use{"File::Basename"} ||=
1116            eval "use File::Basename; 1;"; $_ = dirname($_);'
1117   --rpl '{/.} s:.*/::; s:\.[^/.]+$::;'
1118   --rpl '{.} s:\.[^/.]+$::'
1119
1120
1121
1122Positional replacement strings
1123------------------------------
1124
1125
1126With multiple input sources the argument from the individual input
1127sources can be accessed with \ **{**\ number\ **}**\ :
1128
1129
1130.. code-block:: perl
1131
1132   parallel echo {1} and {2} ::: A B ::: C D
1133
1134
1135Output (the order may be different):
1136
1137
1138.. code-block:: perl
1139
1140   A and C
1141   A and D
1142   B and C
1143   B and D
1144
1145
1146The positional replacement strings can also be modified using \ **/**\ , \ **//**\ , \ **/.**\ , and  \ **.**\ :
1147
1148
1149.. code-block:: perl
1150
1151   parallel echo /={1/} //={1//} /.={1/.} .={1.} ::: A/B.C D/E.F
1152
1153
1154Output (the order may be different):
1155
1156
1157.. code-block:: perl
1158
1159   /=B.C //=A /.=B .=A/B
1160   /=E.F //=D /.=E .=D/E
1161
1162
1163If a position is negative, it will refer to the input source counted
1164from behind:
1165
1166
1167.. code-block:: perl
1168
1169   parallel echo 1={1} 2={2} 3={3} -1={-1} -2={-2} -3={-3} \
1170     ::: A B ::: C D ::: E F
1171
1172
1173Output (the order may be different):
1174
1175
1176.. code-block:: perl
1177
1178   1=A 2=C 3=E -1=E -2=C -3=A
1179   1=A 2=C 3=F -1=F -2=C -3=A
1180   1=A 2=D 3=E -1=E -2=D -3=A
1181   1=A 2=D 3=F -1=F -2=D -3=A
1182   1=B 2=C 3=E -1=E -2=C -3=B
1183   1=B 2=C 3=F -1=F -2=C -3=B
1184   1=B 2=D 3=E -1=E -2=D -3=B
1185   1=B 2=D 3=F -1=F -2=D -3=B
1186
1187
1188
1189Positional perl expression replacement string
1190---------------------------------------------
1191
1192
1193To use a perl expression as a positional replacement string simply
1194prepend the perl expression with number and space:
1195
1196
1197.. code-block:: perl
1198
1199   parallel echo '{=2 s:\.[^.]+$::;s:\.[^.]+$::; =} {1}' \
1200     ::: bar ::: foo.tar.gz
1201
1202
1203Output:
1204
1205
1206.. code-block:: perl
1207
1208   foo bar
1209
1210
1211If a shorthand defined using \ **--rpl**\  starts with \ **{**\  it can be used as
1212a positional replacement string, too:
1213
1214
1215.. code-block:: perl
1216
1217   parallel --rpl '{..} s:\.[^.]+$::;s:\.[^.]+$::;' echo '{2..} {1}' \
1218     ::: bar ::: foo.tar.gz
1219
1220
1221Output: Same as above.
1222
1223
1224Input from columns
1225------------------
1226
1227
1228The columns in a file can be bound to positional replacement strings
1229using \ **--colsep**\ . Here the columns are separated by TAB (\t):
1230
1231
1232.. code-block:: perl
1233
1234   parallel --colsep '\t' echo 1={1} 2={2} :::: tsv-file.tsv
1235
1236
1237Output (the order may be different):
1238
1239
1240.. code-block:: perl
1241
1242   1=f1 2=f2
1243   1=A 2=B
1244   1=C 2=D
1245
1246
1247
1248Header defined replacement strings
1249----------------------------------
1250
1251
1252With \ **--header**\  GNU \ **parallel**\  will use the first value of the input
1253source as the name of the replacement string. Only the non-modified
1254version \ **{}**\  is supported:
1255
1256
1257.. code-block:: perl
1258
1259   parallel --header : echo f1={f1} f2={f2} ::: f1 A B ::: f2 C D
1260
1261
1262Output (the order may be different):
1263
1264
1265.. code-block:: perl
1266
1267   f1=A f2=C
1268   f1=A f2=D
1269   f1=B f2=C
1270   f1=B f2=D
1271
1272
1273It is useful with \ **--colsep**\  for processing files with TAB separated values:
1274
1275
1276.. code-block:: perl
1277
1278   parallel --header : --colsep '\t' echo f1={f1} f2={f2} \
1279     :::: tsv-file.tsv
1280
1281
1282Output (the order may be different):
1283
1284
1285.. code-block:: perl
1286
1287   f1=A f2=B
1288   f1=C f2=D
1289
1290
1291
1292More pre-defined replacement strings with --plus
1293------------------------------------------------
1294
1295
1296\ **--plus**\  adds the replacement strings \ **{+/} {+.} {+..} {+...} {..}  {...}
1297{/..} {/...} {##}**\ . The idea being that \ **{+foo}**\  matches the opposite of \ **{foo}**\
1298and \ **{}**\  = \ **{+/}**\ /\ **{/}**\  = \ **{.}**\ .\ **{+.}**\  = \ **{+/}**\ /\ **{/.}**\ .\ **{+.}**\  = \ **{..}**\ .\ **{+..}**\  =
1299\ **{+/}**\ /\ **{/..}**\ .\ **{+..}**\  = \ **{...}**\ .\ **{+...}**\  = \ **{+/}**\ /\ **{/...}**\ .\ **{+...}**\ .
1300
1301
1302.. code-block:: perl
1303
1304   parallel --plus echo {} ::: dir/sub/file.ex1.ex2.ex3
1305   parallel --plus echo {+/}/{/} ::: dir/sub/file.ex1.ex2.ex3
1306   parallel --plus echo {.}.{+.} ::: dir/sub/file.ex1.ex2.ex3
1307   parallel --plus echo {+/}/{/.}.{+.} ::: dir/sub/file.ex1.ex2.ex3
1308   parallel --plus echo {..}.{+..} ::: dir/sub/file.ex1.ex2.ex3
1309   parallel --plus echo {+/}/{/..}.{+..} ::: dir/sub/file.ex1.ex2.ex3
1310   parallel --plus echo {...}.{+...} ::: dir/sub/file.ex1.ex2.ex3
1311   parallel --plus echo {+/}/{/...}.{+...} ::: dir/sub/file.ex1.ex2.ex3
1312
1313
1314Output:
1315
1316
1317.. code-block:: perl
1318
1319   dir/sub/file.ex1.ex2.ex3
1320
1321
1322\ **{##}**\  is simply the number of jobs:
1323
1324
1325.. code-block:: perl
1326
1327   parallel --plus echo Job {#} of {##} ::: {1..5}
1328
1329
1330Output:
1331
1332
1333.. code-block:: perl
1334
1335   Job 1 of 5
1336   Job 2 of 5
1337   Job 3 of 5
1338   Job 4 of 5
1339   Job 5 of 5
1340
1341
1342
1343Dynamic replacement strings with --plus
1344---------------------------------------
1345
1346
1347\ **--plus**\  also defines these dynamic replacement strings:
1348
1349
1350- \ **{:-\ \*string\*\ }**\
1351
1352 Default value is \ *string*\  if the argument is empty.
1353
1354
1355
1356- \ **{:\ \*number\*\ }**\
1357
1358 Substring from \ *number*\  till end of string.
1359
1360
1361
1362- \ **{:\ \*number1\*\ :\ \*number2\*\ }**\
1363
1364 Substring from \ *number1*\  to \ *number2*\ .
1365
1366
1367
1368- \ **{#\ \*string\*\ }**\
1369
1370 If the argument starts with \ *string*\ , remove it.
1371
1372
1373
1374- \ **{%\ \*string\*\ }**\
1375
1376 If the argument ends with \ *string*\ , remove it.
1377
1378
1379
1380- \ **{/\ \*string1\*\ /\ \*string2\*\ }**\
1381
1382 Replace \ *string1*\  with \ *string2*\ .
1383
1384
1385
1386- \ **{^\ \*string\*\ }**\
1387
1388 If the argument starts with \ *string*\ , upper case it. \ *string*\  must
1389 be a single letter.
1390
1391
1392
1393- \ **{^^\ \*string\*\ }**\
1394
1395 If the argument contains \ *string*\ , upper case it. \ *string*\  must be a
1396 single letter.
1397
1398
1399
1400- \ **{,\ \*string\*\ }**\
1401
1402 If the argument starts with \ *string*\ , lower case it. \ *string*\  must
1403 be a single letter.
1404
1405
1406
1407- \ **{,,\ \*string\*\ }**\
1408
1409 If the argument contains \ *string*\ , lower case it. \ *string*\  must be a
1410 single letter.
1411
1412
1413
1414They are inspired from \ **Bash**\ :
1415
1416
1417.. code-block:: perl
1418
1419   unset myvar
1420   echo ${myvar:-myval}
1421   parallel --plus echo {:-myval} ::: "$myvar"
1422
1423   myvar=abcAaAdef
1424   echo ${myvar:2}
1425   parallel --plus echo {:2} ::: "$myvar"
1426
1427   echo ${myvar:2:3}
1428   parallel --plus echo {:2:3} ::: "$myvar"
1429
1430   echo ${myvar#bc}
1431   parallel --plus echo {#bc} ::: "$myvar"
1432   echo ${myvar#abc}
1433   parallel --plus echo {#abc} ::: "$myvar"
1434
1435   echo ${myvar%de}
1436   parallel --plus echo {%de} ::: "$myvar"
1437   echo ${myvar%def}
1438   parallel --plus echo {%def} ::: "$myvar"
1439
1440   echo ${myvar/def/ghi}
1441   parallel --plus echo {/def/ghi} ::: "$myvar"
1442
1443   echo ${myvar^a}
1444   parallel --plus echo {^a} ::: "$myvar"
1445   echo ${myvar^^a}
1446   parallel --plus echo {^^a} ::: "$myvar"
1447
1448   myvar=AbcAaAdef
1449   echo ${myvar,A}
1450   parallel --plus echo '{,A}' ::: "$myvar"
1451   echo ${myvar,,A}
1452   parallel --plus echo '{,,A}' ::: "$myvar"
1453
1454
1455Output:
1456
1457
1458.. code-block:: perl
1459
1460   myval
1461   myval
1462   cAaAdef
1463   cAaAdef
1464   cAa
1465   cAa
1466   abcAaAdef
1467   abcAaAdef
1468   AaAdef
1469   AaAdef
1470   abcAaAdef
1471   abcAaAdef
1472   abcAaA
1473   abcAaA
1474   abcAaAghi
1475   abcAaAghi
1476   AbcAaAdef
1477   AbcAaAdef
1478   AbcAAAdef
1479   AbcAAAdef
1480   abcAaAdef
1481   abcAaAdef
1482   abcaaadef
1483   abcaaadef
1484
1485
1486
1487
1488More than one argument
1489======================
1490
1491
1492With \ **--xargs**\  GNU \ **parallel**\  will fit as many arguments as possible on a
1493single line:
1494
1495
1496.. code-block:: perl
1497
1498   cat num30000 | parallel --xargs echo | wc -l
1499
1500
1501Output (if you run this under Bash on GNU/Linux):
1502
1503
1504.. code-block:: perl
1505
1506   2
1507
1508
1509The 30000 arguments fitted on 2 lines.
1510
1511The maximal length of a single line can be set with \ **-s**\ . With a maximal
1512line length of 10000 chars 17 commands will be run:
1513
1514
1515.. code-block:: perl
1516
1517   cat num30000 | parallel --xargs -s 10000 echo | wc -l
1518
1519
1520Output:
1521
1522
1523.. code-block:: perl
1524
1525   17
1526
1527
1528For better parallelism GNU \ **parallel**\  can distribute the arguments
1529between all the parallel jobs when end of file is met.
1530
1531Below GNU \ **parallel**\  reads the last argument when generating the second
1532job. When GNU \ **parallel**\  reads the last argument, it spreads all the
1533arguments for the second job over 4 jobs instead, as 4 parallel jobs
1534are requested.
1535
1536The first job will be the same as the \ **--xargs**\  example above, but the
1537second job will be split into 4 evenly sized jobs, resulting in a
1538total of 5 jobs:
1539
1540
1541.. code-block:: perl
1542
1543   cat num30000 | parallel --jobs 4 -m echo | wc -l
1544
1545
1546Output (if you run this under Bash on GNU/Linux):
1547
1548
1549.. code-block:: perl
1550
1551   5
1552
1553
1554This is even more visible when running 4 jobs with 10 arguments. The
155510 arguments are being spread over 4 jobs:
1556
1557
1558.. code-block:: perl
1559
1560   parallel --jobs 4 -m echo ::: 1 2 3 4 5 6 7 8 9 10
1561
1562
1563Output:
1564
1565
1566.. code-block:: perl
1567
1568   1 2 3
1569   4 5 6
1570   7 8 9
1571   10
1572
1573
1574A replacement string can be part of a word. \ **-m**\  will not repeat the context:
1575
1576
1577.. code-block:: perl
1578
1579   parallel --jobs 4 -m echo pre-{}-post ::: A B C D E F G
1580
1581
1582Output (the order may be different):
1583
1584
1585.. code-block:: perl
1586
1587   pre-A B-post
1588   pre-C D-post
1589   pre-E F-post
1590   pre-G-post
1591
1592
1593To repeat the context use \ **-X**\  which otherwise works like \ **-m**\ :
1594
1595
1596.. code-block:: perl
1597
1598   parallel --jobs 4 -X echo pre-{}-post ::: A B C D E F G
1599
1600
1601Output (the order may be different):
1602
1603
1604.. code-block:: perl
1605
1606   pre-A-post pre-B-post
1607   pre-C-post pre-D-post
1608   pre-E-post pre-F-post
1609   pre-G-post
1610
1611
1612To limit the number of arguments use \ **-N**\ :
1613
1614
1615.. code-block:: perl
1616
1617   parallel -N3 echo ::: A B C D E F G H
1618
1619
1620Output (the order may be different):
1621
1622
1623.. code-block:: perl
1624
1625   A B C
1626   D E F
1627   G H
1628
1629
1630\ **-N**\  also sets the positional replacement strings:
1631
1632
1633.. code-block:: perl
1634
1635   parallel -N3 echo 1={1} 2={2} 3={3} ::: A B C D E F G H
1636
1637
1638Output (the order may be different):
1639
1640
1641.. code-block:: perl
1642
1643   1=A 2=B 3=C
1644   1=D 2=E 3=F
1645   1=G 2=H 3=
1646
1647
1648\ **-N0**\  reads 1 argument but inserts none:
1649
1650
1651.. code-block:: perl
1652
1653   parallel -N0 echo foo ::: 1 2 3
1654
1655
1656Output:
1657
1658
1659.. code-block:: perl
1660
1661   foo
1662   foo
1663   foo
1664
1665
1666
1667Quoting
1668=======
1669
1670
1671Command lines that contain special characters may need to be protected from the shell.
1672
1673The \ **perl**\  program \ **print "@ARGV\n"**\  basically works like \ **echo**\ .
1674
1675
1676.. code-block:: perl
1677
1678   perl -e 'print "@ARGV\n"' A
1679
1680
1681Output:
1682
1683
1684.. code-block:: perl
1685
1686   A
1687
1688
1689To run that in parallel the command needs to be quoted:
1690
1691
1692.. code-block:: perl
1693
1694   parallel perl -e 'print "@ARGV\n"' ::: This wont work
1695
1696
1697Output:
1698
1699
1700.. code-block:: perl
1701
1702   [Nothing]
1703
1704
1705To quote the command use \ **-q**\ :
1706
1707
1708.. code-block:: perl
1709
1710   parallel -q perl -e 'print "@ARGV\n"' ::: This works
1711
1712
1713Output (the order may be different):
1714
1715
1716.. code-block:: perl
1717
1718   This
1719   works
1720
1721
1722Or you can quote the critical part using \ **\'**\ :
1723
1724
1725.. code-block:: perl
1726
1727   parallel perl -e \''print "@ARGV\n"'\' ::: This works, too
1728
1729
1730Output (the order may be different):
1731
1732
1733.. code-block:: perl
1734
1735   This
1736   works,
1737   too
1738
1739
1740GNU \ **parallel**\  can also \-quote full lines. Simply run this:
1741
1742
1743.. code-block:: perl
1744
1745   parallel --shellquote
1746   Warning: Input is read from the terminal. You either know what you
1747   Warning: are doing (in which case: YOU ARE AWESOME!) or you forgot
1748   Warning: ::: or :::: or to pipe data into parallel. If so
1749   Warning: consider going through the tutorial: man parallel_tutorial
1750   Warning: Press CTRL-D to exit.
1751   perl -e 'print "@ARGV\n"'
1752   [CTRL-D]
1753
1754
1755Output:
1756
1757
1758.. code-block:: perl
1759
1760   perl\ -e\ \'print\ \"@ARGV\\n\"\'
1761
1762
1763This can then be used as the command:
1764
1765
1766.. code-block:: perl
1767
1768   parallel perl\ -e\ \'print\ \"@ARGV\\n\"\' ::: This also works
1769
1770
1771Output (the order may be different):
1772
1773
1774.. code-block:: perl
1775
1776   This
1777   also
1778   works
1779
1780
1781
1782Trimming space
1783==============
1784
1785
1786Space can be trimmed on the arguments using \ **--trim**\ :
1787
1788
1789.. code-block:: perl
1790
1791   parallel --trim r echo pre-{}-post ::: ' A '
1792
1793
1794Output:
1795
1796
1797.. code-block:: perl
1798
1799   pre- A-post
1800
1801
1802To trim on the left side:
1803
1804
1805.. code-block:: perl
1806
1807   parallel --trim l echo pre-{}-post ::: ' A '
1808
1809
1810Output:
1811
1812
1813.. code-block:: perl
1814
1815   pre-A -post
1816
1817
1818To trim on the both sides:
1819
1820
1821.. code-block:: perl
1822
1823   parallel --trim lr echo pre-{}-post ::: ' A '
1824
1825
1826Output:
1827
1828
1829.. code-block:: perl
1830
1831   pre-A-post
1832
1833
1834
1835Respecting the shell
1836====================
1837
1838
1839This tutorial uses Bash as the shell. GNU \ **parallel**\  respects which
1840shell you are using, so in \ **zsh**\  you can do:
1841
1842
1843.. code-block:: perl
1844
1845   parallel echo \={} ::: zsh bash ls
1846
1847
1848Output:
1849
1850
1851.. code-block:: perl
1852
1853   /usr/bin/zsh
1854   /bin/bash
1855   /bin/ls
1856
1857
1858In \ **csh**\  you can do:
1859
1860
1861.. code-block:: perl
1862
1863   parallel 'set a="{}"; if( { test -d "$a" } ) echo "$a is a dir"' ::: *
1864
1865
1866Output:
1867
1868
1869.. code-block:: perl
1870
1871   [somedir] is a dir
1872
1873
1874This also becomes useful if you use GNU \ **parallel**\  in a shell script:
1875GNU \ **parallel**\  will use the same shell as the shell script.
1876
1877
1878
1879**********************
1880Controlling the output
1881**********************
1882
1883
1884The output can prefixed with the argument:
1885
1886
1887.. code-block:: perl
1888
1889   parallel --tag echo foo-{} ::: A B C
1890
1891
1892Output (the order may be different):
1893
1894
1895.. code-block:: perl
1896
1897   A       foo-A
1898   B       foo-B
1899   C       foo-C
1900
1901
1902To prefix it with another string use \ **--tagstring**\ :
1903
1904
1905.. code-block:: perl
1906
1907   parallel --tagstring {}-bar echo foo-{} ::: A B C
1908
1909
1910Output (the order may be different):
1911
1912
1913.. code-block:: perl
1914
1915   A-bar   foo-A
1916   B-bar   foo-B
1917   C-bar   foo-C
1918
1919
1920To see what commands will be run without running them use \ **--dryrun**\ :
1921
1922
1923.. code-block:: perl
1924
1925   parallel --dryrun echo {} ::: A B C
1926
1927
1928Output (the order may be different):
1929
1930
1931.. code-block:: perl
1932
1933   echo A
1934   echo B
1935   echo C
1936
1937
1938To print the command before running them use \ **--verbose**\ :
1939
1940
1941.. code-block:: perl
1942
1943   parallel --verbose echo {} ::: A B C
1944
1945
1946Output (the order may be different):
1947
1948
1949.. code-block:: perl
1950
1951   echo A
1952   echo B
1953   A
1954   echo C
1955   B
1956   C
1957
1958
1959GNU \ **parallel**\  will postpone the output until the command completes:
1960
1961
1962.. code-block:: perl
1963
1964   parallel -j2 'printf "%s-start\n%s" {} {};
1965     sleep {};printf "%s\n" -middle;echo {}-end' ::: 4 2 1
1966
1967
1968Output:
1969
1970
1971.. code-block:: perl
1972
1973   2-start
1974   2-middle
1975   2-end
1976   1-start
1977   1-middle
1978   1-end
1979   4-start
1980   4-middle
1981   4-end
1982
1983
1984To get the output immediately use \ **--ungroup**\ :
1985
1986
1987.. code-block:: perl
1988
1989   parallel -j2 --ungroup 'printf "%s-start\n%s" {} {};
1990     sleep {};printf "%s\n" -middle;echo {}-end' ::: 4 2 1
1991
1992
1993Output:
1994
1995
1996.. code-block:: perl
1997
1998   4-start
1999   42-start
2000   2-middle
2001   2-end
2002   1-start
2003   1-middle
2004   1-end
2005   -middle
2006   4-end
2007
2008
2009\ **--ungroup**\  is fast, but can cause half a line from one job to be mixed
2010with half a line of another job. That has happened in the second line,
2011where the line '4-middle' is mixed with '2-start'.
2012
2013To avoid this use \ **--linebuffer**\ :
2014
2015
2016.. code-block:: perl
2017
2018   parallel -j2 --linebuffer 'printf "%s-start\n%s" {} {};
2019     sleep {};printf "%s\n" -middle;echo {}-end' ::: 4 2 1
2020
2021
2022Output:
2023
2024
2025.. code-block:: perl
2026
2027   4-start
2028   2-start
2029   2-middle
2030   2-end
2031   1-start
2032   1-middle
2033   1-end
2034   4-middle
2035   4-end
2036
2037
2038To force the output in the same order as the arguments use \ **--keep-order**\ /\ **-k**\ :
2039
2040
2041.. code-block:: perl
2042
2043   parallel -j2 -k 'printf "%s-start\n%s" {} {};
2044     sleep {};printf "%s\n" -middle;echo {}-end' ::: 4 2 1
2045
2046
2047Output:
2048
2049
2050.. code-block:: perl
2051
2052   4-start
2053   4-middle
2054   4-end
2055   2-start
2056   2-middle
2057   2-end
2058   1-start
2059   1-middle
2060   1-end
2061
2062
2063Saving output into files
2064========================
2065
2066
2067GNU \ **parallel**\  can save the output of each job into files:
2068
2069
2070.. code-block:: perl
2071
2072   parallel --files echo ::: A B C
2073
2074
2075Output will be similar to this:
2076
2077
2078.. code-block:: perl
2079
2080   /tmp/pAh6uWuQCg.par
2081   /tmp/opjhZCzAX4.par
2082   /tmp/W0AT_Rph2o.par
2083
2084
2085By default GNU \ **parallel**\  will cache the output in files in \ **/tmp**\ . This
2086can be changed by setting \ **$TMPDIR**\  or \ **--tmpdir**\ :
2087
2088
2089.. code-block:: perl
2090
2091   parallel --tmpdir /var/tmp --files echo ::: A B C
2092
2093
2094Output will be similar to this:
2095
2096
2097.. code-block:: perl
2098
2099   /var/tmp/N_vk7phQRc.par
2100   /var/tmp/7zA4Ccf3wZ.par
2101   /var/tmp/LIuKgF_2LP.par
2102
2103
2104Or:
2105
2106
2107.. code-block:: perl
2108
2109   TMPDIR=/var/tmp parallel --files echo ::: A B C
2110
2111
2112Output: Same as above.
2113
2114The output files can be saved in a structured way using \ **--results**\ :
2115
2116
2117.. code-block:: perl
2118
2119   parallel --results outdir echo ::: A B C
2120
2121
2122Output:
2123
2124
2125.. code-block:: perl
2126
2127   A
2128   B
2129   C
2130
2131
2132These files were also generated containing the standard output
2133(stdout), standard error (stderr), and the sequence number (seq):
2134
2135
2136.. code-block:: perl
2137
2138   outdir/1/A/seq
2139   outdir/1/A/stderr
2140   outdir/1/A/stdout
2141   outdir/1/B/seq
2142   outdir/1/B/stderr
2143   outdir/1/B/stdout
2144   outdir/1/C/seq
2145   outdir/1/C/stderr
2146   outdir/1/C/stdout
2147
2148
2149\ **--header :**\  will take the first value as name and use that in the
2150directory structure. This is useful if you are using multiple input
2151sources:
2152
2153
2154.. code-block:: perl
2155
2156   parallel --header : --results outdir echo ::: f1 A B ::: f2 C D
2157
2158
2159Generated files:
2160
2161
2162.. code-block:: perl
2163
2164   outdir/f1/A/f2/C/seq
2165   outdir/f1/A/f2/C/stderr
2166   outdir/f1/A/f2/C/stdout
2167   outdir/f1/A/f2/D/seq
2168   outdir/f1/A/f2/D/stderr
2169   outdir/f1/A/f2/D/stdout
2170   outdir/f1/B/f2/C/seq
2171   outdir/f1/B/f2/C/stderr
2172   outdir/f1/B/f2/C/stdout
2173   outdir/f1/B/f2/D/seq
2174   outdir/f1/B/f2/D/stderr
2175   outdir/f1/B/f2/D/stdout
2176
2177
2178The directories are named after the variables and their values.
2179
2180
2181
2182*************************
2183Controlling the execution
2184*************************
2185
2186
2187Number of simultaneous jobs
2188===========================
2189
2190
2191The number of concurrent jobs is given with \ **--jobs**\ /\ **-j**\ :
2192
2193
2194.. code-block:: perl
2195
2196   /usr/bin/time parallel -N0 -j64 sleep 1 :::: num128
2197
2198
2199With 64 jobs in parallel the 128 \ **sleep**\ s will take 2-8 seconds to run -
2200depending on how fast your machine is.
2201
2202By default \ **--jobs**\  is the same as the number of CPU cores. So this:
2203
2204
2205.. code-block:: perl
2206
2207   /usr/bin/time parallel -N0 sleep 1 :::: num128
2208
2209
2210should take twice the time of running 2 jobs per CPU core:
2211
2212
2213.. code-block:: perl
2214
2215   /usr/bin/time parallel -N0 --jobs 200% sleep 1 :::: num128
2216
2217
2218\ **--jobs 0**\  will run as many jobs in parallel as possible:
2219
2220
2221.. code-block:: perl
2222
2223   /usr/bin/time parallel -N0 --jobs 0 sleep 1 :::: num128
2224
2225
2226which should take 1-7 seconds depending on how fast your machine is.
2227
2228\ **--jobs**\  can read from a file which is re-read when a job finishes:
2229
2230
2231.. code-block:: perl
2232
2233   echo 50% > my_jobs
2234   /usr/bin/time parallel -N0 --jobs my_jobs sleep 1 :::: num128 &
2235   sleep 1
2236   echo 0 > my_jobs
2237   wait
2238
2239
2240The first second only 50% of the CPU cores will run a job. Then \ **0**\  is
2241put into \ **my_jobs**\  and then the rest of the jobs will be started in
2242parallel.
2243
2244Instead of basing the percentage on the number of CPU cores
2245GNU \ **parallel**\  can base it on the number of CPUs:
2246
2247
2248.. code-block:: perl
2249
2250   parallel --use-cpus-instead-of-cores -N0 sleep 1 :::: num8
2251
2252
2253
2254Shuffle job order
2255=================
2256
2257
2258If you have many jobs (e.g. by multiple combinations of input
2259sources), it can be handy to shuffle the jobs, so you get different
2260values run. Use \ **--shuf**\  for that:
2261
2262
2263.. code-block:: perl
2264
2265   parallel --shuf echo ::: 1 2 3 ::: a b c ::: A B C
2266
2267
2268Output:
2269
2270
2271.. code-block:: perl
2272
2273   All combinations but different order for each run.
2274
2275
2276
2277Interactivity
2278=============
2279
2280
2281GNU \ **parallel**\  can ask the user if a command should be run using \ **--interactive**\ :
2282
2283
2284.. code-block:: perl
2285
2286   parallel --interactive echo ::: 1 2 3
2287
2288
2289Output:
2290
2291
2292.. code-block:: perl
2293
2294   echo 1 ?...y
2295   echo 2 ?...n
2296   1
2297   echo 3 ?...y
2298   3
2299
2300
2301GNU \ **parallel**\  can be used to put arguments on the command line for an
2302interactive command such as \ **emacs**\  to edit one file at a time:
2303
2304
2305.. code-block:: perl
2306
2307   parallel --tty emacs ::: 1 2 3
2308
2309
2310Or give multiple argument in one go to open multiple files:
2311
2312
2313.. code-block:: perl
2314
2315   parallel -X --tty vi ::: 1 2 3
2316
2317
2318
2319A terminal for every job
2320========================
2321
2322
2323Using \ **--tmux**\  GNU \ **parallel**\  can start a terminal for every job run:
2324
2325
2326.. code-block:: perl
2327
2328   seq 10 20 | parallel --tmux 'echo start {}; sleep {}; echo done {}'
2329
2330
2331This will tell you to run something similar to:
2332
2333
2334.. code-block:: perl
2335
2336   tmux -S /tmp/tmsrPrO0 attach
2337
2338
2339Using normal \ **tmux**\  keystrokes (CTRL-b n or CTRL-b p) you can cycle
2340between windows of the running jobs. When a job is finished it will
2341pause for 10 seconds before closing the window.
2342
2343
2344Timing
2345======
2346
2347
2348Some jobs do heavy I/O when they start. To avoid a thundering herd GNU
2349\ **parallel**\  can delay starting new jobs. \ **--delay**\  \ *X*\  will make
2350sure there is at least \ *X*\  seconds between each start:
2351
2352
2353.. code-block:: perl
2354
2355   parallel --delay 2.5 echo Starting {}\;date ::: 1 2 3
2356
2357
2358Output:
2359
2360
2361.. code-block:: perl
2362
2363   Starting 1
2364   Thu Aug 15 16:24:33 CEST 2013
2365   Starting 2
2366   Thu Aug 15 16:24:35 CEST 2013
2367   Starting 3
2368   Thu Aug 15 16:24:38 CEST 2013
2369
2370
2371If jobs taking more than a certain amount of time are known to fail,
2372they can be stopped with \ **--timeout**\ . The accuracy of \ **--timeout**\  is
23732 seconds:
2374
2375
2376.. code-block:: perl
2377
2378   parallel --timeout 4.1 sleep {}\; echo {} ::: 2 4 6 8
2379
2380
2381Output:
2382
2383
2384.. code-block:: perl
2385
2386   2
2387   4
2388
2389
2390GNU \ **parallel**\  can compute the median runtime for jobs and kill those
2391that take more than 200% of the median runtime:
2392
2393
2394.. code-block:: perl
2395
2396   parallel --timeout 200% sleep {}\; echo {} ::: 2.1 2.2 3 7 2.3
2397
2398
2399Output:
2400
2401
2402.. code-block:: perl
2403
2404   2.1
2405   2.2
2406   3
2407   2.3
2408
2409
2410
2411Progress information
2412====================
2413
2414
2415Based on the runtime of completed jobs GNU \ **parallel**\  can estimate the
2416total runtime:
2417
2418
2419.. code-block:: perl
2420
2421   parallel --eta sleep ::: 1 3 2 2 1 3 3 2 1
2422
2423
2424Output:
2425
2426
2427.. code-block:: perl
2428
2429   Computers / CPU cores / Max jobs to run
2430   1:local / 2 / 2
2431
2432   Computer:jobs running/jobs completed/%of started jobs/
2433     Average seconds to complete
2434   ETA: 2s 0left 1.11avg  local:0/9/100%/1.1s
2435
2436
2437GNU \ **parallel**\  can give progress information with \ **--progress**\ :
2438
2439
2440.. code-block:: perl
2441
2442   parallel --progress sleep ::: 1 3 2 2 1 3 3 2 1
2443
2444
2445Output:
2446
2447
2448.. code-block:: perl
2449
2450   Computers / CPU cores / Max jobs to run
2451   1:local / 2 / 2
2452
2453   Computer:jobs running/jobs completed/%of started jobs/
2454     Average seconds to complete
2455   local:0/9/100%/1.1s
2456
2457
2458A progress bar can be shown with \ **--bar**\ :
2459
2460
2461.. code-block:: perl
2462
2463   parallel --bar sleep ::: 1 3 2 2 1 3 3 2 1
2464
2465
2466And a graphic bar can be shown with \ **--bar**\  and \ **zenity**\ :
2467
2468
2469.. code-block:: perl
2470
2471   seq 1000 | parallel -j10 --bar '(echo -n {};sleep 0.1)' \
2472     2> >(perl -pe 'BEGIN{$/="\r";$|=1};s/\r/\n/g' |
2473          zenity --progress --auto-kill --auto-close)
2474
2475
2476A logfile of the jobs completed so far can be generated with \ **--joblog**\ :
2477
2478
2479.. code-block:: perl
2480
2481   parallel --joblog /tmp/log exit  ::: 1 2 3 0
2482   cat /tmp/log
2483
2484
2485Output:
2486
2487
2488.. code-block:: perl
2489
2490   Seq Host Starttime      Runtime Send Receive Exitval Signal Command
2491   1   :    1376577364.974 0.008   0    0       1       0      exit 1
2492   2   :    1376577364.982 0.013   0    0       2       0      exit 2
2493   3   :    1376577364.990 0.013   0    0       3       0      exit 3
2494   4   :    1376577365.003 0.003   0    0       0       0      exit 0
2495
2496
2497The log contains the job sequence, which host the job was run on, the
2498start time and run time, how much data was transferred, the exit
2499value, the signal that killed the job, and finally the command being
2500run.
2501
2502With a joblog GNU \ **parallel**\  can be stopped and later pickup where it
2503left off. It it important that the input of the completed jobs is
2504unchanged.
2505
2506
2507.. code-block:: perl
2508
2509   parallel --joblog /tmp/log exit  ::: 1 2 3 0
2510   cat /tmp/log
2511   parallel --resume --joblog /tmp/log exit  ::: 1 2 3 0 0 0
2512   cat /tmp/log
2513
2514
2515Output:
2516
2517
2518.. code-block:: perl
2519
2520   Seq Host Starttime      Runtime Send Receive Exitval Signal Command
2521   1   :    1376580069.544 0.008   0    0       1       0      exit 1
2522   2   :    1376580069.552 0.009   0    0       2       0      exit 2
2523   3   :    1376580069.560 0.012   0    0       3       0      exit 3
2524   4   :    1376580069.571 0.005   0    0       0       0      exit 0
2525
2526   Seq Host Starttime      Runtime Send Receive Exitval Signal Command
2527   1   :    1376580069.544 0.008   0    0       1       0      exit 1
2528   2   :    1376580069.552 0.009   0    0       2       0      exit 2
2529   3   :    1376580069.560 0.012   0    0       3       0      exit 3
2530   4   :    1376580069.571 0.005   0    0       0       0      exit 0
2531   5   :    1376580070.028 0.009   0    0       0       0      exit 0
2532   6   :    1376580070.038 0.007   0    0       0       0      exit 0
2533
2534
2535Note how the start time of the last 2 jobs is clearly different from the second run.
2536
2537With \ **--resume-failed**\  GNU \ **parallel**\  will re-run the jobs that failed:
2538
2539
2540.. code-block:: perl
2541
2542   parallel --resume-failed --joblog /tmp/log exit  ::: 1 2 3 0 0 0
2543   cat /tmp/log
2544
2545
2546Output:
2547
2548
2549.. code-block:: perl
2550
2551   Seq Host Starttime      Runtime Send Receive Exitval Signal Command
2552   1   :    1376580069.544 0.008   0    0       1       0      exit 1
2553   2   :    1376580069.552 0.009   0    0       2       0      exit 2
2554   3   :    1376580069.560 0.012   0    0       3       0      exit 3
2555   4   :    1376580069.571 0.005   0    0       0       0      exit 0
2556   5   :    1376580070.028 0.009   0    0       0       0      exit 0
2557   6   :    1376580070.038 0.007   0    0       0       0      exit 0
2558   1   :    1376580154.433 0.010   0    0       1       0      exit 1
2559   2   :    1376580154.444 0.022   0    0       2       0      exit 2
2560   3   :    1376580154.466 0.005   0    0       3       0      exit 3
2561
2562
2563Note how seq 1 2 3 have been repeated because they had exit value
2564different from 0.
2565
2566\ **--retry-failed**\  does almost the same as \ **--resume-failed**\ . Where
2567\ **--resume-failed**\  reads the commands from the command line (and
2568ignores the commands in the joblog), \ **--retry-failed**\  ignores the
2569command line and reruns the commands mentioned in the joblog.
2570
2571
2572.. code-block:: perl
2573
2574   parallel --retry-failed --joblog /tmp/log
2575   cat /tmp/log
2576
2577
2578Output:
2579
2580
2581.. code-block:: perl
2582
2583   Seq Host Starttime      Runtime Send Receive Exitval Signal Command
2584   1   :    1376580069.544 0.008   0    0       1       0      exit 1
2585   2   :    1376580069.552 0.009   0    0       2       0      exit 2
2586   3   :    1376580069.560 0.012   0    0       3       0      exit 3
2587   4   :    1376580069.571 0.005   0    0       0       0      exit 0
2588   5   :    1376580070.028 0.009   0    0       0       0      exit 0
2589   6   :    1376580070.038 0.007   0    0       0       0      exit 0
2590   1   :    1376580154.433 0.010   0    0       1       0      exit 1
2591   2   :    1376580154.444 0.022   0    0       2       0      exit 2
2592   3   :    1376580154.466 0.005   0    0       3       0      exit 3
2593   1   :    1376580164.633 0.010   0    0       1       0      exit 1
2594   2   :    1376580164.644 0.022   0    0       2       0      exit 2
2595   3   :    1376580164.666 0.005   0    0       3       0      exit 3
2596
2597
2598
2599Termination
2600===========
2601
2602
2603Unconditional termination
2604-------------------------
2605
2606
2607By default GNU \ **parallel**\  will wait for all jobs to finish before exiting.
2608
2609If you send GNU \ **parallel**\  the \ **TERM**\  signal, GNU \ **parallel**\  will
2610stop spawning new jobs and wait for the remaining jobs to finish. If
2611you send GNU \ **parallel**\  the \ **TERM**\  signal again, GNU \ **parallel**\
2612will kill all running jobs and exit.
2613
2614
2615Termination dependent on job status
2616-----------------------------------
2617
2618
2619For certain jobs there is no need to continue if one of the jobs fails
2620and has an exit code different from 0. GNU \ **parallel**\  will stop spawning new jobs
2621with \ **--halt soon,fail=1**\ :
2622
2623
2624.. code-block:: perl
2625
2626   parallel -j2 --halt soon,fail=1 echo {}\; exit {} ::: 0 0 1 2 3
2627
2628
2629Output:
2630
2631
2632.. code-block:: perl
2633
2634   0
2635   0
2636   1
2637   parallel: This job failed:
2638   echo 1; exit 1
2639   parallel: Starting no more jobs. Waiting for 1 jobs to finish.
2640   2
2641
2642
2643With \ **--halt now,fail=1**\  the running jobs will be killed immediately:
2644
2645
2646.. code-block:: perl
2647
2648   parallel -j2 --halt now,fail=1 echo {}\; exit {} ::: 0 0 1 2 3
2649
2650
2651Output:
2652
2653
2654.. code-block:: perl
2655
2656   0
2657   0
2658   1
2659   parallel: This job failed:
2660   echo 1; exit 1
2661
2662
2663If \ **--halt**\  is given a percentage this percentage of the jobs must fail
2664before GNU \ **parallel**\  stops spawning more jobs:
2665
2666
2667.. code-block:: perl
2668
2669   parallel -j2 --halt soon,fail=20% echo {}\; exit {} \
2670     ::: 0 1 2 3 4 5 6 7 8 9
2671
2672
2673Output:
2674
2675
2676.. code-block:: perl
2677
2678   0
2679   1
2680   parallel: This job failed:
2681   echo 1; exit 1
2682   2
2683   parallel: This job failed:
2684   echo 2; exit 2
2685   parallel: Starting no more jobs. Waiting for 1 jobs to finish.
2686   3
2687   parallel: This job failed:
2688   echo 3; exit 3
2689
2690
2691If you are looking for success instead of failures, you can use
2692\ **success**\ . This will finish as soon as the first job succeeds:
2693
2694
2695.. code-block:: perl
2696
2697   parallel -j2 --halt now,success=1 echo {}\; exit {} ::: 1 2 3 0 4 5 6
2698
2699
2700Output:
2701
2702
2703.. code-block:: perl
2704
2705   1
2706   2
2707   3
2708   0
2709   parallel: This job succeeded:
2710   echo 0; exit 0
2711
2712
2713GNU \ **parallel**\  can retry the command with \ **--retries**\ . This is useful if a
2714command fails for unknown reasons now and then.
2715
2716
2717.. code-block:: perl
2718
2719   parallel -k --retries 3 \
2720     'echo tried {} >>/tmp/runs; echo completed {}; exit {}' ::: 1 2 0
2721   cat /tmp/runs
2722
2723
2724Output:
2725
2726
2727.. code-block:: perl
2728
2729   completed 1
2730   completed 2
2731   completed 0
2732
2733   tried 1
2734   tried 2
2735   tried 1
2736   tried 2
2737   tried 1
2738   tried 2
2739   tried 0
2740
2741
2742Note how job 1 and 2 were tried 3 times, but 0 was not retried because it had exit code 0.
2743
2744
2745Termination signals (advanced)
2746------------------------------
2747
2748
2749Using \ **--termseq**\  you can control which signals are sent when killing
2750children. Normally children will be killed by sending them \ **SIGTERM**\ ,
2751waiting 200 ms, then another \ **SIGTERM**\ , waiting 100 ms, then another
2752\ **SIGTERM**\ , waiting 50 ms, then a \ **SIGKILL**\ , finally waiting 25 ms
2753before giving up. It looks like this:
2754
2755
2756.. code-block:: perl
2757
2758   show_signals() {
2759     perl -e 'for(keys %SIG) {
2760         $SIG{$_} = eval "sub { print \"Got $_\\n\"; }";
2761       }
2762       while(1){sleep 1}'
2763   }
2764   export -f show_signals
2765   echo | parallel --termseq TERM,200,TERM,100,TERM,50,KILL,25 \
2766     -u --timeout 1 show_signals
2767
2768
2769Output:
2770
2771
2772.. code-block:: perl
2773
2774   Got TERM
2775   Got TERM
2776   Got TERM
2777
2778
2779Or just:
2780
2781
2782.. code-block:: perl
2783
2784   echo | parallel -u --timeout 1 show_signals
2785
2786
2787Output: Same as above.
2788
2789You can change this to \ **SIGINT**\ , \ **SIGTERM**\ , \ **SIGKILL**\ :
2790
2791
2792.. code-block:: perl
2793
2794   echo | parallel --termseq INT,200,TERM,100,KILL,25 \
2795     -u --timeout 1 show_signals
2796
2797
2798Output:
2799
2800
2801.. code-block:: perl
2802
2803   Got INT
2804   Got TERM
2805
2806
2807The \ **SIGKILL**\  does not show because it cannot be caught, and thus the
2808child dies.
2809
2810
2811
2812Limiting the resources
2813======================
2814
2815
2816To avoid overloading systems GNU \ **parallel**\  can look at the system load
2817before starting another job:
2818
2819
2820.. code-block:: perl
2821
2822   parallel --load 100% echo load is less than {} job per cpu ::: 1
2823
2824
2825Output:
2826
2827
2828.. code-block:: perl
2829
2830   [when then load is less than the number of cpu cores]
2831   load is less than 1 job per cpu
2832
2833
2834GNU \ **parallel**\  can also check if the system is swapping.
2835
2836
2837.. code-block:: perl
2838
2839   parallel --noswap echo the system is not swapping ::: now
2840
2841
2842Output:
2843
2844
2845.. code-block:: perl
2846
2847   [when then system is not swapping]
2848   the system is not swapping now
2849
2850
2851Some jobs need a lot of memory, and should only be started when there
2852is enough memory free. Using \ **--memfree**\  GNU \ **parallel**\  can check if
2853there is enough memory free. Additionally, GNU \ **parallel**\  will kill
2854off the youngest job if the memory free falls below 50% of the
2855size. The killed job will put back on the queue and retried later.
2856
2857
2858.. code-block:: perl
2859
2860   parallel --memfree 1G echo will run if more than 1 GB is ::: free
2861
2862
2863GNU \ **parallel**\  can run the jobs with a nice value. This will work both
2864locally and remotely.
2865
2866
2867.. code-block:: perl
2868
2869   parallel --nice 17 echo this is being run with nice -n ::: 17
2870
2871
2872Output:
2873
2874
2875.. code-block:: perl
2876
2877   this is being run with nice -n 17
2878
2879
2880
2881
2882****************
2883Remote execution
2884****************
2885
2886
2887GNU \ **parallel**\  can run jobs on remote servers. It uses \ **ssh**\  to
2888communicate with the remote machines.
2889
2890Sshlogin
2891========
2892
2893
2894The most basic sshlogin is \ **-S**\  \ *host*\ :
2895
2896
2897.. code-block:: perl
2898
2899   parallel -S $SERVER1 echo running on ::: $SERVER1
2900
2901
2902Output:
2903
2904
2905.. code-block:: perl
2906
2907   running on [$SERVER1]
2908
2909
2910To use a different username prepend the server with \ *username@*\ :
2911
2912
2913.. code-block:: perl
2914
2915   parallel -S username@$SERVER1 echo running on ::: username@$SERVER1
2916
2917
2918Output:
2919
2920
2921.. code-block:: perl
2922
2923   running on [username@$SERVER1]
2924
2925
2926The special sshlogin \ **:**\  is the local machine:
2927
2928
2929.. code-block:: perl
2930
2931   parallel -S : echo running on ::: the_local_machine
2932
2933
2934Output:
2935
2936
2937.. code-block:: perl
2938
2939   running on the_local_machine
2940
2941
2942If \ **ssh**\  is not in $PATH it can be prepended to $SERVER1:
2943
2944
2945.. code-block:: perl
2946
2947   parallel -S '/usr/bin/ssh '$SERVER1 echo custom ::: ssh
2948
2949
2950Output:
2951
2952
2953.. code-block:: perl
2954
2955   custom ssh
2956
2957
2958The \ **ssh**\  command can also be given using \ **--ssh**\ :
2959
2960
2961.. code-block:: perl
2962
2963   parallel --ssh /usr/bin/ssh -S $SERVER1 echo custom ::: ssh
2964
2965
2966or by setting \ **$PARALLEL_SSH**\ :
2967
2968
2969.. code-block:: perl
2970
2971   export PARALLEL_SSH=/usr/bin/ssh
2972   parallel -S $SERVER1 echo custom ::: ssh
2973
2974
2975Several servers can be given using multiple \ **-S**\ :
2976
2977
2978.. code-block:: perl
2979
2980   parallel -S $SERVER1 -S $SERVER2 echo ::: running on more hosts
2981
2982
2983Output (the order may be different):
2984
2985
2986.. code-block:: perl
2987
2988   running
2989   on
2990   more
2991   hosts
2992
2993
2994Or they can be separated by \ **,**\ :
2995
2996
2997.. code-block:: perl
2998
2999   parallel -S $SERVER1,$SERVER2 echo ::: running on more hosts
3000
3001
3002Output: Same as above.
3003
3004Or newline:
3005
3006
3007.. code-block:: perl
3008
3009   # This gives a \n between $SERVER1 and $SERVER2
3010   SERVERS="`echo $SERVER1; echo $SERVER2`"
3011   parallel -S "$SERVERS" echo ::: running on more hosts
3012
3013
3014They can also be read from a file (replace \ *user@*\  with the user on \ **$SERVER2**\ ):
3015
3016
3017.. code-block:: perl
3018
3019   echo $SERVER1 > nodefile
3020   # Force 4 cores, special ssh-command, username
3021   echo 4//usr/bin/ssh user@$SERVER2 >> nodefile
3022   parallel --sshloginfile nodefile echo ::: running on more hosts
3023
3024
3025Output: Same as above.
3026
3027Every time a job finished, the \ **--sshloginfile**\  will be re-read, so
3028it is possible to both add and remove hosts while running.
3029
3030The special \ **--sshloginfile ..**\  reads from \ **~/.parallel/sshloginfile**\ .
3031
3032To force GNU \ **parallel**\  to treat a server having a given number of CPU
3033cores prepend the number of core followed by \ **/**\  to the sshlogin:
3034
3035
3036.. code-block:: perl
3037
3038   parallel -S 4/$SERVER1 echo force {} cpus on server ::: 4
3039
3040
3041Output:
3042
3043
3044.. code-block:: perl
3045
3046   force 4 cpus on server
3047
3048
3049Servers can be put into groups by prepending \ *@groupname*\  to the
3050server and the group can then be selected by appending \ *@groupname*\  to
3051the argument if using \ **--hostgroup**\ :
3052
3053
3054.. code-block:: perl
3055
3056   parallel --hostgroup -S @grp1/$SERVER1 -S @grp2/$SERVER2 echo {} \
3057     ::: run_on_grp1@grp1 run_on_grp2@grp2
3058
3059
3060Output:
3061
3062
3063.. code-block:: perl
3064
3065   run_on_grp1
3066   run_on_grp2
3067
3068
3069A host can be in multiple groups by separating the groups with \ **+**\ , and
3070you can force GNU \ **parallel**\  to limit the groups on which the command
3071can be run with \ **-S**\  \ *@groupname*\ :
3072
3073
3074.. code-block:: perl
3075
3076   parallel -S @grp1 -S @grp1+grp2/$SERVER1 -S @grp2/SERVER2 echo {} \
3077     ::: run_on_grp1 also_grp1
3078
3079
3080Output:
3081
3082
3083.. code-block:: perl
3084
3085   run_on_grp1
3086   also_grp1
3087
3088
3089
3090Transferring files
3091==================
3092
3093
3094GNU \ **parallel**\  can transfer the files to be processed to the remote
3095host. It does that using rsync.
3096
3097
3098.. code-block:: perl
3099
3100   echo This is input_file > input_file
3101   parallel -S $SERVER1 --transferfile {} cat ::: input_file
3102
3103
3104Output:
3105
3106
3107.. code-block:: perl
3108
3109   This is input_file
3110
3111
3112If the files are processed into another file, the resulting file can be
3113transferred back:
3114
3115
3116.. code-block:: perl
3117
3118   echo This is input_file > input_file
3119   parallel -S $SERVER1 --transferfile {} --return {}.out \
3120     cat {} ">"{}.out ::: input_file
3121   cat input_file.out
3122
3123
3124Output: Same as above.
3125
3126To remove the input and output file on the remote server use \ **--cleanup**\ :
3127
3128
3129.. code-block:: perl
3130
3131   echo This is input_file > input_file
3132   parallel -S $SERVER1 --transferfile {} --return {}.out --cleanup \
3133     cat {} ">"{}.out ::: input_file
3134   cat input_file.out
3135
3136
3137Output: Same as above.
3138
3139There is a shorthand for \ **--transferfile {} --return --cleanup**\  called \ **--trc**\ :
3140
3141
3142.. code-block:: perl
3143
3144   echo This is input_file > input_file
3145   parallel -S $SERVER1 --trc {}.out cat {} ">"{}.out ::: input_file
3146   cat input_file.out
3147
3148
3149Output: Same as above.
3150
3151Some jobs need a common database for all jobs. GNU \ **parallel**\  can
3152transfer that using \ **--basefile**\  which will transfer the file before the
3153first job:
3154
3155
3156.. code-block:: perl
3157
3158   echo common data > common_file
3159   parallel --basefile common_file -S $SERVER1 \
3160     cat common_file\; echo {} ::: foo
3161
3162
3163Output:
3164
3165
3166.. code-block:: perl
3167
3168   common data
3169   foo
3170
3171
3172To remove it from the remote host after the last job use \ **--cleanup**\ .
3173
3174
3175Working dir
3176===========
3177
3178
3179The default working dir on the remote machines is the login dir. This
3180can be changed with \ **--workdir**\  \ *mydir*\ .
3181
3182Files transferred using \ **--transferfile**\  and \ **--return**\  will be relative
3183to \ *mydir*\  on remote computers, and the command will be executed in
3184the dir \ *mydir*\ .
3185
3186The special \ *mydir*\  value \ **...**\  will create working dirs under
3187\ **~/.parallel/tmp**\  on the remote computers. If \ **--cleanup**\  is given
3188these dirs will be removed.
3189
3190The special \ *mydir*\  value \ **.**\  uses the current working dir.  If the
3191current working dir is beneath your home dir, the value \ **.**\  is
3192treated as the relative path to your home dir. This means that if your
3193home dir is different on remote computers (e.g. if your login is
3194different) the relative path will still be relative to your home dir.
3195
3196
3197.. code-block:: perl
3198
3199   parallel -S $SERVER1 pwd ::: ""
3200   parallel --workdir . -S $SERVER1 pwd ::: ""
3201   parallel --workdir ... -S $SERVER1 pwd ::: ""
3202
3203
3204Output:
3205
3206
3207.. code-block:: perl
3208
3209   [the login dir on $SERVER1]
3210   [current dir relative on $SERVER1]
3211   [a dir in ~/.parallel/tmp/...]
3212
3213
3214
3215Avoid overloading sshd
3216======================
3217
3218
3219If many jobs are started on the same server, \ **sshd**\  can be
3220overloaded. GNU \ **parallel**\  can insert a delay between each job run on
3221the same server:
3222
3223
3224.. code-block:: perl
3225
3226   parallel -S $SERVER1 --sshdelay 0.2 echo ::: 1 2 3
3227
3228
3229Output (the order may be different):
3230
3231
3232.. code-block:: perl
3233
3234   1
3235   2
3236   3
3237
3238
3239\ **sshd**\  will be less overloaded if using \ **--controlmaster**\ , which will
3240multiplex ssh connections:
3241
3242
3243.. code-block:: perl
3244
3245   parallel --controlmaster -S $SERVER1 echo ::: 1 2 3
3246
3247
3248Output: Same as above.
3249
3250
3251Ignore hosts that are down
3252==========================
3253
3254
3255In clusters with many hosts a few of them are often down. GNU \ **parallel**\
3256can ignore those hosts. In this case the host 173.194.32.46 is down:
3257
3258
3259.. code-block:: perl
3260
3261   parallel --filter-hosts -S 173.194.32.46,$SERVER1 echo ::: bar
3262
3263
3264Output:
3265
3266
3267.. code-block:: perl
3268
3269   bar
3270
3271
3272
3273Running the same commands on all hosts
3274======================================
3275
3276
3277GNU \ **parallel**\  can run the same command on all the hosts:
3278
3279
3280.. code-block:: perl
3281
3282   parallel --onall -S $SERVER1,$SERVER2 echo ::: foo bar
3283
3284
3285Output (the order may be different):
3286
3287
3288.. code-block:: perl
3289
3290   foo
3291   bar
3292   foo
3293   bar
3294
3295
3296Often you will just want to run a single command on all hosts with out
3297arguments. \ **--nonall**\  is a no argument \ **--onall**\ :
3298
3299
3300.. code-block:: perl
3301
3302   parallel --nonall -S $SERVER1,$SERVER2 echo foo bar
3303
3304
3305Output:
3306
3307
3308.. code-block:: perl
3309
3310   foo bar
3311   foo bar
3312
3313
3314When \ **--tag**\  is used with \ **--nonall**\  and \ **--onall**\  the \ **--tagstring**\  is the host:
3315
3316
3317.. code-block:: perl
3318
3319   parallel --nonall --tag -S $SERVER1,$SERVER2 echo foo bar
3320
3321
3322Output (the order may be different):
3323
3324
3325.. code-block:: perl
3326
3327   $SERVER1 foo bar
3328   $SERVER2 foo bar
3329
3330
3331\ **--jobs**\  sets the number of servers to log in to in parallel.
3332
3333
3334Transferring environment variables and functions
3335================================================
3336
3337
3338\ **env_parallel**\  is a shell function that transfers all aliases,
3339functions, variables, and arrays. You active it by running:
3340
3341
3342.. code-block:: perl
3343
3344   source `which env_parallel.bash`
3345
3346
3347Replace \ **bash**\  with the shell you use.
3348
3349Now you can use \ **env_parallel**\  instead of \ **parallel**\  and still have
3350your environment:
3351
3352
3353.. code-block:: perl
3354
3355   alias myecho=echo
3356   myvar="Joe's var is"
3357   env_parallel -S $SERVER1 'myecho $myvar' ::: green
3358
3359
3360Output:
3361
3362
3363.. code-block:: perl
3364
3365   Joe's var is green
3366
3367
3368The disadvantage is that if your environment is huge \ **env_parallel**\
3369will fail.
3370
3371When \ **env_parallel**\  fails, you can still use \ **--env**\  to tell GNU
3372\ **parallel**\  to transfer an environment variable to the remote system.
3373
3374
3375.. code-block:: perl
3376
3377   MYVAR='foo bar'
3378   export MYVAR
3379   parallel --env MYVAR -S $SERVER1 echo '$MYVAR' ::: baz
3380
3381
3382Output:
3383
3384
3385.. code-block:: perl
3386
3387   foo bar baz
3388
3389
3390This works for functions, too, if your shell is Bash:
3391
3392
3393.. code-block:: perl
3394
3395   # This only works in Bash
3396   my_func() {
3397     echo in my_func $1
3398   }
3399   export -f my_func
3400   parallel --env my_func -S $SERVER1 my_func ::: baz
3401
3402
3403Output:
3404
3405
3406.. code-block:: perl
3407
3408   in my_func baz
3409
3410
3411GNU \ **parallel**\  can copy all user defined variables and functions to
3412the remote system. It just needs to record which ones to ignore in
3413\ **~/.parallel/ignored_vars**\ . Do that by running this once:
3414
3415
3416.. code-block:: perl
3417
3418   parallel --record-env
3419   cat ~/.parallel/ignored_vars
3420
3421
3422Output:
3423
3424
3425.. code-block:: perl
3426
3427   [list of variables to ignore - including $PATH and $HOME]
3428
3429
3430Now all other variables and functions defined will be copied when
3431using \ **--env _**\ .
3432
3433
3434.. code-block:: perl
3435
3436   # The function is only copied if using Bash
3437   my_func2() {
3438     echo in my_func2 $VAR $1
3439   }
3440   export -f my_func2
3441   VAR=foo
3442   export VAR
3443
3444   parallel --env _ -S $SERVER1 'echo $VAR; my_func2' ::: bar
3445
3446
3447Output:
3448
3449
3450.. code-block:: perl
3451
3452   foo
3453   in my_func2 foo bar
3454
3455
3456If you use \ **env_parallel**\  the variables, functions, and aliases do
3457not even need to be exported to be copied:
3458
3459
3460.. code-block:: perl
3461
3462   NOT='not exported var'
3463   alias myecho=echo
3464   not_ex() {
3465     myecho in not_exported_func $NOT $1
3466   }
3467   env_parallel --env _ -S $SERVER1 'echo $NOT; not_ex' ::: bar
3468
3469
3470Output:
3471
3472
3473.. code-block:: perl
3474
3475   not exported var
3476   in not_exported_func not exported var bar
3477
3478
3479
3480Showing what is actually run
3481============================
3482
3483
3484\ **--verbose**\  will show the command that would be run on the local
3485machine.
3486
3487When using \ **--cat**\ , \ **--pipepart**\ , or when a job is run on a remote
3488machine, the command is wrapped with helper scripts. \ **-vv**\  shows all
3489of this.
3490
3491
3492.. code-block:: perl
3493
3494   parallel -vv --pipepart --block 1M wc :::: num30000
3495
3496
3497Output:
3498
3499
3500.. code-block:: perl
3501
3502   <num30000 perl -e 'while(@ARGV) { sysseek(STDIN,shift,0) || die;
3503   $left = shift; while($read = sysread(STDIN,$buf, ($left > 131072
3504   ? 131072 : $left))){ $left -= $read; syswrite(STDOUT,$buf); } }'
3505   0 0 0 168894 | (wc)
3506     30000   30000  168894
3507
3508
3509When the command gets more complex, the output is so hard to read,
3510that it is only useful for debugging:
3511
3512
3513.. code-block:: perl
3514
3515   my_func3() {
3516     echo in my_func $1 > $1.out
3517   }
3518   export -f my_func3
3519   parallel -vv --workdir ... --nice 17 --env _ --trc {}.out \
3520     -S $SERVER1 my_func3 {} ::: abc-file
3521
3522
3523Output will be similar to:
3524
3525
3526.. code-block:: perl
3527
3528   ( ssh server -- mkdir -p ./.parallel/tmp/aspire-1928520-1;rsync
3529   --protocol 30 -rlDzR -essh ./abc-file
3530   server:./.parallel/tmp/aspire-1928520-1 );ssh server -- exec perl -e
3531   \''@GNU_Parallel=("use","IPC::Open3;","use","MIME::Base64");
3532   eval"@GNU_Parallel";my$eval=decode_base64(join"",@ARGV);eval$eval;'\'
3533   c3lzdGVtKCJta2RpciIsIi1wIiwiLS0iLCIucGFyYWxsZWwvdG1wL2FzcGlyZS0xOTI4N
3534   TsgY2hkaXIgIi5wYXJhbGxlbC90bXAvYXNwaXJlLTE5Mjg1MjAtMSIgfHxwcmludChTVE
3535   BhcmFsbGVsOiBDYW5ub3QgY2hkaXIgdG8gLnBhcmFsbGVsL3RtcC9hc3BpcmUtMTkyODU
3536   iKSAmJiBleGl0IDI1NTskRU5WeyJPTERQV0QifT0iL2hvbWUvdGFuZ2UvcHJpdmF0L3Bh
3537   IjskRU5WeyJQQVJBTExFTF9QSUQifT0iMTkyODUyMCI7JEVOVnsiUEFSQUxMRUxfU0VRI
3538   0BiYXNoX2Z1bmN0aW9ucz1xdyhteV9mdW5jMyk7IGlmKCRFTlZ7IlNIRUxMIn09fi9jc2
3539   ByaW50IFNUREVSUiAiQ1NIL1RDU0ggRE8gTk9UIFNVUFBPUlQgbmV3bGluZXMgSU4gVkF
3540   TL0ZVTkNUSU9OUy4gVW5zZXQgQGJhc2hfZnVuY3Rpb25zXG4iOyBleGVjICJmYWxzZSI7
3541   YXNoZnVuYyA9ICJteV9mdW5jMygpIHsgIGVjaG8gaW4gbXlfZnVuYyBcJDEgPiBcJDEub
3542   Xhwb3J0IC1mIG15X2Z1bmMzID4vZGV2L251bGw7IjtAQVJHVj0ibXlfZnVuYzMgYWJjLW
3543   RzaGVsbD0iJEVOVntTSEVMTH0iOyR0bXBkaXI9Ii90bXAiOyRuaWNlPTE3O2RveyRFTlZ
3544   MRUxfVE1QfT0kdG1wZGlyLiIvcGFyIi5qb2luIiIsbWFweygwLi45LCJhIi4uInoiLCJB
3545   KVtyYW5kKDYyKV19KDEuLjUpO313aGlsZSgtZSRFTlZ7UEFSQUxMRUxfVE1QfSk7JFNJ
3546   fT1zdWJ7JGRvbmU9MTt9OyRwaWQ9Zm9yazt1bmxlc3MoJHBpZCl7c2V0cGdycDtldmFse
3547   W9yaXR5KDAsMCwkbmljZSl9O2V4ZWMkc2hlbGwsIi1jIiwoJGJhc2hmdW5jLiJAQVJHVi
3548   JleGVjOiQhXG4iO31kb3skcz0kczwxPzAuMDAxKyRzKjEuMDM6JHM7c2VsZWN0KHVuZGV
3549   mLHVuZGVmLCRzKTt9dW50aWwoJGRvbmV8fGdldHBwaWQ9PTEpO2tpbGwoU0lHSFVQLC0k
3550   dW5sZXNzJGRvbmU7d2FpdDtleGl0KCQ/JjEyNz8xMjgrKCQ/JjEyNyk6MSskPz4+OCk=;
3551   _EXIT_status=$?; mkdir -p ./.; rsync --protocol 30 --rsync-path=cd\
3552   ./.parallel/tmp/aspire-1928520-1/./.\;\ rsync -rlDzR -essh
3553   server:./abc-file.out ./.;ssh server -- \(rm\ -f\
3554   ./.parallel/tmp/aspire-1928520-1/abc-file\;\ sh\ -c\ \'rmdir\
3555   ./.parallel/tmp/aspire-1928520-1/\ ./.parallel/tmp/\ ./.parallel/\
3556   2\>/dev/null\'\;rm\ -rf\ ./.parallel/tmp/aspire-1928520-1\;\);ssh
3557   server -- \(rm\ -f\ ./.parallel/tmp/aspire-1928520-1/abc-file.out\;\
3558   sh\ -c\ \'rmdir\ ./.parallel/tmp/aspire-1928520-1/\ ./.parallel/tmp/\
3559   ./.parallel/\ 2\>/dev/null\'\;rm\ -rf\
3560   ./.parallel/tmp/aspire-1928520-1\;\);ssh server -- rm -rf
3561   .parallel/tmp/aspire-1928520-1; exit $_EXIT_status;
3562
3563
3564
3565
3566*******************************************
3567Saving output to shell variables (advanced)
3568*******************************************
3569
3570
3571GNU \ **parset**\  will set shell variables to the output of GNU
3572\ **parallel**\ . GNU \ **parset**\  has one important limitation: It cannot be
3573part of a pipe. In particular this means it cannot read anything from
3574standard input (stdin) or pipe output to another program.
3575
3576To use GNU \ **parset**\  prepend command with destination variables:
3577
3578
3579.. code-block:: perl
3580
3581   parset myvar1,myvar2 echo ::: a b
3582   echo $myvar1
3583   echo $myvar2
3584
3585
3586Output:
3587
3588
3589.. code-block:: perl
3590
3591   a
3592   b
3593
3594
3595If you only give a single variable, it will be treated as an array:
3596
3597
3598.. code-block:: perl
3599
3600   parset myarray seq {} 5 ::: 1 2 3
3601   echo "${myarray[1]}"
3602
3603
3604Output:
3605
3606
3607.. code-block:: perl
3608
3609   2
3610   3
3611   4
3612   5
3613
3614
3615The commands to run can be an array:
3616
3617
3618.. code-block:: perl
3619
3620   cmd=("echo '<<joe  \"double  space\"  cartoon>>'" "pwd")
3621   parset data ::: "${cmd[@]}"
3622   echo "${data[0]}"
3623   echo "${data[1]}"
3624
3625
3626Output:
3627
3628
3629.. code-block:: perl
3630
3631   <<joe  "double  space"  cartoon>>
3632   [current dir]
3633
3634
3635
3636********************************
3637Saving to an SQL base (advanced)
3638********************************
3639
3640
3641GNU \ **parallel**\  can save into an SQL base. Point GNU \ **parallel**\  to a
3642table and it will put the joblog there together with the variables and
3643the output each in their own column.
3644
3645CSV as SQL base
3646===============
3647
3648
3649The simplest is to use a CSV file as the storage table:
3650
3651
3652.. code-block:: perl
3653
3654   parallel --sqlandworker csv:///%2Ftmp/log.csv \
3655     seq ::: 10 ::: 12 13 14
3656   cat /tmp/log.csv
3657
3658
3659Note how '/' in the path must be written as %2F.
3660
3661Output will be similar to:
3662
3663
3664.. code-block:: perl
3665
3666   Seq,Host,Starttime,JobRuntime,Send,Receive,Exitval,_Signal,
3667     Command,V1,V2,Stdout,Stderr
3668   1,:,1458254498.254,0.069,0,9,0,0,"seq 10 12",10,12,"10
3669   11
3670   12
3671   ",
3672   2,:,1458254498.278,0.080,0,12,0,0,"seq 10 13",10,13,"10
3673   11
3674   12
3675   13
3676   ",
3677   3,:,1458254498.301,0.083,0,15,0,0,"seq 10 14",10,14,"10
3678   11
3679   12
3680   13
3681   14
3682   ",
3683
3684
3685A proper CSV reader (like LibreOffice or R's read.csv) will read this
3686format correctly - even with fields containing newlines as above.
3687
3688If the output is big you may want to put it into files using \ **--results**\ :
3689
3690
3691.. code-block:: perl
3692
3693   parallel --results outdir --sqlandworker csv:///%2Ftmp/log2.csv \
3694     seq ::: 10 ::: 12 13 14
3695   cat /tmp/log2.csv
3696
3697
3698Output will be similar to:
3699
3700
3701.. code-block:: perl
3702
3703   Seq,Host,Starttime,JobRuntime,Send,Receive,Exitval,_Signal,
3704     Command,V1,V2,Stdout,Stderr
3705   1,:,1458824738.287,0.029,0,9,0,0,
3706     "seq 10 12",10,12,outdir/1/10/2/12/stdout,outdir/1/10/2/12/stderr
3707   2,:,1458824738.298,0.025,0,12,0,0,
3708     "seq 10 13",10,13,outdir/1/10/2/13/stdout,outdir/1/10/2/13/stderr
3709   3,:,1458824738.309,0.026,0,15,0,0,
3710     "seq 10 14",10,14,outdir/1/10/2/14/stdout,outdir/1/10/2/14/stderr
3711
3712
3713
3714DBURL as table
3715==============
3716
3717
3718The CSV file is an example of a DBURL.
3719
3720GNU \ **parallel**\  uses a DBURL to address the table. A DBURL has this format:
3721
3722
3723.. code-block:: perl
3724
3725   vendor://[[user][:password]@][host][:port]/[database[/table]
3726
3727
3728Example:
3729
3730
3731.. code-block:: perl
3732
3733   mysql://scott:tiger@my.example.com/mydatabase/mytable
3734   postgresql://scott:tiger@pg.example.com/mydatabase/mytable
3735   sqlite3:///%2Ftmp%2Fmydatabase/mytable
3736   csv:///%2Ftmp/log.csv
3737
3738
3739To refer to \ **/tmp/mydatabase**\  with \ **sqlite**\  or \ **csv**\  you need to
3740encode the \ **/**\  as \ **%2F**\ .
3741
3742Run a job using \ **sqlite**\  on \ **mytable**\  in \ **/tmp/mydatabase**\ :
3743
3744
3745.. code-block:: perl
3746
3747   DBURL=sqlite3:///%2Ftmp%2Fmydatabase
3748   DBURLTABLE=$DBURL/mytable
3749   parallel --sqlandworker $DBURLTABLE echo ::: foo bar ::: baz quuz
3750
3751
3752To see the result:
3753
3754
3755.. code-block:: perl
3756
3757   sql $DBURL 'SELECT * FROM mytable ORDER BY Seq;'
3758
3759
3760Output will be similar to:
3761
3762
3763.. code-block:: perl
3764
3765   Seq|Host|Starttime|JobRuntime|Send|Receive|Exitval|_Signal|
3766     Command|V1|V2|Stdout|Stderr
3767   1|:|1451619638.903|0.806||8|0|0|echo foo baz|foo|baz|foo baz
3768   |
3769   2|:|1451619639.265|1.54||9|0|0|echo foo quuz|foo|quuz|foo quuz
3770   |
3771   3|:|1451619640.378|1.43||8|0|0|echo bar baz|bar|baz|bar baz
3772   |
3773   4|:|1451619641.473|0.958||9|0|0|echo bar quuz|bar|quuz|bar quuz
3774   |
3775
3776
3777The first columns are well known from \ **--joblog**\ . \ **V1**\  and \ **V2**\  are
3778data from the input sources. \ **Stdout**\  and \ **Stderr**\  are standard
3779output and standard error, respectively.
3780
3781
3782Using multiple workers
3783======================
3784
3785
3786Using an SQL base as storage costs overhead in the order of 1 second
3787per job.
3788
3789One of the situations where it makes sense is if you have multiple
3790workers.
3791
3792You can then have a single master machine that submits jobs to the SQL
3793base (but does not do any of the work):
3794
3795
3796.. code-block:: perl
3797
3798   parallel --sqlmaster $DBURLTABLE echo ::: foo bar ::: baz quuz
3799
3800
3801On the worker machines you run exactly the same command except you
3802replace \ **--sqlmaster**\  with \ **--sqlworker**\ .
3803
3804
3805.. code-block:: perl
3806
3807   parallel --sqlworker $DBURLTABLE echo ::: foo bar ::: baz quuz
3808
3809
3810To run a master and a worker on the same machine use \ **--sqlandworker**\
3811as shown earlier.
3812
3813
3814
3815******
3816--pipe
3817******
3818
3819
3820The \ **--pipe**\  functionality puts GNU \ **parallel**\  in a different mode:
3821Instead of treating the data on stdin (standard input) as arguments
3822for a command to run, the data will be sent to stdin (standard input)
3823of the command.
3824
3825The typical situation is:
3826
3827
3828.. code-block:: perl
3829
3830   command_A | command_B | command_C
3831
3832
3833where command_B is slow, and you want to speed up command_B.
3834
3835Chunk size
3836==========
3837
3838
3839By default GNU \ **parallel**\  will start an instance of command_B, read a
3840chunk of 1 MB, and pass that to the instance. Then start another
3841instance, read another chunk, and pass that to the second instance.
3842
3843
3844.. code-block:: perl
3845
3846   cat num1000000 | parallel --pipe wc
3847
3848
3849Output (the order may be different):
3850
3851
3852.. code-block:: perl
3853
3854   165668  165668 1048571
3855   149797  149797 1048579
3856   149796  149796 1048572
3857   149797  149797 1048579
3858   149797  149797 1048579
3859   149796  149796 1048572
3860    85349   85349  597444
3861
3862
3863The size of the chunk is not exactly 1 MB because GNU \ **parallel**\  only
3864passes full lines - never half a line, thus the blocksize is only
38651 MB on average. You can change the block size to 2 MB with \ **--block**\ :
3866
3867
3868.. code-block:: perl
3869
3870   cat num1000000 | parallel --pipe --block 2M wc
3871
3872
3873Output (the order may be different):
3874
3875
3876.. code-block:: perl
3877
3878   315465  315465 2097150
3879   299593  299593 2097151
3880   299593  299593 2097151
3881    85349   85349  597444
3882
3883
3884GNU \ **parallel**\  treats each line as a record. If the order of records
3885is unimportant (e.g. you need all lines processed, but you do not care
3886which is processed first), then you can use \ **--roundrobin**\ . Without
3887\ **--roundrobin**\  GNU \ **parallel**\  will start a command per block; with
3888\ **--roundrobin**\  only the requested number of jobs will be started
3889(\ **--jobs**\ ). The records will then be distributed between the running
3890jobs:
3891
3892
3893.. code-block:: perl
3894
3895   cat num1000000 | parallel --pipe -j4 --roundrobin wc
3896
3897
3898Output will be similar to:
3899
3900
3901.. code-block:: perl
3902
3903   149797  149797 1048579
3904   299593  299593 2097151
3905   315465  315465 2097150
3906   235145  235145 1646016
3907
3908
3909One of the 4 instances got a single record, 2 instances got 2 full
3910records each, and one instance got 1 full and 1 partial record.
3911
3912
3913Records
3914=======
3915
3916
3917GNU \ **parallel**\  sees the input as records. The default record is a single
3918line.
3919
3920Using \ **-N140000**\  GNU \ **parallel**\  will read 140000 records at a time:
3921
3922
3923.. code-block:: perl
3924
3925   cat num1000000 | parallel --pipe -N140000 wc
3926
3927
3928Output (the order may be different):
3929
3930
3931.. code-block:: perl
3932
3933   140000  140000  868895
3934   140000  140000  980000
3935   140000  140000  980000
3936   140000  140000  980000
3937   140000  140000  980000
3938   140000  140000  980000
3939   140000  140000  980000
3940    20000   20000  140001
3941
3942
3943Note how that the last job could not get the full 140000 lines, but
3944only 20000 lines.
3945
3946If a record is 75 lines \ **-L**\  can be used:
3947
3948
3949.. code-block:: perl
3950
3951   cat num1000000 | parallel --pipe -L75 wc
3952
3953
3954Output (the order may be different):
3955
3956
3957.. code-block:: perl
3958
3959   165600  165600 1048095
3960   149850  149850 1048950
3961   149775  149775 1048425
3962   149775  149775 1048425
3963   149850  149850 1048950
3964   149775  149775 1048425
3965    85350   85350  597450
3966       25      25     176
3967
3968
3969Note how GNU \ **parallel**\  still reads a block of around 1 MB; but
3970instead of passing full lines to \ **wc**\  it passes full 75 lines at a
3971time. This of course does not hold for the last job (which in this
3972case got 25 lines).
3973
3974
3975Fixed length records
3976====================
3977
3978
3979Fixed length records can be processed by setting \ **--recend ''**\  and
3980\ **--block \ \*recordsize\*\ **\ . A header of size \ *n*\  can be processed with
3981\ **--header .{\ \*n\*\ }**\ .
3982
3983Here is how to process a file with a 4-byte header and a 3-byte record
3984size:
3985
3986
3987.. code-block:: perl
3988
3989   cat fixedlen | parallel --pipe --header .{4} --block 3 --recend '' \
3990     'echo start; cat; echo'
3991
3992
3993Output:
3994
3995
3996.. code-block:: perl
3997
3998   start
3999   HHHHAAA
4000   start
4001   HHHHCCC
4002   start
4003   HHHHBBB
4004
4005
4006It may be more efficient to increase \ **--block**\  to a multiplum of the
4007record size.
4008
4009
4010Record separators
4011=================
4012
4013
4014GNU \ **parallel**\  uses separators to determine where two records split.
4015
4016\ **--recstart**\  gives the string that starts a record; \ **--recend**\  gives the
4017string that ends a record. The default is \ **--recend '\n'**\  (newline).
4018
4019If both \ **--recend**\  and \ **--recstart**\  are given, then the record will only
4020split if the recend string is immediately followed by the recstart
4021string.
4022
4023Here the \ **--recend**\  is set to \ **', '**\ :
4024
4025
4026.. code-block:: perl
4027
4028   echo /foo, bar/, /baz, qux/, | \
4029     parallel -kN1 --recend ', ' --pipe echo JOB{#}\;cat\;echo END
4030
4031
4032Output:
4033
4034
4035.. code-block:: perl
4036
4037   JOB1
4038   /foo, END
4039   JOB2
4040   bar/, END
4041   JOB3
4042   /baz, END
4043   JOB4
4044   qux/,
4045   END
4046
4047
4048Here the \ **--recstart**\  is set to \ **/**\ :
4049
4050
4051.. code-block:: perl
4052
4053   echo /foo, bar/, /baz, qux/, | \
4054     parallel -kN1 --recstart / --pipe echo JOB{#}\;cat\;echo END
4055
4056
4057Output:
4058
4059
4060.. code-block:: perl
4061
4062   JOB1
4063   /foo, barEND
4064   JOB2
4065   /, END
4066   JOB3
4067   /baz, quxEND
4068   JOB4
4069   /,
4070   END
4071
4072
4073Here both \ **--recend**\  and \ **--recstart**\  are set:
4074
4075
4076.. code-block:: perl
4077
4078   echo /foo, bar/, /baz, qux/, | \
4079     parallel -kN1 --recend ', ' --recstart / --pipe \
4080     echo JOB{#}\;cat\;echo END
4081
4082
4083Output:
4084
4085
4086.. code-block:: perl
4087
4088   JOB1
4089   /foo, bar/, END
4090   JOB2
4091   /baz, qux/,
4092   END
4093
4094
4095Note the difference between setting one string and setting both strings.
4096
4097With \ **--regexp**\  the \ **--recend**\  and \ **--recstart**\  will be treated as
4098a regular expression:
4099
4100
4101.. code-block:: perl
4102
4103   echo foo,bar,_baz,__qux, | \
4104     parallel -kN1 --regexp --recend ,_+ --pipe \
4105     echo JOB{#}\;cat\;echo END
4106
4107
4108Output:
4109
4110
4111.. code-block:: perl
4112
4113   JOB1
4114   foo,bar,_END
4115   JOB2
4116   baz,__END
4117   JOB3
4118   qux,
4119   END
4120
4121
4122GNU \ **parallel**\  can remove the record separators with
4123\ **--remove-rec-sep**\ /\ **--rrs**\ :
4124
4125
4126.. code-block:: perl
4127
4128   echo foo,bar,_baz,__qux, | \
4129     parallel -kN1 --rrs --regexp --recend ,_+ --pipe \
4130     echo JOB{#}\;cat\;echo END
4131
4132
4133Output:
4134
4135
4136.. code-block:: perl
4137
4138   JOB1
4139   foo,barEND
4140   JOB2
4141   bazEND
4142   JOB3
4143   qux,
4144   END
4145
4146
4147
4148Header
4149======
4150
4151
4152If the input data has a header, the header can be repeated for each
4153job by matching the header with \ **--header**\ . If headers start with
4154\ **%**\  you can do this:
4155
4156
4157.. code-block:: perl
4158
4159   cat num_%header | \
4160     parallel --header '(%.*\n)*' --pipe -N3 echo JOB{#}\;cat
4161
4162
4163Output (the order may be different):
4164
4165
4166.. code-block:: perl
4167
4168   JOB1
4169   %head1
4170   %head2
4171   1
4172   2
4173   3
4174   JOB2
4175   %head1
4176   %head2
4177   4
4178   5
4179   6
4180   JOB3
4181   %head1
4182   %head2
4183   7
4184   8
4185   9
4186   JOB4
4187   %head1
4188   %head2
4189   10
4190
4191
4192If the header is 2 lines, \ **--header**\  2 will work:
4193
4194
4195.. code-block:: perl
4196
4197   cat num_%header | parallel --header 2 --pipe -N3 echo JOB{#}\;cat
4198
4199
4200Output: Same as above.
4201
4202
4203--pipepart
4204==========
4205
4206
4207\ **--pipe**\  is not very efficient. It maxes out at around 500
4208MB/s. \ **--pipepart**\  can easily deliver 5 GB/s. But there are a few
4209limitations. The input has to be a normal file (not a pipe) given by
4210\ **-a**\  or \ **::::**\  and \ **-L**\ /\ **-l**\ /\ **-N**\  do not work. \ **--recend**\  and
4211\ **--recstart**\ , however, \ *do*\  work, and records can often be split on
4212that alone.
4213
4214
4215.. code-block:: perl
4216
4217   parallel --pipepart -a num1000000 --block 3m wc
4218
4219
4220Output (the order may be different):
4221
4222
4223.. code-block:: perl
4224
4225  444443  444444 3000002
4226  428572  428572 3000004
4227  126985  126984  888890
4228
4229
4230
4231
4232*******
4233Shebang
4234*******
4235
4236
4237Input data and parallel command in the same file
4238================================================
4239
4240
4241GNU \ **parallel**\  is often called as this:
4242
4243
4244.. code-block:: perl
4245
4246   cat input_file | parallel command
4247
4248
4249With \ **--shebang**\  the \ *input_file*\  and \ **parallel**\  can be combined into the same script.
4250
4251UNIX shell scripts start with a shebang line like this:
4252
4253
4254.. code-block:: perl
4255
4256   #!/bin/bash
4257
4258
4259GNU \ **parallel**\  can do that, too. With \ **--shebang**\  the arguments can be
4260listed in the file. The \ **parallel**\  command is the first line of the
4261script:
4262
4263
4264.. code-block:: perl
4265
4266   #!/usr/bin/parallel --shebang -r echo
4267
4268   foo
4269   bar
4270   baz
4271
4272
4273Output (the order may be different):
4274
4275
4276.. code-block:: perl
4277
4278   foo
4279   bar
4280   baz
4281
4282
4283
4284Parallelizing existing scripts
4285==============================
4286
4287
4288GNU \ **parallel**\  is often called as this:
4289
4290
4291.. code-block:: perl
4292
4293   cat input_file | parallel command
4294   parallel command ::: foo bar
4295
4296
4297If \ **command**\  is a script, \ **parallel**\  can be combined into a single
4298file so this will run the script in parallel:
4299
4300
4301.. code-block:: perl
4302
4303   cat input_file | command
4304   command foo bar
4305
4306
4307This \ **perl**\  script \ **perl_echo**\  works like \ **echo**\ :
4308
4309
4310.. code-block:: perl
4311
4312   #!/usr/bin/perl
4313
4314   print "@ARGV\n"
4315
4316
4317It can be called as this:
4318
4319
4320.. code-block:: perl
4321
4322   parallel perl_echo ::: foo bar
4323
4324
4325By changing the \ **#!**\ -line it can be run in parallel:
4326
4327
4328.. code-block:: perl
4329
4330   #!/usr/bin/parallel --shebang-wrap /usr/bin/perl
4331
4332   print "@ARGV\n"
4333
4334
4335Thus this will work:
4336
4337
4338.. code-block:: perl
4339
4340   perl_echo foo bar
4341
4342
4343Output (the order may be different):
4344
4345
4346.. code-block:: perl
4347
4348   foo
4349   bar
4350
4351
4352This technique can be used for:
4353
4354
4355- Perl:
4356
4357
4358 .. code-block:: perl
4359
4360    #!/usr/bin/parallel --shebang-wrap /usr/bin/perl
4361
4362    print "Arguments @ARGV\n";
4363
4364
4365
4366
4367- Python:
4368
4369
4370 .. code-block:: perl
4371
4372    #!/usr/bin/parallel --shebang-wrap /usr/bin/python
4373
4374    import sys
4375    print 'Arguments', str(sys.argv)
4376
4377
4378
4379
4380- Bash/sh/zsh/Korn shell:
4381
4382
4383 .. code-block:: perl
4384
4385    #!/usr/bin/parallel --shebang-wrap /bin/bash
4386
4387    echo Arguments "$@"
4388
4389
4390
4391
4392- csh:
4393
4394
4395 .. code-block:: perl
4396
4397    #!/usr/bin/parallel --shebang-wrap /bin/csh
4398
4399    echo Arguments "$argv"
4400
4401
4402
4403
4404- Tcl:
4405
4406
4407 .. code-block:: perl
4408
4409    #!/usr/bin/parallel --shebang-wrap /usr/bin/tclsh
4410
4411    puts "Arguments $argv"
4412
4413
4414
4415
4416- R:
4417
4418
4419 .. code-block:: perl
4420
4421    #!/usr/bin/parallel --shebang-wrap /usr/bin/Rscript --vanilla --slave
4422
4423    args <- commandArgs(trailingOnly = TRUE)
4424    print(paste("Arguments ",args))
4425
4426
4427
4428
4429- GNUplot:
4430
4431
4432 .. code-block:: perl
4433
4434    #!/usr/bin/parallel --shebang-wrap ARG={} /usr/bin/gnuplot
4435
4436    print "Arguments ", system('echo $ARG')
4437
4438
4439
4440
4441- Ruby:
4442
4443
4444 .. code-block:: perl
4445
4446    #!/usr/bin/parallel --shebang-wrap /usr/bin/ruby
4447
4448    print "Arguments "
4449    puts ARGV
4450
4451
4452
4453
4454- Octave:
4455
4456
4457 .. code-block:: perl
4458
4459    #!/usr/bin/parallel --shebang-wrap /usr/bin/octave
4460
4461    printf ("Arguments");
4462    arg_list = argv ();
4463    for i = 1:nargin
4464      printf (" %s", arg_list{i});
4465    endfor
4466    printf ("\n");
4467
4468
4469
4470
4471- Common LISP:
4472
4473
4474 .. code-block:: perl
4475
4476    #!/usr/bin/parallel --shebang-wrap /usr/bin/clisp
4477
4478    (format t "~&~S~&" 'Arguments)
4479    (format t "~&~S~&" *args*)
4480
4481
4482
4483
4484- PHP:
4485
4486
4487 .. code-block:: perl
4488
4489    #!/usr/bin/parallel --shebang-wrap /usr/bin/php
4490    <?php
4491    echo "Arguments";
4492    foreach(array_slice($argv,1) as $v)
4493    {
4494      echo " $v";
4495    }
4496    echo "\n";
4497    ?>
4498
4499
4500
4501
4502- Node.js:
4503
4504
4505 .. code-block:: perl
4506
4507    #!/usr/bin/parallel --shebang-wrap /usr/bin/node
4508
4509    var myArgs = process.argv.slice(2);
4510    console.log('Arguments ', myArgs);
4511
4512
4513
4514
4515- LUA:
4516
4517
4518 .. code-block:: perl
4519
4520    #!/usr/bin/parallel --shebang-wrap /usr/bin/lua
4521
4522    io.write "Arguments"
4523    for a = 1, #arg do
4524      io.write(" ")
4525      io.write(arg[a])
4526    end
4527    print("")
4528
4529
4530
4531
4532- C#:
4533
4534
4535 .. code-block:: perl
4536
4537    #!/usr/bin/parallel --shebang-wrap ARGV={} /usr/bin/csharp
4538
4539    var argv = Environment.GetEnvironmentVariable("ARGV");
4540    print("Arguments "+argv);
4541
4542
4543
4544
4545
4546
4547*********
4548Semaphore
4549*********
4550
4551
4552GNU \ **parallel**\  can work as a counting semaphore. This is slower and less
4553efficient than its normal mode.
4554
4555A counting semaphore is like a row of toilets. People needing a toilet
4556can use any toilet, but if there are more people than toilets, they
4557will have to wait for one of the toilets to become available.
4558
4559An alias for \ **parallel --semaphore**\  is \ **sem**\ .
4560
4561\ **sem**\  will follow a person to the toilets, wait until a toilet is
4562available, leave the person in the toilet and exit.
4563
4564\ **sem --fg**\  will follow a person to the toilets, wait until a toilet is
4565available, stay with the person in the toilet and exit when the person
4566exits.
4567
4568\ **sem --wait**\  will wait for all persons to leave the toilets.
4569
4570\ **sem**\  does not have a queue discipline, so the next person is chosen
4571randomly.
4572
4573\ **-j**\  sets the number of toilets.
4574
4575Mutex
4576=====
4577
4578
4579The default is to have only one toilet (this is called a mutex). The
4580program is started in the background and \ **sem**\  exits immediately. Use
4581\ **--wait**\  to wait for all \ **sem**\ s to finish:
4582
4583
4584.. code-block:: perl
4585
4586   sem 'sleep 1; echo The first finished' &&
4587     echo The first is now running in the background &&
4588     sem 'sleep 1; echo The second finished' &&
4589     echo The second is now running in the background
4590   sem --wait
4591
4592
4593Output:
4594
4595
4596.. code-block:: perl
4597
4598   The first is now running in the background
4599   The first finished
4600   The second is now running in the background
4601   The second finished
4602
4603
4604The command can be run in the foreground with \ **--fg**\ , which will only
4605exit when the command completes:
4606
4607
4608.. code-block:: perl
4609
4610   sem --fg 'sleep 1; echo The first finished' &&
4611     echo The first finished running in the foreground &&
4612     sem --fg 'sleep 1; echo The second finished' &&
4613     echo The second finished running in the foreground
4614   sem --wait
4615
4616
4617The difference between this and just running the command, is that a
4618mutex is set, so if other \ **sem**\ s were running in the background only one
4619would run at a time.
4620
4621To control which semaphore is used, use
4622\ **--semaphorename**\ /\ **--id**\ . Run this in one terminal:
4623
4624
4625.. code-block:: perl
4626
4627   sem --id my_id -u 'echo First started; sleep 10; echo First done'
4628
4629
4630and simultaneously this in another terminal:
4631
4632
4633.. code-block:: perl
4634
4635   sem --id my_id -u 'echo Second started; sleep 10; echo Second done'
4636
4637
4638Note how the second will only be started when the first has finished.
4639
4640
4641Counting semaphore
4642==================
4643
4644
4645A mutex is like having a single toilet: When it is in use everyone
4646else will have to wait. A counting semaphore is like having multiple
4647toilets: Several people can use the toilets, but when they all are in
4648use, everyone else will have to wait.
4649
4650\ **sem**\  can emulate a counting semaphore. Use \ **--jobs**\  to set the
4651number of toilets like this:
4652
4653
4654.. code-block:: perl
4655
4656   sem --jobs 3 --id my_id -u 'echo Start 1; sleep 5; echo 1 done' &&
4657   sem --jobs 3 --id my_id -u 'echo Start 2; sleep 6; echo 2 done' &&
4658   sem --jobs 3 --id my_id -u 'echo Start 3; sleep 7; echo 3 done' &&
4659   sem --jobs 3 --id my_id -u 'echo Start 4; sleep 8; echo 4 done' &&
4660   sem --wait --id my_id
4661
4662
4663Output:
4664
4665
4666.. code-block:: perl
4667
4668   Start 1
4669   Start 2
4670   Start 3
4671   1 done
4672   Start 4
4673   2 done
4674   3 done
4675   4 done
4676
4677
4678
4679Timeout
4680=======
4681
4682
4683With \ **--semaphoretimeout**\  you can force running the command anyway after
4684a period (positive number) or give up (negative number):
4685
4686
4687.. code-block:: perl
4688
4689   sem --id foo -u 'echo Slow started; sleep 5; echo Slow ended' &&
4690   sem --id foo --semaphoretimeout 1 'echo Forced running after 1 sec' &&
4691   sem --id foo --semaphoretimeout -2 'echo Give up after 2 secs'
4692   sem --id foo --wait
4693
4694
4695Output:
4696
4697
4698.. code-block:: perl
4699
4700   Slow started
4701   parallel: Warning: Semaphore timed out. Stealing the semaphore.
4702   Forced running after 1 sec
4703   parallel: Warning: Semaphore timed out. Exiting.
4704   Slow ended
4705
4706
4707Note how the 'Give up' was not run.
4708
4709
4710
4711*************
4712Informational
4713*************
4714
4715
4716GNU \ **parallel**\  has some options to give short information about the
4717configuration.
4718
4719\ **--help**\  will print a summary of the most important options:
4720
4721
4722.. code-block:: perl
4723
4724   parallel --help
4725
4726
4727Output:
4728
4729
4730.. code-block:: perl
4731
4732   Usage:
4733
4734   parallel [options] [command [arguments]] < list_of_arguments
4735   parallel [options] [command [arguments]] (::: arguments|:::: argfile(s))...
4736   cat ... | parallel --pipe [options] [command [arguments]]
4737
4738   -j n            Run n jobs in parallel
4739   -k              Keep same order
4740   -X              Multiple arguments with context replace
4741   --colsep regexp Split input on regexp for positional replacements
4742   {} {.} {/} {/.} {#} {%} {= perl code =} Replacement strings
4743   {3} {3.} {3/} {3/.} {=3 perl code =}    Positional replacement strings
4744   With --plus:    {} = {+/}/{/} = {.}.{+.} = {+/}/{/.}.{+.} = {..}.{+..} =
4745                   {+/}/{/..}.{+..} = {...}.{+...} = {+/}/{/...}.{+...}
4746
4747   -S sshlogin     Example: foo@server.example.com
4748   --slf ..        Use ~/.parallel/sshloginfile as the list of sshlogins
4749   --trc {}.bar    Shorthand for --transfer --return {}.bar --cleanup
4750   --onall         Run the given command with argument on all sshlogins
4751   --nonall        Run the given command with no arguments on all sshlogins
4752
4753   --pipe          Split stdin (standard input) to multiple jobs.
4754   --recend str    Record end separator for --pipe.
4755   --recstart str  Record start separator for --pipe.
4756
4757   See 'man parallel' for details
4758
4759   Academic tradition requires you to cite works you base your article on.
4760   When using programs that use GNU Parallel to process data for publication
4761   please cite:
4762
4763     O. Tange (2011): GNU Parallel - The Command-Line Power Tool,
4764     ;login: The USENIX Magazine, February 2011:42-47.
4765
4766   This helps funding further development; AND IT WON'T COST YOU A CENT.
4767   If you pay 10000 EUR you should feel free to use GNU Parallel without citing.
4768
4769
4770When asking for help, always report the full output of this:
4771
4772
4773.. code-block:: perl
4774
4775   parallel --version
4776
4777
4778Output:
4779
4780
4781.. code-block:: perl
4782
4783   GNU parallel 20210122
4784   Copyright (C) 2007-2021 Ole Tange, http://ole.tange.dk and Free Software
4785   Foundation, Inc.
4786   License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>
4787   This is free software: you are free to change and redistribute it.
4788   GNU parallel comes with no warranty.
4789
4790   Web site: https://www.gnu.org/software/parallel
4791
4792   When using programs that use GNU Parallel to process data for publication
4793   please cite as described in 'parallel --citation'.
4794
4795
4796In scripts \ **--minversion**\  can be used to ensure the user has at least
4797this version:
4798
4799
4800.. code-block:: perl
4801
4802   parallel --minversion 20130722 && \
4803     echo Your version is at least 20130722.
4804
4805
4806Output:
4807
4808
4809.. code-block:: perl
4810
4811   20160322
4812   Your version is at least 20130722.
4813
4814
4815If you are using GNU \ **parallel**\  for research the BibTeX citation can be
4816generated using \ **--citation**\ :
4817
4818
4819.. code-block:: perl
4820
4821   parallel --citation
4822
4823
4824Output:
4825
4826
4827.. code-block:: perl
4828
4829   Academic tradition requires you to cite works you base your article on.
4830   When using programs that use GNU Parallel to process data for publication
4831   please cite:
4832
4833   @article{Tange2011a,
4834     title = {GNU Parallel - The Command-Line Power Tool},
4835     author = {O. Tange},
4836     address = {Frederiksberg, Denmark},
4837     journal = {;login: The USENIX Magazine},
4838     month = {Feb},
4839     number = {1},
4840     volume = {36},
4841     url = {https://www.gnu.org/s/parallel},
4842     year = {2011},
4843     pages = {42-47},
4844     doi = {10.5281/zenodo.16303}
4845   }
4846
4847   (Feel free to use \nocite{Tange2011a})
4848
4849   This helps funding further development; AND IT WON'T COST YOU A CENT.
4850   If you pay 10000 EUR you should feel free to use GNU Parallel without citing.
4851
4852   If you send a copy of your published article to tange@gnu.org, it will be
4853   mentioned in the release notes of next version of GNU Parallel.
4854
4855
4856With \ **--max-line-length-allowed**\  GNU \ **parallel**\  will report the maximal
4857size of the command line:
4858
4859
4860.. code-block:: perl
4861
4862   parallel --max-line-length-allowed
4863
4864
4865Output (may vary on different systems):
4866
4867
4868.. code-block:: perl
4869
4870   131071
4871
4872
4873\ **--number-of-cpus**\  and \ **--number-of-cores**\  run system specific code to
4874determine the number of CPUs and CPU cores on the system. On
4875unsupported platforms they will return 1:
4876
4877
4878.. code-block:: perl
4879
4880   parallel --number-of-cpus
4881   parallel --number-of-cores
4882
4883
4884Output (may vary on different systems):
4885
4886
4887.. code-block:: perl
4888
4889   4
4890   64
4891
4892
4893
4894********
4895Profiles
4896********
4897
4898
4899The defaults for GNU \ **parallel**\  can be changed systemwide by putting the
4900command line options in \ **/etc/parallel/config**\ . They can be changed for
4901a user by putting them in \ **~/.parallel/config**\ .
4902
4903Profiles work the same way, but have to be referred to with \ **--profile**\ :
4904
4905
4906.. code-block:: perl
4907
4908   echo '--nice 17' > ~/.parallel/nicetimeout
4909   echo '--timeout 300%' >> ~/.parallel/nicetimeout
4910   parallel --profile nicetimeout echo ::: A B C
4911
4912
4913Output:
4914
4915
4916.. code-block:: perl
4917
4918   A
4919   B
4920   C
4921
4922
4923Profiles can be combined:
4924
4925
4926.. code-block:: perl
4927
4928   echo '-vv --dry-run' > ~/.parallel/dryverbose
4929   parallel --profile dryverbose --profile nicetimeout echo ::: A B C
4930
4931
4932Output:
4933
4934
4935.. code-block:: perl
4936
4937   echo A
4938   echo B
4939   echo C
4940
4941
4942
4943***************
4944Spread the word
4945***************
4946
4947
4948I hope you have learned something from this tutorial.
4949
4950If you like GNU \ **parallel**\ :
4951
4952
4953- \*
4954
4955 (Re-)walk through the tutorial if you have not done so in the past year
4956 (https://www.gnu.org/software/parallel/parallel_tutorial.html)
4957
4958
4959
4960- \*
4961
4962 Give a demo at your local user group/your team/your colleagues
4963
4964
4965
4966- \*
4967
4968 Post the intro videos and the tutorial on Reddit, Mastodon, Diaspora\*,
4969 forums, blogs, Identi.ca, Google+, Twitter, Facebook, Linkedin, and
4970 mailing lists
4971
4972
4973
4974- \*
4975
4976 Request or write a review for your favourite blog or magazine
4977 (especially if you do something cool with GNU \ **parallel**\ )
4978
4979
4980
4981- \*
4982
4983 Invite me for your next conference
4984
4985
4986
4987If you use GNU \ **parallel**\  for research:
4988
4989
4990- \*
4991
4992 Please cite GNU \ **parallel**\  in you publications (use \ **--citation**\ )
4993
4994
4995
4996If GNU \ **parallel**\  saves you money:
4997
4998
4999- \*
5000
5001 (Have your company) donate to FSF or become a member
5002 https://my.fsf.org/donate/
5003
5004
5005
5006(C) 2013-2021 Ole Tange, GFDLv1.3+ (See
5007LICENSES/GFDL-1.3-or-later.txt)
5008
5009