1=encoding utf8 2 3=head1 NAME 4 5perlthrtut - Tutorial on threads in Perl 6 7=head1 DESCRIPTION 8 9This tutorial describes the use of Perl interpreter threads (sometimes 10referred to as I<ithreads>). In this 11model, each thread runs in its own Perl interpreter, and any data sharing 12between threads must be explicit. The user-level interface for I<ithreads> 13uses the L<threads> class. 14 15B<NOTE>: There was another older Perl threading flavor called the 5.005 model 16that used the L<threads> class. This old model was known to have problems, is 17deprecated, and was removed for release 5.10. You are 18strongly encouraged to migrate any existing 5.005 threads code to the new 19model as soon as possible. 20 21You can see which (or neither) threading flavour you have by 22running C<perl -V> and looking at the C<Platform> section. 23If you have C<useithreads=define> you have ithreads, if you 24have C<use5005threads=define> you have 5.005 threads. 25If you have neither, you don't have any thread support built in. 26If you have both, you are in trouble. 27 28The L<threads> and L<threads::shared> modules are included in the core Perl 29distribution. Additionally, they are maintained as a separate modules on 30CPAN, so you can check there for any updates. 31 32=head1 What Is A Thread Anyway? 33 34A thread is a flow of control through a program with a single 35execution point. 36 37Sounds an awful lot like a process, doesn't it? Well, it should. 38Threads are one of the pieces of a process. Every process has at least 39one thread and, up until now, every process running Perl had only one 40thread. With 5.8, though, you can create extra threads. We're going 41to show you how, when, and why. 42 43=head1 Threaded Program Models 44 45There are three basic ways that you can structure a threaded 46program. Which model you choose depends on what you need your program 47to do. For many non-trivial threaded programs, you'll need to choose 48different models for different pieces of your program. 49 50=head2 Boss/Worker 51 52The boss/worker model usually has one I<boss> thread and one or more 53I<worker> threads. The boss thread gathers or generates tasks that need 54to be done, then parcels those tasks out to the appropriate worker 55thread. 56 57This model is common in GUI and server programs, where a main thread 58waits for some event and then passes that event to the appropriate 59worker threads for processing. Once the event has been passed on, the 60boss thread goes back to waiting for another event. 61 62The boss thread does relatively little work. While tasks aren't 63necessarily performed faster than with any other method, it tends to 64have the best user-response times. 65 66=head2 Work Crew 67 68In the work crew model, several threads are created that do 69essentially the same thing to different pieces of data. It closely 70mirrors classical parallel processing and vector processors, where a 71large array of processors do the exact same thing to many pieces of 72data. 73 74This model is particularly useful if the system running the program 75will distribute multiple threads across different processors. It can 76also be useful in ray tracing or rendering engines, where the 77individual threads can pass on interim results to give the user visual 78feedback. 79 80=head2 Pipeline 81 82The pipeline model divides up a task into a series of steps, and 83passes the results of one step on to the thread processing the 84next. Each thread does one thing to each piece of data and passes the 85results to the next thread in line. 86 87This model makes the most sense if you have multiple processors so two 88or more threads will be executing in parallel, though it can often 89make sense in other contexts as well. It tends to keep the individual 90tasks small and simple, as well as allowing some parts of the pipeline 91to block (on I/O or system calls, for example) while other parts keep 92going. If you're running different parts of the pipeline on different 93processors you may also take advantage of the caches on each 94processor. 95 96This model is also handy for a form of recursive programming where, 97rather than having a subroutine call itself, it instead creates 98another thread. Prime and Fibonacci generators both map well to this 99form of the pipeline model. (A version of a prime number generator is 100presented later on.) 101 102=head1 What kind of threads are Perl threads? 103 104If you have experience with other thread implementations, you might 105find that things aren't quite what you expect. It's very important to 106remember when dealing with Perl threads that I<Perl Threads Are Not X 107Threads> for all values of X. They aren't POSIX threads, or 108DecThreads, or Java's Green threads, or Win32 threads. There are 109similarities, and the broad concepts are the same, but if you start 110looking for implementation details you're going to be either 111disappointed or confused. Possibly both. 112 113This is not to say that Perl threads are completely different from 114everything that's ever come before. They're not. Perl's threading 115model owes a lot to other thread models, especially POSIX. Just as 116Perl is not C, though, Perl threads are not POSIX threads. So if you 117find yourself looking for mutexes, or thread priorities, it's time to 118step back a bit and think about what you want to do and how Perl can 119do it. 120 121However, it is important to remember that Perl threads cannot magically 122do things unless your operating system's threads allow it. So if your 123system blocks the entire process on C<sleep()>, Perl usually will, as well. 124 125B<Perl Threads Are Different.> 126 127=head1 Thread-Safe Modules 128 129The addition of threads has changed Perl's internals 130substantially. There are implications for people who write 131modules with XS code or external libraries. However, since Perl data is 132not shared among threads by default, Perl modules stand a high chance of 133being thread-safe or can be made thread-safe easily. Modules that are not 134tagged as thread-safe should be tested or code reviewed before being used 135in production code. 136 137Not all modules that you might use are thread-safe, and you should 138always assume a module is unsafe unless the documentation says 139otherwise. This includes modules that are distributed as part of the 140core. Threads are a relatively new feature, and even some of the standard 141modules aren't thread-safe. 142 143Even if a module is thread-safe, it doesn't mean that the module is optimized 144to work well with threads. A module could possibly be rewritten to utilize 145the new features in threaded Perl to increase performance in a threaded 146environment. 147 148If you're using a module that's not thread-safe for some reason, you 149can protect yourself by using it from one, and only one thread at all. 150If you need multiple threads to access such a module, you can use semaphores and 151lots of programming discipline to control access to it. Semaphores 152are covered in L</"Basic semaphores">. 153 154See also L</"Thread-Safety of System Libraries">. 155 156=head1 Thread Basics 157 158The L<threads> module provides the basic functions you need to write 159threaded programs. In the following sections, we'll cover the basics, 160showing you what you need to do to create a threaded program. After 161that, we'll go over some of the features of the L<threads> module that 162make threaded programming easier. 163 164=head2 Basic Thread Support 165 166Thread support is a Perl compile-time option. It's something that's 167turned on or off when Perl is built at your site, rather than when 168your programs are compiled. If your Perl wasn't compiled with thread 169support enabled, then any attempt to use threads will fail. 170 171Your programs can use the Config module to check whether threads are 172enabled. If your program can't run without them, you can say something 173like: 174 175 use Config; 176 $Config{useithreads} or 177 die('Recompile Perl with threads to run this program.'); 178 179A possibly-threaded program using a possibly-threaded module might 180have code like this: 181 182 use Config; 183 use MyMod; 184 185 BEGIN { 186 if ($Config{useithreads}) { 187 # We have threads 188 require MyMod_threaded; 189 import MyMod_threaded; 190 } else { 191 require MyMod_unthreaded; 192 import MyMod_unthreaded; 193 } 194 } 195 196Since code that runs both with and without threads is usually pretty 197messy, it's best to isolate the thread-specific code in its own 198module. In our example above, that's what C<MyMod_threaded> is, and it's 199only imported if we're running on a threaded Perl. 200 201=head2 A Note about the Examples 202 203In a real situation, care should be taken that all threads are finished 204executing before the program exits. That care has B<not> been taken in these 205examples in the interest of simplicity. Running these examples I<as is> will 206produce error messages, usually caused by the fact that there are still 207threads running when the program exits. You should not be alarmed by this. 208 209=head2 Creating Threads 210 211The L<threads> module provides the tools you need to create new 212threads. Like any other module, you need to tell Perl that you want to use 213it; C<use threads;> imports all the pieces you need to create basic 214threads. 215 216The simplest, most straightforward way to create a thread is with C<create()>: 217 218 use threads; 219 220 my $thr = threads->create(\&sub1); 221 222 sub sub1 { 223 print("In the thread\n"); 224 } 225 226The C<create()> method takes a reference to a subroutine and creates a new 227thread that starts executing in the referenced subroutine. Control 228then passes both to the subroutine and the caller. 229 230If you need to, your program can pass parameters to the subroutine as 231part of the thread startup. Just include the list of parameters as 232part of the C<threads-E<gt>create()> call, like this: 233 234 use threads; 235 236 my $Param3 = 'foo'; 237 my $thr1 = threads->create(\&sub1, 'Param 1', 'Param 2', $Param3); 238 my @ParamList = (42, 'Hello', 3.14); 239 my $thr2 = threads->create(\&sub1, @ParamList); 240 my $thr3 = threads->create(\&sub1, qw(Param1 Param2 Param3)); 241 242 sub sub1 { 243 my @InboundParameters = @_; 244 print("In the thread\n"); 245 print('Got parameters >', join('<>',@InboundParameters), "<\n"); 246 } 247 248The last example illustrates another feature of threads. You can spawn 249off several threads using the same subroutine. Each thread executes 250the same subroutine, but in a separate thread with a separate 251environment and potentially separate arguments. 252 253C<new()> is a synonym for C<create()>. 254 255=head2 Waiting For A Thread To Exit 256 257Since threads are also subroutines, they can return values. To wait 258for a thread to exit and extract any values it might return, you can 259use the C<join()> method: 260 261 use threads; 262 263 my ($thr) = threads->create(\&sub1); 264 265 my @ReturnData = $thr->join(); 266 print('Thread returned ', join(', ', @ReturnData), "\n"); 267 268 sub sub1 { return ('Fifty-six', 'foo', 2); } 269 270In the example above, the C<join()> method returns as soon as the thread 271ends. In addition to waiting for a thread to finish and gathering up 272any values that the thread might have returned, C<join()> also performs 273any OS cleanup necessary for the thread. That cleanup might be 274important, especially for long-running programs that spawn lots of 275threads. If you don't want the return values and don't want to wait 276for the thread to finish, you should call the C<detach()> method 277instead, as described next. 278 279NOTE: In the example above, the thread returns a list, thus necessitating 280that the thread creation call be made in list context (i.e., C<my ($thr)>). 281See L<< threads/"$thr->join()" >> and L<threads/"THREAD CONTEXT"> for more 282details on thread context and return values. 283 284=head2 Ignoring A Thread 285 286C<join()> does three things: it waits for a thread to exit, cleans up 287after it, and returns any data the thread may have produced. But what 288if you're not interested in the thread's return values, and you don't 289really care when the thread finishes? All you want is for the thread 290to get cleaned up after when it's done. 291 292In this case, you use the C<detach()> method. Once a thread is detached, 293it'll run until it's finished; then Perl will clean up after it 294automatically. 295 296 use threads; 297 298 my $thr = threads->create(\&sub1); # Spawn the thread 299 300 $thr->detach(); # Now we officially don't care any more 301 302 sleep(15); # Let thread run for awhile 303 304 sub sub1 { 305 my $count = 0; 306 while (1) { 307 $count++; 308 print("\$count is $count\n"); 309 sleep(1); 310 } 311 } 312 313Once a thread is detached, it may not be joined, and any return data 314that it might have produced (if it was done and waiting for a join) is 315lost. 316 317C<detach()> can also be called as a class method to allow a thread to 318detach itself: 319 320 use threads; 321 322 my $thr = threads->create(\&sub1); 323 324 sub sub1 { 325 threads->detach(); 326 # Do more work 327 } 328 329=head2 Process and Thread Termination 330 331With threads one must be careful to make sure they all have a chance to 332run to completion, assuming that is what you want. 333 334An action that terminates a process will terminate I<all> running 335threads. die() and exit() have this property, 336and perl does an exit when the main thread exits, 337perhaps implicitly by falling off the end of your code, 338even if that's not what you want. 339 340As an example of this case, this code prints the message 341"Perl exited with active threads: 2 running and unjoined": 342 343 use threads; 344 my $thr1 = threads->new(\&thrsub, "test1"); 345 my $thr2 = threads->new(\&thrsub, "test2"); 346 sub thrsub { 347 my ($message) = @_; 348 sleep 1; 349 print "thread $message\n"; 350 } 351 352But when the following lines are added at the end: 353 354 $thr1->join(); 355 $thr2->join(); 356 357it prints two lines of output, a perhaps more useful outcome. 358 359=head1 Threads And Data 360 361Now that we've covered the basics of threads, it's time for our next 362topic: Data. Threading introduces a couple of complications to data 363access that non-threaded programs never need to worry about. 364 365=head2 Shared And Unshared Data 366 367The biggest difference between Perl I<ithreads> and the old 5.005 style 368threading, or for that matter, to most other threading systems out there, 369is that by default, no data is shared. When a new Perl thread is created, 370all the data associated with the current thread is copied to the new 371thread, and is subsequently private to that new thread! 372This is similar in feel to what happens when a Unix process forks, 373except that in this case, the data is just copied to a different part of 374memory within the same process rather than a real fork taking place. 375 376To make use of threading, however, one usually wants the threads to share 377at least some data between themselves. This is done with the 378L<threads::shared> module and the C<:shared> attribute: 379 380 use threads; 381 use threads::shared; 382 383 my $foo :shared = 1; 384 my $bar = 1; 385 threads->create(sub { $foo++; $bar++; })->join(); 386 387 print("$foo\n"); # Prints 2 since $foo is shared 388 print("$bar\n"); # Prints 1 since $bar is not shared 389 390In the case of a shared array, all the array's elements are shared, and for 391a shared hash, all the keys and values are shared. This places 392restrictions on what may be assigned to shared array and hash elements: only 393simple values or references to shared variables are allowed - this is 394so that a private variable can't accidentally become shared. A bad 395assignment will cause the thread to die. For example: 396 397 use threads; 398 use threads::shared; 399 400 my $var = 1; 401 my $svar :shared = 2; 402 my %hash :shared; 403 404 ... create some threads ... 405 406 $hash{a} = 1; # All threads see exists($hash{a}) 407 # and $hash{a} == 1 408 $hash{a} = $var; # okay - copy-by-value: same effect as previous 409 $hash{a} = $svar; # okay - copy-by-value: same effect as previous 410 $hash{a} = \$svar; # okay - a reference to a shared variable 411 $hash{a} = \$var; # This will die 412 delete($hash{a}); # okay - all threads will see !exists($hash{a}) 413 414Note that a shared variable guarantees that if two or more threads try to 415modify it at the same time, the internal state of the variable will not 416become corrupted. However, there are no guarantees beyond this, as 417explained in the next section. 418 419=head2 Thread Pitfalls: Races 420 421While threads bring a new set of useful tools, they also bring a 422number of pitfalls. One pitfall is the race condition: 423 424 use threads; 425 use threads::shared; 426 427 my $x :shared = 1; 428 my $thr1 = threads->create(\&sub1); 429 my $thr2 = threads->create(\&sub2); 430 431 $thr1->join(); 432 $thr2->join(); 433 print("$x\n"); 434 435 sub sub1 { my $foo = $x; $x = $foo + 1; } 436 sub sub2 { my $bar = $x; $x = $bar + 1; } 437 438What do you think C<$x> will be? The answer, unfortunately, is I<it 439depends>. Both C<sub1()> and C<sub2()> access the global variable C<$x>, once 440to read and once to write. Depending on factors ranging from your 441thread implementation's scheduling algorithm to the phase of the moon, 442C<$x> can be 2 or 3. 443 444Race conditions are caused by unsynchronized access to shared 445data. Without explicit synchronization, there's no way to be sure that 446nothing has happened to the shared data between the time you access it 447and the time you update it. Even this simple code fragment has the 448possibility of error: 449 450 use threads; 451 my $x :shared = 2; 452 my $y :shared; 453 my $z :shared; 454 my $thr1 = threads->create(sub { $y = $x; $x = $y + 1; }); 455 my $thr2 = threads->create(sub { $z = $x; $x = $z + 1; }); 456 $thr1->join(); 457 $thr2->join(); 458 459Two threads both access C<$x>. Each thread can potentially be interrupted 460at any point, or be executed in any order. At the end, C<$x> could be 3 461or 4, and both C<$y> and C<$z> could be 2 or 3. 462 463Even C<$x += 5> or C<$x++> are not guaranteed to be atomic. 464 465Whenever your program accesses data or resources that can be accessed 466by other threads, you must take steps to coordinate access or risk 467data inconsistency and race conditions. Note that Perl will protect its 468internals from your race conditions, but it won't protect you from you. 469 470=head1 Synchronization and control 471 472Perl provides a number of mechanisms to coordinate the interactions 473between themselves and their data, to avoid race conditions and the like. 474Some of these are designed to resemble the common techniques used in thread 475libraries such as C<pthreads>; others are Perl-specific. Often, the 476standard techniques are clumsy and difficult to get right (such as 477condition waits). Where possible, it is usually easier to use Perlish 478techniques such as queues, which remove some of the hard work involved. 479 480=head2 Controlling access: lock() 481 482The C<lock()> function takes a shared variable and puts a lock on it. 483No other thread may lock the variable until the variable is unlocked 484by the thread holding the lock. Unlocking happens automatically 485when the locking thread exits the block that contains the call to the 486C<lock()> function. Using C<lock()> is straightforward: This example has 487several threads doing some calculations in parallel, and occasionally 488updating a running total: 489 490 use threads; 491 use threads::shared; 492 493 my $total :shared = 0; 494 495 sub calc { 496 while (1) { 497 my $result; 498 # (... do some calculations and set $result ...) 499 { 500 lock($total); # Block until we obtain the lock 501 $total += $result; 502 } # Lock implicitly released at end of scope 503 last if $result == 0; 504 } 505 } 506 507 my $thr1 = threads->create(\&calc); 508 my $thr2 = threads->create(\&calc); 509 my $thr3 = threads->create(\&calc); 510 $thr1->join(); 511 $thr2->join(); 512 $thr3->join(); 513 print("total=$total\n"); 514 515C<lock()> blocks the thread until the variable being locked is 516available. When C<lock()> returns, your thread can be sure that no other 517thread can lock that variable until the block containing the 518lock exits. 519 520It's important to note that locks don't prevent access to the variable 521in question, only lock attempts. This is in keeping with Perl's 522longstanding tradition of courteous programming, and the advisory file 523locking that C<flock()> gives you. 524 525You may lock arrays and hashes as well as scalars. Locking an array, 526though, will not block subsequent locks on array elements, just lock 527attempts on the array itself. 528 529Locks are recursive, which means it's okay for a thread to 530lock a variable more than once. The lock will last until the outermost 531C<lock()> on the variable goes out of scope. For example: 532 533 my $x :shared; 534 doit(); 535 536 sub doit { 537 { 538 { 539 lock($x); # Wait for lock 540 lock($x); # NOOP - we already have the lock 541 { 542 lock($x); # NOOP 543 { 544 lock($x); # NOOP 545 lockit_some_more(); 546 } 547 } 548 } # *** Implicit unlock here *** 549 } 550 } 551 552 sub lockit_some_more { 553 lock($x); # NOOP 554 } # Nothing happens here 555 556Note that there is no C<unlock()> function - the only way to unlock a 557variable is to allow it to go out of scope. 558 559A lock can either be used to guard the data contained within the variable 560being locked, or it can be used to guard something else, like a section 561of code. In this latter case, the variable in question does not hold any 562useful data, and exists only for the purpose of being locked. In this 563respect, the variable behaves like the mutexes and basic semaphores of 564traditional thread libraries. 565 566=head2 A Thread Pitfall: Deadlocks 567 568Locks are a handy tool to synchronize access to data, and using them 569properly is the key to safe shared data. Unfortunately, locks aren't 570without their dangers, especially when multiple locks are involved. 571Consider the following code: 572 573 use threads; 574 575 my $x :shared = 4; 576 my $y :shared = 'foo'; 577 my $thr1 = threads->create(sub { 578 lock($x); 579 sleep(20); 580 lock($y); 581 }); 582 my $thr2 = threads->create(sub { 583 lock($y); 584 sleep(20); 585 lock($x); 586 }); 587 588This program will probably hang until you kill it. The only way it 589won't hang is if one of the two threads acquires both locks 590first. A guaranteed-to-hang version is more complicated, but the 591principle is the same. 592 593The first thread will grab a lock on C<$x>, then, after a pause during which 594the second thread has probably had time to do some work, try to grab a 595lock on C<$y>. Meanwhile, the second thread grabs a lock on C<$y>, then later 596tries to grab a lock on C<$x>. The second lock attempt for both threads will 597block, each waiting for the other to release its lock. 598 599This condition is called a deadlock, and it occurs whenever two or 600more threads are trying to get locks on resources that the others 601own. Each thread will block, waiting for the other to release a lock 602on a resource. That never happens, though, since the thread with the 603resource is itself waiting for a lock to be released. 604 605There are a number of ways to handle this sort of problem. The best 606way is to always have all threads acquire locks in the exact same 607order. If, for example, you lock variables C<$x>, C<$y>, and C<$z>, always lock 608C<$x> before C<$y>, and C<$y> before C<$z>. It's also best to hold on to locks for 609as short a period of time to minimize the risks of deadlock. 610 611The other synchronization primitives described below can suffer from 612similar problems. 613 614=head2 Queues: Passing Data Around 615 616A queue is a special thread-safe object that lets you put data in one 617end and take it out the other without having to worry about 618synchronization issues. They're pretty straightforward, and look like 619this: 620 621 use threads; 622 use Thread::Queue; 623 624 my $DataQueue = Thread::Queue->new(); 625 my $thr = threads->create(sub { 626 while (my $DataElement = $DataQueue->dequeue()) { 627 print("Popped $DataElement off the queue\n"); 628 } 629 }); 630 631 $DataQueue->enqueue(12); 632 $DataQueue->enqueue("A", "B", "C"); 633 sleep(10); 634 $DataQueue->enqueue(undef); 635 $thr->join(); 636 637You create the queue with C<Thread::Queue-E<gt>new()>. Then you can 638add lists of scalars onto the end with C<enqueue()>, and pop scalars off 639the front of it with C<dequeue()>. A queue has no fixed size, and can grow 640as needed to hold everything pushed on to it. 641 642If a queue is empty, C<dequeue()> blocks until another thread enqueues 643something. This makes queues ideal for event loops and other 644communications between threads. 645 646=head2 Semaphores: Synchronizing Data Access 647 648Semaphores are a kind of generic locking mechanism. In their most basic 649form, they behave very much like lockable scalars, except that they 650can't hold data, and that they must be explicitly unlocked. In their 651advanced form, they act like a kind of counter, and can allow multiple 652threads to have the I<lock> at any one time. 653 654=head2 Basic semaphores 655 656Semaphores have two methods, C<down()> and C<up()>: C<down()> decrements the resource 657count, while C<up()> increments it. Calls to C<down()> will block if the 658semaphore's current count would decrement below zero. This program 659gives a quick demonstration: 660 661 use threads; 662 use Thread::Semaphore; 663 664 my $semaphore = Thread::Semaphore->new(); 665 my $GlobalVariable :shared = 0; 666 667 $thr1 = threads->create(\&sample_sub, 1); 668 $thr2 = threads->create(\&sample_sub, 2); 669 $thr3 = threads->create(\&sample_sub, 3); 670 671 sub sample_sub { 672 my $SubNumber = shift(@_); 673 my $TryCount = 10; 674 my $LocalCopy; 675 sleep(1); 676 while ($TryCount--) { 677 $semaphore->down(); 678 $LocalCopy = $GlobalVariable; 679 print("$TryCount tries left for sub $SubNumber " 680 ."(\$GlobalVariable is $GlobalVariable)\n"); 681 sleep(2); 682 $LocalCopy++; 683 $GlobalVariable = $LocalCopy; 684 $semaphore->up(); 685 } 686 } 687 688 $thr1->join(); 689 $thr2->join(); 690 $thr3->join(); 691 692The three invocations of the subroutine all operate in sync. The 693semaphore, though, makes sure that only one thread is accessing the 694global variable at once. 695 696=head2 Advanced Semaphores 697 698By default, semaphores behave like locks, letting only one thread 699C<down()> them at a time. However, there are other uses for semaphores. 700 701Each semaphore has a counter attached to it. By default, semaphores are 702created with the counter set to one, C<down()> decrements the counter by 703one, and C<up()> increments by one. However, we can override any or all 704of these defaults simply by passing in different values: 705 706 use threads; 707 use Thread::Semaphore; 708 709 my $semaphore = Thread::Semaphore->new(5); 710 # Creates a semaphore with the counter set to five 711 712 my $thr1 = threads->create(\&sub1); 713 my $thr2 = threads->create(\&sub1); 714 715 sub sub1 { 716 $semaphore->down(5); # Decrements the counter by five 717 # Do stuff here 718 $semaphore->up(5); # Increment the counter by five 719 } 720 721 $thr1->detach(); 722 $thr2->detach(); 723 724If C<down()> attempts to decrement the counter below zero, it blocks until 725the counter is large enough. Note that while a semaphore can be created 726with a starting count of zero, any C<up()> or C<down()> always changes the 727counter by at least one, and so C<< $semaphore->down(0) >> is the same as 728C<< $semaphore->down(1) >>. 729 730The question, of course, is why would you do something like this? Why 731create a semaphore with a starting count that's not one, or why 732decrement or increment it by more than one? The answer is resource 733availability. Many resources that you want to manage access for can be 734safely used by more than one thread at once. 735 736For example, let's take a GUI driven program. It has a semaphore that 737it uses to synchronize access to the display, so only one thread is 738ever drawing at once. Handy, but of course you don't want any thread 739to start drawing until things are properly set up. In this case, you 740can create a semaphore with a counter set to zero, and up it when 741things are ready for drawing. 742 743Semaphores with counters greater than one are also useful for 744establishing quotas. Say, for example, that you have a number of 745threads that can do I/O at once. You don't want all the threads 746reading or writing at once though, since that can potentially swamp 747your I/O channels, or deplete your process's quota of filehandles. You 748can use a semaphore initialized to the number of concurrent I/O 749requests (or open files) that you want at any one time, and have your 750threads quietly block and unblock themselves. 751 752Larger increments or decrements are handy in those cases where a 753thread needs to check out or return a number of resources at once. 754 755=head2 Waiting for a Condition 756 757The functions C<cond_wait()> and C<cond_signal()> 758can be used in conjunction with locks to notify 759co-operating threads that a resource has become available. They are 760very similar in use to the functions found in C<pthreads>. However 761for most purposes, queues are simpler to use and more intuitive. See 762L<threads::shared> for more details. 763 764=head2 Giving up control 765 766There are times when you may find it useful to have a thread 767explicitly give up the CPU to another thread. You may be doing something 768processor-intensive and want to make sure that the user-interface thread 769gets called frequently. Regardless, there are times that you might want 770a thread to give up the processor. 771 772Perl's threading package provides the C<yield()> function that does 773this. C<yield()> is pretty straightforward, and works like this: 774 775 use threads; 776 777 sub loop { 778 my $thread = shift; 779 my $foo = 50; 780 while($foo--) { print("In thread $thread\n"); } 781 threads->yield(); 782 $foo = 50; 783 while($foo--) { print("In thread $thread\n"); } 784 } 785 786 my $thr1 = threads->create(\&loop, 'first'); 787 my $thr2 = threads->create(\&loop, 'second'); 788 my $thr3 = threads->create(\&loop, 'third'); 789 790It is important to remember that C<yield()> is only a hint to give up the CPU, 791it depends on your hardware, OS and threading libraries what actually happens. 792B<On many operating systems, yield() is a no-op.> Therefore it is important 793to note that one should not build the scheduling of the threads around 794C<yield()> calls. It might work on your platform but it won't work on another 795platform. 796 797=head1 General Thread Utility Routines 798 799We've covered the workhorse parts of Perl's threading package, and 800with these tools you should be well on your way to writing threaded 801code and packages. There are a few useful little pieces that didn't 802really fit in anyplace else. 803 804=head2 What Thread Am I In? 805 806The C<threads-E<gt>self()> class method provides your program with a way to 807get an object representing the thread it's currently in. You can use this 808object in the same way as the ones returned from thread creation. 809 810=head2 Thread IDs 811 812C<tid()> is a thread object method that returns the thread ID of the 813thread the object represents. Thread IDs are integers, with the main 814thread in a program being 0. Currently Perl assigns a unique TID to 815every thread ever created in your program, assigning the first thread 816to be created a TID of 1, and increasing the TID by 1 for each new 817thread that's created. When used as a class method, C<threads-E<gt>tid()> 818can be used by a thread to get its own TID. 819 820=head2 Are These Threads The Same? 821 822The C<equal()> method takes two thread objects and returns true 823if the objects represent the same thread, and false if they don't. 824 825Thread objects also have an overloaded C<==> comparison so that you can do 826comparison on them as you would with normal objects. 827 828=head2 What Threads Are Running? 829 830C<threads-E<gt>list()> returns a list of thread objects, one for each thread 831that's currently running and not detached. Handy for a number of things, 832including cleaning up at the end of your program (from the main Perl thread, 833of course): 834 835 # Loop through all the threads 836 foreach my $thr (threads->list()) { 837 $thr->join(); 838 } 839 840If some threads have not finished running when the main Perl thread 841ends, Perl will warn you about it and die, since it is impossible for Perl 842to clean up itself while other threads are running. 843 844NOTE: The main Perl thread (thread 0) is in a I<detached> state, and so 845does not appear in the list returned by C<threads-E<gt>list()>. 846 847=head1 A Complete Example 848 849Confused yet? It's time for an example program to show some of the 850things we've covered. This program finds prime numbers using threads. 851 852 1 #!/usr/bin/perl 853 2 # prime-pthread, courtesy of Tom Christiansen 854 3 855 4 use v5.36; 856 5 857 6 use threads; 858 7 use Thread::Queue; 859 8 860 9 sub check_num ($upstream, $cur_prime) { 861 10 my $kid; 862 11 my $downstream = Thread::Queue->new(); 863 12 while (my $num = $upstream->dequeue()) { 864 13 next unless ($num % $cur_prime); 865 14 if ($kid) { 866 15 $downstream->enqueue($num); 867 16 } else { 868 17 print("Found prime: $num\n"); 869 18 $kid = threads->create(\&check_num, $downstream, $num); 870 19 if (! $kid) { 871 20 warn("Sorry. Ran out of threads.\n"); 872 21 last; 873 22 } 874 23 } 875 24 } 876 25 if ($kid) { 877 26 $downstream->enqueue(undef); 878 27 $kid->join(); 879 28 } 880 29 } 881 30 882 31 my $stream = Thread::Queue->new(3..1000, undef); 883 32 check_num($stream, 2); 884 885This program uses the pipeline model to generate prime numbers. Each 886thread in the pipeline has an input queue that feeds numbers to be 887checked, a prime number that it's responsible for, and an output queue 888into which it funnels numbers that have failed the check. If the thread 889has a number that's failed its check and there's no child thread, then 890the thread must have found a new prime number. In that case, a new 891child thread is created for that prime and stuck on the end of the 892pipeline. 893 894This probably sounds a bit more confusing than it really is, so let's 895go through this program piece by piece and see what it does. (For 896those of you who might be trying to remember exactly what a prime 897number is, it's a number that's only evenly divisible by itself and 1.) 898 899The bulk of the work is done by the C<check_num()> subroutine, which 900takes a reference to its input queue and a prime number that it's 901responsible for. We create a new queue (line 11) and reserve a scalar 902for the thread that we're likely to create later (line 10). 903 904The while loop from line 12 to line 24 grabs a scalar off the input 905queue and checks against the prime this thread is responsible 906for. Line 13 checks to see if there's a remainder when we divide the 907number to be checked by our prime. If there is one, the number 908must not be evenly divisible by our prime, so we need to either pass 909it on to the next thread if we've created one (line 15) or create a 910new thread if we haven't. 911 912The new thread creation is line 18. We pass on to it a reference to 913the queue we've created, and the prime number we've found. In lines 19 914through 22, we check to make sure that our new thread got created, and 915if not, we stop checking any remaining numbers in the queue. 916 917Finally, once the loop terminates (because we got a 0 or C<undef> in the 918queue, which serves as a note to terminate), we pass on the notice to our 919child, and wait for it to exit if we've created a child (lines 25 and 92028). 921 922Meanwhile, back in the main thread, we first create a queue (line 31) and 923queue up all the numbers from 3 to 1000 for checking, plus a termination 924notice. Then all we have to do to get the ball rolling is pass the queue 925and the first prime to the C<check_num()> subroutine (line 32). 926 927That's how it works. It's pretty simple; as with many Perl programs, 928the explanation is much longer than the program. 929 930=head1 Different implementations of threads 931 932Some background on thread implementations from the operating system 933viewpoint. There are three basic categories of threads: user-mode threads, 934kernel threads, and multiprocessor kernel threads. 935 936User-mode threads are threads that live entirely within a program and 937its libraries. In this model, the OS knows nothing about threads. As 938far as it's concerned, your process is just a process. 939 940This is the easiest way to implement threads, and the way most OSes 941start. The big disadvantage is that, since the OS knows nothing about 942threads, if one thread blocks they all do. Typical blocking activities 943include most system calls, most I/O, and things like C<sleep()>. 944 945Kernel threads are the next step in thread evolution. The OS knows 946about kernel threads, and makes allowances for them. The main 947difference between a kernel thread and a user-mode thread is 948blocking. With kernel threads, things that block a single thread don't 949block other threads. This is not the case with user-mode threads, 950where the kernel blocks at the process level and not the thread level. 951 952This is a big step forward, and can give a threaded program quite a 953performance boost over non-threaded programs. Threads that block 954performing I/O, for example, won't block threads that are doing other 955things. Each process still has only one thread running at once, 956though, regardless of how many CPUs a system might have. 957 958Since kernel threading can interrupt a thread at any time, they will 959uncover some of the implicit locking assumptions you may make in your 960program. For example, something as simple as C<$x = $x + 2> can behave 961unpredictably with kernel threads if C<$x> is visible to other 962threads, as another thread may have changed C<$x> between the time it 963was fetched on the right hand side and the time the new value is 964stored. 965 966Multiprocessor kernel threads are the final step in thread 967support. With multiprocessor kernel threads on a machine with multiple 968CPUs, the OS may schedule two or more threads to run simultaneously on 969different CPUs. 970 971This can give a serious performance boost to your threaded program, 972since more than one thread will be executing at the same time. As a 973tradeoff, though, any of those nagging synchronization issues that 974might not have shown with basic kernel threads will appear with a 975vengeance. 976 977In addition to the different levels of OS involvement in threads, 978different OSes (and different thread implementations for a particular 979OS) allocate CPU cycles to threads in different ways. 980 981Cooperative multitasking systems have running threads give up control 982if one of two things happen. If a thread calls a yield function, it 983gives up control. It also gives up control if the thread does 984something that would cause it to block, such as perform I/O. In a 985cooperative multitasking implementation, one thread can starve all the 986others for CPU time if it so chooses. 987 988Preemptive multitasking systems interrupt threads at regular intervals 989while the system decides which thread should run next. In a preemptive 990multitasking system, one thread usually won't monopolize the CPU. 991 992On some systems, there can be cooperative and preemptive threads 993running simultaneously. (Threads running with realtime priorities 994often behave cooperatively, for example, while threads running at 995normal priorities behave preemptively.) 996 997Most modern operating systems support preemptive multitasking nowadays. 998 999=head1 Performance considerations 1000 1001The main thing to bear in mind when comparing Perl's I<ithreads> to other threading 1002models is the fact that for each new thread created, a complete copy of 1003all the variables and data of the parent thread has to be taken. Thus, 1004thread creation can be quite expensive, both in terms of memory usage and 1005time spent in creation. The ideal way to reduce these costs is to have a 1006relatively short number of long-lived threads, all created fairly early 1007on (before the base thread has accumulated too much data). Of course, this 1008may not always be possible, so compromises have to be made. However, after 1009a thread has been created, its performance and extra memory usage should 1010be little different than ordinary code. 1011 1012Also note that under the current implementation, shared variables 1013use a little more memory and are a little slower than ordinary variables. 1014 1015=head1 Process-scope Changes 1016 1017Note that while threads themselves are separate execution threads and 1018Perl data is thread-private unless explicitly shared, the threads can 1019affect process-scope state, affecting all the threads. 1020 1021The most common example of this is changing the current working 1022directory using C<chdir()>. One thread calls C<chdir()>, and the working 1023directory of all the threads changes. 1024 1025Even more drastic example of a process-scope change is C<chroot()>: 1026the root directory of all the threads changes, and no thread can 1027undo it (as opposed to C<chdir()>). 1028 1029Further examples of process-scope changes include C<umask()> and 1030changing uids and gids. 1031 1032Thinking of mixing C<fork()> and threads? Please lie down and wait 1033until the feeling passes. Be aware that the semantics of C<fork()> vary 1034between platforms. For example, some Unix systems copy all the current 1035threads into the child process, while others only copy the thread that 1036called C<fork()>. You have been warned! 1037 1038Similarly, mixing signals and threads may be problematic. 1039Implementations are platform-dependent, and even the POSIX 1040semantics may not be what you expect (and Perl doesn't even 1041give you the full POSIX API). For example, there is no way to 1042guarantee that a signal sent to a multi-threaded Perl application 1043will get intercepted by any particular thread. (However, a recently 1044added feature does provide the capability to send signals between 1045threads. See L<threads/THREAD SIGNALLING> for more details.) 1046 1047=head1 Thread-Safety of System Libraries 1048 1049Whether various library calls are thread-safe is outside the control 1050of Perl. Calls often suffering from not being thread-safe include: 1051C<localtime()>, C<gmtime()>, functions fetching user, group and 1052network information (such as C<getgrent()>, C<gethostent()>, 1053C<getnetent()> and so on), C<readdir()>, C<rand()>, and C<srand()>. In 1054general, calls that depend on some global external state. 1055 1056If the system Perl is compiled in has thread-safe variants of such 1057calls, they will be used. Beyond that, Perl is at the mercy of 1058the thread-safety or -unsafety of the calls. Please consult your 1059C library call documentation. 1060 1061On some platforms the thread-safe library interfaces may fail if the 1062result buffer is too small (for example the user group databases may 1063be rather large, and the reentrant interfaces may have to carry around 1064a full snapshot of those databases). Perl will start with a small 1065buffer, but keep retrying and growing the result buffer 1066until the result fits. If this limitless growing sounds bad for 1067security or memory consumption reasons you can recompile Perl with 1068C<PERL_REENTRANT_MAXSIZE> defined to the maximum number of bytes you will 1069allow. 1070 1071=head1 Conclusion 1072 1073A complete thread tutorial could fill a book (and has, many times), 1074but with what we've covered in this introduction, you should be well 1075on your way to becoming a threaded Perl expert. 1076 1077=head1 SEE ALSO 1078 1079Annotated POD for L<threads>: 1080L<https://web.archive.org/web/20171028020148/http://annocpan.org/?mode=search&field=Module&name=threads> 1081 1082Latest version of L<threads> on CPAN: 1083L<https://metacpan.org/pod/threads> 1084 1085Annotated POD for L<threads::shared>: 1086L<https://web.archive.org/web/20171028020148/http://annocpan.org/?mode=search&field=Module&name=threads%3A%3Ashared> 1087 1088Latest version of L<threads::shared> on CPAN: 1089L<https://metacpan.org/pod/threads::shared> 1090 1091Perl threads mailing list: 1092L<https://lists.perl.org/list/ithreads.html> 1093 1094=head1 Bibliography 1095 1096Here's a short bibliography courtesy of Jürgen Christoffel: 1097 1098=head2 Introductory Texts 1099 1100Birrell, Andrew D. An Introduction to Programming with 1101Threads. Digital Equipment Corporation, 1989, DEC-SRC Research Report 1102#35 online as 1103L<https://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-35.pdf> 1104(highly recommended) 1105 1106Robbins, Kay. A., and Steven Robbins. Practical Unix Programming: A 1107Guide to Concurrency, Communication, and 1108Multithreading. Prentice-Hall, 1996. 1109 1110Lewis, Bill, and Daniel J. Berg. Multithreaded Programming with 1111Pthreads. Prentice Hall, 1997, ISBN 0-13-443698-9 (a well-written 1112introduction to threads). 1113 1114Nelson, Greg (editor). Systems Programming with Modula-3. Prentice 1115Hall, 1991, ISBN 0-13-590464-1. 1116 1117Nichols, Bradford, Dick Buttlar, and Jacqueline Proulx Farrell. 1118Pthreads Programming. O'Reilly & Associates, 1996, ISBN 156592-115-1 1119(covers POSIX threads). 1120 1121=head2 OS-Related References 1122 1123Boykin, Joseph, David Kirschen, Alan Langerman, and Susan 1124LoVerso. Programming under Mach. Addison-Wesley, 1994, ISBN 11250-201-52739-1. 1126 1127Tanenbaum, Andrew S. Distributed Operating Systems. Prentice Hall, 11281995, ISBN 0-13-219908-4 (great textbook). 1129 1130Silberschatz, Abraham, and Peter B. Galvin. Operating System Concepts, 11314th ed. Addison-Wesley, 1995, ISBN 0-201-59292-4 1132 1133=head2 Other References 1134 1135Arnold, Ken and James Gosling. The Java Programming Language, 2nd 1136ed. Addison-Wesley, 1998, ISBN 0-201-31006-6. 1137 1138comp.programming.threads FAQ, 1139L<http://www.serpentine.com/~bos/threads-faq/> 1140 1141Le Sergent, T. and B. Berthomieu. "Incremental MultiThreaded Garbage 1142Collection on Virtually Shared Memory Architectures" in Memory 1143Management: Proc. of the International Workshop IWMM 92, St. Malo, 1144France, September 1992, Yves Bekkers and Jacques Cohen, eds. Springer, 11451992, ISBN 3540-55940-X (real-life thread applications). 1146 1147Artur Bergman, "Where Wizards Fear To Tread", June 11, 2002, 1148L<http://www.perl.com/pub/a/2002/06/11/threads.html> 1149 1150=head1 Acknowledgements 1151 1152Thanks (in no particular order) to Chaim Frenkel, Steve Fink, Gurusamy 1153Sarathy, Ilya Zakharevich, Benjamin Sugars, Jürgen Christoffel, Joshua 1154Pritikin, and Alan Burlison, for their help in reality-checking and 1155polishing this article. Big thanks to Tom Christiansen for his rewrite 1156of the prime number generator. 1157 1158=head1 AUTHOR 1159 1160Dan Sugalski E<lt>dan@sidhe.orgE<gt> 1161 1162Slightly modified by Arthur Bergman to fit the new thread model/module. 1163 1164Reworked slightly by Jörg Walter E<lt>jwalt@cpan.orgE<gt> to be more concise 1165about thread-safety of Perl code. 1166 1167Rearranged slightly by Elizabeth Mattijsen E<lt>liz@dijkmat.nlE<gt> to put 1168less emphasis on yield(). 1169 1170=head1 Copyrights 1171 1172The original version of this article originally appeared in The Perl 1173Journal #10, and is copyright 1998 The Perl Journal. It appears courtesy 1174of Jon Orwant and The Perl Journal. This document may be distributed 1175under the same terms as Perl itself. 1176 1177=cut 1178