1This file documents non-portable functions and other issues. 2 3Non-portable functions included in pthreads-win32 4------------------------------------------------- 5 6BOOL 7pthread_win32_test_features_np(int mask) 8 9 This routine allows an application to check which 10 run-time auto-detected features are available within 11 the library. 12 13 The possible features are: 14 15 PTW32_SYSTEM_INTERLOCKED_COMPARE_EXCHANGE 16 Return TRUE if the native version of 17 InterlockedCompareExchange() is being used. 18 This feature is not meaningful in recent 19 library versions as MSVC builds only support 20 system implemented ICE. Note that all Mingw 21 builds use inlined asm versions of all the 22 Interlocked routines. 23 PTW32_ALERTABLE_ASYNC_CANCEL 24 Return TRUE is the QueueUserAPCEx package 25 QUSEREX.DLL is available and the AlertDrv.sys 26 driver is loaded into Windows, providing 27 alertable (pre-emptive) asyncronous threads 28 cancelation. If this feature returns FALSE 29 then the default async cancel scheme is in 30 use, which cannot cancel blocked threads. 31 32 Features may be Or'ed into the mask parameter, in which case 33 the routine returns TRUE if any of the Or'ed features would 34 return TRUE. At this stage it doesn't make sense to Or features 35 but it may some day. 36 37 38void * 39pthread_timechange_handler_np(void *) 40 41 To improve tolerance against operator or time service 42 initiated system clock changes. 43 44 This routine can be called by an application when it 45 receives a WM_TIMECHANGE message from the system. At 46 present it broadcasts all condition variables so that 47 waiting threads can wake up and re-evaluate their 48 conditions and restart their timed waits if required. 49 50 It has the same return type and argument type as a 51 thread routine so that it may be called directly 52 through pthread_create(), i.e. as a separate thread. 53 54 Parameters 55 56 Although a parameter must be supplied, it is ignored. 57 The value NULL can be used. 58 59 Return values 60 61 It can return an error EAGAIN to indicate that not 62 all condition variables were broadcast for some reason. 63 Otherwise, 0 is returned. 64 65 If run as a thread, the return value is returned 66 through pthread_join(). 67 68 The return value should be cast to an integer. 69 70 71HANDLE 72pthread_getw32threadhandle_np(pthread_t thread); 73 74 Returns the win32 thread handle that the POSIX 75 thread "thread" is running as. 76 77 Applications can use the win32 handle to set 78 win32 specific attributes of the thread. 79 80DWORD 81pthread_getw32threadid_np (pthread_t thread) 82 83 Returns the Windows native thread ID that the POSIX 84 thread "thread" is running as. 85 86 Only valid when the library is built where 87 ! (defined(__MINGW64__) || defined(__MINGW32__)) || defined (__MSVCRT__) || defined (__DMC__) 88 and otherwise returns 0. 89 90 91int 92pthread_mutexattr_setkind_np(pthread_mutexattr_t * attr, int kind) 93 94int 95pthread_mutexattr_getkind_np(pthread_mutexattr_t * attr, int *kind) 96 97 These two routines are included for Linux compatibility 98 and are direct equivalents to the standard routines 99 pthread_mutexattr_settype 100 pthread_mutexattr_gettype 101 102 pthread_mutexattr_setkind_np accepts the following 103 mutex kinds: 104 PTHREAD_MUTEX_FAST_NP 105 PTHREAD_MUTEX_ERRORCHECK_NP 106 PTHREAD_MUTEX_RECURSIVE_NP 107 108 These are really just equivalent to (respectively): 109 PTHREAD_MUTEX_NORMAL 110 PTHREAD_MUTEX_ERRORCHECK 111 PTHREAD_MUTEX_RECURSIVE 112 113int 114pthread_delay_np (const struct timespec *interval); 115 116 This routine causes a thread to delay execution for a specific period of time. 117 This period ends at the current time plus the specified interval. The routine 118 will not return before the end of the period is reached, but may return an 119 arbitrary amount of time after the period has gone by. This can be due to 120 system load, thread priorities, and system timer granularity. 121 122 Specifying an interval of zero (0) seconds and zero (0) nanoseconds is 123 allowed and can be used to force the thread to give up the processor or to 124 deliver a pending cancelation request. 125 126 This routine is a cancelation point. 127 128 The timespec structure contains the following two fields: 129 130 tv_sec is an integer number of seconds. 131 tv_nsec is an integer number of nanoseconds. 132 133 Return Values 134 135 If an error condition occurs, this routine returns an integer value 136 indicating the type of error. Possible return values are as follows: 137 138 0 Successful completion. 139 [EINVAL] The value specified by interval is invalid. 140 141int 142pthread_num_processors_np (void) 143 144 This routine (found on HPUX systems) returns the number of processors 145 in the system. This implementation actually returns the number of 146 processors available to the process, which can be a lower number 147 than the system's number, depending on the process's affinity mask. 148 149BOOL 150pthread_win32_process_attach_np (void); 151 152BOOL 153pthread_win32_process_detach_np (void); 154 155BOOL 156pthread_win32_thread_attach_np (void); 157 158BOOL 159pthread_win32_thread_detach_np (void); 160 161 These functions contain the code normally run via dllMain 162 when the library is used as a dll but which need to be 163 called explicitly by an application when the library 164 is statically linked. As of version 2.9.0 of the library, static 165 builds using either MSC or GCC will call pthread_win32_process_* 166 automatically at application startup and exit respectively. 167 168 Otherwise, you will need to call pthread_win32_process_attach_np() 169 before you can call any pthread routines when statically linking. 170 You should call pthread_win32_process_detach_np() before 171 exiting your application to clean up. 172 173 pthread_win32_thread_attach_np() is currently a no-op, but 174 pthread_win32_thread_detach_np() is needed to clean up 175 the implicit pthread handle that is allocated to a Win32 thread if 176 it calls any pthreads routines. Call this routine when the 177 Win32 thread exits. 178 179 Threads created through pthread_create() do not need to call 180 pthread_win32_thread_detach_np(). 181 182 These functions invariably return TRUE except for 183 pthread_win32_process_attach_np() which will return FALSE 184 if pthreads-win32 initialisation fails. 185 186int 187pthreadCancelableWait (HANDLE waitHandle); 188 189int 190pthreadCancelableTimedWait (HANDLE waitHandle, DWORD timeout); 191 192 These two functions provide hooks into the pthread_cancel 193 mechanism that will allow you to wait on a Windows handle 194 and make it a cancellation point. Both functions block 195 until either the given w32 handle is signaled, or 196 pthread_cancel has been called. It is implemented using 197 WaitForMultipleObjects on 'waitHandle' and a manually 198 reset w32 event used to implement pthread_cancel. 199 200 201Non-portable issues 202------------------- 203 204Thread priority 205 206 POSIX defines a single contiguous range of numbers that determine a 207 thread's priority. Win32 defines priority classes and priority 208 levels relative to these classes. Classes are simply priority base 209 levels that the defined priority levels are relative to such that, 210 changing a process's priority class will change the priority of all 211 of it's threads, while the threads retain the same relativity to each 212 other. 213 214 A Win32 system defines a single contiguous monotonic range of values 215 that define system priority levels, just like POSIX. However, Win32 216 restricts individual threads to a subset of this range on a 217 per-process basis. 218 219 The following table shows the base priority levels for combinations 220 of priority class and priority value in Win32. 221 222 Process Priority Class Thread Priority Level 223 ----------------------------------------------------------------- 224 1 IDLE_PRIORITY_CLASS THREAD_PRIORITY_IDLE 225 1 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_IDLE 226 1 NORMAL_PRIORITY_CLASS THREAD_PRIORITY_IDLE 227 1 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_IDLE 228 1 HIGH_PRIORITY_CLASS THREAD_PRIORITY_IDLE 229 2 IDLE_PRIORITY_CLASS THREAD_PRIORITY_LOWEST 230 3 IDLE_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL 231 4 IDLE_PRIORITY_CLASS THREAD_PRIORITY_NORMAL 232 4 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_LOWEST 233 5 IDLE_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL 234 5 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL 235 5 Background NORMAL_PRIORITY_CLASS THREAD_PRIORITY_LOWEST 236 6 IDLE_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST 237 6 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_NORMAL 238 6 Background NORMAL_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL 239 7 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL 240 7 Background NORMAL_PRIORITY_CLASS THREAD_PRIORITY_NORMAL 241 7 Foreground NORMAL_PRIORITY_CLASS THREAD_PRIORITY_LOWEST 242 8 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST 243 8 NORMAL_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL 244 8 Foreground NORMAL_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL 245 8 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_LOWEST 246 9 NORMAL_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST 247 9 Foreground NORMAL_PRIORITY_CLASS THREAD_PRIORITY_NORMAL 248 9 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL 249 10 Foreground NORMAL_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL 250 10 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_NORMAL 251 11 Foreground NORMAL_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST 252 11 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL 253 11 HIGH_PRIORITY_CLASS THREAD_PRIORITY_LOWEST 254 12 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST 255 12 HIGH_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL 256 13 HIGH_PRIORITY_CLASS THREAD_PRIORITY_NORMAL 257 14 HIGH_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL 258 15 HIGH_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST 259 15 HIGH_PRIORITY_CLASS THREAD_PRIORITY_TIME_CRITICAL 260 15 IDLE_PRIORITY_CLASS THREAD_PRIORITY_TIME_CRITICAL 261 15 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_TIME_CRITICAL 262 15 NORMAL_PRIORITY_CLASS THREAD_PRIORITY_TIME_CRITICAL 263 15 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_TIME_CRITICAL 264 16 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_IDLE 265 17 REALTIME_PRIORITY_CLASS -7 266 18 REALTIME_PRIORITY_CLASS -6 267 19 REALTIME_PRIORITY_CLASS -5 268 20 REALTIME_PRIORITY_CLASS -4 269 21 REALTIME_PRIORITY_CLASS -3 270 22 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_LOWEST 271 23 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL 272 24 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_NORMAL 273 25 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL 274 26 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST 275 27 REALTIME_PRIORITY_CLASS 3 276 28 REALTIME_PRIORITY_CLASS 4 277 29 REALTIME_PRIORITY_CLASS 5 278 30 REALTIME_PRIORITY_CLASS 6 279 31 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_TIME_CRITICAL 280 281 Windows NT: Values -7, -6, -5, -4, -3, 3, 4, 5, and 6 are not supported. 282 283 284 As you can see, the real priority levels available to any individual 285 Win32 thread are non-contiguous. 286 287 An application using pthreads-win32 should not make assumptions about 288 the numbers used to represent thread priority levels, except that they 289 are monotonic between the values returned by sched_get_priority_min() 290 and sched_get_priority_max(). E.g. Windows 95, 98, NT, 2000, XP make 291 available a non-contiguous range of numbers between -15 and 15, while 292 at least one version of WinCE (3.0) defines the minimum priority 293 (THREAD_PRIORITY_LOWEST) as 5, and the maximum priority 294 (THREAD_PRIORITY_HIGHEST) as 1. 295 296 Internally, pthreads-win32 maps any priority levels between 297 THREAD_PRIORITY_IDLE and THREAD_PRIORITY_LOWEST to THREAD_PRIORITY_LOWEST, 298 or between THREAD_PRIORITY_TIME_CRITICAL and THREAD_PRIORITY_HIGHEST to 299 THREAD_PRIORITY_HIGHEST. Currently, this also applies to 300 REALTIME_PRIORITY_CLASSi even if levels -7, -6, -5, -4, -3, 3, 4, 5, and 6 301 are supported. 302 303 If it wishes, a Win32 application using pthreads-win32 can use the Win32 304 defined priority macros THREAD_PRIORITY_IDLE through 305 THREAD_PRIORITY_TIME_CRITICAL. 306 307 308The opacity of the pthread_t datatype 309------------------------------------- 310and possible solutions for portable null/compare/hash, etc 311---------------------------------------------------------- 312 313Because pthread_t is an opague datatype an implementation is permitted to define 314pthread_t in any way it wishes. That includes defining some bits, if it is 315scalar, or members, if it is an aggregate, to store information that may be 316extra to the unique identifying value of the ID. As a result, pthread_t values 317may not be directly comparable. 318 319If you want your code to be portable you must adhere to the following contraints: 320 3211) Don't assume it is a scalar data type, e.g. an integer or pointer value. There 322are several other implementations where pthread_t is also a struct. See our FAQ 323Question 11 for our reasons for defining pthread_t as a struct. 324 3252) You must not compare them using relational or equality operators. You must use 326the API function pthread_equal() to test for equality. 327 3283) Never attempt to reference individual members. 329 330 331The problem 332 333Certain applications would like to be able to access only the 'pure' pthread_t 334id values, primarily to use as keys into data structures to manage threads or 335thread-related data, but this is not possible in a maximally portable and 336standards compliant way for current POSIX threads implementations. 337 338For implementations that define pthread_t as a scalar, programmers often employ 339direct relational and equality operators on pthread_t. This code will break when 340ported to an implementation that defines pthread_t as an aggregate type. 341 342For implementations that define pthread_t as an aggregate, e.g. a struct, 343programmers can use memcmp etc., but then face the prospect that the struct may 344include alignment padding bytes or bits as well as extra implementation-specific 345members that are not part of the unique identifying value. 346 347[While this is not currently the case for pthreads-win32, opacity also 348means that an implementation is free to change the definition, which should 349generally only require that applications be recompiled and relinked, not 350rewritten.] 351 352 353Doesn't the compiler take care of padding? 354 355The C89 and later standards only effectively guarrantee element-by-element 356equivalence following an assignment or pass by value of a struct or union, 357therefore undefined areas of any two otherwise equivalent pthread_t instances 358can still compare differently, e.g. attempting to compare two such pthread_t 359variables byte-by-byte, e.g. memcmp(&t1, &t2, sizeof(pthread_t) may give an 360incorrect result. In practice I'm reasonably confident that compilers routinely 361also copy the padding bytes, mainly because assignment of unions would be far 362too complicated otherwise. But it just isn't guarranteed by the standard. 363 364Illustration: 365 366We have two thread IDs t1 and t2 367 368pthread_t t1, t2; 369 370In an application we create the threads and intend to store the thread IDs in an 371ordered data structure (linked list, tree, etc) so we need to be able to compare 372them in order to insert them initially and also to traverse. 373 374Suppose pthread_t contains undefined padding bits and our compiler copies our 375pthread_t [struct] element-by-element, then for the assignment: 376 377pthread_t temp = t1; 378 379temp and t1 will be equivalent and correct but a byte-for-byte comparison such as 380memcmp(&temp, &t1, sizeof(pthread_t)) == 0 may not return true as we expect because 381the undefined bits may not have the same values in the two variable instances. 382 383Similarly if passing by value under the same conditions. 384 385If, on the other hand, the undefined bits are at least constant through every 386assignment and pass-by-value then the byte-for-byte comparison 387memcmp(&temp, &t1, sizeof(pthread_t)) == 0 will always return the expected result. 388How can we force the behaviour we need? 389 390 391Solutions 392 393Adding new functions to the standard API or as non-portable extentions is 394the only reliable and portable way to provide the necessary operations. 395Remember also that POSIX is not tied to the C language. The most common 396functions that have been suggested are: 397 398pthread_null() 399pthread_compare() 400pthread_hash() 401 402A single more general purpose function could also be defined as a 403basis for at least the last two of the above functions. 404 405First we need to list the freedoms and constraints with restpect 406to pthread_t so that we can be sure our solution is compatible with the 407standard. 408 409What is known or may be deduced from the standard: 4101) pthread_t must be able to be passed by value, so it must be a single object. 4112) from (1) it must be copyable so cannot embed thread-state information, locks 412or other volatile objects required to manage the thread it associates with. 4133) pthread_t may carry additional information, e.g. for debugging or to manage 414itself. 4154) there is an implicit requirement that the size of pthread_t is determinable 416at compile-time and size-invariant, because it must be able to copy the object 417(i.e. through assignment and pass-by-value). Such copies must be genuine 418duplicates, not merely a copy of a pointer to a common instance such as 419would be the case if pthread_t were defined as an array. 420 421 422Suppose we define the following function: 423 424/* This function shall return it's argument */ 425pthread_t* pthread_normalize(pthread_t* thread); 426 427For scalar or aggregate pthread_t types this function would simply zero any bits 428within the pthread_t that don't uniquely identify the thread, including padding, 429such that client code can return consistent results from operations done on the 430result. If the additional bits are a pointer to an associate structure then 431this function would ensure that the memory used to store that associate 432structure does not leak. After normalization the following compare would be 433valid and repeatable: 434 435memcmp(pthread_normalize(&t1),pthread_normalize(&t2),sizeof(pthread_t)) 436 437Note 1: such comparisons are intended merely to order and sort pthread_t values 438and allow them to index various data structures. They are not intended to reveal 439anything about the relationships between threads, like startup order. 440 441Note 2: the normalized pthread_t is also a valid pthread_t that uniquely 442identifies the same thread. 443 444Advantages: 4451) In most existing implementations this function would reduce to a no-op that 446emits no additional instructions, i.e after in-lining or optimisation, or if 447defined as a macro: 448#define pthread_normalise(tptr) (tptr) 449 4502) This single function allows an application to portably derive 451application-level versions of any of the other required functions. 452 4533) It is a generic function that could enable unanticipated uses. 454 455Disadvantages: 4561) Less efficient than dedicated compare or hash functions for implementations 457that include significant extra non-id elements in pthread_t. 458 4592) Still need to be concerned about padding if copying normalized pthread_t. 460See the later section on defining pthread_t to neutralise padding issues. 461 462Generally a pthread_t may need to be normalized every time it is used, 463which could have a significant impact. However, this is a design decision 464for the implementor in a competitive environment. An implementation is free 465to define a pthread_t in a way that minimises or eliminates padding or 466renders this function a no-op. 467 468Hazards: 4691) Pass-by-reference directly modifies 'thread' so the application must 470synchronise access or ensure that the pointer refers to a copy. The alternative 471of pass-by-value/return-by-value was considered but then this requires two copy 472operations, disadvantaging implementations where this function is not a no-op 473in terms of speed of execution. This function is intended to be used in high 474frequency situations and needs to be efficient, or at least not unnecessarily 475inefficient. The alternative also sits awkwardly with functions like memcmp. 476 4772) [Non-compliant] code that uses relational and equality operators on 478arithmetic or pointer style pthread_t types would need to be rewritten, but it 479should be rewritten anyway. 480 481 482C implementation of null/compare/hash functions using pthread_normalize(): 483 484/* In pthread.h */ 485pthread_t* pthread_normalize(pthread_t* thread); 486 487/* In user code */ 488/* User-level bitclear function - clear bits in loc corresponding to mask */ 489void* bitclear (void* loc, void* mask, size_t count); 490 491typedef unsigned int hash_t; 492 493/* User-level hash function */ 494hash_t hash(void* ptr, size_t count); 495 496/* 497 * User-level pthr_null function - modifies the origin thread handle. 498 * The concept of a null pthread_t is highly implementation dependent 499 * and this design may be far from the mark. For example, in an 500 * implementation "null" may mean setting a special value inside one 501 * element of pthread_t to mean "INVALID". However, if that value was zero and 502 * formed part of the id component then we may get away with this design. 503 */ 504pthread_t* pthr_null(pthread_t* tp) 505{ 506 /* 507 * This should have the same effect as memset(tp, 0, sizeof(pthread_t)) 508 * We're just showing that we can do it. 509 */ 510 void* p = (void*) pthread_normalize(tp); 511 return (pthread_t*) bitclear(p, p, sizeof(pthread_t)); 512} 513 514/* 515 * Safe user-level pthr_compare function - modifies temporary thread handle copies 516 */ 517int pthr_compare_safe(pthread_t thread1, pthread_t thread2) 518{ 519 return memcmp(pthread_normalize(&thread1), pthread_normalize(&thread2), sizeof(pthread_t)); 520} 521 522/* 523 * Fast user-level pthr_compare function - modifies origin thread handles 524 */ 525int pthr_compare_fast(pthread_t* thread1, pthread_t* thread2) 526{ 527 return memcmp(pthread_normalize(&thread1), pthread_normalize(&thread2), sizeof(pthread_t)); 528} 529 530/* 531 * Safe user-level pthr_hash function - modifies temporary thread handle copy 532 */ 533hash_t pthr_hash_safe(pthread_t thread) 534{ 535 return hash((void *) pthread_normalize(&thread), sizeof(pthread_t)); 536} 537 538/* 539 * Fast user-level pthr_hash function - modifies origin thread handle 540 */ 541hash_t pthr_hash_fast(pthread_t thread) 542{ 543 return hash((void *) pthread_normalize(&thread), sizeof(pthread_t)); 544} 545 546/* User-level bitclear function - modifies the origin array */ 547void* bitclear(void* loc, void* mask, size_t count) 548{ 549 int i; 550 for (i=0; i < count; i++) { 551 (unsigned char) *loc++ &= ~((unsigned char) *mask++); 552 } 553} 554 555/* Donald Knuth hash */ 556hash_t hash(void* str, size_t count) 557{ 558 hash_t hash = (hash_t) count; 559 unsigned int i = 0; 560 561 for(i = 0; i < len; str++, i++) 562 { 563 hash = ((hash << 5) ^ (hash >> 27)) ^ (*str); 564 } 565 return hash; 566} 567 568/* Example of advantage point (3) - split a thread handle into its id and non-id values */ 569pthread_t id = thread, non-id = thread; 570bitclear((void*) &non-id, (void*) pthread_normalize(&id), sizeof(pthread_t)); 571 572 573A pthread_t type change proposal to neutralise the effects of padding 574 575Even if pthread_nornalize() is available, padding is still a problem because 576the standard only garrantees element-by-element equivalence through 577copy operations (assignment and pass-by-value). So padding bit values can 578still change randomly after calls to pthread_normalize(). 579 580[I suspect that most compilers take the easy path and always byte-copy anyway, 581partly because it becomes too complex to do (e.g. unions that contain sub-aggregates) 582but also because programmers can easily design their aggregates to minimise and 583often eliminate padding]. 584 585How can we eliminate the problem of padding bytes in structs? Could 586defining pthread_t as a union rather than a struct provide a solution? 587 588In fact, the Linux pthread.h defines most of it's pthread_*_t objects (but not 589pthread_t itself) as unions, possibly for this and/or other reasons. We'll 590borrow some element naming from there but the ideas themselves are well known 591- the __align element used to force alignment of the union comes from K&R's 592storage allocator example. 593 594/* Essentially our current pthread_t renamed */ 595typedef struct { 596 struct thread_state_t * __p; 597 long __x; /* sequence counter */ 598} thread_id_t; 599 600Ensuring that the last element in the above struct is a long ensures that the 601overall struct size is a multiple of sizeof(long), so there should be no trailing 602padding in this struct or the union we define below. 603(Later we'll see that we can handle internal but not trailing padding.) 604 605/* New pthread_t */ 606typedef union { 607 char __size[sizeof(thread_id_t)]; /* array as the first element */ 608 thread_id_t __tid; 609 long __align; /* Ensure that the union starts on long boundary */ 610} pthread_t; 611 612This guarrantees that, during an assignment or pass-by-value, the compiler copies 613every byte in our thread_id_t because the compiler guarrantees that the __size 614array, which we have ensured is the equal-largest element in the union, retains 615equivalence. 616 617This means that pthread_t values stored, assigned and passed by value will at least 618carry the value of any undefined padding bytes along and therefore ensure that 619those values remain consistent. Our comparisons will return consistent results and 620our hashes of [zero initialised] pthread_t values will also return consistent 621results. 622 623We have also removed the need for a pthread_null() function; we can initialise 624at declaration time or easily create our own const pthread_t to use in assignments 625later: 626 627const pthread_t null_tid = {0}; /* braces are required */ 628 629pthread_t t; 630... 631t = null_tid; 632 633 634Note that we don't have to explicitly make use of the __size array at all. It's 635there just to force the compiler behaviour we want. 636 637 638Partial solutions without a pthread_normalize function 639 640 641An application-level pthread_null and pthread_compare proposal 642(and pthread_hash proposal by extention) 643 644In order to deal with the problem of scalar/aggregate pthread_t type disparity in 645portable code I suggest using an old-fashioned union, e.g.: 646 647Contraints: 648- there is no padding, or padding values are preserved through assignment and 649 pass-by-value (see above); 650- there are no extra non-id values in the pthread_t. 651 652 653Example 1: A null initialiser for pthread_t variables... 654 655typedef union { 656 unsigned char b[sizeof(pthread_t)]; 657 pthread_t t; 658} init_t; 659 660const init_t initial = {0}; 661 662pthread_t tid = initial.t; /* init tid to all zeroes */ 663 664 665Example 2: A comparison function for pthread_t values 666 667typedef union { 668 unsigned char b[sizeof(pthread_t)]; 669 pthread_t t; 670} pthcmp_t; 671 672int pthcmp(pthread_t left, pthread_t right) 673{ 674 /* 675 * Compare two pthread handles in a way that imposes a repeatable but arbitrary 676 * ordering on them. 677 * I.e. given the same set of pthread_t handles the ordering should be the same 678 * each time but the order has no particular meaning other than that. E.g. 679 * the ordering does not imply the thread start sequence, or any other 680 * relationship between threads. 681 * 682 * Return values are: 683 * 1 : left is greater than right 684 * 0 : left is equal to right 685 * -1 : left is less than right 686 */ 687 int i; 688 pthcmp_t L, R; 689 L.t = left; 690 R.t = right; 691 for (i = 0; i < sizeof(pthread_t); i++) 692 { 693 if (L.b[i] > R.b[i]) 694 return 1; 695 else if (L.b[i] < R.b[i]) 696 return -1; 697 } 698 return 0; 699} 700 701It has been pointed out that the C99 standard allows for the possibility that 702integer types also may include padding bits, which could invalidate the above 703method. This addition to C99 was specifically included after it was pointed 704out that there was one, presumably not particularly well known, architecture 705that included a padding bit in it's 32 bit integer type. See section 6.2.6.2 706of both the standard and the rationale, specifically the paragraph starting at 707line 16 on page 43 of the rationale. 708 709 710An aside 711 712Certain compilers, e.g. gcc and one of the IBM compilers, include a feature 713extention: provided the union contains a member of the same type as the 714object then the object may be cast to the union itself. 715 716We could use this feature to speed up the pthrcmp() function from example 2 717above by casting rather than assigning the pthread_t arguments to the union, e.g.: 718 719int pthcmp(pthread_t left, pthread_t right) 720{ 721 /* 722 * Compare two pthread handles in a way that imposes a repeatable but arbitrary 723 * ordering on them. 724 * I.e. given the same set of pthread_t handles the ordering should be the same 725 * each time but the order has no particular meaning other than that. E.g. 726 * the ordering does not imply the thread start sequence, or any other 727 * relationship between threads. 728 * 729 * Return values are: 730 * 1 : left is greater than right 731 * 0 : left is equal to right 732 * -1 : left is less than right 733 */ 734 int i; 735 for (i = 0; i < sizeof(pthread_t); i++) 736 { 737 if (((pthcmp_t)left).b[i] > ((pthcmp_t)right).b[i]) 738 return 1; 739 else if (((pthcmp_t)left).b[i] < ((pthcmp_t)right).b[i]) 740 return -1; 741 } 742 return 0; 743} 744 745 746Result thus far 747 748We can't remove undefined bits if they are there in pthread_t already, but we have 749attempted to render them inert for comparison and hashing functions by making them 750consistent through assignment, copy and pass-by-value. 751 752Note: Hashing pthread_t values requires that all pthread_t variables be initialised 753to the same value (usually all zeros) before being assigned a proper thread ID, i.e. 754to ensure that any padding bits are zero, or at least the same value for all 755pthread_t. Since all pthread_t values are generated by the library in the first 756instance this need not be an application-level operation. 757 758 759Conclusion 760 761I've attempted to resolve the multiple issues of type opacity and the possible 762presence of undefined bits and bytes in pthread_t values, which prevent 763applications from comparing or hashing pthread handles. 764 765Two complimentary partial solutions have been proposed, one an application-level 766scheme to handle both scalar and aggregate pthread_t types equally, plus a 767definition of pthread_t itself that neutralises padding bits and bytes by 768coercing semantics out of the compiler to eliminate variations in the values of 769padding bits. 770 771I have not provided any solution to the problem of handling extra values embedded 772in pthread_t, e.g. debugging or trap information that an implementation is entitled 773to include. Therefore none of this replaces the portability and flexibility of API 774functions but what functions are needed? The threads standard is unlikely to 775include that can be implemented by a combination of existing features and more 776generic functions (several references in the threads rationale suggest this. 777Therefore I propose that the following function could replace the several functions 778that have been suggested in conversations: 779 780pthread_t * pthread_normalize(pthread_t * handle); 781 782For most existing pthreads implementations this function, or macro, would reduce to 783a no-op with zero call overhead. 784