1=head1 NAME 2 3perliol - C API for Perl's implementation of IO in Layers. 4 5=head1 SYNOPSIS 6 7 /* Defining a layer ... */ 8 #include <perliol.h> 9 10=head1 DESCRIPTION 11 12This document describes the behavior and implementation of the PerlIO 13abstraction described in L<perlapio> when C<USE_PERLIO> is defined (and 14C<USE_SFIO> is not). 15 16=head2 History and Background 17 18The PerlIO abstraction was introduced in perl5.003_02 but languished as 19just an abstraction until perl5.7.0. However during that time a number 20of perl extensions switched to using it, so the API is mostly fixed to 21maintain (source) compatibility. 22 23The aim of the implementation is to provide the PerlIO API in a flexible 24and platform neutral manner. It is also a trial of an "Object Oriented 25C, with vtables" approach which may be applied to Perl 6. 26 27=head2 Basic Structure 28 29PerlIO is a stack of layers. 30 31The low levels of the stack work with the low-level operating system 32calls (file descriptors in C) getting bytes in and out, the higher 33layers of the stack buffer, filter, and otherwise manipulate the I/O, 34and return characters (or bytes) to Perl. Terms I<above> and I<below> 35are used to refer to the relative positioning of the stack layers. 36 37A layer contains a "vtable", the table of I/O operations (at C level 38a table of function pointers), and status flags. The functions in the 39vtable implement operations like "open", "read", and "write". 40 41When I/O, for example "read", is requested, the request goes from Perl 42first down the stack using "read" functions of each layer, then at the 43bottom the input is requested from the operating system services, then 44the result is returned up the stack, finally being interpreted as Perl 45data. 46 47The requests do not necessarily go always all the way down to the 48operating system: that's where PerlIO buffering comes into play. 49 50When you do an open() and specify extra PerlIO layers to be deployed, 51the layers you specify are "pushed" on top of the already existing 52default stack. One way to see it is that "operating system is 53on the left" and "Perl is on the right". 54 55What exact layers are in this default stack depends on a lot of 56things: your operating system, Perl version, Perl compile time 57configuration, and Perl runtime configuration. See L<PerlIO>, 58L<perlrun/PERLIO>, and L<open> for more information. 59 60binmode() operates similarly to open(): by default the specified 61layers are pushed on top of the existing stack. 62 63However, note that even as the specified layers are "pushed on top" 64for open() and binmode(), this doesn't mean that the effects are 65limited to the "top": PerlIO layers can be very 'active' and inspect 66and affect layers also deeper in the stack. As an example there 67is a layer called "raw" which repeatedly "pops" layers until 68it reaches the first layer that has declared itself capable of 69handling binary data. The "pushed" layers are processed in left-to-right 70order. 71 72sysopen() operates (unsurprisingly) at a lower level in the stack than 73open(). For example in Unix or Unix-like systems sysopen() operates 74directly at the level of file descriptors: in the terms of PerlIO 75layers, it uses only the "unix" layer, which is a rather thin wrapper 76on top of the Unix file descriptors. 77 78=head2 Layers vs Disciplines 79 80Initial discussion of the ability to modify IO streams behaviour used 81the term "discipline" for the entities which were added. This came (I 82believe) from the use of the term in "sfio", which in turn borrowed it 83from "line disciplines" on Unix terminals. However, this document (and 84the C code) uses the term "layer". 85 86This is, I hope, a natural term given the implementation, and should 87avoid connotations that are inherent in earlier uses of "discipline" 88for things which are rather different. 89 90=head2 Data Structures 91 92The basic data structure is a PerlIOl: 93 94 typedef struct _PerlIO PerlIOl; 95 typedef struct _PerlIO_funcs PerlIO_funcs; 96 typedef PerlIOl *PerlIO; 97 98 struct _PerlIO 99 { 100 PerlIOl * next; /* Lower layer */ 101 PerlIO_funcs * tab; /* Functions for this layer */ 102 IV flags; /* Various flags for state */ 103 }; 104 105A C<PerlIOl *> is a pointer to the struct, and the I<application> 106level C<PerlIO *> is a pointer to a C<PerlIOl *> - i.e. a pointer 107to a pointer to the struct. This allows the application level C<PerlIO *> 108to remain constant while the actual C<PerlIOl *> underneath 109changes. (Compare perl's C<SV *> which remains constant while its 110C<sv_any> field changes as the scalar's type changes.) An IO stream is 111then in general represented as a pointer to this linked-list of 112"layers". 113 114It should be noted that because of the double indirection in a C<PerlIO *>, 115a C<< &(perlio->next) >> "is" a C<PerlIO *>, and so to some degree 116at least one layer can use the "standard" API on the next layer down. 117 118A "layer" is composed of two parts: 119 120=over 4 121 122=item 1. 123 124The functions and attributes of the "layer class". 125 126=item 2. 127 128The per-instance data for a particular handle. 129 130=back 131 132=head2 Functions and Attributes 133 134The functions and attributes are accessed via the "tab" (for table) 135member of C<PerlIOl>. The functions (methods of the layer "class") are 136fixed, and are defined by the C<PerlIO_funcs> type. They are broadly the 137same as the public C<PerlIO_xxxxx> functions: 138 139 struct _PerlIO_funcs 140 { 141 Size_t fsize; 142 char * name; 143 Size_t size; 144 IV kind; 145 IV (*Pushed)(pTHX_ PerlIO *f,const char *mode,SV *arg, PerlIO_funcs *tab); 146 IV (*Popped)(pTHX_ PerlIO *f); 147 PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab, 148 PerlIO_list_t *layers, IV n, 149 const char *mode, 150 int fd, int imode, int perm, 151 PerlIO *old, 152 int narg, SV **args); 153 IV (*Binmode)(pTHX_ PerlIO *f); 154 SV * (*Getarg)(pTHX_ PerlIO *f, CLONE_PARAMS *param, int flags) 155 IV (*Fileno)(pTHX_ PerlIO *f); 156 PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o, CLONE_PARAMS *param, int flags) 157 /* Unix-like functions - cf sfio line disciplines */ 158 SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count); 159 SSize_t (*Unread)(pTHX_ PerlIO *f, const void *vbuf, Size_t count); 160 SSize_t (*Write)(pTHX_ PerlIO *f, const void *vbuf, Size_t count); 161 IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence); 162 Off_t (*Tell)(pTHX_ PerlIO *f); 163 IV (*Close)(pTHX_ PerlIO *f); 164 /* Stdio-like buffered IO functions */ 165 IV (*Flush)(pTHX_ PerlIO *f); 166 IV (*Fill)(pTHX_ PerlIO *f); 167 IV (*Eof)(pTHX_ PerlIO *f); 168 IV (*Error)(pTHX_ PerlIO *f); 169 void (*Clearerr)(pTHX_ PerlIO *f); 170 void (*Setlinebuf)(pTHX_ PerlIO *f); 171 /* Perl's snooping functions */ 172 STDCHAR * (*Get_base)(pTHX_ PerlIO *f); 173 Size_t (*Get_bufsiz)(pTHX_ PerlIO *f); 174 STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f); 175 SSize_t (*Get_cnt)(pTHX_ PerlIO *f); 176 void (*Set_ptrcnt)(pTHX_ PerlIO *f,STDCHAR *ptr,SSize_t cnt); 177 }; 178 179The first few members of the struct give a function table size for 180compatibility check "name" for the layer, the size to C<malloc> for the per-instance data, 181and some flags which are attributes of the class as whole (such as whether it is a buffering 182layer), then follow the functions which fall into four basic groups: 183 184=over 4 185 186=item 1. 187 188Opening and setup functions 189 190=item 2. 191 192Basic IO operations 193 194=item 3. 195 196Stdio class buffering options. 197 198=item 4. 199 200Functions to support Perl's traditional "fast" access to the buffer. 201 202=back 203 204A layer does not have to implement all the functions, but the whole 205table has to be present. Unimplemented slots can be NULL (which will 206result in an error when called) or can be filled in with stubs to 207"inherit" behaviour from a "base class". This "inheritance" is fixed 208for all instances of the layer, but as the layer chooses which stubs 209to populate the table, limited "multiple inheritance" is possible. 210 211=head2 Per-instance Data 212 213The per-instance data are held in memory beyond the basic PerlIOl 214struct, by making a PerlIOl the first member of the layer's struct 215thus: 216 217 typedef struct 218 { 219 struct _PerlIO base; /* Base "class" info */ 220 STDCHAR * buf; /* Start of buffer */ 221 STDCHAR * end; /* End of valid part of buffer */ 222 STDCHAR * ptr; /* Current position in buffer */ 223 Off_t posn; /* Offset of buf into the file */ 224 Size_t bufsiz; /* Real size of buffer */ 225 IV oneword; /* Emergency buffer */ 226 } PerlIOBuf; 227 228In this way (as for perl's scalars) a pointer to a PerlIOBuf can be 229treated as a pointer to a PerlIOl. 230 231=head2 Layers in action. 232 233 table perlio unix 234 | | 235 +-----------+ +----------+ +--------+ 236 PerlIO ->| |--->| next |--->| NULL | 237 +-----------+ +----------+ +--------+ 238 | | | buffer | | fd | 239 +-----------+ | | +--------+ 240 | | +----------+ 241 242 243The above attempts to show how the layer scheme works in a simple case. 244The application's C<PerlIO *> points to an entry in the table(s) 245representing open (allocated) handles. For example the first three slots 246in the table correspond to C<stdin>,C<stdout> and C<stderr>. The table 247in turn points to the current "top" layer for the handle - in this case 248an instance of the generic buffering layer "perlio". That layer in turn 249points to the next layer down - in this case the low-level "unix" layer. 250 251The above is roughly equivalent to a "stdio" buffered stream, but with 252much more flexibility: 253 254=over 4 255 256=item * 257 258If Unix level C<read>/C<write>/C<lseek> is not appropriate for (say) 259sockets then the "unix" layer can be replaced (at open time or even 260dynamically) with a "socket" layer. 261 262=item * 263 264Different handles can have different buffering schemes. The "top" 265layer could be the "mmap" layer if reading disk files was quicker 266using C<mmap> than C<read>. An "unbuffered" stream can be implemented 267simply by not having a buffer layer. 268 269=item * 270 271Extra layers can be inserted to process the data as it flows through. 272This was the driving need for including the scheme in perl 5.7.0+ - we 273needed a mechanism to allow data to be translated between perl's 274internal encoding (conceptually at least Unicode as UTF-8), and the 275"native" format used by the system. This is provided by the 276":encoding(xxxx)" layer which typically sits above the buffering layer. 277 278=item * 279 280A layer can be added that does "\n" to CRLF translation. This layer 281can be used on any platform, not just those that normally do such 282things. 283 284=back 285 286=head2 Per-instance flag bits 287 288The generic flag bits are a hybrid of C<O_XXXXX> style flags deduced 289from the mode string passed to C<PerlIO_open()>, and state bits for 290typical buffer layers. 291 292=over 4 293 294=item PERLIO_F_EOF 295 296End of file. 297 298=item PERLIO_F_CANWRITE 299 300Writes are permitted, i.e. opened as "w" or "r+" or "a", etc. 301 302=item PERLIO_F_CANREAD 303 304Reads are permitted i.e. opened "r" or "w+" (or even "a+" - ick). 305 306=item PERLIO_F_ERROR 307 308An error has occurred (for C<PerlIO_error()>). 309 310=item PERLIO_F_TRUNCATE 311 312Truncate file suggested by open mode. 313 314=item PERLIO_F_APPEND 315 316All writes should be appends. 317 318=item PERLIO_F_CRLF 319 320Layer is performing Win32-like "\n" mapped to CR,LF for output and CR,LF 321mapped to "\n" for input. Normally the provided "crlf" layer is the only 322layer that need bother about this. C<PerlIO_binmode()> will mess with this 323flag rather than add/remove layers if the C<PERLIO_K_CANCRLF> bit is set 324for the layers class. 325 326=item PERLIO_F_UTF8 327 328Data written to this layer should be UTF-8 encoded; data provided 329by this layer should be considered UTF-8 encoded. Can be set on any layer 330by ":utf8" dummy layer. Also set on ":encoding" layer. 331 332=item PERLIO_F_UNBUF 333 334Layer is unbuffered - i.e. write to next layer down should occur for 335each write to this layer. 336 337=item PERLIO_F_WRBUF 338 339The buffer for this layer currently holds data written to it but not sent 340to next layer. 341 342=item PERLIO_F_RDBUF 343 344The buffer for this layer currently holds unconsumed data read from 345layer below. 346 347=item PERLIO_F_LINEBUF 348 349Layer is line buffered. Write data should be passed to next layer down 350whenever a "\n" is seen. Any data beyond the "\n" should then be 351processed. 352 353=item PERLIO_F_TEMP 354 355File has been C<unlink()>ed, or should be deleted on C<close()>. 356 357=item PERLIO_F_OPEN 358 359Handle is open. 360 361=item PERLIO_F_FASTGETS 362 363This instance of this layer supports the "fast C<gets>" interface. 364Normally set based on C<PERLIO_K_FASTGETS> for the class and by the 365existence of the function(s) in the table. However a class that 366normally provides that interface may need to avoid it on a 367particular instance. The "pending" layer needs to do this when 368it is pushed above a layer which does not support the interface. 369(Perl's C<sv_gets()> does not expect the streams fast C<gets> behaviour 370to change during one "get".) 371 372=back 373 374=head2 Methods in Detail 375 376=over 4 377 378=item fsize 379 380 Size_t fsize; 381 382Size of the function table. This is compared against the value PerlIO 383code "knows" as a compatibility check. Future versions I<may> be able 384to tolerate layers compiled against an old version of the headers. 385 386=item name 387 388 char * name; 389 390The name of the layer whose open() method Perl should invoke on 391open(). For example if the layer is called APR, you will call: 392 393 open $fh, ">:APR", ... 394 395and Perl knows that it has to invoke the PerlIOAPR_open() method 396implemented by the APR layer. 397 398=item size 399 400 Size_t size; 401 402The size of the per-instance data structure, e.g.: 403 404 sizeof(PerlIOAPR) 405 406If this field is zero then C<PerlIO_pushed> does not malloc anything 407and assumes layer's Pushed function will do any required layer stack 408manipulation - used to avoid malloc/free overhead for dummy layers. 409If the field is non-zero it must be at least the size of C<PerlIOl>, 410C<PerlIO_pushed> will allocate memory for the layer's data structures 411and link new layer onto the stream's stack. (If the layer's Pushed 412method returns an error indication the layer is popped again.) 413 414=item kind 415 416 IV kind; 417 418=over 4 419 420=item * PERLIO_K_BUFFERED 421 422The layer is buffered. 423 424=item * PERLIO_K_RAW 425 426The layer is acceptable to have in a binmode(FH) stack - i.e. it does not 427(or will configure itself not to) transform bytes passing through it. 428 429=item * PERLIO_K_CANCRLF 430 431Layer can translate between "\n" and CRLF line ends. 432 433=item * PERLIO_K_FASTGETS 434 435Layer allows buffer snooping. 436 437=item * PERLIO_K_MULTIARG 438 439Used when the layer's open() accepts more arguments than usual. The 440extra arguments should come not before the C<MODE> argument. When this 441flag is used it's up to the layer to validate the args. 442 443=back 444 445=item Pushed 446 447 IV (*Pushed)(pTHX_ PerlIO *f,const char *mode, SV *arg); 448 449The only absolutely mandatory method. Called when the layer is pushed 450onto the stack. The C<mode> argument may be NULL if this occurs 451post-open. The C<arg> will be non-C<NULL> if an argument string was 452passed. In most cases this should call C<PerlIOBase_pushed()> to 453convert C<mode> into the appropriate C<PERLIO_F_XXXXX> flags in 454addition to any actions the layer itself takes. If a layer is not 455expecting an argument it need neither save the one passed to it, nor 456provide C<Getarg()> (it could perhaps C<Perl_warn> that the argument 457was un-expected). 458 459Returns 0 on success. On failure returns -1 and should set errno. 460 461=item Popped 462 463 IV (*Popped)(pTHX_ PerlIO *f); 464 465Called when the layer is popped from the stack. A layer will normally 466be popped after C<Close()> is called. But a layer can be popped 467without being closed if the program is dynamically managing layers on 468the stream. In such cases C<Popped()> should free any resources 469(buffers, translation tables, ...) not held directly in the layer's 470struct. It should also C<Unread()> any unconsumed data that has been 471read and buffered from the layer below back to that layer, so that it 472can be re-provided to what ever is now above. 473 474Returns 0 on success and failure. If C<Popped()> returns I<true> then 475I<perlio.c> assumes that either the layer has popped itself, or the 476layer is super special and needs to be retained for other reasons. 477In most cases it should return I<false>. 478 479=item Open 480 481 PerlIO * (*Open)(...); 482 483The C<Open()> method has lots of arguments because it combines the 484functions of perl's C<open>, C<PerlIO_open>, perl's C<sysopen>, 485C<PerlIO_fdopen> and C<PerlIO_reopen>. The full prototype is as 486follows: 487 488 PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab, 489 PerlIO_list_t *layers, IV n, 490 const char *mode, 491 int fd, int imode, int perm, 492 PerlIO *old, 493 int narg, SV **args); 494 495Open should (perhaps indirectly) call C<PerlIO_allocate()> to allocate 496a slot in the table and associate it with the layers information for 497the opened file, by calling C<PerlIO_push>. The I<layers> is an 498array of all the layers destined for the C<PerlIO *>, and any 499arguments passed to them, I<n> is the index into that array of the 500layer being called. The macro C<PerlIOArg> will return a (possibly 501C<NULL>) SV * for the argument passed to the layer. 502 503The I<mode> string is an "C<fopen()>-like" string which would match 504the regular expression C</^[I#]?[rwa]\+?[bt]?$/>. 505 506The C<'I'> prefix is used during creation of C<stdin>..C<stderr> via 507special C<PerlIO_fdopen> calls; the C<'#'> prefix means that this is 508C<sysopen> and that I<imode> and I<perm> should be passed to 509C<PerlLIO_open3>; C<'r'> means B<r>ead, C<'w'> means B<w>rite and 510C<'a'> means B<a>ppend. The C<'+'> suffix means that both reading and 511writing/appending are permitted. The C<'b'> suffix means file should 512be binary, and C<'t'> means it is text. (Almost all layers should do 513the IO in binary mode, and ignore the b/t bits. The C<:crlf> layer 514should be pushed to handle the distinction.) 515 516If I<old> is not C<NULL> then this is a C<PerlIO_reopen>. Perl itself 517does not use this (yet?) and semantics are a little vague. 518 519If I<fd> not negative then it is the numeric file descriptor I<fd>, 520which will be open in a manner compatible with the supplied mode 521string, the call is thus equivalent to C<PerlIO_fdopen>. In this case 522I<nargs> will be zero. 523 524If I<nargs> is greater than zero then it gives the number of arguments 525passed to C<open>, otherwise it will be 1 if for example 526C<PerlIO_open> was called. In simple cases SvPV_nolen(*args) is the 527pathname to open. 528 529If a layer provides C<Open()> it should normally call the C<Open()> 530method of next layer down (if any) and then push itself on top if that 531succeeds. C<PerlIOBase_open> is provided to do exactly that, so in 532most cases you don't have to write your own C<Open()> method. If this 533method is not defined, other layers may have difficulty pushing 534themselves on top of it during open. 535 536If C<PerlIO_push> was performed and open has failed, it must 537C<PerlIO_pop> itself, since if it's not, the layer won't be removed 538and may cause bad problems. 539 540Returns C<NULL> on failure. 541 542=item Binmode 543 544 IV (*Binmode)(pTHX_ PerlIO *f); 545 546Optional. Used when C<:raw> layer is pushed (explicitly or as a result 547of binmode(FH)). If not present layer will be popped. If present 548should configure layer as binary (or pop itself) and return 0. 549If it returns -1 for error C<binmode> will fail with layer 550still on the stack. 551 552=item Getarg 553 554 SV * (*Getarg)(pTHX_ PerlIO *f, 555 CLONE_PARAMS *param, int flags); 556 557Optional. If present should return an SV * representing the string 558argument passed to the layer when it was 559pushed. e.g. ":encoding(ascii)" would return an SvPV with value 560"ascii". (I<param> and I<flags> arguments can be ignored in most 561cases) 562 563C<Dup> uses C<Getarg> to retrieve the argument originally passed to 564C<Pushed>, so you must implement this function if your layer has an 565extra argument to C<Pushed> and will ever be C<Dup>ed. 566 567=item Fileno 568 569 IV (*Fileno)(pTHX_ PerlIO *f); 570 571Returns the Unix/Posix numeric file descriptor for the handle. Normally 572C<PerlIOBase_fileno()> (which just asks next layer down) will suffice 573for this. 574 575Returns -1 on error, which is considered to include the case where the 576layer cannot provide such a file descriptor. 577 578=item Dup 579 580 PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o, 581 CLONE_PARAMS *param, int flags); 582 583XXX: Needs more docs. 584 585Used as part of the "clone" process when a thread is spawned (in which 586case param will be non-NULL) and when a stream is being duplicated via 587'&' in the C<open>. 588 589Similar to C<Open>, returns PerlIO* on success, C<NULL> on failure. 590 591=item Read 592 593 SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count); 594 595Basic read operation. 596 597Typically will call C<Fill> and manipulate pointers (possibly via the 598API). C<PerlIOBuf_read()> may be suitable for derived classes which 599provide "fast gets" methods. 600 601Returns actual bytes read, or -1 on an error. 602 603=item Unread 604 605 SSize_t (*Unread)(pTHX_ PerlIO *f, 606 const void *vbuf, Size_t count); 607 608A superset of stdio's C<ungetc()>. Should arrange for future reads to 609see the bytes in C<vbuf>. If there is no obviously better implementation 610then C<PerlIOBase_unread()> provides the function by pushing a "fake" 611"pending" layer above the calling layer. 612 613Returns the number of unread chars. 614 615=item Write 616 617 SSize_t (*Write)(PerlIO *f, const void *vbuf, Size_t count); 618 619Basic write operation. 620 621Returns bytes written or -1 on an error. 622 623=item Seek 624 625 IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence); 626 627Position the file pointer. Should normally call its own C<Flush> 628method and then the C<Seek> method of next layer down. 629 630Returns 0 on success, -1 on failure. 631 632=item Tell 633 634 Off_t (*Tell)(pTHX_ PerlIO *f); 635 636Return the file pointer. May be based on layers cached concept of 637position to avoid overhead. 638 639Returns -1 on failure to get the file pointer. 640 641=item Close 642 643 IV (*Close)(pTHX_ PerlIO *f); 644 645Close the stream. Should normally call C<PerlIOBase_close()> to flush 646itself and close layers below, and then deallocate any data structures 647(buffers, translation tables, ...) not held directly in the data 648structure. 649 650Returns 0 on success, -1 on failure. 651 652=item Flush 653 654 IV (*Flush)(pTHX_ PerlIO *f); 655 656Should make stream's state consistent with layers below. That is, any 657buffered write data should be written, and file position of lower layers 658adjusted for data read from below but not actually consumed. 659(Should perhaps C<Unread()> such data to the lower layer.) 660 661Returns 0 on success, -1 on failure. 662 663=item Fill 664 665 IV (*Fill)(pTHX_ PerlIO *f); 666 667The buffer for this layer should be filled (for read) from layer 668below. When you "subclass" PerlIOBuf layer, you want to use its 669I<_read> method and to supply your own fill method, which fills the 670PerlIOBuf's buffer. 671 672Returns 0 on success, -1 on failure. 673 674=item Eof 675 676 IV (*Eof)(pTHX_ PerlIO *f); 677 678Return end-of-file indicator. C<PerlIOBase_eof()> is normally sufficient. 679 680Returns 0 on end-of-file, 1 if not end-of-file, -1 on error. 681 682=item Error 683 684 IV (*Error)(pTHX_ PerlIO *f); 685 686Return error indicator. C<PerlIOBase_error()> is normally sufficient. 687 688Returns 1 if there is an error (usually when C<PERLIO_F_ERROR> is set, 6890 otherwise. 690 691=item Clearerr 692 693 void (*Clearerr)(pTHX_ PerlIO *f); 694 695Clear end-of-file and error indicators. Should call C<PerlIOBase_clearerr()> 696to set the C<PERLIO_F_XXXXX> flags, which may suffice. 697 698=item Setlinebuf 699 700 void (*Setlinebuf)(pTHX_ PerlIO *f); 701 702Mark the stream as line buffered. C<PerlIOBase_setlinebuf()> sets the 703PERLIO_F_LINEBUF flag and is normally sufficient. 704 705=item Get_base 706 707 STDCHAR * (*Get_base)(pTHX_ PerlIO *f); 708 709Allocate (if not already done so) the read buffer for this layer and 710return pointer to it. Return NULL on failure. 711 712=item Get_bufsiz 713 714 Size_t (*Get_bufsiz)(pTHX_ PerlIO *f); 715 716Return the number of bytes that last C<Fill()> put in the buffer. 717 718=item Get_ptr 719 720 STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f); 721 722Return the current read pointer relative to this layer's buffer. 723 724=item Get_cnt 725 726 SSize_t (*Get_cnt)(pTHX_ PerlIO *f); 727 728Return the number of bytes left to be read in the current buffer. 729 730=item Set_ptrcnt 731 732 void (*Set_ptrcnt)(pTHX_ PerlIO *f, 733 STDCHAR *ptr, SSize_t cnt); 734 735Adjust the read pointer and count of bytes to match C<ptr> and/or C<cnt>. 736The application (or layer above) must ensure they are consistent. 737(Checking is allowed by the paranoid.) 738 739=back 740 741=head2 Utilities 742 743To ask for the next layer down use PerlIONext(PerlIO *f). 744 745To check that a PerlIO* is valid use PerlIOValid(PerlIO *f). (All 746this does is really just to check that the pointer is non-NULL and 747that the pointer behind that is non-NULL.) 748 749PerlIOBase(PerlIO *f) returns the "Base" pointer, or in other words, 750the C<PerlIOl*> pointer. 751 752PerlIOSelf(PerlIO* f, type) return the PerlIOBase cast to a type. 753 754Perl_PerlIO_or_Base(PerlIO* f, callback, base, failure, args) either 755calls the I<callback> from the functions of the layer I<f> (just by 756the name of the IO function, like "Read") with the I<args>, or if 757there is no such callback, calls the I<base> version of the callback 758with the same args, or if the f is invalid, set errno to EBADF and 759return I<failure>. 760 761Perl_PerlIO_or_fail(PerlIO* f, callback, failure, args) either calls 762the I<callback> of the functions of the layer I<f> with the I<args>, 763or if there is no such callback, set errno to EINVAL. Or if the f is 764invalid, set errno to EBADF and return I<failure>. 765 766Perl_PerlIO_or_Base_void(PerlIO* f, callback, base, args) either calls 767the I<callback> of the functions of the layer I<f> with the I<args>, 768or if there is no such callback, calls the I<base> version of the 769callback with the same args, or if the f is invalid, set errno to 770EBADF. 771 772Perl_PerlIO_or_fail_void(PerlIO* f, callback, args) either calls the 773I<callback> of the functions of the layer I<f> with the I<args>, or if 774there is no such callback, set errno to EINVAL. Or if the f is 775invalid, set errno to EBADF. 776 777=head2 Implementing PerlIO Layers 778 779If you find the implementation document unclear or not sufficient, 780look at the existing PerlIO layer implementations, which include: 781 782=over 783 784=item * C implementations 785 786The F<perlio.c> and F<perliol.h> in the Perl core implement the 787"unix", "perlio", "stdio", "crlf", "utf8", "byte", "raw", "pending" 788layers, and also the "mmap" and "win32" layers if applicable. 789(The "win32" is currently unfinished and unused, to see what is used 790instead in Win32, see L<PerlIO/"Querying the layers of filehandles"> .) 791 792PerlIO::encoding, PerlIO::scalar, PerlIO::via in the Perl core. 793 794PerlIO::gzip and APR::PerlIO (mod_perl 2.0) on CPAN. 795 796=item * Perl implementations 797 798PerlIO::via::QuotedPrint in the Perl core and PerlIO::via::* on CPAN. 799 800=back 801 802If you are creating a PerlIO layer, you may want to be lazy, in other 803words, implement only the methods that interest you. The other methods 804you can either replace with the "blank" methods 805 806 PerlIOBase_noop_ok 807 PerlIOBase_noop_fail 808 809(which do nothing, and return zero and -1, respectively) or for 810certain methods you may assume a default behaviour by using a NULL 811method. The Open method looks for help in the 'parent' layer. 812The following table summarizes the behaviour: 813 814 method behaviour with NULL 815 816 Clearerr PerlIOBase_clearerr 817 Close PerlIOBase_close 818 Dup PerlIOBase_dup 819 Eof PerlIOBase_eof 820 Error PerlIOBase_error 821 Fileno PerlIOBase_fileno 822 Fill FAILURE 823 Flush SUCCESS 824 Getarg SUCCESS 825 Get_base FAILURE 826 Get_bufsiz FAILURE 827 Get_cnt FAILURE 828 Get_ptr FAILURE 829 Open INHERITED 830 Popped SUCCESS 831 Pushed SUCCESS 832 Read PerlIOBase_read 833 Seek FAILURE 834 Set_cnt FAILURE 835 Set_ptrcnt FAILURE 836 Setlinebuf PerlIOBase_setlinebuf 837 Tell FAILURE 838 Unread PerlIOBase_unread 839 Write FAILURE 840 841 FAILURE Set errno (to EINVAL in Unixish, to LIB$_INVARG in VMS) and 842 return -1 (for numeric return values) or NULL (for pointers) 843 INHERITED Inherited from the layer below 844 SUCCESS Return 0 (for numeric return values) or a pointer 845 846=head2 Core Layers 847 848The file C<perlio.c> provides the following layers: 849 850=over 4 851 852=item "unix" 853 854A basic non-buffered layer which calls Unix/POSIX C<read()>, C<write()>, 855C<lseek()>, C<close()>. No buffering. Even on platforms that distinguish 856between O_TEXT and O_BINARY this layer is always O_BINARY. 857 858=item "perlio" 859 860A very complete generic buffering layer which provides the whole of 861PerlIO API. It is also intended to be used as a "base class" for other 862layers. (For example its C<Read()> method is implemented in terms of 863the C<Get_cnt()>/C<Get_ptr()>/C<Set_ptrcnt()> methods). 864 865"perlio" over "unix" provides a complete replacement for stdio as seen 866via PerlIO API. This is the default for USE_PERLIO when system's stdio 867does not permit perl's "fast gets" access, and which do not 868distinguish between C<O_TEXT> and C<O_BINARY>. 869 870=item "stdio" 871 872A layer which provides the PerlIO API via the layer scheme, but 873implements it by calling system's stdio. This is (currently) the default 874if system's stdio provides sufficient access to allow perl's "fast gets" 875access and which do not distinguish between C<O_TEXT> and C<O_BINARY>. 876 877=item "crlf" 878 879A layer derived using "perlio" as a base class. It provides Win32-like 880"\n" to CR,LF translation. Can either be applied above "perlio" or serve 881as the buffer layer itself. "crlf" over "unix" is the default if system 882distinguishes between C<O_TEXT> and C<O_BINARY> opens. (At some point 883"unix" will be replaced by a "native" Win32 IO layer on that platform, 884as Win32's read/write layer has various drawbacks.) The "crlf" layer is 885a reasonable model for a layer which transforms data in some way. 886 887=item "mmap" 888 889If Configure detects C<mmap()> functions this layer is provided (with 890"perlio" as a "base") which does "read" operations by mmap()ing the 891file. Performance improvement is marginal on modern systems, so it is 892mainly there as a proof of concept. It is likely to be unbundled from 893the core at some point. The "mmap" layer is a reasonable model for a 894minimalist "derived" layer. 895 896=item "pending" 897 898An "internal" derivative of "perlio" which can be used to provide 899Unread() function for layers which have no buffer or cannot be 900bothered. (Basically this layer's C<Fill()> pops itself off the stack 901and so resumes reading from layer below.) 902 903=item "raw" 904 905A dummy layer which never exists on the layer stack. Instead when 906"pushed" it actually pops the stack removing itself, it then calls 907Binmode function table entry on all the layers in the stack - normally 908this (via PerlIOBase_binmode) removes any layers which do not have 909C<PERLIO_K_RAW> bit set. Layers can modify that behaviour by defining 910their own Binmode entry. 911 912=item "utf8" 913 914Another dummy layer. When pushed it pops itself and sets the 915C<PERLIO_F_UTF8> flag on the layer which was (and now is once more) 916the top of the stack. 917 918=back 919 920In addition F<perlio.c> also provides a number of C<PerlIOBase_xxxx()> 921functions which are intended to be used in the table slots of classes 922which do not need to do anything special for a particular method. 923 924=head2 Extension Layers 925 926Layers can be made available by extension modules. When an unknown layer 927is encountered the PerlIO code will perform the equivalent of : 928 929 use PerlIO 'layer'; 930 931Where I<layer> is the unknown layer. F<PerlIO.pm> will then attempt to: 932 933 require PerlIO::layer; 934 935If after that process the layer is still not defined then the C<open> 936will fail. 937 938The following extension layers are bundled with perl: 939 940=over 4 941 942=item ":encoding" 943 944 use Encoding; 945 946makes this layer available, although F<PerlIO.pm> "knows" where to 947find it. It is an example of a layer which takes an argument as it is 948called thus: 949 950 open( $fh, "<:encoding(iso-8859-7)", $pathname ); 951 952=item ":scalar" 953 954Provides support for reading data from and writing data to a scalar. 955 956 open( $fh, "+<:scalar", \$scalar ); 957 958When a handle is so opened, then reads get bytes from the string value 959of I<$scalar>, and writes change the value. In both cases the position 960in I<$scalar> starts as zero but can be altered via C<seek>, and 961determined via C<tell>. 962 963Please note that this layer is implied when calling open() thus: 964 965 open( $fh, "+<", \$scalar ); 966 967=item ":via" 968 969Provided to allow layers to be implemented as Perl code. For instance: 970 971 use PerlIO::via::StripHTML; 972 open( my $fh, "<:via(StripHTML)", "index.html" ); 973 974See L<PerlIO::via> for details. 975 976=back 977 978=head1 TODO 979 980Things that need to be done to improve this document. 981 982=over 983 984=item * 985 986Explain how to make a valid fh without going through open()(i.e. apply 987a layer). For example if the file is not opened through perl, but we 988want to get back a fh, like it was opened by Perl. 989 990How PerlIO_apply_layera fits in, where its docs, was it made public? 991 992Currently the example could be something like this: 993 994 PerlIO *foo_to_PerlIO(pTHX_ char *mode, ...) 995 { 996 char *mode; /* "w", "r", etc */ 997 const char *layers = ":APR"; /* the layer name */ 998 PerlIO *f = PerlIO_allocate(aTHX); 999 if (!f) { 1000 return NULL; 1001 } 1002 1003 PerlIO_apply_layers(aTHX_ f, mode, layers); 1004 1005 if (f) { 1006 PerlIOAPR *st = PerlIOSelf(f, PerlIOAPR); 1007 /* fill in the st struct, as in _open() */ 1008 st->file = file; 1009 PerlIOBase(f)->flags |= PERLIO_F_OPEN; 1010 1011 return f; 1012 } 1013 return NULL; 1014 } 1015 1016=item * 1017 1018fix/add the documentation in places marked as XXX. 1019 1020=item * 1021 1022The handling of errors by the layer is not specified. e.g. when $! 1023should be set explicitly, when the error handling should be just 1024delegated to the top layer. 1025 1026Probably give some hints on using SETERRNO() or pointers to where they 1027can be found. 1028 1029=item * 1030 1031I think it would help to give some concrete examples to make it easier 1032to understand the API. Of course I agree that the API has to be 1033concise, but since there is no second document that is more of a 1034guide, I think that it'd make it easier to start with the doc which is 1035an API, but has examples in it in places where things are unclear, to 1036a person who is not a PerlIO guru (yet). 1037 1038=back 1039 1040=cut 1041