1=head1 NAME 2 3perliol - C API for Perl's implementation of IO in Layers. 4 5=head1 SYNOPSIS 6 7 /* Defining a layer ... */ 8 #include <perliol.h> 9 10=head1 DESCRIPTION 11 12This document describes the behavior and implementation of the PerlIO 13abstraction described in L<perlapio> when C<USE_PERLIO> is defined. 14 15=head2 History and Background 16 17The PerlIO abstraction was introduced in perl5.003_02 but languished as 18just an abstraction until perl5.7.0. However during that time a number 19of perl extensions switched to using it, so the API is mostly fixed to 20maintain (source) compatibility. 21 22The aim of the implementation is to provide the PerlIO API in a flexible 23and platform neutral manner. It is also a trial of an "Object Oriented 24C, with vtables" approach which may be applied to Raku. 25 26=head2 Basic Structure 27 28PerlIO is a stack of layers. 29 30The low levels of the stack work with the low-level operating system 31calls (file descriptors in C) getting bytes in and out, the higher 32layers of the stack buffer, filter, and otherwise manipulate the I/O, 33and return characters (or bytes) to Perl. Terms I<above> and I<below> 34are used to refer to the relative positioning of the stack layers. 35 36A layer contains a "vtable", the table of I/O operations (at C level 37a table of function pointers), and status flags. The functions in the 38vtable implement operations like "open", "read", and "write". 39 40When I/O, for example "read", is requested, the request goes from Perl 41first down the stack using "read" functions of each layer, then at the 42bottom the input is requested from the operating system services, then 43the result is returned up the stack, finally being interpreted as Perl 44data. 45 46The requests do not necessarily go always all the way down to the 47operating system: that's where PerlIO buffering comes into play. 48 49When you do an open() and specify extra PerlIO layers to be deployed, 50the layers you specify are "pushed" on top of the already existing 51default stack. One way to see it is that "operating system is 52on the left" and "Perl is on the right". 53 54What exact layers are in this default stack depends on a lot of 55things: your operating system, Perl version, Perl compile time 56configuration, and Perl runtime configuration. See L<PerlIO>, 57L<perlrun/PERLIO>, and L<open> for more information. 58 59binmode() operates similarly to open(): by default the specified 60layers are pushed on top of the existing stack. 61 62However, note that even as the specified layers are "pushed on top" 63for open() and binmode(), this doesn't mean that the effects are 64limited to the "top": PerlIO layers can be very 'active' and inspect 65and affect layers also deeper in the stack. As an example there 66is a layer called "raw" which repeatedly "pops" layers until 67it reaches the first layer that has declared itself capable of 68handling binary data. The "pushed" layers are processed in left-to-right 69order. 70 71sysopen() operates (unsurprisingly) at a lower level in the stack than 72open(). For example in Unix or Unix-like systems sysopen() operates 73directly at the level of file descriptors: in the terms of PerlIO 74layers, it uses only the "unix" layer, which is a rather thin wrapper 75on top of the Unix file descriptors. 76 77=head2 Layers vs Disciplines 78 79Initial discussion of the ability to modify IO streams behaviour used 80the term "discipline" for the entities which were added. This came (I 81believe) from the use of the term in "sfio", which in turn borrowed it 82from "line disciplines" on Unix terminals. However, this document (and 83the C code) uses the term "layer". 84 85This is, I hope, a natural term given the implementation, and should 86avoid connotations that are inherent in earlier uses of "discipline" 87for things which are rather different. 88 89=head2 Data Structures 90 91The basic data structure is a PerlIOl: 92 93 typedef struct _PerlIO PerlIOl; 94 typedef struct _PerlIO_funcs PerlIO_funcs; 95 typedef PerlIOl *PerlIO; 96 97 struct _PerlIO 98 { 99 PerlIOl * next; /* Lower layer */ 100 PerlIO_funcs * tab; /* Functions for this layer */ 101 U32 flags; /* Various flags for state */ 102 }; 103 104A C<PerlIOl *> is a pointer to the struct, and the I<application> 105level C<PerlIO *> is a pointer to a C<PerlIOl *> - i.e. a pointer 106to a pointer to the struct. This allows the application level C<PerlIO *> 107to remain constant while the actual C<PerlIOl *> underneath 108changes. (Compare perl's C<SV *> which remains constant while its 109C<sv_any> field changes as the scalar's type changes.) An IO stream is 110then in general represented as a pointer to this linked-list of 111"layers". 112 113It should be noted that because of the double indirection in a C<PerlIO *>, 114a C<< &(perlio->next) >> "is" a C<PerlIO *>, and so to some degree 115at least one layer can use the "standard" API on the next layer down. 116 117A "layer" is composed of two parts: 118 119=over 4 120 121=item 1. 122 123The functions and attributes of the "layer class". 124 125=item 2. 126 127The per-instance data for a particular handle. 128 129=back 130 131=head2 Functions and Attributes 132 133The functions and attributes are accessed via the "tab" (for table) 134member of C<PerlIOl>. The functions (methods of the layer "class") are 135fixed, and are defined by the C<PerlIO_funcs> type. They are broadly the 136same as the public C<PerlIO_xxxxx> functions: 137 138 struct _PerlIO_funcs 139 { 140 Size_t fsize; 141 char * name; 142 Size_t size; 143 IV kind; 144 IV (*Pushed)(pTHX_ PerlIO *f, 145 const char *mode, 146 SV *arg, 147 PerlIO_funcs *tab); 148 IV (*Popped)(pTHX_ PerlIO *f); 149 PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab, 150 PerlIO_list_t *layers, IV n, 151 const char *mode, 152 int fd, int imode, int perm, 153 PerlIO *old, 154 int narg, SV **args); 155 IV (*Binmode)(pTHX_ PerlIO *f); 156 SV * (*Getarg)(pTHX_ PerlIO *f, CLONE_PARAMS *param, int flags) 157 IV (*Fileno)(pTHX_ PerlIO *f); 158 PerlIO * (*Dup)(pTHX_ PerlIO *f, 159 PerlIO *o, 160 CLONE_PARAMS *param, 161 int flags) 162 /* Unix-like functions - cf sfio line disciplines */ 163 SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count); 164 SSize_t (*Unread)(pTHX_ PerlIO *f, const void *vbuf, Size_t count); 165 SSize_t (*Write)(pTHX_ PerlIO *f, const void *vbuf, Size_t count); 166 IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence); 167 Off_t (*Tell)(pTHX_ PerlIO *f); 168 IV (*Close)(pTHX_ PerlIO *f); 169 /* Stdio-like buffered IO functions */ 170 IV (*Flush)(pTHX_ PerlIO *f); 171 IV (*Fill)(pTHX_ PerlIO *f); 172 IV (*Eof)(pTHX_ PerlIO *f); 173 IV (*Error)(pTHX_ PerlIO *f); 174 void (*Clearerr)(pTHX_ PerlIO *f); 175 void (*Setlinebuf)(pTHX_ PerlIO *f); 176 /* Perl's snooping functions */ 177 STDCHAR * (*Get_base)(pTHX_ PerlIO *f); 178 Size_t (*Get_bufsiz)(pTHX_ PerlIO *f); 179 STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f); 180 SSize_t (*Get_cnt)(pTHX_ PerlIO *f); 181 void (*Set_ptrcnt)(pTHX_ PerlIO *f,STDCHAR *ptr,SSize_t cnt); 182 }; 183 184The first few members of the struct give a function table size for 185compatibility check "name" for the layer, the size to C<malloc> for the per-instance data, 186and some flags which are attributes of the class as whole (such as whether it is a buffering 187layer), then follow the functions which fall into four basic groups: 188 189=over 4 190 191=item 1. 192 193Opening and setup functions 194 195=item 2. 196 197Basic IO operations 198 199=item 3. 200 201Stdio class buffering options. 202 203=item 4. 204 205Functions to support Perl's traditional "fast" access to the buffer. 206 207=back 208 209A layer does not have to implement all the functions, but the whole 210table has to be present. Unimplemented slots can be NULL (which will 211result in an error when called) or can be filled in with stubs to 212"inherit" behaviour from a "base class". This "inheritance" is fixed 213for all instances of the layer, but as the layer chooses which stubs 214to populate the table, limited "multiple inheritance" is possible. 215 216=head2 Per-instance Data 217 218The per-instance data are held in memory beyond the basic PerlIOl 219struct, by making a PerlIOl the first member of the layer's struct 220thus: 221 222 typedef struct 223 { 224 struct _PerlIO base; /* Base "class" info */ 225 STDCHAR * buf; /* Start of buffer */ 226 STDCHAR * end; /* End of valid part of buffer */ 227 STDCHAR * ptr; /* Current position in buffer */ 228 Off_t posn; /* Offset of buf into the file */ 229 Size_t bufsiz; /* Real size of buffer */ 230 IV oneword; /* Emergency buffer */ 231 } PerlIOBuf; 232 233In this way (as for perl's scalars) a pointer to a PerlIOBuf can be 234treated as a pointer to a PerlIOl. 235 236=head2 Layers in action 237 238 table perlio unix 239 | | 240 +-----------+ +----------+ +--------+ 241 PerlIO ->| |--->| next |--->| NULL | 242 +-----------+ +----------+ +--------+ 243 | | | buffer | | fd | 244 +-----------+ | | +--------+ 245 | | +----------+ 246 247 248The above attempts to show how the layer scheme works in a simple case. 249The application's C<PerlIO *> points to an entry in the table(s) 250representing open (allocated) handles. For example the first three slots 251in the table correspond to C<stdin>,C<stdout> and C<stderr>. The table 252in turn points to the current "top" layer for the handle - in this case 253an instance of the generic buffering layer "perlio". That layer in turn 254points to the next layer down - in this case the low-level "unix" layer. 255 256The above is roughly equivalent to a "stdio" buffered stream, but with 257much more flexibility: 258 259=over 4 260 261=item * 262 263If Unix level C<read>/C<write>/C<lseek> is not appropriate for (say) 264sockets then the "unix" layer can be replaced (at open time or even 265dynamically) with a "socket" layer. 266 267=item * 268 269Different handles can have different buffering schemes. The "top" 270layer could be the "mmap" layer if reading disk files was quicker 271using C<mmap> than C<read>. An "unbuffered" stream can be implemented 272simply by not having a buffer layer. 273 274=item * 275 276Extra layers can be inserted to process the data as it flows through. 277This was the driving need for including the scheme in perl 5.7.0+ - we 278needed a mechanism to allow data to be translated between perl's 279internal encoding (conceptually at least Unicode as UTF-8), and the 280"native" format used by the system. This is provided by the 281":encoding(xxxx)" layer which typically sits above the buffering layer. 282 283=item * 284 285A layer can be added that does "\n" to CRLF translation. This layer 286can be used on any platform, not just those that normally do such 287things. 288 289=back 290 291=head2 Per-instance flag bits 292 293The generic flag bits are a hybrid of C<O_XXXXX> style flags deduced 294from the mode string passed to C<PerlIO_open()>, and state bits for 295typical buffer layers. 296 297=over 4 298 299=item PERLIO_F_EOF 300 301End of file. 302 303=item PERLIO_F_CANWRITE 304 305Writes are permitted, i.e. opened as "w" or "r+" or "a", etc. 306 307=item PERLIO_F_CANREAD 308 309Reads are permitted i.e. opened "r" or "w+" (or even "a+" - ick). 310 311=item PERLIO_F_ERROR 312 313An error has occurred (for C<PerlIO_error()>). 314 315=item PERLIO_F_TRUNCATE 316 317Truncate file suggested by open mode. 318 319=item PERLIO_F_APPEND 320 321All writes should be appends. 322 323=item PERLIO_F_CRLF 324 325Layer is performing Win32-like "\n" mapped to CR,LF for output and CR,LF 326mapped to "\n" for input. Normally the provided "crlf" layer is the only 327layer that need bother about this. C<PerlIO_binmode()> will mess with this 328flag rather than add/remove layers if the C<PERLIO_K_CANCRLF> bit is set 329for the layers class. 330 331=item PERLIO_F_UTF8 332 333Data written to this layer should be UTF-8 encoded; data provided 334by this layer should be considered UTF-8 encoded. Can be set on any layer 335by ":utf8" dummy layer. Also set on ":encoding" layer. 336 337=item PERLIO_F_UNBUF 338 339Layer is unbuffered - i.e. write to next layer down should occur for 340each write to this layer. 341 342=item PERLIO_F_WRBUF 343 344The buffer for this layer currently holds data written to it but not sent 345to next layer. 346 347=item PERLIO_F_RDBUF 348 349The buffer for this layer currently holds unconsumed data read from 350layer below. 351 352=item PERLIO_F_LINEBUF 353 354Layer is line buffered. Write data should be passed to next layer down 355whenever a "\n" is seen. Any data beyond the "\n" should then be 356processed. 357 358=item PERLIO_F_TEMP 359 360File has been C<unlink()>ed, or should be deleted on C<close()>. 361 362=item PERLIO_F_OPEN 363 364Handle is open. 365 366=item PERLIO_F_FASTGETS 367 368This instance of this layer supports the "fast C<gets>" interface. 369Normally set based on C<PERLIO_K_FASTGETS> for the class and by the 370existence of the function(s) in the table. However a class that 371normally provides that interface may need to avoid it on a 372particular instance. The "pending" layer needs to do this when 373it is pushed above a layer which does not support the interface. 374(Perl's C<sv_gets()> does not expect the streams fast C<gets> behaviour 375to change during one "get".) 376 377=for apidoc Amnh||PERLIO_F_APPEND 378=for apidoc_item || PERLIO_F_CANREAD 379=for apidoc_item ||PERLIO_F_CANWRITE 380=for apidoc_item ||PERLIO_F_CRLF 381=for apidoc_item ||PERLIO_F_EOF 382=for apidoc_item ||PERLIO_F_ERROR 383=for apidoc_item ||PERLIO_F_FASTGETS 384=for apidoc_item ||PERLIO_F_LINEBUF 385=for apidoc_item ||PERLIO_F_OPEN 386=for apidoc_item ||PERLIO_F_RDBUF 387=for apidoc_item ||PERLIO_F_TEMP 388=for apidoc_item ||PERLIO_F_TRUNCATE 389=for apidoc_item ||PERLIO_F_UNBUF 390=for apidoc_item ||PERLIO_F_UTF8 391=for apidoc_item ||PERLIO_F_WRBUF 392 393=back 394 395=head2 Methods in Detail 396 397=over 4 398 399=item fsize 400 401 Size_t fsize; 402 403Size of the function table. This is compared against the value PerlIO 404code "knows" as a compatibility check. Future versions I<may> be able 405to tolerate layers compiled against an old version of the headers. 406 407=item name 408 409 char * name; 410 411The name of the layer whose open() method Perl should invoke on 412open(). For example if the layer is called APR, you will call: 413 414 open $fh, ">:APR", ... 415 416and Perl knows that it has to invoke the PerlIOAPR_open() method 417implemented by the APR layer. 418 419=item size 420 421 Size_t size; 422 423The size of the per-instance data structure, e.g.: 424 425 sizeof(PerlIOAPR) 426 427If this field is zero then C<PerlIO_pushed> does not malloc anything 428and assumes layer's Pushed function will do any required layer stack 429manipulation - used to avoid malloc/free overhead for dummy layers. 430If the field is non-zero it must be at least the size of C<PerlIOl>, 431C<PerlIO_pushed> will allocate memory for the layer's data structures 432and link new layer onto the stream's stack. (If the layer's Pushed 433method returns an error indication the layer is popped again.) 434 435=item kind 436 437 IV kind; 438 439=over 4 440 441=item * PERLIO_K_BUFFERED 442 443The layer is buffered. 444 445=item * PERLIO_K_RAW 446 447The layer is acceptable to have in a binmode(FH) stack - i.e. it does not 448(or will configure itself not to) transform bytes passing through it. 449 450=item * PERLIO_K_CANCRLF 451 452Layer can translate between "\n" and CRLF line ends. 453 454=item * PERLIO_K_FASTGETS 455 456Layer allows buffer snooping. 457 458=item * PERLIO_K_MULTIARG 459 460Used when the layer's open() accepts more arguments than usual. The 461extra arguments should come not before the C<MODE> argument. When this 462flag is used it's up to the layer to validate the args. 463 464=for apidoc Amnh|| PERLIO_K_BUFFERED 465=for apidoc_item || PERLIO_K_CANCRLF 466=for apidoc_item || PERLIO_K_FASTGETS 467=for apidoc_item || PERLIO_K_MULTIARG 468=for apidoc_item || PERLIO_K_RAW 469 470=back 471 472=item Pushed 473 474 IV (*Pushed)(pTHX_ PerlIO *f,const char *mode, SV *arg); 475 476The only absolutely mandatory method. Called when the layer is pushed 477onto the stack. The C<mode> argument may be NULL if this occurs 478post-open. The C<arg> will be non-C<NULL> if an argument string was 479passed. In most cases this should call C<PerlIOBase_pushed()> to 480convert C<mode> into the appropriate C<PERLIO_F_XXXXX> flags in 481addition to any actions the layer itself takes. If a layer is not 482expecting an argument it need neither save the one passed to it, nor 483provide C<Getarg()> (it could perhaps C<Perl_warn> that the argument 484was un-expected). 485 486Returns 0 on success. On failure returns -1 and should set errno. 487 488=item Popped 489 490 IV (*Popped)(pTHX_ PerlIO *f); 491 492Called when the layer is popped from the stack. A layer will normally 493be popped after C<Close()> is called. But a layer can be popped 494without being closed if the program is dynamically managing layers on 495the stream. In such cases C<Popped()> should free any resources 496(buffers, translation tables, ...) not held directly in the layer's 497struct. It should also C<Unread()> any unconsumed data that has been 498read and buffered from the layer below back to that layer, so that it 499can be re-provided to what ever is now above. 500 501Returns 0 on success and failure. If C<Popped()> returns I<true> then 502I<perlio.c> assumes that either the layer has popped itself, or the 503layer is super special and needs to be retained for other reasons. 504In most cases it should return I<false>. 505 506=item Open 507 508 PerlIO * (*Open)(...); 509 510The C<Open()> method has lots of arguments because it combines the 511functions of perl's C<open>, C<PerlIO_open>, perl's C<sysopen>, 512C<PerlIO_fdopen> and C<PerlIO_reopen>. The full prototype is as 513follows: 514 515 PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab, 516 PerlIO_list_t *layers, IV n, 517 const char *mode, 518 int fd, int imode, int perm, 519 PerlIO *old, 520 int narg, SV **args); 521 522Open should (perhaps indirectly) call C<PerlIO_allocate()> to allocate 523a slot in the table and associate it with the layers information for 524the opened file, by calling C<PerlIO_push>. The I<layers> is an 525array of all the layers destined for the C<PerlIO *>, and any 526arguments passed to them, I<n> is the index into that array of the 527layer being called. The macro C<PerlIOArg> will return a (possibly 528C<NULL>) SV * for the argument passed to the layer. 529 530Where a layer opens or takes ownership of a file descriptor, that layer is 531responsible for getting the file descriptor's close-on-exec flag into the 532correct state. The flag should be clear for a file descriptor numbered 533less than or equal to C<PL_maxsysfd>, and set for any file descriptor 534numbered higher. For thread safety, when a layer opens a new file 535descriptor it should if possible open it with the close-on-exec flag 536initially set. 537 538=for apidoc Amnh||PL_maxsysfd 539 540The I<mode> string is an "C<fopen()>-like" string which would match 541the regular expression C</^[I#]?[rwa]\+?[bt]?$/>. 542 543The C<'I'> prefix is used during creation of C<stdin>..C<stderr> via 544special C<PerlIO_fdopen> calls; the C<'#'> prefix means that this is 545C<sysopen> and that I<imode> and I<perm> should be passed to 546C<PerlLIO_open3>; C<'r'> means B<r>ead, C<'w'> means B<w>rite and 547C<'a'> means B<a>ppend. The C<'+'> suffix means that both reading and 548writing/appending are permitted. The C<'b'> suffix means file should 549be binary, and C<'t'> means it is text. (Almost all layers should do 550the IO in binary mode, and ignore the b/t bits. The C<:crlf> layer 551should be pushed to handle the distinction.) 552 553If I<old> is not C<NULL> then this is a C<PerlIO_reopen>. Perl itself 554does not use this (yet?) and semantics are a little vague. 555 556If I<fd> not negative then it is the numeric file descriptor I<fd>, 557which will be open in a manner compatible with the supplied mode 558string, the call is thus equivalent to C<PerlIO_fdopen>. In this case 559I<nargs> will be zero. 560The file descriptor may have the close-on-exec flag either set or clear; 561it is the responsibility of the layer that takes ownership of it to get 562the flag into the correct state. 563 564If I<nargs> is greater than zero then it gives the number of arguments 565passed to C<open>, otherwise it will be 1 if for example 566C<PerlIO_open> was called. In simple cases SvPV_nolen(*args) is the 567pathname to open. 568 569If a layer provides C<Open()> it should normally call the C<Open()> 570method of next layer down (if any) and then push itself on top if that 571succeeds. C<PerlIOBase_open> is provided to do exactly that, so in 572most cases you don't have to write your own C<Open()> method. If this 573method is not defined, other layers may have difficulty pushing 574themselves on top of it during open. 575 576If C<PerlIO_push> was performed and open has failed, it must 577C<PerlIO_pop> itself, since if it's not, the layer won't be removed 578and may cause bad problems. 579 580Returns C<NULL> on failure. 581 582=item Binmode 583 584 IV (*Binmode)(pTHX_ PerlIO *f); 585 586Optional. Used when C<:raw> layer is pushed (explicitly or as a result 587of binmode(FH)). If not present layer will be popped. If present 588should configure layer as binary (or pop itself) and return 0. 589If it returns -1 for error C<binmode> will fail with layer 590still on the stack. 591 592=item Getarg 593 594 SV * (*Getarg)(pTHX_ PerlIO *f, 595 CLONE_PARAMS *param, int flags); 596 597Optional. If present should return an SV * representing the string 598argument passed to the layer when it was 599pushed. e.g. ":encoding(ascii)" would return an SvPV with value 600"ascii". (I<param> and I<flags> arguments can be ignored in most 601cases) 602 603C<Dup> uses C<Getarg> to retrieve the argument originally passed to 604C<Pushed>, so you must implement this function if your layer has an 605extra argument to C<Pushed> and will ever be C<Dup>ed. 606 607=item Fileno 608 609 IV (*Fileno)(pTHX_ PerlIO *f); 610 611Returns the Unix/Posix numeric file descriptor for the handle. Normally 612C<PerlIOBase_fileno()> (which just asks next layer down) will suffice 613for this. 614 615Returns -1 on error, which is considered to include the case where the 616layer cannot provide such a file descriptor. 617 618=item Dup 619 620 PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o, 621 CLONE_PARAMS *param, int flags); 622 623XXX: Needs more docs. 624 625Used as part of the "clone" process when a thread is spawned (in which 626case param will be non-NULL) and when a stream is being duplicated via 627'&' in the C<open>. 628 629Similar to C<Open>, returns PerlIO* on success, C<NULL> on failure. 630 631=item Read 632 633 SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count); 634 635Basic read operation. 636 637Typically will call C<Fill> and manipulate pointers (possibly via the 638API). C<PerlIOBuf_read()> may be suitable for derived classes which 639provide "fast gets" methods. 640 641Returns actual bytes read, or -1 on an error. 642 643=item Unread 644 645 SSize_t (*Unread)(pTHX_ PerlIO *f, 646 const void *vbuf, Size_t count); 647 648A superset of stdio's C<ungetc()>. Should arrange for future reads to 649see the bytes in C<vbuf>. If there is no obviously better implementation 650then C<PerlIOBase_unread()> provides the function by pushing a "fake" 651"pending" layer above the calling layer. 652 653Returns the number of unread chars. 654 655=item Write 656 657 SSize_t (*Write)(PerlIO *f, const void *vbuf, Size_t count); 658 659Basic write operation. 660 661Returns bytes written or -1 on an error. 662 663=item Seek 664 665 IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence); 666 667Position the file pointer. Should normally call its own C<Flush> 668method and then the C<Seek> method of next layer down. 669 670Returns 0 on success, -1 on failure. 671 672=item Tell 673 674 Off_t (*Tell)(pTHX_ PerlIO *f); 675 676Return the file pointer. May be based on layers cached concept of 677position to avoid overhead. 678 679Returns -1 on failure to get the file pointer. 680 681=item Close 682 683 IV (*Close)(pTHX_ PerlIO *f); 684 685Close the stream. Should normally call C<PerlIOBase_close()> to flush 686itself and close layers below, and then deallocate any data structures 687(buffers, translation tables, ...) not held directly in the data 688structure. 689 690Returns 0 on success, -1 on failure. 691 692=item Flush 693 694 IV (*Flush)(pTHX_ PerlIO *f); 695 696Should make stream's state consistent with layers below. That is, any 697buffered write data should be written, and file position of lower layers 698adjusted for data read from below but not actually consumed. 699(Should perhaps C<Unread()> such data to the lower layer.) 700 701Returns 0 on success, -1 on failure. 702 703=item Fill 704 705 IV (*Fill)(pTHX_ PerlIO *f); 706 707The buffer for this layer should be filled (for read) from layer 708below. When you "subclass" PerlIOBuf layer, you want to use its 709I<_read> method and to supply your own fill method, which fills the 710PerlIOBuf's buffer. 711 712Returns 0 on success, -1 on failure. 713 714=item Eof 715 716 IV (*Eof)(pTHX_ PerlIO *f); 717 718Return end-of-file indicator. C<PerlIOBase_eof()> is normally sufficient. 719 720Returns 0 on end-of-file, 1 if not end-of-file, -1 on error. 721 722=item Error 723 724 IV (*Error)(pTHX_ PerlIO *f); 725 726Return error indicator. C<PerlIOBase_error()> is normally sufficient. 727 728Returns 1 if there is an error (usually when C<PERLIO_F_ERROR> is set), 7290 otherwise. 730 731=item Clearerr 732 733 void (*Clearerr)(pTHX_ PerlIO *f); 734 735Clear end-of-file and error indicators. Should call C<PerlIOBase_clearerr()> 736to set the C<PERLIO_F_XXXXX> flags, which may suffice. 737 738=item Setlinebuf 739 740 void (*Setlinebuf)(pTHX_ PerlIO *f); 741 742Mark the stream as line buffered. C<PerlIOBase_setlinebuf()> sets the 743PERLIO_F_LINEBUF flag and is normally sufficient. 744 745=item Get_base 746 747 STDCHAR * (*Get_base)(pTHX_ PerlIO *f); 748 749Allocate (if not already done so) the read buffer for this layer and 750return pointer to it. Return NULL on failure. 751 752=item Get_bufsiz 753 754 Size_t (*Get_bufsiz)(pTHX_ PerlIO *f); 755 756Return the number of bytes that last C<Fill()> put in the buffer. 757 758=item Get_ptr 759 760 STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f); 761 762Return the current read pointer relative to this layer's buffer. 763 764=item Get_cnt 765 766 SSize_t (*Get_cnt)(pTHX_ PerlIO *f); 767 768Return the number of bytes left to be read in the current buffer. 769 770=item Set_ptrcnt 771 772 void (*Set_ptrcnt)(pTHX_ PerlIO *f, 773 STDCHAR *ptr, SSize_t cnt); 774 775Adjust the read pointer and count of bytes to match C<ptr> and/or C<cnt>. 776The application (or layer above) must ensure they are consistent. 777(Checking is allowed by the paranoid.) 778 779=back 780 781=head2 Utilities 782 783To ask for the next layer down use PerlIONext(PerlIO *f). 784 785To check that a PerlIO* is valid use PerlIOValid(PerlIO *f). (All 786this does is really just to check that the pointer is non-NULL and 787that the pointer behind that is non-NULL.) 788 789PerlIOBase(PerlIO *f) returns the "Base" pointer, or in other words, 790the C<PerlIOl*> pointer. 791 792PerlIOSelf(PerlIO* f, type) return the PerlIOBase cast to a type. 793 794Perl_PerlIO_or_Base(PerlIO* f, callback, base, failure, args) either 795calls the I<callback> from the functions of the layer I<f> (just by 796the name of the IO function, like "Read") with the I<args>, or if 797there is no such callback, calls the I<base> version of the callback 798with the same args, or if the f is invalid, set errno to EBADF and 799return I<failure>. 800 801Perl_PerlIO_or_fail(PerlIO* f, callback, failure, args) either calls 802the I<callback> of the functions of the layer I<f> with the I<args>, 803or if there is no such callback, set errno to EINVAL. Or if the f is 804invalid, set errno to EBADF and return I<failure>. 805 806Perl_PerlIO_or_Base_void(PerlIO* f, callback, base, args) either calls 807the I<callback> of the functions of the layer I<f> with the I<args>, 808or if there is no such callback, calls the I<base> version of the 809callback with the same args, or if the f is invalid, set errno to 810EBADF. 811 812Perl_PerlIO_or_fail_void(PerlIO* f, callback, args) either calls the 813I<callback> of the functions of the layer I<f> with the I<args>, or if 814there is no such callback, set errno to EINVAL. Or if the f is 815invalid, set errno to EBADF. 816 817=head2 Implementing PerlIO Layers 818 819If you find the implementation document unclear or not sufficient, 820look at the existing PerlIO layer implementations, which include: 821 822=over 823 824=item * C implementations 825 826The F<perlio.c> and F<perliol.h> in the Perl core implement the 827"unix", "perlio", "stdio", "crlf", "utf8", "byte", "raw", "pending" 828layers, and also the "mmap" and "win32" layers if applicable. 829(The "win32" is currently unfinished and unused, to see what is used 830instead in Win32, see L<PerlIO/"Querying the layers of filehandles"> .) 831 832PerlIO::encoding, PerlIO::scalar, PerlIO::via in the Perl core. 833 834PerlIO::gzip and APR::PerlIO (mod_perl 2.0) on CPAN. 835 836=item * Perl implementations 837 838PerlIO::via::QuotedPrint in the Perl core and PerlIO::via::* on CPAN. 839 840=back 841 842If you are creating a PerlIO layer, you may want to be lazy, in other 843words, implement only the methods that interest you. The other methods 844you can either replace with the "blank" methods 845 846 PerlIOBase_noop_ok 847 PerlIOBase_noop_fail 848 849(which do nothing, and return zero and -1, respectively) or for 850certain methods you may assume a default behaviour by using a NULL 851method. The Open method looks for help in the 'parent' layer. 852The following table summarizes the behaviour: 853 854 method behaviour with NULL 855 856 Clearerr PerlIOBase_clearerr 857 Close PerlIOBase_close 858 Dup PerlIOBase_dup 859 Eof PerlIOBase_eof 860 Error PerlIOBase_error 861 Fileno PerlIOBase_fileno 862 Fill FAILURE 863 Flush SUCCESS 864 Getarg SUCCESS 865 Get_base FAILURE 866 Get_bufsiz FAILURE 867 Get_cnt FAILURE 868 Get_ptr FAILURE 869 Open INHERITED 870 Popped SUCCESS 871 Pushed SUCCESS 872 Read PerlIOBase_read 873 Seek FAILURE 874 Set_cnt FAILURE 875 Set_ptrcnt FAILURE 876 Setlinebuf PerlIOBase_setlinebuf 877 Tell FAILURE 878 Unread PerlIOBase_unread 879 Write FAILURE 880 881 FAILURE Set errno (to EINVAL in Unixish, to LIB$_INVARG in VMS) 882 and return -1 (for numeric return values) or NULL (for 883 pointers) 884 INHERITED Inherited from the layer below 885 SUCCESS Return 0 (for numeric return values) or a pointer 886 887=head2 Core Layers 888 889The file C<perlio.c> provides the following layers: 890 891=over 4 892 893=item "unix" 894 895A basic non-buffered layer which calls Unix/POSIX C<read()>, C<write()>, 896C<lseek()>, C<close()>. No buffering. Even on platforms that distinguish 897between O_TEXT and O_BINARY this layer is always O_BINARY. 898 899=item "perlio" 900 901A very complete generic buffering layer which provides the whole of 902PerlIO API. It is also intended to be used as a "base class" for other 903layers. (For example its C<Read()> method is implemented in terms of 904the C<Get_cnt()>/C<Get_ptr()>/C<Set_ptrcnt()> methods). 905 906"perlio" over "unix" provides a complete replacement for stdio as seen 907via PerlIO API. This is the default for USE_PERLIO when system's stdio 908does not permit perl's "fast gets" access, and which do not 909distinguish between C<O_TEXT> and C<O_BINARY>. 910 911=item "stdio" 912 913A layer which provides the PerlIO API via the layer scheme, but 914implements it by calling system's stdio. This is (currently) the default 915if system's stdio provides sufficient access to allow perl's "fast gets" 916access and which do not distinguish between C<O_TEXT> and C<O_BINARY>. 917 918=item "crlf" 919 920A layer derived using "perlio" as a base class. It provides Win32-like 921"\n" to CR,LF translation. Can either be applied above "perlio" or serve 922as the buffer layer itself. "crlf" over "unix" is the default if system 923distinguishes between C<O_TEXT> and C<O_BINARY> opens. (At some point 924"unix" will be replaced by a "native" Win32 IO layer on that platform, 925as Win32's read/write layer has various drawbacks.) The "crlf" layer is 926a reasonable model for a layer which transforms data in some way. 927 928=item "mmap" 929 930If Configure detects C<mmap()> functions this layer is provided (with 931"perlio" as a "base") which does "read" operations by mmap()ing the 932file. Performance improvement is marginal on modern systems, so it is 933mainly there as a proof of concept. It is likely to be unbundled from 934the core at some point. The "mmap" layer is a reasonable model for a 935minimalist "derived" layer. 936 937=item "pending" 938 939An "internal" derivative of "perlio" which can be used to provide 940Unread() function for layers which have no buffer or cannot be 941bothered. (Basically this layer's C<Fill()> pops itself off the stack 942and so resumes reading from layer below.) 943 944=item "raw" 945 946A dummy layer which never exists on the layer stack. Instead when 947"pushed" it actually pops the stack removing itself, it then calls 948Binmode function table entry on all the layers in the stack - normally 949this (via PerlIOBase_binmode) removes any layers which do not have 950C<PERLIO_K_RAW> bit set. Layers can modify that behaviour by defining 951their own Binmode entry. 952 953=item "utf8" 954 955Another dummy layer. When pushed it pops itself and sets the 956C<PERLIO_F_UTF8> flag on the layer which was (and now is once more) 957the top of the stack. 958 959=back 960 961In addition F<perlio.c> also provides a number of C<PerlIOBase_xxxx()> 962functions which are intended to be used in the table slots of classes 963which do not need to do anything special for a particular method. 964 965=head2 Extension Layers 966 967Layers can be made available by extension modules. When an unknown layer 968is encountered the PerlIO code will perform the equivalent of : 969 970 use PerlIO 'layer'; 971 972Where I<layer> is the unknown layer. F<PerlIO.pm> will then attempt to: 973 974 require PerlIO::layer; 975 976If after that process the layer is still not defined then the C<open> 977will fail. 978 979The following extension layers are bundled with perl: 980 981=over 4 982 983=item ":encoding" 984 985 use Encoding; 986 987makes this layer available, although F<PerlIO.pm> "knows" where to 988find it. It is an example of a layer which takes an argument as it is 989called thus: 990 991 open( $fh, "<:encoding(iso-8859-7)", $pathname ); 992 993=item ":scalar" 994 995Provides support for reading data from and writing data to a scalar. 996 997 open( $fh, "+<:scalar", \$scalar ); 998 999When a handle is so opened, then reads get bytes from the string value 1000of I<$scalar>, and writes change the value. In both cases the position 1001in I<$scalar> starts as zero but can be altered via C<seek>, and 1002determined via C<tell>. 1003 1004Please note that this layer is implied when calling open() thus: 1005 1006 open( $fh, "+<", \$scalar ); 1007 1008=item ":via" 1009 1010Provided to allow layers to be implemented as Perl code. For instance: 1011 1012 use PerlIO::via::StripHTML; 1013 open( my $fh, "<:via(StripHTML)", "index.html" ); 1014 1015See L<PerlIO::via> for details. 1016 1017=back 1018 1019=head1 TODO 1020 1021Things that need to be done to improve this document. 1022 1023=over 1024 1025=item * 1026 1027Explain how to make a valid fh without going through open()(i.e. apply 1028a layer). For example if the file is not opened through perl, but we 1029want to get back a fh, like it was opened by Perl. 1030 1031How PerlIO_apply_layera fits in, where its docs, was it made public? 1032 1033Currently the example could be something like this: 1034 1035 PerlIO *foo_to_PerlIO(pTHX_ char *mode, ...) 1036 { 1037 char *mode; /* "w", "r", etc */ 1038 const char *layers = ":APR"; /* the layer name */ 1039 PerlIO *f = PerlIO_allocate(aTHX); 1040 if (!f) { 1041 return NULL; 1042 } 1043 1044 PerlIO_apply_layers(aTHX_ f, mode, layers); 1045 1046 if (f) { 1047 PerlIOAPR *st = PerlIOSelf(f, PerlIOAPR); 1048 /* fill in the st struct, as in _open() */ 1049 st->file = file; 1050 PerlIOBase(f)->flags |= PERLIO_F_OPEN; 1051 1052 return f; 1053 } 1054 return NULL; 1055 } 1056 1057=item * 1058 1059fix/add the documentation in places marked as XXX. 1060 1061=item * 1062 1063The handling of errors by the layer is not specified. e.g. when $! 1064should be set explicitly, when the error handling should be just 1065delegated to the top layer. 1066 1067Probably give some hints on using SETERRNO() or pointers to where they 1068can be found. 1069 1070=item * 1071 1072I think it would help to give some concrete examples to make it easier 1073to understand the API. Of course I agree that the API has to be 1074concise, but since there is no second document that is more of a 1075guide, I think that it'd make it easier to start with the doc which is 1076an API, but has examples in it in places where things are unclear, to 1077a person who is not a PerlIO guru (yet). 1078 1079=back 1080 1081=cut 1082