1=head1 NAME 2 3perliol - C API for Perl's implementation of IO in Layers. 4 5=head1 SYNOPSIS 6 7 /* Defining a layer ... */ 8 #include <perliol.h> 9 10=head1 DESCRIPTION 11 12This document describes the behavior and implementation of the PerlIO 13abstraction described in L<perlapio> when C<USE_PERLIO> is defined. 14 15=head2 History and Background 16 17The PerlIO abstraction was introduced in perl5.003_02 but languished as 18just an abstraction until perl5.7.0. However during that time a number 19of perl extensions switched to using it, so the API is mostly fixed to 20maintain (source) compatibility. 21 22The aim of the implementation is to provide the PerlIO API in a flexible 23and platform neutral manner. It is also a trial of an "Object Oriented 24C, with vtables" approach which may be applied to Raku. 25 26=head2 Basic Structure 27 28PerlIO is a stack of layers. 29 30The low levels of the stack work with the low-level operating system 31calls (file descriptors in C) getting bytes in and out, the higher 32layers of the stack buffer, filter, and otherwise manipulate the I/O, 33and return characters (or bytes) to Perl. Terms I<above> and I<below> 34are used to refer to the relative positioning of the stack layers. 35 36A layer contains a "vtable", the table of I/O operations (at C level 37a table of function pointers), and status flags. The functions in the 38vtable implement operations like "open", "read", and "write". 39 40When I/O, for example "read", is requested, the request goes from Perl 41first down the stack using "read" functions of each layer, then at the 42bottom the input is requested from the operating system services, then 43the result is returned up the stack, finally being interpreted as Perl 44data. 45 46The requests do not necessarily go always all the way down to the 47operating system: that's where PerlIO buffering comes into play. 48 49When you do an open() and specify extra PerlIO layers to be deployed, 50the layers you specify are "pushed" on top of the already existing 51default stack. One way to see it is that "operating system is 52on the left" and "Perl is on the right". 53 54What exact layers are in this default stack depends on a lot of 55things: your operating system, Perl version, Perl compile time 56configuration, and Perl runtime configuration. See L<PerlIO>, 57L<perlrun/PERLIO>, and L<open> for more information. 58 59binmode() operates similarly to open(): by default the specified 60layers are pushed on top of the existing stack. 61 62However, note that even as the specified layers are "pushed on top" 63for open() and binmode(), this doesn't mean that the effects are 64limited to the "top": PerlIO layers can be very 'active' and inspect 65and affect layers also deeper in the stack. As an example there 66is a layer called "raw" which repeatedly "pops" layers until 67it reaches the first layer that has declared itself capable of 68handling binary data. The "pushed" layers are processed in left-to-right 69order. 70 71sysopen() operates (unsurprisingly) at a lower level in the stack than 72open(). For example in Unix or Unix-like systems sysopen() operates 73directly at the level of file descriptors: in the terms of PerlIO 74layers, it uses only the "unix" layer, which is a rather thin wrapper 75on top of the Unix file descriptors. 76 77=head2 Layers vs Disciplines 78 79Initial discussion of the ability to modify IO streams behaviour used 80the term "discipline" for the entities which were added. This came (I 81believe) from the use of the term in "sfio", which in turn borrowed it 82from "line disciplines" on Unix terminals. However, this document (and 83the C code) uses the term "layer". 84 85This is, I hope, a natural term given the implementation, and should 86avoid connotations that are inherent in earlier uses of "discipline" 87for things which are rather different. 88 89=head2 Data Structures 90 91The basic data structure is a PerlIOl: 92 93 typedef struct _PerlIO PerlIOl; 94 typedef struct _PerlIO_funcs PerlIO_funcs; 95 typedef PerlIOl *PerlIO; 96 97 struct _PerlIO 98 { 99 PerlIOl * next; /* Lower layer */ 100 PerlIO_funcs * tab; /* Functions for this layer */ 101 U32 flags; /* Various flags for state */ 102 }; 103 104A C<PerlIOl *> is a pointer to the struct, and the I<application> 105level C<PerlIO *> is a pointer to a C<PerlIOl *> - i.e. a pointer 106to a pointer to the struct. This allows the application level C<PerlIO *> 107to remain constant while the actual C<PerlIOl *> underneath 108changes. (Compare perl's C<SV *> which remains constant while its 109C<sv_any> field changes as the scalar's type changes.) An IO stream is 110then in general represented as a pointer to this linked-list of 111"layers". 112 113It should be noted that because of the double indirection in a C<PerlIO *>, 114a C<< &(perlio->next) >> "is" a C<PerlIO *>, and so to some degree 115at least one layer can use the "standard" API on the next layer down. 116 117A "layer" is composed of two parts: 118 119=over 4 120 121=item 1. 122 123The functions and attributes of the "layer class". 124 125=item 2. 126 127The per-instance data for a particular handle. 128 129=back 130 131=head2 Functions and Attributes 132 133The functions and attributes are accessed via the "tab" (for table) 134member of C<PerlIOl>. The functions (methods of the layer "class") are 135fixed, and are defined by the C<PerlIO_funcs> type. They are broadly the 136same as the public C<PerlIO_xxxxx> functions: 137 138 struct _PerlIO_funcs 139 { 140 Size_t fsize; 141 char * name; 142 Size_t size; 143 IV kind; 144 IV (*Pushed)(pTHX_ PerlIO *f, 145 const char *mode, 146 SV *arg, 147 PerlIO_funcs *tab); 148 IV (*Popped)(pTHX_ PerlIO *f); 149 PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab, 150 PerlIO_list_t *layers, IV n, 151 const char *mode, 152 int fd, int imode, int perm, 153 PerlIO *old, 154 int narg, SV **args); 155 IV (*Binmode)(pTHX_ PerlIO *f); 156 SV * (*Getarg)(pTHX_ PerlIO *f, CLONE_PARAMS *param, int flags) 157 IV (*Fileno)(pTHX_ PerlIO *f); 158 PerlIO * (*Dup)(pTHX_ PerlIO *f, 159 PerlIO *o, 160 CLONE_PARAMS *param, 161 int flags) 162 /* Unix-like functions - cf sfio line disciplines */ 163 SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count); 164 SSize_t (*Unread)(pTHX_ PerlIO *f, const void *vbuf, Size_t count); 165 SSize_t (*Write)(pTHX_ PerlIO *f, const void *vbuf, Size_t count); 166 IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence); 167 Off_t (*Tell)(pTHX_ PerlIO *f); 168 IV (*Close)(pTHX_ PerlIO *f); 169 /* Stdio-like buffered IO functions */ 170 IV (*Flush)(pTHX_ PerlIO *f); 171 IV (*Fill)(pTHX_ PerlIO *f); 172 IV (*Eof)(pTHX_ PerlIO *f); 173 IV (*Error)(pTHX_ PerlIO *f); 174 void (*Clearerr)(pTHX_ PerlIO *f); 175 void (*Setlinebuf)(pTHX_ PerlIO *f); 176 /* Perl's snooping functions */ 177 STDCHAR * (*Get_base)(pTHX_ PerlIO *f); 178 Size_t (*Get_bufsiz)(pTHX_ PerlIO *f); 179 STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f); 180 SSize_t (*Get_cnt)(pTHX_ PerlIO *f); 181 void (*Set_ptrcnt)(pTHX_ PerlIO *f,STDCHAR *ptr,SSize_t cnt); 182 }; 183 184The first few members of the struct give a function table size for 185compatibility check "name" for the layer, the size to C<malloc> for the per-instance data, 186and some flags which are attributes of the class as whole (such as whether it is a buffering 187layer), then follow the functions which fall into four basic groups: 188 189=over 4 190 191=item 1. 192 193Opening and setup functions 194 195=item 2. 196 197Basic IO operations 198 199=item 3. 200 201Stdio class buffering options. 202 203=item 4. 204 205Functions to support Perl's traditional "fast" access to the buffer. 206 207=back 208 209A layer does not have to implement all the functions, but the whole 210table has to be present. Unimplemented slots can be NULL (which will 211result in an error when called) or can be filled in with stubs to 212"inherit" behaviour from a "base class". This "inheritance" is fixed 213for all instances of the layer, but as the layer chooses which stubs 214to populate the table, limited "multiple inheritance" is possible. 215 216=head2 Per-instance Data 217 218The per-instance data are held in memory beyond the basic PerlIOl 219struct, by making a PerlIOl the first member of the layer's struct 220thus: 221 222 typedef struct 223 { 224 struct _PerlIO base; /* Base "class" info */ 225 STDCHAR * buf; /* Start of buffer */ 226 STDCHAR * end; /* End of valid part of buffer */ 227 STDCHAR * ptr; /* Current position in buffer */ 228 Off_t posn; /* Offset of buf into the file */ 229 Size_t bufsiz; /* Real size of buffer */ 230 IV oneword; /* Emergency buffer */ 231 } PerlIOBuf; 232 233In this way (as for perl's scalars) a pointer to a PerlIOBuf can be 234treated as a pointer to a PerlIOl. 235 236=head2 Layers in action. 237 238 table perlio unix 239 | | 240 +-----------+ +----------+ +--------+ 241 PerlIO ->| |--->| next |--->| NULL | 242 +-----------+ +----------+ +--------+ 243 | | | buffer | | fd | 244 +-----------+ | | +--------+ 245 | | +----------+ 246 247 248The above attempts to show how the layer scheme works in a simple case. 249The application's C<PerlIO *> points to an entry in the table(s) 250representing open (allocated) handles. For example the first three slots 251in the table correspond to C<stdin>,C<stdout> and C<stderr>. The table 252in turn points to the current "top" layer for the handle - in this case 253an instance of the generic buffering layer "perlio". That layer in turn 254points to the next layer down - in this case the low-level "unix" layer. 255 256The above is roughly equivalent to a "stdio" buffered stream, but with 257much more flexibility: 258 259=over 4 260 261=item * 262 263If Unix level C<read>/C<write>/C<lseek> is not appropriate for (say) 264sockets then the "unix" layer can be replaced (at open time or even 265dynamically) with a "socket" layer. 266 267=item * 268 269Different handles can have different buffering schemes. The "top" 270layer could be the "mmap" layer if reading disk files was quicker 271using C<mmap> than C<read>. An "unbuffered" stream can be implemented 272simply by not having a buffer layer. 273 274=item * 275 276Extra layers can be inserted to process the data as it flows through. 277This was the driving need for including the scheme in perl 5.7.0+ - we 278needed a mechanism to allow data to be translated between perl's 279internal encoding (conceptually at least Unicode as UTF-8), and the 280"native" format used by the system. This is provided by the 281":encoding(xxxx)" layer which typically sits above the buffering layer. 282 283=item * 284 285A layer can be added that does "\n" to CRLF translation. This layer 286can be used on any platform, not just those that normally do such 287things. 288 289=back 290 291=head2 Per-instance flag bits 292 293The generic flag bits are a hybrid of C<O_XXXXX> style flags deduced 294from the mode string passed to C<PerlIO_open()>, and state bits for 295typical buffer layers. 296 297=over 4 298 299=item PERLIO_F_EOF 300 301End of file. 302 303=item PERLIO_F_CANWRITE 304 305Writes are permitted, i.e. opened as "w" or "r+" or "a", etc. 306 307=item PERLIO_F_CANREAD 308 309Reads are permitted i.e. opened "r" or "w+" (or even "a+" - ick). 310 311=item PERLIO_F_ERROR 312 313An error has occurred (for C<PerlIO_error()>). 314 315=item PERLIO_F_TRUNCATE 316 317Truncate file suggested by open mode. 318 319=item PERLIO_F_APPEND 320 321All writes should be appends. 322 323=item PERLIO_F_CRLF 324 325Layer is performing Win32-like "\n" mapped to CR,LF for output and CR,LF 326mapped to "\n" for input. Normally the provided "crlf" layer is the only 327layer that need bother about this. C<PerlIO_binmode()> will mess with this 328flag rather than add/remove layers if the C<PERLIO_K_CANCRLF> bit is set 329for the layers class. 330 331=item PERLIO_F_UTF8 332 333Data written to this layer should be UTF-8 encoded; data provided 334by this layer should be considered UTF-8 encoded. Can be set on any layer 335by ":utf8" dummy layer. Also set on ":encoding" layer. 336 337=item PERLIO_F_UNBUF 338 339Layer is unbuffered - i.e. write to next layer down should occur for 340each write to this layer. 341 342=item PERLIO_F_WRBUF 343 344The buffer for this layer currently holds data written to it but not sent 345to next layer. 346 347=item PERLIO_F_RDBUF 348 349The buffer for this layer currently holds unconsumed data read from 350layer below. 351 352=item PERLIO_F_LINEBUF 353 354Layer is line buffered. Write data should be passed to next layer down 355whenever a "\n" is seen. Any data beyond the "\n" should then be 356processed. 357 358=item PERLIO_F_TEMP 359 360File has been C<unlink()>ed, or should be deleted on C<close()>. 361 362=item PERLIO_F_OPEN 363 364Handle is open. 365 366=item PERLIO_F_FASTGETS 367 368This instance of this layer supports the "fast C<gets>" interface. 369Normally set based on C<PERLIO_K_FASTGETS> for the class and by the 370existence of the function(s) in the table. However a class that 371normally provides that interface may need to avoid it on a 372particular instance. The "pending" layer needs to do this when 373it is pushed above a layer which does not support the interface. 374(Perl's C<sv_gets()> does not expect the streams fast C<gets> behaviour 375to change during one "get".) 376 377=back 378 379=head2 Methods in Detail 380 381=over 4 382 383=item fsize 384 385 Size_t fsize; 386 387Size of the function table. This is compared against the value PerlIO 388code "knows" as a compatibility check. Future versions I<may> be able 389to tolerate layers compiled against an old version of the headers. 390 391=item name 392 393 char * name; 394 395The name of the layer whose open() method Perl should invoke on 396open(). For example if the layer is called APR, you will call: 397 398 open $fh, ">:APR", ... 399 400and Perl knows that it has to invoke the PerlIOAPR_open() method 401implemented by the APR layer. 402 403=item size 404 405 Size_t size; 406 407The size of the per-instance data structure, e.g.: 408 409 sizeof(PerlIOAPR) 410 411If this field is zero then C<PerlIO_pushed> does not malloc anything 412and assumes layer's Pushed function will do any required layer stack 413manipulation - used to avoid malloc/free overhead for dummy layers. 414If the field is non-zero it must be at least the size of C<PerlIOl>, 415C<PerlIO_pushed> will allocate memory for the layer's data structures 416and link new layer onto the stream's stack. (If the layer's Pushed 417method returns an error indication the layer is popped again.) 418 419=item kind 420 421 IV kind; 422 423=over 4 424 425=item * PERLIO_K_BUFFERED 426 427The layer is buffered. 428 429=item * PERLIO_K_RAW 430 431The layer is acceptable to have in a binmode(FH) stack - i.e. it does not 432(or will configure itself not to) transform bytes passing through it. 433 434=item * PERLIO_K_CANCRLF 435 436Layer can translate between "\n" and CRLF line ends. 437 438=item * PERLIO_K_FASTGETS 439 440Layer allows buffer snooping. 441 442=item * PERLIO_K_MULTIARG 443 444Used when the layer's open() accepts more arguments than usual. The 445extra arguments should come not before the C<MODE> argument. When this 446flag is used it's up to the layer to validate the args. 447 448=back 449 450=item Pushed 451 452 IV (*Pushed)(pTHX_ PerlIO *f,const char *mode, SV *arg); 453 454The only absolutely mandatory method. Called when the layer is pushed 455onto the stack. The C<mode> argument may be NULL if this occurs 456post-open. The C<arg> will be non-C<NULL> if an argument string was 457passed. In most cases this should call C<PerlIOBase_pushed()> to 458convert C<mode> into the appropriate C<PERLIO_F_XXXXX> flags in 459addition to any actions the layer itself takes. If a layer is not 460expecting an argument it need neither save the one passed to it, nor 461provide C<Getarg()> (it could perhaps C<Perl_warn> that the argument 462was un-expected). 463 464Returns 0 on success. On failure returns -1 and should set errno. 465 466=item Popped 467 468 IV (*Popped)(pTHX_ PerlIO *f); 469 470Called when the layer is popped from the stack. A layer will normally 471be popped after C<Close()> is called. But a layer can be popped 472without being closed if the program is dynamically managing layers on 473the stream. In such cases C<Popped()> should free any resources 474(buffers, translation tables, ...) not held directly in the layer's 475struct. It should also C<Unread()> any unconsumed data that has been 476read and buffered from the layer below back to that layer, so that it 477can be re-provided to what ever is now above. 478 479Returns 0 on success and failure. If C<Popped()> returns I<true> then 480I<perlio.c> assumes that either the layer has popped itself, or the 481layer is super special and needs to be retained for other reasons. 482In most cases it should return I<false>. 483 484=item Open 485 486 PerlIO * (*Open)(...); 487 488The C<Open()> method has lots of arguments because it combines the 489functions of perl's C<open>, C<PerlIO_open>, perl's C<sysopen>, 490C<PerlIO_fdopen> and C<PerlIO_reopen>. The full prototype is as 491follows: 492 493 PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab, 494 PerlIO_list_t *layers, IV n, 495 const char *mode, 496 int fd, int imode, int perm, 497 PerlIO *old, 498 int narg, SV **args); 499 500Open should (perhaps indirectly) call C<PerlIO_allocate()> to allocate 501a slot in the table and associate it with the layers information for 502the opened file, by calling C<PerlIO_push>. The I<layers> is an 503array of all the layers destined for the C<PerlIO *>, and any 504arguments passed to them, I<n> is the index into that array of the 505layer being called. The macro C<PerlIOArg> will return a (possibly 506C<NULL>) SV * for the argument passed to the layer. 507 508Where a layer opens or takes ownership of a file descriptor, that layer is 509responsible for getting the file descriptor's close-on-exec flag into the 510correct state. The flag should be clear for a file descriptor numbered 511less than or equal to C<PL_maxsysfd>, and set for any file descriptor 512numbered higher. For thread safety, when a layer opens a new file 513descriptor it should if possible open it with the close-on-exec flag 514initially set. 515 516The I<mode> string is an "C<fopen()>-like" string which would match 517the regular expression C</^[I#]?[rwa]\+?[bt]?$/>. 518 519The C<'I'> prefix is used during creation of C<stdin>..C<stderr> via 520special C<PerlIO_fdopen> calls; the C<'#'> prefix means that this is 521C<sysopen> and that I<imode> and I<perm> should be passed to 522C<PerlLIO_open3>; C<'r'> means B<r>ead, C<'w'> means B<w>rite and 523C<'a'> means B<a>ppend. The C<'+'> suffix means that both reading and 524writing/appending are permitted. The C<'b'> suffix means file should 525be binary, and C<'t'> means it is text. (Almost all layers should do 526the IO in binary mode, and ignore the b/t bits. The C<:crlf> layer 527should be pushed to handle the distinction.) 528 529If I<old> is not C<NULL> then this is a C<PerlIO_reopen>. Perl itself 530does not use this (yet?) and semantics are a little vague. 531 532If I<fd> not negative then it is the numeric file descriptor I<fd>, 533which will be open in a manner compatible with the supplied mode 534string, the call is thus equivalent to C<PerlIO_fdopen>. In this case 535I<nargs> will be zero. 536The file descriptor may have the close-on-exec flag either set or clear; 537it is the responsibility of the layer that takes ownership of it to get 538the flag into the correct state. 539 540If I<nargs> is greater than zero then it gives the number of arguments 541passed to C<open>, otherwise it will be 1 if for example 542C<PerlIO_open> was called. In simple cases SvPV_nolen(*args) is the 543pathname to open. 544 545If a layer provides C<Open()> it should normally call the C<Open()> 546method of next layer down (if any) and then push itself on top if that 547succeeds. C<PerlIOBase_open> is provided to do exactly that, so in 548most cases you don't have to write your own C<Open()> method. If this 549method is not defined, other layers may have difficulty pushing 550themselves on top of it during open. 551 552If C<PerlIO_push> was performed and open has failed, it must 553C<PerlIO_pop> itself, since if it's not, the layer won't be removed 554and may cause bad problems. 555 556Returns C<NULL> on failure. 557 558=item Binmode 559 560 IV (*Binmode)(pTHX_ PerlIO *f); 561 562Optional. Used when C<:raw> layer is pushed (explicitly or as a result 563of binmode(FH)). If not present layer will be popped. If present 564should configure layer as binary (or pop itself) and return 0. 565If it returns -1 for error C<binmode> will fail with layer 566still on the stack. 567 568=item Getarg 569 570 SV * (*Getarg)(pTHX_ PerlIO *f, 571 CLONE_PARAMS *param, int flags); 572 573Optional. If present should return an SV * representing the string 574argument passed to the layer when it was 575pushed. e.g. ":encoding(ascii)" would return an SvPV with value 576"ascii". (I<param> and I<flags> arguments can be ignored in most 577cases) 578 579C<Dup> uses C<Getarg> to retrieve the argument originally passed to 580C<Pushed>, so you must implement this function if your layer has an 581extra argument to C<Pushed> and will ever be C<Dup>ed. 582 583=item Fileno 584 585 IV (*Fileno)(pTHX_ PerlIO *f); 586 587Returns the Unix/Posix numeric file descriptor for the handle. Normally 588C<PerlIOBase_fileno()> (which just asks next layer down) will suffice 589for this. 590 591Returns -1 on error, which is considered to include the case where the 592layer cannot provide such a file descriptor. 593 594=item Dup 595 596 PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o, 597 CLONE_PARAMS *param, int flags); 598 599XXX: Needs more docs. 600 601Used as part of the "clone" process when a thread is spawned (in which 602case param will be non-NULL) and when a stream is being duplicated via 603'&' in the C<open>. 604 605Similar to C<Open>, returns PerlIO* on success, C<NULL> on failure. 606 607=item Read 608 609 SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count); 610 611Basic read operation. 612 613Typically will call C<Fill> and manipulate pointers (possibly via the 614API). C<PerlIOBuf_read()> may be suitable for derived classes which 615provide "fast gets" methods. 616 617Returns actual bytes read, or -1 on an error. 618 619=item Unread 620 621 SSize_t (*Unread)(pTHX_ PerlIO *f, 622 const void *vbuf, Size_t count); 623 624A superset of stdio's C<ungetc()>. Should arrange for future reads to 625see the bytes in C<vbuf>. If there is no obviously better implementation 626then C<PerlIOBase_unread()> provides the function by pushing a "fake" 627"pending" layer above the calling layer. 628 629Returns the number of unread chars. 630 631=item Write 632 633 SSize_t (*Write)(PerlIO *f, const void *vbuf, Size_t count); 634 635Basic write operation. 636 637Returns bytes written or -1 on an error. 638 639=item Seek 640 641 IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence); 642 643Position the file pointer. Should normally call its own C<Flush> 644method and then the C<Seek> method of next layer down. 645 646Returns 0 on success, -1 on failure. 647 648=item Tell 649 650 Off_t (*Tell)(pTHX_ PerlIO *f); 651 652Return the file pointer. May be based on layers cached concept of 653position to avoid overhead. 654 655Returns -1 on failure to get the file pointer. 656 657=item Close 658 659 IV (*Close)(pTHX_ PerlIO *f); 660 661Close the stream. Should normally call C<PerlIOBase_close()> to flush 662itself and close layers below, and then deallocate any data structures 663(buffers, translation tables, ...) not held directly in the data 664structure. 665 666Returns 0 on success, -1 on failure. 667 668=item Flush 669 670 IV (*Flush)(pTHX_ PerlIO *f); 671 672Should make stream's state consistent with layers below. That is, any 673buffered write data should be written, and file position of lower layers 674adjusted for data read from below but not actually consumed. 675(Should perhaps C<Unread()> such data to the lower layer.) 676 677Returns 0 on success, -1 on failure. 678 679=item Fill 680 681 IV (*Fill)(pTHX_ PerlIO *f); 682 683The buffer for this layer should be filled (for read) from layer 684below. When you "subclass" PerlIOBuf layer, you want to use its 685I<_read> method and to supply your own fill method, which fills the 686PerlIOBuf's buffer. 687 688Returns 0 on success, -1 on failure. 689 690=item Eof 691 692 IV (*Eof)(pTHX_ PerlIO *f); 693 694Return end-of-file indicator. C<PerlIOBase_eof()> is normally sufficient. 695 696Returns 0 on end-of-file, 1 if not end-of-file, -1 on error. 697 698=item Error 699 700 IV (*Error)(pTHX_ PerlIO *f); 701 702Return error indicator. C<PerlIOBase_error()> is normally sufficient. 703 704Returns 1 if there is an error (usually when C<PERLIO_F_ERROR> is set), 7050 otherwise. 706 707=item Clearerr 708 709 void (*Clearerr)(pTHX_ PerlIO *f); 710 711Clear end-of-file and error indicators. Should call C<PerlIOBase_clearerr()> 712to set the C<PERLIO_F_XXXXX> flags, which may suffice. 713 714=item Setlinebuf 715 716 void (*Setlinebuf)(pTHX_ PerlIO *f); 717 718Mark the stream as line buffered. C<PerlIOBase_setlinebuf()> sets the 719PERLIO_F_LINEBUF flag and is normally sufficient. 720 721=item Get_base 722 723 STDCHAR * (*Get_base)(pTHX_ PerlIO *f); 724 725Allocate (if not already done so) the read buffer for this layer and 726return pointer to it. Return NULL on failure. 727 728=item Get_bufsiz 729 730 Size_t (*Get_bufsiz)(pTHX_ PerlIO *f); 731 732Return the number of bytes that last C<Fill()> put in the buffer. 733 734=item Get_ptr 735 736 STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f); 737 738Return the current read pointer relative to this layer's buffer. 739 740=item Get_cnt 741 742 SSize_t (*Get_cnt)(pTHX_ PerlIO *f); 743 744Return the number of bytes left to be read in the current buffer. 745 746=item Set_ptrcnt 747 748 void (*Set_ptrcnt)(pTHX_ PerlIO *f, 749 STDCHAR *ptr, SSize_t cnt); 750 751Adjust the read pointer and count of bytes to match C<ptr> and/or C<cnt>. 752The application (or layer above) must ensure they are consistent. 753(Checking is allowed by the paranoid.) 754 755=back 756 757=head2 Utilities 758 759To ask for the next layer down use PerlIONext(PerlIO *f). 760 761To check that a PerlIO* is valid use PerlIOValid(PerlIO *f). (All 762this does is really just to check that the pointer is non-NULL and 763that the pointer behind that is non-NULL.) 764 765PerlIOBase(PerlIO *f) returns the "Base" pointer, or in other words, 766the C<PerlIOl*> pointer. 767 768PerlIOSelf(PerlIO* f, type) return the PerlIOBase cast to a type. 769 770Perl_PerlIO_or_Base(PerlIO* f, callback, base, failure, args) either 771calls the I<callback> from the functions of the layer I<f> (just by 772the name of the IO function, like "Read") with the I<args>, or if 773there is no such callback, calls the I<base> version of the callback 774with the same args, or if the f is invalid, set errno to EBADF and 775return I<failure>. 776 777Perl_PerlIO_or_fail(PerlIO* f, callback, failure, args) either calls 778the I<callback> of the functions of the layer I<f> with the I<args>, 779or if there is no such callback, set errno to EINVAL. Or if the f is 780invalid, set errno to EBADF and return I<failure>. 781 782Perl_PerlIO_or_Base_void(PerlIO* f, callback, base, args) either calls 783the I<callback> of the functions of the layer I<f> with the I<args>, 784or if there is no such callback, calls the I<base> version of the 785callback with the same args, or if the f is invalid, set errno to 786EBADF. 787 788Perl_PerlIO_or_fail_void(PerlIO* f, callback, args) either calls the 789I<callback> of the functions of the layer I<f> with the I<args>, or if 790there is no such callback, set errno to EINVAL. Or if the f is 791invalid, set errno to EBADF. 792 793=head2 Implementing PerlIO Layers 794 795If you find the implementation document unclear or not sufficient, 796look at the existing PerlIO layer implementations, which include: 797 798=over 799 800=item * C implementations 801 802The F<perlio.c> and F<perliol.h> in the Perl core implement the 803"unix", "perlio", "stdio", "crlf", "utf8", "byte", "raw", "pending" 804layers, and also the "mmap" and "win32" layers if applicable. 805(The "win32" is currently unfinished and unused, to see what is used 806instead in Win32, see L<PerlIO/"Querying the layers of filehandles"> .) 807 808PerlIO::encoding, PerlIO::scalar, PerlIO::via in the Perl core. 809 810PerlIO::gzip and APR::PerlIO (mod_perl 2.0) on CPAN. 811 812=item * Perl implementations 813 814PerlIO::via::QuotedPrint in the Perl core and PerlIO::via::* on CPAN. 815 816=back 817 818If you are creating a PerlIO layer, you may want to be lazy, in other 819words, implement only the methods that interest you. The other methods 820you can either replace with the "blank" methods 821 822 PerlIOBase_noop_ok 823 PerlIOBase_noop_fail 824 825(which do nothing, and return zero and -1, respectively) or for 826certain methods you may assume a default behaviour by using a NULL 827method. The Open method looks for help in the 'parent' layer. 828The following table summarizes the behaviour: 829 830 method behaviour with NULL 831 832 Clearerr PerlIOBase_clearerr 833 Close PerlIOBase_close 834 Dup PerlIOBase_dup 835 Eof PerlIOBase_eof 836 Error PerlIOBase_error 837 Fileno PerlIOBase_fileno 838 Fill FAILURE 839 Flush SUCCESS 840 Getarg SUCCESS 841 Get_base FAILURE 842 Get_bufsiz FAILURE 843 Get_cnt FAILURE 844 Get_ptr FAILURE 845 Open INHERITED 846 Popped SUCCESS 847 Pushed SUCCESS 848 Read PerlIOBase_read 849 Seek FAILURE 850 Set_cnt FAILURE 851 Set_ptrcnt FAILURE 852 Setlinebuf PerlIOBase_setlinebuf 853 Tell FAILURE 854 Unread PerlIOBase_unread 855 Write FAILURE 856 857 FAILURE Set errno (to EINVAL in Unixish, to LIB$_INVARG in VMS) 858 and return -1 (for numeric return values) or NULL (for 859 pointers) 860 INHERITED Inherited from the layer below 861 SUCCESS Return 0 (for numeric return values) or a pointer 862 863=head2 Core Layers 864 865The file C<perlio.c> provides the following layers: 866 867=over 4 868 869=item "unix" 870 871A basic non-buffered layer which calls Unix/POSIX C<read()>, C<write()>, 872C<lseek()>, C<close()>. No buffering. Even on platforms that distinguish 873between O_TEXT and O_BINARY this layer is always O_BINARY. 874 875=item "perlio" 876 877A very complete generic buffering layer which provides the whole of 878PerlIO API. It is also intended to be used as a "base class" for other 879layers. (For example its C<Read()> method is implemented in terms of 880the C<Get_cnt()>/C<Get_ptr()>/C<Set_ptrcnt()> methods). 881 882"perlio" over "unix" provides a complete replacement for stdio as seen 883via PerlIO API. This is the default for USE_PERLIO when system's stdio 884does not permit perl's "fast gets" access, and which do not 885distinguish between C<O_TEXT> and C<O_BINARY>. 886 887=item "stdio" 888 889A layer which provides the PerlIO API via the layer scheme, but 890implements it by calling system's stdio. This is (currently) the default 891if system's stdio provides sufficient access to allow perl's "fast gets" 892access and which do not distinguish between C<O_TEXT> and C<O_BINARY>. 893 894=item "crlf" 895 896A layer derived using "perlio" as a base class. It provides Win32-like 897"\n" to CR,LF translation. Can either be applied above "perlio" or serve 898as the buffer layer itself. "crlf" over "unix" is the default if system 899distinguishes between C<O_TEXT> and C<O_BINARY> opens. (At some point 900"unix" will be replaced by a "native" Win32 IO layer on that platform, 901as Win32's read/write layer has various drawbacks.) The "crlf" layer is 902a reasonable model for a layer which transforms data in some way. 903 904=item "mmap" 905 906If Configure detects C<mmap()> functions this layer is provided (with 907"perlio" as a "base") which does "read" operations by mmap()ing the 908file. Performance improvement is marginal on modern systems, so it is 909mainly there as a proof of concept. It is likely to be unbundled from 910the core at some point. The "mmap" layer is a reasonable model for a 911minimalist "derived" layer. 912 913=item "pending" 914 915An "internal" derivative of "perlio" which can be used to provide 916Unread() function for layers which have no buffer or cannot be 917bothered. (Basically this layer's C<Fill()> pops itself off the stack 918and so resumes reading from layer below.) 919 920=item "raw" 921 922A dummy layer which never exists on the layer stack. Instead when 923"pushed" it actually pops the stack removing itself, it then calls 924Binmode function table entry on all the layers in the stack - normally 925this (via PerlIOBase_binmode) removes any layers which do not have 926C<PERLIO_K_RAW> bit set. Layers can modify that behaviour by defining 927their own Binmode entry. 928 929=item "utf8" 930 931Another dummy layer. When pushed it pops itself and sets the 932C<PERLIO_F_UTF8> flag on the layer which was (and now is once more) 933the top of the stack. 934 935=back 936 937In addition F<perlio.c> also provides a number of C<PerlIOBase_xxxx()> 938functions which are intended to be used in the table slots of classes 939which do not need to do anything special for a particular method. 940 941=head2 Extension Layers 942 943Layers can be made available by extension modules. When an unknown layer 944is encountered the PerlIO code will perform the equivalent of : 945 946 use PerlIO 'layer'; 947 948Where I<layer> is the unknown layer. F<PerlIO.pm> will then attempt to: 949 950 require PerlIO::layer; 951 952If after that process the layer is still not defined then the C<open> 953will fail. 954 955The following extension layers are bundled with perl: 956 957=over 4 958 959=item ":encoding" 960 961 use Encoding; 962 963makes this layer available, although F<PerlIO.pm> "knows" where to 964find it. It is an example of a layer which takes an argument as it is 965called thus: 966 967 open( $fh, "<:encoding(iso-8859-7)", $pathname ); 968 969=item ":scalar" 970 971Provides support for reading data from and writing data to a scalar. 972 973 open( $fh, "+<:scalar", \$scalar ); 974 975When a handle is so opened, then reads get bytes from the string value 976of I<$scalar>, and writes change the value. In both cases the position 977in I<$scalar> starts as zero but can be altered via C<seek>, and 978determined via C<tell>. 979 980Please note that this layer is implied when calling open() thus: 981 982 open( $fh, "+<", \$scalar ); 983 984=item ":via" 985 986Provided to allow layers to be implemented as Perl code. For instance: 987 988 use PerlIO::via::StripHTML; 989 open( my $fh, "<:via(StripHTML)", "index.html" ); 990 991See L<PerlIO::via> for details. 992 993=back 994 995=head1 TODO 996 997Things that need to be done to improve this document. 998 999=over 1000 1001=item * 1002 1003Explain how to make a valid fh without going through open()(i.e. apply 1004a layer). For example if the file is not opened through perl, but we 1005want to get back a fh, like it was opened by Perl. 1006 1007How PerlIO_apply_layera fits in, where its docs, was it made public? 1008 1009Currently the example could be something like this: 1010 1011 PerlIO *foo_to_PerlIO(pTHX_ char *mode, ...) 1012 { 1013 char *mode; /* "w", "r", etc */ 1014 const char *layers = ":APR"; /* the layer name */ 1015 PerlIO *f = PerlIO_allocate(aTHX); 1016 if (!f) { 1017 return NULL; 1018 } 1019 1020 PerlIO_apply_layers(aTHX_ f, mode, layers); 1021 1022 if (f) { 1023 PerlIOAPR *st = PerlIOSelf(f, PerlIOAPR); 1024 /* fill in the st struct, as in _open() */ 1025 st->file = file; 1026 PerlIOBase(f)->flags |= PERLIO_F_OPEN; 1027 1028 return f; 1029 } 1030 return NULL; 1031 } 1032 1033=item * 1034 1035fix/add the documentation in places marked as XXX. 1036 1037=item * 1038 1039The handling of errors by the layer is not specified. e.g. when $! 1040should be set explicitly, when the error handling should be just 1041delegated to the top layer. 1042 1043Probably give some hints on using SETERRNO() or pointers to where they 1044can be found. 1045 1046=item * 1047 1048I think it would help to give some concrete examples to make it easier 1049to understand the API. Of course I agree that the API has to be 1050concise, but since there is no second document that is more of a 1051guide, I think that it'd make it easier to start with the doc which is 1052an API, but has examples in it in places where things are unclear, to 1053a person who is not a PerlIO guru (yet). 1054 1055=back 1056 1057=cut 1058