1------------------------------------------------------------------------------ 2-- -- 3-- GNAT LIBRARY COMPONENTS -- 4-- -- 5-- G N A T . S P I T B O L . P A T T E R N S -- 6-- -- 7-- B o d y -- 8-- -- 9-- Copyright (C) 1998-2003, Ada Core Technologies, Inc. -- 10-- -- 11-- GNAT is free software; you can redistribute it and/or modify it under -- 12-- terms of the GNU General Public License as published by the Free Soft- -- 13-- ware Foundation; either version 2, or (at your option) any later ver- -- 14-- sion. GNAT is distributed in the hope that it will be useful, but WITH- -- 15-- OUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY -- 16-- or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License -- 17-- for more details. You should have received a copy of the GNU General -- 18-- Public License distributed with GNAT; see file COPYING. If not, write -- 19-- to the Free Software Foundation, 59 Temple Place - Suite 330, Boston, -- 20-- MA 02111-1307, USA. -- 21-- -- 22-- As a special exception, if other files instantiate generics from this -- 23-- unit, or you link this unit with other files to produce an executable, -- 24-- this unit does not by itself cause the resulting executable to be -- 25-- covered by the GNU General Public License. This exception does not -- 26-- however invalidate any other reasons why the executable file might be -- 27-- covered by the GNU Public License. -- 28-- -- 29-- GNAT was originally developed by the GNAT team at New York University. -- 30-- Extensive contributions were provided by Ada Core Technologies Inc. -- 31-- -- 32------------------------------------------------------------------------------ 33 34-- Note: the data structures and general approach used in this implementation 35-- are derived from the original MINIMAL sources for SPITBOL. The code is not 36-- a direct translation, but the approach is followed closely. In particular, 37-- we use the one stack approach developed in the SPITBOL implementation. 38 39with Ada.Exceptions; use Ada.Exceptions; 40with Ada.Strings.Maps; use Ada.Strings.Maps; 41with Ada.Strings.Unbounded.Aux; use Ada.Strings.Unbounded.Aux; 42 43with GNAT.Debug_Utilities; use GNAT.Debug_Utilities; 44 45with System; use System; 46 47with Unchecked_Conversion; 48with Unchecked_Deallocation; 49 50package body GNAT.Spitbol.Patterns is 51 52 ------------------------ 53 -- Internal Debugging -- 54 ------------------------ 55 56 Internal_Debug : constant Boolean := False; 57 -- Set this flag to True to activate some built-in debugging traceback 58 -- These are all lines output with PutD and Put_LineD. 59 60 procedure New_LineD; 61 pragma Inline (New_LineD); 62 -- Output new blank line with New_Line if Internal_Debug is True 63 64 procedure PutD (Str : String); 65 pragma Inline (PutD); 66 -- Output string with Put if Internal_Debug is True 67 68 procedure Put_LineD (Str : String); 69 pragma Inline (Put_LineD); 70 -- Output string with Put_Line if Internal_Debug is True 71 72 ----------------------------- 73 -- Local Type Declarations -- 74 ----------------------------- 75 76 subtype String_Ptr is Ada.Strings.Unbounded.String_Access; 77 subtype File_Ptr is Ada.Text_IO.File_Access; 78 79 function To_Address is new Unchecked_Conversion (PE_Ptr, Address); 80 -- Used only for debugging output purposes 81 82 subtype AFC is Ada.Finalization.Controlled; 83 84 N : constant PE_Ptr := null; 85 -- Shorthand used to initialize Copy fields to null 86 87 type Natural_Ptr is access all Natural; 88 type Pattern_Ptr is access all Pattern; 89 90 -------------------------------------------------- 91 -- Description of Algorithm and Data Structures -- 92 -------------------------------------------------- 93 94 -- A pattern structure is represented as a linked graph of nodes 95 -- with the following structure: 96 97 -- +------------------------------------+ 98 -- I Pcode I 99 -- +------------------------------------+ 100 -- I Index I 101 -- +------------------------------------+ 102 -- I Pthen I 103 -- +------------------------------------+ 104 -- I parameter(s) I 105 -- +------------------------------------+ 106 107 -- Pcode is a code value indicating the type of the patterm node. This 108 -- code is used both as the discriminant value for the record, and as 109 -- the case index in the main match routine that branches to the proper 110 -- match code for the given element. 111 112 -- Index is a serial index number. The use of these serial index 113 -- numbers is described in a separate section. 114 115 -- Pthen is a pointer to the successor node, i.e the node to be matched 116 -- if the attempt to match the node succeeds. If this is the last node 117 -- of the pattern to be matched, then Pthen points to a dummy node 118 -- of kind PC_EOP (end of pattern), which initiales pattern exit. 119 120 -- The parameter or parameters are present for certain node types, 121 -- and the type varies with the pattern code. 122 123 type Pattern_Code is ( 124 PC_Arb_Y, 125 PC_Assign, 126 PC_Bal, 127 PC_BreakX_X, 128 PC_Cancel, 129 PC_EOP, 130 PC_Fail, 131 PC_Fence, 132 PC_Fence_X, 133 PC_Fence_Y, 134 PC_R_Enter, 135 PC_R_Remove, 136 PC_R_Restore, 137 PC_Rest, 138 PC_Succeed, 139 PC_Unanchored, 140 141 PC_Alt, 142 PC_Arb_X, 143 PC_Arbno_S, 144 PC_Arbno_X, 145 146 PC_Rpat, 147 148 PC_Pred_Func, 149 150 PC_Assign_Imm, 151 PC_Assign_OnM, 152 PC_Any_VP, 153 PC_Break_VP, 154 PC_BreakX_VP, 155 PC_NotAny_VP, 156 PC_NSpan_VP, 157 PC_Span_VP, 158 PC_String_VP, 159 160 PC_Write_Imm, 161 PC_Write_OnM, 162 163 PC_Null, 164 PC_String, 165 166 PC_String_2, 167 PC_String_3, 168 PC_String_4, 169 PC_String_5, 170 PC_String_6, 171 172 PC_Setcur, 173 174 PC_Any_CH, 175 PC_Break_CH, 176 PC_BreakX_CH, 177 PC_Char, 178 PC_NotAny_CH, 179 PC_NSpan_CH, 180 PC_Span_CH, 181 182 PC_Any_CS, 183 PC_Break_CS, 184 PC_BreakX_CS, 185 PC_NotAny_CS, 186 PC_NSpan_CS, 187 PC_Span_CS, 188 189 PC_Arbno_Y, 190 PC_Len_Nat, 191 PC_Pos_Nat, 192 PC_RPos_Nat, 193 PC_RTab_Nat, 194 PC_Tab_Nat, 195 196 PC_Pos_NF, 197 PC_Len_NF, 198 PC_RPos_NF, 199 PC_RTab_NF, 200 PC_Tab_NF, 201 202 PC_Pos_NP, 203 PC_Len_NP, 204 PC_RPos_NP, 205 PC_RTab_NP, 206 PC_Tab_NP, 207 208 PC_Any_VF, 209 PC_Break_VF, 210 PC_BreakX_VF, 211 PC_NotAny_VF, 212 PC_NSpan_VF, 213 PC_Span_VF, 214 PC_String_VF); 215 216 type IndexT is range 0 .. +(2 **15 - 1); 217 218 type PE (Pcode : Pattern_Code) is record 219 220 Index : IndexT; 221 -- Serial index number of pattern element within pattern. 222 223 Pthen : PE_Ptr; 224 -- Successor element, to be matched after this one 225 226 case Pcode is 227 228 when PC_Arb_Y | 229 PC_Assign | 230 PC_Bal | 231 PC_BreakX_X | 232 PC_Cancel | 233 PC_EOP | 234 PC_Fail | 235 PC_Fence | 236 PC_Fence_X | 237 PC_Fence_Y | 238 PC_Null | 239 PC_R_Enter | 240 PC_R_Remove | 241 PC_R_Restore | 242 PC_Rest | 243 PC_Succeed | 244 PC_Unanchored => null; 245 246 when PC_Alt | 247 PC_Arb_X | 248 PC_Arbno_S | 249 PC_Arbno_X => Alt : PE_Ptr; 250 251 when PC_Rpat => PP : Pattern_Ptr; 252 253 when PC_Pred_Func => BF : Boolean_Func; 254 255 when PC_Assign_Imm | 256 PC_Assign_OnM | 257 PC_Any_VP | 258 PC_Break_VP | 259 PC_BreakX_VP | 260 PC_NotAny_VP | 261 PC_NSpan_VP | 262 PC_Span_VP | 263 PC_String_VP => VP : VString_Ptr; 264 265 when PC_Write_Imm | 266 PC_Write_OnM => FP : File_Ptr; 267 268 when PC_String => Str : String_Ptr; 269 270 when PC_String_2 => Str2 : String (1 .. 2); 271 272 when PC_String_3 => Str3 : String (1 .. 3); 273 274 when PC_String_4 => Str4 : String (1 .. 4); 275 276 when PC_String_5 => Str5 : String (1 .. 5); 277 278 when PC_String_6 => Str6 : String (1 .. 6); 279 280 when PC_Setcur => Var : Natural_Ptr; 281 282 when PC_Any_CH | 283 PC_Break_CH | 284 PC_BreakX_CH | 285 PC_Char | 286 PC_NotAny_CH | 287 PC_NSpan_CH | 288 PC_Span_CH => Char : Character; 289 290 when PC_Any_CS | 291 PC_Break_CS | 292 PC_BreakX_CS | 293 PC_NotAny_CS | 294 PC_NSpan_CS | 295 PC_Span_CS => CS : Character_Set; 296 297 when PC_Arbno_Y | 298 PC_Len_Nat | 299 PC_Pos_Nat | 300 PC_RPos_Nat | 301 PC_RTab_Nat | 302 PC_Tab_Nat => Nat : Natural; 303 304 when PC_Pos_NF | 305 PC_Len_NF | 306 PC_RPos_NF | 307 PC_RTab_NF | 308 PC_Tab_NF => NF : Natural_Func; 309 310 when PC_Pos_NP | 311 PC_Len_NP | 312 PC_RPos_NP | 313 PC_RTab_NP | 314 PC_Tab_NP => NP : Natural_Ptr; 315 316 when PC_Any_VF | 317 PC_Break_VF | 318 PC_BreakX_VF | 319 PC_NotAny_VF | 320 PC_NSpan_VF | 321 PC_Span_VF | 322 PC_String_VF => VF : VString_Func; 323 324 end case; 325 end record; 326 327 subtype PC_Has_Alt is Pattern_Code range PC_Alt .. PC_Arbno_X; 328 -- Range of pattern codes that has an Alt field. This is used in the 329 -- recursive traversals, since these links must be followed. 330 331 EOP_Element : aliased constant PE := (PC_EOP, 0, N); 332 -- This is the end of pattern element, and is thus the representation of 333 -- a null pattern. It has a zero index element since it is never placed 334 -- inside a pattern. Furthermore it does not need a successor, since it 335 -- marks the end of the pattern, so that no more successors are needed. 336 337 EOP : constant PE_Ptr := EOP_Element'Unrestricted_Access; 338 -- This is the end of pattern pointer, that is used in the Pthen pointer 339 -- of other nodes to signal end of pattern. 340 341 -- The following array is used to determine if a pattern used as an 342 -- argument for Arbno is eligible for treatment using the simple Arbno 343 -- structure (i.e. it is a pattern that is guaranteed to match at least 344 -- one character on success, and not to make any entries on the stack. 345 346 OK_For_Simple_Arbno : constant array (Pattern_Code) of Boolean := 347 (PC_Any_CS | 348 PC_Any_CH | 349 PC_Any_VF | 350 PC_Any_VP | 351 PC_Char | 352 PC_Len_Nat | 353 PC_NotAny_CS | 354 PC_NotAny_CH | 355 PC_NotAny_VF | 356 PC_NotAny_VP | 357 PC_Span_CS | 358 PC_Span_CH | 359 PC_Span_VF | 360 PC_Span_VP | 361 PC_String | 362 PC_String_2 | 363 PC_String_3 | 364 PC_String_4 | 365 PC_String_5 | 366 PC_String_6 => True, 367 others => False); 368 369 ------------------------------- 370 -- The Pattern History Stack -- 371 ------------------------------- 372 373 -- The pattern history stack is used for controlling backtracking when 374 -- a match fails. The idea is to stack entries that give a cursor value 375 -- to be restored, and a node to be reestablished as the current node to 376 -- attempt an appropriate rematch operation. The processing for a pattern 377 -- element that has rematch alternatives pushes an appropriate entry or 378 -- entry on to the stack, and the proceeds. If a match fails at any point, 379 -- the top element of the stack is popped off, resetting the cursor and 380 -- the match continues by accessing the node stored with this entry. 381 382 type Stack_Entry is record 383 384 Cursor : Integer; 385 -- Saved cursor value that is restored when this entry is popped 386 -- from the stack if a match attempt fails. Occasionally, this 387 -- field is used to store a history stack pointer instead of a 388 -- cursor. Such cases are noted in the documentation and the value 389 -- stored is negative since stack pointer values are always negative. 390 391 Node : PE_Ptr; 392 -- This pattern element reference is reestablished as the current 393 -- Node to be matched (which will attempt an appropriate rematch). 394 395 end record; 396 397 subtype Stack_Range is Integer range -Stack_Size .. -1; 398 399 type Stack_Type is array (Stack_Range) of Stack_Entry; 400 -- The type used for a history stack. The actual instance of the stack 401 -- is declared as a local variable in the Match routine, to properly 402 -- handle recursive calls to Match. All stack pointer values are negative 403 -- to distinguish them from normal cursor values. 404 405 -- Note: the pattern matching stack is used only to handle backtracking. 406 -- If no backtracking occurs, its entries are never accessed, and never 407 -- popped off, and in particular it is normal for a successful match 408 -- to terminate with entries on the stack that are simply discarded. 409 410 -- Note: in subsequent diagrams of the stack, we always place element 411 -- zero (the deepest element) at the top of the page, then build the 412 -- stack down on the page with the most recent (top of stack) element 413 -- being the bottom-most entry on the page. 414 415 -- Stack checking is handled by labeling every pattern with the maximum 416 -- number of stack entries that are required, so a single check at the 417 -- start of matching the pattern suffices. There are two exceptions. 418 419 -- First, the count does not include entries for recursive pattern 420 -- references. Such recursions must therefore perform a specific 421 -- stack check with respect to the number of stack entries required 422 -- by the recursive pattern that is accessed and the amount of stack 423 -- that remains unused. 424 425 -- Second, the count includes only one iteration of an Arbno pattern, 426 -- so a specific check must be made on subsequent iterations that there 427 -- is still enough stack space left. The Arbno node has a field that 428 -- records the number of stack entries required by its argument for 429 -- this purpose. 430 431 --------------------------------------------------- 432 -- Use of Serial Index Field in Pattern Elements -- 433 --------------------------------------------------- 434 435 -- The serial index numbers for the pattern elements are assigned as 436 -- a pattern is consructed from its constituent elements. Note that there 437 -- is never any sharing of pattern elements between patterns (copies are 438 -- always made), so the serial index numbers are unique to a particular 439 -- pattern as referenced from the P field of a value of type Pattern. 440 441 -- The index numbers meet three separate invariants, which are used for 442 -- various purposes as described in this section. 443 444 -- First, the numbers uniquely identify the pattern elements within a 445 -- pattern. If Num is the number of elements in a given pattern, then 446 -- the serial index numbers for the elements of this pattern will range 447 -- from 1 .. Num, so that each element has a separate value. 448 449 -- The purpose of this assignment is to provide a convenient auxiliary 450 -- data structure mechanism during operations which must traverse a 451 -- pattern (e.g. copy and finalization processing). Once constructed 452 -- patterns are strictly read only. This is necessary to allow sharing 453 -- of patterns between tasks. This means that we cannot go marking the 454 -- pattern (e.g. with a visited bit). Instead we cosntuct a separate 455 -- vector that contains the necessary information indexed by the Index 456 -- values in the pattern elements. For this purpose the only requirement 457 -- is that they be uniquely assigned. 458 459 -- Second, the pattern element referenced directly, i.e. the leading 460 -- pattern element, is always the maximum numbered element and therefore 461 -- indicates the total number of elements in the pattern. More precisely, 462 -- the element referenced by the P field of a pattern value, or the 463 -- element returned by any of the internal pattern construction routines 464 -- in the body (that return a value of type PE_Ptr) always is this 465 -- maximum element, 466 467 -- The purpose of this requirement is to allow an immediate determination 468 -- of the number of pattern elements within a pattern. This is used to 469 -- properly size the vectors used to contain auxiliary information for 470 -- traversal as described above. 471 472 -- Third, as compound pattern structures are constructed, the way in which 473 -- constituent parts of the pattern are constructed is stylized. This is 474 -- an automatic consequence of the way that these compounjd structures 475 -- are constructed, and basically what we are doing is simply documenting 476 -- and specifying the natural result of the pattern construction. The 477 -- section describing compound pattern structures gives details of the 478 -- numbering of each compound pattern structure. 479 480 -- The purpose of specifying the stylized numbering structures for the 481 -- compound patterns is to help simplify the processing in the Image 482 -- function, since it eases the task of retrieving the original recursive 483 -- structure of the pattern from the flat graph structure of elements. 484 -- This use in the Image function is the only point at which the code 485 -- makes use of the stylized structures. 486 487 type Ref_Array is array (IndexT range <>) of PE_Ptr; 488 -- This type is used to build an array whose N'th entry references the 489 -- element in a pattern whose Index value is N. See Build_Ref_Array. 490 491 procedure Build_Ref_Array (E : PE_Ptr; RA : out Ref_Array); 492 -- Given a pattern element which is the leading element of a pattern 493 -- structure, and a Ref_Array with bounds 1 .. E.Index, fills in the 494 -- Ref_Array so that its N'th entry references the element of the 495 -- referenced pattern whose Index value is N. 496 497 ------------------------------- 498 -- Recursive Pattern Matches -- 499 ------------------------------- 500 501 -- The pattern primitive (+P) where P is a Pattern_Ptr or Pattern_Func 502 -- causes a recursive pattern match. This cannot be handled by an actual 503 -- recursive call to the outer level Match routine, since this would not 504 -- allow for possible backtracking into the region matched by the inner 505 -- pattern. Indeed this is the classical clash between recursion and 506 -- backtracking, and a simple recursive stack structure does not suffice. 507 508 -- This section describes how this recursion and the possible associated 509 -- backtracking is handled. We still use a single stack, but we establish 510 -- the concept of nested regions on this stack, each of which has a stack 511 -- base value pointing to the deepest stack entry of the region. The base 512 -- value for the outer level is zero. 513 514 -- When a recursive match is established, two special stack entries are 515 -- made. The first entry is used to save the original node that starts 516 -- the recursive match. This is saved so that the successor field of 517 -- this node is accessible at the end of the match, but it is never 518 -- popped and executed. 519 520 -- The second entry corresponds to a standard new region action. A 521 -- PC_R_Remove node is stacked, whose cursor field is used to store 522 -- the outer stack base, and the stack base is reset to point to 523 -- this PC_R_Remove node. Then the recursive pattern is matched and 524 -- it can make history stack entries in the normal matter, so now 525 -- the stack looks like: 526 527 -- (stack entries made by outer level) 528 529 -- (Special entry, node is (+P) successor 530 -- cursor entry is not used) 531 532 -- (PC_R_Remove entry, "cursor" value is (negative) <-- Stack base 533 -- saved base value for the enclosing region) 534 535 -- (stack entries made by inner level) 536 537 -- If a subsequent failure occurs and pops the PC_R_Remove node, it 538 -- removes itself and the special entry immediately underneath it, 539 -- restores the stack base value for the enclosing region, and then 540 -- again signals failure to look for alternatives that were stacked 541 -- before the recursion was initiated. 542 543 -- Now we need to consider what happens if the inner pattern succeeds, as 544 -- signalled by accessing the special PC_EOP pattern primitive. First we 545 -- recognize the nested case by looking at the Base value. If this Base 546 -- value is Stack'First, then the entire match has succeeded, but if the 547 -- base value is greater than Stack'First, then we have successfully 548 -- matched an inner pattern, and processing continues at the outer level. 549 550 -- There are two cases. The simple case is when the inner pattern has made 551 -- no stack entries, as recognized by the fact that the current stack 552 -- pointer is equal to the current base value. In this case it is fine to 553 -- remove all trace of the recursion by restoring the outer base value and 554 -- using the special entry to find the appropriate successor node. 555 556 -- The more complex case arises when the inner match does make stack 557 -- entries. In this case, the PC_EOP processing stacks a special entry 558 -- whose cursor value saves the saved inner base value (the one that 559 -- references the corresponding PC_R_Remove value), and whose node 560 -- pointer references a PC_R_Restore node, so the stack looks like: 561 562 -- (stack entries made by outer level) 563 564 -- (Special entry, node is (+P) successor, 565 -- cursor entry is not used) 566 567 -- (PC_R_Remove entry, "cursor" value is (negative) 568 -- saved base value for the enclosing region) 569 570 -- (stack entries made by inner level) 571 572 -- (PC_Region_Replace entry, "cursor" value is (negative) 573 -- stack pointer value referencing the PC_R_Remove entry). 574 575 -- If the entire match succeeds, then these stack entries are, as usual, 576 -- ignored and abandoned. If on the other hand a subsequent failure 577 -- causes the PC_Region_Replace entry to be popped, it restores the 578 -- inner base value from its saved "cursor" value and then fails again. 579 -- Note that it is OK that the cursor is temporarily clobbered by this 580 -- pop, since the second failure will reestablish a proper cursor value. 581 582 --------------------------------- 583 -- Compound Pattern Structures -- 584 --------------------------------- 585 586 -- This section discusses the compound structures used to represent 587 -- constructed patterns. It shows the graph structures of pattern 588 -- elements that are constructed, and in the case of patterns that 589 -- provide backtracking possibilities, describes how the history 590 -- stack is used to control the backtracking. Finally, it notes the 591 -- way in which the Index numbers are assigned to the structure. 592 593 -- In all diagrams, solid lines (built witth minus signs or vertical 594 -- bars, represent successor pointers (Pthen fields) with > or V used 595 -- to indicate the direction of the pointer. The initial node of the 596 -- structure is in the upper left of the diagram. A dotted line is an 597 -- alternative pointer from the element above it to the element below 598 -- it. See individual sections for details on how alternatives are used. 599 600 ------------------- 601 -- Concatenation -- 602 ------------------- 603 604 -- In the pattern structures listed in this section, a line that looks 605 -- lile ----> with nothing to the right indicates an end of pattern 606 -- (EOP) pointer that represents the end of the match. 607 608 -- When a pattern concatenation (L & R) occurs, the resulting structure 609 -- is obtained by finding all such EOP pointers in L, and replacing 610 -- them to point to R. This is the most important flattening that 611 -- occurs in constructing a pattern, and it means that the pattern 612 -- matching circuitry does not have to keep track of the structure 613 -- of a pattern with respect to concatenation, since the appropriate 614 -- succesor is always at hand. 615 616 -- Concatenation itself generates no additional possibilities for 617 -- backtracking, but the constituent patterns of the concatenated 618 -- structure will make stack entries as usual. The maximum amount 619 -- of stack required by the structure is thus simply the sum of the 620 -- maximums required by L and R. 621 622 -- The index numbering of a concatenation structure works by leaving 623 -- the numbering of the right hand pattern, R, unchanged and adjusting 624 -- the numbers in the left hand pattern, L up by the count of elements 625 -- in R. This ensures that the maximum numbered element is the leading 626 -- element as required (given that it was the leading element in L). 627 628 ----------------- 629 -- Alternation -- 630 ----------------- 631 632 -- A pattern (L or R) constructs the structure: 633 634 -- +---+ +---+ 635 -- | A |---->| L |----> 636 -- +---+ +---+ 637 -- . 638 -- . 639 -- +---+ 640 -- | R |----> 641 -- +---+ 642 643 -- The A element here is a PC_Alt node, and the dotted line represents 644 -- the contents of the Alt field. When the PC_Alt element is matched, 645 -- it stacks a pointer to the leading element of R on the history stack 646 -- so that on subsequent failure, a match of R is attempted. 647 648 -- The A node is the higest numbered element in the pattern. The 649 -- original index numbers of R are unchanged, but the index numbers 650 -- of the L pattern are adjusted up by the count of elements in R. 651 652 -- Note that the difference between the index of the L leading element 653 -- the index of the R leading element (after building the alt structure) 654 -- indicates the number of nodes in L, and this is true even after the 655 -- structure is incorporated into some larger structure. For example, 656 -- if the A node has index 16, and L has index 15 and R has index 657 -- 5, then we know that L has 10 (15-5) elements in it. 658 659 -- Suppose that we now concatenate this structure to another pattern 660 -- with 9 elements in it. We will now have the A node with an index 661 -- of 25, L with an index of 24 and R with an index of 14. We still 662 -- know that L has 10 (24-14) elements in it, numbered 15-24, and 663 -- consequently the successor of the alternation structure has an 664 -- index with a value less than 15. This is used in Image to figure 665 -- out the original recursive structure of a pattern. 666 667 -- To clarify the interaction of the alternation and concatenation 668 -- structures, here is a more complex example of the structure built 669 -- for the pattern: 670 671 -- (V or W or X) (Y or Z) 672 673 -- where A,B,C,D,E are all single element patterns: 674 675 -- +---+ +---+ +---+ +---+ 676 -- I A I---->I V I---+-->I A I---->I Y I----> 677 -- +---+ +---+ I +---+ +---+ 678 -- . I . 679 -- . I . 680 -- +---+ +---+ I +---+ 681 -- I A I---->I W I-->I I Z I----> 682 -- +---+ +---+ I +---+ 683 -- . I 684 -- . I 685 -- +---+ I 686 -- I X I------------>+ 687 -- +---+ 688 689 -- The numbering of the nodes would be as follows: 690 691 -- +---+ +---+ +---+ +---+ 692 -- I 8 I---->I 7 I---+-->I 3 I---->I 2 I----> 693 -- +---+ +---+ I +---+ +---+ 694 -- . I . 695 -- . I . 696 -- +---+ +---+ I +---+ 697 -- I 6 I---->I 5 I-->I I 1 I----> 698 -- +---+ +---+ I +---+ 699 -- . I 700 -- . I 701 -- +---+ I 702 -- I 4 I------------>+ 703 -- +---+ 704 705 -- Note: The above structure actually corresponds to 706 707 -- (A or (B or C)) (D or E) 708 709 -- rather than 710 711 -- ((A or B) or C) (D or E) 712 713 -- which is the more natural interpretation, but in fact alternation 714 -- is associative, and the construction of an alternative changes the 715 -- left grouped pattern to the right grouped pattern in any case, so 716 -- that the Image function produces a more natural looking output. 717 718 --------- 719 -- Arb -- 720 --------- 721 722 -- An Arb pattern builds the structure 723 724 -- +---+ 725 -- | X |----> 726 -- +---+ 727 -- . 728 -- . 729 -- +---+ 730 -- | Y |----> 731 -- +---+ 732 733 -- The X node is a PC_Arb_X node, which matches null, and stacks a 734 -- pointer to Y node, which is the PC_Arb_Y node that matches one 735 -- extra character and restacks itself. 736 737 -- The PC_Arb_X node is numbered 2, and the PC_Arb_Y node is 1. 738 739 ------------------------- 740 -- Arbno (simple case) -- 741 ------------------------- 742 743 -- The simple form of Arbno can be used where the pattern always 744 -- matches at least one character if it succeeds, and it is known 745 -- not to make any history stack entries. In this case, Arbno (P) 746 -- can construct the following structure: 747 748 -- +-------------+ 749 -- | ^ 750 -- V | 751 -- +---+ | 752 -- | S |----> | 753 -- +---+ | 754 -- . | 755 -- . | 756 -- +---+ | 757 -- | P |---------->+ 758 -- +---+ 759 760 -- The S (PC_Arbno_S) node matches null stacking a pointer to the 761 -- pattern P. If a subsequent failure causes P to be matched and 762 -- this match succeeds, then node A gets restacked to try another 763 -- instance if needed by a subsequent failure. 764 765 -- The node numbering of the constituent pattern P is not affected. 766 -- The S node has a node number of P.Index + 1. 767 768 -------------------------- 769 -- Arbno (complex case) -- 770 -------------------------- 771 772 -- A call to Arbno (P), where P can match null (or at least is not 773 -- known to require a non-null string) and/or P requires pattern stack 774 -- entries, constructs the following structure: 775 776 -- +--------------------------+ 777 -- | ^ 778 -- V | 779 -- +---+ | 780 -- | X |----> | 781 -- +---+ | 782 -- . | 783 -- . | 784 -- +---+ +---+ +---+ | 785 -- | E |---->| P |---->| Y |--->+ 786 -- +---+ +---+ +---+ 787 788 -- The node X (PC_Arbno_X) matches null, stacking a pointer to the 789 -- E-P-X structure used to match one Arbno instance. 790 791 -- Here E is the PC_R_Enter node which matches null and creates two 792 -- stack entries. The first is a special entry whose node field is 793 -- not used at all, and whose cursor field has the initial cursor. 794 795 -- The second entry corresponds to a standard new region action. A 796 -- PC_R_Remove node is stacked, whose cursor field is used to store 797 -- the outer stack base, and the stack base is reset to point to 798 -- this PC_R_Remove node. Then the pattern P is matched, and it can 799 -- make history stack entries in the normal manner, so now the stack 800 -- looks like: 801 802 -- (stack entries made before assign pattern) 803 804 -- (Special entry, node field not used, 805 -- used only to save initial cursor) 806 807 -- (PC_R_Remove entry, "cursor" value is (negative) <-- Stack Base 808 -- saved base value for the enclosing region) 809 810 -- (stack entries made by matching P) 811 812 -- If the match of P fails, then the PC_R_Remove entry is popped and 813 -- it removes both itself and the special entry underneath it, 814 -- restores the outer stack base, and signals failure. 815 816 -- If the match of P succeeds, then node Y, the PC_Arbno_Y node, pops 817 -- the inner region. There are two possibilities. If matching P left 818 -- no stack entries, then all traces of the inner region can be removed. 819 -- If there are stack entries, then we push an PC_Region_Replace stack 820 -- entry whose "cursor" value is the inner stack base value, and then 821 -- restore the outer stack base value, so the stack looks like: 822 823 -- (stack entries made before assign pattern) 824 825 -- (Special entry, node field not used, 826 -- used only to save initial cursor) 827 828 -- (PC_R_Remove entry, "cursor" value is (negative) 829 -- saved base value for the enclosing region) 830 831 -- (stack entries made by matching P) 832 833 -- (PC_Region_Replace entry, "cursor" value is (negative) 834 -- stack pointer value referencing the PC_R_Remove entry). 835 836 -- Now that we have matched another instance of the Arbno pattern, 837 -- we need to move to the successor. There are two cases. If the 838 -- Arbno pattern matched null, then there is no point in seeking 839 -- alternatives, since we would just match a whole bunch of nulls. 840 -- In this case we look through the alternative node, and move 841 -- directly to its successor (i.e. the successor of the Arbno 842 -- pattern). If on the other hand a non-null string was matched, 843 -- we simply follow the successor to the alternative node, which 844 -- sets up for another possible match of the Arbno pattern. 845 846 -- As noted in the section on stack checking, the stack count (and 847 -- hence the stack check) for a pattern includes only one iteration 848 -- of the Arbno pattern. To make sure that multiple iterations do not 849 -- overflow the stack, the Arbno node saves the stack count required 850 -- by a single iteration, and the Concat function increments this to 851 -- include stack entries required by any successor. The PC_Arbno_Y 852 -- node uses this count to ensure that sufficient stack remains 853 -- before proceeding after matching each new instance. 854 855 -- The node numbering of the constituent pattern P is not affected. 856 -- Where N is the number of nodes in P, the Y node is numbered N + 1, 857 -- the E node is N + 2, and the X node is N + 3. 858 859 ---------------------- 860 -- Assign Immediate -- 861 ---------------------- 862 863 -- Immediate assignment (P * V) constructs the following structure 864 865 -- +---+ +---+ +---+ 866 -- | E |---->| P |---->| A |----> 867 -- +---+ +---+ +---+ 868 869 -- Here E is the PC_R_Enter node which matches null and creates two 870 -- stack entries. The first is a special entry whose node field is 871 -- not used at all, and whose cursor field has the initial cursor. 872 873 -- The second entry corresponds to a standard new region action. A 874 -- PC_R_Remove node is stacked, whose cursor field is used to store 875 -- the outer stack base, and the stack base is reset to point to 876 -- this PC_R_Remove node. Then the pattern P is matched, and it can 877 -- make history stack entries in the normal manner, so now the stack 878 -- looks like: 879 880 -- (stack entries made before assign pattern) 881 882 -- (Special entry, node field not used, 883 -- used only to save initial cursor) 884 885 -- (PC_R_Remove entry, "cursor" value is (negative) <-- Stack Base 886 -- saved base value for the enclosing region) 887 888 -- (stack entries made by matching P) 889 890 -- If the match of P fails, then the PC_R_Remove entry is popped 891 -- and it removes both itself and the special entry underneath it, 892 -- restores the outer stack base, and signals failure. 893 894 -- If the match of P succeeds, then node A, which is the actual 895 -- PC_Assign_Imm node, executes the assignment (using the stack 896 -- base to locate the entry with the saved starting cursor value), 897 -- and the pops the inner region. There are two possibilities, if 898 -- matching P left no stack entries, then all traces of the inner 899 -- region can be removed. If there are stack entries, then we push 900 -- an PC_Region_Replace stack entry whose "cursor" value is the 901 -- inner stack base value, and then restore the outer stack base 902 -- value, so the stack looks like: 903 904 -- (stack entries made before assign pattern) 905 906 -- (Special entry, node field not used, 907 -- used only to save initial cursor) 908 909 -- (PC_R_Remove entry, "cursor" value is (negative) 910 -- saved base value for the enclosing region) 911 912 -- (stack entries made by matching P) 913 914 -- (PC_Region_Replace entry, "cursor" value is the (negative) 915 -- stack pointer value referencing the PC_R_Remove entry). 916 917 -- If a subsequent failure occurs, the PC_Region_Replace node restores 918 -- the inner stack base value and signals failure to explore rematches 919 -- of the pattern P. 920 921 -- The node numbering of the constituent pattern P is not affected. 922 -- Where N is the number of nodes in P, the A node is numbered N + 1, 923 -- and the E node is N + 2. 924 925 --------------------- 926 -- Assign On Match -- 927 --------------------- 928 929 -- The assign on match (**) pattern is quite similar to the assign 930 -- immediate pattern, except that the actual assignment has to be 931 -- delayed. The following structure is constructed: 932 933 -- +---+ +---+ +---+ 934 -- | E |---->| P |---->| A |----> 935 -- +---+ +---+ +---+ 936 937 -- The operation of this pattern is identical to that described above 938 -- for deferred assignment, up to the point where P has been matched. 939 940 -- The A node, which is the PC_Assign_OnM node first pushes a 941 -- PC_Assign node onto the history stack. This node saves the ending 942 -- cursor and acts as a flag for the final assignment, as further 943 -- described below. 944 945 -- It then stores a pointer to itself in the special entry node field. 946 -- This was otherwise unused, and is now used to retrive the address 947 -- of the variable to be assigned at the end of the pattern. 948 949 -- After that the inner region is terminated in the usual manner, 950 -- by stacking a PC_R_Restore entry as described for the assign 951 -- immediate case. Note that the optimization of completely 952 -- removing the inner region does not happen in this case, since 953 -- we have at least one stack entry (the PC_Assign one we just made). 954 -- The stack now looks like: 955 956 -- (stack entries made before assign pattern) 957 958 -- (Special entry, node points to copy of 959 -- the PC_Assign_OnM node, and the 960 -- cursor field saves the initial cursor). 961 962 -- (PC_R_Remove entry, "cursor" value is (negative) 963 -- saved base value for the enclosing region) 964 965 -- (stack entries made by matching P) 966 967 -- (PC_Assign entry, saves final cursor) 968 969 -- (PC_Region_Replace entry, "cursor" value is (negative) 970 -- stack pointer value referencing the PC_R_Remove entry). 971 972 -- If a subsequent failure causes the PC_Assign node to execute it 973 -- simply removes itself and propagates the failure. 974 975 -- If the match succeeds, then the history stack is scanned for 976 -- PC_Assign nodes, and the assignments are executed (examination 977 -- of the above diagram will show that all the necessary data is 978 -- at hand for the assignment). 979 980 -- To optimize the common case where no assign-on-match operations 981 -- are present, a global flag Assign_OnM is maintained which is 982 -- initialize to False, and gets set True as part of the execution 983 -- of the PC_Assign_OnM node. The scan of the history stack for 984 -- PC_Assign entries is done only if this flag is set. 985 986 -- The node numbering of the constituent pattern P is not affected. 987 -- Where N is the number of nodes in P, the A node is numbered N + 1, 988 -- and the E node is N + 2. 989 990 --------- 991 -- Bal -- 992 --------- 993 994 -- Bal builds a single node: 995 996 -- +---+ 997 -- | B |----> 998 -- +---+ 999 1000 -- The node B is the PC_Bal node which matches a parentheses balanced 1001 -- string, starting at the current cursor position. It then updates 1002 -- the cursor past this matched string, and stacks a pointer to itself 1003 -- with this updated cursor value on the history stack, to extend the 1004 -- matched string on a subequent failure. 1005 1006 -- Since this is a single node it is numbered 1 (the reason we include 1007 -- it in the compound patterns section is that it backtracks). 1008 1009 ------------ 1010 -- BreakX -- 1011 ------------ 1012 1013 -- BreakX builds the structure 1014 1015 -- +---+ +---+ 1016 -- | B |---->| A |----> 1017 -- +---+ +---+ 1018 -- ^ . 1019 -- | . 1020 -- | +---+ 1021 -- +<------| X | 1022 -- +---+ 1023 1024 -- Here the B node is the BreakX_xx node that performs a normal Break 1025 -- function. The A node is an alternative (PC_Alt) node that matches 1026 -- null, but stacks a pointer to node X (the PC_BreakX_X node) which 1027 -- extends the match one character (to eat up the previously detected 1028 -- break character), and then rematches the break. 1029 1030 -- The B node is numbered 3, the alternative node is 1, and the X 1031 -- node is 2. 1032 1033 ----------- 1034 -- Fence -- 1035 ----------- 1036 1037 -- Fence builds a single node: 1038 1039 -- +---+ 1040 -- | F |----> 1041 -- +---+ 1042 1043 -- The element F, PC_Fence, matches null, and stacks a pointer to a 1044 -- PC_Cancel element which will abort the match on a subsequent failure. 1045 1046 -- Since this is a single element it is numbered 1 (the reason we 1047 -- include it in the compound patterns section is that it backtracks). 1048 1049 -------------------- 1050 -- Fence Function -- 1051 -------------------- 1052 1053 -- A call to the Fence function builds the structure: 1054 1055 -- +---+ +---+ +---+ 1056 -- | E |---->| P |---->| X |----> 1057 -- +---+ +---+ +---+ 1058 1059 -- Here E is the PC_R_Enter node which matches null and creates two 1060 -- stack entries. The first is a special entry which is not used at 1061 -- all in the fence case (it is present merely for uniformity with 1062 -- other cases of region enter operations). 1063 1064 -- The second entry corresponds to a standard new region action. A 1065 -- PC_R_Remove node is stacked, whose cursor field is used to store 1066 -- the outer stack base, and the stack base is reset to point to 1067 -- this PC_R_Remove node. Then the pattern P is matched, and it can 1068 -- make history stack entries in the normal manner, so now the stack 1069 -- looks like: 1070 1071 -- (stack entries made before fence pattern) 1072 1073 -- (Special entry, not used at all) 1074 1075 -- (PC_R_Remove entry, "cursor" value is (negative) <-- Stack Base 1076 -- saved base value for the enclosing region) 1077 1078 -- (stack entries made by matching P) 1079 1080 -- If the match of P fails, then the PC_R_Remove entry is popped 1081 -- and it removes both itself and the special entry underneath it, 1082 -- restores the outer stack base, and signals failure. 1083 1084 -- If the match of P succeeds, then node X, the PC_Fence_X node, gets 1085 -- control. One might be tempted to think that at this point, the 1086 -- history stack entries made by matching P can just be removed since 1087 -- they certainly are not going to be used for rematching (that is 1088 -- whole point of Fence after all!) However, this is wrong, because 1089 -- it would result in the loss of possible assign-on-match entries 1090 -- for deferred pattern assignments. 1091 1092 -- Instead what we do is to make a special entry whose node references 1093 -- PC_Fence_Y, and whose cursor saves the inner stack base value, i.e. 1094 -- the pointer to the PC_R_Remove entry. Then the outer stack base 1095 -- pointer is restored, so the stack looks like: 1096 1097 -- (stack entries made before assign pattern) 1098 1099 -- (Special entry, not used at all) 1100 1101 -- (PC_R_Remove entry, "cursor" value is (negative) 1102 -- saved base value for the enclosing region) 1103 1104 -- (stack entries made by matching P) 1105 1106 -- (PC_Fence_Y entry, "cursor" value is (negative) stack 1107 -- pointer value referencing the PC_R_Remove entry). 1108 1109 -- If a subsequent failure occurs, then the PC_Fence_Y entry removes 1110 -- the entire inner region, including all entries made by matching P, 1111 -- and alternatives prior to the Fence pattern are sought. 1112 1113 -- The node numbering of the constituent pattern P is not affected. 1114 -- Where N is the number of nodes in P, the X node is numbered N + 1, 1115 -- and the E node is N + 2. 1116 1117 ------------- 1118 -- Succeed -- 1119 ------------- 1120 1121 -- Succeed builds a single node: 1122 1123 -- +---+ 1124 -- | S |----> 1125 -- +---+ 1126 1127 -- The node S is the PC_Succeed node which matches null, and stacks 1128 -- a pointer to itself on the history stack, so that a subsequent 1129 -- failure repeats the same match. 1130 1131 -- Since this is a single node it is numbered 1 (the reason we include 1132 -- it in the compound patterns section is that it backtracks). 1133 1134 --------------------- 1135 -- Write Immediate -- 1136 --------------------- 1137 1138 -- The structure built for a write immediate operation (P * F, where 1139 -- F is a file access value) is: 1140 1141 -- +---+ +---+ +---+ 1142 -- | E |---->| P |---->| W |----> 1143 -- +---+ +---+ +---+ 1144 1145 -- Here E is the PC_R_Enter node and W is the PC_Write_Imm node. The 1146 -- handling is identical to that described above for Assign Immediate, 1147 -- except that at the point where a successful match occurs, the matched 1148 -- substring is written to the referenced file. 1149 1150 -- The node numbering of the constituent pattern P is not affected. 1151 -- Where N is the number of nodes in P, the W node is numbered N + 1, 1152 -- and the E node is N + 2. 1153 1154 -------------------- 1155 -- Write On Match -- 1156 -------------------- 1157 1158 -- The structure built for a write on match operation (P ** F, where 1159 -- F is a file access value) is: 1160 1161 -- +---+ +---+ +---+ 1162 -- | E |---->| P |---->| W |----> 1163 -- +---+ +---+ +---+ 1164 1165 -- Here E is the PC_R_Enter node and W is the PC_Write_OnM node. The 1166 -- handling is identical to that described above for Assign On Match, 1167 -- except that at the point where a successful match has completed, 1168 -- the matched substring is written to the referenced file. 1169 1170 -- The node numbering of the constituent pattern P is not affected. 1171 -- Where N is the number of nodes in P, the W node is numbered N + 1, 1172 -- and the E node is N + 2. 1173 ----------------------- 1174 -- Constant Patterns -- 1175 ----------------------- 1176 1177 -- The following pattern elements are referenced only from the pattern 1178 -- history stack. In each case the processing for the pattern element 1179 -- results in pattern match abort, or futher failure, so there is no 1180 -- need for a successor and no need for a node number 1181 1182 CP_Assign : aliased PE := (PC_Assign, 0, N); 1183 CP_Cancel : aliased PE := (PC_Cancel, 0, N); 1184 CP_Fence_Y : aliased PE := (PC_Fence_Y, 0, N); 1185 CP_R_Remove : aliased PE := (PC_R_Remove, 0, N); 1186 CP_R_Restore : aliased PE := (PC_R_Restore, 0, N); 1187 1188 ----------------------- 1189 -- Local Subprograms -- 1190 ----------------------- 1191 1192 function Alternate (L, R : PE_Ptr) return PE_Ptr; 1193 function "or" (L, R : PE_Ptr) return PE_Ptr renames Alternate; 1194 -- Build pattern structure corresponding to the alternation of L, R. 1195 -- (i.e. try to match L, and if that fails, try to match R). 1196 1197 function Arbno_Simple (P : PE_Ptr) return PE_Ptr; 1198 -- Build simple Arbno pattern, P is a pattern that is guaranteed to 1199 -- match at least one character if it succeeds and to require no 1200 -- stack entries under all circumstances. The result returned is 1201 -- a simple Arbno structure as previously described. 1202 1203 function Bracket (E, P, A : PE_Ptr) return PE_Ptr; 1204 -- Given two single node pattern elements E and A, and a (possible 1205 -- complex) pattern P, construct the concatenation E-->P-->A and 1206 -- return a pointer to E. The concatenation does not affect the 1207 -- node numbering in P. A has a number one higher than the maximum 1208 -- number in P, and E has a number two higher than the maximum 1209 -- number in P (see for example the Assign_Immediate structure to 1210 -- understand a typical use of this function). 1211 1212 function BreakX_Make (B : PE_Ptr) return Pattern; 1213 -- Given a pattern element for a Break patternx, returns the 1214 -- corresponding BreakX compound pattern structure. 1215 1216 function Concat (L, R : PE_Ptr; Incr : Natural) return PE_Ptr; 1217 -- Creates a pattern eelement that represents a concatenation of the 1218 -- two given pattern elements (i.e. the pattern L followed by R). 1219 -- The result returned is always the same as L, but the pattern 1220 -- referenced by L is modified to have R as a successor. This 1221 -- procedure does not copy L or R, so if a copy is required, it 1222 -- is the responsibility of the caller. The Incr parameter is an 1223 -- amount to be added to the Nat field of any P_Arbno_Y node that is 1224 -- in the left operand, it represents the additional stack space 1225 -- required by the right operand. 1226 1227 function C_To_PE (C : PChar) return PE_Ptr; 1228 -- Given a character, constructs a pattern element that matches 1229 -- the single character. 1230 1231 function Copy (P : PE_Ptr) return PE_Ptr; 1232 -- Creates a copy of the pattern element referenced by the given 1233 -- pattern element reference. This is a deep copy, which means that 1234 -- it follows the Next and Alt pointers. 1235 1236 function Image (P : PE_Ptr) return String; 1237 -- Returns the image of the address of the referenced pattern element. 1238 -- This is equivalent to Image (To_Address (P)); 1239 1240 function Is_In (C : Character; Str : String) return Boolean; 1241 pragma Inline (Is_In); 1242 -- Determines if the character C is in string Str. 1243 1244 procedure Logic_Error; 1245 -- Called to raise Program_Error with an appropriate message if an 1246 -- internal logic error is detected. 1247 1248 function Str_BF (A : Boolean_Func) return String; 1249 function Str_FP (A : File_Ptr) return String; 1250 function Str_NF (A : Natural_Func) return String; 1251 function Str_NP (A : Natural_Ptr) return String; 1252 function Str_PP (A : Pattern_Ptr) return String; 1253 function Str_VF (A : VString_Func) return String; 1254 function Str_VP (A : VString_Ptr) return String; 1255 -- These are debugging routines, which return a representation of the 1256 -- given access value (they are called only by Image and Dump) 1257 1258 procedure Set_Successor (Pat : PE_Ptr; Succ : PE_Ptr); 1259 -- Adjusts all EOP pointers in Pat to point to Succ. No other changes 1260 -- are made. In particular, Succ is unchanged, and no index numbers 1261 -- are modified. Note that Pat may not be equal to EOP on entry. 1262 1263 function S_To_PE (Str : PString) return PE_Ptr; 1264 -- Given a string, constructs a pattern element that matches the string 1265 1266 procedure Uninitialized_Pattern; 1267 pragma No_Return (Uninitialized_Pattern); 1268 -- Called to raise Program_Error with an appropriate error message if 1269 -- an uninitialized pattern is used in any pattern construction or 1270 -- pattern matching operation. 1271 1272 procedure XMatch 1273 (Subject : String; 1274 Pat_P : PE_Ptr; 1275 Pat_S : Natural; 1276 Start : out Natural; 1277 Stop : out Natural); 1278 -- This is the common pattern match routine. It is passed a string and 1279 -- a pattern, and it indicates success or failure, and on success the 1280 -- section of the string matched. It does not perform any assignments 1281 -- to the subject string, so pattern replacement is for the caller. 1282 -- 1283 -- Subject The subject string. The lower bound is always one. In the 1284 -- Match procedures, it is fine to use strings whose lower bound 1285 -- is not one, but we perform a one time conversion before the 1286 -- call to XMatch, so that XMatch does not have to be bothered 1287 -- with strange lower bounds. 1288 -- 1289 -- Pat_P Points to initial pattern element of pattern to be matched 1290 -- 1291 -- Pat_S Maximum required stack entries for pattern to be matched 1292 -- 1293 -- Start If match is successful, starting index of matched section. 1294 -- This value is always non-zero. A value of zero is used to 1295 -- indicate a failed match. 1296 -- 1297 -- Stop If match is successful, ending index of matched section. 1298 -- This can be zero if we match the null string at the start, 1299 -- in which case Start is set to zero, and Stop to one. If the 1300 -- Match fails, then the contents of Stop is undefined. 1301 1302 procedure XMatchD 1303 (Subject : String; 1304 Pat_P : PE_Ptr; 1305 Pat_S : Natural; 1306 Start : out Natural; 1307 Stop : out Natural); 1308 -- Identical in all respects to XMatch, except that trace information is 1309 -- output on Standard_Ouput during execution of the match. This is the 1310 -- version that is called if the original Match call has Debug => True. 1311 1312 --------- 1313 -- "&" -- 1314 --------- 1315 1316 function "&" (L : PString; R : Pattern) return Pattern is 1317 begin 1318 return (AFC with R.Stk, Concat (S_To_PE (L), Copy (R.P), R.Stk)); 1319 end "&"; 1320 1321 function "&" (L : Pattern; R : PString) return Pattern is 1322 begin 1323 return (AFC with L.Stk, Concat (Copy (L.P), S_To_PE (R), 0)); 1324 end "&"; 1325 1326 function "&" (L : PChar; R : Pattern) return Pattern is 1327 begin 1328 return (AFC with R.Stk, Concat (C_To_PE (L), Copy (R.P), R.Stk)); 1329 end "&"; 1330 1331 function "&" (L : Pattern; R : PChar) return Pattern is 1332 begin 1333 return (AFC with L.Stk, Concat (Copy (L.P), C_To_PE (R), 0)); 1334 end "&"; 1335 1336 function "&" (L : Pattern; R : Pattern) return Pattern is 1337 begin 1338 return (AFC with L.Stk + R.Stk, Concat (Copy (L.P), Copy (R.P), R.Stk)); 1339 end "&"; 1340 1341 --------- 1342 -- "*" -- 1343 --------- 1344 1345 -- Assign immediate 1346 1347 -- +---+ +---+ +---+ 1348 -- | E |---->| P |---->| A |----> 1349 -- +---+ +---+ +---+ 1350 1351 -- The node numbering of the constituent pattern P is not affected. 1352 -- Where N is the number of nodes in P, the A node is numbered N + 1, 1353 -- and the E node is N + 2. 1354 1355 function "*" (P : Pattern; Var : VString_Var) return Pattern is 1356 Pat : constant PE_Ptr := Copy (P.P); 1357 E : constant PE_Ptr := new PE'(PC_R_Enter, 0, EOP); 1358 A : constant PE_Ptr := 1359 new PE'(PC_Assign_Imm, 0, EOP, Var'Unrestricted_Access); 1360 1361 begin 1362 return (AFC with P.Stk + 3, Bracket (E, Pat, A)); 1363 end "*"; 1364 1365 function "*" (P : PString; Var : VString_Var) return Pattern is 1366 Pat : constant PE_Ptr := S_To_PE (P); 1367 E : constant PE_Ptr := new PE'(PC_R_Enter, 0, EOP); 1368 A : constant PE_Ptr := 1369 new PE'(PC_Assign_Imm, 0, EOP, Var'Unrestricted_Access); 1370 1371 begin 1372 return (AFC with 3, Bracket (E, Pat, A)); 1373 end "*"; 1374 1375 function "*" (P : PChar; Var : VString_Var) return Pattern is 1376 Pat : constant PE_Ptr := C_To_PE (P); 1377 E : constant PE_Ptr := new PE'(PC_R_Enter, 0, EOP); 1378 A : constant PE_Ptr := 1379 new PE'(PC_Assign_Imm, 0, EOP, Var'Unrestricted_Access); 1380 1381 begin 1382 return (AFC with 3, Bracket (E, Pat, A)); 1383 end "*"; 1384 1385 -- Write immediate 1386 1387 -- +---+ +---+ +---+ 1388 -- | E |---->| P |---->| W |----> 1389 -- +---+ +---+ +---+ 1390 1391 -- The node numbering of the constituent pattern P is not affected. 1392 -- Where N is the number of nodes in P, the W node is numbered N + 1, 1393 -- and the E node is N + 2. 1394 1395 function "*" (P : Pattern; Fil : File_Access) return Pattern is 1396 Pat : constant PE_Ptr := Copy (P.P); 1397 E : constant PE_Ptr := new PE'(PC_R_Enter, 0, EOP); 1398 W : constant PE_Ptr := new PE'(PC_Write_Imm, 0, EOP, Fil); 1399 1400 begin 1401 return (AFC with 3, Bracket (E, Pat, W)); 1402 end "*"; 1403 1404 function "*" (P : PString; Fil : File_Access) return Pattern is 1405 Pat : constant PE_Ptr := S_To_PE (P); 1406 E : constant PE_Ptr := new PE'(PC_R_Enter, 0, EOP); 1407 W : constant PE_Ptr := new PE'(PC_Write_Imm, 0, EOP, Fil); 1408 1409 begin 1410 return (AFC with 3, Bracket (E, Pat, W)); 1411 end "*"; 1412 1413 function "*" (P : PChar; Fil : File_Access) return Pattern is 1414 Pat : constant PE_Ptr := C_To_PE (P); 1415 E : constant PE_Ptr := new PE'(PC_R_Enter, 0, EOP); 1416 W : constant PE_Ptr := new PE'(PC_Write_Imm, 0, EOP, Fil); 1417 1418 begin 1419 return (AFC with 3, Bracket (E, Pat, W)); 1420 end "*"; 1421 1422 ---------- 1423 -- "**" -- 1424 ---------- 1425 1426 -- Assign on match 1427 1428 -- +---+ +---+ +---+ 1429 -- | E |---->| P |---->| A |----> 1430 -- +---+ +---+ +---+ 1431 1432 -- The node numbering of the constituent pattern P is not affected. 1433 -- Where N is the number of nodes in P, the A node is numbered N + 1, 1434 -- and the E node is N + 2. 1435 1436 function "**" (P : Pattern; Var : VString_Var) return Pattern is 1437 Pat : constant PE_Ptr := Copy (P.P); 1438 E : constant PE_Ptr := new PE'(PC_R_Enter, 0, EOP); 1439 A : constant PE_Ptr := 1440 new PE'(PC_Assign_OnM, 0, EOP, Var'Unrestricted_Access); 1441 1442 begin 1443 return (AFC with P.Stk + 3, Bracket (E, Pat, A)); 1444 end "**"; 1445 1446 function "**" (P : PString; Var : VString_Var) return Pattern is 1447 Pat : constant PE_Ptr := S_To_PE (P); 1448 E : constant PE_Ptr := new PE'(PC_R_Enter, 0, EOP); 1449 A : constant PE_Ptr := 1450 new PE'(PC_Assign_OnM, 0, EOP, Var'Unrestricted_Access); 1451 1452 begin 1453 return (AFC with 3, Bracket (E, Pat, A)); 1454 end "**"; 1455 1456 function "**" (P : PChar; Var : VString_Var) return Pattern is 1457 Pat : constant PE_Ptr := C_To_PE (P); 1458 E : constant PE_Ptr := new PE'(PC_R_Enter, 0, EOP); 1459 A : constant PE_Ptr := 1460 new PE'(PC_Assign_OnM, 0, EOP, Var'Unrestricted_Access); 1461 1462 begin 1463 return (AFC with 3, Bracket (E, Pat, A)); 1464 end "**"; 1465 1466 -- Write on match 1467 1468 -- +---+ +---+ +---+ 1469 -- | E |---->| P |---->| W |----> 1470 -- +---+ +---+ +---+ 1471 1472 -- The node numbering of the constituent pattern P is not affected. 1473 -- Where N is the number of nodes in P, the W node is numbered N + 1, 1474 -- and the E node is N + 2. 1475 1476 function "**" (P : Pattern; Fil : File_Access) return Pattern is 1477 Pat : constant PE_Ptr := Copy (P.P); 1478 E : constant PE_Ptr := new PE'(PC_R_Enter, 0, EOP); 1479 W : constant PE_Ptr := new PE'(PC_Write_OnM, 0, EOP, Fil); 1480 1481 begin 1482 return (AFC with P.Stk + 3, Bracket (E, Pat, W)); 1483 end "**"; 1484 1485 function "**" (P : PString; Fil : File_Access) return Pattern is 1486 Pat : constant PE_Ptr := S_To_PE (P); 1487 E : constant PE_Ptr := new PE'(PC_R_Enter, 0, EOP); 1488 W : constant PE_Ptr := new PE'(PC_Write_OnM, 0, EOP, Fil); 1489 1490 begin 1491 return (AFC with 3, Bracket (E, Pat, W)); 1492 end "**"; 1493 1494 function "**" (P : PChar; Fil : File_Access) return Pattern is 1495 Pat : constant PE_Ptr := C_To_PE (P); 1496 E : constant PE_Ptr := new PE'(PC_R_Enter, 0, EOP); 1497 W : constant PE_Ptr := new PE'(PC_Write_OnM, 0, EOP, Fil); 1498 1499 begin 1500 return (AFC with 3, Bracket (E, Pat, W)); 1501 end "**"; 1502 1503 --------- 1504 -- "+" -- 1505 --------- 1506 1507 function "+" (Str : VString_Var) return Pattern is 1508 begin 1509 return 1510 (AFC with 0, 1511 new PE'(PC_String_VP, 1, EOP, Str'Unrestricted_Access)); 1512 end "+"; 1513 1514 function "+" (Str : VString_Func) return Pattern is 1515 begin 1516 return (AFC with 0, new PE'(PC_String_VF, 1, EOP, Str)); 1517 end "+"; 1518 1519 function "+" (P : Pattern_Var) return Pattern is 1520 begin 1521 return 1522 (AFC with 3, 1523 new PE'(PC_Rpat, 1, EOP, P'Unrestricted_Access)); 1524 end "+"; 1525 1526 function "+" (P : Boolean_Func) return Pattern is 1527 begin 1528 return (AFC with 3, new PE'(PC_Pred_Func, 1, EOP, P)); 1529 end "+"; 1530 1531 ---------- 1532 -- "or" -- 1533 ---------- 1534 1535 function "or" (L : PString; R : Pattern) return Pattern is 1536 begin 1537 return (AFC with R.Stk + 1, S_To_PE (L) or Copy (R.P)); 1538 end "or"; 1539 1540 function "or" (L : Pattern; R : PString) return Pattern is 1541 begin 1542 return (AFC with L.Stk + 1, Copy (L.P) or S_To_PE (R)); 1543 end "or"; 1544 1545 function "or" (L : PString; R : PString) return Pattern is 1546 begin 1547 return (AFC with 1, S_To_PE (L) or S_To_PE (R)); 1548 end "or"; 1549 1550 function "or" (L : Pattern; R : Pattern) return Pattern is 1551 begin 1552 return (AFC with 1553 Natural'Max (L.Stk, R.Stk) + 1, Copy (L.P) or Copy (R.P)); 1554 end "or"; 1555 1556 function "or" (L : PChar; R : Pattern) return Pattern is 1557 begin 1558 return (AFC with 1, C_To_PE (L) or Copy (R.P)); 1559 end "or"; 1560 1561 function "or" (L : Pattern; R : PChar) return Pattern is 1562 begin 1563 return (AFC with 1, Copy (L.P) or C_To_PE (R)); 1564 end "or"; 1565 1566 function "or" (L : PChar; R : PChar) return Pattern is 1567 begin 1568 return (AFC with 1, C_To_PE (L) or C_To_PE (R)); 1569 end "or"; 1570 1571 function "or" (L : PString; R : PChar) return Pattern is 1572 begin 1573 return (AFC with 1, S_To_PE (L) or C_To_PE (R)); 1574 end "or"; 1575 1576 function "or" (L : PChar; R : PString) return Pattern is 1577 begin 1578 return (AFC with 1, C_To_PE (L) or S_To_PE (R)); 1579 end "or"; 1580 1581 ------------ 1582 -- Adjust -- 1583 ------------ 1584 1585 -- No two patterns share the same pattern elements, so the adjust 1586 -- procedure for a Pattern assignment must do a deep copy of the 1587 -- pattern element structure. 1588 1589 procedure Adjust (Object : in out Pattern) is 1590 begin 1591 Object.P := Copy (Object.P); 1592 end Adjust; 1593 1594 --------------- 1595 -- Alternate -- 1596 --------------- 1597 1598 function Alternate (L, R : PE_Ptr) return PE_Ptr is 1599 begin 1600 -- If the left pattern is null, then we just add the alternation 1601 -- node with an index one greater than the right hand pattern. 1602 1603 if L = EOP then 1604 return new PE'(PC_Alt, R.Index + 1, EOP, R); 1605 1606 -- If the left pattern is non-null, then build a reference vector 1607 -- for its elements, and adjust their index values to acccomodate 1608 -- the right hand elements. Then add the alternation node. 1609 1610 else 1611 declare 1612 Refs : Ref_Array (1 .. L.Index); 1613 1614 begin 1615 Build_Ref_Array (L, Refs); 1616 1617 for J in Refs'Range loop 1618 Refs (J).Index := Refs (J).Index + R.Index; 1619 end loop; 1620 end; 1621 1622 return new PE'(PC_Alt, L.Index + 1, L, R); 1623 end if; 1624 end Alternate; 1625 1626 --------- 1627 -- Any -- 1628 --------- 1629 1630 function Any (Str : String) return Pattern is 1631 begin 1632 return (AFC with 0, new PE'(PC_Any_CS, 1, EOP, To_Set (Str))); 1633 end Any; 1634 1635 function Any (Str : VString) return Pattern is 1636 begin 1637 return Any (S (Str)); 1638 end Any; 1639 1640 function Any (Str : Character) return Pattern is 1641 begin 1642 return (AFC with 0, new PE'(PC_Any_CH, 1, EOP, Str)); 1643 end Any; 1644 1645 function Any (Str : Character_Set) return Pattern is 1646 begin 1647 return (AFC with 0, new PE'(PC_Any_CS, 1, EOP, Str)); 1648 end Any; 1649 1650 function Any (Str : access VString) return Pattern is 1651 begin 1652 return (AFC with 0, new PE'(PC_Any_VP, 1, EOP, VString_Ptr (Str))); 1653 end Any; 1654 1655 function Any (Str : VString_Func) return Pattern is 1656 begin 1657 return (AFC with 0, new PE'(PC_Any_VF, 1, EOP, Str)); 1658 end Any; 1659 1660 --------- 1661 -- Arb -- 1662 --------- 1663 1664 -- +---+ 1665 -- | X |----> 1666 -- +---+ 1667 -- . 1668 -- . 1669 -- +---+ 1670 -- | Y |----> 1671 -- +---+ 1672 1673 -- The PC_Arb_X element is numbered 2, and the PC_Arb_Y element is 1. 1674 1675 function Arb return Pattern is 1676 Y : constant PE_Ptr := new PE'(PC_Arb_Y, 1, EOP); 1677 X : constant PE_Ptr := new PE'(PC_Arb_X, 2, EOP, Y); 1678 1679 begin 1680 return (AFC with 1, X); 1681 end Arb; 1682 1683 ----------- 1684 -- Arbno -- 1685 ----------- 1686 1687 function Arbno (P : PString) return Pattern is 1688 begin 1689 if P'Length = 0 then 1690 return (AFC with 0, EOP); 1691 1692 else 1693 return (AFC with 0, Arbno_Simple (S_To_PE (P))); 1694 end if; 1695 end Arbno; 1696 1697 function Arbno (P : PChar) return Pattern is 1698 begin 1699 return (AFC with 0, Arbno_Simple (C_To_PE (P))); 1700 end Arbno; 1701 1702 function Arbno (P : Pattern) return Pattern is 1703 Pat : constant PE_Ptr := Copy (P.P); 1704 1705 begin 1706 if P.Stk = 0 1707 and then OK_For_Simple_Arbno (Pat.Pcode) 1708 then 1709 return (AFC with 0, Arbno_Simple (Pat)); 1710 end if; 1711 1712 -- This is the complex case, either the pattern makes stack entries 1713 -- or it is possible for the pattern to match the null string (more 1714 -- accurately, we don't know that this is not the case). 1715 1716 -- +--------------------------+ 1717 -- | ^ 1718 -- V | 1719 -- +---+ | 1720 -- | X |----> | 1721 -- +---+ | 1722 -- . | 1723 -- . | 1724 -- +---+ +---+ +---+ | 1725 -- | E |---->| P |---->| Y |--->+ 1726 -- +---+ +---+ +---+ 1727 1728 -- The node numbering of the constituent pattern P is not affected. 1729 -- Where N is the number of nodes in P, the Y node is numbered N + 1, 1730 -- the E node is N + 2, and the X node is N + 3. 1731 1732 declare 1733 E : constant PE_Ptr := new PE'(PC_R_Enter, 0, EOP); 1734 X : constant PE_Ptr := new PE'(PC_Arbno_X, 0, EOP, E); 1735 Y : constant PE_Ptr := new PE'(PC_Arbno_Y, 0, X, P.Stk + 3); 1736 EPY : constant PE_Ptr := Bracket (E, Pat, Y); 1737 1738 begin 1739 X.Alt := EPY; 1740 X.Index := EPY.Index + 1; 1741 return (AFC with P.Stk + 3, X); 1742 end; 1743 end Arbno; 1744 1745 ------------------ 1746 -- Arbno_Simple -- 1747 ------------------ 1748 1749 -- +-------------+ 1750 -- | ^ 1751 -- V | 1752 -- +---+ | 1753 -- | S |----> | 1754 -- +---+ | 1755 -- . | 1756 -- . | 1757 -- +---+ | 1758 -- | P |---------->+ 1759 -- +---+ 1760 1761 -- The node numbering of the constituent pattern P is not affected. 1762 -- The S node has a node number of P.Index + 1. 1763 1764 -- Note that we know that P cannot be EOP, because a null pattern 1765 -- does not meet the requirements for simple Arbno. 1766 1767 function Arbno_Simple (P : PE_Ptr) return PE_Ptr is 1768 S : constant PE_Ptr := new PE'(PC_Arbno_S, P.Index + 1, EOP, P); 1769 1770 begin 1771 Set_Successor (P, S); 1772 return S; 1773 end Arbno_Simple; 1774 1775 --------- 1776 -- Bal -- 1777 --------- 1778 1779 function Bal return Pattern is 1780 begin 1781 return (AFC with 1, new PE'(PC_Bal, 1, EOP)); 1782 end Bal; 1783 1784 ------------- 1785 -- Bracket -- 1786 ------------- 1787 1788 function Bracket (E, P, A : PE_Ptr) return PE_Ptr is 1789 begin 1790 if P = EOP then 1791 E.Pthen := A; 1792 E.Index := 2; 1793 A.Index := 1; 1794 1795 else 1796 E.Pthen := P; 1797 Set_Successor (P, A); 1798 E.Index := P.Index + 2; 1799 A.Index := P.Index + 1; 1800 end if; 1801 1802 return E; 1803 end Bracket; 1804 1805 ----------- 1806 -- Break -- 1807 ----------- 1808 1809 function Break (Str : String) return Pattern is 1810 begin 1811 return (AFC with 0, new PE'(PC_Break_CS, 1, EOP, To_Set (Str))); 1812 end Break; 1813 1814 function Break (Str : VString) return Pattern is 1815 begin 1816 return Break (S (Str)); 1817 end Break; 1818 1819 function Break (Str : Character) return Pattern is 1820 begin 1821 return (AFC with 0, new PE'(PC_Break_CH, 1, EOP, Str)); 1822 end Break; 1823 1824 function Break (Str : Character_Set) return Pattern is 1825 begin 1826 return (AFC with 0, new PE'(PC_Break_CS, 1, EOP, Str)); 1827 end Break; 1828 1829 function Break (Str : access VString) return Pattern is 1830 begin 1831 return (AFC with 0, new PE'(PC_Break_VP, 1, EOP, VString_Ptr (Str))); 1832 end Break; 1833 1834 function Break (Str : VString_Func) return Pattern is 1835 begin 1836 return (AFC with 0, new PE'(PC_Break_VF, 1, EOP, Str)); 1837 end Break; 1838 1839 ------------ 1840 -- BreakX -- 1841 ------------ 1842 1843 function BreakX (Str : String) return Pattern is 1844 begin 1845 return BreakX_Make (new PE'(PC_BreakX_CS, 3, N, To_Set (Str))); 1846 end BreakX; 1847 1848 function BreakX (Str : VString) return Pattern is 1849 begin 1850 return BreakX (S (Str)); 1851 end BreakX; 1852 1853 function BreakX (Str : Character) return Pattern is 1854 begin 1855 return BreakX_Make (new PE'(PC_BreakX_CH, 3, N, Str)); 1856 end BreakX; 1857 1858 function BreakX (Str : Character_Set) return Pattern is 1859 begin 1860 return BreakX_Make (new PE'(PC_BreakX_CS, 3, N, Str)); 1861 end BreakX; 1862 1863 function BreakX (Str : access VString) return Pattern is 1864 begin 1865 return BreakX_Make (new PE'(PC_BreakX_VP, 3, N, VString_Ptr (Str))); 1866 end BreakX; 1867 1868 function BreakX (Str : VString_Func) return Pattern is 1869 begin 1870 return BreakX_Make (new PE'(PC_BreakX_VF, 3, N, Str)); 1871 end BreakX; 1872 1873 ----------------- 1874 -- BreakX_Make -- 1875 ----------------- 1876 1877 -- +---+ +---+ 1878 -- | B |---->| A |----> 1879 -- +---+ +---+ 1880 -- ^ . 1881 -- | . 1882 -- | +---+ 1883 -- +<------| X | 1884 -- +---+ 1885 1886 -- The B node is numbered 3, the alternative node is 1, and the X 1887 -- node is 2. 1888 1889 function BreakX_Make (B : PE_Ptr) return Pattern is 1890 X : constant PE_Ptr := new PE'(PC_BreakX_X, 2, B); 1891 A : constant PE_Ptr := new PE'(PC_Alt, 1, EOP, X); 1892 1893 begin 1894 B.Pthen := A; 1895 return (AFC with 2, B); 1896 end BreakX_Make; 1897 1898 --------------------- 1899 -- Build_Ref_Array -- 1900 --------------------- 1901 1902 procedure Build_Ref_Array (E : PE_Ptr; RA : out Ref_Array) is 1903 1904 procedure Record_PE (E : PE_Ptr); 1905 -- Record given pattern element if not already recorded in RA, 1906 -- and also record any referenced pattern elements recursively. 1907 1908 procedure Record_PE (E : PE_Ptr) is 1909 begin 1910 PutD (" Record_PE called with PE_Ptr = " & Image (E)); 1911 1912 if E = EOP or else RA (E.Index) /= null then 1913 Put_LineD (", nothing to do"); 1914 return; 1915 1916 else 1917 Put_LineD (", recording" & IndexT'Image (E.Index)); 1918 RA (E.Index) := E; 1919 Record_PE (E.Pthen); 1920 1921 if E.Pcode in PC_Has_Alt then 1922 Record_PE (E.Alt); 1923 end if; 1924 end if; 1925 end Record_PE; 1926 1927 -- Start of processing for Build_Ref_Array 1928 1929 begin 1930 New_LineD; 1931 Put_LineD ("Entering Build_Ref_Array"); 1932 Record_PE (E); 1933 New_LineD; 1934 end Build_Ref_Array; 1935 1936 ------------- 1937 -- C_To_PE -- 1938 ------------- 1939 1940 function C_To_PE (C : PChar) return PE_Ptr is 1941 begin 1942 return new PE'(PC_Char, 1, EOP, C); 1943 end C_To_PE; 1944 1945 ------------ 1946 -- Cancel -- 1947 ------------ 1948 1949 function Cancel return Pattern is 1950 begin 1951 return (AFC with 0, new PE'(PC_Cancel, 1, EOP)); 1952 end Cancel; 1953 1954 ------------ 1955 -- Concat -- 1956 ------------ 1957 1958 -- Concat needs to traverse the left operand performing the following 1959 -- set of fixups: 1960 1961 -- a) Any successor pointers (Pthen fields) that are set to EOP are 1962 -- reset to point to the second operand. 1963 1964 -- b) Any PC_Arbno_Y node has its stack count field incremented 1965 -- by the parameter Incr provided for this purpose. 1966 1967 -- d) Num fields of all pattern elements in the left operand are 1968 -- adjusted to include the elements of the right operand. 1969 1970 -- Note: we do not use Set_Successor in the processing for Concat, since 1971 -- there is no point in doing two traversals, we may as well do everything 1972 -- at the same time. 1973 1974 function Concat (L, R : PE_Ptr; Incr : Natural) return PE_Ptr is 1975 begin 1976 if L = EOP then 1977 return R; 1978 1979 elsif R = EOP then 1980 return L; 1981 1982 else 1983 declare 1984 Refs : Ref_Array (1 .. L.Index); 1985 -- We build a reference array for L whose N'th element points to 1986 -- the pattern element of L whose original Index value is N. 1987 1988 P : PE_Ptr; 1989 1990 begin 1991 Build_Ref_Array (L, Refs); 1992 1993 for J in Refs'Range loop 1994 P := Refs (J); 1995 1996 P.Index := P.Index + R.Index; 1997 1998 if P.Pcode = PC_Arbno_Y then 1999 P.Nat := P.Nat + Incr; 2000 end if; 2001 2002 if P.Pthen = EOP then 2003 P.Pthen := R; 2004 end if; 2005 2006 if P.Pcode in PC_Has_Alt and then P.Alt = EOP then 2007 P.Alt := R; 2008 end if; 2009 end loop; 2010 end; 2011 2012 return L; 2013 end if; 2014 end Concat; 2015 2016 ---------- 2017 -- Copy -- 2018 ---------- 2019 2020 function Copy (P : PE_Ptr) return PE_Ptr is 2021 begin 2022 if P = null then 2023 Uninitialized_Pattern; 2024 2025 else 2026 declare 2027 Refs : Ref_Array (1 .. P.Index); 2028 -- References to elements in P, indexed by Index field 2029 2030 Copy : Ref_Array (1 .. P.Index); 2031 -- Holds copies of elements of P, indexed by Index field. 2032 2033 E : PE_Ptr; 2034 2035 begin 2036 Build_Ref_Array (P, Refs); 2037 2038 -- Now copy all nodes 2039 2040 for J in Refs'Range loop 2041 Copy (J) := new PE'(Refs (J).all); 2042 end loop; 2043 2044 -- Adjust all internal references 2045 2046 for J in Copy'Range loop 2047 E := Copy (J); 2048 2049 -- Adjust successor pointer to point to copy 2050 2051 if E.Pthen /= EOP then 2052 E.Pthen := Copy (E.Pthen.Index); 2053 end if; 2054 2055 -- Adjust Alt pointer if there is one to point to copy 2056 2057 if E.Pcode in PC_Has_Alt and then E.Alt /= EOP then 2058 E.Alt := Copy (E.Alt.Index); 2059 end if; 2060 2061 -- Copy referenced string 2062 2063 if E.Pcode = PC_String then 2064 E.Str := new String'(E.Str.all); 2065 end if; 2066 end loop; 2067 2068 return Copy (P.Index); 2069 end; 2070 end if; 2071 end Copy; 2072 2073 ---------- 2074 -- Dump -- 2075 ---------- 2076 2077 procedure Dump (P : Pattern) is 2078 2079 subtype Count is Ada.Text_IO.Count; 2080 Scol : Count; 2081 -- Used to keep track of column in dump output 2082 2083 Refs : Ref_Array (1 .. P.P.Index); 2084 -- We build a reference array whose N'th element points to the 2085 -- pattern element whose Index value is N. 2086 2087 Cols : Natural := 2; 2088 -- Number of columns used for pattern numbers, minimum is 2 2089 2090 E : PE_Ptr; 2091 2092 procedure Write_Node_Id (E : PE_Ptr); 2093 -- Writes out a string identifying the given pattern element. 2094 2095 procedure Write_Node_Id (E : PE_Ptr) is 2096 begin 2097 if E = EOP then 2098 Put ("EOP"); 2099 2100 for J in 4 .. Cols loop 2101 Put (' '); 2102 end loop; 2103 2104 else 2105 declare 2106 Str : String (1 .. Cols); 2107 N : Natural := Natural (E.Index); 2108 2109 begin 2110 Put ("#"); 2111 2112 for J in reverse Str'Range loop 2113 Str (J) := Character'Val (48 + N mod 10); 2114 N := N / 10; 2115 end loop; 2116 2117 Put (Str); 2118 end; 2119 end if; 2120 end Write_Node_Id; 2121 2122 begin 2123 New_Line; 2124 Put ("Pattern Dump Output (pattern at " & 2125 Image (P'Address) & 2126 ", S = " & Natural'Image (P.Stk) & ')'); 2127 2128 Scol := Col; 2129 New_Line; 2130 2131 while Col < Scol loop 2132 Put ('-'); 2133 end loop; 2134 2135 New_Line; 2136 2137 -- If uninitialized pattern, dump line and we are done 2138 2139 if P.P = null then 2140 Put_Line ("Uninitialized pattern value"); 2141 return; 2142 end if; 2143 2144 -- If null pattern, just dump it and we are all done 2145 2146 if P.P = EOP then 2147 Put_Line ("EOP (null pattern)"); 2148 return; 2149 end if; 2150 2151 Build_Ref_Array (P.P, Refs); 2152 2153 -- Set number of columns required for node numbers 2154 2155 while 10 ** Cols - 1 < Integer (P.P.Index) loop 2156 Cols := Cols + 1; 2157 end loop; 2158 2159 -- Now dump the nodes in reverse sequence. We output them in reverse 2160 -- sequence since this corresponds to the natural order used to 2161 -- construct the patterns. 2162 2163 for J in reverse Refs'Range loop 2164 E := Refs (J); 2165 Write_Node_Id (E); 2166 Set_Col (Count (Cols) + 4); 2167 Put (Image (E)); 2168 Put (" "); 2169 Put (Pattern_Code'Image (E.Pcode)); 2170 Put (" "); 2171 Set_Col (21 + Count (Cols) + Address_Image_Length); 2172 Write_Node_Id (E.Pthen); 2173 Set_Col (24 + 2 * Count (Cols) + Address_Image_Length); 2174 2175 case E.Pcode is 2176 2177 when PC_Alt | 2178 PC_Arb_X | 2179 PC_Arbno_S | 2180 PC_Arbno_X => 2181 Write_Node_Id (E.Alt); 2182 2183 when PC_Rpat => 2184 Put (Str_PP (E.PP)); 2185 2186 when PC_Pred_Func => 2187 Put (Str_BF (E.BF)); 2188 2189 when PC_Assign_Imm | 2190 PC_Assign_OnM | 2191 PC_Any_VP | 2192 PC_Break_VP | 2193 PC_BreakX_VP | 2194 PC_NotAny_VP | 2195 PC_NSpan_VP | 2196 PC_Span_VP | 2197 PC_String_VP => 2198 Put (Str_VP (E.VP)); 2199 2200 when PC_Write_Imm | 2201 PC_Write_OnM => 2202 Put (Str_FP (E.FP)); 2203 2204 when PC_String => 2205 Put (Image (E.Str.all)); 2206 2207 when PC_String_2 => 2208 Put (Image (E.Str2)); 2209 2210 when PC_String_3 => 2211 Put (Image (E.Str3)); 2212 2213 when PC_String_4 => 2214 Put (Image (E.Str4)); 2215 2216 when PC_String_5 => 2217 Put (Image (E.Str5)); 2218 2219 when PC_String_6 => 2220 Put (Image (E.Str6)); 2221 2222 when PC_Setcur => 2223 Put (Str_NP (E.Var)); 2224 2225 when PC_Any_CH | 2226 PC_Break_CH | 2227 PC_BreakX_CH | 2228 PC_Char | 2229 PC_NotAny_CH | 2230 PC_NSpan_CH | 2231 PC_Span_CH => 2232 Put (''' & E.Char & '''); 2233 2234 when PC_Any_CS | 2235 PC_Break_CS | 2236 PC_BreakX_CS | 2237 PC_NotAny_CS | 2238 PC_NSpan_CS | 2239 PC_Span_CS => 2240 Put ('"' & To_Sequence (E.CS) & '"'); 2241 2242 when PC_Arbno_Y | 2243 PC_Len_Nat | 2244 PC_Pos_Nat | 2245 PC_RPos_Nat | 2246 PC_RTab_Nat | 2247 PC_Tab_Nat => 2248 Put (S (E.Nat)); 2249 2250 when PC_Pos_NF | 2251 PC_Len_NF | 2252 PC_RPos_NF | 2253 PC_RTab_NF | 2254 PC_Tab_NF => 2255 Put (Str_NF (E.NF)); 2256 2257 when PC_Pos_NP | 2258 PC_Len_NP | 2259 PC_RPos_NP | 2260 PC_RTab_NP | 2261 PC_Tab_NP => 2262 Put (Str_NP (E.NP)); 2263 2264 when PC_Any_VF | 2265 PC_Break_VF | 2266 PC_BreakX_VF | 2267 PC_NotAny_VF | 2268 PC_NSpan_VF | 2269 PC_Span_VF | 2270 PC_String_VF => 2271 Put (Str_VF (E.VF)); 2272 2273 when others => null; 2274 2275 end case; 2276 2277 New_Line; 2278 end loop; 2279 2280 New_Line; 2281 end Dump; 2282 2283 ---------- 2284 -- Fail -- 2285 ---------- 2286 2287 function Fail return Pattern is 2288 begin 2289 return (AFC with 0, new PE'(PC_Fail, 1, EOP)); 2290 end Fail; 2291 2292 ----------- 2293 -- Fence -- 2294 ----------- 2295 2296 -- Simple case 2297 2298 function Fence return Pattern is 2299 begin 2300 return (AFC with 1, new PE'(PC_Fence, 1, EOP)); 2301 end Fence; 2302 2303 -- Function case 2304 2305 -- +---+ +---+ +---+ 2306 -- | E |---->| P |---->| X |----> 2307 -- +---+ +---+ +---+ 2308 2309 -- The node numbering of the constituent pattern P is not affected. 2310 -- Where N is the number of nodes in P, the X node is numbered N + 1, 2311 -- and the E node is N + 2. 2312 2313 function Fence (P : Pattern) return Pattern is 2314 Pat : constant PE_Ptr := Copy (P.P); 2315 E : constant PE_Ptr := new PE'(PC_R_Enter, 0, EOP); 2316 X : constant PE_Ptr := new PE'(PC_Fence_X, 0, EOP); 2317 2318 begin 2319 return (AFC with P.Stk + 1, Bracket (E, Pat, X)); 2320 end Fence; 2321 2322 -------------- 2323 -- Finalize -- 2324 -------------- 2325 2326 procedure Finalize (Object : in out Pattern) is 2327 2328 procedure Free is new Unchecked_Deallocation (PE, PE_Ptr); 2329 procedure Free is new Unchecked_Deallocation (String, String_Ptr); 2330 2331 begin 2332 -- Nothing to do if already freed 2333 2334 if Object.P = null then 2335 return; 2336 2337 -- Otherwise we must free all elements 2338 2339 else 2340 declare 2341 Refs : Ref_Array (1 .. Object.P.Index); 2342 -- References to elements in pattern to be finalized 2343 2344 begin 2345 Build_Ref_Array (Object.P, Refs); 2346 2347 for J in Refs'Range loop 2348 if Refs (J).Pcode = PC_String then 2349 Free (Refs (J).Str); 2350 end if; 2351 2352 Free (Refs (J)); 2353 end loop; 2354 2355 Object.P := null; 2356 end; 2357 end if; 2358 end Finalize; 2359 2360 ----------- 2361 -- Image -- 2362 ----------- 2363 2364 function Image (P : PE_Ptr) return String is 2365 begin 2366 return Image (To_Address (P)); 2367 end Image; 2368 2369 function Image (P : Pattern) return String is 2370 begin 2371 return S (Image (P)); 2372 end Image; 2373 2374 function Image (P : Pattern) return VString is 2375 2376 Kill_Ampersand : Boolean := False; 2377 -- Set True to delete next & to be output to Result 2378 2379 Result : VString := Nul; 2380 -- The result is accumulated here, using Append 2381 2382 Refs : Ref_Array (1 .. P.P.Index); 2383 -- We build a reference array whose N'th element points to the 2384 -- pattern element whose Index value is N. 2385 2386 procedure Delete_Ampersand; 2387 -- Deletes the ampersand at the end of Result 2388 2389 procedure Image_Seq (E : PE_Ptr; Succ : PE_Ptr; Paren : Boolean); 2390 -- E refers to a pattern structure whose successor is given by Succ. 2391 -- This procedure appends to Result a representation of this pattern. 2392 -- The Paren parameter indicates whether parentheses are required if 2393 -- the output is more than one element. 2394 2395 procedure Image_One (E : in out PE_Ptr); 2396 -- E refers to a pattern structure. This procedure appends to Result 2397 -- a representation of the single simple or compound pattern structure 2398 -- at the start of E and updates E to point to its successor. 2399 2400 ---------------------- 2401 -- Delete_Ampersand -- 2402 ---------------------- 2403 2404 procedure Delete_Ampersand is 2405 L : constant Natural := Length (Result); 2406 2407 begin 2408 if L > 2 then 2409 Delete (Result, L - 1, L); 2410 end if; 2411 end Delete_Ampersand; 2412 2413 --------------- 2414 -- Image_One -- 2415 --------------- 2416 2417 procedure Image_One (E : in out PE_Ptr) is 2418 2419 ER : PE_Ptr := E.Pthen; 2420 -- Successor set as result in E unless reset 2421 2422 begin 2423 case E.Pcode is 2424 2425 when PC_Cancel => 2426 Append (Result, "Cancel"); 2427 2428 when PC_Alt => Alt : declare 2429 2430 Elmts_In_L : constant IndexT := E.Pthen.Index - E.Alt.Index; 2431 -- Number of elements in left pattern of alternation. 2432 2433 Lowest_In_L : constant IndexT := E.Index - Elmts_In_L; 2434 -- Number of lowest index in elements of left pattern 2435 2436 E1 : PE_Ptr; 2437 2438 begin 2439 -- The successor of the alternation node must have a lower 2440 -- index than any node that is in the left pattern or a 2441 -- higher index than the alternation node itself. 2442 2443 while ER /= EOP 2444 and then ER.Index >= Lowest_In_L 2445 and then ER.Index < E.Index 2446 loop 2447 ER := ER.Pthen; 2448 end loop; 2449 2450 Append (Result, '('); 2451 2452 E1 := E; 2453 loop 2454 Image_Seq (E1.Pthen, ER, False); 2455 Append (Result, " or "); 2456 E1 := E1.Alt; 2457 exit when E1.Pcode /= PC_Alt; 2458 end loop; 2459 2460 Image_Seq (E1, ER, False); 2461 Append (Result, ')'); 2462 end Alt; 2463 2464 when PC_Any_CS => 2465 Append (Result, "Any (" & Image (To_Sequence (E.CS)) & ')'); 2466 2467 when PC_Any_VF => 2468 Append (Result, "Any (" & Str_VF (E.VF) & ')'); 2469 2470 when PC_Any_VP => 2471 Append (Result, "Any (" & Str_VP (E.VP) & ')'); 2472 2473 when PC_Arb_X => 2474 Append (Result, "Arb"); 2475 2476 when PC_Arbno_S => 2477 Append (Result, "Arbno ("); 2478 Image_Seq (E.Alt, E, False); 2479 Append (Result, ')'); 2480 2481 when PC_Arbno_X => 2482 Append (Result, "Arbno ("); 2483 Image_Seq (E.Alt.Pthen, Refs (E.Index - 2), False); 2484 Append (Result, ')'); 2485 2486 when PC_Assign_Imm => 2487 Delete_Ampersand; 2488 Append (Result, "* " & Str_VP (Refs (E.Index - 1).VP)); 2489 2490 when PC_Assign_OnM => 2491 Delete_Ampersand; 2492 Append (Result, "** " & Str_VP (Refs (E.Index - 1).VP)); 2493 2494 when PC_Any_CH => 2495 Append (Result, "Any ('" & E.Char & "')"); 2496 2497 when PC_Bal => 2498 Append (Result, "Bal"); 2499 2500 when PC_Break_CH => 2501 Append (Result, "Break ('" & E.Char & "')"); 2502 2503 when PC_Break_CS => 2504 Append (Result, "Break (" & Image (To_Sequence (E.CS)) & ')'); 2505 2506 when PC_Break_VF => 2507 Append (Result, "Break (" & Str_VF (E.VF) & ')'); 2508 2509 when PC_Break_VP => 2510 Append (Result, "Break (" & Str_VP (E.VP) & ')'); 2511 2512 when PC_BreakX_CH => 2513 Append (Result, "BreakX ('" & E.Char & "')"); 2514 ER := ER.Pthen; 2515 2516 when PC_BreakX_CS => 2517 Append (Result, "BreakX (" & Image (To_Sequence (E.CS)) & ')'); 2518 ER := ER.Pthen; 2519 2520 when PC_BreakX_VF => 2521 Append (Result, "BreakX (" & Str_VF (E.VF) & ')'); 2522 ER := ER.Pthen; 2523 2524 when PC_BreakX_VP => 2525 Append (Result, "BreakX (" & Str_VP (E.VP) & ')'); 2526 ER := ER.Pthen; 2527 2528 when PC_Char => 2529 Append (Result, ''' & E.Char & '''); 2530 2531 when PC_Fail => 2532 Append (Result, "Fail"); 2533 2534 when PC_Fence => 2535 Append (Result, "Fence"); 2536 2537 when PC_Fence_X => 2538 Append (Result, "Fence ("); 2539 Image_Seq (E.Pthen, Refs (E.Index - 1), False); 2540 Append (Result, ")"); 2541 ER := Refs (E.Index - 1).Pthen; 2542 2543 when PC_Len_Nat => 2544 Append (Result, "Len (" & E.Nat & ')'); 2545 2546 when PC_Len_NF => 2547 Append (Result, "Len (" & Str_NF (E.NF) & ')'); 2548 2549 when PC_Len_NP => 2550 Append (Result, "Len (" & Str_NP (E.NP) & ')'); 2551 2552 when PC_NotAny_CH => 2553 Append (Result, "NotAny ('" & E.Char & "')"); 2554 2555 when PC_NotAny_CS => 2556 Append (Result, "NotAny (" & Image (To_Sequence (E.CS)) & ')'); 2557 2558 when PC_NotAny_VF => 2559 Append (Result, "NotAny (" & Str_VF (E.VF) & ')'); 2560 2561 when PC_NotAny_VP => 2562 Append (Result, "NotAny (" & Str_VP (E.VP) & ')'); 2563 2564 when PC_NSpan_CH => 2565 Append (Result, "NSpan ('" & E.Char & "')"); 2566 2567 when PC_NSpan_CS => 2568 Append (Result, "NSpan (" & Image (To_Sequence (E.CS)) & ')'); 2569 2570 when PC_NSpan_VF => 2571 Append (Result, "NSpan (" & Str_VF (E.VF) & ')'); 2572 2573 when PC_NSpan_VP => 2574 Append (Result, "NSpan (" & Str_VP (E.VP) & ')'); 2575 2576 when PC_Null => 2577 Append (Result, """"""); 2578 2579 when PC_Pos_Nat => 2580 Append (Result, "Pos (" & E.Nat & ')'); 2581 2582 when PC_Pos_NF => 2583 Append (Result, "Pos (" & Str_NF (E.NF) & ')'); 2584 2585 when PC_Pos_NP => 2586 Append (Result, "Pos (" & Str_NP (E.NP) & ')'); 2587 2588 when PC_R_Enter => 2589 Kill_Ampersand := True; 2590 2591 when PC_Rest => 2592 Append (Result, "Rest"); 2593 2594 when PC_Rpat => 2595 Append (Result, "(+ " & Str_PP (E.PP) & ')'); 2596 2597 when PC_Pred_Func => 2598 Append (Result, "(+ " & Str_BF (E.BF) & ')'); 2599 2600 when PC_RPos_Nat => 2601 Append (Result, "RPos (" & E.Nat & ')'); 2602 2603 when PC_RPos_NF => 2604 Append (Result, "RPos (" & Str_NF (E.NF) & ')'); 2605 2606 when PC_RPos_NP => 2607 Append (Result, "RPos (" & Str_NP (E.NP) & ')'); 2608 2609 when PC_RTab_Nat => 2610 Append (Result, "RTab (" & E.Nat & ')'); 2611 2612 when PC_RTab_NF => 2613 Append (Result, "RTab (" & Str_NF (E.NF) & ')'); 2614 2615 when PC_RTab_NP => 2616 Append (Result, "RTab (" & Str_NP (E.NP) & ')'); 2617 2618 when PC_Setcur => 2619 Append (Result, "Setcur (" & Str_NP (E.Var) & ')'); 2620 2621 when PC_Span_CH => 2622 Append (Result, "Span ('" & E.Char & "')"); 2623 2624 when PC_Span_CS => 2625 Append (Result, "Span (" & Image (To_Sequence (E.CS)) & ')'); 2626 2627 when PC_Span_VF => 2628 Append (Result, "Span (" & Str_VF (E.VF) & ')'); 2629 2630 when PC_Span_VP => 2631 Append (Result, "Span (" & Str_VP (E.VP) & ')'); 2632 2633 when PC_String => 2634 Append (Result, Image (E.Str.all)); 2635 2636 when PC_String_2 => 2637 Append (Result, Image (E.Str2)); 2638 2639 when PC_String_3 => 2640 Append (Result, Image (E.Str3)); 2641 2642 when PC_String_4 => 2643 Append (Result, Image (E.Str4)); 2644 2645 when PC_String_5 => 2646 Append (Result, Image (E.Str5)); 2647 2648 when PC_String_6 => 2649 Append (Result, Image (E.Str6)); 2650 2651 when PC_String_VF => 2652 Append (Result, "(+" & Str_VF (E.VF) & ')'); 2653 2654 when PC_String_VP => 2655 Append (Result, "(+" & Str_VP (E.VP) & ')'); 2656 2657 when PC_Succeed => 2658 Append (Result, "Succeed"); 2659 2660 when PC_Tab_Nat => 2661 Append (Result, "Tab (" & E.Nat & ')'); 2662 2663 when PC_Tab_NF => 2664 Append (Result, "Tab (" & Str_NF (E.NF) & ')'); 2665 2666 when PC_Tab_NP => 2667 Append (Result, "Tab (" & Str_NP (E.NP) & ')'); 2668 2669 when PC_Write_Imm => 2670 Append (Result, '('); 2671 Image_Seq (E, Refs (E.Index - 1), True); 2672 Append (Result, " * " & Str_FP (Refs (E.Index - 1).FP)); 2673 ER := Refs (E.Index - 1).Pthen; 2674 2675 when PC_Write_OnM => 2676 Append (Result, '('); 2677 Image_Seq (E.Pthen, Refs (E.Index - 1), True); 2678 Append (Result, " ** " & Str_FP (Refs (E.Index - 1).FP)); 2679 ER := Refs (E.Index - 1).Pthen; 2680 2681 -- Other pattern codes should not appear as leading elements 2682 2683 when PC_Arb_Y | 2684 PC_Arbno_Y | 2685 PC_Assign | 2686 PC_BreakX_X | 2687 PC_EOP | 2688 PC_Fence_Y | 2689 PC_R_Remove | 2690 PC_R_Restore | 2691 PC_Unanchored => 2692 Append (Result, "???"); 2693 2694 end case; 2695 2696 E := ER; 2697 end Image_One; 2698 2699 --------------- 2700 -- Image_Seq -- 2701 --------------- 2702 2703 procedure Image_Seq (E : PE_Ptr; Succ : PE_Ptr; Paren : Boolean) is 2704 Indx : constant Natural := Length (Result); 2705 E1 : PE_Ptr := E; 2706 Mult : Boolean := False; 2707 2708 begin 2709 -- The image of EOP is "" (the null string) 2710 2711 if E = EOP then 2712 Append (Result, """"""); 2713 2714 -- Else generate appropriate concatenation sequence 2715 2716 else 2717 loop 2718 Image_One (E1); 2719 exit when E1 = Succ; 2720 exit when E1 = EOP; 2721 Mult := True; 2722 2723 if Kill_Ampersand then 2724 Kill_Ampersand := False; 2725 else 2726 Append (Result, " & "); 2727 end if; 2728 end loop; 2729 end if; 2730 2731 if Mult and Paren then 2732 Insert (Result, Indx + 1, "("); 2733 Append (Result, ")"); 2734 end if; 2735 end Image_Seq; 2736 2737 -- Start of processing for Image 2738 2739 begin 2740 Build_Ref_Array (P.P, Refs); 2741 Image_Seq (P.P, EOP, False); 2742 return Result; 2743 end Image; 2744 2745 ----------- 2746 -- Is_In -- 2747 ----------- 2748 2749 function Is_In (C : Character; Str : String) return Boolean is 2750 begin 2751 for J in Str'Range loop 2752 if Str (J) = C then 2753 return True; 2754 end if; 2755 end loop; 2756 2757 return False; 2758 end Is_In; 2759 2760 --------- 2761 -- Len -- 2762 --------- 2763 2764 function Len (Count : Natural) return Pattern is 2765 begin 2766 -- Note, the following is not just an optimization, it is needed 2767 -- to ensure that Arbno (Len (0)) does not generate an infinite 2768 -- matching loop (since PC_Len_Nat is OK_For_Simple_Arbno). 2769 2770 if Count = 0 then 2771 return (AFC with 0, new PE'(PC_Null, 1, EOP)); 2772 2773 else 2774 return (AFC with 0, new PE'(PC_Len_Nat, 1, EOP, Count)); 2775 end if; 2776 end Len; 2777 2778 function Len (Count : Natural_Func) return Pattern is 2779 begin 2780 return (AFC with 0, new PE'(PC_Len_NF, 1, EOP, Count)); 2781 end Len; 2782 2783 function Len (Count : access Natural) return Pattern is 2784 begin 2785 return (AFC with 0, new PE'(PC_Len_NP, 1, EOP, Natural_Ptr (Count))); 2786 end Len; 2787 2788 ----------------- 2789 -- Logic_Error -- 2790 ----------------- 2791 2792 procedure Logic_Error is 2793 begin 2794 Raise_Exception 2795 (Program_Error'Identity, 2796 "Internal logic error in GNAT.Spitbol.Patterns"); 2797 end Logic_Error; 2798 2799 ----------- 2800 -- Match -- 2801 ----------- 2802 2803 function Match 2804 (Subject : VString; 2805 Pat : Pattern) 2806 return Boolean 2807 is 2808 Start, Stop : Natural; 2809 2810 begin 2811 if Debug_Mode then 2812 XMatchD (Get_String (Subject).all, Pat.P, Pat.Stk, Start, Stop); 2813 else 2814 XMatch (Get_String (Subject).all, Pat.P, Pat.Stk, Start, Stop); 2815 end if; 2816 2817 return Start /= 0; 2818 end Match; 2819 2820 function Match 2821 (Subject : String; 2822 Pat : Pattern) 2823 return Boolean 2824 is 2825 Start, Stop : Natural; 2826 subtype String1 is String (1 .. Subject'Length); 2827 2828 begin 2829 if Debug_Mode then 2830 XMatchD (String1 (Subject), Pat.P, Pat.Stk, Start, Stop); 2831 else 2832 XMatch (String1 (Subject), Pat.P, Pat.Stk, Start, Stop); 2833 end if; 2834 2835 return Start /= 0; 2836 end Match; 2837 2838 function Match 2839 (Subject : VString_Var; 2840 Pat : Pattern; 2841 Replace : VString) 2842 return Boolean 2843 is 2844 Start, Stop : Natural; 2845 2846 begin 2847 if Debug_Mode then 2848 XMatchD (Get_String (Subject).all, Pat.P, Pat.Stk, Start, Stop); 2849 else 2850 XMatch (Get_String (Subject).all, Pat.P, Pat.Stk, Start, Stop); 2851 end if; 2852 2853 if Start = 0 then 2854 return False; 2855 else 2856 Replace_Slice 2857 (Subject'Unrestricted_Access.all, 2858 Start, Stop, Get_String (Replace).all); 2859 return True; 2860 end if; 2861 end Match; 2862 2863 function Match 2864 (Subject : VString_Var; 2865 Pat : Pattern; 2866 Replace : String) 2867 return Boolean 2868 is 2869 Start, Stop : Natural; 2870 2871 begin 2872 if Debug_Mode then 2873 XMatchD (Get_String (Subject).all, Pat.P, Pat.Stk, Start, Stop); 2874 else 2875 XMatch (Get_String (Subject).all, Pat.P, Pat.Stk, Start, Stop); 2876 end if; 2877 2878 if Start = 0 then 2879 return False; 2880 else 2881 Replace_Slice 2882 (Subject'Unrestricted_Access.all, Start, Stop, Replace); 2883 return True; 2884 end if; 2885 end Match; 2886 2887 procedure Match 2888 (Subject : VString; 2889 Pat : Pattern) 2890 is 2891 Start, Stop : Natural; 2892 2893 begin 2894 if Debug_Mode then 2895 XMatchD (Get_String (Subject).all, Pat.P, Pat.Stk, Start, Stop); 2896 else 2897 XMatch (Get_String (Subject).all, Pat.P, Pat.Stk, Start, Stop); 2898 end if; 2899 2900 end Match; 2901 2902 procedure Match 2903 (Subject : String; 2904 Pat : Pattern) 2905 is 2906 Start, Stop : Natural; 2907 subtype String1 is String (1 .. Subject'Length); 2908 begin 2909 if Debug_Mode then 2910 XMatchD (String1 (Subject), Pat.P, Pat.Stk, Start, Stop); 2911 else 2912 XMatch (String1 (Subject), Pat.P, Pat.Stk, Start, Stop); 2913 end if; 2914 end Match; 2915 2916 procedure Match 2917 (Subject : in out VString; 2918 Pat : Pattern; 2919 Replace : VString) 2920 is 2921 Start, Stop : Natural; 2922 2923 begin 2924 if Debug_Mode then 2925 XMatchD (Get_String (Subject).all, Pat.P, Pat.Stk, Start, Stop); 2926 else 2927 XMatch (Get_String (Subject).all, Pat.P, Pat.Stk, Start, Stop); 2928 end if; 2929 2930 if Start /= 0 then 2931 Replace_Slice (Subject, Start, Stop, Get_String (Replace).all); 2932 end if; 2933 end Match; 2934 2935 procedure Match 2936 (Subject : in out VString; 2937 Pat : Pattern; 2938 Replace : String) 2939 is 2940 Start, Stop : Natural; 2941 2942 begin 2943 if Debug_Mode then 2944 XMatchD (Get_String (Subject).all, Pat.P, Pat.Stk, Start, Stop); 2945 else 2946 XMatch (Get_String (Subject).all, Pat.P, Pat.Stk, Start, Stop); 2947 end if; 2948 2949 if Start /= 0 then 2950 Replace_Slice (Subject, Start, Stop, Replace); 2951 end if; 2952 end Match; 2953 2954 function Match 2955 (Subject : VString; 2956 Pat : PString) 2957 return Boolean 2958 is 2959 Pat_Len : constant Natural := Pat'Length; 2960 Sub_Len : constant Natural := Length (Subject); 2961 Sub_Str : constant String_Access := Get_String (Subject); 2962 2963 begin 2964 if Anchored_Mode then 2965 if Pat_Len > Sub_Len then 2966 return False; 2967 else 2968 return Pat = Sub_Str.all (1 .. Pat_Len); 2969 end if; 2970 2971 else 2972 for J in 1 .. Sub_Len - Pat_Len + 1 loop 2973 if Pat = Sub_Str.all (J .. J + (Pat_Len - 1)) then 2974 return True; 2975 end if; 2976 end loop; 2977 2978 return False; 2979 end if; 2980 end Match; 2981 2982 function Match 2983 (Subject : String; 2984 Pat : PString) 2985 return Boolean 2986 is 2987 Pat_Len : constant Natural := Pat'Length; 2988 Sub_Len : constant Natural := Subject'Length; 2989 SFirst : constant Natural := Subject'First; 2990 2991 begin 2992 if Anchored_Mode then 2993 if Pat_Len > Sub_Len then 2994 return False; 2995 else 2996 return Pat = Subject (SFirst .. SFirst + Pat_Len - 1); 2997 end if; 2998 2999 else 3000 for J in SFirst .. SFirst + Sub_Len - Pat_Len loop 3001 if Pat = Subject (J .. J + (Pat_Len - 1)) then 3002 return True; 3003 end if; 3004 end loop; 3005 3006 return False; 3007 end if; 3008 end Match; 3009 3010 function Match 3011 (Subject : VString_Var; 3012 Pat : PString; 3013 Replace : VString) 3014 return Boolean 3015 is 3016 Start, Stop : Natural; 3017 3018 begin 3019 if Debug_Mode then 3020 XMatchD (Get_String (Subject).all, S_To_PE (Pat), 0, Start, Stop); 3021 else 3022 XMatch (Get_String (Subject).all, S_To_PE (Pat), 0, Start, Stop); 3023 end if; 3024 3025 if Start = 0 then 3026 return False; 3027 else 3028 Replace_Slice 3029 (Subject'Unrestricted_Access.all, 3030 Start, Stop, Get_String (Replace).all); 3031 return True; 3032 end if; 3033 end Match; 3034 3035 function Match 3036 (Subject : VString_Var; 3037 Pat : PString; 3038 Replace : String) 3039 return Boolean 3040 is 3041 Start, Stop : Natural; 3042 3043 begin 3044 if Debug_Mode then 3045 XMatchD (Get_String (Subject).all, S_To_PE (Pat), 0, Start, Stop); 3046 else 3047 XMatch (Get_String (Subject).all, S_To_PE (Pat), 0, Start, Stop); 3048 end if; 3049 3050 if Start = 0 then 3051 return False; 3052 else 3053 Replace_Slice 3054 (Subject'Unrestricted_Access.all, Start, Stop, Replace); 3055 return True; 3056 end if; 3057 end Match; 3058 3059 procedure Match 3060 (Subject : VString; 3061 Pat : PString) 3062 is 3063 Start, Stop : Natural; 3064 3065 begin 3066 if Debug_Mode then 3067 XMatchD (Get_String (Subject).all, S_To_PE (Pat), 0, Start, Stop); 3068 else 3069 XMatch (Get_String (Subject).all, S_To_PE (Pat), 0, Start, Stop); 3070 end if; 3071 end Match; 3072 3073 procedure Match 3074 (Subject : String; 3075 Pat : PString) 3076 is 3077 Start, Stop : Natural; 3078 subtype String1 is String (1 .. Subject'Length); 3079 3080 begin 3081 if Debug_Mode then 3082 XMatchD (String1 (Subject), S_To_PE (Pat), 0, Start, Stop); 3083 else 3084 XMatch (String1 (Subject), S_To_PE (Pat), 0, Start, Stop); 3085 end if; 3086 end Match; 3087 3088 procedure Match 3089 (Subject : in out VString; 3090 Pat : PString; 3091 Replace : VString) 3092 is 3093 Start, Stop : Natural; 3094 3095 begin 3096 if Debug_Mode then 3097 XMatchD (Get_String (Subject).all, S_To_PE (Pat), 0, Start, Stop); 3098 else 3099 XMatch (Get_String (Subject).all, S_To_PE (Pat), 0, Start, Stop); 3100 end if; 3101 3102 if Start /= 0 then 3103 Replace_Slice (Subject, Start, Stop, Get_String (Replace).all); 3104 end if; 3105 end Match; 3106 3107 procedure Match 3108 (Subject : in out VString; 3109 Pat : PString; 3110 Replace : String) 3111 is 3112 Start, Stop : Natural; 3113 3114 begin 3115 if Debug_Mode then 3116 XMatchD (Get_String (Subject).all, S_To_PE (Pat), 0, Start, Stop); 3117 else 3118 XMatch (Get_String (Subject).all, S_To_PE (Pat), 0, Start, Stop); 3119 end if; 3120 3121 if Start /= 0 then 3122 Replace_Slice (Subject, Start, Stop, Replace); 3123 end if; 3124 end Match; 3125 3126 function Match 3127 (Subject : VString_Var; 3128 Pat : Pattern; 3129 Result : Match_Result_Var) 3130 return Boolean 3131 is 3132 Start, Stop : Natural; 3133 3134 begin 3135 if Debug_Mode then 3136 XMatchD (Get_String (Subject).all, Pat.P, Pat.Stk, Start, Stop); 3137 else 3138 XMatch (Get_String (Subject).all, Pat.P, Pat.Stk, Start, Stop); 3139 end if; 3140 3141 if Start = 0 then 3142 Result'Unrestricted_Access.all.Var := null; 3143 return False; 3144 3145 else 3146 Result'Unrestricted_Access.all.Var := Subject'Unrestricted_Access; 3147 Result'Unrestricted_Access.all.Start := Start; 3148 Result'Unrestricted_Access.all.Stop := Stop; 3149 return True; 3150 end if; 3151 end Match; 3152 3153 procedure Match 3154 (Subject : in out VString; 3155 Pat : Pattern; 3156 Result : out Match_Result) 3157 is 3158 Start, Stop : Natural; 3159 3160 begin 3161 if Debug_Mode then 3162 XMatchD (Get_String (Subject).all, Pat.P, Pat.Stk, Start, Stop); 3163 else 3164 XMatch (Get_String (Subject).all, Pat.P, Pat.Stk, Start, Stop); 3165 end if; 3166 3167 if Start = 0 then 3168 Result.Var := null; 3169 3170 else 3171 Result.Var := Subject'Unrestricted_Access; 3172 Result.Start := Start; 3173 Result.Stop := Stop; 3174 end if; 3175 end Match; 3176 3177 --------------- 3178 -- New_LineD -- 3179 --------------- 3180 3181 procedure New_LineD is 3182 begin 3183 if Internal_Debug then 3184 New_Line; 3185 end if; 3186 end New_LineD; 3187 3188 ------------ 3189 -- NotAny -- 3190 ------------ 3191 3192 function NotAny (Str : String) return Pattern is 3193 begin 3194 return (AFC with 0, new PE'(PC_NotAny_CS, 1, EOP, To_Set (Str))); 3195 end NotAny; 3196 3197 function NotAny (Str : VString) return Pattern is 3198 begin 3199 return NotAny (S (Str)); 3200 end NotAny; 3201 3202 function NotAny (Str : Character) return Pattern is 3203 begin 3204 return (AFC with 0, new PE'(PC_NotAny_CH, 1, EOP, Str)); 3205 end NotAny; 3206 3207 function NotAny (Str : Character_Set) return Pattern is 3208 begin 3209 return (AFC with 0, new PE'(PC_NotAny_CS, 1, EOP, Str)); 3210 end NotAny; 3211 3212 function NotAny (Str : access VString) return Pattern is 3213 begin 3214 return (AFC with 0, new PE'(PC_NotAny_VP, 1, EOP, VString_Ptr (Str))); 3215 end NotAny; 3216 3217 function NotAny (Str : VString_Func) return Pattern is 3218 begin 3219 return (AFC with 0, new PE'(PC_NotAny_VF, 1, EOP, Str)); 3220 end NotAny; 3221 3222 ----------- 3223 -- NSpan -- 3224 ----------- 3225 3226 function NSpan (Str : String) return Pattern is 3227 begin 3228 return (AFC with 0, new PE'(PC_NSpan_CS, 1, EOP, To_Set (Str))); 3229 end NSpan; 3230 3231 function NSpan (Str : VString) return Pattern is 3232 begin 3233 return NSpan (S (Str)); 3234 end NSpan; 3235 3236 function NSpan (Str : Character) return Pattern is 3237 begin 3238 return (AFC with 0, new PE'(PC_NSpan_CH, 1, EOP, Str)); 3239 end NSpan; 3240 3241 function NSpan (Str : Character_Set) return Pattern is 3242 begin 3243 return (AFC with 0, new PE'(PC_NSpan_CS, 1, EOP, Str)); 3244 end NSpan; 3245 3246 function NSpan (Str : access VString) return Pattern is 3247 begin 3248 return (AFC with 0, new PE'(PC_NSpan_VP, 1, EOP, VString_Ptr (Str))); 3249 end NSpan; 3250 3251 function NSpan (Str : VString_Func) return Pattern is 3252 begin 3253 return (AFC with 0, new PE'(PC_NSpan_VF, 1, EOP, Str)); 3254 end NSpan; 3255 3256 --------- 3257 -- Pos -- 3258 --------- 3259 3260 function Pos (Count : Natural) return Pattern is 3261 begin 3262 return (AFC with 0, new PE'(PC_Pos_Nat, 1, EOP, Count)); 3263 end Pos; 3264 3265 function Pos (Count : Natural_Func) return Pattern is 3266 begin 3267 return (AFC with 0, new PE'(PC_Pos_NF, 1, EOP, Count)); 3268 end Pos; 3269 3270 function Pos (Count : access Natural) return Pattern is 3271 begin 3272 return (AFC with 0, new PE'(PC_Pos_NP, 1, EOP, Natural_Ptr (Count))); 3273 end Pos; 3274 3275 ---------- 3276 -- PutD -- 3277 ---------- 3278 3279 procedure PutD (Str : String) is 3280 begin 3281 if Internal_Debug then 3282 Put (Str); 3283 end if; 3284 end PutD; 3285 3286 --------------- 3287 -- Put_LineD -- 3288 --------------- 3289 3290 procedure Put_LineD (Str : String) is 3291 begin 3292 if Internal_Debug then 3293 Put_Line (Str); 3294 end if; 3295 end Put_LineD; 3296 3297 ------------- 3298 -- Replace -- 3299 ------------- 3300 3301 procedure Replace 3302 (Result : in out Match_Result; 3303 Replace : VString) 3304 is 3305 begin 3306 if Result.Var /= null then 3307 Replace_Slice 3308 (Result.Var.all, 3309 Result.Start, 3310 Result.Stop, 3311 Get_String (Replace).all); 3312 Result.Var := null; 3313 end if; 3314 end Replace; 3315 3316 ---------- 3317 -- Rest -- 3318 ---------- 3319 3320 function Rest return Pattern is 3321 begin 3322 return (AFC with 0, new PE'(PC_Rest, 1, EOP)); 3323 end Rest; 3324 3325 ---------- 3326 -- Rpos -- 3327 ---------- 3328 3329 function Rpos (Count : Natural) return Pattern is 3330 begin 3331 return (AFC with 0, new PE'(PC_RPos_Nat, 1, EOP, Count)); 3332 end Rpos; 3333 3334 function Rpos (Count : Natural_Func) return Pattern is 3335 begin 3336 return (AFC with 0, new PE'(PC_RPos_NF, 1, EOP, Count)); 3337 end Rpos; 3338 3339 function Rpos (Count : access Natural) return Pattern is 3340 begin 3341 return (AFC with 0, new PE'(PC_RPos_NP, 1, EOP, Natural_Ptr (Count))); 3342 end Rpos; 3343 3344 ---------- 3345 -- Rtab -- 3346 ---------- 3347 3348 function Rtab (Count : Natural) return Pattern is 3349 begin 3350 return (AFC with 0, new PE'(PC_RTab_Nat, 1, EOP, Count)); 3351 end Rtab; 3352 3353 function Rtab (Count : Natural_Func) return Pattern is 3354 begin 3355 return (AFC with 0, new PE'(PC_RTab_NF, 1, EOP, Count)); 3356 end Rtab; 3357 3358 function Rtab (Count : access Natural) return Pattern is 3359 begin 3360 return (AFC with 0, new PE'(PC_RTab_NP, 1, EOP, Natural_Ptr (Count))); 3361 end Rtab; 3362 3363 ------------- 3364 -- S_To_PE -- 3365 ------------- 3366 3367 function S_To_PE (Str : PString) return PE_Ptr is 3368 Len : constant Natural := Str'Length; 3369 3370 begin 3371 case Len is 3372 when 0 => 3373 return new PE'(PC_Null, 1, EOP); 3374 3375 when 1 => 3376 return new PE'(PC_Char, 1, EOP, Str (1)); 3377 3378 when 2 => 3379 return new PE'(PC_String_2, 1, EOP, Str); 3380 3381 when 3 => 3382 return new PE'(PC_String_3, 1, EOP, Str); 3383 3384 when 4 => 3385 return new PE'(PC_String_4, 1, EOP, Str); 3386 3387 when 5 => 3388 return new PE'(PC_String_5, 1, EOP, Str); 3389 3390 when 6 => 3391 return new PE'(PC_String_6, 1, EOP, Str); 3392 3393 when others => 3394 return new PE'(PC_String, 1, EOP, new String'(Str)); 3395 3396 end case; 3397 end S_To_PE; 3398 3399 ------------------- 3400 -- Set_Successor -- 3401 ------------------- 3402 3403 -- Note: this procedure is not used by the normal concatenation circuit, 3404 -- since other fixups are required on the left operand in this case, and 3405 -- they might as well be done all together. 3406 3407 procedure Set_Successor (Pat : PE_Ptr; Succ : PE_Ptr) is 3408 begin 3409 if Pat = null then 3410 Uninitialized_Pattern; 3411 3412 elsif Pat = EOP then 3413 Logic_Error; 3414 3415 else 3416 declare 3417 Refs : Ref_Array (1 .. Pat.Index); 3418 -- We build a reference array for L whose N'th element points to 3419 -- the pattern element of L whose original Index value is N. 3420 3421 P : PE_Ptr; 3422 3423 begin 3424 Build_Ref_Array (Pat, Refs); 3425 3426 for J in Refs'Range loop 3427 P := Refs (J); 3428 3429 if P.Pthen = EOP then 3430 P.Pthen := Succ; 3431 end if; 3432 3433 if P.Pcode in PC_Has_Alt and then P.Alt = EOP then 3434 P.Alt := Succ; 3435 end if; 3436 end loop; 3437 end; 3438 end if; 3439 end Set_Successor; 3440 3441 ------------ 3442 -- Setcur -- 3443 ------------ 3444 3445 function Setcur (Var : access Natural) return Pattern is 3446 begin 3447 return (AFC with 0, new PE'(PC_Setcur, 1, EOP, Natural_Ptr (Var))); 3448 end Setcur; 3449 3450 ---------- 3451 -- Span -- 3452 ---------- 3453 3454 function Span (Str : String) return Pattern is 3455 begin 3456 return (AFC with 0, new PE'(PC_Span_CS, 1, EOP, To_Set (Str))); 3457 end Span; 3458 3459 function Span (Str : VString) return Pattern is 3460 begin 3461 return Span (S (Str)); 3462 end Span; 3463 3464 function Span (Str : Character) return Pattern is 3465 begin 3466 return (AFC with 0, new PE'(PC_Span_CH, 1, EOP, Str)); 3467 end Span; 3468 3469 function Span (Str : Character_Set) return Pattern is 3470 begin 3471 return (AFC with 0, new PE'(PC_Span_CS, 1, EOP, Str)); 3472 end Span; 3473 3474 function Span (Str : access VString) return Pattern is 3475 begin 3476 return (AFC with 0, new PE'(PC_Span_VP, 1, EOP, VString_Ptr (Str))); 3477 end Span; 3478 3479 function Span (Str : VString_Func) return Pattern is 3480 begin 3481 return (AFC with 0, new PE'(PC_Span_VF, 1, EOP, Str)); 3482 end Span; 3483 3484 ------------ 3485 -- Str_BF -- 3486 ------------ 3487 3488 function Str_BF (A : Boolean_Func) return String is 3489 function To_A is new Unchecked_Conversion (Boolean_Func, Address); 3490 3491 begin 3492 return "BF(" & Image (To_A (A)) & ')'; 3493 end Str_BF; 3494 3495 ------------ 3496 -- Str_FP -- 3497 ------------ 3498 3499 function Str_FP (A : File_Ptr) return String is 3500 begin 3501 return "FP(" & Image (A.all'Address) & ')'; 3502 end Str_FP; 3503 3504 ------------ 3505 -- Str_NF -- 3506 ------------ 3507 3508 function Str_NF (A : Natural_Func) return String is 3509 function To_A is new Unchecked_Conversion (Natural_Func, Address); 3510 3511 begin 3512 return "NF(" & Image (To_A (A)) & ')'; 3513 end Str_NF; 3514 3515 ------------ 3516 -- Str_NP -- 3517 ------------ 3518 3519 function Str_NP (A : Natural_Ptr) return String is 3520 begin 3521 return "NP(" & Image (A.all'Address) & ')'; 3522 end Str_NP; 3523 3524 ------------ 3525 -- Str_PP -- 3526 ------------ 3527 3528 function Str_PP (A : Pattern_Ptr) return String is 3529 begin 3530 return "PP(" & Image (A.all'Address) & ')'; 3531 end Str_PP; 3532 3533 ------------ 3534 -- Str_VF -- 3535 ------------ 3536 3537 function Str_VF (A : VString_Func) return String is 3538 function To_A is new Unchecked_Conversion (VString_Func, Address); 3539 3540 begin 3541 return "VF(" & Image (To_A (A)) & ')'; 3542 end Str_VF; 3543 3544 ------------ 3545 -- Str_VP -- 3546 ------------ 3547 3548 function Str_VP (A : VString_Ptr) return String is 3549 begin 3550 return "VP(" & Image (A.all'Address) & ')'; 3551 end Str_VP; 3552 3553 ------------- 3554 -- Succeed -- 3555 ------------- 3556 3557 function Succeed return Pattern is 3558 begin 3559 return (AFC with 1, new PE'(PC_Succeed, 1, EOP)); 3560 end Succeed; 3561 3562 --------- 3563 -- Tab -- 3564 --------- 3565 3566 function Tab (Count : Natural) return Pattern is 3567 begin 3568 return (AFC with 0, new PE'(PC_Tab_Nat, 1, EOP, Count)); 3569 end Tab; 3570 3571 function Tab (Count : Natural_Func) return Pattern is 3572 begin 3573 return (AFC with 0, new PE'(PC_Tab_NF, 1, EOP, Count)); 3574 end Tab; 3575 3576 function Tab (Count : access Natural) return Pattern is 3577 begin 3578 return (AFC with 0, new PE'(PC_Tab_NP, 1, EOP, Natural_Ptr (Count))); 3579 end Tab; 3580 3581 --------------------------- 3582 -- Uninitialized_Pattern -- 3583 --------------------------- 3584 3585 procedure Uninitialized_Pattern is 3586 begin 3587 Raise_Exception 3588 (Program_Error'Identity, 3589 "uninitialized value of type GNAT.Spitbol.Patterns.Pattern"); 3590 end Uninitialized_Pattern; 3591 3592 ------------ 3593 -- XMatch -- 3594 ------------ 3595 3596 procedure XMatch 3597 (Subject : String; 3598 Pat_P : PE_Ptr; 3599 Pat_S : Natural; 3600 Start : out Natural; 3601 Stop : out Natural) 3602 is 3603 Node : PE_Ptr; 3604 -- Pointer to current pattern node. Initialized from Pat_P, and then 3605 -- updated as the match proceeds through its constituent elements. 3606 3607 Length : constant Natural := Subject'Length; 3608 -- Length of string (= Subject'Last, since Subject'First is always 1) 3609 3610 Cursor : Integer := 0; 3611 -- If the value is non-negative, then this value is the index showing 3612 -- the current position of the match in the subject string. The next 3613 -- character to be matched is at Subject (Cursor + 1). Note that since 3614 -- our view of the subject string in XMatch always has a lower bound 3615 -- of one, regardless of original bounds, that this definition exactly 3616 -- corresponds to the cursor value as referenced by functions like Pos. 3617 -- 3618 -- If the value is negative, then this is a saved stack pointer, 3619 -- typically a base pointer of an inner or outer region. Cursor 3620 -- temporarily holds such a value when it is popped from the stack 3621 -- by Fail. In all cases, Cursor is reset to a proper non-negative 3622 -- cursor value before the match proceeds (e.g. by propagating the 3623 -- failure and popping a "real" cursor value from the stack. 3624 3625 PE_Unanchored : aliased PE := (PC_Unanchored, 0, Pat_P); 3626 -- Dummy pattern element used in the unanchored case. 3627 3628 Stack : Stack_Type; 3629 -- The pattern matching failure stack for this call to Match 3630 3631 Stack_Ptr : Stack_Range; 3632 -- Current stack pointer. This points to the top element of the stack 3633 -- that is currently in use. At the outer level this is the special 3634 -- entry placed on the stack according to the anchor mode. 3635 3636 Stack_Init : constant Stack_Range := Stack'First + 1; 3637 -- This is the initial value of the Stack_Ptr and Stack_Base. The 3638 -- initial (Stack'First) element of the stack is not used so that 3639 -- when we pop the last element off, Stack_Ptr is still in range. 3640 3641 Stack_Base : Stack_Range; 3642 -- This value is the stack base value, i.e. the stack pointer for the 3643 -- first history stack entry in the current stack region. See separate 3644 -- section on handling of recursive pattern matches. 3645 3646 Assign_OnM : Boolean := False; 3647 -- Set True if assign-on-match or write-on-match operations may be 3648 -- present in the history stack, which must then be scanned on a 3649 -- successful match. 3650 3651 procedure Pop_Region; 3652 pragma Inline (Pop_Region); 3653 -- Used at the end of processing of an inner region. if the inner 3654 -- region left no stack entries, then all trace of it is removed. 3655 -- Otherwise a PC_Restore_Region entry is pushed to ensure proper 3656 -- handling of alternatives in the inner region. 3657 3658 procedure Push (Node : PE_Ptr); 3659 pragma Inline (Push); 3660 -- Make entry in pattern matching stack with current cursor valeu 3661 3662 procedure Push_Region; 3663 pragma Inline (Push_Region); 3664 -- This procedure makes a new region on the history stack. The 3665 -- caller first establishes the special entry on the stack, but 3666 -- does not push the stack pointer. Then this call stacks a 3667 -- PC_Remove_Region node, on top of this entry, using the cursor 3668 -- field of the PC_Remove_Region entry to save the outer level 3669 -- stack base value, and resets the stack base to point to this 3670 -- PC_Remove_Region node. 3671 3672 ---------------- 3673 -- Pop_Region -- 3674 ---------------- 3675 3676 procedure Pop_Region is 3677 begin 3678 -- If nothing was pushed in the inner region, we can just get 3679 -- rid of it entirely, leaving no traces that it was ever there 3680 3681 if Stack_Ptr = Stack_Base then 3682 Stack_Ptr := Stack_Base - 2; 3683 Stack_Base := Stack (Stack_Ptr + 2).Cursor; 3684 3685 -- If stuff was pushed in the inner region, then we have to 3686 -- push a PC_R_Restore node so that we properly handle possible 3687 -- rematches within the region. 3688 3689 else 3690 Stack_Ptr := Stack_Ptr + 1; 3691 Stack (Stack_Ptr).Cursor := Stack_Base; 3692 Stack (Stack_Ptr).Node := CP_R_Restore'Access; 3693 Stack_Base := Stack (Stack_Base).Cursor; 3694 end if; 3695 end Pop_Region; 3696 3697 ---------- 3698 -- Push -- 3699 ---------- 3700 3701 procedure Push (Node : PE_Ptr) is 3702 begin 3703 Stack_Ptr := Stack_Ptr + 1; 3704 Stack (Stack_Ptr).Cursor := Cursor; 3705 Stack (Stack_Ptr).Node := Node; 3706 end Push; 3707 3708 ----------------- 3709 -- Push_Region -- 3710 ----------------- 3711 3712 procedure Push_Region is 3713 begin 3714 Stack_Ptr := Stack_Ptr + 2; 3715 Stack (Stack_Ptr).Cursor := Stack_Base; 3716 Stack (Stack_Ptr).Node := CP_R_Remove'Access; 3717 Stack_Base := Stack_Ptr; 3718 end Push_Region; 3719 3720 -- Start of processing for XMatch 3721 3722 begin 3723 if Pat_P = null then 3724 Uninitialized_Pattern; 3725 end if; 3726 3727 -- Check we have enough stack for this pattern. This check deals with 3728 -- every possibility except a match of a recursive pattern, where we 3729 -- make a check at each recursion level. 3730 3731 if Pat_S >= Stack_Size - 1 then 3732 raise Pattern_Stack_Overflow; 3733 end if; 3734 3735 -- In anchored mode, the bottom entry on the stack is an abort entry 3736 3737 if Anchored_Mode then 3738 Stack (Stack_Init).Node := CP_Cancel'Access; 3739 Stack (Stack_Init).Cursor := 0; 3740 3741 -- In unanchored more, the bottom entry on the stack references 3742 -- the special pattern element PE_Unanchored, whose Pthen field 3743 -- points to the initial pattern element. The cursor value in this 3744 -- entry is the number of anchor moves so far. 3745 3746 else 3747 Stack (Stack_Init).Node := PE_Unanchored'Unchecked_Access; 3748 Stack (Stack_Init).Cursor := 0; 3749 end if; 3750 3751 Stack_Ptr := Stack_Init; 3752 Stack_Base := Stack_Ptr; 3753 Cursor := 0; 3754 Node := Pat_P; 3755 goto Match; 3756 3757 ----------------------------------------- 3758 -- Main Pattern Matching State Control -- 3759 ----------------------------------------- 3760 3761 -- This is a state machine which uses gotos to change state. The 3762 -- initial state is Match, to initiate the matching of the first 3763 -- element, so the goto Match above starts the match. In the 3764 -- following descriptions, we indicate the global values that 3765 -- are relevant for the state transition. 3766 3767 -- Come here if entire match fails 3768 3769 <<Match_Fail>> 3770 Start := 0; 3771 Stop := 0; 3772 return; 3773 3774 -- Come here if entire match succeeds 3775 3776 -- Cursor current position in subject string 3777 3778 <<Match_Succeed>> 3779 Start := Stack (Stack_Init).Cursor + 1; 3780 Stop := Cursor; 3781 3782 -- Scan history stack for deferred assignments or writes 3783 3784 if Assign_OnM then 3785 for S in Stack_Init .. Stack_Ptr loop 3786 if Stack (S).Node = CP_Assign'Access then 3787 declare 3788 Inner_Base : constant Stack_Range := 3789 Stack (S + 1).Cursor; 3790 Special_Entry : constant Stack_Range := 3791 Inner_Base - 1; 3792 Node_OnM : constant PE_Ptr := 3793 Stack (Special_Entry).Node; 3794 Start : constant Natural := 3795 Stack (Special_Entry).Cursor + 1; 3796 Stop : constant Natural := Stack (S).Cursor; 3797 3798 begin 3799 if Node_OnM.Pcode = PC_Assign_OnM then 3800 Set_String (Node_OnM.VP.all, Subject (Start .. Stop)); 3801 3802 elsif Node_OnM.Pcode = PC_Write_OnM then 3803 Put_Line (Node_OnM.FP.all, Subject (Start .. Stop)); 3804 3805 else 3806 Logic_Error; 3807 end if; 3808 end; 3809 end if; 3810 end loop; 3811 end if; 3812 3813 return; 3814 3815 -- Come here if attempt to match current element fails 3816 3817 -- Stack_Base current stack base 3818 -- Stack_Ptr current stack pointer 3819 3820 <<Fail>> 3821 Cursor := Stack (Stack_Ptr).Cursor; 3822 Node := Stack (Stack_Ptr).Node; 3823 Stack_Ptr := Stack_Ptr - 1; 3824 goto Match; 3825 3826 -- Come here if attempt to match current element succeeds 3827 3828 -- Cursor current position in subject string 3829 -- Node pointer to node successfully matched 3830 -- Stack_Base current stack base 3831 -- Stack_Ptr current stack pointer 3832 3833 <<Succeed>> 3834 Node := Node.Pthen; 3835 3836 -- Come here to match the next pattern element 3837 3838 -- Cursor current position in subject string 3839 -- Node pointer to node to be matched 3840 -- Stack_Base current stack base 3841 -- Stack_Ptr current stack pointer 3842 3843 <<Match>> 3844 3845 -------------------------------------------------- 3846 -- Main Pattern Match Element Matching Routines -- 3847 -------------------------------------------------- 3848 3849 -- Here is the case statement that processes the current node. The 3850 -- processing for each element does one of five things: 3851 3852 -- goto Succeed to move to the successor 3853 -- goto Match_Succeed if the entire match succeeds 3854 -- goto Match_Fail if the entire match fails 3855 -- goto Fail to signal failure of current match 3856 3857 -- Processing is NOT allowed to fall through 3858 3859 case Node.Pcode is 3860 3861 -- Cancel 3862 3863 when PC_Cancel => 3864 goto Match_Fail; 3865 3866 -- Alternation 3867 3868 when PC_Alt => 3869 Push (Node.Alt); 3870 Node := Node.Pthen; 3871 goto Match; 3872 3873 -- Any (one character case) 3874 3875 when PC_Any_CH => 3876 if Cursor < Length 3877 and then Subject (Cursor + 1) = Node.Char 3878 then 3879 Cursor := Cursor + 1; 3880 goto Succeed; 3881 else 3882 goto Fail; 3883 end if; 3884 3885 -- Any (character set case) 3886 3887 when PC_Any_CS => 3888 if Cursor < Length 3889 and then Is_In (Subject (Cursor + 1), Node.CS) 3890 then 3891 Cursor := Cursor + 1; 3892 goto Succeed; 3893 else 3894 goto Fail; 3895 end if; 3896 3897 -- Any (string function case) 3898 3899 when PC_Any_VF => declare 3900 U : constant VString := Node.VF.all; 3901 Str : constant String_Access := Get_String (U); 3902 3903 begin 3904 if Cursor < Length 3905 and then Is_In (Subject (Cursor + 1), Str.all) 3906 then 3907 Cursor := Cursor + 1; 3908 goto Succeed; 3909 else 3910 goto Fail; 3911 end if; 3912 end; 3913 3914 -- Any (string pointer case) 3915 3916 when PC_Any_VP => declare 3917 Str : constant String_Access := Get_String (Node.VP.all); 3918 3919 begin 3920 if Cursor < Length 3921 and then Is_In (Subject (Cursor + 1), Str.all) 3922 then 3923 Cursor := Cursor + 1; 3924 goto Succeed; 3925 else 3926 goto Fail; 3927 end if; 3928 end; 3929 3930 -- Arb (initial match) 3931 3932 when PC_Arb_X => 3933 Push (Node.Alt); 3934 Node := Node.Pthen; 3935 goto Match; 3936 3937 -- Arb (extension) 3938 3939 when PC_Arb_Y => 3940 if Cursor < Length then 3941 Cursor := Cursor + 1; 3942 Push (Node); 3943 goto Succeed; 3944 else 3945 goto Fail; 3946 end if; 3947 3948 -- Arbno_S (simple Arbno initialize). This is the node that 3949 -- initiates the match of a simple Arbno structure. 3950 3951 when PC_Arbno_S => 3952 Push (Node.Alt); 3953 Node := Node.Pthen; 3954 goto Match; 3955 3956 -- Arbno_X (Arbno initialize). This is the node that initiates 3957 -- the match of a complex Arbno structure. 3958 3959 when PC_Arbno_X => 3960 Push (Node.Alt); 3961 Node := Node.Pthen; 3962 goto Match; 3963 3964 -- Arbno_Y (Arbno rematch). This is the node that is executed 3965 -- following successful matching of one instance of a complex 3966 -- Arbno pattern. 3967 3968 when PC_Arbno_Y => declare 3969 Null_Match : constant Boolean := 3970 Cursor = Stack (Stack_Base - 1).Cursor; 3971 3972 begin 3973 Pop_Region; 3974 3975 -- If arbno extension matched null, then immediately fail 3976 3977 if Null_Match then 3978 goto Fail; 3979 end if; 3980 3981 -- Here we must do a stack check to make sure enough stack 3982 -- is left. This check will happen once for each instance of 3983 -- the Arbno pattern that is matched. The Nat field of a 3984 -- PC_Arbno pattern contains the maximum stack entries needed 3985 -- for the Arbno with one instance and the successor pattern 3986 3987 if Stack_Ptr + Node.Nat >= Stack'Last then 3988 raise Pattern_Stack_Overflow; 3989 end if; 3990 3991 goto Succeed; 3992 end; 3993 3994 -- Assign. If this node is executed, it means the assign-on-match 3995 -- or write-on-match operation will not happen after all, so we 3996 -- is propagate the failure, removing the PC_Assign node. 3997 3998 when PC_Assign => 3999 goto Fail; 4000 4001 -- Assign immediate. This node performs the actual assignment. 4002 4003 when PC_Assign_Imm => 4004 Set_String 4005 (Node.VP.all, 4006 Subject (Stack (Stack_Base - 1).Cursor + 1 .. Cursor)); 4007 Pop_Region; 4008 goto Succeed; 4009 4010 -- Assign on match. This node sets up for the eventual assignment 4011 4012 when PC_Assign_OnM => 4013 Stack (Stack_Base - 1).Node := Node; 4014 Push (CP_Assign'Access); 4015 Pop_Region; 4016 Assign_OnM := True; 4017 goto Succeed; 4018 4019 -- Bal 4020 4021 when PC_Bal => 4022 if Cursor >= Length or else Subject (Cursor + 1) = ')' then 4023 goto Fail; 4024 4025 elsif Subject (Cursor + 1) = '(' then 4026 declare 4027 Paren_Count : Natural := 1; 4028 4029 begin 4030 loop 4031 Cursor := Cursor + 1; 4032 4033 if Cursor >= Length then 4034 goto Fail; 4035 4036 elsif Subject (Cursor + 1) = '(' then 4037 Paren_Count := Paren_Count + 1; 4038 4039 elsif Subject (Cursor + 1) = ')' then 4040 Paren_Count := Paren_Count - 1; 4041 exit when Paren_Count = 0; 4042 end if; 4043 end loop; 4044 end; 4045 end if; 4046 4047 Cursor := Cursor + 1; 4048 Push (Node); 4049 goto Succeed; 4050 4051 -- Break (one character case) 4052 4053 when PC_Break_CH => 4054 while Cursor < Length loop 4055 if Subject (Cursor + 1) = Node.Char then 4056 goto Succeed; 4057 else 4058 Cursor := Cursor + 1; 4059 end if; 4060 end loop; 4061 4062 goto Fail; 4063 4064 -- Break (character set case) 4065 4066 when PC_Break_CS => 4067 while Cursor < Length loop 4068 if Is_In (Subject (Cursor + 1), Node.CS) then 4069 goto Succeed; 4070 else 4071 Cursor := Cursor + 1; 4072 end if; 4073 end loop; 4074 4075 goto Fail; 4076 4077 -- Break (string function case) 4078 4079 when PC_Break_VF => declare 4080 U : constant VString := Node.VF.all; 4081 Str : constant String_Access := Get_String (U); 4082 4083 begin 4084 while Cursor < Length loop 4085 if Is_In (Subject (Cursor + 1), Str.all) then 4086 goto Succeed; 4087 else 4088 Cursor := Cursor + 1; 4089 end if; 4090 end loop; 4091 4092 goto Fail; 4093 end; 4094 4095 -- Break (string pointer case) 4096 4097 when PC_Break_VP => declare 4098 Str : constant String_Access := Get_String (Node.VP.all); 4099 4100 begin 4101 while Cursor < Length loop 4102 if Is_In (Subject (Cursor + 1), Str.all) then 4103 goto Succeed; 4104 else 4105 Cursor := Cursor + 1; 4106 end if; 4107 end loop; 4108 4109 goto Fail; 4110 end; 4111 4112 -- BreakX (one character case) 4113 4114 when PC_BreakX_CH => 4115 while Cursor < Length loop 4116 if Subject (Cursor + 1) = Node.Char then 4117 goto Succeed; 4118 else 4119 Cursor := Cursor + 1; 4120 end if; 4121 end loop; 4122 4123 goto Fail; 4124 4125 -- BreakX (character set case) 4126 4127 when PC_BreakX_CS => 4128 while Cursor < Length loop 4129 if Is_In (Subject (Cursor + 1), Node.CS) then 4130 goto Succeed; 4131 else 4132 Cursor := Cursor + 1; 4133 end if; 4134 end loop; 4135 4136 goto Fail; 4137 4138 -- BreakX (string function case) 4139 4140 when PC_BreakX_VF => declare 4141 U : constant VString := Node.VF.all; 4142 Str : constant String_Access := Get_String (U); 4143 4144 begin 4145 while Cursor < Length loop 4146 if Is_In (Subject (Cursor + 1), Str.all) then 4147 goto Succeed; 4148 else 4149 Cursor := Cursor + 1; 4150 end if; 4151 end loop; 4152 4153 goto Fail; 4154 end; 4155 4156 -- BreakX (string pointer case) 4157 4158 when PC_BreakX_VP => declare 4159 Str : constant String_Access := Get_String (Node.VP.all); 4160 4161 begin 4162 while Cursor < Length loop 4163 if Is_In (Subject (Cursor + 1), Str.all) then 4164 goto Succeed; 4165 else 4166 Cursor := Cursor + 1; 4167 end if; 4168 end loop; 4169 4170 goto Fail; 4171 end; 4172 4173 -- BreakX_X (BreakX extension). See section on "Compound Pattern 4174 -- Structures". This node is the alternative that is stacked to 4175 -- skip past the break character and extend the break. 4176 4177 when PC_BreakX_X => 4178 Cursor := Cursor + 1; 4179 goto Succeed; 4180 4181 -- Character (one character string) 4182 4183 when PC_Char => 4184 if Cursor < Length 4185 and then Subject (Cursor + 1) = Node.Char 4186 then 4187 Cursor := Cursor + 1; 4188 goto Succeed; 4189 else 4190 goto Fail; 4191 end if; 4192 4193 -- End of Pattern 4194 4195 when PC_EOP => 4196 if Stack_Base = Stack_Init then 4197 goto Match_Succeed; 4198 4199 -- End of recursive inner match. See separate section on 4200 -- handing of recursive pattern matches for details. 4201 4202 else 4203 Node := Stack (Stack_Base - 1).Node; 4204 Pop_Region; 4205 goto Match; 4206 end if; 4207 4208 -- Fail 4209 4210 when PC_Fail => 4211 goto Fail; 4212 4213 -- Fence (built in pattern) 4214 4215 when PC_Fence => 4216 Push (CP_Cancel'Access); 4217 goto Succeed; 4218 4219 -- Fence function node X. This is the node that gets control 4220 -- after a successful match of the fenced pattern. 4221 4222 when PC_Fence_X => 4223 Stack_Ptr := Stack_Ptr + 1; 4224 Stack (Stack_Ptr).Cursor := Stack_Base; 4225 Stack (Stack_Ptr).Node := CP_Fence_Y'Access; 4226 Stack_Base := Stack (Stack_Base).Cursor; 4227 goto Succeed; 4228 4229 -- Fence function node Y. This is the node that gets control on 4230 -- a failure that occurs after the fenced pattern has matched. 4231 4232 -- Note: the Cursor at this stage is actually the inner stack 4233 -- base value. We don't reset this, but we do use it to strip 4234 -- off all the entries made by the fenced pattern. 4235 4236 when PC_Fence_Y => 4237 Stack_Ptr := Cursor - 2; 4238 goto Fail; 4239 4240 -- Len (integer case) 4241 4242 when PC_Len_Nat => 4243 if Cursor + Node.Nat > Length then 4244 goto Fail; 4245 else 4246 Cursor := Cursor + Node.Nat; 4247 goto Succeed; 4248 end if; 4249 4250 -- Len (Integer function case) 4251 4252 when PC_Len_NF => declare 4253 N : constant Natural := Node.NF.all; 4254 4255 begin 4256 if Cursor + N > Length then 4257 goto Fail; 4258 else 4259 Cursor := Cursor + N; 4260 goto Succeed; 4261 end if; 4262 end; 4263 4264 -- Len (integer pointer case) 4265 4266 when PC_Len_NP => 4267 if Cursor + Node.NP.all > Length then 4268 goto Fail; 4269 else 4270 Cursor := Cursor + Node.NP.all; 4271 goto Succeed; 4272 end if; 4273 4274 -- NotAny (one character case) 4275 4276 when PC_NotAny_CH => 4277 if Cursor < Length 4278 and then Subject (Cursor + 1) /= Node.Char 4279 then 4280 Cursor := Cursor + 1; 4281 goto Succeed; 4282 else 4283 goto Fail; 4284 end if; 4285 4286 -- NotAny (character set case) 4287 4288 when PC_NotAny_CS => 4289 if Cursor < Length 4290 and then not Is_In (Subject (Cursor + 1), Node.CS) 4291 then 4292 Cursor := Cursor + 1; 4293 goto Succeed; 4294 else 4295 goto Fail; 4296 end if; 4297 4298 -- NotAny (string function case) 4299 4300 when PC_NotAny_VF => declare 4301 U : constant VString := Node.VF.all; 4302 Str : constant String_Access := Get_String (U); 4303 4304 begin 4305 if Cursor < Length 4306 and then 4307 not Is_In (Subject (Cursor + 1), Str.all) 4308 then 4309 Cursor := Cursor + 1; 4310 goto Succeed; 4311 else 4312 goto Fail; 4313 end if; 4314 end; 4315 4316 -- NotAny (string pointer case) 4317 4318 when PC_NotAny_VP => declare 4319 Str : constant String_Access := Get_String (Node.VP.all); 4320 4321 begin 4322 if Cursor < Length 4323 and then 4324 not Is_In (Subject (Cursor + 1), Str.all) 4325 then 4326 Cursor := Cursor + 1; 4327 goto Succeed; 4328 else 4329 goto Fail; 4330 end if; 4331 end; 4332 4333 -- NSpan (one character case) 4334 4335 when PC_NSpan_CH => 4336 while Cursor < Length 4337 and then Subject (Cursor + 1) = Node.Char 4338 loop 4339 Cursor := Cursor + 1; 4340 end loop; 4341 4342 goto Succeed; 4343 4344 -- NSpan (character set case) 4345 4346 when PC_NSpan_CS => 4347 while Cursor < Length 4348 and then Is_In (Subject (Cursor + 1), Node.CS) 4349 loop 4350 Cursor := Cursor + 1; 4351 end loop; 4352 4353 goto Succeed; 4354 4355 -- NSpan (string function case) 4356 4357 when PC_NSpan_VF => declare 4358 U : constant VString := Node.VF.all; 4359 Str : constant String_Access := Get_String (U); 4360 4361 begin 4362 while Cursor < Length 4363 and then Is_In (Subject (Cursor + 1), Str.all) 4364 loop 4365 Cursor := Cursor + 1; 4366 end loop; 4367 4368 goto Succeed; 4369 end; 4370 4371 -- NSpan (string pointer case) 4372 4373 when PC_NSpan_VP => declare 4374 Str : constant String_Access := Get_String (Node.VP.all); 4375 4376 begin 4377 while Cursor < Length 4378 and then Is_In (Subject (Cursor + 1), Str.all) 4379 loop 4380 Cursor := Cursor + 1; 4381 end loop; 4382 4383 goto Succeed; 4384 end; 4385 4386 -- Null string 4387 4388 when PC_Null => 4389 goto Succeed; 4390 4391 -- Pos (integer case) 4392 4393 when PC_Pos_Nat => 4394 if Cursor = Node.Nat then 4395 goto Succeed; 4396 else 4397 goto Fail; 4398 end if; 4399 4400 -- Pos (Integer function case) 4401 4402 when PC_Pos_NF => declare 4403 N : constant Natural := Node.NF.all; 4404 4405 begin 4406 if Cursor = N then 4407 goto Succeed; 4408 else 4409 goto Fail; 4410 end if; 4411 end; 4412 4413 -- Pos (integer pointer case) 4414 4415 when PC_Pos_NP => 4416 if Cursor = Node.NP.all then 4417 goto Succeed; 4418 else 4419 goto Fail; 4420 end if; 4421 4422 -- Predicate function 4423 4424 when PC_Pred_Func => 4425 if Node.BF.all then 4426 goto Succeed; 4427 else 4428 goto Fail; 4429 end if; 4430 4431 -- Region Enter. Initiate new pattern history stack region 4432 4433 when PC_R_Enter => 4434 Stack (Stack_Ptr + 1).Cursor := Cursor; 4435 Push_Region; 4436 goto Succeed; 4437 4438 -- Region Remove node. This is the node stacked by an R_Enter. 4439 -- It removes the special format stack entry right underneath, and 4440 -- then restores the outer level stack base and signals failure. 4441 4442 -- Note: the cursor value at this stage is actually the (negative) 4443 -- stack base value for the outer level. 4444 4445 when PC_R_Remove => 4446 Stack_Base := Cursor; 4447 Stack_Ptr := Stack_Ptr - 1; 4448 goto Fail; 4449 4450 -- Region restore node. This is the node stacked at the end of an 4451 -- inner level match. Its function is to restore the inner level 4452 -- region, so that alternatives in this region can be sought. 4453 4454 -- Note: the Cursor at this stage is actually the negative of the 4455 -- inner stack base value, which we use to restore the inner region. 4456 4457 when PC_R_Restore => 4458 Stack_Base := Cursor; 4459 goto Fail; 4460 4461 -- Rest 4462 4463 when PC_Rest => 4464 Cursor := Length; 4465 goto Succeed; 4466 4467 -- Initiate recursive match (pattern pointer case) 4468 4469 when PC_Rpat => 4470 Stack (Stack_Ptr + 1).Node := Node.Pthen; 4471 Push_Region; 4472 4473 if Stack_Ptr + Node.PP.all.Stk >= Stack_Size then 4474 raise Pattern_Stack_Overflow; 4475 else 4476 Node := Node.PP.all.P; 4477 goto Match; 4478 end if; 4479 4480 -- RPos (integer case) 4481 4482 when PC_RPos_Nat => 4483 if Cursor = (Length - Node.Nat) then 4484 goto Succeed; 4485 else 4486 goto Fail; 4487 end if; 4488 4489 -- RPos (integer function case) 4490 4491 when PC_RPos_NF => declare 4492 N : constant Natural := Node.NF.all; 4493 4494 begin 4495 if Length - Cursor = N then 4496 goto Succeed; 4497 else 4498 goto Fail; 4499 end if; 4500 end; 4501 4502 -- RPos (integer pointer case) 4503 4504 when PC_RPos_NP => 4505 if Cursor = (Length - Node.NP.all) then 4506 goto Succeed; 4507 else 4508 goto Fail; 4509 end if; 4510 4511 -- RTab (integer case) 4512 4513 when PC_RTab_Nat => 4514 if Cursor <= (Length - Node.Nat) then 4515 Cursor := Length - Node.Nat; 4516 goto Succeed; 4517 else 4518 goto Fail; 4519 end if; 4520 4521 -- RTab (integer function case) 4522 4523 when PC_RTab_NF => declare 4524 N : constant Natural := Node.NF.all; 4525 4526 begin 4527 if Length - Cursor >= N then 4528 Cursor := Length - N; 4529 goto Succeed; 4530 else 4531 goto Fail; 4532 end if; 4533 end; 4534 4535 -- RTab (integer pointer case) 4536 4537 when PC_RTab_NP => 4538 if Cursor <= (Length - Node.NP.all) then 4539 Cursor := Length - Node.NP.all; 4540 goto Succeed; 4541 else 4542 goto Fail; 4543 end if; 4544 4545 -- Cursor assignment 4546 4547 when PC_Setcur => 4548 Node.Var.all := Cursor; 4549 goto Succeed; 4550 4551 -- Span (one character case) 4552 4553 when PC_Span_CH => declare 4554 P : Natural := Cursor; 4555 4556 begin 4557 while P < Length 4558 and then Subject (P + 1) = Node.Char 4559 loop 4560 P := P + 1; 4561 end loop; 4562 4563 if P /= Cursor then 4564 Cursor := P; 4565 goto Succeed; 4566 else 4567 goto Fail; 4568 end if; 4569 end; 4570 4571 -- Span (character set case) 4572 4573 when PC_Span_CS => declare 4574 P : Natural := Cursor; 4575 4576 begin 4577 while P < Length 4578 and then Is_In (Subject (P + 1), Node.CS) 4579 loop 4580 P := P + 1; 4581 end loop; 4582 4583 if P /= Cursor then 4584 Cursor := P; 4585 goto Succeed; 4586 else 4587 goto Fail; 4588 end if; 4589 end; 4590 4591 -- Span (string function case) 4592 4593 when PC_Span_VF => declare 4594 U : constant VString := Node.VF.all; 4595 Str : constant String_Access := Get_String (U); 4596 P : Natural := Cursor; 4597 4598 begin 4599 while P < Length 4600 and then Is_In (Subject (P + 1), Str.all) 4601 loop 4602 P := P + 1; 4603 end loop; 4604 4605 if P /= Cursor then 4606 Cursor := P; 4607 goto Succeed; 4608 else 4609 goto Fail; 4610 end if; 4611 end; 4612 4613 -- Span (string pointer case) 4614 4615 when PC_Span_VP => declare 4616 Str : constant String_Access := Get_String (Node.VP.all); 4617 P : Natural := Cursor; 4618 4619 begin 4620 while P < Length 4621 and then Is_In (Subject (P + 1), Str.all) 4622 loop 4623 P := P + 1; 4624 end loop; 4625 4626 if P /= Cursor then 4627 Cursor := P; 4628 goto Succeed; 4629 else 4630 goto Fail; 4631 end if; 4632 end; 4633 4634 -- String (two character case) 4635 4636 when PC_String_2 => 4637 if (Length - Cursor) >= 2 4638 and then Subject (Cursor + 1 .. Cursor + 2) = Node.Str2 4639 then 4640 Cursor := Cursor + 2; 4641 goto Succeed; 4642 else 4643 goto Fail; 4644 end if; 4645 4646 -- String (three character case) 4647 4648 when PC_String_3 => 4649 if (Length - Cursor) >= 3 4650 and then Subject (Cursor + 1 .. Cursor + 3) = Node.Str3 4651 then 4652 Cursor := Cursor + 3; 4653 goto Succeed; 4654 else 4655 goto Fail; 4656 end if; 4657 4658 -- String (four character case) 4659 4660 when PC_String_4 => 4661 if (Length - Cursor) >= 4 4662 and then Subject (Cursor + 1 .. Cursor + 4) = Node.Str4 4663 then 4664 Cursor := Cursor + 4; 4665 goto Succeed; 4666 else 4667 goto Fail; 4668 end if; 4669 4670 -- String (five character case) 4671 4672 when PC_String_5 => 4673 if (Length - Cursor) >= 5 4674 and then Subject (Cursor + 1 .. Cursor + 5) = Node.Str5 4675 then 4676 Cursor := Cursor + 5; 4677 goto Succeed; 4678 else 4679 goto Fail; 4680 end if; 4681 4682 -- String (six character case) 4683 4684 when PC_String_6 => 4685 if (Length - Cursor) >= 6 4686 and then Subject (Cursor + 1 .. Cursor + 6) = Node.Str6 4687 then 4688 Cursor := Cursor + 6; 4689 goto Succeed; 4690 else 4691 goto Fail; 4692 end if; 4693 4694 -- String (case of more than six characters) 4695 4696 when PC_String => declare 4697 Len : constant Natural := Node.Str'Length; 4698 4699 begin 4700 if (Length - Cursor) >= Len 4701 and then Node.Str.all = Subject (Cursor + 1 .. Cursor + Len) 4702 then 4703 Cursor := Cursor + Len; 4704 goto Succeed; 4705 else 4706 goto Fail; 4707 end if; 4708 end; 4709 4710 -- String (function case) 4711 4712 when PC_String_VF => declare 4713 U : constant VString := Node.VF.all; 4714 Str : constant String_Access := Get_String (U); 4715 Len : constant Natural := Str'Length; 4716 4717 begin 4718 if (Length - Cursor) >= Len 4719 and then Str.all = Subject (Cursor + 1 .. Cursor + Len) 4720 then 4721 Cursor := Cursor + Len; 4722 goto Succeed; 4723 else 4724 goto Fail; 4725 end if; 4726 end; 4727 4728 -- String (pointer case) 4729 4730 when PC_String_VP => declare 4731 S : constant String_Access := Get_String (Node.VP.all); 4732 Len : constant Natural := S'Length; 4733 4734 begin 4735 if (Length - Cursor) >= Len 4736 and then S.all = Subject (Cursor + 1 .. Cursor + Len) 4737 then 4738 Cursor := Cursor + Len; 4739 goto Succeed; 4740 else 4741 goto Fail; 4742 end if; 4743 end; 4744 4745 -- Succeed 4746 4747 when PC_Succeed => 4748 Push (Node); 4749 goto Succeed; 4750 4751 -- Tab (integer case) 4752 4753 when PC_Tab_Nat => 4754 if Cursor <= Node.Nat then 4755 Cursor := Node.Nat; 4756 goto Succeed; 4757 else 4758 goto Fail; 4759 end if; 4760 4761 -- Tab (integer function case) 4762 4763 when PC_Tab_NF => declare 4764 N : constant Natural := Node.NF.all; 4765 4766 begin 4767 if Cursor <= N then 4768 Cursor := N; 4769 goto Succeed; 4770 else 4771 goto Fail; 4772 end if; 4773 end; 4774 4775 -- Tab (integer pointer case) 4776 4777 when PC_Tab_NP => 4778 if Cursor <= Node.NP.all then 4779 Cursor := Node.NP.all; 4780 goto Succeed; 4781 else 4782 goto Fail; 4783 end if; 4784 4785 -- Unanchored movement 4786 4787 when PC_Unanchored => 4788 4789 -- All done if we tried every position 4790 4791 if Cursor > Length then 4792 goto Match_Fail; 4793 4794 -- Otherwise extend the anchor point, and restack ourself 4795 4796 else 4797 Cursor := Cursor + 1; 4798 Push (Node); 4799 goto Succeed; 4800 end if; 4801 4802 -- Write immediate. This node performs the actual write 4803 4804 when PC_Write_Imm => 4805 Put_Line 4806 (Node.FP.all, 4807 Subject (Stack (Stack_Base - 1).Cursor + 1 .. Cursor)); 4808 Pop_Region; 4809 goto Succeed; 4810 4811 -- Write on match. This node sets up for the eventual write 4812 4813 when PC_Write_OnM => 4814 Stack (Stack_Base - 1).Node := Node; 4815 Push (CP_Assign'Access); 4816 Pop_Region; 4817 Assign_OnM := True; 4818 goto Succeed; 4819 4820 end case; 4821 4822 -- We are NOT allowed to fall though this case statement, since every 4823 -- match routine must end by executing a goto to the appropriate point 4824 -- in the finite state machine model. 4825 4826 Logic_Error; 4827 4828 end XMatch; 4829 4830 ------------- 4831 -- XMatchD -- 4832 ------------- 4833 4834 -- Maintenance note: There is a LOT of code duplication between XMatch 4835 -- and XMatchD. This is quite intentional, the point is to avoid any 4836 -- unnecessary debugging overhead in the XMatch case, but this does mean 4837 -- that any changes to XMatchD must be mirrored in XMatch. In case of 4838 -- any major changes, the proper approach is to delete XMatch, make the 4839 -- changes to XMatchD, and then make a copy of XMatchD, removing all 4840 -- calls to Dout, and all Put and Put_Line operations. This copy becomes 4841 -- the new XMatch. 4842 4843 procedure XMatchD 4844 (Subject : String; 4845 Pat_P : PE_Ptr; 4846 Pat_S : Natural; 4847 Start : out Natural; 4848 Stop : out Natural) 4849 is 4850 Node : PE_Ptr; 4851 -- Pointer to current pattern node. Initialized from Pat_P, and then 4852 -- updated as the match proceeds through its constituent elements. 4853 4854 Length : constant Natural := Subject'Length; 4855 -- Length of string (= Subject'Last, since Subject'First is always 1) 4856 4857 Cursor : Integer := 0; 4858 -- If the value is non-negative, then this value is the index showing 4859 -- the current position of the match in the subject string. The next 4860 -- character to be matched is at Subject (Cursor + 1). Note that since 4861 -- our view of the subject string in XMatch always has a lower bound 4862 -- of one, regardless of original bounds, that this definition exactly 4863 -- corresponds to the cursor value as referenced by functions like Pos. 4864 -- 4865 -- If the value is negative, then this is a saved stack pointer, 4866 -- typically a base pointer of an inner or outer region. Cursor 4867 -- temporarily holds such a value when it is popped from the stack 4868 -- by Fail. In all cases, Cursor is reset to a proper non-negative 4869 -- cursor value before the match proceeds (e.g. by propagating the 4870 -- failure and popping a "real" cursor value from the stack. 4871 4872 PE_Unanchored : aliased PE := (PC_Unanchored, 0, Pat_P); 4873 -- Dummy pattern element used in the unanchored case. 4874 4875 Region_Level : Natural := 0; 4876 -- Keeps track of recursive region level. This is used only for 4877 -- debugging, it is the number of saved history stack base values. 4878 4879 Stack : Stack_Type; 4880 -- The pattern matching failure stack for this call to Match 4881 4882 Stack_Ptr : Stack_Range; 4883 -- Current stack pointer. This points to the top element of the stack 4884 -- that is currently in use. At the outer level this is the special 4885 -- entry placed on the stack according to the anchor mode. 4886 4887 Stack_Init : constant Stack_Range := Stack'First + 1; 4888 -- This is the initial value of the Stack_Ptr and Stack_Base. The 4889 -- initial (Stack'First) element of the stack is not used so that 4890 -- when we pop the last element off, Stack_Ptr is still in range. 4891 4892 Stack_Base : Stack_Range; 4893 -- This value is the stack base value, i.e. the stack pointer for the 4894 -- first history stack entry in the current stack region. See separate 4895 -- section on handling of recursive pattern matches. 4896 4897 Assign_OnM : Boolean := False; 4898 -- Set True if assign-on-match or write-on-match operations may be 4899 -- present in the history stack, which must then be scanned on a 4900 -- successful match. 4901 4902 procedure Dout (Str : String); 4903 -- Output string to standard error with bars indicating region level. 4904 4905 procedure Dout (Str : String; A : Character); 4906 -- Calls Dout with the string S ('A') 4907 4908 procedure Dout (Str : String; A : Character_Set); 4909 -- Calls Dout with the string S ("A") 4910 4911 procedure Dout (Str : String; A : Natural); 4912 -- Calls Dout with the string S (A) 4913 4914 procedure Dout (Str : String; A : String); 4915 -- Calls Dout with the string S ("A") 4916 4917 function Img (P : PE_Ptr) return String; 4918 -- Returns a string of the form #nnn where nnn is P.Index 4919 4920 procedure Pop_Region; 4921 pragma Inline (Pop_Region); 4922 -- Used at the end of processing of an inner region. if the inner 4923 -- region left no stack entries, then all trace of it is removed. 4924 -- Otherwise a PC_Restore_Region entry is pushed to ensure proper 4925 -- handling of alternatives in the inner region. 4926 4927 procedure Push (Node : PE_Ptr); 4928 pragma Inline (Push); 4929 -- Make entry in pattern matching stack with current cursor valeu 4930 4931 procedure Push_Region; 4932 pragma Inline (Push_Region); 4933 -- This procedure makes a new region on the history stack. The 4934 -- caller first establishes the special entry on the stack, but 4935 -- does not push the stack pointer. Then this call stacks a 4936 -- PC_Remove_Region node, on top of this entry, using the cursor 4937 -- field of the PC_Remove_Region entry to save the outer level 4938 -- stack base value, and resets the stack base to point to this 4939 -- PC_Remove_Region node. 4940 4941 ---------- 4942 -- Dout -- 4943 ---------- 4944 4945 procedure Dout (Str : String) is 4946 begin 4947 for J in 1 .. Region_Level loop 4948 Put ("| "); 4949 end loop; 4950 4951 Put_Line (Str); 4952 end Dout; 4953 4954 procedure Dout (Str : String; A : Character) is 4955 begin 4956 Dout (Str & " ('" & A & "')"); 4957 end Dout; 4958 4959 procedure Dout (Str : String; A : Character_Set) is 4960 begin 4961 Dout (Str & " (" & Image (To_Sequence (A)) & ')'); 4962 end Dout; 4963 4964 procedure Dout (Str : String; A : Natural) is 4965 begin 4966 Dout (Str & " (" & A & ')'); 4967 end Dout; 4968 4969 procedure Dout (Str : String; A : String) is 4970 begin 4971 Dout (Str & " (" & Image (A) & ')'); 4972 end Dout; 4973 4974 --------- 4975 -- Img -- 4976 --------- 4977 4978 function Img (P : PE_Ptr) return String is 4979 begin 4980 return "#" & Integer (P.Index) & " "; 4981 end Img; 4982 4983 ---------------- 4984 -- Pop_Region -- 4985 ---------------- 4986 4987 procedure Pop_Region is 4988 begin 4989 Region_Level := Region_Level - 1; 4990 4991 -- If nothing was pushed in the inner region, we can just get 4992 -- rid of it entirely, leaving no traces that it was ever there 4993 4994 if Stack_Ptr = Stack_Base then 4995 Stack_Ptr := Stack_Base - 2; 4996 Stack_Base := Stack (Stack_Ptr + 2).Cursor; 4997 4998 -- If stuff was pushed in the inner region, then we have to 4999 -- push a PC_R_Restore node so that we properly handle possible 5000 -- rematches within the region. 5001 5002 else 5003 Stack_Ptr := Stack_Ptr + 1; 5004 Stack (Stack_Ptr).Cursor := Stack_Base; 5005 Stack (Stack_Ptr).Node := CP_R_Restore'Access; 5006 Stack_Base := Stack (Stack_Base).Cursor; 5007 end if; 5008 end Pop_Region; 5009 5010 ---------- 5011 -- Push -- 5012 ---------- 5013 5014 procedure Push (Node : PE_Ptr) is 5015 begin 5016 Stack_Ptr := Stack_Ptr + 1; 5017 Stack (Stack_Ptr).Cursor := Cursor; 5018 Stack (Stack_Ptr).Node := Node; 5019 end Push; 5020 5021 ----------------- 5022 -- Push_Region -- 5023 ----------------- 5024 5025 procedure Push_Region is 5026 begin 5027 Region_Level := Region_Level + 1; 5028 Stack_Ptr := Stack_Ptr + 2; 5029 Stack (Stack_Ptr).Cursor := Stack_Base; 5030 Stack (Stack_Ptr).Node := CP_R_Remove'Access; 5031 Stack_Base := Stack_Ptr; 5032 end Push_Region; 5033 5034 -- Start of processing for XMatchD 5035 5036 begin 5037 New_Line; 5038 Put_Line ("Initiating pattern match, subject = " & Image (Subject)); 5039 Put ("--------------------------------------"); 5040 5041 for J in 1 .. Length loop 5042 Put ('-'); 5043 end loop; 5044 5045 New_Line; 5046 Put_Line ("subject length = " & Length); 5047 5048 if Pat_P = null then 5049 Uninitialized_Pattern; 5050 end if; 5051 5052 -- Check we have enough stack for this pattern. This check deals with 5053 -- every possibility except a match of a recursive pattern, where we 5054 -- make a check at each recursion level. 5055 5056 if Pat_S >= Stack_Size - 1 then 5057 raise Pattern_Stack_Overflow; 5058 end if; 5059 5060 -- In anchored mode, the bottom entry on the stack is an abort entry 5061 5062 if Anchored_Mode then 5063 Stack (Stack_Init).Node := CP_Cancel'Access; 5064 Stack (Stack_Init).Cursor := 0; 5065 5066 -- In unanchored more, the bottom entry on the stack references 5067 -- the special pattern element PE_Unanchored, whose Pthen field 5068 -- points to the initial pattern element. The cursor value in this 5069 -- entry is the number of anchor moves so far. 5070 5071 else 5072 Stack (Stack_Init).Node := PE_Unanchored'Unchecked_Access; 5073 Stack (Stack_Init).Cursor := 0; 5074 end if; 5075 5076 Stack_Ptr := Stack_Init; 5077 Stack_Base := Stack_Ptr; 5078 Cursor := 0; 5079 Node := Pat_P; 5080 goto Match; 5081 5082 ----------------------------------------- 5083 -- Main Pattern Matching State Control -- 5084 ----------------------------------------- 5085 5086 -- This is a state machine which uses gotos to change state. The 5087 -- initial state is Match, to initiate the matching of the first 5088 -- element, so the goto Match above starts the match. In the 5089 -- following descriptions, we indicate the global values that 5090 -- are relevant for the state transition. 5091 5092 -- Come here if entire match fails 5093 5094 <<Match_Fail>> 5095 Dout ("match fails"); 5096 New_Line; 5097 Start := 0; 5098 Stop := 0; 5099 return; 5100 5101 -- Come here if entire match succeeds 5102 5103 -- Cursor current position in subject string 5104 5105 <<Match_Succeed>> 5106 Dout ("match succeeds"); 5107 Start := Stack (Stack_Init).Cursor + 1; 5108 Stop := Cursor; 5109 Dout ("first matched character index = " & Start); 5110 Dout ("last matched character index = " & Stop); 5111 Dout ("matched substring = " & Image (Subject (Start .. Stop))); 5112 5113 -- Scan history stack for deferred assignments or writes 5114 5115 if Assign_OnM then 5116 for S in Stack'First .. Stack_Ptr loop 5117 if Stack (S).Node = CP_Assign'Access then 5118 declare 5119 Inner_Base : constant Stack_Range := 5120 Stack (S + 1).Cursor; 5121 Special_Entry : constant Stack_Range := 5122 Inner_Base - 1; 5123 Node_OnM : constant PE_Ptr := 5124 Stack (Special_Entry).Node; 5125 Start : constant Natural := 5126 Stack (Special_Entry).Cursor + 1; 5127 Stop : constant Natural := Stack (S).Cursor; 5128 5129 begin 5130 if Node_OnM.Pcode = PC_Assign_OnM then 5131 Set_String (Node_OnM.VP.all, Subject (Start .. Stop)); 5132 Dout 5133 (Img (Stack (S).Node) & 5134 "deferred assignment of " & 5135 Image (Subject (Start .. Stop))); 5136 5137 elsif Node_OnM.Pcode = PC_Write_OnM then 5138 Put_Line (Node_OnM.FP.all, Subject (Start .. Stop)); 5139 Dout 5140 (Img (Stack (S).Node) & 5141 "deferred write of " & 5142 Image (Subject (Start .. Stop))); 5143 5144 else 5145 Logic_Error; 5146 end if; 5147 end; 5148 end if; 5149 end loop; 5150 end if; 5151 5152 New_Line; 5153 return; 5154 5155 -- Come here if attempt to match current element fails 5156 5157 -- Stack_Base current stack base 5158 -- Stack_Ptr current stack pointer 5159 5160 <<Fail>> 5161 Cursor := Stack (Stack_Ptr).Cursor; 5162 Node := Stack (Stack_Ptr).Node; 5163 Stack_Ptr := Stack_Ptr - 1; 5164 5165 if Cursor >= 0 then 5166 Dout ("failure, cursor reset to " & Cursor); 5167 end if; 5168 5169 goto Match; 5170 5171 -- Come here if attempt to match current element succeeds 5172 5173 -- Cursor current position in subject string 5174 -- Node pointer to node successfully matched 5175 -- Stack_Base current stack base 5176 -- Stack_Ptr current stack pointer 5177 5178 <<Succeed>> 5179 Dout ("success, cursor = " & Cursor); 5180 Node := Node.Pthen; 5181 5182 -- Come here to match the next pattern element 5183 5184 -- Cursor current position in subject string 5185 -- Node pointer to node to be matched 5186 -- Stack_Base current stack base 5187 -- Stack_Ptr current stack pointer 5188 5189 <<Match>> 5190 5191 -------------------------------------------------- 5192 -- Main Pattern Match Element Matching Routines -- 5193 -------------------------------------------------- 5194 5195 -- Here is the case statement that processes the current node. The 5196 -- processing for each element does one of five things: 5197 5198 -- goto Succeed to move to the successor 5199 -- goto Match_Succeed if the entire match succeeds 5200 -- goto Match_Fail if the entire match fails 5201 -- goto Fail to signal failure of current match 5202 5203 -- Processing is NOT allowed to fall through 5204 5205 case Node.Pcode is 5206 5207 -- Cancel 5208 5209 when PC_Cancel => 5210 Dout (Img (Node) & "matching Cancel"); 5211 goto Match_Fail; 5212 5213 -- Alternation 5214 5215 when PC_Alt => 5216 Dout 5217 (Img (Node) & "setting up alternative " & Img (Node.Alt)); 5218 Push (Node.Alt); 5219 Node := Node.Pthen; 5220 goto Match; 5221 5222 -- Any (one character case) 5223 5224 when PC_Any_CH => 5225 Dout (Img (Node) & "matching Any", Node.Char); 5226 5227 if Cursor < Length 5228 and then Subject (Cursor + 1) = Node.Char 5229 then 5230 Cursor := Cursor + 1; 5231 goto Succeed; 5232 else 5233 goto Fail; 5234 end if; 5235 5236 -- Any (character set case) 5237 5238 when PC_Any_CS => 5239 Dout (Img (Node) & "matching Any", Node.CS); 5240 5241 if Cursor < Length 5242 and then Is_In (Subject (Cursor + 1), Node.CS) 5243 then 5244 Cursor := Cursor + 1; 5245 goto Succeed; 5246 else 5247 goto Fail; 5248 end if; 5249 5250 -- Any (string function case) 5251 5252 when PC_Any_VF => declare 5253 U : constant VString := Node.VF.all; 5254 Str : constant String_Access := Get_String (U); 5255 5256 begin 5257 Dout (Img (Node) & "matching Any", Str.all); 5258 5259 if Cursor < Length 5260 and then Is_In (Subject (Cursor + 1), Str.all) 5261 then 5262 Cursor := Cursor + 1; 5263 goto Succeed; 5264 else 5265 goto Fail; 5266 end if; 5267 end; 5268 5269 -- Any (string pointer case) 5270 5271 when PC_Any_VP => declare 5272 Str : constant String_Access := Get_String (Node.VP.all); 5273 5274 begin 5275 Dout (Img (Node) & "matching Any", Str.all); 5276 5277 if Cursor < Length 5278 and then Is_In (Subject (Cursor + 1), Str.all) 5279 then 5280 Cursor := Cursor + 1; 5281 goto Succeed; 5282 else 5283 goto Fail; 5284 end if; 5285 end; 5286 5287 -- Arb (initial match) 5288 5289 when PC_Arb_X => 5290 Dout (Img (Node) & "matching Arb"); 5291 Push (Node.Alt); 5292 Node := Node.Pthen; 5293 goto Match; 5294 5295 -- Arb (extension) 5296 5297 when PC_Arb_Y => 5298 Dout (Img (Node) & "extending Arb"); 5299 5300 if Cursor < Length then 5301 Cursor := Cursor + 1; 5302 Push (Node); 5303 goto Succeed; 5304 else 5305 goto Fail; 5306 end if; 5307 5308 -- Arbno_S (simple Arbno initialize). This is the node that 5309 -- initiates the match of a simple Arbno structure. 5310 5311 when PC_Arbno_S => 5312 Dout (Img (Node) & 5313 "setting up Arbno alternative " & Img (Node.Alt)); 5314 Push (Node.Alt); 5315 Node := Node.Pthen; 5316 goto Match; 5317 5318 -- Arbno_X (Arbno initialize). This is the node that initiates 5319 -- the match of a complex Arbno structure. 5320 5321 when PC_Arbno_X => 5322 Dout (Img (Node) & 5323 "setting up Arbno alternative " & Img (Node.Alt)); 5324 Push (Node.Alt); 5325 Node := Node.Pthen; 5326 goto Match; 5327 5328 -- Arbno_Y (Arbno rematch). This is the node that is executed 5329 -- following successful matching of one instance of a complex 5330 -- Arbno pattern. 5331 5332 when PC_Arbno_Y => declare 5333 Null_Match : constant Boolean := 5334 Cursor = Stack (Stack_Base - 1).Cursor; 5335 5336 begin 5337 Dout (Img (Node) & "extending Arbno"); 5338 Pop_Region; 5339 5340 -- If arbno extension matched null, then immediately fail 5341 5342 if Null_Match then 5343 Dout ("Arbno extension matched null, so fails"); 5344 goto Fail; 5345 end if; 5346 5347 -- Here we must do a stack check to make sure enough stack 5348 -- is left. This check will happen once for each instance of 5349 -- the Arbno pattern that is matched. The Nat field of a 5350 -- PC_Arbno pattern contains the maximum stack entries needed 5351 -- for the Arbno with one instance and the successor pattern 5352 5353 if Stack_Ptr + Node.Nat >= Stack'Last then 5354 raise Pattern_Stack_Overflow; 5355 end if; 5356 5357 goto Succeed; 5358 end; 5359 5360 -- Assign. If this node is executed, it means the assign-on-match 5361 -- or write-on-match operation will not happen after all, so we 5362 -- is propagate the failure, removing the PC_Assign node. 5363 5364 when PC_Assign => 5365 Dout (Img (Node) & "deferred assign/write cancelled"); 5366 goto Fail; 5367 5368 -- Assign immediate. This node performs the actual assignment. 5369 5370 when PC_Assign_Imm => 5371 Dout 5372 (Img (Node) & "executing immediate assignment of " & 5373 Image (Subject (Stack (Stack_Base - 1).Cursor + 1 .. Cursor))); 5374 Set_String 5375 (Node.VP.all, 5376 Subject (Stack (Stack_Base - 1).Cursor + 1 .. Cursor)); 5377 Pop_Region; 5378 goto Succeed; 5379 5380 -- Assign on match. This node sets up for the eventual assignment 5381 5382 when PC_Assign_OnM => 5383 Dout (Img (Node) & "registering deferred assignment"); 5384 Stack (Stack_Base - 1).Node := Node; 5385 Push (CP_Assign'Access); 5386 Pop_Region; 5387 Assign_OnM := True; 5388 goto Succeed; 5389 5390 -- Bal 5391 5392 when PC_Bal => 5393 Dout (Img (Node) & "matching or extending Bal"); 5394 if Cursor >= Length or else Subject (Cursor + 1) = ')' then 5395 goto Fail; 5396 5397 elsif Subject (Cursor + 1) = '(' then 5398 declare 5399 Paren_Count : Natural := 1; 5400 5401 begin 5402 loop 5403 Cursor := Cursor + 1; 5404 5405 if Cursor >= Length then 5406 goto Fail; 5407 5408 elsif Subject (Cursor + 1) = '(' then 5409 Paren_Count := Paren_Count + 1; 5410 5411 elsif Subject (Cursor + 1) = ')' then 5412 Paren_Count := Paren_Count - 1; 5413 exit when Paren_Count = 0; 5414 end if; 5415 end loop; 5416 end; 5417 end if; 5418 5419 Cursor := Cursor + 1; 5420 Push (Node); 5421 goto Succeed; 5422 5423 -- Break (one character case) 5424 5425 when PC_Break_CH => 5426 Dout (Img (Node) & "matching Break", Node.Char); 5427 5428 while Cursor < Length loop 5429 if Subject (Cursor + 1) = Node.Char then 5430 goto Succeed; 5431 else 5432 Cursor := Cursor + 1; 5433 end if; 5434 end loop; 5435 5436 goto Fail; 5437 5438 -- Break (character set case) 5439 5440 when PC_Break_CS => 5441 Dout (Img (Node) & "matching Break", Node.CS); 5442 5443 while Cursor < Length loop 5444 if Is_In (Subject (Cursor + 1), Node.CS) then 5445 goto Succeed; 5446 else 5447 Cursor := Cursor + 1; 5448 end if; 5449 end loop; 5450 5451 goto Fail; 5452 5453 -- Break (string function case) 5454 5455 when PC_Break_VF => declare 5456 U : constant VString := Node.VF.all; 5457 Str : constant String_Access := Get_String (U); 5458 5459 begin 5460 Dout (Img (Node) & "matching Break", Str.all); 5461 5462 while Cursor < Length loop 5463 if Is_In (Subject (Cursor + 1), Str.all) then 5464 goto Succeed; 5465 else 5466 Cursor := Cursor + 1; 5467 end if; 5468 end loop; 5469 5470 goto Fail; 5471 end; 5472 5473 -- Break (string pointer case) 5474 5475 when PC_Break_VP => declare 5476 Str : constant String_Access := Get_String (Node.VP.all); 5477 5478 begin 5479 Dout (Img (Node) & "matching Break", Str.all); 5480 5481 while Cursor < Length loop 5482 if Is_In (Subject (Cursor + 1), Str.all) then 5483 goto Succeed; 5484 else 5485 Cursor := Cursor + 1; 5486 end if; 5487 end loop; 5488 5489 goto Fail; 5490 end; 5491 5492 -- BreakX (one character case) 5493 5494 when PC_BreakX_CH => 5495 Dout (Img (Node) & "matching BreakX", Node.Char); 5496 5497 while Cursor < Length loop 5498 if Subject (Cursor + 1) = Node.Char then 5499 goto Succeed; 5500 else 5501 Cursor := Cursor + 1; 5502 end if; 5503 end loop; 5504 5505 goto Fail; 5506 5507 -- BreakX (character set case) 5508 5509 when PC_BreakX_CS => 5510 Dout (Img (Node) & "matching BreakX", Node.CS); 5511 5512 while Cursor < Length loop 5513 if Is_In (Subject (Cursor + 1), Node.CS) then 5514 goto Succeed; 5515 else 5516 Cursor := Cursor + 1; 5517 end if; 5518 end loop; 5519 5520 goto Fail; 5521 5522 -- BreakX (string function case) 5523 5524 when PC_BreakX_VF => declare 5525 U : constant VString := Node.VF.all; 5526 Str : constant String_Access := Get_String (U); 5527 5528 begin 5529 Dout (Img (Node) & "matching BreakX", Str.all); 5530 5531 while Cursor < Length loop 5532 if Is_In (Subject (Cursor + 1), Str.all) then 5533 goto Succeed; 5534 else 5535 Cursor := Cursor + 1; 5536 end if; 5537 end loop; 5538 5539 goto Fail; 5540 end; 5541 5542 -- BreakX (string pointer case) 5543 5544 when PC_BreakX_VP => declare 5545 Str : constant String_Access := Get_String (Node.VP.all); 5546 5547 begin 5548 Dout (Img (Node) & "matching BreakX", Str.all); 5549 5550 while Cursor < Length loop 5551 if Is_In (Subject (Cursor + 1), Str.all) then 5552 goto Succeed; 5553 else 5554 Cursor := Cursor + 1; 5555 end if; 5556 end loop; 5557 5558 goto Fail; 5559 end; 5560 5561 -- BreakX_X (BreakX extension). See section on "Compound Pattern 5562 -- Structures". This node is the alternative that is stacked 5563 -- to skip past the break character and extend the break. 5564 5565 when PC_BreakX_X => 5566 Dout (Img (Node) & "extending BreakX"); 5567 5568 Cursor := Cursor + 1; 5569 goto Succeed; 5570 5571 -- Character (one character string) 5572 5573 when PC_Char => 5574 Dout (Img (Node) & "matching '" & Node.Char & '''); 5575 5576 if Cursor < Length 5577 and then Subject (Cursor + 1) = Node.Char 5578 then 5579 Cursor := Cursor + 1; 5580 goto Succeed; 5581 else 5582 goto Fail; 5583 end if; 5584 5585 -- End of Pattern 5586 5587 when PC_EOP => 5588 if Stack_Base = Stack_Init then 5589 Dout ("end of pattern"); 5590 goto Match_Succeed; 5591 5592 -- End of recursive inner match. See separate section on 5593 -- handing of recursive pattern matches for details. 5594 5595 else 5596 Dout ("terminating recursive match"); 5597 Node := Stack (Stack_Base - 1).Node; 5598 Pop_Region; 5599 goto Match; 5600 end if; 5601 5602 -- Fail 5603 5604 when PC_Fail => 5605 Dout (Img (Node) & "matching Fail"); 5606 goto Fail; 5607 5608 -- Fence (built in pattern) 5609 5610 when PC_Fence => 5611 Dout (Img (Node) & "matching Fence"); 5612 Push (CP_Cancel'Access); 5613 goto Succeed; 5614 5615 -- Fence function node X. This is the node that gets control 5616 -- after a successful match of the fenced pattern. 5617 5618 when PC_Fence_X => 5619 Dout (Img (Node) & "matching Fence function"); 5620 Stack_Ptr := Stack_Ptr + 1; 5621 Stack (Stack_Ptr).Cursor := Stack_Base; 5622 Stack (Stack_Ptr).Node := CP_Fence_Y'Access; 5623 Stack_Base := Stack (Stack_Base).Cursor; 5624 Region_Level := Region_Level - 1; 5625 goto Succeed; 5626 5627 -- Fence function node Y. This is the node that gets control on 5628 -- a failure that occurs after the fenced pattern has matched. 5629 5630 -- Note: the Cursor at this stage is actually the inner stack 5631 -- base value. We don't reset this, but we do use it to strip 5632 -- off all the entries made by the fenced pattern. 5633 5634 when PC_Fence_Y => 5635 Dout (Img (Node) & "pattern matched by Fence caused failure"); 5636 Stack_Ptr := Cursor - 2; 5637 goto Fail; 5638 5639 -- Len (integer case) 5640 5641 when PC_Len_Nat => 5642 Dout (Img (Node) & "matching Len", Node.Nat); 5643 5644 if Cursor + Node.Nat > Length then 5645 goto Fail; 5646 else 5647 Cursor := Cursor + Node.Nat; 5648 goto Succeed; 5649 end if; 5650 5651 -- Len (Integer function case) 5652 5653 when PC_Len_NF => declare 5654 N : constant Natural := Node.NF.all; 5655 5656 begin 5657 Dout (Img (Node) & "matching Len", N); 5658 5659 if Cursor + N > Length then 5660 goto Fail; 5661 else 5662 Cursor := Cursor + N; 5663 goto Succeed; 5664 end if; 5665 end; 5666 5667 -- Len (integer pointer case) 5668 5669 when PC_Len_NP => 5670 Dout (Img (Node) & "matching Len", Node.NP.all); 5671 5672 if Cursor + Node.NP.all > Length then 5673 goto Fail; 5674 else 5675 Cursor := Cursor + Node.NP.all; 5676 goto Succeed; 5677 end if; 5678 5679 -- NotAny (one character case) 5680 5681 when PC_NotAny_CH => 5682 Dout (Img (Node) & "matching NotAny", Node.Char); 5683 5684 if Cursor < Length 5685 and then Subject (Cursor + 1) /= Node.Char 5686 then 5687 Cursor := Cursor + 1; 5688 goto Succeed; 5689 else 5690 goto Fail; 5691 end if; 5692 5693 -- NotAny (character set case) 5694 5695 when PC_NotAny_CS => 5696 Dout (Img (Node) & "matching NotAny", Node.CS); 5697 5698 if Cursor < Length 5699 and then not Is_In (Subject (Cursor + 1), Node.CS) 5700 then 5701 Cursor := Cursor + 1; 5702 goto Succeed; 5703 else 5704 goto Fail; 5705 end if; 5706 5707 -- NotAny (string function case) 5708 5709 when PC_NotAny_VF => declare 5710 U : constant VString := Node.VF.all; 5711 Str : constant String_Access := Get_String (U); 5712 5713 begin 5714 Dout (Img (Node) & "matching NotAny", Str.all); 5715 5716 if Cursor < Length 5717 and then 5718 not Is_In (Subject (Cursor + 1), Str.all) 5719 then 5720 Cursor := Cursor + 1; 5721 goto Succeed; 5722 else 5723 goto Fail; 5724 end if; 5725 end; 5726 5727 -- NotAny (string pointer case) 5728 5729 when PC_NotAny_VP => declare 5730 Str : constant String_Access := Get_String (Node.VP.all); 5731 5732 begin 5733 Dout (Img (Node) & "matching NotAny", Str.all); 5734 5735 if Cursor < Length 5736 and then 5737 not Is_In (Subject (Cursor + 1), Str.all) 5738 then 5739 Cursor := Cursor + 1; 5740 goto Succeed; 5741 else 5742 goto Fail; 5743 end if; 5744 end; 5745 5746 -- NSpan (one character case) 5747 5748 when PC_NSpan_CH => 5749 Dout (Img (Node) & "matching NSpan", Node.Char); 5750 5751 while Cursor < Length 5752 and then Subject (Cursor + 1) = Node.Char 5753 loop 5754 Cursor := Cursor + 1; 5755 end loop; 5756 5757 goto Succeed; 5758 5759 -- NSpan (character set case) 5760 5761 when PC_NSpan_CS => 5762 Dout (Img (Node) & "matching NSpan", Node.CS); 5763 5764 while Cursor < Length 5765 and then Is_In (Subject (Cursor + 1), Node.CS) 5766 loop 5767 Cursor := Cursor + 1; 5768 end loop; 5769 5770 goto Succeed; 5771 5772 -- NSpan (string function case) 5773 5774 when PC_NSpan_VF => declare 5775 U : constant VString := Node.VF.all; 5776 Str : constant String_Access := Get_String (U); 5777 5778 begin 5779 Dout (Img (Node) & "matching NSpan", Str.all); 5780 5781 while Cursor < Length 5782 and then Is_In (Subject (Cursor + 1), Str.all) 5783 loop 5784 Cursor := Cursor + 1; 5785 end loop; 5786 5787 goto Succeed; 5788 end; 5789 5790 -- NSpan (string pointer case) 5791 5792 when PC_NSpan_VP => declare 5793 Str : constant String_Access := Get_String (Node.VP.all); 5794 5795 begin 5796 Dout (Img (Node) & "matching NSpan", Str.all); 5797 5798 while Cursor < Length 5799 and then Is_In (Subject (Cursor + 1), Str.all) 5800 loop 5801 Cursor := Cursor + 1; 5802 end loop; 5803 5804 goto Succeed; 5805 end; 5806 5807 when PC_Null => 5808 Dout (Img (Node) & "matching null"); 5809 goto Succeed; 5810 5811 -- Pos (integer case) 5812 5813 when PC_Pos_Nat => 5814 Dout (Img (Node) & "matching Pos", Node.Nat); 5815 5816 if Cursor = Node.Nat then 5817 goto Succeed; 5818 else 5819 goto Fail; 5820 end if; 5821 5822 -- Pos (Integer function case) 5823 5824 when PC_Pos_NF => declare 5825 N : constant Natural := Node.NF.all; 5826 5827 begin 5828 Dout (Img (Node) & "matching Pos", N); 5829 5830 if Cursor = N then 5831 goto Succeed; 5832 else 5833 goto Fail; 5834 end if; 5835 end; 5836 5837 -- Pos (integer pointer case) 5838 5839 when PC_Pos_NP => 5840 Dout (Img (Node) & "matching Pos", Node.NP.all); 5841 5842 if Cursor = Node.NP.all then 5843 goto Succeed; 5844 else 5845 goto Fail; 5846 end if; 5847 5848 -- Predicate function 5849 5850 when PC_Pred_Func => 5851 Dout (Img (Node) & "matching predicate function"); 5852 5853 if Node.BF.all then 5854 goto Succeed; 5855 else 5856 goto Fail; 5857 end if; 5858 5859 -- Region Enter. Initiate new pattern history stack region 5860 5861 when PC_R_Enter => 5862 Dout (Img (Node) & "starting match of nested pattern"); 5863 Stack (Stack_Ptr + 1).Cursor := Cursor; 5864 Push_Region; 5865 goto Succeed; 5866 5867 -- Region Remove node. This is the node stacked by an R_Enter. 5868 -- It removes the special format stack entry right underneath, and 5869 -- then restores the outer level stack base and signals failure. 5870 5871 -- Note: the cursor value at this stage is actually the (negative) 5872 -- stack base value for the outer level. 5873 5874 when PC_R_Remove => 5875 Dout ("failure, match of nested pattern terminated"); 5876 Stack_Base := Cursor; 5877 Region_Level := Region_Level - 1; 5878 Stack_Ptr := Stack_Ptr - 1; 5879 goto Fail; 5880 5881 -- Region restore node. This is the node stacked at the end of an 5882 -- inner level match. Its function is to restore the inner level 5883 -- region, so that alternatives in this region can be sought. 5884 5885 -- Note: the Cursor at this stage is actually the negative of the 5886 -- inner stack base value, which we use to restore the inner region. 5887 5888 when PC_R_Restore => 5889 Dout ("failure, search for alternatives in nested pattern"); 5890 Region_Level := Region_Level + 1; 5891 Stack_Base := Cursor; 5892 goto Fail; 5893 5894 -- Rest 5895 5896 when PC_Rest => 5897 Dout (Img (Node) & "matching Rest"); 5898 Cursor := Length; 5899 goto Succeed; 5900 5901 -- Initiate recursive match (pattern pointer case) 5902 5903 when PC_Rpat => 5904 Stack (Stack_Ptr + 1).Node := Node.Pthen; 5905 Push_Region; 5906 Dout (Img (Node) & "initiating recursive match"); 5907 5908 if Stack_Ptr + Node.PP.all.Stk >= Stack_Size then 5909 raise Pattern_Stack_Overflow; 5910 else 5911 Node := Node.PP.all.P; 5912 goto Match; 5913 end if; 5914 5915 -- RPos (integer case) 5916 5917 when PC_RPos_Nat => 5918 Dout (Img (Node) & "matching RPos", Node.Nat); 5919 5920 if Cursor = (Length - Node.Nat) then 5921 goto Succeed; 5922 else 5923 goto Fail; 5924 end if; 5925 5926 -- RPos (integer function case) 5927 5928 when PC_RPos_NF => declare 5929 N : constant Natural := Node.NF.all; 5930 5931 begin 5932 Dout (Img (Node) & "matching RPos", N); 5933 5934 if Length - Cursor = N then 5935 goto Succeed; 5936 else 5937 goto Fail; 5938 end if; 5939 end; 5940 5941 -- RPos (integer pointer case) 5942 5943 when PC_RPos_NP => 5944 Dout (Img (Node) & "matching RPos", Node.NP.all); 5945 5946 if Cursor = (Length - Node.NP.all) then 5947 goto Succeed; 5948 else 5949 goto Fail; 5950 end if; 5951 5952 -- RTab (integer case) 5953 5954 when PC_RTab_Nat => 5955 Dout (Img (Node) & "matching RTab", Node.Nat); 5956 5957 if Cursor <= (Length - Node.Nat) then 5958 Cursor := Length - Node.Nat; 5959 goto Succeed; 5960 else 5961 goto Fail; 5962 end if; 5963 5964 -- RTab (integer function case) 5965 5966 when PC_RTab_NF => declare 5967 N : constant Natural := Node.NF.all; 5968 5969 begin 5970 Dout (Img (Node) & "matching RPos", N); 5971 5972 if Length - Cursor >= N then 5973 Cursor := Length - N; 5974 goto Succeed; 5975 else 5976 goto Fail; 5977 end if; 5978 end; 5979 5980 -- RTab (integer pointer case) 5981 5982 when PC_RTab_NP => 5983 Dout (Img (Node) & "matching RPos", Node.NP.all); 5984 5985 if Cursor <= (Length - Node.NP.all) then 5986 Cursor := Length - Node.NP.all; 5987 goto Succeed; 5988 else 5989 goto Fail; 5990 end if; 5991 5992 -- Cursor assignment 5993 5994 when PC_Setcur => 5995 Dout (Img (Node) & "matching Setcur"); 5996 Node.Var.all := Cursor; 5997 goto Succeed; 5998 5999 -- Span (one character case) 6000 6001 when PC_Span_CH => declare 6002 P : Natural := Cursor; 6003 6004 begin 6005 Dout (Img (Node) & "matching Span", Node.Char); 6006 6007 while P < Length 6008 and then Subject (P + 1) = Node.Char 6009 loop 6010 P := P + 1; 6011 end loop; 6012 6013 if P /= Cursor then 6014 Cursor := P; 6015 goto Succeed; 6016 else 6017 goto Fail; 6018 end if; 6019 end; 6020 6021 -- Span (character set case) 6022 6023 when PC_Span_CS => declare 6024 P : Natural := Cursor; 6025 6026 begin 6027 Dout (Img (Node) & "matching Span", Node.CS); 6028 6029 while P < Length 6030 and then Is_In (Subject (P + 1), Node.CS) 6031 loop 6032 P := P + 1; 6033 end loop; 6034 6035 if P /= Cursor then 6036 Cursor := P; 6037 goto Succeed; 6038 else 6039 goto Fail; 6040 end if; 6041 end; 6042 6043 -- Span (string function case) 6044 6045 when PC_Span_VF => declare 6046 U : constant VString := Node.VF.all; 6047 Str : constant String_Access := Get_String (U); 6048 P : Natural := Cursor; 6049 6050 begin 6051 Dout (Img (Node) & "matching Span", Str.all); 6052 6053 while P < Length 6054 and then Is_In (Subject (P + 1), Str.all) 6055 loop 6056 P := P + 1; 6057 end loop; 6058 6059 if P /= Cursor then 6060 Cursor := P; 6061 goto Succeed; 6062 else 6063 goto Fail; 6064 end if; 6065 end; 6066 6067 -- Span (string pointer case) 6068 6069 when PC_Span_VP => declare 6070 Str : constant String_Access := Get_String (Node.VP.all); 6071 P : Natural := Cursor; 6072 6073 begin 6074 Dout (Img (Node) & "matching Span", Str.all); 6075 6076 while P < Length 6077 and then Is_In (Subject (P + 1), Str.all) 6078 loop 6079 P := P + 1; 6080 end loop; 6081 6082 if P /= Cursor then 6083 Cursor := P; 6084 goto Succeed; 6085 else 6086 goto Fail; 6087 end if; 6088 end; 6089 6090 -- String (two character case) 6091 6092 when PC_String_2 => 6093 Dout (Img (Node) & "matching " & Image (Node.Str2)); 6094 6095 if (Length - Cursor) >= 2 6096 and then Subject (Cursor + 1 .. Cursor + 2) = Node.Str2 6097 then 6098 Cursor := Cursor + 2; 6099 goto Succeed; 6100 else 6101 goto Fail; 6102 end if; 6103 6104 -- String (three character case) 6105 6106 when PC_String_3 => 6107 Dout (Img (Node) & "matching " & Image (Node.Str3)); 6108 6109 if (Length - Cursor) >= 3 6110 and then Subject (Cursor + 1 .. Cursor + 3) = Node.Str3 6111 then 6112 Cursor := Cursor + 3; 6113 goto Succeed; 6114 else 6115 goto Fail; 6116 end if; 6117 6118 -- String (four character case) 6119 6120 when PC_String_4 => 6121 Dout (Img (Node) & "matching " & Image (Node.Str4)); 6122 6123 if (Length - Cursor) >= 4 6124 and then Subject (Cursor + 1 .. Cursor + 4) = Node.Str4 6125 then 6126 Cursor := Cursor + 4; 6127 goto Succeed; 6128 else 6129 goto Fail; 6130 end if; 6131 6132 -- String (five character case) 6133 6134 when PC_String_5 => 6135 Dout (Img (Node) & "matching " & Image (Node.Str5)); 6136 6137 if (Length - Cursor) >= 5 6138 and then Subject (Cursor + 1 .. Cursor + 5) = Node.Str5 6139 then 6140 Cursor := Cursor + 5; 6141 goto Succeed; 6142 else 6143 goto Fail; 6144 end if; 6145 6146 -- String (six character case) 6147 6148 when PC_String_6 => 6149 Dout (Img (Node) & "matching " & Image (Node.Str6)); 6150 6151 if (Length - Cursor) >= 6 6152 and then Subject (Cursor + 1 .. Cursor + 6) = Node.Str6 6153 then 6154 Cursor := Cursor + 6; 6155 goto Succeed; 6156 else 6157 goto Fail; 6158 end if; 6159 6160 -- String (case of more than six characters) 6161 6162 when PC_String => declare 6163 Len : constant Natural := Node.Str'Length; 6164 6165 begin 6166 Dout (Img (Node) & "matching " & Image (Node.Str.all)); 6167 6168 if (Length - Cursor) >= Len 6169 and then Node.Str.all = Subject (Cursor + 1 .. Cursor + Len) 6170 then 6171 Cursor := Cursor + Len; 6172 goto Succeed; 6173 else 6174 goto Fail; 6175 end if; 6176 end; 6177 6178 -- String (function case) 6179 6180 when PC_String_VF => declare 6181 U : constant VString := Node.VF.all; 6182 Str : constant String_Access := Get_String (U); 6183 Len : constant Natural := Str'Length; 6184 6185 begin 6186 Dout (Img (Node) & "matching " & Image (Str.all)); 6187 6188 if (Length - Cursor) >= Len 6189 and then Str.all = Subject (Cursor + 1 .. Cursor + Len) 6190 then 6191 Cursor := Cursor + Len; 6192 goto Succeed; 6193 else 6194 goto Fail; 6195 end if; 6196 end; 6197 6198 -- String (vstring pointer case) 6199 6200 when PC_String_VP => declare 6201 S : constant String_Access := Get_String (Node.VP.all); 6202 Len : constant Natural := 6203 Ada.Strings.Unbounded.Length (Node.VP.all); 6204 6205 begin 6206 Dout 6207 (Img (Node) & "matching " & Image (S.all)); 6208 6209 if (Length - Cursor) >= Len 6210 and then S.all = Subject (Cursor + 1 .. Cursor + Len) 6211 then 6212 Cursor := Cursor + Len; 6213 goto Succeed; 6214 else 6215 goto Fail; 6216 end if; 6217 end; 6218 6219 -- Succeed 6220 6221 when PC_Succeed => 6222 Dout (Img (Node) & "matching Succeed"); 6223 Push (Node); 6224 goto Succeed; 6225 6226 -- Tab (integer case) 6227 6228 when PC_Tab_Nat => 6229 Dout (Img (Node) & "matching Tab", Node.Nat); 6230 6231 if Cursor <= Node.Nat then 6232 Cursor := Node.Nat; 6233 goto Succeed; 6234 else 6235 goto Fail; 6236 end if; 6237 6238 -- Tab (integer function case) 6239 6240 when PC_Tab_NF => declare 6241 N : constant Natural := Node.NF.all; 6242 6243 begin 6244 Dout (Img (Node) & "matching Tab ", N); 6245 6246 if Cursor <= N then 6247 Cursor := N; 6248 goto Succeed; 6249 else 6250 goto Fail; 6251 end if; 6252 end; 6253 6254 -- Tab (integer pointer case) 6255 6256 when PC_Tab_NP => 6257 Dout (Img (Node) & "matching Tab ", Node.NP.all); 6258 6259 if Cursor <= Node.NP.all then 6260 Cursor := Node.NP.all; 6261 goto Succeed; 6262 else 6263 goto Fail; 6264 end if; 6265 6266 -- Unanchored movement 6267 6268 when PC_Unanchored => 6269 Dout ("attempting to move anchor point"); 6270 6271 -- All done if we tried every position 6272 6273 if Cursor > Length then 6274 goto Match_Fail; 6275 6276 -- Otherwise extend the anchor point, and restack ourself 6277 6278 else 6279 Cursor := Cursor + 1; 6280 Push (Node); 6281 goto Succeed; 6282 end if; 6283 6284 -- Write immediate. This node performs the actual write 6285 6286 when PC_Write_Imm => 6287 Dout (Img (Node) & "executing immediate write of " & 6288 Subject (Stack (Stack_Base - 1).Cursor + 1 .. Cursor)); 6289 6290 Put_Line 6291 (Node.FP.all, 6292 Subject (Stack (Stack_Base - 1).Cursor + 1 .. Cursor)); 6293 Pop_Region; 6294 goto Succeed; 6295 6296 -- Write on match. This node sets up for the eventual write 6297 6298 when PC_Write_OnM => 6299 Dout (Img (Node) & "registering deferred write"); 6300 Stack (Stack_Base - 1).Node := Node; 6301 Push (CP_Assign'Access); 6302 Pop_Region; 6303 Assign_OnM := True; 6304 goto Succeed; 6305 6306 end case; 6307 6308 -- We are NOT allowed to fall though this case statement, since every 6309 -- match routine must end by executing a goto to the appropriate point 6310 -- in the finite state machine model. 6311 6312 Logic_Error; 6313 6314 end XMatchD; 6315 6316end GNAT.Spitbol.Patterns; 6317