1------------------------------------------------------------------------------ 2-- -- 3-- GNAT COMPILER COMPONENTS -- 4-- -- 5-- G N A T . R E G E X P -- 6-- -- 7-- S p e c -- 8-- -- 9-- Copyright (C) 1998-2003 Ada Core Technologies, Inc. -- 10-- -- 11-- GNAT is free software; you can redistribute it and/or modify it under -- 12-- terms of the GNU General Public License as published by the Free Soft- -- 13-- ware Foundation; either version 2, or (at your option) any later ver- -- 14-- sion. GNAT is distributed in the hope that it will be useful, but WITH- -- 15-- OUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY -- 16-- or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License -- 17-- for more details. You should have received a copy of the GNU General -- 18-- Public License distributed with GNAT; see file COPYING. If not, write -- 19-- to the Free Software Foundation, 59 Temple Place - Suite 330, Boston, -- 20-- MA 02111-1307, USA. -- 21-- -- 22-- As a special exception, if other files instantiate generics from this -- 23-- unit, or you link this unit with other files to produce an executable, -- 24-- this unit does not by itself cause the resulting executable to be -- 25-- covered by the GNU General Public License. This exception does not -- 26-- however invalidate any other reasons why the executable file might be -- 27-- covered by the GNU Public License. -- 28-- -- 29-- GNAT was originally developed by the GNAT team at New York University. -- 30-- Extensive contributions were provided by Ada Core Technologies Inc. -- 31-- -- 32------------------------------------------------------------------------------ 33 34-- Simple Regular expression matching 35 36-- This package provides a simple implementation of a regular expression 37-- pattern matching algorithm, using a subset of the syntax of regular 38-- expressions copied from familiar Unix style utilities. 39 40------------------------------------------------------------ 41-- Summary of Pattern Matching Packages in GNAT Hierarchy -- 42------------------------------------------------------------ 43 44-- There are three related packages that perform pattern maching functions. 45-- the following is an outline of these packages, to help you determine 46-- which is best for your needs. 47 48-- GNAT.Regexp (files g-regexp.ads/g-regexp.adb) 49-- This is a simple package providing Unix-style regular expression 50-- matching with the restriction that it matches entire strings. It 51-- is particularly useful for file name matching, and in particular 52-- it provides "globbing patterns" that are useful in implementing 53-- unix or DOS style wild card matching for file names. 54 55-- GNAT.Regpat (files g-regpat.ads/g-regpat.adb) 56-- This is a more complete implementation of Unix-style regular 57-- expressions, copied from the original V7 style regular expression 58-- library written in C by Henry Spencer. It is functionally the 59-- same as this library, and uses the same internal data structures 60-- stored in a binary compatible manner. 61 62-- GNAT.Spitbol.Patterns (files g-spipat.ads/g-spipat.adb) 63-- This is a completely general patterm matching package based on the 64-- pattern language of SNOBOL4, as implemented in SPITBOL. The pattern 65-- language is modeled on context free grammars, with context sensitive 66-- extensions that provide full (type 0) computational capabilities. 67 68with Ada.Finalization; 69 70package GNAT.Regexp is 71 72 -- The regular expression must first be compiled, using the Compile 73 -- function, which creates a finite state matching table, allowing 74 -- very fast matching once the expression has been compiled. 75 76 -- The following is the form of a regular expression, expressed in Ada 77 -- reference manual style BNF is as follows 78 79 -- regexp ::= term 80 81 -- regexp ::= term | term -- alternation (term or term ...) 82 83 -- term ::= item 84 85 -- term ::= item item ... -- concatenation (item then item) 86 87 -- item ::= elmt -- match elmt 88 -- item ::= elmt * -- zero or more elmt's 89 -- item ::= elmt + -- one or more elmt's 90 -- item ::= elmt ? -- matches elmt or nothing 91 92 -- elmt ::= nchr -- matches given character 93 -- elmt ::= [nchr nchr ...] -- matches any character listed 94 -- elmt ::= [^ nchr nchr ...] -- matches any character not listed 95 -- elmt ::= [char - char] -- matches chars in given range 96 -- elmt ::= . -- matches any single character 97 -- elmt ::= ( regexp ) -- parens used for grouping 98 99 -- char ::= any character, including special characters 100 -- nchr ::= any character except \()[].*+?^ or \char to match char 101 -- ... is used to indication repetition (one or more terms) 102 103 -- See also regexp(1) man page on Unix systems for further details 104 105 -- A second kind of regular expressions is provided. This one is more 106 -- like the wild card patterns used in file names by the Unix shell (or 107 -- DOS prompt) command lines. The grammar is the following: 108 109 -- regexp ::= term 110 111 -- term ::= elmt 112 113 -- term ::= elmt elmt ... -- concatenation (elmt then elmt) 114 -- term ::= * -- any string of 0 or more characters 115 -- term ::= ? -- matches any character 116 -- term ::= [char char ...] -- matches any character listed 117 -- term ::= [char - char] -- matches any character in given range 118 -- term ::= {elmt, elmt, ...} -- alternation (matches any of elmt) 119 120 -- Important note : This package was mainly intended to match regular 121 -- expressions against file names. The whole string has to match the 122 -- regular expression. If only a substring matches, then the function 123 -- Match will return False. 124 125 type Regexp is private; 126 -- Private type used to represent a regular expression 127 128 Error_In_Regexp : exception; 129 -- Exception raised when an error is found in the regular expression 130 131 function Compile 132 (Pattern : String; 133 Glob : Boolean := False; 134 Case_Sensitive : Boolean := True) 135 return Regexp; 136 -- Compiles a regular expression S. If the syntax of the given 137 -- expression is invalid (does not match above grammar, Error_In_Regexp 138 -- is raised. If Glob is True, the pattern is considered as a 'globbing 139 -- pattern', that is a pattern as given by the second grammar above. 140 -- As a special case, if Pattern is the empty string it will always 141 -- match. 142 143 function Match (S : String; R : Regexp) return Boolean; 144 -- True if S matches R, otherwise False. Raises Constraint_Error if 145 -- R is an uninitialized regular expression value. 146 147private 148 type Regexp_Value; 149 150 type Regexp_Access is access Regexp_Value; 151 152 type Regexp is new Ada.Finalization.Controlled with record 153 R : Regexp_Access := null; 154 end record; 155 156 pragma Finalize_Storage_Only (Regexp); 157 158 procedure Finalize (R : in out Regexp); 159 -- Free the memory occupied by R 160 161 procedure Adjust (R : in out Regexp); 162 -- Called after an assignment (do a copy of the Regexp_Access.all) 163 164end GNAT.Regexp; 165