• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

examples/H03-May-2022-381299

src/H03-May-2022-13,2017,909

tests/H03-May-2022-8,0515,525

.cargo-checksum.jsonH A D03-May-202289 11

.cargo_vcs_info.jsonH A D01-Jan-197074 65

.gitignoreH A D01-Jan-197068 98

CHANGELOG.mdH A D01-Jan-197039.1 KiB1,022788

Cargo.lockH A D01-Jan-19702.5 KiB9986

Cargo.tomlH A D01-Jan-19702.9 KiB11798

Cargo.toml.orig-cargoH A D01-Jan-19706.1 KiB195163

HACKING.mdH A D01-Jan-197016.5 KiB342272

LICENSE-APACHEH A D01-Jan-197010.6 KiB202169

LICENSE-MITH A D01-Jan-19701 KiB2622

PERFORMANCE.mdH A D01-Jan-197013.3 KiB278216

README.mdH A D01-Jan-19708 KiB251187

UNICODE.mdH A D01-Jan-197010.2 KiB260203

rustfmt.tomlH A D01-Jan-197044 32

testH A D01-Jan-1970839 3121

README.md

1regex
2=====
3A Rust library for parsing, compiling, and executing regular expressions. Its
4syntax is similar to Perl-style regular expressions, but lacks a few features
5like look around and backreferences. In exchange, all searches execute in
6linear time with respect to the size of the regular expression and search text.
7Much of the syntax and implementation is inspired
8by [RE2](https://github.com/google/re2).
9
10[![Build status](https://github.com/rust-lang/regex/workflows/ci/badge.svg)](https://github.com/rust-lang/regex/actions)
11[![](https://meritbadge.herokuapp.com/regex)](https://crates.io/crates/regex)
12[![Rust](https://img.shields.io/badge/rust-1.41.1%2B-blue.svg?maxAge=3600)](https://github.com/rust-lang/regex)
13
14### Documentation
15
16[Module documentation with examples](https://docs.rs/regex).
17The module documentation also includes a comprehensive description of the
18syntax supported.
19
20Documentation with examples for the various matching functions and iterators
21can be found on the
22[`Regex` type](https://docs.rs/regex/*/regex/struct.Regex.html).
23
24### Usage
25
26Add this to your `Cargo.toml`:
27
28```toml
29[dependencies]
30regex = "1.5"
31```
32
33Here's a simple example that matches a date in YYYY-MM-DD format and prints the
34year, month and day:
35
36```rust
37use regex::Regex;
38
39fn main() {
40    let re = Regex::new(r"(?x)
41(?P<year>\d{4})  # the year
42-
43(?P<month>\d{2}) # the month
44-
45(?P<day>\d{2})   # the day
46").unwrap();
47    let caps = re.captures("2010-03-14").unwrap();
48
49    assert_eq!("2010", &caps["year"]);
50    assert_eq!("03", &caps["month"]);
51    assert_eq!("14", &caps["day"]);
52}
53```
54
55If you have lots of dates in text that you'd like to iterate over, then it's
56easy to adapt the above example with an iterator:
57
58```rust
59use regex::Regex;
60
61const TO_SEARCH: &'static str = "
62On 2010-03-14, foo happened. On 2014-10-14, bar happened.
63";
64
65fn main() {
66    let re = Regex::new(r"(\d{4})-(\d{2})-(\d{2})").unwrap();
67
68    for caps in re.captures_iter(TO_SEARCH) {
69        // Note that all of the unwraps are actually OK for this regex
70        // because the only way for the regex to match is if all of the
71        // capture groups match. This is not true in general though!
72        println!("year: {}, month: {}, day: {}",
73                 caps.get(1).unwrap().as_str(),
74                 caps.get(2).unwrap().as_str(),
75                 caps.get(3).unwrap().as_str());
76    }
77}
78```
79
80This example outputs:
81
82```text
83year: 2010, month: 03, day: 14
84year: 2014, month: 10, day: 14
85```
86
87### Usage: Avoid compiling the same regex in a loop
88
89It is an anti-pattern to compile the same regular expression in a loop since
90compilation is typically expensive. (It takes anywhere from a few microseconds
91to a few **milliseconds** depending on the size of the regex.) Not only is
92compilation itself expensive, but this also prevents optimizations that reuse
93allocations internally to the matching engines.
94
95In Rust, it can sometimes be a pain to pass regular expressions around if
96they're used from inside a helper function. Instead, we recommend using the
97[`lazy_static`](https://crates.io/crates/lazy_static) crate to ensure that
98regular expressions are compiled exactly once.
99
100For example:
101
102```rust,ignore
103use regex::Regex;
104
105fn some_helper_function(text: &str) -> bool {
106    lazy_static! {
107        static ref RE: Regex = Regex::new("...").unwrap();
108    }
109    RE.is_match(text)
110}
111```
112
113Specifically, in this example, the regex will be compiled when it is used for
114the first time. On subsequent uses, it will reuse the previous compilation.
115
116### Usage: match regular expressions on `&[u8]`
117
118The main API of this crate (`regex::Regex`) requires the caller to pass a
119`&str` for searching. In Rust, an `&str` is required to be valid UTF-8, which
120means the main API can't be used for searching arbitrary bytes.
121
122To match on arbitrary bytes, use the `regex::bytes::Regex` API. The API
123is identical to the main API, except that it takes an `&[u8]` to search
124on instead of an `&str`. By default, `.` will match any *byte* using
125`regex::bytes::Regex`, while `.` will match any *UTF-8 encoded Unicode scalar
126value* using the main API.
127
128This example shows how to find all null-terminated strings in a slice of bytes:
129
130```rust
131use regex::bytes::Regex;
132
133let re = Regex::new(r"(?P<cstr>[^\x00]+)\x00").unwrap();
134let text = b"foo\x00bar\x00baz\x00";
135
136// Extract all of the strings without the null terminator from each match.
137// The unwrap is OK here since a match requires the `cstr` capture to match.
138let cstrs: Vec<&[u8]> =
139    re.captures_iter(text)
140      .map(|c| c.name("cstr").unwrap().as_bytes())
141      .collect();
142assert_eq!(vec![&b"foo"[..], &b"bar"[..], &b"baz"[..]], cstrs);
143```
144
145Notice here that the `[^\x00]+` will match any *byte* except for `NUL`. When
146using the main API, `[^\x00]+` would instead match any valid UTF-8 sequence
147except for `NUL`.
148
149### Usage: match multiple regular expressions simultaneously
150
151This demonstrates how to use a `RegexSet` to match multiple (possibly
152overlapping) regular expressions in a single scan of the search text:
153
154```rust
155use regex::RegexSet;
156
157let set = RegexSet::new(&[
158    r"\w+",
159    r"\d+",
160    r"\pL+",
161    r"foo",
162    r"bar",
163    r"barfoo",
164    r"foobar",
165]).unwrap();
166
167// Iterate over and collect all of the matches.
168let matches: Vec<_> = set.matches("foobar").into_iter().collect();
169assert_eq!(matches, vec![0, 2, 3, 4, 6]);
170
171// You can also test whether a particular regex matched:
172let matches = set.matches("foobar");
173assert!(!matches.matched(5));
174assert!(matches.matched(6));
175```
176
177### Usage: enable SIMD optimizations
178
179SIMD optimizations are enabled automatically on Rust stable 1.27 and newer.
180For nightly versions of Rust, this requires a recent version with the SIMD
181features stabilized.
182
183
184### Usage: a regular expression parser
185
186This repository contains a crate that provides a well tested regular expression
187parser, abstract syntax and a high-level intermediate representation for
188convenient analysis. It provides no facilities for compilation or execution.
189This may be useful if you're implementing your own regex engine or otherwise
190need to do analysis on the syntax of a regular expression. It is otherwise not
191recommended for general use.
192
193[Documentation `regex-syntax`.](https://docs.rs/regex-syntax)
194
195
196### Crate features
197
198This crate comes with several features that permit tweaking the trade off
199between binary size, compilation time and runtime performance. Users of this
200crate can selectively disable Unicode tables, or choose from a variety of
201optimizations performed by this crate to disable.
202
203When all of these features are disabled, runtime match performance may be much
204worse, but if you're matching on short strings, or if high performance isn't
205necessary, then such a configuration is perfectly serviceable. To disable
206all such features, use the following `Cargo.toml` dependency configuration:
207
208```toml
209[dependencies.regex]
210version = "1.3"
211default-features = false
212# regex currently requires the standard library, you must re-enable it.
213features = ["std"]
214```
215
216This will reduce the dependency tree of `regex` down to a single crate
217(`regex-syntax`).
218
219The full set of features one can disable are
220[in the "Crate features" section of the documentation](https://docs.rs/regex/*/#crate-features).
221
222
223### Minimum Rust version policy
224
225This crate's minimum supported `rustc` version is `1.41.1`.
226
227The current **tentative** policy is that the minimum Rust version required
228to use this crate can be increased in minor version updates. For example, if
229regex 1.0 requires Rust 1.20.0, then regex 1.0.z for all values of `z` will
230also require Rust 1.20.0 or newer. However, regex 1.y for `y > 0` may require a
231newer minimum version of Rust.
232
233In general, this crate will be conservative with respect to the minimum
234supported version of Rust.
235
236
237### License
238
239This project is licensed under either of
240
241 * Apache License, Version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or
242   https://www.apache.org/licenses/LICENSE-2.0)
243 * MIT license ([LICENSE-MIT](LICENSE-MIT) or
244   https://opensource.org/licenses/MIT)
245
246at your option.
247
248The data in `regex-syntax/src/unicode_tables/` is licensed under the Unicode
249License Agreement
250([LICENSE-UNICODE](https://www.unicode.org/copyright.html#License)).
251