1 /*!
2 A lower level API for packed multiple substring search, principally for a small
3 number of patterns.
4 
5 This sub-module provides vectorized routines for quickly finding matches of a
6 small number of patterns. In general, users of this crate shouldn't need to
7 interface with this module directory, as the primary
8 [`AhoCorasick`](../struct.AhoCorasick.html)
9 searcher will use these routines automatically as a prefilter when applicable.
10 However, in some cases, callers may want to bypass the Aho-Corasick machinery
11 entirely and use this vectorized searcher directly.
12 
13 # Overview
14 
15 The primary types in this sub-module are:
16 
17 * [`Searcher`](struct.Searcher.html) executes the actual search algorithm to
18   report matches in a haystack.
19 * [`Builder`](struct.Builder.html) accumulates patterns incrementally and can
20   construct a `Searcher`.
21 * [`Config`](struct.Config.html) permits tuning the searcher, and itself will
22   produce a `Builder` (which can then be used to build a `Searcher`).
23   Currently, the only tuneable knob are the match semantics, but this may be
24   expanded in the future.
25 
26 # Examples
27 
28 This example shows how to create a searcher from an iterator of patterns.
29 By default, leftmost-first match semantics are used. (See the top-level
30 [`MatchKind`](../enum.MatchKind.html) type for more details about match
31 semantics, which apply similarly to packed substring search.)
32 
33 ```
34 use aho_corasick::packed::{MatchKind, Searcher};
35 
36 # fn example() -> Option<()> {
37 let searcher = Searcher::new(["foobar", "foo"].iter().cloned())?;
38 let matches: Vec<usize> = searcher
39     .find_iter("foobar")
40     .map(|mat| mat.pattern())
41     .collect();
42 assert_eq!(vec![0], matches);
43 # Some(()) }
44 # if cfg!(target_arch = "x86_64") {
45 #     example().unwrap()
46 # } else {
47 #     assert!(example().is_none());
48 # }
49 ```
50 
51 This example shows how to use [`Config`](struct.Config.html) to change the
52 match semantics to leftmost-longest:
53 
54 ```
55 use aho_corasick::packed::{Config, MatchKind};
56 
57 # fn example() -> Option<()> {
58 let searcher = Config::new()
59     .match_kind(MatchKind::LeftmostLongest)
60     .builder()
61     .add("foo")
62     .add("foobar")
63     .build()?;
64 let matches: Vec<usize> = searcher
65     .find_iter("foobar")
66     .map(|mat| mat.pattern())
67     .collect();
68 assert_eq!(vec![1], matches);
69 # Some(()) }
70 # if cfg!(target_arch = "x86_64") {
71 #     example().unwrap()
72 # } else {
73 #     assert!(example().is_none());
74 # }
75 ```
76 
77 # Packed substring searching
78 
79 Packed substring searching refers to the use of SIMD (Single Instruction,
80 Multiple Data) to accelerate the detection of matches in a haystack. Unlike
81 conventional algorithms, such as Aho-Corasick, SIMD algorithms for substring
82 search tend to do better with a small number of patterns, where as Aho-Corasick
83 generally maintains reasonably consistent performance regardless of the number
84 of patterns you give it. Because of this, the vectorized searcher in this
85 sub-module cannot be used as a general purpose searcher, since building the
86 searcher may fail. However, in exchange, when searching for a small number of
87 patterns, searching can be quite a bit faster than Aho-Corasick (sometimes by
88 an order of magnitude).
89 
90 The key take away here is that constructing a searcher from a list of patterns
91 is a fallible operation. While the precise conditions under which building a
92 searcher can fail is specifically an implementation detail, here are some
93 common reasons:
94 
95 * Too many patterns were given. Typically, the limit is on the order of 100 or
96   so, but this limit may fluctuate based on available CPU features.
97 * The available packed algorithms require CPU features that aren't available.
98   For example, currently, this crate only provides packed algorithms for
99   `x86_64`. Therefore, constructing a packed searcher on any other target
100   (e.g., ARM) will always fail.
101 * Zero patterns were given, or one of the patterns given was empty. Packed
102   searchers require at least one pattern and that all patterns are non-empty.
103 * Something else about the nature of the patterns (typically based on
104   heuristics) suggests that a packed searcher would perform very poorly, so
105   no searcher is built.
106 */
107 
108 pub use packed::api::{Builder, Config, FindIter, MatchKind, Searcher};
109 
110 mod api;
111 mod pattern;
112 mod rabinkarp;
113 mod teddy;
114 #[cfg(test)]
115 mod tests;
116 #[cfg(target_arch = "x86_64")]
117 mod vector;
118