1# Regular Expression Tokenizer 2 3Tokenizes strings that represent a regular expressions. 4 5[![Build Status](https://secure.travis-ci.org/fent/ret.js.svg)](http://travis-ci.org/fent/ret.js) 6[![Dependency Status](https://david-dm.org/fent/ret.js.svg)](https://david-dm.org/fent/ret.js) 7[![codecov](https://codecov.io/gh/fent/ret.js/branch/master/graph/badge.svg)](https://codecov.io/gh/fent/ret.js) 8 9# Usage 10 11```js 12var ret = require('ret'); 13 14var tokens = ret(/foo|bar/.source); 15``` 16 17`tokens` will contain the following object 18 19```js 20{ 21 "type": ret.types.ROOT 22 "options": [ 23 [ { "type": ret.types.CHAR, "value", 102 }, 24 { "type": ret.types.CHAR, "value", 111 }, 25 { "type": ret.types.CHAR, "value", 111 } ], 26 [ { "type": ret.types.CHAR, "value", 98 }, 27 { "type": ret.types.CHAR, "value", 97 }, 28 { "type": ret.types.CHAR, "value", 114 } ] 29 ] 30} 31``` 32 33# Token Types 34 35`ret.types` is a collection of the various token types exported by ret. 36 37### ROOT 38 39Only used in the root of the regexp. This is needed due to the posibility of the root containing a pipe `|` character. In that case, the token will have an `options` key that will be an array of arrays of tokens. If not, it will contain a `stack` key that is an array of tokens. 40 41```js 42{ 43 "type": ret.types.ROOT, 44 "stack": [token1, token2...], 45} 46``` 47 48```js 49{ 50 "type": ret.types.ROOT, 51 "options" [ 52 [token1, token2...], 53 [othertoken1, othertoken2...] 54 ... 55 ], 56} 57``` 58 59### GROUP 60 61Groups contain tokens that are inside of a parenthesis. If the group begins with `?` followed by another character, it's a special type of group. A ':' tells the group not to be remembered when `exec` is used. '=' means the previous token matches only if followed by this group, and '!' means the previous token matches only if NOT followed. 62 63Like root, it can contain an `options` key instead of `stack` if there is a pipe. 64 65```js 66{ 67 "type": ret.types.GROUP, 68 "remember" true, 69 "followedBy": false, 70 "notFollowedBy": false, 71 "stack": [token1, token2...], 72} 73``` 74 75```js 76{ 77 "type": ret.types.GROUP, 78 "remember" true, 79 "followedBy": false, 80 "notFollowedBy": false, 81 "options" [ 82 [token1, token2...], 83 [othertoken1, othertoken2...] 84 ... 85 ], 86} 87``` 88 89### POSITION 90 91`\b`, `\B`, `^`, and `$` specify positions in the regexp. 92 93```js 94{ 95 "type": ret.types.POSITION, 96 "value": "^", 97} 98``` 99 100### SET 101 102Contains a key `set` specifying what tokens are allowed and a key `not` specifying if the set should be negated. A set can contain other sets, ranges, and characters. 103 104```js 105{ 106 "type": ret.types.SET, 107 "set": [token1, token2...], 108 "not": false, 109} 110``` 111 112### RANGE 113 114Used in set tokens to specify a character range. `from` and `to` are character codes. 115 116```js 117{ 118 "type": ret.types.RANGE, 119 "from": 97, 120 "to": 122, 121} 122``` 123 124### REPETITION 125 126```js 127{ 128 "type": ret.types.REPETITION, 129 "min": 0, 130 "max": Infinity, 131 "value": token, 132} 133``` 134 135### REFERENCE 136 137References a group token. `value` is 1-9. 138 139```js 140{ 141 "type": ret.types.REFERENCE, 142 "value": 1, 143} 144``` 145 146### CHAR 147 148Represents a single character token. `value` is the character code. This might seem a bit cluttering instead of concatenating characters together. But since repetition tokens only repeat the last token and not the last clause like the pipe, it's simpler to do it this way. 149 150```js 151{ 152 "type": ret.types.CHAR, 153 "value": 123, 154} 155``` 156 157## Errors 158 159ret.js will throw errors if given a string with an invalid regular expression. All possible errors are 160 161* Invalid group. When a group with an immediate `?` character is followed by an invalid character. It can only be followed by `!`, `=`, or `:`. Example: `/(?_abc)/` 162* Nothing to repeat. Thrown when a repetitional token is used as the first token in the current clause, as in right in the beginning of the regexp or group, or right after a pipe. Example: `/foo|?bar/`, `/{1,3}foo|bar/`, `/foo(+bar)/` 163* Unmatched ). A group was not opened, but was closed. Example: `/hello)2u/` 164* Unterminated group. A group was not closed. Example: `/(1(23)4/` 165* Unterminated character class. A custom character set was not closed. Example: `/[abc/` 166 167 168# Install 169 170 npm install ret 171 172 173# Tests 174 175Tests are written with [vows](http://vowsjs.org/) 176 177```bash 178npm test 179``` 180 181# License 182 183MIT 184