1# Optimization/Design notes 2 3This is my scratch pad for optimization ideas. Some of this I will implement, some I have implemented, some are just speculative. 4 5# Scopes 6 7## Representation ideas: 8 9- Normal arrays of strings 10- array of 32-bit or 64-bit atoms (maybe using Servo's atom library) 11- Atoms packed into one or two u64s 12 - fast equality checking 13 - potentially fast prefix checking 14 - needs unsafe code 15 16## Potential packings: 17 18- variable width atoms, either 7 bits and a tag bit for top 128 or 13 bits and 3 tagging bits for rest 19 - can fit all but 33 of the scopes present 20- tagged pointer (taking advantage of alignment), either a pointer to a slow path, or the first 4 bits set then a packed representation, one of others mentioned 21- 6 10-bit atoms referencing unique things by position (see by-position stats below) 22- 5 11-bit atoms and one 8-bit one for the first atom (2^11 = 2048, 2^8 = 256), one remaining bit for tag marker 23 24## Stats: 25 26- 7000 scopes referenced in sublime, 3537 unique ones, all stats after this are based on non-unique data 27- all but 33 scopes in default packages could fit in 64 with combination 8bit or 16bit atom encoding 28- there are only 1219 unique atoms in the default package set 29- the top 128 atoms make up ~90% of all unique atoms referenced in syntax files 30- there are 26 unique first atoms, 145 unique last atoms 31- every position (1st atom, 2nd atom, ...) has under 878 possibilities, only 2nd,3rd and 4th have >256 32- 99.8% of scopes have 6 or fewer atoms, 97% have 5 or fewer, 70% have 4 or fewer 33 - for unique scopes: {2=>81, 4=>1752, 3=>621, 5=>935, 7=>8, 6=>140} ----> 95% of uniques <= 6 34 - for non-unique scopes: {2=>125, 4=>3383, 3=>1505, 5=>1891, 7=>9, 6=>202} 35 36# Checking prefix 37 38operation: `fn extent_matched(potential_prefix: Scope, s: Scope) -> u8` 39idea: any differences are beyond the length of the prefix. 40figure this out by xor and then ctz/clz then a compare to the length (however that works). 41 42```bash 43XXXXYYYY00000000 # prefix 44XXXXYYYYZZZZ0000 # testee 4500000000ZZZZ0000 # = xored 46 47XXXXYYYYQQQQ0000 # non-prefix 48XXXXYYYYZZZZ0000 # testee 4900000000GGGG0000 # = xored 50 51XXXXQQQQ00000000 # non-prefix 52XXXXYYYYZZZZ0000 # testee 530000BBBBZZZZ0000 # = xored 54``` 55 56# Parsing 57 58* Problem: need to reduce number of regex search calls 59* Solution: cache better 60 61## Stats 62 63```bash 64# On stats branch 65$cargo run --release --example syncat testdata/jquery.js | grep cmiss | wc -l 66 Running `target/release/examples/syncat testdata/jquery.js` 67 61266 68$cargo run --release --example syncat testdata/jquery.js | grep ptoken | wc -l 69 Compiling syntect v0.1.0 (file:///Users/tristan/Box/Dev/Projects/syntect) 70 Running `target/release/examples/syncat testdata/jquery.js` 71 98714 72$wc -l testdata/jquery.js 73 9210 testdata/jquery.js 74$cargo run --release --example syncat testdata/jquery.js | grep cclear | wc -l 75 Compiling syntect v0.1.0 (file:///Users/tristan/Box/Dev/Projects/syntect) 76 Running `target/release/examples/syncat testdata/jquery.js` 77 71302 78$cargo run --release --example syncat testdata/jquery.js | grep freshcachetoken | wc -l 79 Compiling syntect v0.1.0 (file:///Users/tristan/Box/Dev/Projects/syntect) 80 Running `target/release/examples/syncat testdata/jquery.js` 81 80512 82# On stats-2 branch 83$cargo run --example syncat testdata/jquery.js | grep cachehit | wc -l 84 Running `target/debug/examples/syncat testdata/jquery.js` 85 527774 86$cargo run --example syncat testdata/jquery.js | grep regsearch | wc -l 87 Running `target/debug/examples/syncat testdata/jquery.js` 88 2862948 89$cargo run --example syncat testdata/jquery.js | grep regmatch | wc -l 90 Compiling syntect v0.6.0 (file:///Users/tristan/Box/Dev/Projects/syntect) 91 Running `target/debug/examples/syncat testdata/jquery.js` 92 296127 93$cargo run --example syncat testdata/jquery.js | grep leastmatch | wc -l 94 Compiling syntect v0.6.0 (file:///Users/tristan/Box/Dev/Projects/syntect) 95 Running `target/debug/examples/syncat testdata/jquery.js` 96 137842 97# With search caching 98$cargo run --example syncat testdata/jquery.js | grep searchcached | wc -l 99 Compiling syntect v0.6.0 (file:///Users/tristan/Box/Dev/Projects/syntect) 100 Running `target/debug/examples/syncat testdata/jquery.js` 101 2440527 102$cargo run --example syncat testdata/jquery.js | grep regsearch | wc -l 103 Running `target/debug/examples/syncat testdata/jquery.js` 104 950195 105``` 106 107Average unique regexes per line is 87.58, average non-unique is regsearch/lines = 317 108 109Ideally we should have only a couple fresh cache searches per line, not `~10` like the stats show (freshcachetoken/linecount). 110 111In a fantabulous world these stats mean a possible 10x speed improvement, but since caching does have a cost and we can't always cache it likely will be nice but not that high. 112 113## Issues 114 115- Stack transitions always bust cache, even when for example JS just pushes another group 116- Doesn't cache actual matches, only if it matched or not 117 118## Attacks 119 120- cache based on actual context, only search if it is a prototype we haven't searched before 121 - hash maps based on casting RC ref to pointer and hashing? (there is a Hash impl for pointers) 122- for new searches, store matched regexes for context in BTreeMap like textmate 123 - for subsequent tokens in same context, just pop off btreemap and re-search if before curpos 124- cache per Regex 125