Name | Date | Size | #Lines | LOC | ||
---|---|---|---|---|---|---|
.. | 03-May-2022 | - | ||||
.gitignore | H A D | 22-Dec-2020 | 273 | |||
LICENSE | H A D | 22-Dec-2020 | 1.1 KiB | |||
README.md | H A D | 22-Dec-2020 | 2.7 KiB | |||
bucket.go | H A D | 22-Dec-2020 | 611 | |||
cuckoofilter.go | H A D | 22-Dec-2020 | 3.5 KiB | |||
cuckoofilter_test.go | H A D | 22-Dec-2020 | 1.7 KiB | |||
doc.go | H A D | 22-Dec-2020 | 2.2 KiB | |||
go.mod | H A D | 22-Dec-2020 | 275 | |||
scalable_cuckoofilter.go | H A D | 22-Dec-2020 | 3.5 KiB | |||
scalable_cuckoofilter_test.go | H A D | 22-Dec-2020 | 1.2 KiB | |||
util.go | H A D | 22-Dec-2020 | 1 KiB | |||
util_test.go | H A D | 22-Dec-2020 | 1.8 KiB |
README.md
1# Cuckoo Filter 2 3[![GoDoc](https://godoc.org/github.com/seiflotfy/cuckoofilter?status.svg)](https://godoc.org/github.com/seiflotfy/cuckoofilter) [![CodeHunt.io](https://img.shields.io/badge/vote-codehunt.io-02AFD1.svg)](http://codehunt.io/sub/cuckoo-filter/?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge) 4 5Cuckoo filter is a Bloom filter replacement for approximated set-membership queries. While Bloom filters are well-known space-efficient data structures to serve queries like "if item x is in a set?", they do not support deletion. Their variances to enable deletion (like counting Bloom filters) usually require much more space. 6 7Cuckoo filters provide the flexibility to add and remove items dynamically. A cuckoo filter is based on cuckoo hashing (and therefore named as cuckoo filter). It is essentially a cuckoo hash table storing each key's fingerprint. Cuckoo hash tables can be highly compact, thus a cuckoo filter could use less space than conventional Bloom filters, for applications that require low false positive rates (< 3%). 8 9For details about the algorithm and citations please use this article for now 10 11["Cuckoo Filter: Better Than Bloom" by Bin Fan, Dave Andersen and Michael Kaminsky](https://www.cs.cmu.edu/~dga/papers/cuckoo-conext2014.pdf) 12 13## Implementation details 14 15The paper cited above leaves several parameters to choose. In this implementation 16 171. Every element has 2 possible bucket indices 182. Buckets have a static size of 4 fingerprints 193. Fingerprints have a static size of 8 bits 20 211 and 2 are suggested to be the optimum by the authors. The choice of 3 comes down to the desired false positive rate. Given a target false positive rate of `r` and a bucket size `b`, they suggest choosing the fingerprint size `f` using 22 23 f >= log2(2b/r) bits 24 25With the 8 bit fingerprint size in this repository, you can expect `r ~= 0.03`. 26[Other implementations](https://github.com/panmari/cuckoofilter) use 16 bit, which correspond to a false positive rate of `r ~= 0.0001`. 27 28## Example usage: 29```go 30package main 31 32import "fmt" 33import "github.com/seiflotfy/cuckoofilter" 34 35func main() { 36 cf := cuckoo.NewFilter(1000) 37 cf.InsertUnique([]byte("geeky ogre")) 38 39 // Lookup a string (and it a miss) if it exists in the cuckoofilter 40 cf.Lookup([]byte("hello")) 41 42 count := cf.Count() 43 fmt.Println(count) // count == 1 44 45 // Delete a string (and it a miss) 46 cf.Delete([]byte("hello")) 47 48 count = cf.Count() 49 fmt.Println(count) // count == 1 50 51 // Delete a string (a hit) 52 cf.Delete([]byte("geeky ogre")) 53 54 count = cf.Count() 55 fmt.Println(count) // count == 0 56 57 cf.Reset() // reset 58} 59``` 60 61## Documentation: 62["Cuckoo Filter on GoDoc"](http://godoc.org/github.com/seiflotfy/cuckoofilter) 63