• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

.gitignoreH A D22-Dec-2020273

LICENSEH A D22-Dec-20201.1 KiB

README.mdH A D22-Dec-20202.7 KiB

bucket.goH A D22-Dec-2020611

cuckoofilter.goH A D22-Dec-20203.5 KiB

cuckoofilter_test.goH A D22-Dec-20201.7 KiB

doc.goH A D22-Dec-20202.2 KiB

go.modH A D22-Dec-2020275

scalable_cuckoofilter.goH A D22-Dec-20203.5 KiB

scalable_cuckoofilter_test.goH A D22-Dec-20201.2 KiB

util.goH A D22-Dec-20201 KiB

util_test.goH A D22-Dec-20201.8 KiB

README.md

1# Cuckoo Filter
2
3[![GoDoc](https://godoc.org/github.com/seiflotfy/cuckoofilter?status.svg)](https://godoc.org/github.com/seiflotfy/cuckoofilter) [![CodeHunt.io](https://img.shields.io/badge/vote-codehunt.io-02AFD1.svg)](http://codehunt.io/sub/cuckoo-filter/?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge)
4
5Cuckoo filter is a Bloom filter replacement for approximated set-membership queries. While Bloom filters are well-known space-efficient data structures to serve queries like "if item x is in a set?", they do not support deletion. Their variances to enable deletion (like counting Bloom filters) usually require much more space.
6
7Cuckoo filters provide the flexibility to add and remove items dynamically. A cuckoo filter is based on cuckoo hashing (and therefore named as cuckoo filter). It is essentially a cuckoo hash table storing each key's fingerprint. Cuckoo hash tables can be highly compact, thus a cuckoo filter could use less space than conventional Bloom filters, for applications that require low false positive rates (< 3%).
8
9For details about the algorithm and citations please use this article for now
10
11["Cuckoo Filter: Better Than Bloom" by Bin Fan, Dave Andersen and Michael Kaminsky](https://www.cs.cmu.edu/~dga/papers/cuckoo-conext2014.pdf)
12
13## Implementation details
14
15The paper cited above leaves several parameters to choose. In this implementation
16
171. Every element has 2 possible bucket indices
182. Buckets have a static size of 4 fingerprints
193. Fingerprints have a static size of 8 bits
20
211 and 2 are suggested to be the optimum by the authors. The choice of 3 comes down to the desired false positive rate. Given a target false positive rate of `r` and a bucket size `b`, they suggest choosing the fingerprint size `f` using
22
23    f >= log2(2b/r) bits
24
25With the 8 bit fingerprint size in this repository, you can expect `r ~= 0.03`.
26[Other implementations](https://github.com/panmari/cuckoofilter) use 16 bit, which correspond to a false positive rate of `r ~= 0.0001`.
27
28## Example usage:
29```go
30package main
31
32import "fmt"
33import "github.com/seiflotfy/cuckoofilter"
34
35func main() {
36  cf := cuckoo.NewFilter(1000)
37  cf.InsertUnique([]byte("geeky ogre"))
38
39  // Lookup a string (and it a miss) if it exists in the cuckoofilter
40  cf.Lookup([]byte("hello"))
41
42  count := cf.Count()
43  fmt.Println(count) // count == 1
44
45  // Delete a string (and it a miss)
46  cf.Delete([]byte("hello"))
47
48  count = cf.Count()
49  fmt.Println(count) // count == 1
50
51  // Delete a string (a hit)
52  cf.Delete([]byte("geeky ogre"))
53
54  count = cf.Count()
55  fmt.Println(count) // count == 0
56
57  cf.Reset()    // reset
58}
59```
60
61## Documentation:
62["Cuckoo Filter on GoDoc"](http://godoc.org/github.com/seiflotfy/cuckoofilter)
63