• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..15-Oct-2020-

LICENSEH A D15-Oct-20201.5 KiB2925

README.mdH A D15-Oct-20204.4 KiB8860

crc32.goH A D15-Oct-20206.2 KiB208104

crc32_amd64.goH A D15-Oct-20206.9 KiB231102

crc32_amd64.sH A D15-Oct-20206 KiB320213

crc32_amd64p32.goH A D15-Oct-20201.1 KiB4421

crc32_amd64p32.sH A D15-Oct-20201.2 KiB6843

crc32_generic.goH A D15-Oct-20202.5 KiB9057

crc32_otherarch.goH A D15-Oct-2020665 167

crc32_s390x.goH A D15-Oct-20202.1 KiB9257

crc32_s390x.sH A D15-Oct-20207.9 KiB250128

README.md

1# crc32
2CRC32 hash with x64 optimizations
3
4This package is a drop-in replacement for the standard library `hash/crc32` package, that features SSE 4.2 optimizations on x64 platforms, for a 10x speedup.
5
6[![Build Status](https://travis-ci.org/klauspost/crc32.svg?branch=master)](https://travis-ci.org/klauspost/crc32)
7
8# usage
9
10Install using `go get github.com/klauspost/crc32`. This library is based on Go 1.5 code and requires Go 1.3 or newer.
11
12Replace `import "hash/crc32"` with `import "github.com/klauspost/crc32"` and you are good to go.
13
14# changes
15* Oct 20, 2016: Changes have been merged to upstream Go. Package updated to match.
16* Dec 4, 2015: Uses the "slice-by-8" trick more extensively, which gives a 1.5 to 2.5x speedup if assembler is unavailable.
17
18
19# performance
20
21For *Go 1.7* performance is equivalent to the standard library. So if you use this package for Go 1.7 you can switch back.
22
23
24For IEEE tables (the most common), there is approximately a factor 10 speedup with "CLMUL" (Carryless multiplication) instruction:
25```
26benchmark            old ns/op     new ns/op     delta
27BenchmarkCrc32KB     99955         10258         -89.74%
28
29benchmark            old MB/s     new MB/s     speedup
30BenchmarkCrc32KB     327.83       3194.20      9.74x
31```
32
33For other tables and "CLMUL"  capable machines the performance is the same as the standard library.
34
35Here are some detailed benchmarks, comparing to go 1.5 standard library with and without assembler enabled.
36
37```
38Std:   Standard Go 1.5 library
39Crc:   Indicates IEEE type CRC.
4040B:   Size of each slice encoded.
41NoAsm: Assembler was disabled (ie. not an AMD64 or SSE 4.2+ capable machine).
42Castagnoli: Castagnoli CRC type.
43
44BenchmarkStdCrc40B-4            10000000               158 ns/op         252.88 MB/s
45BenchmarkCrc40BNoAsm-4          20000000               105 ns/op         377.38 MB/s (slice8)
46BenchmarkCrc40B-4               20000000               105 ns/op         378.77 MB/s (slice8)
47
48BenchmarkStdCrc1KB-4              500000              3604 ns/op         284.10 MB/s
49BenchmarkCrc1KBNoAsm-4           1000000              1463 ns/op         699.79 MB/s (slice8)
50BenchmarkCrc1KB-4                3000000               396 ns/op        2583.69 MB/s (asm)
51
52BenchmarkStdCrc8KB-4              200000             11417 ns/op         717.48 MB/s (slice8)
53BenchmarkCrc8KBNoAsm-4            200000             11317 ns/op         723.85 MB/s (slice8)
54BenchmarkCrc8KB-4                 500000              2919 ns/op        2805.73 MB/s (asm)
55
56BenchmarkStdCrc32KB-4              30000             45749 ns/op         716.24 MB/s (slice8)
57BenchmarkCrc32KBNoAsm-4            30000             45109 ns/op         726.42 MB/s (slice8)
58BenchmarkCrc32KB-4                100000             11497 ns/op        2850.09 MB/s (asm)
59
60BenchmarkStdNoAsmCastagnol40B-4 10000000               161 ns/op         246.94 MB/s
61BenchmarkStdCastagnoli40B-4     50000000              28.4 ns/op        1410.69 MB/s (asm)
62BenchmarkCastagnoli40BNoAsm-4   20000000               100 ns/op         398.01 MB/s (slice8)
63BenchmarkCastagnoli40B-4        50000000              28.2 ns/op        1419.54 MB/s (asm)
64
65BenchmarkStdNoAsmCastagnoli1KB-4  500000              3622 ns/op        282.67 MB/s
66BenchmarkStdCastagnoli1KB-4     10000000               144 ns/op        7099.78 MB/s (asm)
67BenchmarkCastagnoli1KBNoAsm-4    1000000              1475 ns/op         694.14 MB/s (slice8)
68BenchmarkCastagnoli1KB-4        10000000               146 ns/op        6993.35 MB/s (asm)
69
70BenchmarkStdNoAsmCastagnoli8KB-4  50000              28781 ns/op         284.63 MB/s
71BenchmarkStdCastagnoli8KB-4      1000000              1029 ns/op        7957.89 MB/s (asm)
72BenchmarkCastagnoli8KBNoAsm-4     200000             11410 ns/op         717.94 MB/s (slice8)
73BenchmarkCastagnoli8KB-4         1000000              1000 ns/op        8188.71 MB/s (asm)
74
75BenchmarkStdNoAsmCastagnoli32KB-4  10000            115426 ns/op         283.89 MB/s
76BenchmarkStdCastagnoli32KB-4      300000              4065 ns/op        8059.13 MB/s (asm)
77BenchmarkCastagnoli32KBNoAsm-4     30000             45171 ns/op         725.41 MB/s (slice8)
78BenchmarkCastagnoli32KB-4         500000              4077 ns/op        8035.89 MB/s (asm)
79```
80
81The IEEE assembler optimizations has been submitted and will be part of the Go 1.6 standard library.
82
83However, the improved use of slice-by-8 has not, but will probably be submitted for Go 1.7.
84
85# license
86
87Standard Go license. Changes are Copyright (c) 2015 Klaus Post under same conditions.
88