1liberasurecode 2============== 3 4liberasurecode is an Erasure Code API library written in C with pluggable Erasure Code backends. 5 6---- 7 8Highlights 9========== 10 11 * Unified Erasure Coding interface for common storage workloads. 12 13 * Pluggable Erasure Code backends - liberasurecode supports the following backends: 14 15 - 'liberasurecode_rs_vand' - Native, software-only Erasure Coding implementation that supports a Reed-Solomon backend 16 - 'Jerasure' - Erasure Coding library that supports Reed-Solomon, Cauchy backends [1] 17 - 'ISA-L' - Intel Storage Acceleration Library - SIMD accelerated Erasure Coding backends [2] 18 - 'SHSS' - NTT Lab Japan's hybrid Erasure Coding backend [4] 19 - 'Flat XOR HD' - built-in to liberasurecode, based on [3] 20 - 'NULL' template backend implemented to help future backend writers 21 22 23 * True 'plugin' architecture - liberasurecode uses Dynamically Loaded (DL) 24 libraries to realize a true 'plugin' architecture. This also allows one to 25 build liberasurecode indepdendent of the Erasure Code backend libraries. 26 27 * Cross-platform - liberasurecode is known to work on Linux (Fedora/Debian 28 flavors), Solaris, BSD and Darwin/Mac OS X. 29 30 * Community support - Developed alongside Erasure Code authority Kevin 31 Greenan, liberasurecode is an actively maintained open-source project with 32 growing community involvement (Openstack Swift, Ceph, PyECLib, NTT Labs). 33 34---- 35 36License 37========== 38 39liberasurecode is distributed under the terms of the **BSD** license. 40 41---- 42 43Active Users 44==================== 45 46 * PyECLib - Python EC library: https://github.com/openstack/pyeclib 47 * Openstack Swift Object Store - https://wiki.openstack.org/wiki/Swift 48 49 50---- 51 52Build and Install 53================= 54 55Install dependencies - 56 57 Debian/Ubuntu hosts: 58 59```sh 60 $ sudo apt-get install build-essential autoconf automake libtool 61``` 62 63 Fedora/RedHat/Centos hosts: 64 65```sh 66 $ sudo yum install -y gcc make autoconf automake libtool 67``` 68 69To build the liberasurecode repository, perform the following from the 70top-level directory: 71 72``` sh 73 $ ./autogen.sh 74 $ ./configure 75 $ make 76 $ make test 77 $ sudo make install 78``` 79 80---- 81 82liberasurecode API Definition 83============================= 84 85``` c 86 87/* liberasurecode frontend API functions */ 88 89/** 90 * Create a liberasurecode instance and return a descriptor 91 * for use with EC operations (encode, decode, reconstruct) 92 * 93 * @param id - one of the supported backends as 94 * defined by ec_backend_id_t 95 * @param ec_args - arguments to the EC backend 96 * arguments common to all backends 97 * k - number of data fragments 98 * m - number of parity fragments 99 * w - word size, in bits 100 * hd - hamming distance (=m for Reed-Solomon) 101 * ct - fragment checksum type (stored with the fragment metadata) 102 * backend-specific arguments 103 * null_args - arguments for the null backend 104 * flat_xor_hd, jerasure do not require any special args 105 * 106 * @return liberasurecode instance descriptor (int > 0) 107 */ 108int liberasurecode_instance_create(const ec_backend_id_t id, 109 struct ec_args *args); 110 111/** 112 * Close a liberasurecode instance 113 * 114 * @param desc - liberasurecode descriptor to close 115 * 116 * @return 0 on success, otherwise non-zero error code 117 */ 118int liberasurecode_instance_destroy(int desc); 119 120 121/** 122 * Erasure encode a data buffer 123 * 124 * @param desc - liberasurecode descriptor/handle 125 * from liberasurecode_instance_create() 126 * @param orig_data - data to encode 127 * @param orig_data_size - length of data to encode 128 * @param encoded_data - pointer to _output_ array (char **) of k data 129 * fragments (char *), allocated by the callee 130 * @param encoded_parity - pointer to _output_ array (char **) of m parity 131 * fragments (char *), allocated by the callee 132 * @param fragment_len - pointer to _output_ length of each fragment, assuming 133 * all fragments are the same length 134 * 135 * @return 0 on success, -error code otherwise 136 */ 137int liberasurecode_encode(int desc, 138 const char *orig_data, uint64_t orig_data_size, /* input */ 139 char ***encoded_data, char ***encoded_parity, /* output */ 140 uint64_t *fragment_len); /* output */ 141 142/** 143 * Cleanup structures allocated by librasurecode_encode 144 * 145 * The caller has no context, so cannot safely free memory 146 * allocated by liberasurecode, so it must pass the 147 * deallocation responsibility back to liberasurecode. 148 * 149 * @param desc - liberasurecode descriptor/handle 150 * from liberasurecode_instance_create() 151 * @param encoded_data - (char **) array of k data 152 * fragments (char *), allocated by liberasurecode_encode 153 * @param encoded_parity - (char **) array of m parity 154 * fragments (char *), allocated by liberasurecode_encode 155 * 156 * @return 0 in success; -error otherwise 157 */ 158int liberasurecode_encode_cleanup(int desc, char **encoded_data, 159 char **encoded_parity); 160 161/** 162 * Reconstruct original data from a set of k encoded fragments 163 * 164 * @param desc - liberasurecode descriptor/handle 165 * from liberasurecode_instance_create() 166 * @param fragments - erasure encoded fragments (> = k) 167 * @param num_fragments - number of fragments being passed in 168 * @param fragment_len - length of each fragment (assume they are the same) 169 * @param force_metadata_checks - force fragment metadata checks (default: 0) 170 * @param out_data - _output_ pointer to decoded data 171 * @param out_data_len - _output_ length of decoded output 172 * (both output data pointers are allocated by liberasurecode, 173 * caller invokes liberasurecode_decode_clean() after it has 174 * read decoded data in 'out_data') 175 * 176 * @return 0 on success, -error code otherwise 177 */ 178int liberasurecode_decode(int desc, 179 char **available_fragments, /* input */ 180 int num_fragments, uint64_t fragment_len, /* input */ 181 int force_metadata_checks, /* input */ 182 char **out_data, uint64_t *out_data_len); /* output */ 183 184/** 185 * Cleanup structures allocated by librasurecode_decode 186 * 187 * The caller has no context, so cannot safely free memory 188 * allocated by liberasurecode, so it must pass the 189 * deallocation responsibility back to liberasurecode. 190 * 191 * @param desc - liberasurecode descriptor/handle 192 * from liberasurecode_instance_create() 193 * @param data - (char *) buffer of data decoded by librasurecode_decode 194 * 195 * @return 0 on success; -error otherwise 196 */ 197int liberasurecode_decode_cleanup(int desc, char *data); 198 199/** 200 * Reconstruct a missing fragment from a subset of available fragments 201 * 202 * @param desc - liberasurecode descriptor/handle 203 * from liberasurecode_instance_create() 204 * @param available_fragments - erasure encoded fragments 205 * @param num_fragments - number of fragments being passed in 206 * @param fragment_len - size in bytes of the fragments 207 * @param destination_idx - missing idx to reconstruct 208 * @param out_fragment - output of reconstruct 209 * 210 * @return 0 on success, -error code otherwise 211 */ 212int liberasurecode_reconstruct_fragment(int desc, 213 char **available_fragments, /* input */ 214 int num_fragments, uint64_t fragment_len, /* input */ 215 int destination_idx, /* input */ 216 char* out_fragment); /* output */ 217 218/** 219 * Return a list of lists with valid rebuild indexes given 220 * a list of missing indexes. 221 * 222 * @desc: liberasurecode instance descriptor (obtained with 223 * liberasurecode_instance_create) 224 * @fragments_to_reconstruct list of indexes to reconstruct 225 * @fragments_to_exclude list of indexes to exclude from 226 * reconstruction equation 227 * @fragments_needed list of fragments needed to reconstruct 228 * fragments in fragments_to_reconstruct 229 * 230 * @return 0 on success, non-zero on error 231 */ 232int liberasurecode_fragments_needed(int desc, 233 int *fragments_to_reconstruct, 234 int *fragments_to_exclude, 235 int *fragments_needed); 236 237``` 238 239Erasure Code Fragment Checksum Types Supported 240---------------------------------------------- 241 242``` c 243 244/* Checksum types supported for fragment metadata stored in each fragment */ 245typedef enum { 246 CHKSUM_NONE = 0, /* "none" (default) */ 247 CHKSUM_CRC32 = 1, /* "crc32" */ 248 CHKSUM_TYPES_MAX, 249} ec_checksum_type_t; 250 251``` 252 253Erasure Code Fragment Checksum API 254---------------------------------- 255 256``` c 257 258struct __attribute__((__packed__)) 259fragment_metadata 260{ 261 uint32_t idx; /* 4 */ 262 uint32_t size; /* 4 */ 263 uint32_t frag_backend_metadata_size; /* 4 */ 264 uint64_t orig_data_size; /* 8 */ 265 uint8_t chksum_type; /* 1 */ 266 uint32_t chksum[LIBERASURECODE_MAX_CHECKSUM_LEN]; /* 32 */ 267 uint8_t chksum_mismatch; /* 1 */ 268 uint8_t backend_id; /* 1 */ 269 uint32_t backend_version; /* 4 */ 270} fragment_metadata_t; 271 272#define FRAGSIZE_2_BLOCKSIZE(fragment_size) \ 273 (fragment_size - sizeof(fragment_header_t)) 274 275/** 276 * Get opaque metadata for a fragment. The metadata is opaque to the 277 * client, but meaningful to the underlying library. It is used to verify 278 * stripes in verify_stripe_metadata(). 279 * 280 * @param fragment - fragment data pointer 281 * @param fragment_metadata - pointer to allocated buffer of size at least 282 * sizeof(struct fragment_metadata) to hold fragment metadata struct 283 * 284 * @return 0 on success, non-zero on error 285 */ 286//EDL: This needs to be implemented 287int liberasurecode_get_fragment_metadata(char *fragment, 288 fragment_metadata_t *fragment_metadata); 289 290/** 291* Verify that the specified pointer points to a well formed fragment that can 292* be processed by both this instance of liberasurecode and the specified 293* backend. 294* 295* @param desc - liberasurecode descriptor/handle 296* from liberasurecode_instance_create() 297* @param fragment - fragment to verify 298* 299* @return 1 if fragment validation fails, 0 otherwise. 300*/ 301int is_invalid_fragment(int desc, char *fragment); 302 303/** 304 * Verify a subset of fragments generated by encode() 305 * 306 * @param desc - liberasurecode descriptor/handle 307 * from liberasurecode_instance_create() 308 * @param fragments - fragments part of the EC stripe to verify 309 * @param num_fragments - number of fragments part of the EC stripe 310 * 311 * @return 1 if stripe checksum verification is successful, 0 otherwise 312 */ 313int liberasurecode_verify_stripe_metadata(int desc, 314 char **fragments, int num_fragments); 315 316/* ==~=*=~===~=*=~==~=*=~== liberasurecode Helpers ==~*==~=*=~==~=~=*=~==~= */ 317 318/** 319 * This computes the aligned size of a buffer passed into 320 * the encode function. The encode function must pad fragments 321 * to be algined with the word size (w) and the last fragment also 322 * needs to be aligned. This computes the sum of the algined fragment 323 * sizes for a given buffer to encode. 324 * 325 * @param desc - liberasurecode descriptor/handle 326 * from liberasurecode_instance_create() 327 * @param data_len - original data length in bytes 328 * 329 * @return aligned length, or -error code on error 330 */ 331int liberasurecode_get_aligned_data_size(int desc, uint64_t data_len); 332 333/** 334 * This will return the minimum encode size, which is the minimum 335 * buffer size that can be encoded. 336 * 337 * @param desc - liberasurecode descriptor/handle 338 * from liberasurecode_instance_create() 339 * 340 * @return minimum data length length, or -error code on error 341 */ 342int liberasurecode_get_minimum_encode_size(int desc); 343 344/** 345 * This will return the fragment size, which is each fragment data 346 * length the backend will allocate when encoding. 347 * 348 * @param desc - liberasurecode descriptor/handle 349 * from liberasurecode_instance_create() 350 * @param data_len - original data length in bytes 351 * 352 * @return fragment size - sizeof(fragment_header) + size 353 * + frag_backend_metadata_size 354 */ 355int liberasurecode_get_fragment_size(int desc, int data_len); 356``` 357 358---- 359 360Code organization 361================= 362``` 363 |-- include 364 | +-- erasurecode 365 | | +-- erasurecode.h --> liberasurecode frontend API header 366 | | +-- erasurecode_backend.h --> liberasurecode backend API header 367 | +-- xor_codes --> headers for the built-in XOR codes 368 | 369 |-- src 370 | |-- erasurecode.c --> liberasurecode API implementation 371 | | (frontend + backend) 372 | |-- backends 373 | | +-- null 374 | | +--- null.c --> 'null' erasure code backend (template backend) 375 | | +-- xor 376 | | +--- flat_xor_hd.c --> 'flat_xor_hd' erasure code backend (built-in) 377 | | +-- jerasure 378 | | +-- jerasure_rs_cauchy.c --> 'jerasure_rs_vand' erasure code backend (jerasure.org) 379 | | +-- jerasure_rs_vand.c --> 'jerasure_rs_cauchy' erasure code backend (jerasure.org) 380 | | +-- isa-l 381 | | +-- isa_l_rs_vand.c --> 'isa_l_rs_vand' erasure code backend (Intel) 382 | | +-- shss 383 | | +-- shss.c --> 'shss' erasure code backend (NTT Labs) 384 | | 385 | |-- builtin 386 | | +-- xor_codes --> XOR HD code backend, built-in erasure 387 | | | code implementation (shared library) 388 | | +-- xor_code.c 389 | | +-- xor_hd_code.c 390 | | +-- rs_vand --> liberasurecode native Reed Soloman codes 391 | | 392 | +-- utils 393 | +-- chksum --> fragment checksum utils for erasure 394 | +-- alg_sig.c coded fragments 395 | +-- crc32.c 396 | 397 |-- doc --> API Documentation 398 | +-- Doxyfile 399 | +-- html 400 | 401 |--- test --> Test routines 402 | +-- builtin 403 | | +-- xor_codes 404 | +-- liberasurecode_test.c 405 | +-- utils 406 | 407 |-- autogen.sh 408 |-- configure.ac 409 |-- Makefile.am 410 |-- README 411 |-- NEWS 412 |-- COPYING 413 |-- AUTHORS 414 |-- INSTALL 415 +-- ChangeLog 416``` 417--- 418 419References 420========== 421 422 [1] Jerasure, C library that supports erasure coding in storage applications, http://jerasure.org 423 424 [2] Intel(R) Storage Acceleration Library (Open Source Version), https://01.org/intel%C2%AE-storage-acceleration-library-open-source-version 425 426 [3] Greenan, Kevin M et al, "Flat XOR-based erasure codes in storage systems", http://www.kaymgee.com/Kevin_Greenan/Publications_files/greenan-msst10.pdf 427 428 [4] Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>, "NTT SHSS Erasure Coding backend" 429