1Noteworthy changes in release 1.4.1 (8th May 2017) 2~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3 4This is primarily a security bug fix update. 5 6* Fixed SECURITY issue with buffer overruns with malicious data. (#514). 7 8* S3 support for non Amazon AWS endpoints. (#506) 9 10* Support for variant breakpoints in bcftools. (#516) 11 12* Improved handling of BCF NaNs. (#485) 13 14* Compilation / portability improvements. (#255, #423, #498, #488) 15 16* Miscellaneous bug fixes (#482, #521, #522, #523, #524). 17 18* Sanitise headers (#509) 19 20 21Release 1.4 (13 March 2017) 22 23* Incompatible changes: several functions and data types have been changed 24 in this release, and the shared library soversion has been bumped to 2. 25 26 - bam_pileup1_t has an additional field (which holds user data) 27 - bam1_core_t has been modified to allow for >64K CIGAR operations 28 and (along with bam1_t) so that CIGAR entries are aligned in memory 29 - hopen() has vararg arguments for setting URL scheme-dependent options 30 - the various tbx_conf_* presets are now const 31 - auxiliary fields in bam1_t are now always stored in little-endian byte 32 order (previously this depended on if you read a bam, sam or cram file) 33 - index metadata (accessible via hts_idx_get_meta()) is now always 34 stored in little-endian byte order (previously this depended on if 35 the index was in tbi or csi format) 36 - bam_aux2i() now returns an int64_t value 37 - fai_load() will no longer save local copies of remote fasta indexes 38 - hts_idx_get_meta() now takes a uint32_t * for l_meta (was int32_t *) 39 40* HTSlib now links against libbz2 and liblzma by default. To remove these 41 dependencies, run configure with options --disable-bz2 and --disable-lzma, 42 but note that this may make some CRAM files produced elsewhere unreadable. 43 44* Added a thread pool interface and replaced the bgzf multi-threading 45 code to use this pool. BAM and CRAM decoding is now multi-threaded 46 too, using the pool to automatically balance the number of threads 47 between decode, encode and any data processing jobs. 48 49* New errmod_cal(), probaln_glocal(), sam_cap_mapq(), and sam_prob_realn() 50 functions, previously internal to SAMtools, have been added to HTSlib. 51 52* Files can now be accessed via Google Cloud Storage using gs: URLs, when 53 HTSlib is configured to use libcurl for network file access rather than 54 the included basic knetfile networking. 55 56* S3 file access now also supports the "host_base" setting in the 57 $HOME/.s3cfg configuration file. 58 59* Data URLs ("data:,text") now follow the standard RFC 2397 format and may 60 be base64-encoded (when written as "data:;base64,text") or may include 61 percent-encoded characters. HTSlib's previous over-simplified "data:text" 62 format is no longer supported -- you will need to add an initial comma. 63 64* When plugins are enabled, S3 support is now provided by a separate 65 hfile_s3 plugin rather than by hfile_libcurl itself as previously. 66 When --enable-libcurl is used, by default both GCS and S3 support 67 and plugins will also be built; they can be individually disabled 68 via --disable-gcs and --disable-s3. 69 70* The iRODS file access plugin has been moved to a separate repository. 71 Configure no longer has a --with-irods option; instead build the plugin 72 found at <https://github.com/samtools/htslib-plugins>. 73 74* APIs to portably read and write (possibly unaligned) data in little-endian 75 byte order have been added. 76 77* New functions bam_auxB_len(), bam_auxB2i() and bam_auxB2f() have been 78 added to make accessing array-type auxiliary data easier. bam_aux2i() 79 can now return the full range of values that can be stored in an integer 80 tag (including unsigned 32 bit tags). bam_aux2f() will return the value 81 of integer tags (as a double) as well as floating-point ones. All of 82 the bam_aux2 and bam_auxB2 functions will set errno if the requested 83 conversion is not valid. 84 85* New functions fai_load3() and fai_build3() allow fasta indexes to be 86 stored in a different location to the indexed fasta file. 87 88* New functions bgzf_index_dump_hfile() and bgzf_index_load_hfile() 89 allow bgzf index files (.gzi) to be written to / read from an existing 90 hFILE handle. 91 92* hts_idx_push() will report when trying to add a range to an index that 93 is beyond the limits that the given index can handle. This means trying 94 to index chromosomes longer than 2^29 bases with a .bai or .tbi index 95 will report an error instead of apparantly working but creating an invalid 96 index entry. 97 98* VCF formatting is now approximately 4x faster. (Whether this is 99 noticable depends on what was creating the VCF.) 100 101* CRAM lossy_names mode now works with TLEN of 0 or TLEN within +/- 1 102 of the computed value. Note in these situations TLEN will be 103 generated / fixed during CRAM decode. 104 105* CRAM now supports bzip2 and lzma codecs. Within htslib these are 106 disabled by default, but can be enabled by specifying "use_bzip2" or 107 "use_lzma" in an hts_opt_add() call or via the mode string of the 108 hts_open_format() function. 109 110Noteworthy changes in release 1.3.2 (13 September 2016) 111 112* Corrected bin calculation when converting directly from CRAM to BAM. 113 Previously a small fraction of converted reads would fail Picard's 114 validation with "bin field of BAM record does not equal value computed" 115 (SAMtools issue #574). 116 117* Plugins can now signal to HTSlib which of RTLD_LOCAL and RTLD_GLOBAL 118 they wish to be opened with -- previously they were always RTLD_LOCAL. 119 120 121Noteworthy changes in release 1.3.1 (22 April 2016) 122 123* Improved error checking and reporting, especially of I/O errors when 124 writing output files (#17, #315, PR #271, PR #317). 125 126* Build fixes for 32-bit systems; be sure to run configure to enable 127 large file support and access to 2GiB+ files. 128 129* Numerous VCF parsing fixes (#321, #322, #323, #324, #325; PR #370). 130 Particular thanks to Kostya Kortchinsky of the Google Security Team 131 for testing and numerous input parsing bug reports. 132 133* HTSlib now prints an informational message when initially creating a 134 CRAM reference cache in the default location under your $HOME directory. 135 (No message is printed if you are using $REF_CACHE to specify a location.) 136 137* Avoided rare race condition when caching downloaded CRAM reference sequence 138 files, by using distinctive names for temporary files (in addition to O_EXCL, 139 which has always been used). Occasional corruption would previously occur 140 when multiple tools were simultaneously caching the same reference sequences 141 on an NFS filesystem that did not support O_EXCL (PR #320). 142 143* Prevented race condition in file access plugin loading (PR #341). 144 145* Fixed mpileup memory leak, so no more "[bam_plp_destroy] memory leak [...] 146 Continue anyway" warning messages (#299). 147 148* Various minor CRAM fixes. 149 150* Fixed documentation problems #348 and #358. 151 152 153Noteworthy changes in release 1.3 (15 December 2015) 154 155* Files can now be accessed via HTTPS and Amazon S3 in addition to HTTP 156 and FTP, when HTSlib is configured to use libcurl for network file access 157 rather than the included basic knetfile networking. 158 159* HTSlib can be built to use remote access hFILE backends (such as iRODS 160 and libcurl) via a plugin mechanism. This allows other backends to be 161 easily added and facilitates building tools that use HTSlib, as they 162 don't need to be linked with the backends' various required libraries. 163 164* When writing CRAM output, sam_open() etc now default to writing CRAM v3.0 165 rather than v2.1. 166 167* fai_build() and samtools faidx now accept initial whitespace in ">" 168 headers (e.g., "> chr1 description" is taken to refer to "chr1"). 169 170* tabix --only-header works again (was broken in 1.2.x; #249). 171 172* HTSlib's configure script and Makefile now fully support the standard 173 convention of allowing CC/CPPFLAGS/CFLAGS/LDFLAGS/LIBS to be overridden 174 as needed. Previously the Makefile listened to $(LDLIBS) instead; if you 175 were overriding that, you should now override LIBS rather than LDLIBS. 176 177* Fixed bugs #168, #172, #176, #197, #206, #225, #245, #265, #295, and #296. 178 179 180Noteworthy changes in release 1.2.1 (3 February 2015) 181 182* Reinstated hts_file_type() and FT_* macros, which were available until 1.1 183 but briefly removed in 1.2. This function is deprecated and will be removed 184 in a future release -- you should use hts_detect_format() etc instead 185 186 187Noteworthy changes in release 1.2 (2 February 2015) 188 189* HTSlib now has a configure script which checks your build environment 190 and allows for selection of optional extras. See INSTALL for details 191 192* By default, reference sequences are fetched from the EBI CRAM Reference 193 Registry and cached in your $HOME cache directory. This behaviour can 194 be controlled by setting REF_PATH and REF_CACHE enviroment variables 195 (see the samtools(1) man page for details) 196 197* Numerous CRAM improvements: 198 - Support for CRAM v3.0, an upcoming revision to CRAM supporting 199 better compression and per-container checksums 200 - EOF checking for v2.1 and v3.0 (similar to checking BAM EOF blocks) 201 - Non-standard values for PNEXT and TLEN fields are now preserved 202 - hts_set_fai_filename() now provides a reference file when encoding 203 - Generated read names are now numbered from 1, rather than being 204 labelled 'slice:record-in-slice' 205 - Multi-threading and speed improvements 206 207* New htsfile command for identifying file formats, and corresponding 208 file format detection APIs 209 210* New tabix --regions FILE, --targets FILE options for filtering via BED files 211 212* Optional iRODS file access, disabled by default. Configure with --with-irods 213 to enable accessing iRODS data objects directly via 'irods:DATAOBJ' 214 215* All occurences of 2^29 in the source have been eliminated, so indexing 216 and querying against reference sequences larger than 512Mbp works (when 217 using CSI indices) 218 219* Support for plain GZIP compression in various places 220 221* VCF header editing speed improvements 222 223* Added seq_nt16_int[] (equivalent to the samtools API's bam_nt16_nt4_table) 224 225* Reinstated faidx_fetch_nseq(), which was accidentally removed from 1.1. 226 Now faidx_fetch_nseq() and faidx_nseq() are equivalent; eventually 227 faidx_fetch_nseq() will be deprecated and removed [#156] 228 229* Fixed bugs #141, #152, #155, #158, #159, and various memory leaks 230