1Release 0.7.17 (23 October 2017)
2--------------------------------
3
4This release adds option -q to preserve the mapping quality of split alignment
5with a lower alignment score than the primary alignment. Option -5
6automatically applies -q as well.
7
8(0.7.17: 23 October 2017, r1188)
9
10
11
12Release 0.7.16 (30 July 2017)
13-----------------------------
14
15This release added a couple of minor features and incorporated multiple pull
16requests, including:
17
18 * Added option -5, which is useful to some Hi-C pipelines.
19
20 * Fixed an error with samtools sorting (#129). Updated download link for
21   GRCh38 (#123). Fixed README MarkDown formatting (#70). Addressed multiple
22   issues via a collected pull request #139 by @jmarshall. Avoid malformatted
23   SAM header when -R is used with TAB (#84). Output mate CIGAR (#138).
24
25(0.7.16: 30 July 2017, r1180)
26
27
28
29Release 0.7.15 (31 May 2016)
30----------------------------
31
32Fixed a long existing bug which potentially leads to underestimated insert size
33upper bound. This bug should have little effect in practice.
34
35(0.7.15: 31 May 2016, r1140)
36
37
38
39Release 0.7.14 (4 May 2016)
40---------------------------
41
42In the ALT mapping mode, this release adds the "AH:*" header tag to SQ lines
43corresponding to alternate haplotypes.
44
45(0.7.14: 4 May 2016, r1136)
46
47
48
49Release 0.7.13 (23 Feburary 2016)
50---------------------------------
51
52This release fixes a few minor bugs in the previous version and adds a few
53minor features. All BWA algorithms should produce identical output to 0.7.12
54when there are no ALT contigs.
55
56Detailed changes:
57
58 * Fixed a bug in "bwa-postalt.js". The old version may produce 0.5% of wrong
59   bases for reads mapped to the ALT contigs.
60
61 * Fixed a potential bug in the multithreading mode. It may occur when mapping
62   is much faster than file reading, which should almost never happen in
63   practice.
64
65 * Changed the download URL of GRCh38.
66
67 * Removed the read overlap mode. It is not working well.
68
69 * Added the ropebwt2 algorithm as an alternative to index large genomes.
70   Ropebwt2 is slower than the "bwtsw" algorithm, but it has a permissive
71   license. This allows us to create an Apache2-licensed BWA (in the "Apache2"
72   branch) for commercial users who are concerned with GPL.
73
74(0.7.13: 23 Feburary 2016, r1126)
75
76
77
78Release 0.7.12 (28 December 2014)
79---------------------------------
80
81This release fixed a bug in the pair-end mode when ALT contigs are present. It
82leads to undercalling in regions overlapping ALT contigs.
83
84(0.7.12: 28 December 2014, r1039)
85
86
87
88Release 0.7.11 (23 December, 2014)
89----------------------------------
90
91A major change to BWA-MEM is the support of mapping to ALT contigs in addition
92to the primary assembly. Part of the ALT mapping strategy is implemented in
93BWA-MEM and the rest in a postprocessing script for now. Due to the extra
94layer of complexity on generating the reference genome and on the two-step
95mapping, we start to provide a wrapper script and precompiled binaries since
96this release. The package may be more convenient to some specific use cases.
97For general uses, the single BWA binary still works like the old way.
98
99Another major addition to BWA-MEM is HLA typing, which made possible with the
100new ALT mapping strategy. Necessary data and programs are included in the
101binary release. The wrapper script also optionally performs HLA typing when HLA
102genes are included in the reference genome as additional ALT contigs.
103
104Other notable changes to BWA-MEM:
105
106 * Added option `-b` to `bwa index`. This option tunes the batch size used in
107   the construction of BWT. It is advised to use large `-b` for huge reference
108   sequences such as the BLAST *nt* database.
109
110 * Optimized for PacBio data. This includes a change to scoring based on a
111   study done by Aaron Quinlan and a heuristic speedup. Further speedup is
112   possible, but needs more careful investigation.
113
114 * Dropped PacBio read-to-read alignment for now. BWA-MEM is good for finding
115   the best hit, but is not very sensitive to suboptimal hits. Option `-x pbread`
116   is still available, but hidden on the command line. This may be removed in
117   future releases.
118
119 * Added a new pre-setting for Oxford Nanopore 2D reads. LAST is still a little
120   more sensitive on older bacterial data, but bwa-mem is as good on more
121   recent data and is times faster for mapping against mammalian genomes.
122
123 * Added LAST-like seeding. This improves the accuracy for longer reads.
124
125 * Added option `-H` to insert arbitrary header lines.
126
127 * Smarter option `-p`. Given an interleaved FASTQ stream, old bwa-mem identifies
128   the 2i-th and (2i+1)-th reads as a read pair. The new verion identifies
129   adjacent reads with the same read name as a read pair. It is possible to mix
130   single-end and paired-end reads in one FASTQ.
131
132 * Improved parallelization. Old bwa-mem waits for I/O. The new version puts
133   I/O on a separate thread. It performs mapping while reading FASTQ and
134   writing SAM. This saves significant wall-clock time when reading from
135   or writing to a slow Unix pipe.
136
137With the new release, the recommended way to map Illumina reads to GRCh38 is to
138use the bwakit binary package:
139
140    bwa.kit/run-gen-ref hs38DH
141    bwa.kit/bwa index hs38DH.fa
142    bwa.kit/run-bwamem -t8 -H -o out-prefix hs38DH.fa read1.fq.gz read2.fq.gz | sh
143
144Please check bwa.kit/README.md for details and command line options.
145
146(0.7.11: 23 December 2014, r1034)
147
148
149
150Release 0.7.10 (13 July, 2014)
151------------------------------
152
153Notable changes to BWA-MEM:
154
155 * Fixed a segmentation fault due to an alignment bridging the forward-reverse
156   boundary. This is a bug.
157
158 * Use the PacBio heuristic to map contigs to the reference genome. The old
159   heuristic evaluates the necessity of full extension for each chain. This may
160   not work in long low-complexity regions. The PacBio heuristic performs
161   SSE2-SW around each short seed. It works better. Note that the heuristic is
162   only applied to long query sequences. For Illumina reads, the output is
163   identical to the previous version.
164
165(0.7.10: 13 July 2014, r789)
166
167
168
169Release 0.7.9 (19 May, 2014)
170----------------------------
171
172This release brings several major changes to BWA-MEM. Notably, BWA-MEM now
173formally supports PacBio read-to-reference alignment and experimentally supports
174PacBio read-to-read alignment. BWA-MEM also runs faster at a minor cost of
175accuracy. The speedup is more significant when GRCh38 is in use. More
176specifically:
177
178 * Support PacBio subread-to-reference alignment. Although older BWA-MEM works
179   with PacBio data in principle, the resultant alignments are frequently
180   fragmented. In this release, we fine tuned existing methods and introduced
181   new heuristics to improve PacBio alignment. These changes are not used by
182   default. Users need to add option "-x pacbio" to enable the feature.
183
184 * Support PacBio subread-to-subread alignment (EXPERIMENTAL). This feature is
185   enabled with option "-x pbread". In this mode, the output only gives the
186   overlapping region between a pair of reads without detailed alignment.
187
188 * Output alternative hits in the XA tag if there are not so many of them. This
189   is a BWA-backtrack feature.
190
191 * Support mapping to ALT contigs in GRCh38 (EXPERIMENTAL). We provide a script
192   to postprocess hits in the XA tag to adjust the mapping quality and generate
193   new primary alignments to all overlapping ALT contigs. We would *NOT*
194   recommend this feature for production uses.
195
196 * Improved alignments to many short reference sequences. Older BWA-MEM may
197   generate an alignment bridging two or more adjacent reference sequences.
198   Such alignments are split at a later step as postprocessing. This approach
199   is complex and does not always work. This release forbids these alignments
200   from the very beginning. BWA-MEM should not produce an alignment bridging
201   two or more reference sequences any more.
202
203 * Reduced the maximum seed occurrence from 10000 to 500. Reduced the maximum
204   rounds of Smith-Waterman mate rescue from 100 to 50. Added a heuristic to
205   lower the mapping quality if a read contains seeds with excessive
206   occurrences. These changes make BWA-MEM faster at a minor cost of accuracy
207   in highly repetitive regions.
208
209 * Added an option "-Y" to use soft clipping for supplementary alignments.
210
211 * Bugfix: incomplete alignment extension in corner cases.
212
213 * Bugfix: integer overflow when aligning long query sequences.
214
215 * Bugfix: chain score is not computed correctly (almost no practical effect)
216
217 * General code cleanup
218
219 * Added FAQs to README
220
221Changes in BWA-backtrack:
222
223 * Bugfix: a segmentation fault when an alignment stands out of the end of the
224   last chromosome.
225
226(0.7.9: 19 May 2014, r783)
227
228
229
230Release 0.7.8 (31 March, 2014)
231------------------------------
232
233Changes in BWA-MEM:
234
235 * Bugfix: off-diagonal X-dropoff (option -d) not working as intended.
236   Short-read alignment is not affected.
237
238 * Bugfix: unnecessarily large bandwidth used during global alignment,
239   which reduces the mapping speed by -5% for short reads. Results are not
240   affected.
241
242 * Bugfix: when the matching score is not one, paired-end mapping quality is
243   inaccurate.
244
245 * When the matching score (option -A) is changed, scale all score-related
246   options accordingly unless overridden by users.
247
248 * Allow to specify different gap open (or extension) penalties for deletions
249   and insertions separately.
250
251 * Allow to specify the insert size distribution.
252
253 * Better and more detailed debugging information.
254
255With the default setting, 0.7.8 and 0.7.7 gave identical output on one million
256100bp read pairs.
257
258(0.7.8: 31 March 2014, r455)
259
260
261
262Release 0.7.7 (25 Feburary, 2014)
263---------------------------------
264
265This release fixes incorrect MD tags in the BWA-MEM output.
266
267A note about short-read mapping to GRCh38. The new human reference genome
268GRCh38 contains 60Mbp program generated alpha repeat arrays, some of which are
269hard masked as they cannot be localized. These highly repetitive arrays make
270BWA-MEM -50% slower. If you are concerned with the performance of BWA-MEM, you
271may consider to use option "-c2000 -m50". On simulated data, this setting helps
272the performance at a very minor cost on accuracy. I may consider to change the
273default in future releases.
274
275(0.7.7: 25 Feburary 2014, r441)
276
277
278
279Release 0.7.6 (31 Januaray, 2014)
280---------------------------------
281
282Changes in BWA-MEM:
283
284 * Changed the way mapping quality is estimated. The new method tends to give
285   the same alignment a higher mapping quality. On paired-end reads, the change
286   is minor as with pairing, the mapping quality is usually high. For short
287   single-end reads, the difference is considerable.
288
289 * Improved load balance when many threads are spawned. However, bwa-mem is
290   still not very thread efficient, probably due to the frequent heap memory
291   allocation. Further improvement is a little difficult and may affect the
292   code stability.
293
294 * Allow to use different clipping penalties for 5'- and 3'-ends. This helps
295   when we do not want to clip one end.
296
297 * Print the @PG line, including the command line options.
298
299 * Improved the band width estimate: a) fixed a bug causing the band
300   width extimated from extension not used in the final global alignment; b)
301   try doubled band width if the global alignment score is smaller.
302   Insufficient band width leads to wrong CIGAR and spurious mismatches/indels.
303
304 * Added a new option -D to fine tune a heuristic on dropping suboptimal hits.
305   Reducing -D increases accuracy but decreases the mapping speed. If unsure,
306   leave it to the default.
307
308 * Bugfix: for a repetitive single-end read, the reported hit is not randomly
309   distributed among equally best hits.
310
311 * Bugfix: missing paired-end hits due to unsorted list of SE hits.
312
313 * Bugfix: incorrect CIGAR caused by a defect in the global alignment.
314
315 * Bugfix: incorrect CIGAR caused by failed SW rescue.
316
317 * Bugfix: alignments largely mapped to the same position are regarded to be
318   distinct from each other, which leads to underestimated mapping quality.
319
320 * Added the MD tag.
321
322There are no changes to BWA-backtrack in this release. However, it has a few
323known issues yet to be fixed. If you prefer BWA-track, It is still advised to
324use bwa-0.6.x.
325
326While I developed BWA-MEM, I also found a few issues with BWA-SW. It is now
327possible to improve BWA-SW with the lessons learned from BWA-MEM. However, as
328BWA-MEM is usually better, I will not improve BWA-SW until I find applications
329where BWA-SW may excel.
330
331(0.7.6: 31 January 2014, r432)
332
333
334
335Release 0.7.5a (30 May, 2013)
336-----------------------------
337
338Fixed a bug in BWA-backtrack which leads to off-by-one mapping errors in rare
339cases.
340
341(0.7.5a: 30 May 2013, r405)
342
343
344
345Release 0.7.5 (29 May, 2013)
346----------------------------
347
348Changes in all components:
349
350 * Improved error checking on memory allocation and file I/O. Patches provided
351   by Rob Davies.
352
353 * Updated README.
354
355 * Bugfix: return code is zero upon errors.
356
357Changes in BWA-MEM:
358
359 * Changed the way a chimeric alignment is reported (conforming to the upcoming
360   SAM spec v1.5). With 0.7.5, if the read has a chimeric alignment, the paired
361   or the top hit uses soft clipping and is marked with neither 0x800 nor 0x100
362   bits. All the other hits part of the chimeric alignment will use hard
363   clipping and be marked with 0x800 if option "-M" is not in use, or marked
364   with 0x100 otherwise.
365
366 * Other hits part of a chimeric alignment are now reported in the SA tag,
367   conforming to the SAM spec v1.5.
368
369 * Better method for resolving an alignment bridging two or more short
370   reference sequences. The current strategy maps the query to the reference
371   sequence that covers the middle point of the alignment. For most
372   applications, this change has no effects.
373
374Changes in BWA-backtrack:
375
376 * Added a magic number to .sai files. This prevents samse/sampe from reading
377   corrupted .sai (e.g. a .sai file containing LSF log) or incompatible .sai
378   generated by a different version of bwa.
379
380 * Bugfix: alignments in the XA:Z: tag were wrong.
381
382 * Keep track of #ins and #del during backtracking. This simplifies the code
383   and reduces errors in rare corner cases. I should have done this in the
384   early days of bwa.
385
386In addition, if you use BWA-MEM or the fastmap command of BWA, please cite:
387
388 - Li H. (2013) Aligning sequence reads, clone sequences and assembly contigs
389   with BWA-MEM. arXiv:1303.3997v2 [q-bio.GN].
390
391Thank you.
392
393(0.7.5: 29 May 2013, r404)
394
395
396
397Release 0.7.4 (23 April, 2013)
398------------------------------
399
400This is a bugfix release. Most of bugs are considered to be minor which only
401occur very rarely.
402
403 * Bugfix: wrong CIGAR when a query sequence bridges three or more target
404   sequences. This only happens when aligning reads to short assembly contigs.
405
406 * Bugfix: leading "D" operator in CIGAR.
407
408 * Extend more seeds for better alignment around tandem repeats. This is also
409   a cause of the leading "D" operator in CIGAR.
410
411 * Bugfix: SSE2-SSW may occasionally find incorrect query starting position
412   around tandem repeat. This will lead to a suboptimal CIGAR in BWA-MEM and
413   a wrong CIGAR in BWA.
414
415 * Bugfix: clipping penalty does not work as is intended when there is a gap
416   towards the end of a read.
417
418 * Fixed an issue caused by a bug in the libc from Mac/Darwin. In Darwin,
419   fread() is unable to read a data block longer than 2GB due to an integer
420   overflow bug in its implementation.
421
422Since version 0.7.4, BWA-MEM is considered to reach similar stability to
423BWA-backtrack for short-read mapping.
424
425(0.7.4: 23 April, r385)
426
427
428
429Release 0.7.3a (15 March, 2013)
430-------------------------------
431
432In 0.7.3, the wrong CIGAR bug was only fixed in one scenario, but not fixed
433in another corner case.
434
435(0.7.3a: 15 March 2013, r367)
436
437
438
439Release 0.7.3 (15 March, 2013)
440------------------------------
441
442Changes to BWA-MEM:
443
444 * Bugfix: pairing score is inaccurate when option -A does not take the default
445   value. This is a very minor issue even if it happens.
446
447 * Bugfix: occasionally wrong CIGAR. This happens when in the alignment there
448   is a 1bp deletion and a 1bp insertion which are close to the end of the
449   reads, and there are no other substitutions or indels. BWA-MEM would not do
450   a gapped alignment due to the bug.
451
452 * New feature: output other non-overlapping alignments in the XP tag such that
453   we can see the entire picture of alignment from one SAM line. XP gives the
454   position, CIGAR, NM and mapQ of each aligned subsequence of the query.
455
456BWA-MEM has been used to align -300Gbp 100-700bp SE/PE reads. SNP/indel calling
457has also been evaluated on part of these data. BWA-MEM generally gives better
458pre-filtered SNP calls than BWA. No significant issues have been observed since
4590.7.2, though minor improvements or bugs (e.g. the bug fixed in this release)
460are still possible. If you find potential issues, please send bug reports to
461<bio-bwa-help@lists.sourceforge.net> (free registration required).
462
463In addition, more detailed description of the BWA-MEM algorithm can be found at
464<https://github.com/lh3/mem-paper>.
465
466(0.7.3: 15 March 2013, r366)
467
468
469
470Release 0.7.2 (9 March, 2013)
471-----------------------------
472
473Emergent bug fix: 0.7.0 and 0.7.1 give a wrong sign to TLEN. In addition,
474flagging 'properly paired' also gets improved a little.
475
476(0.7.2: 9 March 2013, r351)
477
478
479
480Release 0.7.1 (8 March, 2013)
481-----------------------------
482
483Changes to BWA-MEM:
484
485 * Bugfix: rare segmentation fault caused by a partial hit to the end of the
486   last sequence.
487
488 * Bugfix: occasional mis-pairing given an interleaved fastq.
489
490 * Bugfix: wrong mate information when the mate is unmapped. SAM generated by
491   BWA-MEM can now be validated with Picard.
492
493 * Improved the performance and accuracy for ultra-long query sequences.
494   Short-read alignment is not affected.
495
496Changes to other components:
497
498 * In BWA-backtrack and BWA-SW, replaced the code for global alignment,
499   Smith-Waterman and SW extension. The performance and accuracy of the two
500   algorithms stay the same.
501
502 * Added an experimental subcommand to merge overlapping paired ends. The
503   algorithm is very conservative: it may miss true overlaps but rarely makes
504   mistakes.
505
506An important note is that like BWA-SW, BWA-MEM may output multiple primary
507alignments for a read, which may cause problems to some tools. For aligning
508sequence reads, it is advised to use '-M' to flag extra hits as secondary. This
509option is not the default because multiple primary alignments are theoretically
510possible in sequence alignment.
511
512(0.7.1: 8 March 2013, r347)
513
514
515
516Beta Release 0.7.0 (28 Feburary, 2013)
517--------------------------------------
518
519This release comes with a new alignment algorithm, BWA-MEM, for 70bp-1Mbp query
520sequences. BWA-MEM essentially seeds alignments with a variant of the fastmap
521algorithm and extends seeds with banded affine-gap-penalty dynamic programming
522(i.e. the Smith-Waterman-Gotoh algorithm). For typical Illumina 100bp reads or
523longer low-divergence query sequences, BWA-MEM is about twice as fast as BWA
524and BWA-SW and is more accurate. It also supports split alignments like BWA-SW
525and may optionally output multiple hits like BWA. BWA-MEM does not guarantee
526to find hits within a certain edit distance, but BWA is not efficient for such
527task given longer reads anyway, and the edit-distance criterion is arguably
528not as important in long-read alignment.
529
530In addition to the algorithmic improvements, BWA-MEM also implements a few
531handy features in practical aspects:
532
533 1. BWA-MEM automatically switches between local and glocal (global wrt reads;
534    local wrt reference) alignment. It reports the end-to-end glocal alignment
535    if the glocal alignment is not much worse than the optimal local alignment.
536    Glocal alignment reduces reference bias.
537
538 2. BWA-MEM automatically infers pair orientation from a batch of single-end
539    alignments. It allows more than one orientations if there are sufficient
540    supporting reads. This feature has not been tested on reads from Illumina
541    jumping library yet. (EXPERIMENTAL)
542
543 3. BWA-MEM optionally takes one interleaved fastq for paired-end mapping. It
544    is possible to convert a name-sorted BAM to an interleaved fastq on the fly
545    and feed the data stream to BWA-MEM for mapping.
546
547 4. BWA-MEM optionally copies FASTA/Q comments to the final SAM output, which
548    helps to transfer individual read annotations to the output.
549
550 5. BWA-MEM supports more advanced piping. Users can now run:
551    (bwa mem ref.fa '<bzcat r1.fq.bz2' '<bzcat r2.fq.bz2') to map bzip'd read
552    files without replying on bash features.
553
554 6. BWA-MEM provides a few basic APIs for single-end mapping. The 'example.c'
555    program in the source code directory implements a full single-end mapper in
556    50 lines of code.
557
558The BWA-MEM algorithm is in the beta phase. It is not advised to use BWA-MEM
559for production use yet. However, when the implementation becomes stable after a
560few release cycles, existing BWA users are recommended to migrate to BWA-MEM
561for 76bp or longer Illumina reads and long query sequences. The original BWA
562short-read algorithm will not deliver satisfactory results for 150bp+ Illumina
563reads. Change of mappers will be necessary sooner or later.
564
565(0.7.0 beta: 28 Feburary 2013, r313)
566
567
568
569Release 0.6.2 (19 June, 2012)
570-----------------------------
571
572This is largely a bug-fix release. Notable changes in BWA-short and BWA-SW:
573
574 * Bugfix: BWA-SW may give bad alignments due to incorrect band width.
575
576 * Bugfix: A segmentation fault due to an out-of-boundary error. The fix is a
577   temporary solution. The real cause has not been identified.
578
579 * Attempt to read index from prefix.64.bwt, such that the 32-bit and 64-bit
580   index can coexist.
581
582 * Added options '-I' and '-S' to control BWA-SW pairing.
583
584(0.6.2: 19 June 2012, r126)
585
586
587
588Release 0.6.1 (28 November, 2011)
589---------------------------------
590
591Notable changes to BWA-short:
592
593 * Bugfix: duplicated alternative hits in the XA tag.
594
595 * Bugfix: when trimming enabled, bwa-aln trims 1bp less.
596
597 * Disabled the color-space alignment. 0.6.x is not working with SOLiD reads at
598   present.
599
600Notable changes to BWA-SW:
601
602 * Bugfix: segfault due to excessive ambiguous bases.
603
604 * Bugfix: incorrect mate position in the SE mode.
605
606 * Bugfix: rare segfault in the PE mode
607
608 * When macro _NO_SSE2 is in use, fall back to the standard Smith-Waterman
609   instead of SSE2-SW.
610
611 * Optionally mark split hits with lower alignment scores as secondary.
612
613Changes to fastmap:
614
615 * Bugfix: infinite loop caused by ambiguous bases.
616
617 * Optionally output the query sequence.
618
619(0.6.1: 28 November 2011, r104)
620
621
622
623Release 0.5.10 and 0.6.0 (12 November, 2011)
624--------------------------------------------
625
626The 0.6.0 release comes with two major changes. Firstly, the index data
627structure has been changed to support genomes longer than 4GB. The forward and
628reverse backward genome is now integrated in one index. This change speeds up
629BWA-short by about 20% and BWA-SW by 90% with the mapping acccuracy largely
630unchanged. A tradeoff is BWA requires more memory, but this is the price almost
631all mappers that index the genome have to pay.
632
633Secondly, BWA-SW in 0.6.0 now works with paired-end data. It is more accurate
634for highly unique reads and more robust to long indels and structural
635variations. However, BWA-short still has edges for reads with many suboptimal
636hits. It is yet to know which algorithm is the best for variant calling.
637
6380.5.10 is a bugfix release only and is likely to be the last release in the 0.5
639branch unless I find critical bugs in future.
640
641Other notable changes:
642
643 * Added the 'fastmap' command that finds super-maximal exact matches. It does
644   not give the final alignment, but runs much faster. It can be a building
645   block for other alignment algorithms. [0.6.0 only]
646
647 * Output the timing information before BWA exits. This also tells users that
648   the task has been finished instead of being killed or aborted. [0.6.0 only]
649
650 * Sped up multi-threading when using many (>20) CPU cores.
651
652 * Check I/O error.
653
654 * Increased the maximum barcode length to 63bp.
655
656 * Automatically choose the indexing algorithm.
657
658 * Bugfix: very rare segfault due to an uninitialized variable. The bug also
659   affects the placement of suboptimal alignments. The effect is very minor.
660
661This release involves quite a lot of tricky changes. Although it has been
662tested on a few data sets, subtle bugs may be still hidden. It is *NOT*
663recommended to use this release in a production pipeline. In future, however,
664BWA-SW may be better when reads continue to go longer. I would encourage users
665to try the 0.6 release. I would also like to hear the users' experience. Thank
666you.
667
668(0.6.0: 12 November 2011, r85)
669
670
671
672Beta Release 0.5.9 (24 January, 2011)
673-------------------------------------
674
675Notable changes:
676
677 * Feature: barcode support via the '-B' option.
678
679 * Feature: Illumina 1.3+ read format support via the '-I' option.
680
681 * Bugfix: RG tags are not attached to unmapped reads.
682
683 * Bugfix: very rare bwasw mismappings
684
685 * Recommend options for PacBio reads in bwasw help message.
686
687
688Also, since January 13, the BWA master repository has been moved to github:
689
690  https://github.com/lh3/bwa
691
692The revision number has been reset. All recent changes will be first
693committed to this repository.
694
695(0.5.9: 24 January 2011, r16)
696
697
698
699Beta Release Candidate 0.5.9rc1 (10 December, 2010)
700---------------------------------------------------
701
702Notable changes in bwasw:
703
704 * Output unmapped reads.
705
706 * For a repetitive read, choose a random hit instead of a fixed
707   one. This is not well tested.
708
709Notable changes in bwa-short:
710
711 * Fixed a bug in the SW scoring system, which may lead to unexpected
712   gaps towards the end of a read.
713
714 * Fixed a bug which invalidates the randomness of repetitive reads.
715
716 * Fixed a rare memory leak.
717
718 * Allowed to specify the read group at the command line.
719
720 * Take name-grouped BAM files as input.
721
722Changes to this release are usually safe in that they do not interfere
723with the key functionality. However, the release has only been tested on
724small samples instead of on large-scale real data. If anything weird
725happens, please report the bugs to the bio-bwa-help mailing list.
726
727(0.5.9rc1: 10 December 2010, r1561)
728
729
730
731Beta Release 0.5.8 (8 June, 2010)
732---------------------------------
733
734Notable changes in bwasw:
735
736 * Fixed an issue of missing alignments. This should happen rarely and
737   only when the contig/read alignment is multi-part. Very rarely, bwasw
738   may still miss a segment in a multi-part alignment. This is difficult
739   to fix, although possible.
740
741Notable changes in bwa-short:
742
743 * Discard the SW alignment when the best single-end alignment is much
744   better. Such a SW alignment may caused by structural variations and
745   forcing it to be aligned leads to false alignment. This fix has not
746   been tested thoroughly. It would be great to receive more users
747   feedbacks on this issue.
748
749 * Fixed a typo/bug in sampe which leads to unnecessarily large memory
750   usage in some cases.
751
752 * Further reduced the chance of reporting 'weird pairing'.
753
754(0.5.8: 8 June 2010, r1442)
755
756
757
758Beta Release 0.5.7 (1 March, 2010)
759----------------------------------
760
761This release only has an effect on paired-end data with fat insert-size
762distribution. Users are still recommended to update as the new release
763improves the robustness to poor data.
764
765 * The fix for 'weird pairing' was not working in version 0.5.6, pointed
766   out by Carol Scott. It should work now.
767
768 * Optionally output to a normal file rather than to stdout (by Tim
769   Fennel).
770
771(0.5.7: 1 March 2010, r1310)
772
773
774
775Beta Release 0.5.6 (10 Feburary, 2010)
776--------------------------------------
777
778Notable changes in bwa-short:
779
780 * Report multiple hits in the SAM format at a new tag XA encoded as:
781   (chr,pos,CIGAR,NM;)*. By default, if a paired or single-end read has
782   4 or fewer hits, they will all be reported; if a read in a anomalous
783   pair has 11 or fewer hits, all of them will be reported.
784
785 * Perform Smith-Waterman alignment also for anomalous read pairs when
786   both ends have quality higher than 17. This reduces false positives
787   for some SV discovery algorithms.
788
789 * Do not report "weird pairing" when the insert size distribution is
790   too fat or has a mean close to zero.
791
792 * If a read is bridging two adjacent chromsomes, flag it as unmapped.
793
794 * Fixed a small but long existing memory leak in paired-end mapping.
795
796 * Multiple bug fixes in SOLiD mapping: a) quality "-1" can be correctly
797   parsed by solid2fastq.pl; b) truncated quality string is resolved; c)
798   SOLiD read mapped to the reverse strand is complemented.
799
800 * Bwa now calculates skewness and kurtosis of the insert size
801   distribution.
802
803 * Deploy a Bayesian method to estimate the maximum distance for a read
804   pair considered to be paired properly. The method is proposed by
805   Gerton Lunter, but bwa only implements a simplified version.
806
807 * Export more functions for Java bindings, by Matt Hanna (See:
808   http://www.broadinstitute.org/gsa/wiki/index.php/Sting_BWA/C_bindings)
809
810 * Abstract bwa CIGAR for further extension, by Rodrigo Goya.
811
812(0.5.6: 10 Feburary 2010, r1303)
813
814
815
816Beta Release 0.5.5 (10 November, 2009)
817--------------------------------------
818
819This is a bug fix release:
820
821 * Fixed a serious bug/typo in aln which does not occur given short
822   reads, but will lead to segfault for >500bp reads. Of course, the aln
823   command is not recommended for reads longer than 200bp, but this is a
824   bug anyway.
825
826 * Fixed a minor bug/typo which leads to incorrect single-end mapping
827   quality when one end is moved to meet the mate-pair requirement.
828
829 * Fixed a bug in samse for mapping in the color space. This bug is
830   caused by quality filtration added since 0.5.1.
831
832(0.5.5: 10 November 2009, r1273)
833
834
835
836Beta Release 0.5.4 (9 October, 2009)
837------------------------------------
838
839Since this version, the default seed length used in the "aln" command is
840changed to 32.
841
842Notable changes in bwa-short:
843
844 * Added a new tag "XC:i" which gives the length of clipped reads.
845
846 * In sampe, skip alignments in case of a bug in the Smith-Waterman
847   alignment module.
848
849 * In sampe, fixed a bug in pairing when the read sequence is identical
850   to its reverse complement.
851
852 * In sampe, optionally preload the entire FM-index into memory to
853   reduce disk operations.
854
855Notable changes in dBWT-SW/BWA-SW:
856
857 * Changed name dBWT-SW to BWA-SW.
858
859 * Optionally use "hard clipping" in the SAM output.
860
861(0.5.4: 9 October 2009, r1245)
862
863
864
865Beta Release 0.5.3 (15 September, 2009)
866---------------------------------------
867
868Fixed a critical bug in bwa-short: reads mapped to the reverse strand
869are not complemented.
870
871(0.5.3: 15 September 2009, r1225)
872
873
874
875Beta Release 0.5.2 (13 September, 2009)
876---------------------------------------
877
878Notable changes in bwa-short:
879
880 * Optionally trim reads before alignment. See the manual page on 'aln
881   -q' for detailed description.
882
883 * Fixed a bug in calculating the NM tag for a gapped alignment.
884
885 * Fixed a bug given a mixture of reads with some longer than the seed
886   length and some shorter.
887
888 * Print SAM header.
889
890Notable changes in dBWT-SW:
891
892 * Changed the default value of -T to 30. As a result, the accuracy is a
893   little higher for short reads at the cost of speed.
894
895(0.5.2: 13 September 2009, r1223)
896
897
898
899Beta Release 0.5.1 (2 September, 2009)
900--------------------------------------
901
902Notable changes in the short read alignment component:
903
904 * Fixed a bug in samse: do not write mate coordinates.
905
906Notable changes in dBWT-SW:
907
908 * Randomly choose one alignment if the read is a repetitive.
909
910 * Fixed a flaw when a read is mapped across two adjacent reference
911   sequences. However, wrong alignment reports may still occur rarely in
912   this case.
913
914 * Changed the default band width to 50. The speed is slower due to this
915   change.
916
917 * Improved the mapping quality a little given long query sequences.
918
919(0.5.1: 2 September 2009, r1209)
920
921
922
923Beta Release 0.5.0 (20 August, 2009)
924------------------------------------
925
926This release implements a novel algorithm, dBWT-SW, specifically
927designed for long reads. It is 10-50 times faster than SSAHA2, depending
928on the characteristics of the input data, and achieves comparable
929alignment accuracy while allowing chimera detection. In comparison to
930BLAT, dBWT-SW is several times faster and much more accurate especially
931when the error rate is high. Please read the manual page for more
932information.
933
934The dBWT-SW algorithm is kind of developed for future sequencing
935technologies which produce much longer reads with a little higher error
936rate. It is still at its early development stage. Some features are
937missing and it may be buggy although I have evaluated on several
938simulated and real data sets. But following the "release early"
939paradigm, I would like the users to try it first.
940
941Other notable changes in BWA are:
942
943 * Fixed a rare bug in the Smith-Waterman alignment module.
944
945 * Fixed a rare bug about the wrong alignment coordinate when a read is
946   poorly aligned.
947
948 * Fixed a bug in generating the "mate-unmap" SAM tag when both ends in
949   a pair are unmapped.
950
951(0.5.0: 20 August 2009, r1200)
952
953
954
955Beta Release 0.4.9 (19 May, 2009)
956---------------------------------
957
958Interestingly, the integer overflow bug claimed to be fixed in 0.4.7 has
959not in fact. Now I have fixed the bug. Sorry for this and thank Quan
960Long for pointing out the bug (again).
961
962(0.4.9: 19 May 2009, r1075)
963
964
965
966Beta Release 0.4.8 (18 May, 2009)
967---------------------------------
968
969One change to "aln -R". Now by default, if there are no more than '-R'
970equally best hits, bwa will search for suboptimal hits. This change
971affects the ability in finding SNPs in segmental duplications.
972
973I have not tested this option thoroughly, but this simple change is less
974likely to cause new bugs. Hope I am right.
975
976(0.4.8: 18 May 2009, r1073)
977
978
979
980Beta Release 0.4.7 (12 May, 2009)
981---------------------------------
982
983Notable changes:
984
985 * Output SM (single-end mapping quality) and AM (smaller mapping
986   quality among the two ends) tag from sam output.
987
988 * Improved the functionality of stdsw.
989
990 * Made the XN tag more accurate.
991
992 * Fixed a very rare segfault caused by integer overflow.
993
994 * Improve the insert size estimation.
995
996 * Fixed compiling errors for some Linux systems.
997
998(0.4.7: 12 May 2009, r1066)
999
1000
1001
1002Beta Release 0.4.6 (9 March, 2009)
1003----------------------------------
1004
1005This release improves the SOLiD support. First, a script for converting
1006SOLiD raw data is provided. This script is adapted from solid2fastq.pl
1007in the MAQ package. Second, a nucleotide reference file can be directly
1008used with 'bwa index'. Third, SOLiD paired-end support is
1009completed. Fourth, color-space reads will be converted to nucleotides
1010when SAM output is generated. Color errors are corrected in this
1011process. Please note that like MAQ, BWA cannot make use of the primer
1012base and the first color.
1013
1014In addition, the calculation of mapping quality is also improved a
1015little bit, although end-users may barely observe the difference.
1016
1017(0.4.6: 9 March 2009, r915)
1018
1019
1020
1021Beta Release 0.4.5 (18 Feburary, 2009)
1022--------------------------------------
1023
1024Not much happened, but I think it would be good to let the users use the
1025latest version.
1026
1027Notable changes (Thank Bob Handsaker for catching the two bugs):
1028
1029 * Improved bounary check. Previous version may still give incorrect
1030   alignment coordinates in rare cases.
1031
1032 * Fixed a bug in SW alignment when no residue matches. This only
1033   affects the 'sampe' command.
1034
1035 * Robustly estimate insert size without setting the maximum on the
1036   command line. Since this release 'sampe -a' only has an effect if
1037   there are not enough good pairs to infer the insert size
1038   distribution.
1039
1040 * Reduced false PE alignments a little bit by using the inferred insert
1041   size distribution. This fix may be more important for long insert
1042   size libraries.
1043
1044(0.4.5: 18 Feburary 2009, r829)
1045
1046
1047
1048Beta Release 0.4.4 (15 Feburary, 2009)
1049--------------------------------------
1050
1051This is mainly a bug fix release. Notable changes are:
1052
1053 * Imposed boundary check for extracting subsequence from the
1054   genome. Previously this causes memory problem in rare cases.
1055
1056 * Fixed a bug in failing to find whether an alignment overlapping with
1057   N on the genome.
1058
1059 * Changed MD tag to meet the latest SAM specification.
1060
1061(0.4.4: 15 Feburary 2009, r815)
1062
1063
1064
1065Beta Release 0.4.3 (22 January, 2009)
1066------------------------------------
1067
1068Notable changes:
1069
1070 * Treat an ambiguous base N as a mismatch. Previous versions will not
1071   map reads containing any N.
1072
1073 * Automatically choose the maximum allowed number of differences. This
1074   is important when reads of different lengths are mixed together.
1075
1076 * Print mate coordinate if only one end is unmapped.
1077
1078 * Generate MD tag. This tag encodes the mismatching positions and the
1079   reference bases at these positions. Deletions from the reference will
1080   also be printed.
1081
1082 * Optionally dump multiple hits from samse, in another concise format
1083   rather than SAM.
1084
1085 * Optionally disable iterative search. This is VERY SLOOOOW, though.
1086
1087 * Fixed a bug in generate SAM.
1088
1089(0.4.3: 22 January 2009, r787)
1090
1091
1092
1093Beta Release 0.4.2 (9 January, 2009)
1094------------------------------------
1095
1096Aaron Quinlan found a bug in the indexer: the bwa indexer segfaults if
1097there are no comment texts in the FASTA header. This is a critical
1098bug. Nothing else was changed.
1099
1100(0.4.2: 9 January 2009, r769)
1101
1102
1103
1104Beta Release 0.4.1 (7 January, 2009)
1105------------------------------------
1106
1107I am sorry for the quick updates these days. I like to set a milestone
1108for BWA and this release seems to be. For paired end reads, BWA also
1109does Smith-Waterman alignment for an unmapped read whose mate can be
1110mapped confidently. With this strategy BWA achieves similar accuracy to
1111maq. Benchmark is also updated accordingly.
1112
1113(0.4.1: 7 January 2009, r760)
1114
1115
1116
1117Beta Release 0.4.0 (6 January, 2009)
1118------------------------------------
1119
1120In comparison to the release two days ago, this release is mainly tuned
1121for performance with some tricks I learnt from Bowtie. However, as the
1122indexing format has also been changed, I have to increase the version
1123number to 0.4.0 to emphasize that *DATABASE MUST BE RE-INDEXED* with
1124'bwa index'.
1125
1126 * Improved the speed by about 20%.
1127
1128 * Added multi-threading to 'bwa aln'.
1129
1130(0.4.0: 6 January 2009, r756)
1131
1132
1133
1134Beta Release 0.3.0 (4 January, 2009)
1135------------------------------------
1136
1137 * Added paired-end support by separating SA calculation and alignment
1138   output.
1139
1140 * Added SAM output.
1141
1142 * Added evaluation to the documentation.
1143
1144(0.3.0: 4 January 2009, r741)
1145
1146
1147
1148Beta Release 0.2.0 (15 Augusst, 2008)
1149-------------------------------------
1150
1151 * Take the subsequence at the 5'-end as seed. Seeding strategy greatly
1152   improves the speed for long reads, at the cost of missing a few true
1153   hits that contain many differences in the seed. Seeding also increase
1154   the memory by 800MB.
1155
1156 * Fixed a bug which may miss some gapped alignments. Fixing the bug
1157   also slows the speed a little.
1158
1159(0.2.0: 15 August 2008, r428)
1160
1161
1162
1163Beta Release 0.1.6 (08 Augusst, 2008)
1164-------------------------------------
1165
1166 * Give accurate CIGAR string.
1167
1168 * Add a simple interface to SW/NW alignment
1169
1170(0.1.6: 08 August 2008, r414)
1171
1172
1173
1174Beta Release 0.1.5 (27 July, 2008)
1175----------------------------------
1176
1177 * Improve the speed. This version is expected to give the same results.
1178
1179(0.1.5: 27 July 2008, r400)
1180
1181
1182
1183Beta Release 0.1.4 (22 July, 2008)
1184----------------------------------
1185
1186 * Fixed a bug which may cause missing gapped alignments.
1187
1188 * More clearly define what alignments can be found by BWA (See
1189   manual). Now BWA runs a little slower because it will visit more
1190   potential gapped alignments.
1191
1192 * A bit code clean up.
1193
1194(0.1.4: 22 July 2008, r387)
1195
1196
1197
1198Beta Release 0.1.3 (21 July, 2008)
1199----------------------------------
1200
1201Improve the speed with some tricks on retrieving occurences. The results
1202should be exactly the same as that of 0.1.2.
1203
1204(0.1.3: 21 July 2008, r382)
1205
1206
1207
1208Beta Release 0.1.2 (17 July, 2008)
1209----------------------------------
1210
1211Support gapped alignment. Codes for ungapped alignment has been removed.
1212
1213(0.1.2: 17 July 2008, r371)
1214
1215
1216
1217Beta Release 0.1.1 (03 June, 2008)
1218-----------------------------------
1219
1220This is the first release of BWA, Burrows-Wheeler Alignment tool. Please
1221read man page for more information about this software.
1222
1223(0.1.1: 03 June 2008, r349)
1224