1News for the Biopython Project
2==============================
3
4This file contains release notes and general news about the Biopython project.
5See also the DEPRECATED file which tracks the removal of obsolete modules or
6functions, and online https://biopython.org/wiki/News and
7https://www.open-bio.org/category/obf-projects/biopython/
8
9The latest news is at the top of this file.
10
11(In progress, not yet released): Biopython 1.80
12===============================================
13
141 June 2021: Biopython 1.79
15================================
16
17This is intended to be our final release supporting Python 3.6. It also
18supports Python 3.7, 3.8 and 3.9, and has also been tested on PyPy3.6.1 v7.1.1.
19
20The ``Seq`` and ``MutableSeq`` classes in ``Bio.Seq`` now store their sequence
21contents as ``bytes` ` and ``bytearray`` objects, respectively. Previously, for
22``Seq`` objects a string object was used, and a Unicode array object for
23``MutableSeq`` objects. This was maintained during the transition from Python2
24to Python3. However, a Python2 string object corresponds to a ``bytes`` object
25in Python3, storing the string as a series of 256-bit characters. While non-
26ASCII characters could be stored in Python2 strings, they were not treated as
27such. For example:
28
29In Python2::
30
31    >>> s = "Генетика"
32    >>> type(s)
33    <class 'str'>
34    >>> len(s)
35    16
36
37In Python3::
38
39    >>> s = "Генетика"
40    >>> type(s)
41    <class 'str'>
42    >>> len(s)
43    8
44
45In Python3, storing the sequence contents as ``bytes`` and ``bytearray``
46objects has the further advantage that both support the buffer protocol.
47
48Taking advantage of the similarity between ``bytes`` and ``bytearray``, the
49``Seq`` and ``MutableSeq`` classes now inherit from an abstract base class
50``_SeqAbstractBaseClass`` in ``Bio.Seq`` that implements most of the ``Seq``
51and ``MutableSeq`` methods, ensuring their consistency with each other. For
52methods that modify the sequence contents, an optional ``inplace`` argument to
53specify if a new sequence object should be returned with the new sequence
54contents (if ``inplace`` is ``False``, the default) or if the sequence object
55itself should be modified (if ``inplace`` is ``True``). For ``Seq`` objects,
56which are immutable, using ``inplace=True`` raises an exception. For
57``inplace=False``, the default, ``Seq`` objects and ``MutableSeq`` behave
58consistently.
59
60As before, ``Seq`` and ``MutableSeq`` objects can be initialized using a string
61object, which will be converted to a ``bytes`` or ``bytearray`` object assuming
62an ASCII encoding. Alternatively, a ``bytes`` or ``bytearray`` object can be
63used, or an instance of any class inheriting from the new
64``SequenceDataAbstractBaseClass`` abstract base class in ``Bio.Seq``. This
65requires that the class implements the ``__len__`` and ``__getitem`` methods
66that return the sequence length and sequence contents on demand. Initialzing a
67``Seq`` instance using an instance of a class inheriting from
68``SequenceDataAbstractBaseClass`` allows the ``Seq`` object to be lazy, meaning
69that its sequence is provided on demand only, without requiring to initialize
70the full sequence. This feature is now used in ``BioSQL``, providing on-demand
71sequence loading from an SQL database, as well as in a new parser for twoBit
72(.2bit) sequence data added to ``Bio.SeqIO``. This is a lazy parser that allows
73fast access to genome-size DNA sequence files by not having to read the full
74genome sequence. The new ``_UndefinedSequenceData`` class in ``Bio.Seq``  also
75inherits from ``SequenceDataAbstractBaseClass`` to represent sequences of known
76length but unknown sequence contents. This provides an alternative to
77``UnknownSeq``, which is now deprecated as its definition was ambiguous. For
78example, in these examples the ``UnknownSeq`` is interpreted as a sequence with
79a well-defined sequence contents::
80
81    >>> s = UnknownSeq(3, character="A")
82    >>> s.translate()
83    UnknownSeq(1, character='K')
84    >>> s + "A"
85    Seq("AAAA")
86
87A sequence object with an undefined sequence contents can now be created by
88using ``None`` when creating the ``Seq`` object, together with the sequence
89length. Trying to access its sequence contents raises an
90``UndefinedSequenceError``::
91
92    >>> s = Seq(None, length=6)
93    >>> s
94    Seq(None, length=6)
95    >>> len(s)
96    6
97    >>> "A" in s
98    Traceback (most recent call last):
99    ...
100    Bio.Seq.UndefinedSequenceError: Sequence content is undefined
101    >>> print(s)
102    Traceback (most recent call last):
103    ....
104    Bio.Seq.UndefinedSequenceError: Sequence content is undefined
105
106Element assignment in Bio.PDB.Atom now returns "X" when the element cannot be
107unambiguously guessed from the atom name, in accordance with PDB structures.
108
109Bio.PDB entities now have a ``center_of_mass()`` method that calculates either
110centers of gravity or geometry.
111
112New method ``disordered_remove()`` implemented in Bio.PDB DisorderedAtom and
113DisorderedResidue to remove children.
114
115New module Bio.PDB.SASA implements the Shrake-Rupley algorithm to calculate
116atomic solvent accessible areas without third-party tools.
117
118Expected ``TypeError`` behaviour has been restored to the ``Seq`` object's
119string like methods (fixing a regression in Biopython 1.78).
120
121The KEGG ``KGML_Pathway`` KGML output was fixed to produce output that complies
122with KGML v0.7.2.
123
124Parsing motifs in ``pfm-four-rows`` format can now handle motifs with values
125in scientific notation.
126
127Parsing motifs in ``minimal``` MEME format will use ``nsites`` when making
128the count matrix from the frequency matrix, instead of multiply the frequency
129matrix by 1000000.
130
131Bio.UniProt.GOA now parses Gene Product Information (GPI) files version 1.2,
132files can be downloaded from the EBI ftp site:
133ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/
134
135
136Many thanks to the Biopython developers and community for making this release
137possible, especially the following contributors:
138
139- Damien Goutte-Gattat
140- Gert Hulselmans
141- João Rodrigues
142- Markus Piotrowski
143- Sergio Valqui
144- Suyash Gupta
145- Vini Salazar (first contribution)
146- Leighton Pritchard
147
1484 September 2020: Biopython 1.78
149================================
150
151This release of Biopython supports Python 3.6, 3.7 and 3.8. It has also been
152tested on PyPy3.6.1 v7.1.1.
153
154The main change is that ``Bio.Alphabet`` is no longer used. In some cases you
155will now have to specify expected letters, molecule type (DNA, RNA, protein),
156or gap character explicitly. Please consult the updated Tutorial and API
157documentation for guidance. This simplification has sped up many ``Seq``
158object methods. See https://biopython.org/wiki/Alphabet for more information.
159
160``Bio.SeqIO.parse()`` is faster with "fastq" format due to small improvements
161in the ``Bio.SeqIO.QualityIO`` module.
162
163The ``SeqFeature`` object's ``.extract()`` method can now be used for
164trans-spliced locations via an optional dictionary of references.
165
166As in recent releases, more of our code is now explicitly available under
167either our original "Biopython License Agreement", or the very similar but
168more commonly used "3-Clause BSD License".  See the ``LICENSE.rst`` file for
169more details.
170
171Additionally, a number of small bugs and typos have been fixed with additions
172to the test suite. There has been further work to follow the Python PEP8,
173PEP257 and best practice standard coding style, and all of the tests have
174been reformatted with the ``black`` tool to match the main code base.
175
176Many thanks to the Biopython developers and community for making this release
177possible, especially the following contributors:
178
179- Adam Sjøgren (first contribution)
180- Carlos Pena
181- Chris Daley
182- Chris Rands
183- Christian Brueffer
184- Damien Goutte-Gattat
185- João Rodrigues
186- João Vitor F Cavalcante (first contribution)
187- Marie Crane
188- Markus Piotrowski
189- Michiel de Hoon
190- Peter Cock
191- Sergio Valqui
192- Yogesh Kulkarni (first contribution)
193- Zheng Ruan
194
19525 May 2020: Biopython 1.77
196===========================
197
198This release of Biopython supports Python 3.6, 3.7 and 3.8 It has also been
199tested on PyPy3.6.1 v7.1.1-beta0.
200
201**We have dropped support for Python 2 now.**
202
203``pairwise2`` now allows the input of parameters with keywords and returns the
204alignments as a list of ``namedtuples``.
205
206The codon tables have been updated to NCBI genetic code table version 4.5,
207which adds Cephalodiscidae mitochondrial as table 33.
208
209Updated ``Bio.Restriction`` to the January 2020 release of REBASE.
210
211A major contribution by Rob Miller to ``Bio.PDB`` provides new methods to
212handle protein structure transformations using dihedral angles (internal
213coordinates). The new framework supports lossless interconversion between
214internal and cartesian coordinates, which, among other uses, simplifies the
215analysis and manipulation of coordinates of proteins structures.
216
217As in recent releases, more of our code is now explicitly available under
218either our original "Biopython License Agreement", or the very similar but
219more commonly used "3-Clause BSD License".  See the ``LICENSE.rst`` file for
220more details.
221
222Additionally, a number of small bugs and typos have been fixed with further
223additions to the test suite. There has been further work to follow the Python
224PEP8, PEP257 and best practice standard coding style, and all the main code
225base has been reformatted with the ``black`` tool.
226
227Many thanks to the Biopython developers and community for making this release
228possible, especially the following contributors:
229
230- Alexander Decurnou (first contribution)
231- Andrei Istrate (first contribution)
232- Andrey Raspopov
233- Artemi Bendandi (first contribution)
234- Austin Varela (first contribution)
235- Chris Daley
236- Chris Rands
237- Deepak Khatri
238- Hielke Walinga (first contribution)
239- Kai Blin
240- Karthikeyan Singaravelan (first contribution)
241- Konstantinos Zisis (first contribution)
242- Markus Piotrowski
243- Michiel de Hoon
244- Peter Cock
245- Rob Miller
246- Sergio Valqui
247- Steve Bond
248- Sujan Dulal (first contribution)
249- Tianyi Shi (first contribution)
250
25120 December 2019: Biopython 1.76
252================================
253
254This release of Biopython supports Python 2.7, 3.5, 3.6, 3.7 and 3.8. It has
255also been tested on PyPy2.7.13 v7.1.1 and PyPy3.6.1 v7.1.1-beta0.
256
257We intend this to be our final release supporting Python 2.7 and 3.5.
258
259As in recent releases, more of our code is now explicitly available under
260either our original "Biopython License Agreement", or the very similar but
261more commonly used "3-Clause BSD License".  See the ``LICENSE.rst`` file for
262more details.
263
264
265``PDBParser`` and ``PDBIO`` now support PQR format file parsing and input/
266output.
267
268In addition to the mainstream ``x86_64`` aka ``AMD64`` CPU architecture, we
269now also test every contribution on the ``ARM64``, ``ppc64le``, and ``s390x``
270CPUs under Linux thanks to Travis CI. Further post-release testing done by
271Debian and other packagers and distributors of Biopython also covers these
272CPUs.
273
274``Bio.motifs.PositionSpecificScoringMatrix.search()`` method has been
275re-written: it now applies ``.calculate()`` to chunks of the sequence
276to maintain a low memory footprint for long sequences.
277
278Additionally, a number of small bugs and typos have been fixed with further
279additions to the test suite. There has been further work to follow the Python
280PEP8, PEP257 and best practice standard coding style, and more of the code
281style has been reformatted with the ``black`` tool.
282
283Many thanks to the Biopython developers and community for making this release
284possible, especially the following contributors:
285
286- Chris Daley (first contribution)
287- Chris Rands
288- Christian Brueffer
289- Ilya Flyamer (first contribution)
290- Jakub Lipinski (first contribution)
291- Michael R. Crusoe (first contribution)
292- Michiel de Hoon
293- Peter Cock
294- Sergio Valqui
295
2966 November 2019: Biopython 1.75
297===============================
298
299This release of Biopython supports Python 2.7, 3.5, 3.6, 3.7 and is expected
300to work on the soon to be released Python 3.8. It has also been tested on
301PyPy2.7.13 v7.1.1 and PyPy3.6.1 v7.1.1-beta0.
302
303Note we intend to drop Python 2.7 support in early 2020.
304
305The restriction enzyme list in ``Bio.Restriction`` has been updated to the
306August 2019 release of REBASE.
307
308``Bio.SeqIO`` now supports reading and writing files in the native format of
309Christian Marck's DNA Strider program ("xdna" format, also used by Serial
310Cloner), as well as reading files in the native formats of GSL Biotech's
311SnapGene ("snapgene") and Textco Biosoftware's Gene Construction Kit ("gck").
312
313``Bio.AlignIO`` now supports GCG MSF multiple sequence alignments as the "msf"
314format (work funded by the National Marrow Donor Program).
315
316The main ``Seq`` object now has string-like ``.index()`` and ``.rindex()``
317methods, matching the existing ``.find()`` and ``.rfind()`` implementations.
318The ``MutableSeq`` object retains its more list-like ``.index()`` behaviour.
319
320The ``MMTFIO`` class has been added that allows writing of MMTF file format
321files from a Biopython structure object. ``MMTFIO`` has a similar interface to
322``PDBIO`` and ``MMCIFIO``, including the use of a ``Select`` class to write
323out a specified selection. This final addition to read/write support for
324PDB/mmCIF/MMTF in Biopython allows conversion between all three file formats.
325
326Values from mmCIF files are now read in as a list even when they consist of a
327single value. This change improves consistency and reduces the likelihood of
328making an error, but will require user code to be updated accordingly.
329
330`Bio.motifs.meme` has been updated to parse XML output files from MEME over
331the plain-text output file. The goal of this change is to parse a more
332structured data source with minimal loss of functionality upon future MEME
333releases.
334
335``Bio.PDB`` has been updated to support parsing REMARK 99 header entries from
336PDB-style Astral files.
337
338A new keyword parameter ``full_sequences`` was added to ``Bio.pairwise2``'s
339pretty print method ``format_alignment`` to restore the output of local
340alignments to the 'old' format (showing the whole sequences including the
341un-aligned parts instead of only showing the aligned parts).
342
343A new function ``charge_at_pH(pH)`` has been added to ``ProtParam`` and
344``IsoelectricPoint`` in ``Bio.SeqUtils``.
345
346The ``PairwiseAligner`` in ``Bio.Align`` was extended to allow generalized
347pairwise alignments, i.e. alignments of any Python object, for example
348three-letter amino acid sequences, three-nucleotide codons, and arrays of
349integers.
350
351A new module ``substitution_matrices`` was added to ``Bio.Align``, which
352includes an ``Array`` class that can be used as a substitution matrix. As
353the ``Array`` class is a subclass of a numpy array, mathematical operations
354can be applied to it directly, and C code that makes use of substitution
355matrices can directly access the numerical values stored in the substitution
356matrices. This module is intended as a replacement of ``Bio.SubsMat``,
357which is currently unmaintained.
358
359As in recent releases, more of our code is now explicitly available under
360either our original "Biopython License Agreement", or the very similar but
361more commonly used "3-Clause BSD License".  See the ``LICENSE.rst`` file for
362more details.
363
364Additionally, a number of small bugs and typos have been fixed with further
365additions to the test suite, and there has been further work to follow the
366Python PEP8, PEP257 and best practice standard coding style. We have also
367started to use the ``black`` Python code formatting tool.
368
369Many thanks to the Biopython developers and community for making this release
370possible, especially the following contributors:
371
372- Chris MacRaild
373- Chris Rands
374- Damien Goutte-Gattat (first contribution)
375- Devang Thakkar
376- Harry Jubb
377- Joe Greener
378- Kiran Mukhyala (first contribution)
379- Konstantin Vdovkin
380- Mark Amery
381- Markus Piotrowski
382- Michiel de Hoon
383- Mike Moritz (first contribution)
384- Mustafa Anil Tuncel
385- Nick Negretti
386- Osvaldo Zagordi (first contribution)
387- Peter Cock
388- Peter Kerpedjiev
389- Sergio Valqui
390- Spencer Bliven
391- Victor Lin
392
393
39416 July 2019: Biopython 1.74
395============================
396
397This release of Biopython supports Python 2.7, 3.4, 3.5, 3.6 and 3.7. However,
398it will be the last release to support Python 3.4 which is now at end-of-life.
399It has also been tested on PyPy2.7 v6.0.0 and PyPy3.5 v6.0.0.
400
401As in recent releases, more of our code is now explicitly available under
402either our original "Biopython License Agreement", or the very similar but
403more commonly used "3-Clause BSD License".  See the ``LICENSE.rst`` file for
404more details.
405
406Our core sequence objects (``Seq``, ``UnknownSeq``, and ``MutableSeq``) now
407have a string-like ``.join()`` method.
408
409The NCBI now allows longer accessions in the GenBank file LOCUS line, meaning
410the fields may not always follow the historical column based positions. We
411no longer give a warning when parsing these. We now allow writing such files
412(although with a warning as support for reading them is not yet widespread).
413
414Support for the ``mysqlclient`` package, a fork of MySQLdb, has been added.
415
416We now capture the IDcode field from PDB Header records.
417
418``Bio.pairwise2``'s pretty-print output from ``format_alignment`` has been
419optimized for local alignments: If they do not consist of the whole sequences,
420only the aligned section of the sequences are shown, together with the start
421positions of the sequences (in 1-based notation). Alignments of lists will now
422also be prettily printed.
423
424``Bio.SearchIO`` now supports parsing the text output of the HHsuite protein
425sequence search tool. The format name is ``hhsuite2-text`` and
426``hhsuite3-text``, for versions 2 and 3 of HHsuite, respectively.
427
428``Bio.SearchIO`` HSP objects has a new attribute called ``output_index``. This
429attribute is meant for capturing the order by which the HSP were output in the
430parsed file and is set with a default value of -1 for all HSP objects. It is
431also used for sorting the output of ``QueryResult.hsps``.
432
433``Bio.SeqIO.AbiIO`` has been updated to preserve bytes value when parsing. The
434goal of this change is make the parser more robust by being able to extract
435string-values that are not utf-8-encoded. This affects all tag values, except
436for ID and description values, where they need to be extracted as strings
437to conform to the ``SeqRecord`` interface. In this case, the parser will
438attempt to decode using ``utf-8`` and fall back to the system encoding if that
439fails. This change affects Python 3 only.
440
441``Bio.motifs.mast`` has been updated to parse XML output files from MAST over
442the plain-text output file. The goal of this change is to parse a more
443structured data source with minimal loss of functionality upon future MAST
444releases. Class structure remains the same plus an additional attribute
445``Record.strand_handling`` required for diagram parsing.
446
447``Bio.Entrez`` now automatically retries HTTP requests on failure. The
448maximum number of tries and the sleep between them can be configured by
449changing ``Bio.Entrez.max_tries`` and ``Bio.Entrez.sleep_between_tries``.
450(The defaults are 3 tries and 15 seconds, respectively.)
451
452The restriction enzyme list in ``Bio.Restriction`` has been updated to the May
4532019 release of REBASE.
454
455All tests using the older print-and-compare approach have been replaced by
456unittests following Python's standard testing framework.
457
458On the documentation side, all the public modules, classes, methods and
459functions now have docstrings (built in help strings). Furthermore, the PDF
460version of the *Biopython Tutorial and Cookbook* now uses syntax coloring
461for code snippets.
462
463Additionally, a number of small bugs and typos have been fixed with further
464additions to the test suite, and there has been further work to follow the
465Python PEP8, PEP257 and best practice standard coding style.
466
467Many thanks to the Biopython developers and community for making this release
468possible, especially the following contributors:
469
470- Andrey Raspopov (first contribution)
471- Antony Lee
472- Benjamin Rowell (first contribution)
473- Bernhard Thiel
474- Brandon Invergo
475- Catherine Lesuisse
476- Chris Rands
477- Deepak Khatri (first contribution)
478- Gert Hulselmans
479- Jared Andrews
480- Jens Thomas (first contribution)
481- Konstantin Vdovkin
482- Lenna Peterson
483- Mark Amery
484- Markus Piotrowski
485- Micky Yun Chan (first contribution)
486- Nick Negretti
487- Peter Cock
488- Peter Kerpedjiev
489- Ralf Stephan
490- Rob Miller (first contribution)
491- Sergio Valqui
492- Victor Lin
493- Wibowo 'Bow' Arindrarto
494- Zheng Ruan
495
496
49718 December 2018: Biopython 1.73
498================================
499
500This release of Biopython supports Python 2.7, 3.4, 3.5, 3.6 and 3.7.
501It has also been tested on PyPy2.7 v6.0.0 and PyPy3.5 v6.0.0.
502
503As in recent releases, more of our code is now explicitly available under
504either our original "Biopython License Agreement", or the very similar but
505more commonly used "3-Clause BSD License".  See the ``LICENSE.rst`` file for
506more details.
507
508The dictionary-like indexing in SeqIO and SearchIO will now explicitly preserve
509record order to match a behaviour change in the Python standard dict object.
510This means looping over the index will load the records in the on-disk order,
511which will be much faster (previously it would be effectively at random, based
512on the key hash sorting).
513
514The "grant" matrix in Bio.SubsMat.MatrixInfo has been replaced as our original
515values taken from Gerhard Vogt's old webpages at EMBL Heidelberg were
516discovered to be in error. The new values have been transformed following
517Vogt's approach, taking the global maximum 215 minus the similarity scores
518from the original paper Grantham (1974), to give a distance measure.
519
520Additionally, a number of small bugs and typos have been fixed with further
521additions to the test suite, and there has been further work to follow the
522Python PEP8, PEP257 and best practice standard coding style.
523
524Double-quote characters in GenBank feature qualifier values in ``Bio.SeqIO``
525are now escaped as per the NCBI standard. Improperly escaped values trigger a
526warning on parsing.
527
528There is a new command line wrapper for the BWA-MEM sequence mapper.
529
530The string-based FASTA parsers in ``Bio.SeqIO.FastaIO`` have been optimised,
531which also speeds up parsing FASTA files using ``Bio.SeqIO.parse()``.
532
533Many thanks to the Biopython developers and community for making this release
534possible, especially the following contributors:
535
536- Alona Levy-Jurgenson (first contribution)
537- Ariel Aptekmann
538- Brandon Invergo
539- Catherine Lesuisse
540- Chris Rands
541- Darcy Mason (first contribution)
542- Devang Thakkar (first contribution)
543- Ivan Antonov (first contribution)
544- Jeremy LaBarage (first contribution)
545- Juraj Szász (first contribution)
546- Kai Blin
547- Konstantin Vdovkin (first contribution)
548- Manuel Nuno Melo (first contribution)
549- Maximilian Greil
550- Nick Negretti (first contribution)
551- Peter Cock
552- Rona Costello (first contribution)
553- Spencer Bliven
554- Wibowo 'Bow' Arindrarto
555- Yi Hsiao (first contribution)
556
557
55821 June 2018: Biopython 1.72
559============================
560
561This release of Biopython supports Python 2.7, 3.4, 3.5 and 3.6.
562It has also been tested on PyPy2.7 v6.0.0 and PyPy3.5 v6.0.0.
563
564Internal changes to Bio.SeqIO have sped up the SeqRecord .format method and
565SeqIO.write (especially when used in a for loop).
566
567The MAF alignment indexing in Bio.AlignIO.MafIO has been updated to use
568inclusive end co-ordinates to better handle searches at end points. This
569will require you to rebuild any existing MAF index files.
570
571In this release more of our code is now explicitly available under either our
572original "Biopython License Agreement", or the very similar but more commonly
573used "3-Clause BSD License".  See the ``LICENSE.rst`` file for more details.
574
575The Entrez module now supports the NCBI API key. Also you can now set a custom
576directory for DTD and XSD files. This allows Entrez to be used in environments
577like AWS Lambda, which restricts write access to specific directories.
578Improved support for parsing NCBI Entrez XML files that use XSD schemas.
579
580Internal changes to our C code mean that NumPy is no longer required at
581compile time - only at run time (and only for those modules which use NumPy).
582
583Seq, UnknownSeq, MutableSeq and derived classes now support integer
584multiplication methods, matching native Python string methods.
585
586A translate method has been added to Bio.SeqFeature that will extract a
587feature and translate it using the codon_start and transl_table qualifiers
588of the feature if they are present.
589
590Bio.SearchIO is no longer considered experimental, and so it does not raise
591warnings anymore when imported.
592
593A new pairwise sequence aligner is available in Bio.Align, as an alternative
594to the existing pairwise sequence aligner in Bio.pairwise2.
595
596Many thanks to the Biopython developers and community for making this release
597possible, especially the following contributors:
598
599- Benjamin Vaisvil (first contribution)
600- Blaise Li
601- Chad Parmet
602- Chris Rands
603- Connor T. Skennerton
604- Francesco Gastaldello
605- Michiel de Hoon
606- Pamela Russell (first contribution)
607- Peter Cock
608- Spencer Bliven
609- Stefans Mezulis
610- Wibowo 'Bow' Arindrarto
611
612
6133 April 2018: Biopython 1.71
614============================
615
616This release of Biopython supports Python 2.7, 3.4, 3.5 and 3.6.
617It has also been tested on PyPy2.7 v5.10.0 and PyPy3.5 v5.10.1.
618
619Python 3 is the primary development platform for Biopython. We will drop
620support for Python 2.7 no later than 2020, in line with the end-of-life or
621sunset date for Python 2.7 itself.
622
623Encoding issues have been fixed in several parsers when reading data files
624with non-ASCII characters, like accented letters in people's names. This would
625raise ``UnicodeDecodeError: 'ascii' codec can't decode byte ...`` under some
626system locale settings.
627
628Bio.KEGG can now parse Gene files.
629
630The multiple-sequence-alignment object used by Bio.AlignIO etc now supports
631a per-column annotation dictionary, useful for richly annotated alignments
632in the Stockholm/PFAM format.
633
634The SeqRecord object now has a translate method, following the approach used
635for its existing reverse_complement method etc.
636
637The output of function ``format_alignment`` in ``Bio.pairwise2`` for displaying
638a pairwise sequence alignment as text now indicates gaps and mis-matches.
639
640Bio.SeqIO now supports reading and writing two-line-per-record FASTA files
641under the format name "fasta-2line", useful if you wish to work without
642line-wrapped sequences.
643
644Bio.PDB now contains a writer for the mmCIF file format, which has been the
645standard PDB archive format since 2014. This allows structural objects to be
646written out and facilitates conversion between the PDB and mmCIF file formats.
647
648Bio.Emboss.Applications has been updated to fix a wrong parameter in fuzznuc
649wrapper and include a new wrapper for fuzzpro.
650
651The restriction enzyme list in ``Bio.Restriction`` has been updated to the
652November 2017 release of REBASE.
653
654New codon tables 27-31 from NCBI (NCBI genetic code table version 4.2)
655were added to Bio.Data.CodonTable. Note that tables 27, 28 and 31 contain
656no dedicated stop codons; the stop codons in these codes have a context
657dependent encoding as either STOP or as amino acid.
658
659IO functions such as ``SeqIO.parse`` now accept any objects which can be passed
660to the builtin ``open`` function. Specifically, this allows using
661``pathlib.Path`` objects under Python 3.6 and newer, as per `PEP 519
662<https://www.python.org/dev/peps/pep-0519/>`_.
663
664Bio.SearchIO can now parse InterProScan XML files.
665
666For Python 3 compatibility, comparison operators for the entities within a
667Bio.PDB Structure object were implemented. These allow the comparison of
668models, chains, residues, and atoms with the common operators  (==, !=, >, ...)
669Comparisons are based on IDs and take the parents of the entity up to the
670model level into account. For consistent behaviour of all entities the
671operators for atoms were modified to also consider the parent IDs. NOTE: this
672represents a change in behaviour in respect to v1.70 for Atom comparisons. In
673order to mimic the behaviour of previous versions, comparison will have to be
674done for Atom IDs and alternative locations specifically.
675
676In this release more of our code is now explicitly available under either our
677original "Biopython License Agreement", or the very similar but more commonly
678used "3-Clause BSD License".  See the ``LICENSE.rst`` file for more details.
679
680Additionally, a number of small bugs and typos have been fixed with further
681additions to the test suite, and there has been further work to follow the
682Python PEP8, PEP257 and best practice standard coding style.
683
684Many thanks to the Biopython developers and community for making this release
685possible, especially the following contributors:
686
687- Adhemar Zerlotini
688- Ariel Aptekmann
689- Chris Rands
690- Christian Brueffer
691- Connor T. Skennerton
692- Erik Cederstrand (first contribution)
693- Fei Qi (first contribution)
694- Francesco Gastaldello
695- James Jeffryes (first contribution)
696- Jerven Bolleman (first contribution)
697- Joe Greener (first contribution)
698- Joerg Schaarschmidt (first contribution)
699- João Rodrigues
700- Jeroen Van Goey
701- Jun Aruga (first contribution)
702- Kai Blin
703- Kozo Nishida
704- Lewis A. Marshall (first contribution)
705- Markus Piotrowski
706- Michiel de Hoon
707- Nicolas Fontrodona (first contribution)
708- Peter Cock
709- Philip Bergstrom (first contribution)
710- rht (first contribution)
711- Saket Choudhary
712- Shuichiro MAKIGAKI (first contribution)
713- Shyam Saladi (first contribution)
714- Siong Kong
715- Spencer Bliven
716- Stefans Mezulis
717- Steve Bond
718- Yasar L. Ahmed (first contribution)
719- Zachary Sailer (first contribution)
720- Zaid Ur-Rehman (first contribution)
721
722
72310 July 2017: Biopython 1.70
724============================
725
726This release of Biopython supports Python 2.7, 3.4, 3.5 and 3.6 (we have now
727dropped support for Python 3.3). It has also been tested on PyPy v5.7,
728PyPy3.5 v5.8 beta, and Jython 2.7 (although support for Jython is deprecated).
729
730Biopython now has a new logo, contributed by Patrick Kunzmann. Drawing on our
731original logo and the current Python logo, this shows a yellow and blue snake
732forming a double helix.
733
734For installation Biopython now assumes ``setuptools`` is present, and takes
735advantage of this to declare we require NumPy at install time (except under
736Jython). This should help ensure ``pip install biopython`` works smoothly.
737
738Bio.AlignIO now supports Mauve's eXtended Multi-FastA (XMFA) file format
739under the format name "mauve" (contributed by Eric Rasche).
740
741Bio.ExPASy was updated to fix fetching PROSITE and PRODOC records, and return
742text-mode handles for use under Python 3.
743
744Two new arguments for reading and writing blast-xml files have been added
745to the Bio.SearchIO functions (read/parse and write, respectively). They
746are 'use_raw_hit_ids' and 'use_raw_query_ids'. Check out the relevant
747SearchIO.BlastIO documentation for a complete description of what these
748arguments do.
749
750Bio.motifs was updated to support changes in MEME v4.11.4 output.
751
752The Bio.Seq sequence objects now have a ``.count_overlap()`` method to
753supplement the Python string like non-overlap based ``.count()`` method.
754
755The Bio.SeqFeature location objects can now be compared for equality.
756
757Bio.Phylo.draw_graphviz is now deprecated. We recommend using Bio.Phylo.draw
758instead, or another library or program if more advanced plotting functionality
759is needed.
760
761In Bio.Phylo.TreeConstruction, the DistanceMatrix class (previously
762_DistanceMatrix) has a new method 'format_phylip' to write Phylip-compatible
763distance matrix files (contributed by Jordan Willis).
764
765Additionally, a number of small bugs have been fixed with further additions
766to the test suite, and there has been further work to follow the Python PEP8,
767PEP257 and best practice standard coding style.
768
769Many thanks to the Biopython developers and community for making this release
770possible, especially the following contributors:
771
772- Aaron Kitzmiller (first contribution)
773- Adil Iqbal (first contribution)
774- Allis Tauri
775- Andrew Guy
776- Ariel Aptekmann (first contribution)
777- Ben Fulton
778- Bertrand Caron (first contribution)
779- Chris Rands (first contribution)
780- Connor T. Skennerton
781- Eric Rasche
782- Eric Talevich
783- Francesco Gastaldello
784- François Coste (first contribution)
785- Frederic Sapet (first contribution)
786- Jimmy O'Donnell (first contribution)
787- Jared Andrews (first contribution)
788- John Kern (first contribution)
789- Jordan Willis (first contribution)
790- João Rodrigues
791- Kai Blin
792- Markus Piotrowski
793- Mateusz Korycinski (first contribution)
794- Maximilian Greil
795- Michiel de Hoon
796- morrme (first contribution)
797- Noam Kremen (first contribution)
798- Patrick Kunzmann (first contribution)
799- Peter Cock
800- Rasmus Fonseca (first contribution)
801- Rodrigo Dorantes-Gilardi (first contribution)
802- Sacha Laurent (first contribution)
803- Sourav Singh
804- Ted Cybulski (first contribution)
805- Tiago Antao
806- Wibowo 'Bow' Arindrarto
807- Zheng Ruan
808
809
8106 April 2017: Biopython 1.69
811============================
812
813This release of Biopython supports Python 2.7, 3.3, 3.4, 3.5 and 3.6 (we have
814now dropped support for Python 2.6). It has also been tested on PyPy v5.7,
815PyPy3.5 v5.7 beta, and Jython 2.7.
816
817We have started to dual-license Biopython under both our original liberal
818"Biopython License Agreement", and the very similar but more commonly used
819"3-Clause BSD License". In this release a small number of the Python files
820are explicitly available under either license, but most of the code remains
821under the "Biopython License Agreement" only. See the ``LICENSE.rst`` file
822for more details.
823
824We now expect and take advantage of NumPy under PyPy, and compile most of the
825Biopython C code modules as well.
826
827Bio.AlignIO now supports the UCSC Multiple Alignment Format (MAF) under the
828format name "maf", using new module Bio.AlignIO.MafIO which also offers
829indexed access to these potentially large files using SQLite3 (contributed by
830Andrew Sczesnak, with additional refinements from Adam Novak).
831
832Bio.SearchIO.AbiIO has been extended to support parsing FSA files. The
833underlying format (ABIF) remains the same as AB1 files and so the string
834'abif' is the expected format argument in the main SeqIO functions. AbiIO
835determines whether the file is AB1 or FSA based on the presence of specific
836tags.
837
838The Uniprot parser is now able to parse "submittedName" elements in XML files.
839
840The NEXUS parser handling of internal node comments has been improved, which
841should help if working with tools like the BEAST TreeAnnotator. Slashes are
842now also allowed in identifiers.
843
844New parser for ExPASy Cellosaurus, a cell line database, cell line catalogue,
845and cell line ontology (contributed by Steve Marshall).
846
847For consistency the Bio.Seq module now offers a complement function (already
848available as a method on the Seq and MutableSeq objects).
849
850The SeqFeature object's qualifiers is now an explicitly ordered dictionary
851(note that as of Python 3.6 the Python dict is ordered by default anyway).
852This helps reproduce GenBank/EMBL files on input/output.
853
854The Bio.SeqIO UniProt-XML parser was updated to cope with features with
855unknown locations which can be found in mass spec data.
856
857The Bio.SeqIO GenBank, EMBL, and IMGT parsers now record the molecule type
858from the LOCUS/ID line explicitly in the record.annotations dictionary.
859The Bio.SeqIO EMBL parser was updated to cope with more variants seen in
860patent data files, and the related IMGT parser was updated to cope with
861IPD-IMGT/HLA database files after release v3.16.0 when their ID line changed.
862The GenBank output now uses colon space to match current NCBI DBLINK lines.
863
864The Bio.Affy package supports Affymetrix version 4 of the CEL file format,
865in addition to version 3.
866
867The restriction enzyme list in ``Bio.Restriction`` has been updated to the
868February 2017 release of REBASE.
869
870Bio.PDB.PDBList now can download PDBx/mmCif (new default), PDB (old default),
871PDBML/XML and mmtf format protein structures.  This is inline with the RCSB
872recommendation to use PDBx/mmCif and deprecate the PDB file format. Biopython
873already has support for parsing mmCif files.
874
875Additionally, a number of small bugs have been fixed with further additions
876to the test suite, and there has been further work to follow the Python PEP8,
877PEP257 and best practice standard coding style.
878
879Many thanks to the Biopython developers and community for making this release
880possible, especially the following contributors:
881
882- Aaron Rosenfeld
883- Adam Kurkiewicz (first contribution)
884- Adam Novak (first contribution)
885- Adrian Altenhoff (first contribution)
886- Allis Tauri (first contribution)
887- Andrew Dalke
888- Andrew Guy (first contribution)
889- Andrew Sczesnak (first contribution)
890- Ben Fulton
891- Bernhard Thiel (first contribution)
892- Bertrand Néron
893- Blaise Li (first contribution)
894- Brandon Carter (first contribution)
895- Brandon Invergo
896- Carlos Pena
897- Carlos Ríos
898- Chris Warth
899- Emmanuel Noutahi
900- Foen Peng (first contribution)
901- Francesco Gastaldello (first contribution)
902- Francisco Pina-Martins (first contribution)
903- Hector Martinez (first contribution)
904- Jacek Śmietański
905- Jack Twilley (first contribution)
906- Jeroen Van Goey (first contribution)
907- Joshua Meyers (first contribution)
908- Kurt Graff (first contribution)
909- Lenna Peterson
910- Leonhard Heizinger (first contribution)
911- Marcin Magnus (first contribution)
912- Markus Piotrowski
913- Maximilian Greil (first contribution)
914- Michał J. Gajda (first contribution)
915- Michiel de Hoon
916- Milind Luthra (first contribution)
917- Oscar G. Garcia (first contribution)
918- Owen Solberg
919- Peter Cock
920- Richard Neher (first contribution)
921- Sebastian Bassi
922- Sourav Singh (first contribution)
923- Spencer Bliven (first contribution)
924- Stefans Mezulis
925- Steve Bond
926- Steve Marshall (first contribution)
927- Uri Laserson
928- Veronika Berman (first contribution)
929- Vincent Davis
930- Wibowo 'Bow' Arindrarto
931
932
93325 August 2016: Biopython 1.68
934==============================
935
936This release of Biopython supports Python 2.6, 2.7, 3.3, 3.4 and 3.5, but
937this will be our final release to run on Python 2.6. It has also been tested
938on PyPy 5.0, PyPy3 version 2.4, and Jython 2.7.
939
940Bio.PDB has been extended to parse the RSSB's new binary Macromolecular
941Transmission Format (MMTF, see http://mmtf.rcsb.org), in addition to the
942mmCIF and PDB file formats (contributed by Anthony Bradley). This requires
943an optional external dependency on the mmtf-python library.
944
945Module Bio.pairwise2 has been re-written (contributed by Markus Piotrowski).
946It is now faster, addresses some problems with local alignments, and also
947now allows gap insertions after deletions, and vice versa, inspired by the
948https://doi.org/10.1101/031500 preprint from Flouri et al.
949
950The two sample graphical tools SeqGui (Sequence Graphical User Interface)
951and xbbtools were rewritten (SeqGui) or updated (xbbtools) using the tkinter
952library (contributed by Markus Piotrowski). SeqGui allows simple nucleotide
953transcription, back-transcription and translation into amino acids using
954Bio.Seq internally, offering of the NCBI genetic codes supported in Biopython.
955xbbtools is able to open Fasta formatted files, does simple nucleotide
956operations and translations in any reading frame using one of the NCBI genetic
957codes. In addition, it supports standalone Blast installations to do local
958Blast searches.
959
960New NCBI genetic code table 26 (Pachysolen tannophilus Nuclear Code) has been
961added to Bio.Data (and the translation functionality), and table 11 is now
962also available under the alias Archaeal.
963
964In line with NCBI website changes, Biopython now uses HTTPS rather than HTTP
965to connect to the NCBI Entrez and QBLAST API.
966
967Additionally, a number of small bugs have been fixed with further additions
968to the test suite, and there has been further work to follow the Python PEP8
969and best practice standard coding style.
970
971Many thanks to the Biopython developers and community for making this release
972possible, especially the following contributors:
973
974- Anthony Bradley (first contribution)
975- Ben Fulton
976- Carlos Pena
977- Connor T. Skennerton
978- Iddo Friedberg
979- Kai Blin
980- Kristian Davidsen (first contribution)
981- Markus Piotrowski
982- Olivier Morelle (first contribution)
983- Peter Cock
984- Stefans Mezulis (first contribution)
985- Tiago Antao
986- Travis Wrightsman
987- Uwe Schmitt (first contribution)
988- Xiaoyu Zhuo (first contribution)
989
990
9918 June 2016: Biopython 1.67
992===========================
993
994This release of Biopython supports Python 2.6, 2.7, 3.3, 3.4 and 3.5, but
995support for Python 2.6 is considered to be deprecated. It has also been
996tested on PyPy 5.0, PyPy3 version 2.4, and Jython 2.7.
997
998Comparison of SeqRecord objects until now has used the default Python object
999comparison (are they the same instance in memory?). This can be surprising, but
1000comparing all of the attributes would be too complex. As of this release
1001attempting to compare SeqRecord objects should raise an exception instead. If
1002you want the old behaviour, use id(record1) == id(record2) instead.
1003
1004New experimental module Bio.phenotype is for working with Phenotype Microarray
1005plates in JSON and the machine vendor's CSV format (contributed by Marco
1006Galardini).
1007
1008Following the convention used elsewhere in Biopython, there is a new function
1009Bio.KEGG.read(...) for parsing KEGG files expected to contain a single record
1010only - the existing function Bio.KEGG.parse(...) is intended to be used to
1011iterate over multi-record files.
1012
1013When a gap character is defined, Bio.Seq will now translate gap codons
1014(e.g. "---") into a single gap ("-") in the protein sequence. The gap character
1015is inferred from the Seq object's alphabet, but it can also be passed as an
1016argument to the translate method.
1017
1018The new NCBI genetic code table 25, covering Candidate Division SR1 and
1019Gracilibacteria, has been added to Bio.Data (and the translation
1020functionality).
1021
1022The Bio.Entrez interface will automatically use an HTTP POST rather than
1023HTTP GET if the URL would exceed 1000 characters. This is based on NCBI
1024guidelines and the fact that very long queries like complex searches can
1025otherwise trigger an HTTP Error 414 Request URI too long.
1026
1027Foreign keys are now used when creating BioSQL databases with SQLite3 (this
1028was not possible until SQLite version 3.6.19). The BioSQL taxonomy code now
1029updates the taxon table left/right keys when updating the taxonomy.
1030
1031There have been some fixes to the MMCIF structure parser which now uses
1032identifiers which better match results from the PDB structure parse.
1033
1034The restriction enzyme list in ``Bio.Restriction`` has been updated to the
1035May 2016 release of REBASE.
1036
1037The mmCIF parser in Bio.PDB.MMCIFParser has been joined by a second version
1038which only looks at the ATOM and HETATM lines and can be much faster.
1039
1040The Bio.KEGG.REST will now return unicode text-based handles, except for
1041images which remain as binary bytes-based handles, making it easier to use
1042with the mostly text-based parsers in Biopython.
1043
1044Note that the BioSQL test configuration information is now in a new file
1045Tests/biosql.ini rather than directly in Tests/test_BioSQL_*.py as before.
1046You can make a copy of the provided example file Tests/biosql.ini.sample
1047as Tests/biosql.ini and edit this if you wish to run the BioSQL tests.
1048
1049Additionally, a number of small bugs have been fixed with further additions
1050to the test suite, and there has been further work to follow the Python PEP8
1051standard coding style, and in converting our docstring documentation to use
1052the reStructuredText markup style.
1053
1054Many thanks to the Biopython developers and community for making this release
1055possible, especially the following contributors:
1056
1057- Aaron Rosenfeld (first contribution)
1058- Anders Pitman (first contribution)
1059- Barbara Mühlemann (first contribution)
1060- Ben Fulton
1061- Ben Woodcroft (first contribution)
1062- Brandon Invergo
1063- Brian Osborne (first contribution)
1064- Carlos Pena
1065- Chaitanya Gupta (first contribution)
1066- Chris Warth (first contribution)
1067- Christiam Camacho (first contribution)
1068- Connor T. Skennerton
1069- David Koppstein (first contribution)
1070- Eric Talevich
1071- Jacek Śmietański (first contribution)
1072- João D Ferreira (first contribution)
1073- João Rodrigues
1074- Joe Cora (first contribution)
1075- Kai Blin
1076- Leighton Pritchard
1077- Lenna Peterson
1078- Marco Galardini (first contribution)
1079- Markus Piotrowski
1080- Matt Ruffalo (first contribution)
1081- Matteo Sticco (first contribution)
1082- Nader Morshed (first contribution)
1083- Owen Solberg (first contribution)
1084- Peter Cock
1085- Steve Bond (first contribution)
1086- Terry Jones (first contribution)
1087- Vincent Davis
1088- Zheng Ruan
1089
1090
109121 October 2015: Biopython 1.66
1092===============================
1093
1094This release of Biopython supports Python 2.6, 2.7, 3.3, 3.4 and 3.5, but
1095support for Python 2.6 is considered to be deprecated. It has also been
1096tested on PyPy 2.4 to 2.6, PyPy3 version 2.4, and Jython 2.7.
1097
1098Further work on the Bio.KEGG and Bio.Graphics modules now allows drawing KGML
1099pathways with transparency.
1100
1101The Bio.SeqIO "abi" parser now decodes almost all the documented fields used
1102by the ABIF instruments - including the individual color channels.
1103
1104Bio.PDB now has a QCPSuperimposer module using the Quaternion Characteristic
1105Polynomial algorithm for superimposing structures. This is a fast alternative
1106to the existing SVDSuperimposer code using singular value decomposition.
1107
1108Bio.Entrez now implements the NCBI Entrez Citation Matching function
1109(ECitMatch), which retrieves PubMed IDs (PMIDs) that correspond to a set of
1110input citation strings.
1111
1112Bio.Entrez.parse(...) now supports NCBI XML files using XSD schemas, which
1113will be downloaded and cached like NCBI DTD files.
1114
1115A subtle bug in how multi-part GenBank/EMBL locations on the reverse strand
1116were parsed into CompoundLocations was fixed: complement(join(...)) as used
1117by NCBI worked, but join(complement(...),complement(...),...) as used by
1118EMBL/ENSEMBL gave the CompoundLocation parts in the wrong order. A related
1119bug when taking the reverse complement of a SeqRecord containing features
1120with CompoundLocations was also fixed.
1121
1122Additionally, a number of small bugs have been fixed with further additions
1123to the test suite, and there has been further work on conforming to the
1124Python PEP8 standard coding style.
1125
1126Many thanks to the Biopython developers and community for making this release
1127possible, especially the following contributors:
1128
1129- Alan Medlar (first contribution)
1130- Anthony Mathelier (first contribution)
1131- Antony Lee (first contribution)
1132- Anuj Sharma (first contribution)
1133- Ben Fulton (first contribution)
1134- Bertrand Néron
1135- Brandon Invergo
1136- Carlos Pena
1137- Christian Brueffer
1138- Connor T. Skennerton (first contribution)
1139- David Arenillas (first contribution)
1140- David Nicholson (first contribution)
1141- Emmanuel Noutahi (first contribution)
1142- Eric Rasche (first contribution)
1143- Fabio Madeira (first contribution)
1144- Franco Caramia (first contribution)
1145- Gert Hulselmans (first contribution)
1146- Gleb Kuznetsov (first contribution)
1147- João Rodrigues
1148- John Bradley (first contribution)
1149- Kai Blin
1150- Kian Ho (first contribution)
1151- Kozo Nishida (first contribution)
1152- Kuan-Yi Li (first contribution)
1153- Leighton Pritchard
1154- Lucas Sinclair
1155- Michiel de Hoon
1156- Peter Cock
1157- Saket Choudhary
1158- Sunhwan Jo (first contribution)
1159- Tarcisio Fedrizzi (first contribution)
1160- Tiago Antao
1161- Vincent Davis
1162
1163
116417 December 2014: Biopython 1.65 released.
1165==========================================
1166
1167The Biopython sequence objects now use string comparison, rather than Python's
1168object comparison. This has been planned for a long time with warning messages
1169in place (under Python 2, the warnings were sadly missing under Python 3).
1170
1171The Bio.KEGG and Bio.Graphics modules have been expanded with support for
1172the online KEGG REST API, and parsing, representing and drawing KGML pathways.
1173
1174The Pterobranchia Mitochondrial genetic code has been added to Bio.Data (and
1175the translation functionality), which is the new NCBI genetic code table 24.
1176
1177The Bio.SeqIO parser for the ABI capillary file format now exposes all the raw
1178data in the SeqRecord's annotation as a dictionary. This allows further
1179in-depth analysis by advanced users.
1180
1181Bio.SearchIO QueryResult objects now allow Hit retrieval using its alternative
1182IDs (any IDs listed after the first one, for example as used with the NCBI
1183BLAST NR database).
1184
1185We have also done some more work applying PEP8 coding styles to Biopython.
1186
1187Bio.SeqUtils.MeltingTemp has been rewritten with new functionality.
1188
1189The new experimental module Bio.CodonAlign has been renamed Bio.codonalign
1190(and similar lower case PEP8 style module names have been used for the
1191sub-modules within this).
1192
1193Bio.SeqIO.index_db(...) and Bio.SearchIO.index_db(...) now store any relative
1194filenames relative to the index file, rather than (as before) relative to the
1195current directory at the time the index was built. This makes the indexes
1196less fragile, so that they can be used from other working directories. NOTE:
1197This change is backward compatible (old index files work as before), however
1198relative paths in new indexes will not work on older versions of Biopython!
1199
1200Biopython also seems to work fine under PyPy3 2.4 which implements Python 3.2
1201plus unicode string literals.
1202
1203Many thanks to the Biopython developers and community for making this release
1204possible, especially the following contributors:
1205
1206- Alan Du (first contribution)
1207- Carlos Pena (first contribution)
1208- Colin Lappala (first contribution)
1209- Christian Brueffer
1210- David Bulger (first contribution)
1211- Eric Talevich
1212- Evan Parker (first contribution)
1213- Hongbo Zhu
1214- Kai Blin
1215- Kevin Wu (first contribution)
1216- Leighton Pritchard
1217- Leszek Pryszcz (first contribution)
1218- Markus Piotrowski
1219- Matt Shirley (first contribution)
1220- Mike Cariaso (first contribution)
1221- Peter Cock
1222- Seth Sims (first contribution)
1223- Tiago Antao
1224- Travis Wrightsman (first contribution)
1225- Tyghe Vallard (first contribution)
1226- Vincent Davis
1227- Wibowo 'Bow' Arindrarto
1228- Zheng Ruan
1229
1230
123129 May 2014: Biopython 1.64 released.
1232=====================================
1233
1234This release of Biopython supports Python 2.6 and 2.7, 3.3 and also the
1235new 3.4 version. It is also tested on PyPy 2.0 to 2.3, and Jython 2.7b2.
1236
1237The new experimental module Bio.CodonAlign facilitates building codon
1238alignment and further analysis upon it. This work is from the Google
1239Summer of Code (GSoC) project by Zheng Ruan.
1240
1241Bio.Phylo now has tree construction and consensus modules, from the
1242GSoC work by Yanbo Ye.
1243
1244Bio.Entrez will now automatically download and cache new NCBI DTD files for
1245XML parsing under the user's home directory (using ``~/.biopython`` on
1246Unix like systems, and ``$APPDATA/biopython`` on Windows).
1247
1248Bio.Sequencing.Applications now includes a wrapper for the samtools command
1249line tool.
1250
1251Bio.PopGen.SimCoal now also supports fastsimcoal.
1252
1253SearchIO hmmer3-text, hmmer3-tab, and hmmer3-domtab now support output from
1254hmmer3.1b1.
1255
1256The ``accession`` of QueryResult and Hit objects created when using the
1257'hmmer3-tab' format are now properly named as ``accession`` (previously they
1258were ``acc``, deviating from the documentation).
1259
1260The ``homology` key in the ``aln_annotation`` attribute of an HSP object in
1261Bio.SearchIO has been renamed to ``similarity``.
1262
1263The Bio.SeqUtils masses and molecular_weight function have been updated.
1264
1265BioSQL can now use the mysql-connector package (available for Python 2, 3
1266and PyPy) as an alternative to MySQLdb (Python 2 only) to connect to a MySQL
1267database.
1268
1269Many thanks to the Biopython developers and community for making this release
1270possible, especially the following contributors:
1271
1272- Chunlei Wu (first contribution)
1273- Edward Liaw (first contribution)
1274- Eric Talevich
1275- Leighton Pritchard
1276- Manlio Calvi (first contribution)
1277- Markus Piotrowski (first contribution)
1278- Melissa Gymrek (first contribution)
1279- Michiel de Hoon
1280- Nigel Delaney
1281- Peter Cock
1282- Saket Choudhary
1283- Tiago Antao
1284- Vincent Davis (first contribution)
1285- Wibowo 'Bow' Arindrarto
1286- Yanbo Ye (first contribution)
1287- Zheng Ruan (first contribution)
1288
1289
12904 December 2013: Biopython 1.63 released.
1291=========================================
1292
1293This release supports Python 3.3 onwards without conversion via the 2to3
1294library. See the Biopython 1.63 beta release notes below for details. Since
1295the beta release we have made some minor bug fixes and test improvements.
1296
1297The restriction enzyme list in Bio.Restriction has been updated to the
1298December 2013 release of REBASE.
1299
1300Additional contributors since the beta:
1301
1302- Gokcen Eraslan (first contribution)
1303
1304
130512 November 2013: Biopython 1.63 beta released.
1306===============================================
1307
1308This is a beta release for testing purposes, the main reason for a
1309beta version is the large amount of changes imposed by the removal of
1310the 2to3 library previously required for the support of Python 3.X.
1311This was made possible by dropping Python 2.5 (and Jython 2.5).
1312
1313This release of Biopython supports Python 2.6 and 2.7, and also Python
13143.3.
1315
1316The Biopython Tutorial & Cookbook, and the docstring examples in the source
1317code, now use the Python 3 style print function in place of the Python 2
1318style print statement. This language feature is available under Python 2.6
1319and 2.7 via::
1320
1321    from __future__ import print_function
1322
1323Similarly we now use the Python 3 style built-in next function in place of
1324the Python 2 style iterators' .next() method. This language feature is also
1325available under Python 2.6 and 2.7.
1326
1327Many thanks to the Biopython developers and community for making this release
1328possible, especially the following contributors:
1329
1330- Chris Mitchell (first contribution)
1331- Christian Brueffer
1332- Eric Talevich
1333- Josha Inglis (first contribution)
1334- Konstantin Tretyakov (first contribution)
1335- Lenna Peterson
1336- Martin Mokrejs
1337- Nigel Delaney (first contribution)
1338- Peter Cock
1339- Sergei Lebedev (first contribution)
1340- Tiago Antao
1341- Wayne Decatur (first contribution)
1342- Wibowo 'Bow' Arindrarto
1343
1344
134528 August 2013: Biopython 1.62 released.
1346========================================
1347
1348This is our first release to officially support Python 3, however it is
1349also our final release supporting Python 2.5. Specifically this release
1350is supported and tested on standard Python 2.5, 2.6, 2.7 and 3.3.
1351It was also tested under Jython 2.5, 2.7 and PyPy 1.9, 2.0.
1352
1353See the Biopython 1.62 beta release notes below for most changes. Since the
1354beta release we have added several minor bug fixes and test improvements.
1355Additional contributors since the beta:
1356
1357- Bertrand Néron (first contribution)
1358- Lenna Peterson
1359- Martin Mokrejs
1360- Matsuyuki Shirota (first contribution)
1361
1362
136315 July 2013: Biopython 1.62 beta released.
1364===========================================
1365
1366This is a beta release for testing purposes, both for new features added,
1367and changes to location parsing, but more importantly Biopython 1.62 will
1368be our first release to officially support Python 3.
1369
1370Specifically we intend Biopython 1.62 to support standard Python 2.5, 2.6, 2.7
1371and 3.3, but the release will also be tested under Jython 2.5, 2.7 and PyPy
13721.9, 2.0 as well. It will be our final release supporting Python 2.5.
1373
1374The translation functions will give a warning on any partial codons (and this
1375will probably become an error in a future release). If you know you are dealing
1376with partial sequences, either pad with N to extend the sequence length to a
1377multiple of three, or explicitly trim the sequence.
1378
1379The handling of joins and related complex features in Genbank/EMBL files has
1380been changed with the introduction of a CompoundLocation object. Previously
1381a SeqFeature for something like a multi-exon CDS would have a child SeqFeature
1382(under the sub_features attribute) for each exon. The sub_features property
1383will still be populated for now, but is deprecated and will in future be
1384removed. Please consult the examples in the help (docstrings) and Tutorial.
1385
1386Thanks to the efforts of Ben Morris, the Phylo module now supports the file
1387formats NeXML and CDAO. The Newick parser is also significantly faster, and can
1388now optionally extract bootstrap values from the Newick comment field (like
1389Molphy and Archaeopteryx do). Nate Sutton added a wrapper for FastTree to
1390Bio.Phylo.Applications.
1391
1392New module Bio.UniProt adds parsers for the GAF, GPA and GPI formats from
1393UniProt-GOA.
1394
1395The BioSQL module is now supported in Jython. MySQL and PostgreSQL databases
1396can be used. The relevant JDBC driver should be available in the CLASSPATH.
1397
1398Feature labels on circular GenomeDiagram figures now support the label_position
1399argument (start, middle or end) in addition to the current default placement,
1400and in a change to prior releases these labels are outside the features which
1401is now consistent with the linear diagrams.
1402
1403The code for parsing 3D structures in mmCIF files was updated to use the
1404Python standard library's shlex module instead of C code using flex.
1405
1406The Bio.Sequencing.Applications module now includes a BWA command line wrapper.
1407
1408Bio.motifs supports JASPAR format files with multiple position-frequence
1409matrices.
1410
1411Additionally there have been other minor bug fixes and more unit tests.
1412
1413Many thanks to the Biopython developers and community for making this release
1414possible, especially the following contributors:
1415
1416- Alexander Campbell (first contribution)
1417- Andrea Rizzi (first contribution)
1418- Anthony Mathelier (first contribution)
1419- Ben Morris (first contribution)
1420- Brad Chapman
1421- Christian Brueffer
1422- David Arenillas (first contribution)
1423- David Martin (first contribution)
1424- Eric Talevich
1425- Iddo Friedberg
1426- Jian-Long Huang (first contribution)
1427- Joao Rodrigues
1428- Kai Blin
1429- Michiel de Hoon
1430- Nate Sutton (first contribution)
1431- Peter Cock
1432- Petra Kubincová (first contribution)
1433- Phillip Garland
1434- Saket Choudhary (first contribution)
1435- Tiago Antao
1436- Wibowo 'Bow' Arindrarto
1437- Xabier Bello (first contribution)
1438
1439
14405 February 2013: Biopython 1.61 released.
1441=========================================
1442
1443GenomeDiagram has three new sigils (shapes to illustrate features). OCTO shows
1444an octagonal shape, like the existing BOX sigil but with the corners cut off.
1445JAGGY shows a box with jagged edges at the start and end, intended for things
1446like NNNNN regions in draft genomes. Finally BIGARROW is like the existing
1447ARROW sigil but is drawn straddling the axis. This is useful for drawing
1448vertically compact figures where you do not have overlapping genes.
1449
1450New module Bio.Graphics.ColorSpiral can generate colors along a spiral path
1451through HSV color space. This can be used to make arbitrary 'rainbow' scales,
1452for example to color features or cross-links on a GenomeDiagram figure.
1453
1454The Bio.SeqIO module now supports reading sequences from PDB files in two
1455different ways. The "pdb-atom" format determines the sequence as it appears in
1456the structure based on the atom coordinate section of the file (via Bio.PDB,
1457so NumPy is currently required for this). Alternatively, you can use the
1458"pdb-seqres" format to read the complete protein sequence as it is listed in
1459the PDB header, if available.
1460
1461The Bio.SeqUtils module how has a seq1 function to turn a sequence using three
1462letter amino acid codes into one using the more common one letter codes. This
1463acts as the inverse of the existing seq3 function.
1464
1465The multiple-sequence-alignment object used by Bio.AlignIO etc now supports
1466an annotation dictionary. Additional support for per-column annotation is
1467planned, with addition and splicing to work like that for the SeqRecord
1468per-letter annotation.
1469
1470A new warning, Bio.BiopythonExperimentalWarning, has been introduced. This
1471marks any experimental code included in the otherwise stable release. Such
1472'beta' level code is ready for wider testing, but still likely to change and
1473should only be tried by early adopters to give feedback via the biopython-dev
1474mailing list. We'd expect such experimental code to reach stable status in
1475one or two releases time, at which point our normal policies about trying to
1476preserve backwards compatibility would apply. See also the README file.
1477
1478This release also includes Bow's Google Summer of Code work writing a unified
1479parsing framework for NCBI BLAST (assorted formats including tabular and XML),
1480HMMER, BLAT, and other sequence searching tools. This is currently available
1481with the new BiopythonExperimentalWarning to indicate that this is still
1482somewhat experimental. We're bundling it with the main release to get more
1483public feedback, but with the big warning that the API is likely to change.
1484In fact, even the current name of Bio.SearchIO may change since unless you
1485are familiar with BioPerl its purpose isn't immediately clear.
1486
1487The Bio.Motif module has been updated and reorganized. To allow for a clean
1488deprecation of the old code, the new motif code is stored in a new module
1489Bio.motifs, and a PendingDeprecationWarning was added to Bio.Motif.
1490
1491A faster low level string FASTA based parser SimpleFastaParser has been added
1492to Bio.SeqIO.FastaIO which like its sister function for FASTQ files does not
1493have the overhead of constructing SeqRecord objects.
1494
1495Additionally there have been other minor bug fixes and more unit tests.
1496
1497Finally, we are phasing out support for Python 2.5. We will continue support
1498for at least one further release (Biopython 1.62). This could be extended
1499given feedback from our users (or if the Jython 2.7 release is delayed, since
1500the current stable release Jython 2.5 implemented Python 2.5 only). Focusing
1501on Python 2.6 and 2.7 only will make writing Python 3 compatible code easier.
1502
1503Many thanks to the Biopython developers and community for making this release
1504possible, especially the following contributors:
1505
1506- Brandon Invergo
1507- Bryan Lunt (first contribution)
1508- Christian Brueffer (first contribution)
1509- David Cain
1510- Eric Talevich
1511- Grace Yeo (first contribution)
1512- Jeffrey Chang
1513- Jingping Li (first contribution)
1514- Kai Blin (first contribution)
1515- Leighton Pritchard
1516- Lenna Peterson
1517- Lucas Sinclair (first contribution)
1518- Michiel de Hoon
1519- Nick Semenkovich (first contribution)
1520- Peter Cock
1521- Robert Ernst (first contribution)
1522- Tiago Antao
1523- Wibowo 'Bow' Arindrarto
1524
1525
152625 June 2012: Biopython 1.60 released.
1527======================================
1528
1529New module Bio.bgzf supports reading and writing BGZF files (Blocked GNU
1530Zip Format), a variant of GZIP with efficient random access, most commonly
1531used as part of the BAM file format. This uses Python's zlib library
1532internally, and provides a simple interface like Python's gzip library.
1533Using this the Bio.SeqIO indexing functions now support BGZF compressed
1534sequence files.
1535
1536The GenBank/EMBL parser will now give a warning on unrecognised feature
1537locations and continue parsing (leaving the feature's location as None).
1538Previously it would abort with an exception, which was often unhelpful.
1539
1540The Bio.PDB.MMCIFParser is now compiled by default (but is still not
1541available under Jython, PyPy or Python 3).
1542
1543The SFF parser in Bio.SeqIO now decodes Roche 454 'universal accession
1544number' 14 character read names, which encode the timestamp of the run,
1545the region the read came from, and the location of the well.
1546
1547In the Phylo module, the "draw" function for plotting tree objects has become
1548much more flexible, with improved support for matplotlib conventions and new
1549parameters for specifying branch and taxon labels. Writing in the PhyloXML
1550format has been updated to more closely match the output of other programs. A
1551wrapper for the program RAxML has been added under Bio.Phylo.Applications,
1552alongside the existing wrapper for PhyML.
1553
1554Additionally there have been other minor bug fixes and more unit tests.
1555
1556Many thanks to the Biopython developers and community for making this release
1557possible, especially the following contributors:
1558
1559- Brandon Invergo
1560- Eric Talevich
1561- Jeff Hussmann (first contribution)
1562- John Comeau (first contribution)
1563- Kamil Slowikowski (first contribution)
1564- Kevin Jacobs
1565- Lenna Peterson (first contribution)
1566- Matt Fenwick (first contribution)
1567- Peter Cock
1568- Paul T. Bathen
1569- Wibowo Arindrarto
1570
1571
157224 February 2012: Biopython 1.59 released.
1573==========================================
1574
1575Please note that this release will *not* work on Python 2.4 (while the recent
1576releases have worked despite us not officially supporting this).
1577
1578The position objects used in Bio.SeqFeature now act almost like integers,
1579making dealing with fuzzy locations in EMBL/GenBank files much easier. Note as
1580part of this work, the arguments to create fuzzy positions OneOfPosition and
1581WithinPosition have changed in a non-backwards compatible way.
1582
1583The SeqFeature's strand and any database reference are now properties of the
1584FeatureLocation object (a more logical placement), with proxy methods for
1585backwards compatibility. As part of this change, if you print a location
1586object it will now display any strand and database reference information.
1587
1588The installation setup.py now supports 'install_requires' when setuptools
1589is installed. This avoids the manual dialog when installing Biopython via
1590easy_install or pip and numpy is not installed. It also allows user libraries
1591that require Biopython to include it in their install_requires and get
1592automatical installation of dependencies.
1593
1594Bio.Graphics.BasicChromosome has been extended to allow simple sub-features to
1595be drawn on chromosome segments, suitable to show the position of genes, SNPs
1596or other loci. Note Bio.Graphics requires the ReportLab library.
1597
1598Bio.Graphics.GenomeDiagram has been extended to allow cross-links between
1599tracks, and track specific start/end positions for showing regions. This can
1600be used to imitate the output from the Artemis Comparison Tool (ACT).
1601Also, a new attribute circle_core makes it easier to have an empty space in
1602the middle of a circular diagram (see tutorial).
1603
1604Bio.Align.Applications now includes a wrapper for command line tool Clustal
1605Omega for protein multiple sequence alignment.
1606
1607Bio.AlignIO now supports sequential PHYLIP files (as well as interlaced
1608PHYLIP files) as a separate format variant.
1609
1610New module Bio.TogoWS offers a wrapper for the TogoWS REST API, a web service
1611based in Japan offering access to KEGG, DDBJ, PDBj, CBRC plus access to some
1612NCBI, EBI resources including PubMed, GenBank and UniProt. This is much easier
1613to use than the NCBI Entrez API, but should be especially useful for Biopython
1614users based in Asia.
1615
1616Bio.Entrez function efetch has been updated to handle the NCBI's stricter
1617handling of multiple ID arguments in EFetch 2.0, however the NCBI have also
1618changed the retmode default argument so you may need to make this explicit.
1619e.g. retmode="text"
1620
1621Additionally there have been other minor bug fixes and more unit tests.
1622
1623Many thanks to the Biopython developers and community for making this release
1624possible, especially the following contributors:
1625
1626- Andreas Wilm (first contribution)
1627- Alessio Papini (first contribution)
1628- Brad Chapman
1629- Brandon Invergo
1630- Connor McCoy
1631- Eric Talevich
1632- João Rodrigues
1633- Konrad Förstner (first contribution)
1634- Michiel de Hoon
1635- Matej Repič (first contribution)
1636- Leighton Pritchard
1637- Peter Cock
1638
1639
164018 August 2011: Biopython 1.58 released.
1641========================================
1642
1643A new interface and parsers for the PAML (Phylogenetic Analysis by Maximum
1644Likelihood) package of programs, supporting codeml, baseml and yn00 as well
1645as a Python re-implementation of chi2 was added as the Bio.Phylo.PAML module.
1646
1647Bio.SeqIO now includes read and write support for the SeqXML, a simple XML
1648format offering basic annotation support. See Schmitt et al (2011) in
1649Briefings in Bioinformatics, https://doi.org/10.1093/bib/bbr025
1650
1651Bio.SeqIO now includes read support for ABI files ("Sanger" capillary
1652sequencing trace files, containing called sequence with PHRED qualities).
1653
1654The Bio.AlignIO "fasta-m10" parser was updated to cope with the >>><<< lines
1655as used in Bill Pearson's FASTA version 3.36, without this fix the parser
1656would only return alignments for the first query sequence.
1657
1658The Bio.AlignIO "phylip" parser and writer now treat a dot/period in the
1659sequence as an error, in line with the official PHYLIP specification. Older
1660versions of our code didn't do anything special with this character. Also,
1661support for "phylip-relaxed" has been added which allows longer record names
1662as used in RAxML and PHYML.
1663
1664Of potential interest to anyone subclassing Biopython objects, any remaining
1665"old style" Python classes have been switched to "new style" classes. This
1666allows things like defining properties.
1667
1668Bio.HMM's Viterbi algorithm now expects the initial probabilities explicitly.
1669
1670Many thanks to the Biopython developers and community for making this release
1671possible, especially the following contributors:
1672
1673- Aaron Gallagher (first contribution)
1674- Bartek Wilczynski
1675- Bogdan T. (first contribution)
1676- Brandon Invergo (first contribution)
1677- Connor McCoy (first contribution)
1678- David Cain (first contribution)
1679- Eric Talevich
1680- Fábio Madeira (first contribution)
1681- Hongbo Zhu
1682- Joao Rodrigues
1683- Michiel de Hoon
1684- Peter Cock
1685- Thomas Schmitt (first contribution)
1686- Tiago Antao
1687- Walter Gillett
1688- Wibowo Arindrarto (first contribution)
1689
1690
16912 April 2011: Biopython 1.57 released.
1692======================================
1693
1694Bio.SeqIO now includes an index_db() function which extends the existing
1695indexing functionality to allow indexing many files, and more importantly
1696this keeps the index on disk in a simple SQLite3 database rather than in
1697memory in a Python dictionary.
1698
1699Bio.Blast.Applications now includes a wrapper for the BLAST+ blast_formatter
1700tool from NCBI BLAST 2.2.24+ or later. This release of BLAST+ added the
1701ability to run the BLAST tools and save the output as ASN.1 format, and then
1702convert this to any other supported BLAST output format (plain text, tabular,
1703XML, or HTML) with the blast_formatter tool. The wrappers were also updated
1704to include new arguments added in BLAST 2.2.25+ such as -db_hard_mask.
1705
1706The SeqRecord object now has a reverse_complement method (similar to that of
1707the Seq object). This is most useful to reversing per-letter-annotation (such
1708as quality scores from FASTQ) or features (such as annotation from GenBank).
1709
1710Bio.SeqIO.write's QUAL output has been sped up, and Bio.SeqIO.convert now
1711uses an optimised routine for FASTQ to QUAL making this much faster.
1712
1713Biopython can now be installed with pip. Thanks to David Koppstein and
1714James Casbon for reporting the problem.
1715
1716Bio.SeqIO.write now uses lower case for the sequence for GenBank, EMBL and
1717IMGT output.
1718
1719The Bio.PDB module received several fixes and improvements, including starting
1720to merge João's work from GSoC 2010; consequently Atom objects now know
1721their element type and IUPAC mass. (The new features that use these
1722attributes won't be included in Biopython until the next release, though, so
1723stay tuned.)
1724
1725The nodetype hierarchy in the Bio.SCOP.Cla.Record class is now a dictionary
1726(previously it was a list of key,value tuples) to better match the standard.
1727
1728Many thanks to the Biopython developers and community for making this release
1729possible, especially the following contributors:
1730
1731- Brad Chapman
1732- Eric Talevich
1733- Erick Matsen (first contribution)
1734- Hongbo Zhu
1735- Jeffrey Finkelstein (first contribution)
1736- Joanna & Dominik Kasprzak (first contribution)
1737- Joao Rodrigues
1738- Kristian Rother
1739- Leighton Pritchard
1740- Michiel de Hoon
1741- Peter Cock
1742- Peter Thorpe (first contribution)
1743- Phillip Garland
1744- Walter Gillett (first contribution)
1745
1746
174726 November 2010: Biopython 1.56 released.
1748==========================================
1749
1750This is planned to be our last release to support Python 2.4, however this
1751could be delayed given immediate feedback from our users (e.g. if this proves
1752to be a problem in combination with other libraries or a popular Linux
1753distribution).
1754
1755Bio.SeqIO can now read and index UniProt XML files (under format name
1756"uniprot-xml", which was agreed with EMBOSS and BioPerl for when/if they
1757support it too).
1758
1759Bio.SeqIO can now read, write and index IMGT files. These are a variant of
1760the EMBL sequence text file format with longer feature indentation.
1761
1762Bio.SeqIO now supports protein EMBL files (used in the EMBL patents database
1763file epo_prt.dat) - previously we only expected nucleotide EMBL files.
1764
1765The Bio.Seq translation methods and function will now accept an arbitrary
1766CodonTable object (for those of you working on very unusual organisms).
1767
1768The SeqFeature object now supports len(feature) giving the length consistent
1769with the existing extract method. Also, it now supports iteration giving the
1770coordinate (with respect to the parent sequence) of each letter within the
1771feature (in frame aware order), and "in" which allows you to check if a
1772(parent based) coordinate is within the feature location.
1773
1774Bio.Entrez will now try to download any missing NCBI DTD files and cache them
1775in the user's home directory.
1776
1777The provisional database schema for BioSQL support on SQLite which Biopython
1778has been using since Release 1.53 has now been added to BioSQL, and updated
1779slightly.
1780
1781Bio.PopGen.FDist now supports the DFDist command line tool as well as FDist2.
1782
1783Bio.Motif now has a chapter in the Tutorial.
1784
1785(At least) 13 people have contributed to this release, including 6 new people:
1786
1787- Andrea Pierleoni (first contribution)
1788- Bart de Koning (first contribution)
1789- Bartek Wilczynski
1790- Bartosz Telenczuk (first contribution)
1791- Cymon Cox
1792- Eric Talevich
1793- Frank Kauff
1794- Michiel de Hoon
1795- Peter Cock
1796- Phillip Garland (first contribution)
1797- Siong Kong (first contribution)
1798- Tiago Antao
1799- Uri Laserson (first contribution)
1800
1801
180231 August 2010: Biopython 1.55 released.
1803========================================
1804
1805See the notes below for the Biopython 1.55 beta release for changes since
1806Biopython 1.54 was released. Since the beta release we have marked a few
1807modules as obsolete or deprecated, and removed some deprecated code. There
1808have also been a few bug fixes, extra unit tests, and documentation
1809improvements.
1810
1811(At least) 12 people have contributed to this release, including 6 new people:
1812
1813- Andres Colubri (first contribution)
1814- Carlos Ríos (first contribution)
1815- Claude Paroz (first contribution)
1816- Cymon Cox
1817- Eric Talevich
1818- Frank Kauff
1819- Joao Rodrigues (first contribution)
1820- Konstantin Okonechnikov (first contribution)
1821- Michiel de Hoon
1822- Nathan Edwards (first contribution)
1823- Peter Cock
1824- Tiago Antao
1825
1826
182718 August 2010: Biopython 1.55 beta released.
1828=============================================
1829
1830This is a beta release for testing purposes, both for new features added,
1831and more importantly updates to avoid code deprecated in Python 2.7 or in
1832Python 3. This is an important step towards Python 3 support.
1833
1834We are phasing out support for Python 2.4. We will continue to support it
1835for at least one further release (Biopython 1.56). This could be delayed
1836given feedback from our users (e.g. if this proves to be a problem in
1837combination with other libraries or a popular Linux distribution).
1838
1839The SeqRecord object now has upper and lower methods (like the Seq object and
1840Python strings), which return a new SeqRecord with the sequence in upper or
1841lower case and a copy of all the annotation unchanged.
1842
1843Several small issues with Bio.PDB have been resolved, which includes better
1844handling of model numbers, and files missing the element column.
1845
1846Feature location parsing for GenBank and EMBL files has been rewritten,
1847making the parser much faster.
1848
1849Ace parsing by SeqIO now uses zero rather than None for the quality score of
1850any gaps (insertions) in the contig sequence.
1851
1852The BioSQL classes DBServer and BioSeqDatabase now act more like Python
1853dictionaries, making it easier to count, delete, iterate over, or check for
1854membership of namespaces and records.
1855
1856The command line tool application wrapper classes are now executable, so you
1857can use them to call the tool (using the subprocess module internally) and
1858capture the output and any error messages as strings (stdout and stderr).
1859This avoids having to worry about the details of how best to use subprocess.
1860
1861(At least) 10 people have contributed to this release, including 5 new people:
1862
1863- Andres Colubri (first contribution)
1864- Carlos Ríos (first contribution)
1865- Claude Paroz (first contribution)
1866- Eric Talevich
1867- Frank Kauff
1868- Joao Rodrigues (first contribution)
1869- Konstantin Okonechnikov (first contribution)
1870- Michiel de Hoon
1871- Peter Cock
1872- Tiago Antao
1873
1874
1875May 20, 2010: Biopython 1.54 released.
1876======================================
1877
1878See the notes below for the Biopython 1.54 beta release for changes since
1879Biopython 1.53 was released. Since then there have been some changes to
1880the new Bio.Phylo module, more documentation, and a number of smaller
1881bug fixes.
1882
1883
1884April 2, 2010: Biopython 1.54 beta released.
1885============================================
1886
1887We are phasing out support for Python 2.4. We will continue to support it
1888for at least two further releases, and at least one year (whichever takes
1889longer), before dropping support for Python 2.4. This could be delayed
1890given feedback from our users (e.g. if this proves to be a problem in
1891combination with other libraries or a popular Linux distribution).
1892
1893New module Bio.Phylo includes support for reading, writing and working with
1894phylogenetic trees from Newick, Nexus and phyloXML files. This was work by
1895Eric Talevich on a Google Summer of Code 2009 project, under The National
1896Evolutionary Synthesis Center (NESCent), mentored by Brad Chapman and
1897Christian Zmasek.
1898
1899Bio.Entrez includes some more DTD files, in particular eLink_090910.dtd,
1900needed for our NCBI Entrez Utilities XML parser.
1901
1902The parse, read and write functions in Bio.SeqIO and Bio.AlignIO will now
1903accept filenames as well as handles. This follows a general shift from
1904other Python libraries, and does make usage a little simpler. Also
1905the write functions will now accept a single SeqRecord or alignment.
1906
1907Bio.SeqIO now supports writing EMBL files (DNA and RNA sequences only).
1908
1909The dictionary-like objects from Bio.SeqIO.index() now support a get_raw
1910method for most file formats, giving you the original unparsed data from the
1911file as a string. This is useful for selecting a subset of records from a
1912file where Bio.SeqIO.write() does not support the file format (e.g. the
1913"swiss" format) or where you need to exactly preserve the original layout.
1914
1915Based on code from Jose Blanca (author of sff_extract), Bio.SeqIO now
1916supports reading, indexing and writing Standard Flowgram Format (SFF)
1917files which are used by 454 Life Sciences (Roche) sequencers. This means
1918you can use SeqIO to convert from SFF to FASTQ, FASTA and QUAL (as
1919trimmed or untrimmed reads).
1920
1921An improved multiple sequence alignment object has been introduced,
1922and is used by Bio.AlignIO for input. This is a little stricter than the
1923old class but should otherwise be backwards compatible.
1924
1925(At least) 11 people contributed to this release, including 5 new people:
1926
1927- Anne Pajon (first contribution)
1928- Brad Chapman
1929- Christian Zmasek
1930- Diana Jaunzeikare (first contribution)
1931- Eric Talevich
1932- Jose Blanca (first contribution)
1933- Kevin Jacobs (first contribution)
1934- Leighton Pritchard
1935- Michiel de Hoon
1936- Peter Cock
1937- Thomas Holder (first contribution)
1938
1939
1940December 15, 2009: Biopython 1.53 released.
1941===========================================
1942
1943Biopython is now using git for source code control, currently on github. Our
1944old CVS repository will remain on the OBF servers in the short/medium term
1945as a backup, but will not be updated in future.
1946
1947The Bio.Blast.Applications wrappers now covers the new NCBI BLAST C++ tools
1948(where blastall is replaced by blastp, blastn, etc, and the command line
1949switches have all been renamed). These will be replacing the old wrappers in
1950Bio.Blast.NCBIStandalone which are now obsolete, and will be deprecated in
1951our next release.
1952
1953The plain text BLAST parser has been updated, and should cope with recent
1954versions of NCBI BLAST, including the new C++ based version. Nevertheless,
1955we (and the NCBI) still recommend using the XML output for parsing.
1956
1957The Seq (and related UnknownSeq) objects gained upper and lower methods,
1958like the string methods of the same name but alphabet aware. The Seq object
1959also gained a new ungap method for removing gap characters in an alphabet
1960aware manner.
1961
1962The SeqFeature object now has an extract method, used with the parent
1963sequence (as a string or Seq object) to get the region of that sequence
1964described by the feature's location information (including the strand and
1965any sub-features for a join). As an example, this is useful to get the
1966nucleotide sequence for features in GenBank or EMBL files.
1967
1968SeqRecord objects now support addition, giving a new SeqRecord with the
1969combined sequence, all the SeqFeatures, and any common annotation.
1970
1971Bio.Entrez includes the new (Jan 2010) DTD files from the NCBI for parsing
1972MedLine/PubMed data.
1973
1974The NCBI codon tables have been updated from version 3.4 to 3.9, which adds
1975a few extra start codons, and a few new tables (Tables 16, 21, 22 and 23).
1976Note that Table 14 which used to be called "Flatworm Mitochondrial" is now
1977called "Alternative Flatworm Mitochondrial", and "Flatworm Mitochondrial" is
1978now an alias for Table 9 ("Echinoderm Mitochondrial").
1979
1980The restriction enzyme list in Bio.Restriction has been updated to the
1981Nov 2009 release of REBASE.
1982
1983The Bio.PDB parser and output code has been updated to understand the
1984element column in ATOM and HETATM lines (based on patches contributed by
1985Hongbo Zhu and Frederik Gwinner). Bio.PDB.PDBList has also been updated
1986for recent changes to the PDB FTP site (Paul T. Bathen).
1987
1988SQLite support was added for BioSQL databases (Brad Chapman), allowing access
1989to BioSQL through a lightweight embedded SQL engine. Python 2.5+ includes
1990support for SQLite built in, but on Python 2.4 the optional sqlite3 library
1991must be installed to use this. We currently use a draft BioSQL on SQLite
1992schema, which will be merged with the main BioSQL release for use in other
1993projects.
1994
1995Support for running Biopython under Jython (using the Java Virtual Machine)
1996has been much improved thanks to input from Kyle Ellrott. Note that Jython
1997does not support C code - this means NumPy isn't available, and nor are a
1998selection of Biopython modules (including Bio.Cluster, Bio.PDB and BioSQL).
1999Also, currently Jython does not parse DTD files, which means the XML parser
2000in Bio.Entrez won't work. However, most of the Biopython modules seem fine
2001from testing Jython 2.5.0 and 2.5.1.
2002
2003(At least) 12 people contributed to this release, including 3 first timers:
2004
2005- Bartek Wilczynski
2006- Brad Chapman
2007- Chris Lasher
2008- Cymon Cox
2009- Frank Kauff
2010- Frederik Gwinner (first contribution)
2011- Hongbo Zhu (first contribution)
2012- Kyle Ellrott
2013- Leighton Pritchard
2014- Michiel de Hoon
2015- Paul Bathen (first contribution)
2016- Peter Cock
2017
2018
2019September 22, 2009: Biopython 1.52 released.
2020============================================
2021
2022The Population Genetics module now allows the calculation of several tests,
2023and statistical estimators via a wrapper to GenePop. Supported are tests for
2024Hardy-Weinberg equilibrium, linkage disequilibrium and estimates for various
2025F statistics (Cockerham and Wier Fst and Fis, Robertson and Hill Fis, etc),
2026null allele frequencies and number of migrants among many others. Isolation
2027By Distance (IBD) functionality is also supported.
2028
2029New helper functions Bio.SeqIO.convert() and Bio.AlignIO.convert() allow an
2030easier way to use Biopython for simple file format conversions. Additionally,
2031these new functions allow Biopython to offer important file format specific
2032optimisations (e.g. FASTQ to FASTA, and interconverting FASTQ variants).
2033
2034New function Bio.SeqIO.index() allows indexing of most sequence file formats
2035(but not alignment file formats), allowing dictionary like random access to
2036all the entries in the file as SeqRecord objects, keyed on the record id.
2037This is especially useful for very large sequencing files, where all the
2038records cannot be held in memory at once. This supplements the more flexible
2039but memory demanding Bio.SeqIO.to_dict() function.
2040
2041Bio.SeqIO can now write "phd" format files (used by PHRED, PHRAD and CONSED),
2042allowing interconversion with FASTQ files, or FASTA+QUAL files.
2043
2044Bio.Emboss.Applications now includes wrappers for the "new" PHYLIP EMBASSY
2045package (e.g. fneighbor) which replace the "old" PHYLIP EMBASSY package (e.g.
2046eneighbor) whose Biopython wrappers are now obsolete.
2047
2048See also the DEPRECATED file, as several old deprecated modules have finally
2049been removed (e.g. Bio.EUtils which had been replaced by Bio.Entrez).
2050
2051On a technical note, this will be the last release using CVS for source code
2052control. Biopython is moving from CVS to git.
2053
2054
2055August 17, 2009: Biopython 1.51 released.
2056=========================================
2057
2058FASTQ support in Bio.SeqIO has been improved, extended and sped up since
2059Biopython 1.50. Support for Illumina 1.3+ style FASTQ files was added in the
20601.51 beta release. Furthermore, we now follow the interpretation agreed on
2061the OBF mailing lists with EMBOSS, BioPerl, BioJava and BioRuby for inter-
2062conversion and the valid score range for each FASTQ variant. This means
2063Solexa FASTQ scores can be from -5 to 62 (format name "fastq-solexa" in
2064Bio.SeqIO), Illumina 1.3+ FASTQ files have PHRED scores from 0 to 62 (format
2065name "fastq-illumina"), and Sanger FASTQ files have PHRED scores from 0 to
206693 (format name "fastq" or "fastq-sanger").
2067
2068Bio.Sequencing.Phd has been updated, for example to cope with missing peak
2069positions. The "phd" support in Bio.SeqIO has also been updated to record
2070the PHRED qualities (and peak positions) in the SeqRecord's per-letter
2071annotation. This allows conversion of PHD files into FASTQ or QUAL which may
2072be useful for meta-assembly.
2073
2074See the notes below for the Biopython 1.50 beta release for changes since
2075Biopython 1.49 was released. This includes dropping support for Python 2.3,
2076removing our deprecated parsing infrastructure (Martel and Bio.Mindy), and
2077hence removing any dependence on mxTextTools.
2078
2079Additionally, since the beta, a number of small bugs have been fixed, and
2080there have been further additions to the test suite and documentation.
2081
2082
2083June 23, 2009: Biopython 1.51 beta released.
2084============================================
2085
2086Biopython no longer supports Python 2.3.  Currently we support Python 2.4,
20872.5 and 2.6.
2088
2089Our deprecated parsing infrastructure (Martel and Bio.Mindy) has been
2090removed.  This means Biopython no longer has any dependence on mxTextTools.
2091
2092A few cosmetic issues in GenomeDiagram with arrow sigils and labels on
2093circular diagrams have been fixed.
2094
2095Bio.SeqIO will now write GenBank files with the feature table (previously
2096omitted), and a couple of obscure errors parsing ambiguous locations have
2097been fixed.
2098
2099Bio.SeqIO can now read and write Illumina 1.3+ style FASTQ files (which use
2100PHRED quality scores with an ASCII offset of 64) under the format name
2101"fastq-illumina". Biopython 1.50 supported just "fastq" (the original Sanger
2102style FASTQ files using PHRED scores with an ASCII offset of 33), and
2103"fastq-solexa" (the original Solexa/Illumina FASTQ format variant holding
2104Solexa scores with an ASCII offset of 64) .
2105
2106For parsing the "swiss" format, Bio.SeqIO now uses the new Bio.SwissProt
2107parser, making it about twice as fast as in Biopython 1.50, where the older
2108now deprecated Bio.SwissProt.SProt was used. There should be no functional
2109differences as a result of this change.
2110
2111Our command line wrapper objects have been updated to support accessing
2112parameters via python properties, and setting of parameters at initiation
2113with keyword arguments.  Additionally Cymon Cox has contributed several new
2114multiple alignment wrappers under Bio.Align.Applications.
2115
2116A few more issues with Biopython's BioSQL support have been fixed (mostly by
2117Cymon Cox). In particular, the default PostgreSQL schema includes some rules
2118intended for BioPerl support only, which were causing problems in Biopython
2119(see BioSQL bug 2839).
2120
2121There have also been additions to the tutorial, such as the new alignment
2122wrappers, with a whole chapter for the SeqRecord object. We have also added
2123to the unit test coverage.
2124
2125
2126April 20, 2009: Biopython 1.50 released.
2127========================================
2128
2129See the notes below for the Biopython 1.50 beta release for more details,
2130but the highlights are:
2131
2132* The SeqRecord supports slicing and per-letter-annotation
2133* Bio.SeqIO can read and write FASTQ and QUAL files
2134* Bio.Seq now has an UnknownSeq object
2135* GenomeDiagram has been integrated into Biopython
2136* New module Bio.Motif will later replace Bio.AlignAce and Bio.MEME
2137* This will be the final release to support Python 2.3
2138* This will be the final release with Martel and Bio.Mindy
2139
2140Since the 1.50 beta release:
2141
2142* The NCBI's Entrez EFetch no longer supports rettype="genbank"
2143  and "gb" (or "gp") should be used instead.
2144* Bio.SeqIO now supports "gb" as an alias for "genbank".
2145* The Seq object now has string-like startswith and endswith methods
2146* Bio.Blast.NCBIXML now has a read function for single record files
2147* A few more unit tests were added
2148* More documentation
2149
2150
2151April 3, 2009: Biopython 1.50 beta released.
2152============================================
2153
2154The SeqRecord object has a new dictionary attribute, letter_annotations,
2155which is for holding per-letter-annotation information like sequence
2156quality scores or secondary structure predictions.  As part of this work,
2157the SeqRecord object can now be sliced to give a new SeqRecord covering
2158just part of the sequence.  This will slice the per-letter-annotation to
2159match, and will also include any SeqFeature objects as appropriate.
2160
2161Bio.SeqIO can now read and write FASTQ and QUAL quality files using PHRED
2162quality scores (Sanger style, also used for Roche 454 sequencing), and FASTQ
2163files using Solexa/Illumina quality scores.
2164
2165The Bio.Seq module now has an UnknownSeq object, used for when we have a
2166sequence of known length, but unknown content.  This is used in parsing
2167GenBank and EMBL files where the sequence may not be present (e.g. for a
2168contig record) and when parsing QUAL files (which don't have the sequence)
2169
2170GenomeDiagram by Leighton Pritchard has been integrated into Biopython as
2171the Bio.Graphics.GenomeDiagram module  If you use this code, please cite the
2172publication Pritchard et al. (2006), Bioinformatics 22 616-617.  Note that
2173like Bio.Graphics, this requires the ReportLab python library.
2174
2175A new module Bio.Motif has been added, which is intended to replace the
2176existing Bio.AlignAce and Bio.MEME modules.
2177
2178The set of NCBI DTD files included with Bio.Entrez has been updated with the
2179revised files the NCBI introduced on 1 Jan 2009.
2180
2181Minor fix to BioSQL for retrieving references and comments.
2182
2183Bio.SwissProt has a new faster parser which will be replacing the older
2184slower code in Bio.SwissProt.SProt (which we expect to deprecate in the next
2185release).
2186
2187We've also made some changes to our test framework, which is now given a
2188whole chapter in the tutorial.  This intended to help new developers or
2189contributors wanting to improve our unit test coverage.
2190
2191
2192November 21, 2008: Biopython 1.49 released.
2193===========================================
2194
2195See the notes below for the Biopython 1.49 beta release for more details,
2196but the highlights are:
2197
2198* Biopython has transitioned from Numeric to NumPy
2199* Martel and Bio.Mindy are now deprecated
2200
2201Since the 1.49 beta release:
2202
2203* A couple of NumPy issues have been resolved
2204* Further small improvements to BioSQL
2205* Bio.PopGen.SimCoal should now work on Windows
2206* A few more unit tests were added
2207
2208
2209November 7, 2008: Biopython 1.49 beta released.
2210===============================================
2211
2212Biopython has transitioned from Numeric to NumPy.  Please move to NumPy.
2213
2214A number of small changes have been made to support Python 2.6 (mostly
2215avoiding deprecated functionality), and further small changes have been
2216made for better compatibility with Python 3 (this work is still ongoing).
2217However, we intend to support Python 2.3 for only a couple more releases.
2218
2219As part of the Numeric to NumPy migration, Bio.KDTree has been rewritten in
2220C instead of C++ which therefore simplifies building Biopython from source.
2221
2222Martel and Bio.Mindy are now considered to be deprecated, meaning mxTextTools
2223is no longer required to use Biopython.  See the DEPRECATED file for details
2224of other deprecations.
2225
2226The Seq object now supports more string like methods (gaining find, rfind,
2227split, rsplit, strip, lstrip and rstrip in addition to previously supported
2228methods like count).  Also, biological methods transcribe, back_transcribe
2229and translate have been added, joining the pre-existing reverse_complement
2230and complement methods.  Together these changes allow a more object
2231orientated programming style using the Seq object.
2232
2233The behaviour of the Bio.Seq module's translate function has changed so that
2234ambiguous codons which could be a stop codon like "TAN" or "NNN" are now
2235translated as "X" (consistent with EMBOSS and BioPerl - Biopython previously
2236raised an exception), and a bug was fixed so that invalid codons (like "A-T")
2237now raise an exception (previously these were translated as stop codons).
2238
2239BioSQL had a few bugs fixed, and can now optionally fetch the NCBI taxonomy
2240on demand when loading sequences (via Bio.Entrez) allowing you to populate
2241the taxon/taxon_name tables gradually.  This has been tested in combination
2242with the BioSQL load_ncbi_taxonomy.pl script used to populate or update the
2243taxon/taxon_name tables.  BioSQL should also now work with the psycopg2
2244driver for PostgreSQL as well as the older psycopg driver.
2245
2246The PDB and PopGen sections of the Tutorial have been promoted to full
2247chapters, and a new chapter has been added on supervised learning methods
2248like logistic regression.  The "Cookbook" section now has a few graphical
2249examples using Biopython to calculate sequence properties, and matplotlib
2250(pylab) to plot them.
2251
2252The input functions in Bio.SeqIO and Bio.AlignIO now accept an optional
2253argument to specify the expected sequence alphabet.
2254
2255The somewhat quirky unit test GUI has been removed, the unit tests are now
2256run via the command line by default.
2257
2258
2259September 8, 2008: Biopython 1.48 released.
2260===========================================
2261
2262The SeqRecord and Alignment objects have a new method to format the object as
2263a string in a requested file format (handled via Bio.SeqIO and Bio.AlignIO).
2264
2265Additional file formats supported in Bio.SeqIO and Bio.AlignIO:
2266
2267- reading and writing "tab" format (simple tab separated)
2268- writing "nexus" files.
2269- reading "pir" files (NBRF/PIR)
2270- basic support for writing "genbank" files (GenBank plain text)
2271
2272Fixed some problems reading Clustal alignments (introduced in Biopython 1.46
2273when consolidating Bio.AlignIO and Bio.Clustalw).
2274
2275Updates to the Bio.Sequencing parsers.
2276
2277Bio.PubMed and the online code in Bio.GenBank are now considered obsolete,
2278and we intend to deprecate them after the next release. For accessing PubMed
2279and GenBank, please use Bio.Entrez instead.
2280
2281Bio.Fasta is now considered to be obsolete, please use Bio.SeqIO instead. We
2282do intend to deprecate this module eventually, however, for several years
2283this was the primary FASTA parsing module in Biopython and is likely to be in
2284use in many existing scripts.
2285
2286Martel and Bio.Mindy are now considered to be obsolete, and are likely to be
2287deprecated and removed in a future release.
2288
2289In addition a number of other modules have been deprecated, including:
2290Bio.MetaTool, Bio.EUtils, Bio.Saf, Bio.NBRF, and Bio.IntelliGenetics
2291See the DEPRECATED file for full details.
2292
2293
2294July 5, 2008: Biopython 1.47 released.
2295======================================
2296
2297Improved handling of ambiguous nucleotides in Bio.Seq.Translate().
2298Better handling of stop codons in the alphabet from a translation.
2299Fixed some codon tables (problem introduced in Biopython 1.46).
2300
2301Updated Nexus file handling.
2302
2303Fixed a bug in Bio.Cluster potentially causing segfaults in the
2304single-linkage hierarchical clustering library.
2305
2306Added some DTDs to be able to parse EFetch results from the
2307nucleotide database.
2308
2309Added IntelliGenetics/MASE parsing to Bio.SeqIO (as the "ig" format).
2310
2311
2312June 29, 2008: Biopython 1.46 released.
2313=======================================
2314
2315Bio.Entrez now has several Entrez format XML parsers, and a chapter
2316in the tutorial.
2317
2318Addition of new Bio.AlignIO module for working with sequence alignments
2319in the style introduced with Bio.SeqIO in recent releases, with a whole
2320chapter in the tutorial.
2321
2322A problem parsing certain EMBL files was fixed.
2323
2324Several minor fixes were made to the NCBI BLAST XML parser, including
2325support for the online version 2.2.18+ introduced in May 2008.
2326
2327The NCBIWWW.qblast() function now allows other programs (blastx, tblastn,
2328tblastx) in addition to just blastn and blastp.
2329
2330Bio.EUtils has been updated to explicitly enforce the NCBI's rule of at
2331most one query every 3 seconds, rather than assuming the user would obey
2332this.
2333
2334Iterators in Bio.Medline, Bio.SCOP, Bio.Prosite, Bio.Prosite.Prodoc,
2335Bio.SwissProt, and others to make them more generally usable.
2336
2337Phylip export added to Bio.Nexus.
2338
2339Improved handling of ambiguous nucleotides and stop codons in
2340Bio.Seq.Translate (plus introduced a regression fixed in Biopython 1.47).
2341
2342
2343March 22, 2008: Biopython 1.45 released.
2344========================================
2345
2346The Seq and MutableSeq objects act more like python strings, in particular
2347str(object) now returns the full sequence as a plain string.  The existing
2348tostring() method is preserved for backwards compatibility.
2349
2350BioSQL has had some bugs fixed, and has an additional unit test which loads
2351records into a database using Bio.SeqIO and then checks the records can be
2352retrieved correctly.  The DBSeq and DBSeqRecord classes now subclass the
2353Seq and SeqRecord classes, which provides more functionality.
2354
2355The modules under Bio.WWW are being deprecated.
2356Functionality in Bio.WWW.NCBI, Bio.WWW.SCOP, Bio.WWW.InterPro and
2357Bio.WWW.ExPASy is now available from Bio.Entrez, Bio.SCOP, Bio.InterPro and
2358Bio.ExPASy instead. Bio.Entrez was used to fix a nasty bug in Bio.GenBank.
2359
2360Tiago Antao has included more functionality in the Population Genetics
2361module, Bio.PopGen.
2362
2363The Bio.Cluster module has been updated to be more consistent with other
2364Biopython code.
2365
2366The tutorial has been updated, including devoting a whole chapter to
2367Swiss-Prot, Prosite, Prodoc, and ExPASy. There is also a new chapter on
2368Bio.Entrez.
2369
2370Bio.biblio was deprecated.
2371
2372
2373October 28, 2007: Biopython 1.44 released.
2374==========================================
2375
2376NOTE: This release includes some rather drastic code changes, which were
2377necessary to get Biopython to work with the new release of mxTextTools.
2378
2379The (reverse)complement functions in Bio.Seq support ambiguous nucleotides.
2380
2381Bio.Kabat, which was previously deprecated, is now removed from Biopython.
2382
2383Bio.MarkupEditor was deprecated, as it does not appear to have any users.
2384
2385Bio.Blast.NCBI.qblast() updated with more URL options, thanks to a patch
2386from Chang Soon Ong.
2387
2388Several fixes to the Blast parser.
2389
2390The deprecated Bio.Blast.NCBIWWW functions blast and blasturl were removed.
2391
2392The standalone Blast functions blastall, blastpgp now create XML output by
2393default.
2394
2395Bio.SeqIO.FASTA and Bio.SeqIO.generic have been deprecated in favour of
2396the new Bio.SeqIO module.
2397
2398Bio.FormatIO has been removed (a gradual deprecation was not possible).
2399Please look at Bio.SeqIO for sequence input/output instead.
2400
2401Fix for a bug in Bio.Cluster, which caused kcluster() to hang on some
2402platforms.
2403
2404Bio.expressions has been deprecated.
2405
2406Bio.SeqUtils.CheckSum created, including new methods from Sebastian Bassi,
2407and functions crc32 and crc64 which were moved from Bio/crc.py.
2408Bio.crc is now deprecated. Bio.lcc was updated and moved to Bio.SeqUtils.lcc.
2409
2410Bio.SwissProt parser updated to cope with recent file format updates.
2411
2412Bio.Fasta, Bio.KEGG and Bio.Geo updated to pure python parsers which
2413don't rely on Martel.
2414
2415Numerous fixes in the Genbank parser.
2416
2417Several fixes in Bio.Nexus.
2418
2419Bio.MultiProc and Bio.Medline.NLMMedlineXML were deprecating, as they failed
2420on some platforms, and seemed to have no users. Deprecated concurrent
2421behavior in Bio.config.DBRegistry and timeouts in Bio.dbdefs.swissprot,
2422which relies on Bio.MultiProc.
2423
2424Tiago Antao has started work on a Population Genetics module, Bio.PopGen
2425
2426Updates to the tutorial, including giving Bio.Seq and Bio.SeqIO a whole
2427chapter each.
2428
2429
2430March 17, 2007: Biopython 1.43 released.
2431========================================
2432
2433New Bio.SeqIO module for reading and writing biological sequence files
2434in various formats, based on SeqRecord objects.  This includes a new fasta
2435parser which is much faster than Bio.Fasta, particularly for larger files.
2436Easier to use, too.
2437
2438Various improvements in Bio.SeqRecord.
2439
2440Running Blast using Bio.Blast.NCBIStandalone now generates output in XML
2441format by default.
2442The new function Bio.Blast.NCBIXML.parse can parse multiple Blast records
2443in XML format.
2444
2445Bio.Cluster no longer uses ranlib, but uses its own random number generator
2446instead. Some modifications to make Bio.Cluster more compatible with the new
2447NumPy (we're not quite there yet though).
2448
2449New Bio.UniGene parser.
2450
2451Numerous improvements in Bio.PDB.
2452
2453Bug fixes in Bio.SwissProt, BioSQL, Bio.Nexus, and other modules.
2454
2455Faster parsing of large GenBank files.
2456
2457New EMBL parser under Bio.GenBank and also integrated into (new) Bio.SeqIO
2458
2459Compilation of KDTree (C++ code) is optional (setup.py asks the user if it
2460should be compiled). For the Windows installer, C++ code is now included.
2461
2462Nominating Bio.Kabat for removal.
2463
2464Believe it or not, even the documentation was updated.
2465
2466
2467July 16, 2006: Biopython 1.42 released.
2468=======================================
2469
2470Bio.GenBank: New parser by Peter, which doesn't rely on Martel.
2471
2472Numerous updates in Bio.Nexus and Bio.Geo.
2473
2474Bio.Cluster became (somewhat) object-oriented.
2475
2476Lots of bug fixes, and updates to the documentation.
2477
2478
2479October 28, 2005: Biopython 1.41 released.
2480==========================================
2481
2482Major changes:
2483
2484NEW: Bio.MEME -- thanks to Jason Hackney
2485
2486Added transcribe, translate, and reverse_complement functions to Bio.Seq that
2487work both on Seq objects and plain strings.
2488
2489Major code optimization in cpairwise2module.
2490
2491CompareACE support added to AlignAce.
2492
2493Updates to Blast parsers in Bio.Blast, in particular use of the XML parser
2494in NCBIXML contributed by Bertrand Frottier, and the BLAT parser by Yair
2495Benita.
2496
2497Pairwise single-linkage hierarchical clustering in Bio.Cluster became much
2498faster and memory-efficient, allowing clustering of large data sets.
2499
2500Bio.Emboss: Added command lines for einverted and palindrome.
2501
2502Bio.Nexus: Added support for StringIO objects.
2503
2504Numerous updates in Bio.PDB.
2505
2506Lots of fixes in the documentation.
2507
2508March 29, 2005: MEME parser added. Thanks to Jason Hackney
2509
2510
2511Feb 18, 2005: Biopython 1.40 beta
2512=================================
2513Major Changes since v1.30. For a full list of changes please see the CVS
2514
2515IMPORTANT: Biopython now works with Python version >= 2.3
2516
2517NEW: Bio.Nexus -- thanks to Frank Kauff
2518Bio.Nexus is a Nexus file parser. Nexus is a common format for phylogenetic
2519trees.
2520
2521NEW: CAPS module -- Thanks to Jonathan Taylor.
2522
2523NEW: Restriction enzyme package contributed by Frederic Sohm. This includes
2524classes for manipulating enzymes, updating from Rebase, as well as
2525documentation and Tests.
2526
2527CHANGED: Bio.PDB -- thanks to Thomas Hamelryck.
2528
2529- Added atom serial number.
2530- Epydoc style documentation.
2531- Added secondary structure support (through DSSP).
2532- Added Accessible Surface Area support (through DSSP).
2533- Added Residue Depth support (through MSMS).
2534- Added Half Sphere Exposure.
2535- Added Fragment classification of the protein backbone (see Kolodny et al.,
2536- JMB, 2002).
2537- Corrected problem on Windows with PDBList (thanks to Matt Dimmic)
2538- Added StructureAlignment module to superimpose structures based on a FASTA
2539  sequence alignment.
2540- Various additions to Polypeptide.
2541- Various bug corrections in Vector.
2542- Lots of smaller bug corrections and additional features
2543
2544CHANGED: MutableSeq -- thanks to Michiel De Hoon
2545Added the functions 'complement' and 'reverse_complement' to Bio.Seq's Seq and
2546MutableSeq objects. Similar functions previously existed in various locations
2547in BioPython:
2548
2549- forward_complement, reverse_complement in Bio.GFF.easy
2550- complement, antiparallel in Bio.SeqUtils
2551
2552These functions have now been deprecated, and will issue a DeprecationWarning
2553when used. The functions complement and reverse_complement, when applied to a
2554Seq object, will return a new Seq object. The same function applied to a
2555MutableSeq object will modify the MutableSeq object itself, and don't return
2556anything.
2557
2558
2559May 14, 2004: Biopython 1.30
2560============================
2561
2562- Affy package added for dealing with Affymetrix cel files -- thanks to Harry
2563  Zuzan.
2564- Added code for parsing Blast XML output -- thanks to Bertrand Frottier.
2565- Added code for parsing Compass output -- thanks to James Casbon.
2566- New melting temperature calculation module -- thanks to Sebastian Bassi.
2567- Added lowess function for non-parameteric regression -- thanks to Michiel.
2568- Reduced protein alphabet supported added -- thanks to Iddo.
2569
2570- Added documentation for Logistic Regression and Bio.PDB -- thanks to Michiel
2571  and Thomas.
2572- Documentation added for converting between file formats.
2573- Updates to install documentation for non-root users -- thanks to Jakob
2574  Fredslund.
2575- epydoc now used for automatic generation of documentation.
2576
2577- Fasta parser updated to use Martel for parsing and indexing, allowing better
2578  speed and dealing with large data files.
2579- Updated to Registry code. Now 'from Bio import db' gives you a number of new
2580  retrieval options, including embl, fasta, genbak, interpro, prodoc and
2581  swissprot.
2582- GenBank parser uses new Martel format. GenBank retrieval now uses EUtils
2583  instead of the old non-working entrez scripts. GenBank indexing uses standard
2584  Mindy indexing. Fix for valueless qualifiers in feature keys -- thanks to
2585  Leighton Pritchard.
2586- Numerous updated to Bio.PDB modules -- thanks to Thomas. PDB can now parse
2587  headers -- thanks to Kristian Rother.
2588- Updates to the Ace parser -- thanks to Frank Kauff and Leighton Pritchard.
2589
2590- Added pgdb (PyGreSQL) support to BioSQL -- thanks to Marc Colosimo.
2591- Fix problems with using py2exe and Biopython -- thanks to Michael Cariaso.
2592- PSIBlast parser fixes -- thanks to Jer-Yee John Chuang and James Casbon.
2593- Fix to NCBIWWW retrieval so that HTML results are returned correctly.
2594- Fix to Clustalw to handle question marks in title names -- thanks to Ashleigh
2595  Smythe.
2596- Fix to NBRF parsing to it accepts files produced by Clustalw -- thanks to
2597  Ashleigh Smythe.
2598- Fixes to the Enyzme module -- thanks to Marc Colosimo.
2599- Fix for bugs in SeqUtils -- thanks to Frank Kauff.
2600- Fix for optional hsps in ncbiblast Martel format -- thanks to Heiko.
2601- Fix to Fasta parsing to allow # comment lines -- thanks to Karl Diedrich.
2602- Updates to the C clustering library -- thanks to Michiel.
2603- Fixes for breakage in the SCOP module and addition of regression tests to
2604  framework -- thanks to Gavin.
2605- Various fixes to Bio.Wise -- thanks to Michael.
2606- Fix for bug in FastaReader -- thanks to Micheal.
2607- Fix EUtils bug where efetch would only return 500 sequences.
2608- Updates for Emboss commandlines, water and tranalign.
2609- Fixes to the FormatIO system of file conversion.
2610
2611- C++ code (KDTree, Affy) now compiled by default on most platforms -- thanks
2612  to Michael for some nice distutils hacks and many people for testing.
2613- Deprecated Bio.sequtils -- use Bio.SeqUtils instead.
2614- Deprecated Bio.SVM -- use libsvm instead.
2615- Deprecated Bio.kMeans and Bio.xkMeans -- use Bio.cluster instead.
2616- Deprecated RecordFile -- doesn't appear to be finished code.
2617
2618
2619Feb 16, 2004: Biopython 1.24
2620============================
2621
2622- New parsers for Phred and Ace format files -- thanks to Frank Kauff
2623- New Code for dealing with NMR data -- thanks to Bob Bussell
2624- New SeqUtils modules for codon usage, isoelectric points and other
2625  protein properties -- thanks to Yair Benita
2626- New code for dealing with Wise contributed by Michael
2627- EZ-Retrieve sequence retrieval now supported thanks to Jeff
2628- Bio.Cluster updated along with documentation by Michiel
2629- BioSQL fixed so it now works with the current SQL schema -- thanks to Yves
2630  Bastide for patches
2631- Patches to Bio/__init__ to make it compatible with py2exe -- thanks to
2632  Leighton Pritchard
2633- Added __iter__ to all Biopython Iterators to make them Python 2.2 compatible
2634- Fixes to NCBIWWW for retrieving from NCBI -- thanks to Chris Wroe
2635- Retrieval of multiple alignment objects from BLAST records -- thanks to
2636  James Casbon
2637- Fixes to GenBank format for new tags by Peter
2638- Parsing fixes in clustalw parsed -- thanks to Greg Singer and Iddo
2639- Fasta Indexes can have a specified filename -- thanks to Chunlei Wu
2640- Fix to Prosite parser -- thanks to Mike Liang
2641- Fix in GenBank parsing -- mRNAs now get strand information
2642
2643
2644Oct 18, 2003: Biopython 1.23
2645============================
2646
2647- Fixed distribution of files in Bio/Cluster
2648- Now distributing Bio/KDTree/_KDTree.swig.C
2649- minor updates in installation code
2650- added mmCIF support for PDB files
2651
2652
2653Oct 9, 2003: Biopython 1.22
2654===========================
2655
2656- Added Peter Slicker's patches for speeding up modules under Python 2.3
2657- Fixed Martel installation.
2658- Does not install Bio.Cluster without Numeric.
2659- Distribute EUtils DTDs.
2660- Yves Bastide patched NCBIStandalone.Iterator to be Python 2.0 iterator
2661- Ashleigh's string coersion fixes in Clustalw.
2662- Yair Benita added precision to the protein molecular weights.
2663- Bartek updated AlignAce.Parser and added Motif.sim method
2664- bug fixes in Michiel De Hoon's clustering library
2665- Iddo's bug fixes to Bio.Enzyme and new RecordConsumer
2666- Guido Draheim added patches for fixing import path to xbb scripts
2667- regression tests updated to be Python 2.3 compatible
2668- GenBank.NCBIDictionary is smarter about guessing the format
2669
2670
2671Jul 28, 2003: Biopython 1.21
2672============================
2673
2674- Martel added back into the released package
2675- new AlignACE module by Bartek Wilczynski
2676- Andreas Kuntzagk fix for GenBank Iterator on empty files
2677
2678
2679Jul 27, 2003: Biopython 1.20
2680============================
2681
2682- added Andrew Dalke's EUtils library
2683- added Michiel de Hoon's gene expression analysis package
2684- updates to setup code, now smarter about dependencies
2685- updates to test suite, now smarter about code that is imported
2686- Michael Hoffman's fixes to DocSQL
2687- syntax fixes in triemodule.c to compile on SGI, Python 2.1 compatible
2688- updates in NCBIStandalone, short query error
2689- Sebastian Bassi submitted code to calculate LCC complexity
2690- Greg Kettler's NCBIStandalone fix for long query lengths
2691- slew of miscellaneous fixes from George Paci
2692- miscellaneous cleanups and updates from Andreas Kuntzagk
2693- Peter Bienstman's fixes to Genbank code -- now parses whole database
2694- Kayte Lindner's LocusLink package
2695- miscellaneous speedups and code cleanup in ParserSupport by Brad Chapman
2696- miscellaneous BLAST fixes and updates
2697- Iddo added new code to parse BLAST table output format
2698- Karl Diedrich's patch to read T_Coffee files
2699- Larry Heisler's fix for primer3 output
2700- Bio.Medline now uses proper iterator objects
2701- copen now handles SIGTERM correctly
2702- small bugfixes and updates in Thomas Hamelryck's PDB package
2703- bugfixes and updates to SeqIO.FASTA reader
2704- updates to Registry system, conforms to 2003 hackathon OBDA spec
2705- Yu Huang patch to support tblastn in wublast expression
2706
2707
2708Dec 17, 2002: Biopython 1.10
2709============================
2710
2711- Python requirement bumped up to 2.2
2712- hierarchy reorg, many things moved upwards into Bio namespace
2713- pairwise2 replaces fastpairwise and pairwise
2714- removed deprecated Sequence.py package
2715- minor bug fix in File.SGMLStripper
2716- added Scripts/debug/debug_blast_parser.py to diagnose blast parsing errors
2717- IPI supported by SwissProt/SProt.py parser
2718- large speedup for kmeans
2719- new registry framework for generic access to databases and parsers
2720- small bug fix in stringfns.split
2721- scripts that access NCBI moved over to new EUtils system
2722- new crc module
2723- biblio.py supports the EBI Bibliographic database
2724- new CDD parser
2725- new Ndb parser
2726- new ECell parser
2727- new Geo parser
2728- access to GFF databases
2729- new KDTree data structure
2730- new LocusLink parser
2731- new MarkovModel algorithm
2732- new Saf parser
2733- miscellaneous sequence handling functions in sequtils
2734- new SVDSuperimpose algorithm
2735
2736
2737Dec 18, 2001: Biopython1.00a4
2738=============================
2739
2740- minor bug fix in NCBIStandalone.blastall
2741- optimization in dynamic programming code
2742- new modules for logistic regression and maximum entropy
2743- minor bug fix in ParserSupport
2744- minor bug fixes in SCOP package
2745- minor updates in the kMeans cluster selection code
2746- minor bug fixes in SubsMat code
2747- support for XML-formatted MEDLINE files
2748- added MultiProc.run to simplify splitting code across processors
2749- listfns.items now supports lists with unhashable items
2750- new data type for pathways
2751- new support for intelligenetics format
2752- new support for metatool format
2753- new support for NBRF format
2754- new support for generalized launching of applications
2755- new support for genetic algorithms
2756- minor bug fixes in GenBank parsing
2757- new support for Primer in the Emboss package
2758- new support for chromosome graphics
2759- new support for HMMs
2760- new support for NeuralNetwork
2761- slew of Martel fixes (see Martel docs)
2762
2763
2764Sept 3, 2001: Biopython1.00a3
2765=============================
2766
2767- added package to support KEGG
2768- added sequtils module for computations on sequences
2769- added pairwise sequence alignment algorithm
2770- major bug fixes in UndoHandle
2771- format updates in PubMed
2772- Tk interface to kMeans clustering
2773
2774
2775July 5, 2001: Biopython1.00a2
2776=============================
2777
2778- deprecated old regression testing frameworks
2779- deprecated Sequence.py
2780- Swiss-Prot parser bug fixes
2781- GenBank parser bug fixes
2782- Can now output GenBank format
2783- can now download many sequences at a time from GenBank
2784- kMeans clustering algorithm
2785- Kabat format now supported
2786- FSSP format now supported
2787- more functionality for alignment code
2788- SubsMat bug fixes and updates
2789- fixed memory leak in listfns bug fixes
2790- Martel bundled and part of the install procedure
2791- Medline.Parser bug fixes
2792- PubMed.download_many handles broken IDs better
2793
2794
2795Mar 3, 2001: Biopython 1.00a1
2796=============================
2797
2798- Refactoring of modules.  X/X.py moved to X/__init__.py.
2799- Can search sequences for Prosite patterns at ExPASy
2800- Can do BLAST searches against stable URL at NCBI
2801- Prosite Pattern bug fixes
2802- GenBank parser
2803- Complete Seq and SeqFeatures framework
2804- distutils cleanup
2805- compile warning cleanups
2806- support for UniGene
2807- code for working with substitution matrices
2808- Tools.MultiProc package for rudimentary multiprocessing stuff
2809
2810
2811Nov 10, 2000: Biopython 0.90d04
2812===============================
2813
2814- Added support for multiple alignments, ClustalW
2815- BLAST updates, bug fixes, and BlastErrorParser
2816- Fixes for PSI-BLAST in master-slave mode
2817- Minor update in stringfns, split separators can be negated
2818- Added download_many function to PubMed
2819- xbbtools updates
2820- Prodoc parser now accepts a copyright at the end of a record
2821- Swiss-Prot parser now handles taxonomy ID tag
2822
2823
2824Sept 6, 2000: Biopython 0.90d03
2825===============================
2826
2827- Blast updates:
2828
2829  - bug fixes in NCBIStandalone, NCBIWWW
2830  - some __str__ methods in Record.py implemented (incomplete)
2831
2832- Tests:
2833
2834  - new BLAST regression tests
2835  - prosite tests fixed
2836
2837- New parsers for Rebase, Gobase
2838- pure python implementation of C-based tools
2839- Thomas Sicheritz-Ponten's xbbtools
2840- can now generate documentation from docstrings using HappyDoc
2841
2842
2843Aug17-18, 2000: Bioinformatics Open Source Conference 2000
2844==========================================================
2845
2846We had a very good Birds-of-a-Feather meeting:
2847http://mailman.open-bio.org/pipermail/biopython/2000-August/000360.html
2848
2849
2850Aug 2, 2000: Biopython 0.90d02 is released.
2851===========================================
2852
2853- Blast updates:
2854  - now works with v2.0.14
2855  - HSP.identities and HSP.positives now tuples
2856  - HSP.gaps added
2857- SCOP updates:
2858  - Lin.Iterator now works with release 50
2859- Starting a tutorial
2860- New regression tests for Prodoc
2861
2862
2863July 6, 2000: Biopython 0.90d01 is released.
2864============================================
2865
2866
2867February 8, 2000: Anonymous CVS made available.
2868===============================================
2869
2870
2871August 1999: Biopython project founded.
2872=======================================
2873
2874Call for Participation sent out to relevant mailing lists, news
2875groups.
2876
2877The Biopython Project (https://www.biopython.org/) is a new open
2878collaborative effort to develop freely available Python libraries and
2879applications that address the needs of current and future work in
2880bioinformatics, including sequence analysis, structural biology,
2881pathways, expression data, etc.  When available, the source code will
2882be released as open source (https://github.com/biopython/biopython/blob/9c4785fc9eaf8a3bc436c6c0b16e7a05019cade1/LICENSE)
2883under terms similar to Python.
2884
2885This is a Call for Participation for interested people to join the
2886project.  We are hoping to attract people from a diverse set of
2887backgrounds to help with code development, site maintenance,
2888scientific discussion, etc.  This project is open to everyone.  If
2889you're interested, please visit the web page, join the biopython
2890mailing list, and let us know what you think!
2891
2892Jeffrey Chang <jchang@smi.stanford.edu>
2893Andrew Dalke <dalke@bioreason.com>
2894