1News for the Biopython Project 2============================== 3 4This file contains release notes and general news about the Biopython project. 5See also the DEPRECATED file which tracks the removal of obsolete modules or 6functions, and online https://biopython.org/wiki/News and 7https://www.open-bio.org/category/obf-projects/biopython/ 8 9The latest news is at the top of this file. 10 11(In progress, not yet released): Biopython 1.80 12=============================================== 13 141 June 2021: Biopython 1.79 15================================ 16 17This is intended to be our final release supporting Python 3.6. It also 18supports Python 3.7, 3.8 and 3.9, and has also been tested on PyPy3.6.1 v7.1.1. 19 20The ``Seq`` and ``MutableSeq`` classes in ``Bio.Seq`` now store their sequence 21contents as ``bytes` ` and ``bytearray`` objects, respectively. Previously, for 22``Seq`` objects a string object was used, and a Unicode array object for 23``MutableSeq`` objects. This was maintained during the transition from Python2 24to Python3. However, a Python2 string object corresponds to a ``bytes`` object 25in Python3, storing the string as a series of 256-bit characters. While non- 26ASCII characters could be stored in Python2 strings, they were not treated as 27such. For example: 28 29In Python2:: 30 31 >>> s = "Генетика" 32 >>> type(s) 33 <class 'str'> 34 >>> len(s) 35 16 36 37In Python3:: 38 39 >>> s = "Генетика" 40 >>> type(s) 41 <class 'str'> 42 >>> len(s) 43 8 44 45In Python3, storing the sequence contents as ``bytes`` and ``bytearray`` 46objects has the further advantage that both support the buffer protocol. 47 48Taking advantage of the similarity between ``bytes`` and ``bytearray``, the 49``Seq`` and ``MutableSeq`` classes now inherit from an abstract base class 50``_SeqAbstractBaseClass`` in ``Bio.Seq`` that implements most of the ``Seq`` 51and ``MutableSeq`` methods, ensuring their consistency with each other. For 52methods that modify the sequence contents, an optional ``inplace`` argument to 53specify if a new sequence object should be returned with the new sequence 54contents (if ``inplace`` is ``False``, the default) or if the sequence object 55itself should be modified (if ``inplace`` is ``True``). For ``Seq`` objects, 56which are immutable, using ``inplace=True`` raises an exception. For 57``inplace=False``, the default, ``Seq`` objects and ``MutableSeq`` behave 58consistently. 59 60As before, ``Seq`` and ``MutableSeq`` objects can be initialized using a string 61object, which will be converted to a ``bytes`` or ``bytearray`` object assuming 62an ASCII encoding. Alternatively, a ``bytes`` or ``bytearray`` object can be 63used, or an instance of any class inheriting from the new 64``SequenceDataAbstractBaseClass`` abstract base class in ``Bio.Seq``. This 65requires that the class implements the ``__len__`` and ``__getitem`` methods 66that return the sequence length and sequence contents on demand. Initialzing a 67``Seq`` instance using an instance of a class inheriting from 68``SequenceDataAbstractBaseClass`` allows the ``Seq`` object to be lazy, meaning 69that its sequence is provided on demand only, without requiring to initialize 70the full sequence. This feature is now used in ``BioSQL``, providing on-demand 71sequence loading from an SQL database, as well as in a new parser for twoBit 72(.2bit) sequence data added to ``Bio.SeqIO``. This is a lazy parser that allows 73fast access to genome-size DNA sequence files by not having to read the full 74genome sequence. The new ``_UndefinedSequenceData`` class in ``Bio.Seq`` also 75inherits from ``SequenceDataAbstractBaseClass`` to represent sequences of known 76length but unknown sequence contents. This provides an alternative to 77``UnknownSeq``, which is now deprecated as its definition was ambiguous. For 78example, in these examples the ``UnknownSeq`` is interpreted as a sequence with 79a well-defined sequence contents:: 80 81 >>> s = UnknownSeq(3, character="A") 82 >>> s.translate() 83 UnknownSeq(1, character='K') 84 >>> s + "A" 85 Seq("AAAA") 86 87A sequence object with an undefined sequence contents can now be created by 88using ``None`` when creating the ``Seq`` object, together with the sequence 89length. Trying to access its sequence contents raises an 90``UndefinedSequenceError``:: 91 92 >>> s = Seq(None, length=6) 93 >>> s 94 Seq(None, length=6) 95 >>> len(s) 96 6 97 >>> "A" in s 98 Traceback (most recent call last): 99 ... 100 Bio.Seq.UndefinedSequenceError: Sequence content is undefined 101 >>> print(s) 102 Traceback (most recent call last): 103 .... 104 Bio.Seq.UndefinedSequenceError: Sequence content is undefined 105 106Element assignment in Bio.PDB.Atom now returns "X" when the element cannot be 107unambiguously guessed from the atom name, in accordance with PDB structures. 108 109Bio.PDB entities now have a ``center_of_mass()`` method that calculates either 110centers of gravity or geometry. 111 112New method ``disordered_remove()`` implemented in Bio.PDB DisorderedAtom and 113DisorderedResidue to remove children. 114 115New module Bio.PDB.SASA implements the Shrake-Rupley algorithm to calculate 116atomic solvent accessible areas without third-party tools. 117 118Expected ``TypeError`` behaviour has been restored to the ``Seq`` object's 119string like methods (fixing a regression in Biopython 1.78). 120 121The KEGG ``KGML_Pathway`` KGML output was fixed to produce output that complies 122with KGML v0.7.2. 123 124Parsing motifs in ``pfm-four-rows`` format can now handle motifs with values 125in scientific notation. 126 127Parsing motifs in ``minimal``` MEME format will use ``nsites`` when making 128the count matrix from the frequency matrix, instead of multiply the frequency 129matrix by 1000000. 130 131Bio.UniProt.GOA now parses Gene Product Information (GPI) files version 1.2, 132files can be downloaded from the EBI ftp site: 133ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/ 134 135 136Many thanks to the Biopython developers and community for making this release 137possible, especially the following contributors: 138 139- Damien Goutte-Gattat 140- Gert Hulselmans 141- João Rodrigues 142- Markus Piotrowski 143- Sergio Valqui 144- Suyash Gupta 145- Vini Salazar (first contribution) 146- Leighton Pritchard 147 1484 September 2020: Biopython 1.78 149================================ 150 151This release of Biopython supports Python 3.6, 3.7 and 3.8. It has also been 152tested on PyPy3.6.1 v7.1.1. 153 154The main change is that ``Bio.Alphabet`` is no longer used. In some cases you 155will now have to specify expected letters, molecule type (DNA, RNA, protein), 156or gap character explicitly. Please consult the updated Tutorial and API 157documentation for guidance. This simplification has sped up many ``Seq`` 158object methods. See https://biopython.org/wiki/Alphabet for more information. 159 160``Bio.SeqIO.parse()`` is faster with "fastq" format due to small improvements 161in the ``Bio.SeqIO.QualityIO`` module. 162 163The ``SeqFeature`` object's ``.extract()`` method can now be used for 164trans-spliced locations via an optional dictionary of references. 165 166As in recent releases, more of our code is now explicitly available under 167either our original "Biopython License Agreement", or the very similar but 168more commonly used "3-Clause BSD License". See the ``LICENSE.rst`` file for 169more details. 170 171Additionally, a number of small bugs and typos have been fixed with additions 172to the test suite. There has been further work to follow the Python PEP8, 173PEP257 and best practice standard coding style, and all of the tests have 174been reformatted with the ``black`` tool to match the main code base. 175 176Many thanks to the Biopython developers and community for making this release 177possible, especially the following contributors: 178 179- Adam Sjøgren (first contribution) 180- Carlos Pena 181- Chris Daley 182- Chris Rands 183- Christian Brueffer 184- Damien Goutte-Gattat 185- João Rodrigues 186- João Vitor F Cavalcante (first contribution) 187- Marie Crane 188- Markus Piotrowski 189- Michiel de Hoon 190- Peter Cock 191- Sergio Valqui 192- Yogesh Kulkarni (first contribution) 193- Zheng Ruan 194 19525 May 2020: Biopython 1.77 196=========================== 197 198This release of Biopython supports Python 3.6, 3.7 and 3.8 It has also been 199tested on PyPy3.6.1 v7.1.1-beta0. 200 201**We have dropped support for Python 2 now.** 202 203``pairwise2`` now allows the input of parameters with keywords and returns the 204alignments as a list of ``namedtuples``. 205 206The codon tables have been updated to NCBI genetic code table version 4.5, 207which adds Cephalodiscidae mitochondrial as table 33. 208 209Updated ``Bio.Restriction`` to the January 2020 release of REBASE. 210 211A major contribution by Rob Miller to ``Bio.PDB`` provides new methods to 212handle protein structure transformations using dihedral angles (internal 213coordinates). The new framework supports lossless interconversion between 214internal and cartesian coordinates, which, among other uses, simplifies the 215analysis and manipulation of coordinates of proteins structures. 216 217As in recent releases, more of our code is now explicitly available under 218either our original "Biopython License Agreement", or the very similar but 219more commonly used "3-Clause BSD License". See the ``LICENSE.rst`` file for 220more details. 221 222Additionally, a number of small bugs and typos have been fixed with further 223additions to the test suite. There has been further work to follow the Python 224PEP8, PEP257 and best practice standard coding style, and all the main code 225base has been reformatted with the ``black`` tool. 226 227Many thanks to the Biopython developers and community for making this release 228possible, especially the following contributors: 229 230- Alexander Decurnou (first contribution) 231- Andrei Istrate (first contribution) 232- Andrey Raspopov 233- Artemi Bendandi (first contribution) 234- Austin Varela (first contribution) 235- Chris Daley 236- Chris Rands 237- Deepak Khatri 238- Hielke Walinga (first contribution) 239- Kai Blin 240- Karthikeyan Singaravelan (first contribution) 241- Konstantinos Zisis (first contribution) 242- Markus Piotrowski 243- Michiel de Hoon 244- Peter Cock 245- Rob Miller 246- Sergio Valqui 247- Steve Bond 248- Sujan Dulal (first contribution) 249- Tianyi Shi (first contribution) 250 25120 December 2019: Biopython 1.76 252================================ 253 254This release of Biopython supports Python 2.7, 3.5, 3.6, 3.7 and 3.8. It has 255also been tested on PyPy2.7.13 v7.1.1 and PyPy3.6.1 v7.1.1-beta0. 256 257We intend this to be our final release supporting Python 2.7 and 3.5. 258 259As in recent releases, more of our code is now explicitly available under 260either our original "Biopython License Agreement", or the very similar but 261more commonly used "3-Clause BSD License". See the ``LICENSE.rst`` file for 262more details. 263 264 265``PDBParser`` and ``PDBIO`` now support PQR format file parsing and input/ 266output. 267 268In addition to the mainstream ``x86_64`` aka ``AMD64`` CPU architecture, we 269now also test every contribution on the ``ARM64``, ``ppc64le``, and ``s390x`` 270CPUs under Linux thanks to Travis CI. Further post-release testing done by 271Debian and other packagers and distributors of Biopython also covers these 272CPUs. 273 274``Bio.motifs.PositionSpecificScoringMatrix.search()`` method has been 275re-written: it now applies ``.calculate()`` to chunks of the sequence 276to maintain a low memory footprint for long sequences. 277 278Additionally, a number of small bugs and typos have been fixed with further 279additions to the test suite. There has been further work to follow the Python 280PEP8, PEP257 and best practice standard coding style, and more of the code 281style has been reformatted with the ``black`` tool. 282 283Many thanks to the Biopython developers and community for making this release 284possible, especially the following contributors: 285 286- Chris Daley (first contribution) 287- Chris Rands 288- Christian Brueffer 289- Ilya Flyamer (first contribution) 290- Jakub Lipinski (first contribution) 291- Michael R. Crusoe (first contribution) 292- Michiel de Hoon 293- Peter Cock 294- Sergio Valqui 295 2966 November 2019: Biopython 1.75 297=============================== 298 299This release of Biopython supports Python 2.7, 3.5, 3.6, 3.7 and is expected 300to work on the soon to be released Python 3.8. It has also been tested on 301PyPy2.7.13 v7.1.1 and PyPy3.6.1 v7.1.1-beta0. 302 303Note we intend to drop Python 2.7 support in early 2020. 304 305The restriction enzyme list in ``Bio.Restriction`` has been updated to the 306August 2019 release of REBASE. 307 308``Bio.SeqIO`` now supports reading and writing files in the native format of 309Christian Marck's DNA Strider program ("xdna" format, also used by Serial 310Cloner), as well as reading files in the native formats of GSL Biotech's 311SnapGene ("snapgene") and Textco Biosoftware's Gene Construction Kit ("gck"). 312 313``Bio.AlignIO`` now supports GCG MSF multiple sequence alignments as the "msf" 314format (work funded by the National Marrow Donor Program). 315 316The main ``Seq`` object now has string-like ``.index()`` and ``.rindex()`` 317methods, matching the existing ``.find()`` and ``.rfind()`` implementations. 318The ``MutableSeq`` object retains its more list-like ``.index()`` behaviour. 319 320The ``MMTFIO`` class has been added that allows writing of MMTF file format 321files from a Biopython structure object. ``MMTFIO`` has a similar interface to 322``PDBIO`` and ``MMCIFIO``, including the use of a ``Select`` class to write 323out a specified selection. This final addition to read/write support for 324PDB/mmCIF/MMTF in Biopython allows conversion between all three file formats. 325 326Values from mmCIF files are now read in as a list even when they consist of a 327single value. This change improves consistency and reduces the likelihood of 328making an error, but will require user code to be updated accordingly. 329 330`Bio.motifs.meme` has been updated to parse XML output files from MEME over 331the plain-text output file. The goal of this change is to parse a more 332structured data source with minimal loss of functionality upon future MEME 333releases. 334 335``Bio.PDB`` has been updated to support parsing REMARK 99 header entries from 336PDB-style Astral files. 337 338A new keyword parameter ``full_sequences`` was added to ``Bio.pairwise2``'s 339pretty print method ``format_alignment`` to restore the output of local 340alignments to the 'old' format (showing the whole sequences including the 341un-aligned parts instead of only showing the aligned parts). 342 343A new function ``charge_at_pH(pH)`` has been added to ``ProtParam`` and 344``IsoelectricPoint`` in ``Bio.SeqUtils``. 345 346The ``PairwiseAligner`` in ``Bio.Align`` was extended to allow generalized 347pairwise alignments, i.e. alignments of any Python object, for example 348three-letter amino acid sequences, three-nucleotide codons, and arrays of 349integers. 350 351A new module ``substitution_matrices`` was added to ``Bio.Align``, which 352includes an ``Array`` class that can be used as a substitution matrix. As 353the ``Array`` class is a subclass of a numpy array, mathematical operations 354can be applied to it directly, and C code that makes use of substitution 355matrices can directly access the numerical values stored in the substitution 356matrices. This module is intended as a replacement of ``Bio.SubsMat``, 357which is currently unmaintained. 358 359As in recent releases, more of our code is now explicitly available under 360either our original "Biopython License Agreement", or the very similar but 361more commonly used "3-Clause BSD License". See the ``LICENSE.rst`` file for 362more details. 363 364Additionally, a number of small bugs and typos have been fixed with further 365additions to the test suite, and there has been further work to follow the 366Python PEP8, PEP257 and best practice standard coding style. We have also 367started to use the ``black`` Python code formatting tool. 368 369Many thanks to the Biopython developers and community for making this release 370possible, especially the following contributors: 371 372- Chris MacRaild 373- Chris Rands 374- Damien Goutte-Gattat (first contribution) 375- Devang Thakkar 376- Harry Jubb 377- Joe Greener 378- Kiran Mukhyala (first contribution) 379- Konstantin Vdovkin 380- Mark Amery 381- Markus Piotrowski 382- Michiel de Hoon 383- Mike Moritz (first contribution) 384- Mustafa Anil Tuncel 385- Nick Negretti 386- Osvaldo Zagordi (first contribution) 387- Peter Cock 388- Peter Kerpedjiev 389- Sergio Valqui 390- Spencer Bliven 391- Victor Lin 392 393 39416 July 2019: Biopython 1.74 395============================ 396 397This release of Biopython supports Python 2.7, 3.4, 3.5, 3.6 and 3.7. However, 398it will be the last release to support Python 3.4 which is now at end-of-life. 399It has also been tested on PyPy2.7 v6.0.0 and PyPy3.5 v6.0.0. 400 401As in recent releases, more of our code is now explicitly available under 402either our original "Biopython License Agreement", or the very similar but 403more commonly used "3-Clause BSD License". See the ``LICENSE.rst`` file for 404more details. 405 406Our core sequence objects (``Seq``, ``UnknownSeq``, and ``MutableSeq``) now 407have a string-like ``.join()`` method. 408 409The NCBI now allows longer accessions in the GenBank file LOCUS line, meaning 410the fields may not always follow the historical column based positions. We 411no longer give a warning when parsing these. We now allow writing such files 412(although with a warning as support for reading them is not yet widespread). 413 414Support for the ``mysqlclient`` package, a fork of MySQLdb, has been added. 415 416We now capture the IDcode field from PDB Header records. 417 418``Bio.pairwise2``'s pretty-print output from ``format_alignment`` has been 419optimized for local alignments: If they do not consist of the whole sequences, 420only the aligned section of the sequences are shown, together with the start 421positions of the sequences (in 1-based notation). Alignments of lists will now 422also be prettily printed. 423 424``Bio.SearchIO`` now supports parsing the text output of the HHsuite protein 425sequence search tool. The format name is ``hhsuite2-text`` and 426``hhsuite3-text``, for versions 2 and 3 of HHsuite, respectively. 427 428``Bio.SearchIO`` HSP objects has a new attribute called ``output_index``. This 429attribute is meant for capturing the order by which the HSP were output in the 430parsed file and is set with a default value of -1 for all HSP objects. It is 431also used for sorting the output of ``QueryResult.hsps``. 432 433``Bio.SeqIO.AbiIO`` has been updated to preserve bytes value when parsing. The 434goal of this change is make the parser more robust by being able to extract 435string-values that are not utf-8-encoded. This affects all tag values, except 436for ID and description values, where they need to be extracted as strings 437to conform to the ``SeqRecord`` interface. In this case, the parser will 438attempt to decode using ``utf-8`` and fall back to the system encoding if that 439fails. This change affects Python 3 only. 440 441``Bio.motifs.mast`` has been updated to parse XML output files from MAST over 442the plain-text output file. The goal of this change is to parse a more 443structured data source with minimal loss of functionality upon future MAST 444releases. Class structure remains the same plus an additional attribute 445``Record.strand_handling`` required for diagram parsing. 446 447``Bio.Entrez`` now automatically retries HTTP requests on failure. The 448maximum number of tries and the sleep between them can be configured by 449changing ``Bio.Entrez.max_tries`` and ``Bio.Entrez.sleep_between_tries``. 450(The defaults are 3 tries and 15 seconds, respectively.) 451 452The restriction enzyme list in ``Bio.Restriction`` has been updated to the May 4532019 release of REBASE. 454 455All tests using the older print-and-compare approach have been replaced by 456unittests following Python's standard testing framework. 457 458On the documentation side, all the public modules, classes, methods and 459functions now have docstrings (built in help strings). Furthermore, the PDF 460version of the *Biopython Tutorial and Cookbook* now uses syntax coloring 461for code snippets. 462 463Additionally, a number of small bugs and typos have been fixed with further 464additions to the test suite, and there has been further work to follow the 465Python PEP8, PEP257 and best practice standard coding style. 466 467Many thanks to the Biopython developers and community for making this release 468possible, especially the following contributors: 469 470- Andrey Raspopov (first contribution) 471- Antony Lee 472- Benjamin Rowell (first contribution) 473- Bernhard Thiel 474- Brandon Invergo 475- Catherine Lesuisse 476- Chris Rands 477- Deepak Khatri (first contribution) 478- Gert Hulselmans 479- Jared Andrews 480- Jens Thomas (first contribution) 481- Konstantin Vdovkin 482- Lenna Peterson 483- Mark Amery 484- Markus Piotrowski 485- Micky Yun Chan (first contribution) 486- Nick Negretti 487- Peter Cock 488- Peter Kerpedjiev 489- Ralf Stephan 490- Rob Miller (first contribution) 491- Sergio Valqui 492- Victor Lin 493- Wibowo 'Bow' Arindrarto 494- Zheng Ruan 495 496 49718 December 2018: Biopython 1.73 498================================ 499 500This release of Biopython supports Python 2.7, 3.4, 3.5, 3.6 and 3.7. 501It has also been tested on PyPy2.7 v6.0.0 and PyPy3.5 v6.0.0. 502 503As in recent releases, more of our code is now explicitly available under 504either our original "Biopython License Agreement", or the very similar but 505more commonly used "3-Clause BSD License". See the ``LICENSE.rst`` file for 506more details. 507 508The dictionary-like indexing in SeqIO and SearchIO will now explicitly preserve 509record order to match a behaviour change in the Python standard dict object. 510This means looping over the index will load the records in the on-disk order, 511which will be much faster (previously it would be effectively at random, based 512on the key hash sorting). 513 514The "grant" matrix in Bio.SubsMat.MatrixInfo has been replaced as our original 515values taken from Gerhard Vogt's old webpages at EMBL Heidelberg were 516discovered to be in error. The new values have been transformed following 517Vogt's approach, taking the global maximum 215 minus the similarity scores 518from the original paper Grantham (1974), to give a distance measure. 519 520Additionally, a number of small bugs and typos have been fixed with further 521additions to the test suite, and there has been further work to follow the 522Python PEP8, PEP257 and best practice standard coding style. 523 524Double-quote characters in GenBank feature qualifier values in ``Bio.SeqIO`` 525are now escaped as per the NCBI standard. Improperly escaped values trigger a 526warning on parsing. 527 528There is a new command line wrapper for the BWA-MEM sequence mapper. 529 530The string-based FASTA parsers in ``Bio.SeqIO.FastaIO`` have been optimised, 531which also speeds up parsing FASTA files using ``Bio.SeqIO.parse()``. 532 533Many thanks to the Biopython developers and community for making this release 534possible, especially the following contributors: 535 536- Alona Levy-Jurgenson (first contribution) 537- Ariel Aptekmann 538- Brandon Invergo 539- Catherine Lesuisse 540- Chris Rands 541- Darcy Mason (first contribution) 542- Devang Thakkar (first contribution) 543- Ivan Antonov (first contribution) 544- Jeremy LaBarage (first contribution) 545- Juraj Szász (first contribution) 546- Kai Blin 547- Konstantin Vdovkin (first contribution) 548- Manuel Nuno Melo (first contribution) 549- Maximilian Greil 550- Nick Negretti (first contribution) 551- Peter Cock 552- Rona Costello (first contribution) 553- Spencer Bliven 554- Wibowo 'Bow' Arindrarto 555- Yi Hsiao (first contribution) 556 557 55821 June 2018: Biopython 1.72 559============================ 560 561This release of Biopython supports Python 2.7, 3.4, 3.5 and 3.6. 562It has also been tested on PyPy2.7 v6.0.0 and PyPy3.5 v6.0.0. 563 564Internal changes to Bio.SeqIO have sped up the SeqRecord .format method and 565SeqIO.write (especially when used in a for loop). 566 567The MAF alignment indexing in Bio.AlignIO.MafIO has been updated to use 568inclusive end co-ordinates to better handle searches at end points. This 569will require you to rebuild any existing MAF index files. 570 571In this release more of our code is now explicitly available under either our 572original "Biopython License Agreement", or the very similar but more commonly 573used "3-Clause BSD License". See the ``LICENSE.rst`` file for more details. 574 575The Entrez module now supports the NCBI API key. Also you can now set a custom 576directory for DTD and XSD files. This allows Entrez to be used in environments 577like AWS Lambda, which restricts write access to specific directories. 578Improved support for parsing NCBI Entrez XML files that use XSD schemas. 579 580Internal changes to our C code mean that NumPy is no longer required at 581compile time - only at run time (and only for those modules which use NumPy). 582 583Seq, UnknownSeq, MutableSeq and derived classes now support integer 584multiplication methods, matching native Python string methods. 585 586A translate method has been added to Bio.SeqFeature that will extract a 587feature and translate it using the codon_start and transl_table qualifiers 588of the feature if they are present. 589 590Bio.SearchIO is no longer considered experimental, and so it does not raise 591warnings anymore when imported. 592 593A new pairwise sequence aligner is available in Bio.Align, as an alternative 594to the existing pairwise sequence aligner in Bio.pairwise2. 595 596Many thanks to the Biopython developers and community for making this release 597possible, especially the following contributors: 598 599- Benjamin Vaisvil (first contribution) 600- Blaise Li 601- Chad Parmet 602- Chris Rands 603- Connor T. Skennerton 604- Francesco Gastaldello 605- Michiel de Hoon 606- Pamela Russell (first contribution) 607- Peter Cock 608- Spencer Bliven 609- Stefans Mezulis 610- Wibowo 'Bow' Arindrarto 611 612 6133 April 2018: Biopython 1.71 614============================ 615 616This release of Biopython supports Python 2.7, 3.4, 3.5 and 3.6. 617It has also been tested on PyPy2.7 v5.10.0 and PyPy3.5 v5.10.1. 618 619Python 3 is the primary development platform for Biopython. We will drop 620support for Python 2.7 no later than 2020, in line with the end-of-life or 621sunset date for Python 2.7 itself. 622 623Encoding issues have been fixed in several parsers when reading data files 624with non-ASCII characters, like accented letters in people's names. This would 625raise ``UnicodeDecodeError: 'ascii' codec can't decode byte ...`` under some 626system locale settings. 627 628Bio.KEGG can now parse Gene files. 629 630The multiple-sequence-alignment object used by Bio.AlignIO etc now supports 631a per-column annotation dictionary, useful for richly annotated alignments 632in the Stockholm/PFAM format. 633 634The SeqRecord object now has a translate method, following the approach used 635for its existing reverse_complement method etc. 636 637The output of function ``format_alignment`` in ``Bio.pairwise2`` for displaying 638a pairwise sequence alignment as text now indicates gaps and mis-matches. 639 640Bio.SeqIO now supports reading and writing two-line-per-record FASTA files 641under the format name "fasta-2line", useful if you wish to work without 642line-wrapped sequences. 643 644Bio.PDB now contains a writer for the mmCIF file format, which has been the 645standard PDB archive format since 2014. This allows structural objects to be 646written out and facilitates conversion between the PDB and mmCIF file formats. 647 648Bio.Emboss.Applications has been updated to fix a wrong parameter in fuzznuc 649wrapper and include a new wrapper for fuzzpro. 650 651The restriction enzyme list in ``Bio.Restriction`` has been updated to the 652November 2017 release of REBASE. 653 654New codon tables 27-31 from NCBI (NCBI genetic code table version 4.2) 655were added to Bio.Data.CodonTable. Note that tables 27, 28 and 31 contain 656no dedicated stop codons; the stop codons in these codes have a context 657dependent encoding as either STOP or as amino acid. 658 659IO functions such as ``SeqIO.parse`` now accept any objects which can be passed 660to the builtin ``open`` function. Specifically, this allows using 661``pathlib.Path`` objects under Python 3.6 and newer, as per `PEP 519 662<https://www.python.org/dev/peps/pep-0519/>`_. 663 664Bio.SearchIO can now parse InterProScan XML files. 665 666For Python 3 compatibility, comparison operators for the entities within a 667Bio.PDB Structure object were implemented. These allow the comparison of 668models, chains, residues, and atoms with the common operators (==, !=, >, ...) 669Comparisons are based on IDs and take the parents of the entity up to the 670model level into account. For consistent behaviour of all entities the 671operators for atoms were modified to also consider the parent IDs. NOTE: this 672represents a change in behaviour in respect to v1.70 for Atom comparisons. In 673order to mimic the behaviour of previous versions, comparison will have to be 674done for Atom IDs and alternative locations specifically. 675 676In this release more of our code is now explicitly available under either our 677original "Biopython License Agreement", or the very similar but more commonly 678used "3-Clause BSD License". See the ``LICENSE.rst`` file for more details. 679 680Additionally, a number of small bugs and typos have been fixed with further 681additions to the test suite, and there has been further work to follow the 682Python PEP8, PEP257 and best practice standard coding style. 683 684Many thanks to the Biopython developers and community for making this release 685possible, especially the following contributors: 686 687- Adhemar Zerlotini 688- Ariel Aptekmann 689- Chris Rands 690- Christian Brueffer 691- Connor T. Skennerton 692- Erik Cederstrand (first contribution) 693- Fei Qi (first contribution) 694- Francesco Gastaldello 695- James Jeffryes (first contribution) 696- Jerven Bolleman (first contribution) 697- Joe Greener (first contribution) 698- Joerg Schaarschmidt (first contribution) 699- João Rodrigues 700- Jeroen Van Goey 701- Jun Aruga (first contribution) 702- Kai Blin 703- Kozo Nishida 704- Lewis A. Marshall (first contribution) 705- Markus Piotrowski 706- Michiel de Hoon 707- Nicolas Fontrodona (first contribution) 708- Peter Cock 709- Philip Bergstrom (first contribution) 710- rht (first contribution) 711- Saket Choudhary 712- Shuichiro MAKIGAKI (first contribution) 713- Shyam Saladi (first contribution) 714- Siong Kong 715- Spencer Bliven 716- Stefans Mezulis 717- Steve Bond 718- Yasar L. Ahmed (first contribution) 719- Zachary Sailer (first contribution) 720- Zaid Ur-Rehman (first contribution) 721 722 72310 July 2017: Biopython 1.70 724============================ 725 726This release of Biopython supports Python 2.7, 3.4, 3.5 and 3.6 (we have now 727dropped support for Python 3.3). It has also been tested on PyPy v5.7, 728PyPy3.5 v5.8 beta, and Jython 2.7 (although support for Jython is deprecated). 729 730Biopython now has a new logo, contributed by Patrick Kunzmann. Drawing on our 731original logo and the current Python logo, this shows a yellow and blue snake 732forming a double helix. 733 734For installation Biopython now assumes ``setuptools`` is present, and takes 735advantage of this to declare we require NumPy at install time (except under 736Jython). This should help ensure ``pip install biopython`` works smoothly. 737 738Bio.AlignIO now supports Mauve's eXtended Multi-FastA (XMFA) file format 739under the format name "mauve" (contributed by Eric Rasche). 740 741Bio.ExPASy was updated to fix fetching PROSITE and PRODOC records, and return 742text-mode handles for use under Python 3. 743 744Two new arguments for reading and writing blast-xml files have been added 745to the Bio.SearchIO functions (read/parse and write, respectively). They 746are 'use_raw_hit_ids' and 'use_raw_query_ids'. Check out the relevant 747SearchIO.BlastIO documentation for a complete description of what these 748arguments do. 749 750Bio.motifs was updated to support changes in MEME v4.11.4 output. 751 752The Bio.Seq sequence objects now have a ``.count_overlap()`` method to 753supplement the Python string like non-overlap based ``.count()`` method. 754 755The Bio.SeqFeature location objects can now be compared for equality. 756 757Bio.Phylo.draw_graphviz is now deprecated. We recommend using Bio.Phylo.draw 758instead, or another library or program if more advanced plotting functionality 759is needed. 760 761In Bio.Phylo.TreeConstruction, the DistanceMatrix class (previously 762_DistanceMatrix) has a new method 'format_phylip' to write Phylip-compatible 763distance matrix files (contributed by Jordan Willis). 764 765Additionally, a number of small bugs have been fixed with further additions 766to the test suite, and there has been further work to follow the Python PEP8, 767PEP257 and best practice standard coding style. 768 769Many thanks to the Biopython developers and community for making this release 770possible, especially the following contributors: 771 772- Aaron Kitzmiller (first contribution) 773- Adil Iqbal (first contribution) 774- Allis Tauri 775- Andrew Guy 776- Ariel Aptekmann (first contribution) 777- Ben Fulton 778- Bertrand Caron (first contribution) 779- Chris Rands (first contribution) 780- Connor T. Skennerton 781- Eric Rasche 782- Eric Talevich 783- Francesco Gastaldello 784- François Coste (first contribution) 785- Frederic Sapet (first contribution) 786- Jimmy O'Donnell (first contribution) 787- Jared Andrews (first contribution) 788- John Kern (first contribution) 789- Jordan Willis (first contribution) 790- João Rodrigues 791- Kai Blin 792- Markus Piotrowski 793- Mateusz Korycinski (first contribution) 794- Maximilian Greil 795- Michiel de Hoon 796- morrme (first contribution) 797- Noam Kremen (first contribution) 798- Patrick Kunzmann (first contribution) 799- Peter Cock 800- Rasmus Fonseca (first contribution) 801- Rodrigo Dorantes-Gilardi (first contribution) 802- Sacha Laurent (first contribution) 803- Sourav Singh 804- Ted Cybulski (first contribution) 805- Tiago Antao 806- Wibowo 'Bow' Arindrarto 807- Zheng Ruan 808 809 8106 April 2017: Biopython 1.69 811============================ 812 813This release of Biopython supports Python 2.7, 3.3, 3.4, 3.5 and 3.6 (we have 814now dropped support for Python 2.6). It has also been tested on PyPy v5.7, 815PyPy3.5 v5.7 beta, and Jython 2.7. 816 817We have started to dual-license Biopython under both our original liberal 818"Biopython License Agreement", and the very similar but more commonly used 819"3-Clause BSD License". In this release a small number of the Python files 820are explicitly available under either license, but most of the code remains 821under the "Biopython License Agreement" only. See the ``LICENSE.rst`` file 822for more details. 823 824We now expect and take advantage of NumPy under PyPy, and compile most of the 825Biopython C code modules as well. 826 827Bio.AlignIO now supports the UCSC Multiple Alignment Format (MAF) under the 828format name "maf", using new module Bio.AlignIO.MafIO which also offers 829indexed access to these potentially large files using SQLite3 (contributed by 830Andrew Sczesnak, with additional refinements from Adam Novak). 831 832Bio.SearchIO.AbiIO has been extended to support parsing FSA files. The 833underlying format (ABIF) remains the same as AB1 files and so the string 834'abif' is the expected format argument in the main SeqIO functions. AbiIO 835determines whether the file is AB1 or FSA based on the presence of specific 836tags. 837 838The Uniprot parser is now able to parse "submittedName" elements in XML files. 839 840The NEXUS parser handling of internal node comments has been improved, which 841should help if working with tools like the BEAST TreeAnnotator. Slashes are 842now also allowed in identifiers. 843 844New parser for ExPASy Cellosaurus, a cell line database, cell line catalogue, 845and cell line ontology (contributed by Steve Marshall). 846 847For consistency the Bio.Seq module now offers a complement function (already 848available as a method on the Seq and MutableSeq objects). 849 850The SeqFeature object's qualifiers is now an explicitly ordered dictionary 851(note that as of Python 3.6 the Python dict is ordered by default anyway). 852This helps reproduce GenBank/EMBL files on input/output. 853 854The Bio.SeqIO UniProt-XML parser was updated to cope with features with 855unknown locations which can be found in mass spec data. 856 857The Bio.SeqIO GenBank, EMBL, and IMGT parsers now record the molecule type 858from the LOCUS/ID line explicitly in the record.annotations dictionary. 859The Bio.SeqIO EMBL parser was updated to cope with more variants seen in 860patent data files, and the related IMGT parser was updated to cope with 861IPD-IMGT/HLA database files after release v3.16.0 when their ID line changed. 862The GenBank output now uses colon space to match current NCBI DBLINK lines. 863 864The Bio.Affy package supports Affymetrix version 4 of the CEL file format, 865in addition to version 3. 866 867The restriction enzyme list in ``Bio.Restriction`` has been updated to the 868February 2017 release of REBASE. 869 870Bio.PDB.PDBList now can download PDBx/mmCif (new default), PDB (old default), 871PDBML/XML and mmtf format protein structures. This is inline with the RCSB 872recommendation to use PDBx/mmCif and deprecate the PDB file format. Biopython 873already has support for parsing mmCif files. 874 875Additionally, a number of small bugs have been fixed with further additions 876to the test suite, and there has been further work to follow the Python PEP8, 877PEP257 and best practice standard coding style. 878 879Many thanks to the Biopython developers and community for making this release 880possible, especially the following contributors: 881 882- Aaron Rosenfeld 883- Adam Kurkiewicz (first contribution) 884- Adam Novak (first contribution) 885- Adrian Altenhoff (first contribution) 886- Allis Tauri (first contribution) 887- Andrew Dalke 888- Andrew Guy (first contribution) 889- Andrew Sczesnak (first contribution) 890- Ben Fulton 891- Bernhard Thiel (first contribution) 892- Bertrand Néron 893- Blaise Li (first contribution) 894- Brandon Carter (first contribution) 895- Brandon Invergo 896- Carlos Pena 897- Carlos Ríos 898- Chris Warth 899- Emmanuel Noutahi 900- Foen Peng (first contribution) 901- Francesco Gastaldello (first contribution) 902- Francisco Pina-Martins (first contribution) 903- Hector Martinez (first contribution) 904- Jacek Śmietański 905- Jack Twilley (first contribution) 906- Jeroen Van Goey (first contribution) 907- Joshua Meyers (first contribution) 908- Kurt Graff (first contribution) 909- Lenna Peterson 910- Leonhard Heizinger (first contribution) 911- Marcin Magnus (first contribution) 912- Markus Piotrowski 913- Maximilian Greil (first contribution) 914- Michał J. Gajda (first contribution) 915- Michiel de Hoon 916- Milind Luthra (first contribution) 917- Oscar G. Garcia (first contribution) 918- Owen Solberg 919- Peter Cock 920- Richard Neher (first contribution) 921- Sebastian Bassi 922- Sourav Singh (first contribution) 923- Spencer Bliven (first contribution) 924- Stefans Mezulis 925- Steve Bond 926- Steve Marshall (first contribution) 927- Uri Laserson 928- Veronika Berman (first contribution) 929- Vincent Davis 930- Wibowo 'Bow' Arindrarto 931 932 93325 August 2016: Biopython 1.68 934============================== 935 936This release of Biopython supports Python 2.6, 2.7, 3.3, 3.4 and 3.5, but 937this will be our final release to run on Python 2.6. It has also been tested 938on PyPy 5.0, PyPy3 version 2.4, and Jython 2.7. 939 940Bio.PDB has been extended to parse the RSSB's new binary Macromolecular 941Transmission Format (MMTF, see http://mmtf.rcsb.org), in addition to the 942mmCIF and PDB file formats (contributed by Anthony Bradley). This requires 943an optional external dependency on the mmtf-python library. 944 945Module Bio.pairwise2 has been re-written (contributed by Markus Piotrowski). 946It is now faster, addresses some problems with local alignments, and also 947now allows gap insertions after deletions, and vice versa, inspired by the 948https://doi.org/10.1101/031500 preprint from Flouri et al. 949 950The two sample graphical tools SeqGui (Sequence Graphical User Interface) 951and xbbtools were rewritten (SeqGui) or updated (xbbtools) using the tkinter 952library (contributed by Markus Piotrowski). SeqGui allows simple nucleotide 953transcription, back-transcription and translation into amino acids using 954Bio.Seq internally, offering of the NCBI genetic codes supported in Biopython. 955xbbtools is able to open Fasta formatted files, does simple nucleotide 956operations and translations in any reading frame using one of the NCBI genetic 957codes. In addition, it supports standalone Blast installations to do local 958Blast searches. 959 960New NCBI genetic code table 26 (Pachysolen tannophilus Nuclear Code) has been 961added to Bio.Data (and the translation functionality), and table 11 is now 962also available under the alias Archaeal. 963 964In line with NCBI website changes, Biopython now uses HTTPS rather than HTTP 965to connect to the NCBI Entrez and QBLAST API. 966 967Additionally, a number of small bugs have been fixed with further additions 968to the test suite, and there has been further work to follow the Python PEP8 969and best practice standard coding style. 970 971Many thanks to the Biopython developers and community for making this release 972possible, especially the following contributors: 973 974- Anthony Bradley (first contribution) 975- Ben Fulton 976- Carlos Pena 977- Connor T. Skennerton 978- Iddo Friedberg 979- Kai Blin 980- Kristian Davidsen (first contribution) 981- Markus Piotrowski 982- Olivier Morelle (first contribution) 983- Peter Cock 984- Stefans Mezulis (first contribution) 985- Tiago Antao 986- Travis Wrightsman 987- Uwe Schmitt (first contribution) 988- Xiaoyu Zhuo (first contribution) 989 990 9918 June 2016: Biopython 1.67 992=========================== 993 994This release of Biopython supports Python 2.6, 2.7, 3.3, 3.4 and 3.5, but 995support for Python 2.6 is considered to be deprecated. It has also been 996tested on PyPy 5.0, PyPy3 version 2.4, and Jython 2.7. 997 998Comparison of SeqRecord objects until now has used the default Python object 999comparison (are they the same instance in memory?). This can be surprising, but 1000comparing all of the attributes would be too complex. As of this release 1001attempting to compare SeqRecord objects should raise an exception instead. If 1002you want the old behaviour, use id(record1) == id(record2) instead. 1003 1004New experimental module Bio.phenotype is for working with Phenotype Microarray 1005plates in JSON and the machine vendor's CSV format (contributed by Marco 1006Galardini). 1007 1008Following the convention used elsewhere in Biopython, there is a new function 1009Bio.KEGG.read(...) for parsing KEGG files expected to contain a single record 1010only - the existing function Bio.KEGG.parse(...) is intended to be used to 1011iterate over multi-record files. 1012 1013When a gap character is defined, Bio.Seq will now translate gap codons 1014(e.g. "---") into a single gap ("-") in the protein sequence. The gap character 1015is inferred from the Seq object's alphabet, but it can also be passed as an 1016argument to the translate method. 1017 1018The new NCBI genetic code table 25, covering Candidate Division SR1 and 1019Gracilibacteria, has been added to Bio.Data (and the translation 1020functionality). 1021 1022The Bio.Entrez interface will automatically use an HTTP POST rather than 1023HTTP GET if the URL would exceed 1000 characters. This is based on NCBI 1024guidelines and the fact that very long queries like complex searches can 1025otherwise trigger an HTTP Error 414 Request URI too long. 1026 1027Foreign keys are now used when creating BioSQL databases with SQLite3 (this 1028was not possible until SQLite version 3.6.19). The BioSQL taxonomy code now 1029updates the taxon table left/right keys when updating the taxonomy. 1030 1031There have been some fixes to the MMCIF structure parser which now uses 1032identifiers which better match results from the PDB structure parse. 1033 1034The restriction enzyme list in ``Bio.Restriction`` has been updated to the 1035May 2016 release of REBASE. 1036 1037The mmCIF parser in Bio.PDB.MMCIFParser has been joined by a second version 1038which only looks at the ATOM and HETATM lines and can be much faster. 1039 1040The Bio.KEGG.REST will now return unicode text-based handles, except for 1041images which remain as binary bytes-based handles, making it easier to use 1042with the mostly text-based parsers in Biopython. 1043 1044Note that the BioSQL test configuration information is now in a new file 1045Tests/biosql.ini rather than directly in Tests/test_BioSQL_*.py as before. 1046You can make a copy of the provided example file Tests/biosql.ini.sample 1047as Tests/biosql.ini and edit this if you wish to run the BioSQL tests. 1048 1049Additionally, a number of small bugs have been fixed with further additions 1050to the test suite, and there has been further work to follow the Python PEP8 1051standard coding style, and in converting our docstring documentation to use 1052the reStructuredText markup style. 1053 1054Many thanks to the Biopython developers and community for making this release 1055possible, especially the following contributors: 1056 1057- Aaron Rosenfeld (first contribution) 1058- Anders Pitman (first contribution) 1059- Barbara Mühlemann (first contribution) 1060- Ben Fulton 1061- Ben Woodcroft (first contribution) 1062- Brandon Invergo 1063- Brian Osborne (first contribution) 1064- Carlos Pena 1065- Chaitanya Gupta (first contribution) 1066- Chris Warth (first contribution) 1067- Christiam Camacho (first contribution) 1068- Connor T. Skennerton 1069- David Koppstein (first contribution) 1070- Eric Talevich 1071- Jacek Śmietański (first contribution) 1072- João D Ferreira (first contribution) 1073- João Rodrigues 1074- Joe Cora (first contribution) 1075- Kai Blin 1076- Leighton Pritchard 1077- Lenna Peterson 1078- Marco Galardini (first contribution) 1079- Markus Piotrowski 1080- Matt Ruffalo (first contribution) 1081- Matteo Sticco (first contribution) 1082- Nader Morshed (first contribution) 1083- Owen Solberg (first contribution) 1084- Peter Cock 1085- Steve Bond (first contribution) 1086- Terry Jones (first contribution) 1087- Vincent Davis 1088- Zheng Ruan 1089 1090 109121 October 2015: Biopython 1.66 1092=============================== 1093 1094This release of Biopython supports Python 2.6, 2.7, 3.3, 3.4 and 3.5, but 1095support for Python 2.6 is considered to be deprecated. It has also been 1096tested on PyPy 2.4 to 2.6, PyPy3 version 2.4, and Jython 2.7. 1097 1098Further work on the Bio.KEGG and Bio.Graphics modules now allows drawing KGML 1099pathways with transparency. 1100 1101The Bio.SeqIO "abi" parser now decodes almost all the documented fields used 1102by the ABIF instruments - including the individual color channels. 1103 1104Bio.PDB now has a QCPSuperimposer module using the Quaternion Characteristic 1105Polynomial algorithm for superimposing structures. This is a fast alternative 1106to the existing SVDSuperimposer code using singular value decomposition. 1107 1108Bio.Entrez now implements the NCBI Entrez Citation Matching function 1109(ECitMatch), which retrieves PubMed IDs (PMIDs) that correspond to a set of 1110input citation strings. 1111 1112Bio.Entrez.parse(...) now supports NCBI XML files using XSD schemas, which 1113will be downloaded and cached like NCBI DTD files. 1114 1115A subtle bug in how multi-part GenBank/EMBL locations on the reverse strand 1116were parsed into CompoundLocations was fixed: complement(join(...)) as used 1117by NCBI worked, but join(complement(...),complement(...),...) as used by 1118EMBL/ENSEMBL gave the CompoundLocation parts in the wrong order. A related 1119bug when taking the reverse complement of a SeqRecord containing features 1120with CompoundLocations was also fixed. 1121 1122Additionally, a number of small bugs have been fixed with further additions 1123to the test suite, and there has been further work on conforming to the 1124Python PEP8 standard coding style. 1125 1126Many thanks to the Biopython developers and community for making this release 1127possible, especially the following contributors: 1128 1129- Alan Medlar (first contribution) 1130- Anthony Mathelier (first contribution) 1131- Antony Lee (first contribution) 1132- Anuj Sharma (first contribution) 1133- Ben Fulton (first contribution) 1134- Bertrand Néron 1135- Brandon Invergo 1136- Carlos Pena 1137- Christian Brueffer 1138- Connor T. Skennerton (first contribution) 1139- David Arenillas (first contribution) 1140- David Nicholson (first contribution) 1141- Emmanuel Noutahi (first contribution) 1142- Eric Rasche (first contribution) 1143- Fabio Madeira (first contribution) 1144- Franco Caramia (first contribution) 1145- Gert Hulselmans (first contribution) 1146- Gleb Kuznetsov (first contribution) 1147- João Rodrigues 1148- John Bradley (first contribution) 1149- Kai Blin 1150- Kian Ho (first contribution) 1151- Kozo Nishida (first contribution) 1152- Kuan-Yi Li (first contribution) 1153- Leighton Pritchard 1154- Lucas Sinclair 1155- Michiel de Hoon 1156- Peter Cock 1157- Saket Choudhary 1158- Sunhwan Jo (first contribution) 1159- Tarcisio Fedrizzi (first contribution) 1160- Tiago Antao 1161- Vincent Davis 1162 1163 116417 December 2014: Biopython 1.65 released. 1165========================================== 1166 1167The Biopython sequence objects now use string comparison, rather than Python's 1168object comparison. This has been planned for a long time with warning messages 1169in place (under Python 2, the warnings were sadly missing under Python 3). 1170 1171The Bio.KEGG and Bio.Graphics modules have been expanded with support for 1172the online KEGG REST API, and parsing, representing and drawing KGML pathways. 1173 1174The Pterobranchia Mitochondrial genetic code has been added to Bio.Data (and 1175the translation functionality), which is the new NCBI genetic code table 24. 1176 1177The Bio.SeqIO parser for the ABI capillary file format now exposes all the raw 1178data in the SeqRecord's annotation as a dictionary. This allows further 1179in-depth analysis by advanced users. 1180 1181Bio.SearchIO QueryResult objects now allow Hit retrieval using its alternative 1182IDs (any IDs listed after the first one, for example as used with the NCBI 1183BLAST NR database). 1184 1185We have also done some more work applying PEP8 coding styles to Biopython. 1186 1187Bio.SeqUtils.MeltingTemp has been rewritten with new functionality. 1188 1189The new experimental module Bio.CodonAlign has been renamed Bio.codonalign 1190(and similar lower case PEP8 style module names have been used for the 1191sub-modules within this). 1192 1193Bio.SeqIO.index_db(...) and Bio.SearchIO.index_db(...) now store any relative 1194filenames relative to the index file, rather than (as before) relative to the 1195current directory at the time the index was built. This makes the indexes 1196less fragile, so that they can be used from other working directories. NOTE: 1197This change is backward compatible (old index files work as before), however 1198relative paths in new indexes will not work on older versions of Biopython! 1199 1200Biopython also seems to work fine under PyPy3 2.4 which implements Python 3.2 1201plus unicode string literals. 1202 1203Many thanks to the Biopython developers and community for making this release 1204possible, especially the following contributors: 1205 1206- Alan Du (first contribution) 1207- Carlos Pena (first contribution) 1208- Colin Lappala (first contribution) 1209- Christian Brueffer 1210- David Bulger (first contribution) 1211- Eric Talevich 1212- Evan Parker (first contribution) 1213- Hongbo Zhu 1214- Kai Blin 1215- Kevin Wu (first contribution) 1216- Leighton Pritchard 1217- Leszek Pryszcz (first contribution) 1218- Markus Piotrowski 1219- Matt Shirley (first contribution) 1220- Mike Cariaso (first contribution) 1221- Peter Cock 1222- Seth Sims (first contribution) 1223- Tiago Antao 1224- Travis Wrightsman (first contribution) 1225- Tyghe Vallard (first contribution) 1226- Vincent Davis 1227- Wibowo 'Bow' Arindrarto 1228- Zheng Ruan 1229 1230 123129 May 2014: Biopython 1.64 released. 1232===================================== 1233 1234This release of Biopython supports Python 2.6 and 2.7, 3.3 and also the 1235new 3.4 version. It is also tested on PyPy 2.0 to 2.3, and Jython 2.7b2. 1236 1237The new experimental module Bio.CodonAlign facilitates building codon 1238alignment and further analysis upon it. This work is from the Google 1239Summer of Code (GSoC) project by Zheng Ruan. 1240 1241Bio.Phylo now has tree construction and consensus modules, from the 1242GSoC work by Yanbo Ye. 1243 1244Bio.Entrez will now automatically download and cache new NCBI DTD files for 1245XML parsing under the user's home directory (using ``~/.biopython`` on 1246Unix like systems, and ``$APPDATA/biopython`` on Windows). 1247 1248Bio.Sequencing.Applications now includes a wrapper for the samtools command 1249line tool. 1250 1251Bio.PopGen.SimCoal now also supports fastsimcoal. 1252 1253SearchIO hmmer3-text, hmmer3-tab, and hmmer3-domtab now support output from 1254hmmer3.1b1. 1255 1256The ``accession`` of QueryResult and Hit objects created when using the 1257'hmmer3-tab' format are now properly named as ``accession`` (previously they 1258were ``acc``, deviating from the documentation). 1259 1260The ``homology` key in the ``aln_annotation`` attribute of an HSP object in 1261Bio.SearchIO has been renamed to ``similarity``. 1262 1263The Bio.SeqUtils masses and molecular_weight function have been updated. 1264 1265BioSQL can now use the mysql-connector package (available for Python 2, 3 1266and PyPy) as an alternative to MySQLdb (Python 2 only) to connect to a MySQL 1267database. 1268 1269Many thanks to the Biopython developers and community for making this release 1270possible, especially the following contributors: 1271 1272- Chunlei Wu (first contribution) 1273- Edward Liaw (first contribution) 1274- Eric Talevich 1275- Leighton Pritchard 1276- Manlio Calvi (first contribution) 1277- Markus Piotrowski (first contribution) 1278- Melissa Gymrek (first contribution) 1279- Michiel de Hoon 1280- Nigel Delaney 1281- Peter Cock 1282- Saket Choudhary 1283- Tiago Antao 1284- Vincent Davis (first contribution) 1285- Wibowo 'Bow' Arindrarto 1286- Yanbo Ye (first contribution) 1287- Zheng Ruan (first contribution) 1288 1289 12904 December 2013: Biopython 1.63 released. 1291========================================= 1292 1293This release supports Python 3.3 onwards without conversion via the 2to3 1294library. See the Biopython 1.63 beta release notes below for details. Since 1295the beta release we have made some minor bug fixes and test improvements. 1296 1297The restriction enzyme list in Bio.Restriction has been updated to the 1298December 2013 release of REBASE. 1299 1300Additional contributors since the beta: 1301 1302- Gokcen Eraslan (first contribution) 1303 1304 130512 November 2013: Biopython 1.63 beta released. 1306=============================================== 1307 1308This is a beta release for testing purposes, the main reason for a 1309beta version is the large amount of changes imposed by the removal of 1310the 2to3 library previously required for the support of Python 3.X. 1311This was made possible by dropping Python 2.5 (and Jython 2.5). 1312 1313This release of Biopython supports Python 2.6 and 2.7, and also Python 13143.3. 1315 1316The Biopython Tutorial & Cookbook, and the docstring examples in the source 1317code, now use the Python 3 style print function in place of the Python 2 1318style print statement. This language feature is available under Python 2.6 1319and 2.7 via:: 1320 1321 from __future__ import print_function 1322 1323Similarly we now use the Python 3 style built-in next function in place of 1324the Python 2 style iterators' .next() method. This language feature is also 1325available under Python 2.6 and 2.7. 1326 1327Many thanks to the Biopython developers and community for making this release 1328possible, especially the following contributors: 1329 1330- Chris Mitchell (first contribution) 1331- Christian Brueffer 1332- Eric Talevich 1333- Josha Inglis (first contribution) 1334- Konstantin Tretyakov (first contribution) 1335- Lenna Peterson 1336- Martin Mokrejs 1337- Nigel Delaney (first contribution) 1338- Peter Cock 1339- Sergei Lebedev (first contribution) 1340- Tiago Antao 1341- Wayne Decatur (first contribution) 1342- Wibowo 'Bow' Arindrarto 1343 1344 134528 August 2013: Biopython 1.62 released. 1346======================================== 1347 1348This is our first release to officially support Python 3, however it is 1349also our final release supporting Python 2.5. Specifically this release 1350is supported and tested on standard Python 2.5, 2.6, 2.7 and 3.3. 1351It was also tested under Jython 2.5, 2.7 and PyPy 1.9, 2.0. 1352 1353See the Biopython 1.62 beta release notes below for most changes. Since the 1354beta release we have added several minor bug fixes and test improvements. 1355Additional contributors since the beta: 1356 1357- Bertrand Néron (first contribution) 1358- Lenna Peterson 1359- Martin Mokrejs 1360- Matsuyuki Shirota (first contribution) 1361 1362 136315 July 2013: Biopython 1.62 beta released. 1364=========================================== 1365 1366This is a beta release for testing purposes, both for new features added, 1367and changes to location parsing, but more importantly Biopython 1.62 will 1368be our first release to officially support Python 3. 1369 1370Specifically we intend Biopython 1.62 to support standard Python 2.5, 2.6, 2.7 1371and 3.3, but the release will also be tested under Jython 2.5, 2.7 and PyPy 13721.9, 2.0 as well. It will be our final release supporting Python 2.5. 1373 1374The translation functions will give a warning on any partial codons (and this 1375will probably become an error in a future release). If you know you are dealing 1376with partial sequences, either pad with N to extend the sequence length to a 1377multiple of three, or explicitly trim the sequence. 1378 1379The handling of joins and related complex features in Genbank/EMBL files has 1380been changed with the introduction of a CompoundLocation object. Previously 1381a SeqFeature for something like a multi-exon CDS would have a child SeqFeature 1382(under the sub_features attribute) for each exon. The sub_features property 1383will still be populated for now, but is deprecated and will in future be 1384removed. Please consult the examples in the help (docstrings) and Tutorial. 1385 1386Thanks to the efforts of Ben Morris, the Phylo module now supports the file 1387formats NeXML and CDAO. The Newick parser is also significantly faster, and can 1388now optionally extract bootstrap values from the Newick comment field (like 1389Molphy and Archaeopteryx do). Nate Sutton added a wrapper for FastTree to 1390Bio.Phylo.Applications. 1391 1392New module Bio.UniProt adds parsers for the GAF, GPA and GPI formats from 1393UniProt-GOA. 1394 1395The BioSQL module is now supported in Jython. MySQL and PostgreSQL databases 1396can be used. The relevant JDBC driver should be available in the CLASSPATH. 1397 1398Feature labels on circular GenomeDiagram figures now support the label_position 1399argument (start, middle or end) in addition to the current default placement, 1400and in a change to prior releases these labels are outside the features which 1401is now consistent with the linear diagrams. 1402 1403The code for parsing 3D structures in mmCIF files was updated to use the 1404Python standard library's shlex module instead of C code using flex. 1405 1406The Bio.Sequencing.Applications module now includes a BWA command line wrapper. 1407 1408Bio.motifs supports JASPAR format files with multiple position-frequence 1409matrices. 1410 1411Additionally there have been other minor bug fixes and more unit tests. 1412 1413Many thanks to the Biopython developers and community for making this release 1414possible, especially the following contributors: 1415 1416- Alexander Campbell (first contribution) 1417- Andrea Rizzi (first contribution) 1418- Anthony Mathelier (first contribution) 1419- Ben Morris (first contribution) 1420- Brad Chapman 1421- Christian Brueffer 1422- David Arenillas (first contribution) 1423- David Martin (first contribution) 1424- Eric Talevich 1425- Iddo Friedberg 1426- Jian-Long Huang (first contribution) 1427- Joao Rodrigues 1428- Kai Blin 1429- Michiel de Hoon 1430- Nate Sutton (first contribution) 1431- Peter Cock 1432- Petra Kubincová (first contribution) 1433- Phillip Garland 1434- Saket Choudhary (first contribution) 1435- Tiago Antao 1436- Wibowo 'Bow' Arindrarto 1437- Xabier Bello (first contribution) 1438 1439 14405 February 2013: Biopython 1.61 released. 1441========================================= 1442 1443GenomeDiagram has three new sigils (shapes to illustrate features). OCTO shows 1444an octagonal shape, like the existing BOX sigil but with the corners cut off. 1445JAGGY shows a box with jagged edges at the start and end, intended for things 1446like NNNNN regions in draft genomes. Finally BIGARROW is like the existing 1447ARROW sigil but is drawn straddling the axis. This is useful for drawing 1448vertically compact figures where you do not have overlapping genes. 1449 1450New module Bio.Graphics.ColorSpiral can generate colors along a spiral path 1451through HSV color space. This can be used to make arbitrary 'rainbow' scales, 1452for example to color features or cross-links on a GenomeDiagram figure. 1453 1454The Bio.SeqIO module now supports reading sequences from PDB files in two 1455different ways. The "pdb-atom" format determines the sequence as it appears in 1456the structure based on the atom coordinate section of the file (via Bio.PDB, 1457so NumPy is currently required for this). Alternatively, you can use the 1458"pdb-seqres" format to read the complete protein sequence as it is listed in 1459the PDB header, if available. 1460 1461The Bio.SeqUtils module how has a seq1 function to turn a sequence using three 1462letter amino acid codes into one using the more common one letter codes. This 1463acts as the inverse of the existing seq3 function. 1464 1465The multiple-sequence-alignment object used by Bio.AlignIO etc now supports 1466an annotation dictionary. Additional support for per-column annotation is 1467planned, with addition and splicing to work like that for the SeqRecord 1468per-letter annotation. 1469 1470A new warning, Bio.BiopythonExperimentalWarning, has been introduced. This 1471marks any experimental code included in the otherwise stable release. Such 1472'beta' level code is ready for wider testing, but still likely to change and 1473should only be tried by early adopters to give feedback via the biopython-dev 1474mailing list. We'd expect such experimental code to reach stable status in 1475one or two releases time, at which point our normal policies about trying to 1476preserve backwards compatibility would apply. See also the README file. 1477 1478This release also includes Bow's Google Summer of Code work writing a unified 1479parsing framework for NCBI BLAST (assorted formats including tabular and XML), 1480HMMER, BLAT, and other sequence searching tools. This is currently available 1481with the new BiopythonExperimentalWarning to indicate that this is still 1482somewhat experimental. We're bundling it with the main release to get more 1483public feedback, but with the big warning that the API is likely to change. 1484In fact, even the current name of Bio.SearchIO may change since unless you 1485are familiar with BioPerl its purpose isn't immediately clear. 1486 1487The Bio.Motif module has been updated and reorganized. To allow for a clean 1488deprecation of the old code, the new motif code is stored in a new module 1489Bio.motifs, and a PendingDeprecationWarning was added to Bio.Motif. 1490 1491A faster low level string FASTA based parser SimpleFastaParser has been added 1492to Bio.SeqIO.FastaIO which like its sister function for FASTQ files does not 1493have the overhead of constructing SeqRecord objects. 1494 1495Additionally there have been other minor bug fixes and more unit tests. 1496 1497Finally, we are phasing out support for Python 2.5. We will continue support 1498for at least one further release (Biopython 1.62). This could be extended 1499given feedback from our users (or if the Jython 2.7 release is delayed, since 1500the current stable release Jython 2.5 implemented Python 2.5 only). Focusing 1501on Python 2.6 and 2.7 only will make writing Python 3 compatible code easier. 1502 1503Many thanks to the Biopython developers and community for making this release 1504possible, especially the following contributors: 1505 1506- Brandon Invergo 1507- Bryan Lunt (first contribution) 1508- Christian Brueffer (first contribution) 1509- David Cain 1510- Eric Talevich 1511- Grace Yeo (first contribution) 1512- Jeffrey Chang 1513- Jingping Li (first contribution) 1514- Kai Blin (first contribution) 1515- Leighton Pritchard 1516- Lenna Peterson 1517- Lucas Sinclair (first contribution) 1518- Michiel de Hoon 1519- Nick Semenkovich (first contribution) 1520- Peter Cock 1521- Robert Ernst (first contribution) 1522- Tiago Antao 1523- Wibowo 'Bow' Arindrarto 1524 1525 152625 June 2012: Biopython 1.60 released. 1527====================================== 1528 1529New module Bio.bgzf supports reading and writing BGZF files (Blocked GNU 1530Zip Format), a variant of GZIP with efficient random access, most commonly 1531used as part of the BAM file format. This uses Python's zlib library 1532internally, and provides a simple interface like Python's gzip library. 1533Using this the Bio.SeqIO indexing functions now support BGZF compressed 1534sequence files. 1535 1536The GenBank/EMBL parser will now give a warning on unrecognised feature 1537locations and continue parsing (leaving the feature's location as None). 1538Previously it would abort with an exception, which was often unhelpful. 1539 1540The Bio.PDB.MMCIFParser is now compiled by default (but is still not 1541available under Jython, PyPy or Python 3). 1542 1543The SFF parser in Bio.SeqIO now decodes Roche 454 'universal accession 1544number' 14 character read names, which encode the timestamp of the run, 1545the region the read came from, and the location of the well. 1546 1547In the Phylo module, the "draw" function for plotting tree objects has become 1548much more flexible, with improved support for matplotlib conventions and new 1549parameters for specifying branch and taxon labels. Writing in the PhyloXML 1550format has been updated to more closely match the output of other programs. A 1551wrapper for the program RAxML has been added under Bio.Phylo.Applications, 1552alongside the existing wrapper for PhyML. 1553 1554Additionally there have been other minor bug fixes and more unit tests. 1555 1556Many thanks to the Biopython developers and community for making this release 1557possible, especially the following contributors: 1558 1559- Brandon Invergo 1560- Eric Talevich 1561- Jeff Hussmann (first contribution) 1562- John Comeau (first contribution) 1563- Kamil Slowikowski (first contribution) 1564- Kevin Jacobs 1565- Lenna Peterson (first contribution) 1566- Matt Fenwick (first contribution) 1567- Peter Cock 1568- Paul T. Bathen 1569- Wibowo Arindrarto 1570 1571 157224 February 2012: Biopython 1.59 released. 1573========================================== 1574 1575Please note that this release will *not* work on Python 2.4 (while the recent 1576releases have worked despite us not officially supporting this). 1577 1578The position objects used in Bio.SeqFeature now act almost like integers, 1579making dealing with fuzzy locations in EMBL/GenBank files much easier. Note as 1580part of this work, the arguments to create fuzzy positions OneOfPosition and 1581WithinPosition have changed in a non-backwards compatible way. 1582 1583The SeqFeature's strand and any database reference are now properties of the 1584FeatureLocation object (a more logical placement), with proxy methods for 1585backwards compatibility. As part of this change, if you print a location 1586object it will now display any strand and database reference information. 1587 1588The installation setup.py now supports 'install_requires' when setuptools 1589is installed. This avoids the manual dialog when installing Biopython via 1590easy_install or pip and numpy is not installed. It also allows user libraries 1591that require Biopython to include it in their install_requires and get 1592automatical installation of dependencies. 1593 1594Bio.Graphics.BasicChromosome has been extended to allow simple sub-features to 1595be drawn on chromosome segments, suitable to show the position of genes, SNPs 1596or other loci. Note Bio.Graphics requires the ReportLab library. 1597 1598Bio.Graphics.GenomeDiagram has been extended to allow cross-links between 1599tracks, and track specific start/end positions for showing regions. This can 1600be used to imitate the output from the Artemis Comparison Tool (ACT). 1601Also, a new attribute circle_core makes it easier to have an empty space in 1602the middle of a circular diagram (see tutorial). 1603 1604Bio.Align.Applications now includes a wrapper for command line tool Clustal 1605Omega for protein multiple sequence alignment. 1606 1607Bio.AlignIO now supports sequential PHYLIP files (as well as interlaced 1608PHYLIP files) as a separate format variant. 1609 1610New module Bio.TogoWS offers a wrapper for the TogoWS REST API, a web service 1611based in Japan offering access to KEGG, DDBJ, PDBj, CBRC plus access to some 1612NCBI, EBI resources including PubMed, GenBank and UniProt. This is much easier 1613to use than the NCBI Entrez API, but should be especially useful for Biopython 1614users based in Asia. 1615 1616Bio.Entrez function efetch has been updated to handle the NCBI's stricter 1617handling of multiple ID arguments in EFetch 2.0, however the NCBI have also 1618changed the retmode default argument so you may need to make this explicit. 1619e.g. retmode="text" 1620 1621Additionally there have been other minor bug fixes and more unit tests. 1622 1623Many thanks to the Biopython developers and community for making this release 1624possible, especially the following contributors: 1625 1626- Andreas Wilm (first contribution) 1627- Alessio Papini (first contribution) 1628- Brad Chapman 1629- Brandon Invergo 1630- Connor McCoy 1631- Eric Talevich 1632- João Rodrigues 1633- Konrad Förstner (first contribution) 1634- Michiel de Hoon 1635- Matej Repič (first contribution) 1636- Leighton Pritchard 1637- Peter Cock 1638 1639 164018 August 2011: Biopython 1.58 released. 1641======================================== 1642 1643A new interface and parsers for the PAML (Phylogenetic Analysis by Maximum 1644Likelihood) package of programs, supporting codeml, baseml and yn00 as well 1645as a Python re-implementation of chi2 was added as the Bio.Phylo.PAML module. 1646 1647Bio.SeqIO now includes read and write support for the SeqXML, a simple XML 1648format offering basic annotation support. See Schmitt et al (2011) in 1649Briefings in Bioinformatics, https://doi.org/10.1093/bib/bbr025 1650 1651Bio.SeqIO now includes read support for ABI files ("Sanger" capillary 1652sequencing trace files, containing called sequence with PHRED qualities). 1653 1654The Bio.AlignIO "fasta-m10" parser was updated to cope with the >>><<< lines 1655as used in Bill Pearson's FASTA version 3.36, without this fix the parser 1656would only return alignments for the first query sequence. 1657 1658The Bio.AlignIO "phylip" parser and writer now treat a dot/period in the 1659sequence as an error, in line with the official PHYLIP specification. Older 1660versions of our code didn't do anything special with this character. Also, 1661support for "phylip-relaxed" has been added which allows longer record names 1662as used in RAxML and PHYML. 1663 1664Of potential interest to anyone subclassing Biopython objects, any remaining 1665"old style" Python classes have been switched to "new style" classes. This 1666allows things like defining properties. 1667 1668Bio.HMM's Viterbi algorithm now expects the initial probabilities explicitly. 1669 1670Many thanks to the Biopython developers and community for making this release 1671possible, especially the following contributors: 1672 1673- Aaron Gallagher (first contribution) 1674- Bartek Wilczynski 1675- Bogdan T. (first contribution) 1676- Brandon Invergo (first contribution) 1677- Connor McCoy (first contribution) 1678- David Cain (first contribution) 1679- Eric Talevich 1680- Fábio Madeira (first contribution) 1681- Hongbo Zhu 1682- Joao Rodrigues 1683- Michiel de Hoon 1684- Peter Cock 1685- Thomas Schmitt (first contribution) 1686- Tiago Antao 1687- Walter Gillett 1688- Wibowo Arindrarto (first contribution) 1689 1690 16912 April 2011: Biopython 1.57 released. 1692====================================== 1693 1694Bio.SeqIO now includes an index_db() function which extends the existing 1695indexing functionality to allow indexing many files, and more importantly 1696this keeps the index on disk in a simple SQLite3 database rather than in 1697memory in a Python dictionary. 1698 1699Bio.Blast.Applications now includes a wrapper for the BLAST+ blast_formatter 1700tool from NCBI BLAST 2.2.24+ or later. This release of BLAST+ added the 1701ability to run the BLAST tools and save the output as ASN.1 format, and then 1702convert this to any other supported BLAST output format (plain text, tabular, 1703XML, or HTML) with the blast_formatter tool. The wrappers were also updated 1704to include new arguments added in BLAST 2.2.25+ such as -db_hard_mask. 1705 1706The SeqRecord object now has a reverse_complement method (similar to that of 1707the Seq object). This is most useful to reversing per-letter-annotation (such 1708as quality scores from FASTQ) or features (such as annotation from GenBank). 1709 1710Bio.SeqIO.write's QUAL output has been sped up, and Bio.SeqIO.convert now 1711uses an optimised routine for FASTQ to QUAL making this much faster. 1712 1713Biopython can now be installed with pip. Thanks to David Koppstein and 1714James Casbon for reporting the problem. 1715 1716Bio.SeqIO.write now uses lower case for the sequence for GenBank, EMBL and 1717IMGT output. 1718 1719The Bio.PDB module received several fixes and improvements, including starting 1720to merge João's work from GSoC 2010; consequently Atom objects now know 1721their element type and IUPAC mass. (The new features that use these 1722attributes won't be included in Biopython until the next release, though, so 1723stay tuned.) 1724 1725The nodetype hierarchy in the Bio.SCOP.Cla.Record class is now a dictionary 1726(previously it was a list of key,value tuples) to better match the standard. 1727 1728Many thanks to the Biopython developers and community for making this release 1729possible, especially the following contributors: 1730 1731- Brad Chapman 1732- Eric Talevich 1733- Erick Matsen (first contribution) 1734- Hongbo Zhu 1735- Jeffrey Finkelstein (first contribution) 1736- Joanna & Dominik Kasprzak (first contribution) 1737- Joao Rodrigues 1738- Kristian Rother 1739- Leighton Pritchard 1740- Michiel de Hoon 1741- Peter Cock 1742- Peter Thorpe (first contribution) 1743- Phillip Garland 1744- Walter Gillett (first contribution) 1745 1746 174726 November 2010: Biopython 1.56 released. 1748========================================== 1749 1750This is planned to be our last release to support Python 2.4, however this 1751could be delayed given immediate feedback from our users (e.g. if this proves 1752to be a problem in combination with other libraries or a popular Linux 1753distribution). 1754 1755Bio.SeqIO can now read and index UniProt XML files (under format name 1756"uniprot-xml", which was agreed with EMBOSS and BioPerl for when/if they 1757support it too). 1758 1759Bio.SeqIO can now read, write and index IMGT files. These are a variant of 1760the EMBL sequence text file format with longer feature indentation. 1761 1762Bio.SeqIO now supports protein EMBL files (used in the EMBL patents database 1763file epo_prt.dat) - previously we only expected nucleotide EMBL files. 1764 1765The Bio.Seq translation methods and function will now accept an arbitrary 1766CodonTable object (for those of you working on very unusual organisms). 1767 1768The SeqFeature object now supports len(feature) giving the length consistent 1769with the existing extract method. Also, it now supports iteration giving the 1770coordinate (with respect to the parent sequence) of each letter within the 1771feature (in frame aware order), and "in" which allows you to check if a 1772(parent based) coordinate is within the feature location. 1773 1774Bio.Entrez will now try to download any missing NCBI DTD files and cache them 1775in the user's home directory. 1776 1777The provisional database schema for BioSQL support on SQLite which Biopython 1778has been using since Release 1.53 has now been added to BioSQL, and updated 1779slightly. 1780 1781Bio.PopGen.FDist now supports the DFDist command line tool as well as FDist2. 1782 1783Bio.Motif now has a chapter in the Tutorial. 1784 1785(At least) 13 people have contributed to this release, including 6 new people: 1786 1787- Andrea Pierleoni (first contribution) 1788- Bart de Koning (first contribution) 1789- Bartek Wilczynski 1790- Bartosz Telenczuk (first contribution) 1791- Cymon Cox 1792- Eric Talevich 1793- Frank Kauff 1794- Michiel de Hoon 1795- Peter Cock 1796- Phillip Garland (first contribution) 1797- Siong Kong (first contribution) 1798- Tiago Antao 1799- Uri Laserson (first contribution) 1800 1801 180231 August 2010: Biopython 1.55 released. 1803======================================== 1804 1805See the notes below for the Biopython 1.55 beta release for changes since 1806Biopython 1.54 was released. Since the beta release we have marked a few 1807modules as obsolete or deprecated, and removed some deprecated code. There 1808have also been a few bug fixes, extra unit tests, and documentation 1809improvements. 1810 1811(At least) 12 people have contributed to this release, including 6 new people: 1812 1813- Andres Colubri (first contribution) 1814- Carlos Ríos (first contribution) 1815- Claude Paroz (first contribution) 1816- Cymon Cox 1817- Eric Talevich 1818- Frank Kauff 1819- Joao Rodrigues (first contribution) 1820- Konstantin Okonechnikov (first contribution) 1821- Michiel de Hoon 1822- Nathan Edwards (first contribution) 1823- Peter Cock 1824- Tiago Antao 1825 1826 182718 August 2010: Biopython 1.55 beta released. 1828============================================= 1829 1830This is a beta release for testing purposes, both for new features added, 1831and more importantly updates to avoid code deprecated in Python 2.7 or in 1832Python 3. This is an important step towards Python 3 support. 1833 1834We are phasing out support for Python 2.4. We will continue to support it 1835for at least one further release (Biopython 1.56). This could be delayed 1836given feedback from our users (e.g. if this proves to be a problem in 1837combination with other libraries or a popular Linux distribution). 1838 1839The SeqRecord object now has upper and lower methods (like the Seq object and 1840Python strings), which return a new SeqRecord with the sequence in upper or 1841lower case and a copy of all the annotation unchanged. 1842 1843Several small issues with Bio.PDB have been resolved, which includes better 1844handling of model numbers, and files missing the element column. 1845 1846Feature location parsing for GenBank and EMBL files has been rewritten, 1847making the parser much faster. 1848 1849Ace parsing by SeqIO now uses zero rather than None for the quality score of 1850any gaps (insertions) in the contig sequence. 1851 1852The BioSQL classes DBServer and BioSeqDatabase now act more like Python 1853dictionaries, making it easier to count, delete, iterate over, or check for 1854membership of namespaces and records. 1855 1856The command line tool application wrapper classes are now executable, so you 1857can use them to call the tool (using the subprocess module internally) and 1858capture the output and any error messages as strings (stdout and stderr). 1859This avoids having to worry about the details of how best to use subprocess. 1860 1861(At least) 10 people have contributed to this release, including 5 new people: 1862 1863- Andres Colubri (first contribution) 1864- Carlos Ríos (first contribution) 1865- Claude Paroz (first contribution) 1866- Eric Talevich 1867- Frank Kauff 1868- Joao Rodrigues (first contribution) 1869- Konstantin Okonechnikov (first contribution) 1870- Michiel de Hoon 1871- Peter Cock 1872- Tiago Antao 1873 1874 1875May 20, 2010: Biopython 1.54 released. 1876====================================== 1877 1878See the notes below for the Biopython 1.54 beta release for changes since 1879Biopython 1.53 was released. Since then there have been some changes to 1880the new Bio.Phylo module, more documentation, and a number of smaller 1881bug fixes. 1882 1883 1884April 2, 2010: Biopython 1.54 beta released. 1885============================================ 1886 1887We are phasing out support for Python 2.4. We will continue to support it 1888for at least two further releases, and at least one year (whichever takes 1889longer), before dropping support for Python 2.4. This could be delayed 1890given feedback from our users (e.g. if this proves to be a problem in 1891combination with other libraries or a popular Linux distribution). 1892 1893New module Bio.Phylo includes support for reading, writing and working with 1894phylogenetic trees from Newick, Nexus and phyloXML files. This was work by 1895Eric Talevich on a Google Summer of Code 2009 project, under The National 1896Evolutionary Synthesis Center (NESCent), mentored by Brad Chapman and 1897Christian Zmasek. 1898 1899Bio.Entrez includes some more DTD files, in particular eLink_090910.dtd, 1900needed for our NCBI Entrez Utilities XML parser. 1901 1902The parse, read and write functions in Bio.SeqIO and Bio.AlignIO will now 1903accept filenames as well as handles. This follows a general shift from 1904other Python libraries, and does make usage a little simpler. Also 1905the write functions will now accept a single SeqRecord or alignment. 1906 1907Bio.SeqIO now supports writing EMBL files (DNA and RNA sequences only). 1908 1909The dictionary-like objects from Bio.SeqIO.index() now support a get_raw 1910method for most file formats, giving you the original unparsed data from the 1911file as a string. This is useful for selecting a subset of records from a 1912file where Bio.SeqIO.write() does not support the file format (e.g. the 1913"swiss" format) or where you need to exactly preserve the original layout. 1914 1915Based on code from Jose Blanca (author of sff_extract), Bio.SeqIO now 1916supports reading, indexing and writing Standard Flowgram Format (SFF) 1917files which are used by 454 Life Sciences (Roche) sequencers. This means 1918you can use SeqIO to convert from SFF to FASTQ, FASTA and QUAL (as 1919trimmed or untrimmed reads). 1920 1921An improved multiple sequence alignment object has been introduced, 1922and is used by Bio.AlignIO for input. This is a little stricter than the 1923old class but should otherwise be backwards compatible. 1924 1925(At least) 11 people contributed to this release, including 5 new people: 1926 1927- Anne Pajon (first contribution) 1928- Brad Chapman 1929- Christian Zmasek 1930- Diana Jaunzeikare (first contribution) 1931- Eric Talevich 1932- Jose Blanca (first contribution) 1933- Kevin Jacobs (first contribution) 1934- Leighton Pritchard 1935- Michiel de Hoon 1936- Peter Cock 1937- Thomas Holder (first contribution) 1938 1939 1940December 15, 2009: Biopython 1.53 released. 1941=========================================== 1942 1943Biopython is now using git for source code control, currently on github. Our 1944old CVS repository will remain on the OBF servers in the short/medium term 1945as a backup, but will not be updated in future. 1946 1947The Bio.Blast.Applications wrappers now covers the new NCBI BLAST C++ tools 1948(where blastall is replaced by blastp, blastn, etc, and the command line 1949switches have all been renamed). These will be replacing the old wrappers in 1950Bio.Blast.NCBIStandalone which are now obsolete, and will be deprecated in 1951our next release. 1952 1953The plain text BLAST parser has been updated, and should cope with recent 1954versions of NCBI BLAST, including the new C++ based version. Nevertheless, 1955we (and the NCBI) still recommend using the XML output for parsing. 1956 1957The Seq (and related UnknownSeq) objects gained upper and lower methods, 1958like the string methods of the same name but alphabet aware. The Seq object 1959also gained a new ungap method for removing gap characters in an alphabet 1960aware manner. 1961 1962The SeqFeature object now has an extract method, used with the parent 1963sequence (as a string or Seq object) to get the region of that sequence 1964described by the feature's location information (including the strand and 1965any sub-features for a join). As an example, this is useful to get the 1966nucleotide sequence for features in GenBank or EMBL files. 1967 1968SeqRecord objects now support addition, giving a new SeqRecord with the 1969combined sequence, all the SeqFeatures, and any common annotation. 1970 1971Bio.Entrez includes the new (Jan 2010) DTD files from the NCBI for parsing 1972MedLine/PubMed data. 1973 1974The NCBI codon tables have been updated from version 3.4 to 3.9, which adds 1975a few extra start codons, and a few new tables (Tables 16, 21, 22 and 23). 1976Note that Table 14 which used to be called "Flatworm Mitochondrial" is now 1977called "Alternative Flatworm Mitochondrial", and "Flatworm Mitochondrial" is 1978now an alias for Table 9 ("Echinoderm Mitochondrial"). 1979 1980The restriction enzyme list in Bio.Restriction has been updated to the 1981Nov 2009 release of REBASE. 1982 1983The Bio.PDB parser and output code has been updated to understand the 1984element column in ATOM and HETATM lines (based on patches contributed by 1985Hongbo Zhu and Frederik Gwinner). Bio.PDB.PDBList has also been updated 1986for recent changes to the PDB FTP site (Paul T. Bathen). 1987 1988SQLite support was added for BioSQL databases (Brad Chapman), allowing access 1989to BioSQL through a lightweight embedded SQL engine. Python 2.5+ includes 1990support for SQLite built in, but on Python 2.4 the optional sqlite3 library 1991must be installed to use this. We currently use a draft BioSQL on SQLite 1992schema, which will be merged with the main BioSQL release for use in other 1993projects. 1994 1995Support for running Biopython under Jython (using the Java Virtual Machine) 1996has been much improved thanks to input from Kyle Ellrott. Note that Jython 1997does not support C code - this means NumPy isn't available, and nor are a 1998selection of Biopython modules (including Bio.Cluster, Bio.PDB and BioSQL). 1999Also, currently Jython does not parse DTD files, which means the XML parser 2000in Bio.Entrez won't work. However, most of the Biopython modules seem fine 2001from testing Jython 2.5.0 and 2.5.1. 2002 2003(At least) 12 people contributed to this release, including 3 first timers: 2004 2005- Bartek Wilczynski 2006- Brad Chapman 2007- Chris Lasher 2008- Cymon Cox 2009- Frank Kauff 2010- Frederik Gwinner (first contribution) 2011- Hongbo Zhu (first contribution) 2012- Kyle Ellrott 2013- Leighton Pritchard 2014- Michiel de Hoon 2015- Paul Bathen (first contribution) 2016- Peter Cock 2017 2018 2019September 22, 2009: Biopython 1.52 released. 2020============================================ 2021 2022The Population Genetics module now allows the calculation of several tests, 2023and statistical estimators via a wrapper to GenePop. Supported are tests for 2024Hardy-Weinberg equilibrium, linkage disequilibrium and estimates for various 2025F statistics (Cockerham and Wier Fst and Fis, Robertson and Hill Fis, etc), 2026null allele frequencies and number of migrants among many others. Isolation 2027By Distance (IBD) functionality is also supported. 2028 2029New helper functions Bio.SeqIO.convert() and Bio.AlignIO.convert() allow an 2030easier way to use Biopython for simple file format conversions. Additionally, 2031these new functions allow Biopython to offer important file format specific 2032optimisations (e.g. FASTQ to FASTA, and interconverting FASTQ variants). 2033 2034New function Bio.SeqIO.index() allows indexing of most sequence file formats 2035(but not alignment file formats), allowing dictionary like random access to 2036all the entries in the file as SeqRecord objects, keyed on the record id. 2037This is especially useful for very large sequencing files, where all the 2038records cannot be held in memory at once. This supplements the more flexible 2039but memory demanding Bio.SeqIO.to_dict() function. 2040 2041Bio.SeqIO can now write "phd" format files (used by PHRED, PHRAD and CONSED), 2042allowing interconversion with FASTQ files, or FASTA+QUAL files. 2043 2044Bio.Emboss.Applications now includes wrappers for the "new" PHYLIP EMBASSY 2045package (e.g. fneighbor) which replace the "old" PHYLIP EMBASSY package (e.g. 2046eneighbor) whose Biopython wrappers are now obsolete. 2047 2048See also the DEPRECATED file, as several old deprecated modules have finally 2049been removed (e.g. Bio.EUtils which had been replaced by Bio.Entrez). 2050 2051On a technical note, this will be the last release using CVS for source code 2052control. Biopython is moving from CVS to git. 2053 2054 2055August 17, 2009: Biopython 1.51 released. 2056========================================= 2057 2058FASTQ support in Bio.SeqIO has been improved, extended and sped up since 2059Biopython 1.50. Support for Illumina 1.3+ style FASTQ files was added in the 20601.51 beta release. Furthermore, we now follow the interpretation agreed on 2061the OBF mailing lists with EMBOSS, BioPerl, BioJava and BioRuby for inter- 2062conversion and the valid score range for each FASTQ variant. This means 2063Solexa FASTQ scores can be from -5 to 62 (format name "fastq-solexa" in 2064Bio.SeqIO), Illumina 1.3+ FASTQ files have PHRED scores from 0 to 62 (format 2065name "fastq-illumina"), and Sanger FASTQ files have PHRED scores from 0 to 206693 (format name "fastq" or "fastq-sanger"). 2067 2068Bio.Sequencing.Phd has been updated, for example to cope with missing peak 2069positions. The "phd" support in Bio.SeqIO has also been updated to record 2070the PHRED qualities (and peak positions) in the SeqRecord's per-letter 2071annotation. This allows conversion of PHD files into FASTQ or QUAL which may 2072be useful for meta-assembly. 2073 2074See the notes below for the Biopython 1.50 beta release for changes since 2075Biopython 1.49 was released. This includes dropping support for Python 2.3, 2076removing our deprecated parsing infrastructure (Martel and Bio.Mindy), and 2077hence removing any dependence on mxTextTools. 2078 2079Additionally, since the beta, a number of small bugs have been fixed, and 2080there have been further additions to the test suite and documentation. 2081 2082 2083June 23, 2009: Biopython 1.51 beta released. 2084============================================ 2085 2086Biopython no longer supports Python 2.3. Currently we support Python 2.4, 20872.5 and 2.6. 2088 2089Our deprecated parsing infrastructure (Martel and Bio.Mindy) has been 2090removed. This means Biopython no longer has any dependence on mxTextTools. 2091 2092A few cosmetic issues in GenomeDiagram with arrow sigils and labels on 2093circular diagrams have been fixed. 2094 2095Bio.SeqIO will now write GenBank files with the feature table (previously 2096omitted), and a couple of obscure errors parsing ambiguous locations have 2097been fixed. 2098 2099Bio.SeqIO can now read and write Illumina 1.3+ style FASTQ files (which use 2100PHRED quality scores with an ASCII offset of 64) under the format name 2101"fastq-illumina". Biopython 1.50 supported just "fastq" (the original Sanger 2102style FASTQ files using PHRED scores with an ASCII offset of 33), and 2103"fastq-solexa" (the original Solexa/Illumina FASTQ format variant holding 2104Solexa scores with an ASCII offset of 64) . 2105 2106For parsing the "swiss" format, Bio.SeqIO now uses the new Bio.SwissProt 2107parser, making it about twice as fast as in Biopython 1.50, where the older 2108now deprecated Bio.SwissProt.SProt was used. There should be no functional 2109differences as a result of this change. 2110 2111Our command line wrapper objects have been updated to support accessing 2112parameters via python properties, and setting of parameters at initiation 2113with keyword arguments. Additionally Cymon Cox has contributed several new 2114multiple alignment wrappers under Bio.Align.Applications. 2115 2116A few more issues with Biopython's BioSQL support have been fixed (mostly by 2117Cymon Cox). In particular, the default PostgreSQL schema includes some rules 2118intended for BioPerl support only, which were causing problems in Biopython 2119(see BioSQL bug 2839). 2120 2121There have also been additions to the tutorial, such as the new alignment 2122wrappers, with a whole chapter for the SeqRecord object. We have also added 2123to the unit test coverage. 2124 2125 2126April 20, 2009: Biopython 1.50 released. 2127======================================== 2128 2129See the notes below for the Biopython 1.50 beta release for more details, 2130but the highlights are: 2131 2132* The SeqRecord supports slicing and per-letter-annotation 2133* Bio.SeqIO can read and write FASTQ and QUAL files 2134* Bio.Seq now has an UnknownSeq object 2135* GenomeDiagram has been integrated into Biopython 2136* New module Bio.Motif will later replace Bio.AlignAce and Bio.MEME 2137* This will be the final release to support Python 2.3 2138* This will be the final release with Martel and Bio.Mindy 2139 2140Since the 1.50 beta release: 2141 2142* The NCBI's Entrez EFetch no longer supports rettype="genbank" 2143 and "gb" (or "gp") should be used instead. 2144* Bio.SeqIO now supports "gb" as an alias for "genbank". 2145* The Seq object now has string-like startswith and endswith methods 2146* Bio.Blast.NCBIXML now has a read function for single record files 2147* A few more unit tests were added 2148* More documentation 2149 2150 2151April 3, 2009: Biopython 1.50 beta released. 2152============================================ 2153 2154The SeqRecord object has a new dictionary attribute, letter_annotations, 2155which is for holding per-letter-annotation information like sequence 2156quality scores or secondary structure predictions. As part of this work, 2157the SeqRecord object can now be sliced to give a new SeqRecord covering 2158just part of the sequence. This will slice the per-letter-annotation to 2159match, and will also include any SeqFeature objects as appropriate. 2160 2161Bio.SeqIO can now read and write FASTQ and QUAL quality files using PHRED 2162quality scores (Sanger style, also used for Roche 454 sequencing), and FASTQ 2163files using Solexa/Illumina quality scores. 2164 2165The Bio.Seq module now has an UnknownSeq object, used for when we have a 2166sequence of known length, but unknown content. This is used in parsing 2167GenBank and EMBL files where the sequence may not be present (e.g. for a 2168contig record) and when parsing QUAL files (which don't have the sequence) 2169 2170GenomeDiagram by Leighton Pritchard has been integrated into Biopython as 2171the Bio.Graphics.GenomeDiagram module If you use this code, please cite the 2172publication Pritchard et al. (2006), Bioinformatics 22 616-617. Note that 2173like Bio.Graphics, this requires the ReportLab python library. 2174 2175A new module Bio.Motif has been added, which is intended to replace the 2176existing Bio.AlignAce and Bio.MEME modules. 2177 2178The set of NCBI DTD files included with Bio.Entrez has been updated with the 2179revised files the NCBI introduced on 1 Jan 2009. 2180 2181Minor fix to BioSQL for retrieving references and comments. 2182 2183Bio.SwissProt has a new faster parser which will be replacing the older 2184slower code in Bio.SwissProt.SProt (which we expect to deprecate in the next 2185release). 2186 2187We've also made some changes to our test framework, which is now given a 2188whole chapter in the tutorial. This intended to help new developers or 2189contributors wanting to improve our unit test coverage. 2190 2191 2192November 21, 2008: Biopython 1.49 released. 2193=========================================== 2194 2195See the notes below for the Biopython 1.49 beta release for more details, 2196but the highlights are: 2197 2198* Biopython has transitioned from Numeric to NumPy 2199* Martel and Bio.Mindy are now deprecated 2200 2201Since the 1.49 beta release: 2202 2203* A couple of NumPy issues have been resolved 2204* Further small improvements to BioSQL 2205* Bio.PopGen.SimCoal should now work on Windows 2206* A few more unit tests were added 2207 2208 2209November 7, 2008: Biopython 1.49 beta released. 2210=============================================== 2211 2212Biopython has transitioned from Numeric to NumPy. Please move to NumPy. 2213 2214A number of small changes have been made to support Python 2.6 (mostly 2215avoiding deprecated functionality), and further small changes have been 2216made for better compatibility with Python 3 (this work is still ongoing). 2217However, we intend to support Python 2.3 for only a couple more releases. 2218 2219As part of the Numeric to NumPy migration, Bio.KDTree has been rewritten in 2220C instead of C++ which therefore simplifies building Biopython from source. 2221 2222Martel and Bio.Mindy are now considered to be deprecated, meaning mxTextTools 2223is no longer required to use Biopython. See the DEPRECATED file for details 2224of other deprecations. 2225 2226The Seq object now supports more string like methods (gaining find, rfind, 2227split, rsplit, strip, lstrip and rstrip in addition to previously supported 2228methods like count). Also, biological methods transcribe, back_transcribe 2229and translate have been added, joining the pre-existing reverse_complement 2230and complement methods. Together these changes allow a more object 2231orientated programming style using the Seq object. 2232 2233The behaviour of the Bio.Seq module's translate function has changed so that 2234ambiguous codons which could be a stop codon like "TAN" or "NNN" are now 2235translated as "X" (consistent with EMBOSS and BioPerl - Biopython previously 2236raised an exception), and a bug was fixed so that invalid codons (like "A-T") 2237now raise an exception (previously these were translated as stop codons). 2238 2239BioSQL had a few bugs fixed, and can now optionally fetch the NCBI taxonomy 2240on demand when loading sequences (via Bio.Entrez) allowing you to populate 2241the taxon/taxon_name tables gradually. This has been tested in combination 2242with the BioSQL load_ncbi_taxonomy.pl script used to populate or update the 2243taxon/taxon_name tables. BioSQL should also now work with the psycopg2 2244driver for PostgreSQL as well as the older psycopg driver. 2245 2246The PDB and PopGen sections of the Tutorial have been promoted to full 2247chapters, and a new chapter has been added on supervised learning methods 2248like logistic regression. The "Cookbook" section now has a few graphical 2249examples using Biopython to calculate sequence properties, and matplotlib 2250(pylab) to plot them. 2251 2252The input functions in Bio.SeqIO and Bio.AlignIO now accept an optional 2253argument to specify the expected sequence alphabet. 2254 2255The somewhat quirky unit test GUI has been removed, the unit tests are now 2256run via the command line by default. 2257 2258 2259September 8, 2008: Biopython 1.48 released. 2260=========================================== 2261 2262The SeqRecord and Alignment objects have a new method to format the object as 2263a string in a requested file format (handled via Bio.SeqIO and Bio.AlignIO). 2264 2265Additional file formats supported in Bio.SeqIO and Bio.AlignIO: 2266 2267- reading and writing "tab" format (simple tab separated) 2268- writing "nexus" files. 2269- reading "pir" files (NBRF/PIR) 2270- basic support for writing "genbank" files (GenBank plain text) 2271 2272Fixed some problems reading Clustal alignments (introduced in Biopython 1.46 2273when consolidating Bio.AlignIO and Bio.Clustalw). 2274 2275Updates to the Bio.Sequencing parsers. 2276 2277Bio.PubMed and the online code in Bio.GenBank are now considered obsolete, 2278and we intend to deprecate them after the next release. For accessing PubMed 2279and GenBank, please use Bio.Entrez instead. 2280 2281Bio.Fasta is now considered to be obsolete, please use Bio.SeqIO instead. We 2282do intend to deprecate this module eventually, however, for several years 2283this was the primary FASTA parsing module in Biopython and is likely to be in 2284use in many existing scripts. 2285 2286Martel and Bio.Mindy are now considered to be obsolete, and are likely to be 2287deprecated and removed in a future release. 2288 2289In addition a number of other modules have been deprecated, including: 2290Bio.MetaTool, Bio.EUtils, Bio.Saf, Bio.NBRF, and Bio.IntelliGenetics 2291See the DEPRECATED file for full details. 2292 2293 2294July 5, 2008: Biopython 1.47 released. 2295====================================== 2296 2297Improved handling of ambiguous nucleotides in Bio.Seq.Translate(). 2298Better handling of stop codons in the alphabet from a translation. 2299Fixed some codon tables (problem introduced in Biopython 1.46). 2300 2301Updated Nexus file handling. 2302 2303Fixed a bug in Bio.Cluster potentially causing segfaults in the 2304single-linkage hierarchical clustering library. 2305 2306Added some DTDs to be able to parse EFetch results from the 2307nucleotide database. 2308 2309Added IntelliGenetics/MASE parsing to Bio.SeqIO (as the "ig" format). 2310 2311 2312June 29, 2008: Biopython 1.46 released. 2313======================================= 2314 2315Bio.Entrez now has several Entrez format XML parsers, and a chapter 2316in the tutorial. 2317 2318Addition of new Bio.AlignIO module for working with sequence alignments 2319in the style introduced with Bio.SeqIO in recent releases, with a whole 2320chapter in the tutorial. 2321 2322A problem parsing certain EMBL files was fixed. 2323 2324Several minor fixes were made to the NCBI BLAST XML parser, including 2325support for the online version 2.2.18+ introduced in May 2008. 2326 2327The NCBIWWW.qblast() function now allows other programs (blastx, tblastn, 2328tblastx) in addition to just blastn and blastp. 2329 2330Bio.EUtils has been updated to explicitly enforce the NCBI's rule of at 2331most one query every 3 seconds, rather than assuming the user would obey 2332this. 2333 2334Iterators in Bio.Medline, Bio.SCOP, Bio.Prosite, Bio.Prosite.Prodoc, 2335Bio.SwissProt, and others to make them more generally usable. 2336 2337Phylip export added to Bio.Nexus. 2338 2339Improved handling of ambiguous nucleotides and stop codons in 2340Bio.Seq.Translate (plus introduced a regression fixed in Biopython 1.47). 2341 2342 2343March 22, 2008: Biopython 1.45 released. 2344======================================== 2345 2346The Seq and MutableSeq objects act more like python strings, in particular 2347str(object) now returns the full sequence as a plain string. The existing 2348tostring() method is preserved for backwards compatibility. 2349 2350BioSQL has had some bugs fixed, and has an additional unit test which loads 2351records into a database using Bio.SeqIO and then checks the records can be 2352retrieved correctly. The DBSeq and DBSeqRecord classes now subclass the 2353Seq and SeqRecord classes, which provides more functionality. 2354 2355The modules under Bio.WWW are being deprecated. 2356Functionality in Bio.WWW.NCBI, Bio.WWW.SCOP, Bio.WWW.InterPro and 2357Bio.WWW.ExPASy is now available from Bio.Entrez, Bio.SCOP, Bio.InterPro and 2358Bio.ExPASy instead. Bio.Entrez was used to fix a nasty bug in Bio.GenBank. 2359 2360Tiago Antao has included more functionality in the Population Genetics 2361module, Bio.PopGen. 2362 2363The Bio.Cluster module has been updated to be more consistent with other 2364Biopython code. 2365 2366The tutorial has been updated, including devoting a whole chapter to 2367Swiss-Prot, Prosite, Prodoc, and ExPASy. There is also a new chapter on 2368Bio.Entrez. 2369 2370Bio.biblio was deprecated. 2371 2372 2373October 28, 2007: Biopython 1.44 released. 2374========================================== 2375 2376NOTE: This release includes some rather drastic code changes, which were 2377necessary to get Biopython to work with the new release of mxTextTools. 2378 2379The (reverse)complement functions in Bio.Seq support ambiguous nucleotides. 2380 2381Bio.Kabat, which was previously deprecated, is now removed from Biopython. 2382 2383Bio.MarkupEditor was deprecated, as it does not appear to have any users. 2384 2385Bio.Blast.NCBI.qblast() updated with more URL options, thanks to a patch 2386from Chang Soon Ong. 2387 2388Several fixes to the Blast parser. 2389 2390The deprecated Bio.Blast.NCBIWWW functions blast and blasturl were removed. 2391 2392The standalone Blast functions blastall, blastpgp now create XML output by 2393default. 2394 2395Bio.SeqIO.FASTA and Bio.SeqIO.generic have been deprecated in favour of 2396the new Bio.SeqIO module. 2397 2398Bio.FormatIO has been removed (a gradual deprecation was not possible). 2399Please look at Bio.SeqIO for sequence input/output instead. 2400 2401Fix for a bug in Bio.Cluster, which caused kcluster() to hang on some 2402platforms. 2403 2404Bio.expressions has been deprecated. 2405 2406Bio.SeqUtils.CheckSum created, including new methods from Sebastian Bassi, 2407and functions crc32 and crc64 which were moved from Bio/crc.py. 2408Bio.crc is now deprecated. Bio.lcc was updated and moved to Bio.SeqUtils.lcc. 2409 2410Bio.SwissProt parser updated to cope with recent file format updates. 2411 2412Bio.Fasta, Bio.KEGG and Bio.Geo updated to pure python parsers which 2413don't rely on Martel. 2414 2415Numerous fixes in the Genbank parser. 2416 2417Several fixes in Bio.Nexus. 2418 2419Bio.MultiProc and Bio.Medline.NLMMedlineXML were deprecating, as they failed 2420on some platforms, and seemed to have no users. Deprecated concurrent 2421behavior in Bio.config.DBRegistry and timeouts in Bio.dbdefs.swissprot, 2422which relies on Bio.MultiProc. 2423 2424Tiago Antao has started work on a Population Genetics module, Bio.PopGen 2425 2426Updates to the tutorial, including giving Bio.Seq and Bio.SeqIO a whole 2427chapter each. 2428 2429 2430March 17, 2007: Biopython 1.43 released. 2431======================================== 2432 2433New Bio.SeqIO module for reading and writing biological sequence files 2434in various formats, based on SeqRecord objects. This includes a new fasta 2435parser which is much faster than Bio.Fasta, particularly for larger files. 2436Easier to use, too. 2437 2438Various improvements in Bio.SeqRecord. 2439 2440Running Blast using Bio.Blast.NCBIStandalone now generates output in XML 2441format by default. 2442The new function Bio.Blast.NCBIXML.parse can parse multiple Blast records 2443in XML format. 2444 2445Bio.Cluster no longer uses ranlib, but uses its own random number generator 2446instead. Some modifications to make Bio.Cluster more compatible with the new 2447NumPy (we're not quite there yet though). 2448 2449New Bio.UniGene parser. 2450 2451Numerous improvements in Bio.PDB. 2452 2453Bug fixes in Bio.SwissProt, BioSQL, Bio.Nexus, and other modules. 2454 2455Faster parsing of large GenBank files. 2456 2457New EMBL parser under Bio.GenBank and also integrated into (new) Bio.SeqIO 2458 2459Compilation of KDTree (C++ code) is optional (setup.py asks the user if it 2460should be compiled). For the Windows installer, C++ code is now included. 2461 2462Nominating Bio.Kabat for removal. 2463 2464Believe it or not, even the documentation was updated. 2465 2466 2467July 16, 2006: Biopython 1.42 released. 2468======================================= 2469 2470Bio.GenBank: New parser by Peter, which doesn't rely on Martel. 2471 2472Numerous updates in Bio.Nexus and Bio.Geo. 2473 2474Bio.Cluster became (somewhat) object-oriented. 2475 2476Lots of bug fixes, and updates to the documentation. 2477 2478 2479October 28, 2005: Biopython 1.41 released. 2480========================================== 2481 2482Major changes: 2483 2484NEW: Bio.MEME -- thanks to Jason Hackney 2485 2486Added transcribe, translate, and reverse_complement functions to Bio.Seq that 2487work both on Seq objects and plain strings. 2488 2489Major code optimization in cpairwise2module. 2490 2491CompareACE support added to AlignAce. 2492 2493Updates to Blast parsers in Bio.Blast, in particular use of the XML parser 2494in NCBIXML contributed by Bertrand Frottier, and the BLAT parser by Yair 2495Benita. 2496 2497Pairwise single-linkage hierarchical clustering in Bio.Cluster became much 2498faster and memory-efficient, allowing clustering of large data sets. 2499 2500Bio.Emboss: Added command lines for einverted and palindrome. 2501 2502Bio.Nexus: Added support for StringIO objects. 2503 2504Numerous updates in Bio.PDB. 2505 2506Lots of fixes in the documentation. 2507 2508March 29, 2005: MEME parser added. Thanks to Jason Hackney 2509 2510 2511Feb 18, 2005: Biopython 1.40 beta 2512================================= 2513Major Changes since v1.30. For a full list of changes please see the CVS 2514 2515IMPORTANT: Biopython now works with Python version >= 2.3 2516 2517NEW: Bio.Nexus -- thanks to Frank Kauff 2518Bio.Nexus is a Nexus file parser. Nexus is a common format for phylogenetic 2519trees. 2520 2521NEW: CAPS module -- Thanks to Jonathan Taylor. 2522 2523NEW: Restriction enzyme package contributed by Frederic Sohm. This includes 2524classes for manipulating enzymes, updating from Rebase, as well as 2525documentation and Tests. 2526 2527CHANGED: Bio.PDB -- thanks to Thomas Hamelryck. 2528 2529- Added atom serial number. 2530- Epydoc style documentation. 2531- Added secondary structure support (through DSSP). 2532- Added Accessible Surface Area support (through DSSP). 2533- Added Residue Depth support (through MSMS). 2534- Added Half Sphere Exposure. 2535- Added Fragment classification of the protein backbone (see Kolodny et al., 2536- JMB, 2002). 2537- Corrected problem on Windows with PDBList (thanks to Matt Dimmic) 2538- Added StructureAlignment module to superimpose structures based on a FASTA 2539 sequence alignment. 2540- Various additions to Polypeptide. 2541- Various bug corrections in Vector. 2542- Lots of smaller bug corrections and additional features 2543 2544CHANGED: MutableSeq -- thanks to Michiel De Hoon 2545Added the functions 'complement' and 'reverse_complement' to Bio.Seq's Seq and 2546MutableSeq objects. Similar functions previously existed in various locations 2547in BioPython: 2548 2549- forward_complement, reverse_complement in Bio.GFF.easy 2550- complement, antiparallel in Bio.SeqUtils 2551 2552These functions have now been deprecated, and will issue a DeprecationWarning 2553when used. The functions complement and reverse_complement, when applied to a 2554Seq object, will return a new Seq object. The same function applied to a 2555MutableSeq object will modify the MutableSeq object itself, and don't return 2556anything. 2557 2558 2559May 14, 2004: Biopython 1.30 2560============================ 2561 2562- Affy package added for dealing with Affymetrix cel files -- thanks to Harry 2563 Zuzan. 2564- Added code for parsing Blast XML output -- thanks to Bertrand Frottier. 2565- Added code for parsing Compass output -- thanks to James Casbon. 2566- New melting temperature calculation module -- thanks to Sebastian Bassi. 2567- Added lowess function for non-parameteric regression -- thanks to Michiel. 2568- Reduced protein alphabet supported added -- thanks to Iddo. 2569 2570- Added documentation for Logistic Regression and Bio.PDB -- thanks to Michiel 2571 and Thomas. 2572- Documentation added for converting between file formats. 2573- Updates to install documentation for non-root users -- thanks to Jakob 2574 Fredslund. 2575- epydoc now used for automatic generation of documentation. 2576 2577- Fasta parser updated to use Martel for parsing and indexing, allowing better 2578 speed and dealing with large data files. 2579- Updated to Registry code. Now 'from Bio import db' gives you a number of new 2580 retrieval options, including embl, fasta, genbak, interpro, prodoc and 2581 swissprot. 2582- GenBank parser uses new Martel format. GenBank retrieval now uses EUtils 2583 instead of the old non-working entrez scripts. GenBank indexing uses standard 2584 Mindy indexing. Fix for valueless qualifiers in feature keys -- thanks to 2585 Leighton Pritchard. 2586- Numerous updated to Bio.PDB modules -- thanks to Thomas. PDB can now parse 2587 headers -- thanks to Kristian Rother. 2588- Updates to the Ace parser -- thanks to Frank Kauff and Leighton Pritchard. 2589 2590- Added pgdb (PyGreSQL) support to BioSQL -- thanks to Marc Colosimo. 2591- Fix problems with using py2exe and Biopython -- thanks to Michael Cariaso. 2592- PSIBlast parser fixes -- thanks to Jer-Yee John Chuang and James Casbon. 2593- Fix to NCBIWWW retrieval so that HTML results are returned correctly. 2594- Fix to Clustalw to handle question marks in title names -- thanks to Ashleigh 2595 Smythe. 2596- Fix to NBRF parsing to it accepts files produced by Clustalw -- thanks to 2597 Ashleigh Smythe. 2598- Fixes to the Enyzme module -- thanks to Marc Colosimo. 2599- Fix for bugs in SeqUtils -- thanks to Frank Kauff. 2600- Fix for optional hsps in ncbiblast Martel format -- thanks to Heiko. 2601- Fix to Fasta parsing to allow # comment lines -- thanks to Karl Diedrich. 2602- Updates to the C clustering library -- thanks to Michiel. 2603- Fixes for breakage in the SCOP module and addition of regression tests to 2604 framework -- thanks to Gavin. 2605- Various fixes to Bio.Wise -- thanks to Michael. 2606- Fix for bug in FastaReader -- thanks to Micheal. 2607- Fix EUtils bug where efetch would only return 500 sequences. 2608- Updates for Emboss commandlines, water and tranalign. 2609- Fixes to the FormatIO system of file conversion. 2610 2611- C++ code (KDTree, Affy) now compiled by default on most platforms -- thanks 2612 to Michael for some nice distutils hacks and many people for testing. 2613- Deprecated Bio.sequtils -- use Bio.SeqUtils instead. 2614- Deprecated Bio.SVM -- use libsvm instead. 2615- Deprecated Bio.kMeans and Bio.xkMeans -- use Bio.cluster instead. 2616- Deprecated RecordFile -- doesn't appear to be finished code. 2617 2618 2619Feb 16, 2004: Biopython 1.24 2620============================ 2621 2622- New parsers for Phred and Ace format files -- thanks to Frank Kauff 2623- New Code for dealing with NMR data -- thanks to Bob Bussell 2624- New SeqUtils modules for codon usage, isoelectric points and other 2625 protein properties -- thanks to Yair Benita 2626- New code for dealing with Wise contributed by Michael 2627- EZ-Retrieve sequence retrieval now supported thanks to Jeff 2628- Bio.Cluster updated along with documentation by Michiel 2629- BioSQL fixed so it now works with the current SQL schema -- thanks to Yves 2630 Bastide for patches 2631- Patches to Bio/__init__ to make it compatible with py2exe -- thanks to 2632 Leighton Pritchard 2633- Added __iter__ to all Biopython Iterators to make them Python 2.2 compatible 2634- Fixes to NCBIWWW for retrieving from NCBI -- thanks to Chris Wroe 2635- Retrieval of multiple alignment objects from BLAST records -- thanks to 2636 James Casbon 2637- Fixes to GenBank format for new tags by Peter 2638- Parsing fixes in clustalw parsed -- thanks to Greg Singer and Iddo 2639- Fasta Indexes can have a specified filename -- thanks to Chunlei Wu 2640- Fix to Prosite parser -- thanks to Mike Liang 2641- Fix in GenBank parsing -- mRNAs now get strand information 2642 2643 2644Oct 18, 2003: Biopython 1.23 2645============================ 2646 2647- Fixed distribution of files in Bio/Cluster 2648- Now distributing Bio/KDTree/_KDTree.swig.C 2649- minor updates in installation code 2650- added mmCIF support for PDB files 2651 2652 2653Oct 9, 2003: Biopython 1.22 2654=========================== 2655 2656- Added Peter Slicker's patches for speeding up modules under Python 2.3 2657- Fixed Martel installation. 2658- Does not install Bio.Cluster without Numeric. 2659- Distribute EUtils DTDs. 2660- Yves Bastide patched NCBIStandalone.Iterator to be Python 2.0 iterator 2661- Ashleigh's string coersion fixes in Clustalw. 2662- Yair Benita added precision to the protein molecular weights. 2663- Bartek updated AlignAce.Parser and added Motif.sim method 2664- bug fixes in Michiel De Hoon's clustering library 2665- Iddo's bug fixes to Bio.Enzyme and new RecordConsumer 2666- Guido Draheim added patches for fixing import path to xbb scripts 2667- regression tests updated to be Python 2.3 compatible 2668- GenBank.NCBIDictionary is smarter about guessing the format 2669 2670 2671Jul 28, 2003: Biopython 1.21 2672============================ 2673 2674- Martel added back into the released package 2675- new AlignACE module by Bartek Wilczynski 2676- Andreas Kuntzagk fix for GenBank Iterator on empty files 2677 2678 2679Jul 27, 2003: Biopython 1.20 2680============================ 2681 2682- added Andrew Dalke's EUtils library 2683- added Michiel de Hoon's gene expression analysis package 2684- updates to setup code, now smarter about dependencies 2685- updates to test suite, now smarter about code that is imported 2686- Michael Hoffman's fixes to DocSQL 2687- syntax fixes in triemodule.c to compile on SGI, Python 2.1 compatible 2688- updates in NCBIStandalone, short query error 2689- Sebastian Bassi submitted code to calculate LCC complexity 2690- Greg Kettler's NCBIStandalone fix for long query lengths 2691- slew of miscellaneous fixes from George Paci 2692- miscellaneous cleanups and updates from Andreas Kuntzagk 2693- Peter Bienstman's fixes to Genbank code -- now parses whole database 2694- Kayte Lindner's LocusLink package 2695- miscellaneous speedups and code cleanup in ParserSupport by Brad Chapman 2696- miscellaneous BLAST fixes and updates 2697- Iddo added new code to parse BLAST table output format 2698- Karl Diedrich's patch to read T_Coffee files 2699- Larry Heisler's fix for primer3 output 2700- Bio.Medline now uses proper iterator objects 2701- copen now handles SIGTERM correctly 2702- small bugfixes and updates in Thomas Hamelryck's PDB package 2703- bugfixes and updates to SeqIO.FASTA reader 2704- updates to Registry system, conforms to 2003 hackathon OBDA spec 2705- Yu Huang patch to support tblastn in wublast expression 2706 2707 2708Dec 17, 2002: Biopython 1.10 2709============================ 2710 2711- Python requirement bumped up to 2.2 2712- hierarchy reorg, many things moved upwards into Bio namespace 2713- pairwise2 replaces fastpairwise and pairwise 2714- removed deprecated Sequence.py package 2715- minor bug fix in File.SGMLStripper 2716- added Scripts/debug/debug_blast_parser.py to diagnose blast parsing errors 2717- IPI supported by SwissProt/SProt.py parser 2718- large speedup for kmeans 2719- new registry framework for generic access to databases and parsers 2720- small bug fix in stringfns.split 2721- scripts that access NCBI moved over to new EUtils system 2722- new crc module 2723- biblio.py supports the EBI Bibliographic database 2724- new CDD parser 2725- new Ndb parser 2726- new ECell parser 2727- new Geo parser 2728- access to GFF databases 2729- new KDTree data structure 2730- new LocusLink parser 2731- new MarkovModel algorithm 2732- new Saf parser 2733- miscellaneous sequence handling functions in sequtils 2734- new SVDSuperimpose algorithm 2735 2736 2737Dec 18, 2001: Biopython1.00a4 2738============================= 2739 2740- minor bug fix in NCBIStandalone.blastall 2741- optimization in dynamic programming code 2742- new modules for logistic regression and maximum entropy 2743- minor bug fix in ParserSupport 2744- minor bug fixes in SCOP package 2745- minor updates in the kMeans cluster selection code 2746- minor bug fixes in SubsMat code 2747- support for XML-formatted MEDLINE files 2748- added MultiProc.run to simplify splitting code across processors 2749- listfns.items now supports lists with unhashable items 2750- new data type for pathways 2751- new support for intelligenetics format 2752- new support for metatool format 2753- new support for NBRF format 2754- new support for generalized launching of applications 2755- new support for genetic algorithms 2756- minor bug fixes in GenBank parsing 2757- new support for Primer in the Emboss package 2758- new support for chromosome graphics 2759- new support for HMMs 2760- new support for NeuralNetwork 2761- slew of Martel fixes (see Martel docs) 2762 2763 2764Sept 3, 2001: Biopython1.00a3 2765============================= 2766 2767- added package to support KEGG 2768- added sequtils module for computations on sequences 2769- added pairwise sequence alignment algorithm 2770- major bug fixes in UndoHandle 2771- format updates in PubMed 2772- Tk interface to kMeans clustering 2773 2774 2775July 5, 2001: Biopython1.00a2 2776============================= 2777 2778- deprecated old regression testing frameworks 2779- deprecated Sequence.py 2780- Swiss-Prot parser bug fixes 2781- GenBank parser bug fixes 2782- Can now output GenBank format 2783- can now download many sequences at a time from GenBank 2784- kMeans clustering algorithm 2785- Kabat format now supported 2786- FSSP format now supported 2787- more functionality for alignment code 2788- SubsMat bug fixes and updates 2789- fixed memory leak in listfns bug fixes 2790- Martel bundled and part of the install procedure 2791- Medline.Parser bug fixes 2792- PubMed.download_many handles broken IDs better 2793 2794 2795Mar 3, 2001: Biopython 1.00a1 2796============================= 2797 2798- Refactoring of modules. X/X.py moved to X/__init__.py. 2799- Can search sequences for Prosite patterns at ExPASy 2800- Can do BLAST searches against stable URL at NCBI 2801- Prosite Pattern bug fixes 2802- GenBank parser 2803- Complete Seq and SeqFeatures framework 2804- distutils cleanup 2805- compile warning cleanups 2806- support for UniGene 2807- code for working with substitution matrices 2808- Tools.MultiProc package for rudimentary multiprocessing stuff 2809 2810 2811Nov 10, 2000: Biopython 0.90d04 2812=============================== 2813 2814- Added support for multiple alignments, ClustalW 2815- BLAST updates, bug fixes, and BlastErrorParser 2816- Fixes for PSI-BLAST in master-slave mode 2817- Minor update in stringfns, split separators can be negated 2818- Added download_many function to PubMed 2819- xbbtools updates 2820- Prodoc parser now accepts a copyright at the end of a record 2821- Swiss-Prot parser now handles taxonomy ID tag 2822 2823 2824Sept 6, 2000: Biopython 0.90d03 2825=============================== 2826 2827- Blast updates: 2828 2829 - bug fixes in NCBIStandalone, NCBIWWW 2830 - some __str__ methods in Record.py implemented (incomplete) 2831 2832- Tests: 2833 2834 - new BLAST regression tests 2835 - prosite tests fixed 2836 2837- New parsers for Rebase, Gobase 2838- pure python implementation of C-based tools 2839- Thomas Sicheritz-Ponten's xbbtools 2840- can now generate documentation from docstrings using HappyDoc 2841 2842 2843Aug17-18, 2000: Bioinformatics Open Source Conference 2000 2844========================================================== 2845 2846We had a very good Birds-of-a-Feather meeting: 2847http://mailman.open-bio.org/pipermail/biopython/2000-August/000360.html 2848 2849 2850Aug 2, 2000: Biopython 0.90d02 is released. 2851=========================================== 2852 2853- Blast updates: 2854 - now works with v2.0.14 2855 - HSP.identities and HSP.positives now tuples 2856 - HSP.gaps added 2857- SCOP updates: 2858 - Lin.Iterator now works with release 50 2859- Starting a tutorial 2860- New regression tests for Prodoc 2861 2862 2863July 6, 2000: Biopython 0.90d01 is released. 2864============================================ 2865 2866 2867February 8, 2000: Anonymous CVS made available. 2868=============================================== 2869 2870 2871August 1999: Biopython project founded. 2872======================================= 2873 2874Call for Participation sent out to relevant mailing lists, news 2875groups. 2876 2877The Biopython Project (https://www.biopython.org/) is a new open 2878collaborative effort to develop freely available Python libraries and 2879applications that address the needs of current and future work in 2880bioinformatics, including sequence analysis, structural biology, 2881pathways, expression data, etc. When available, the source code will 2882be released as open source (https://github.com/biopython/biopython/blob/9c4785fc9eaf8a3bc436c6c0b16e7a05019cade1/LICENSE) 2883under terms similar to Python. 2884 2885This is a Call for Participation for interested people to join the 2886project. We are hoping to attract people from a diverse set of 2887backgrounds to help with code development, site maintenance, 2888scientific discussion, etc. This project is open to everyone. If 2889you're interested, please visit the web page, join the biopython 2890mailing list, and let us know what you think! 2891 2892Jeffrey Chang <jchang@smi.stanford.edu> 2893Andrew Dalke <dalke@bioreason.com> 2894