Bioperl 1.5.1

I am pleased to announce the 1.5.1 developer release of Bioperl.

Essential links here



MD5 sum

Please see my mailing list postfor more information.

I have appended the Change log from Bioperl core compents below
1.5.1 Developer release

o Major problem with how Annotations were written out with
Bio::Seq is fixed by reverting to old behavior for
Bio::Annotation objects.

o Bio::SeqIO

* bug #1871; REFLOOP’ parsing loop, I changed the pattern to
expect at l east 9 spaces at the beginning of a line to
indicate line wrapping.

* Treat multi-line SOURCE sections correctly, this defect broke
both common_name() and classification()

* parse swissprot fields in genpept file

* parse WGS genbank records

* Changed regexp for ID line. The capturing parentheses are
the same, the difference is an optional repeated-not-semi-
colon expression following the captured S+. This means the
regexp works when the division looks like /PRO;/ or when the
division looks like /ANG ;/ – the latter is from EMBL

* fix ID line parsing: the molecule string can have spaces in
it. Like: “genomic DNA”

– bugs #1727, #1734

* Added parser for entrezgene ASN1 (text format) files.
Uses Bio::ASN1::EntrezGene as a low level parser (get it from CPAN)

o Bio::AlignIO

– coordinate problem fixed

o Bio::Taxonomy and Bio::DB::Taxonomy

– Parse NCBI XML now so that nearly all the taxonomy up-and-down
can be done via Web without downloading all the sequence.
o Bio::Tools::Run::RemoteBlast supports more options and complies
to changes to the NCBI interface. It is reccomended that you
retrieve the data in XML instead of plain-text BLAST report to
insure proper parsing and retrieval of all information as NCBI
fully expects to change things in the future.
o Bio::Tree and Bio::TreeIO

– Fixes so that re-rooting a tree works properly

– Writing out nhx format from a newick/nexus file will properly output
bootstrap information. The use must move the internal node labels over
to bootstraps.
for my $node ( grep { ! $_->is_Leaf } $tree->get_nodes ) {
– Nexus parsing is much more flexible now, does not care about

– Cladogram drawing module in Bio::Tree::Draw

– Node height and depth now properly calculated

– fix tree pruning algorithm so that node with 1 child gets merged

o Graphics tweaks. Glyph::xyplot improved. Many other small-medium sized
bugs and improvements were added, see Gbrowse mailing list for most of

o Bio::DB::GFF partially supports GFF3. See information about
gff3_munge flag in scripts/Bio-DB-GFF/

o Better location parsing in Bio::Factory::FTLocationFactory –
this is part of the engine for parsing EMBL/GenBank feature table
locations. Nested join/order-by/complement are allowed now

o Bio::PrimarySeqI->translate now takes named parameters

o Bio::Tools::Phylo::PAML – parsing RST (ancestral sequence
reconstruction) is now supported. Parsing different models and
branch specific parametes are now supported.

o Bio::Factory::FTLocationFactory – parse hierarchical locations
(joins of joins)

o Bio::Matrix::DistanceMatrix returns arrayrefs instead of arrays
for getter/setter functions

o Bio::SearchIO

– blast bug #1739; match scientific notation in score
and possible e+ values

– reads more WU-BLAST parameters and parameters, match
a full database pathname,

– Handle NCBI WEB and newer BLAST formats specifically
(Query|Sbjct:) match in alignment blocks can now be (Query|Sbjct).

– psl off-by-one error fixed

– exonerate parsing much improved, CIGAR and VULGAR can be parsed
and HSPs can be constructed from them.

– HSPs query/hit now have a seqdesc field filled out (this was
always available via $hit->description and

– can parse -A0 hmmpfam files

– Writer::GbrowseGFF more customizeable.

o Bio::Tools::Hmmpfam
make e-value default score displayed in gff, rather than raw score
allow parse of multiple records