Please visit our ***NEW*** OBF/BOSC website: https://www.open-bio.org/ |
-
Newsletter:2006 Winter
Winter 2006 Open Bioinformatics Foundation Newsletter
Contents
Open Bioinformatics Foundation Report
President's Report
Jason Stajich
2005 OBF Board Meeting Report
The 2005 Open Bioinformatics Foundation board meeting was held during the BOSC 2005 conference in Detroit, MI. Several items of buisiness were attended to including several personnel changes.
- Past President Ewan Birney was replaced by Jason Stajich
- Hilmar Lapp was elected parlimentarian
- Ewan Birney was added as an at-large
In addition the Board agreed to guidelines for Membership requirements for OBF members.
Financial Overview
Chris Dagdigian
Website and Mailing list statistics
maybe will drop this if I don't have time Jason 16:31, 26 January 2006 (EST)
BOSC Reports
Contributed by Darin London
Each year, the O|B|F sponsors the Bioinformatics Open Source Conference (BOSC) as a Special Interest Group in association with the International Systems in Molecular Biology Conference (ISMB) sponsored by the International Society for Computational Biology (ISCB). The conferences over the last two years have been very successful, and we look forward to continuing to serve the Open Source Bioinformatics Community, and the wider research community in 2006.
BOSC 2004
BOSC 2004 was held along with ISMB 2004 in Glasgow, Scotland. The conference was a great success, with 136 people in attendance. Our Keynote address featured Wolfgang Huber, delivering an inciteful overview of the very popular BioConductor Open Source application system, for which he is responsible. His talk was followed, over the course of two days, by eleven excellent 20-minute presentations selected from a very competitive pool of applicants; a special session on Annotation Database Systems where the leaders in the field discussed their representative projects; and 29 lightning talks. These talks informed us of a huge range of Open Source/Open Standards bioinformatics software development and usage activities within the community, including endeavors devoted to semantically identifying life science objects, easing the process of computationally processing the ever-growing array of biological datasets, integrating disparate datasets, and validating the diverse Open Source software systems available to the research community. As always, attendees were given the opportunity to organize Birds of a Feather discussions devoted to a diverse array of more specialized interests after the main session each day.
BOSC 2005
BOSC 2005 was held in conjunction with ISMB 2005 in Detroit, Michigan. The meeting was very successful, with 99 people in attendance. The meeting featured two Keynote speakers. The first was delivered by Jason Stajich, one of the core developers of BioPerl. He spoke of the challenges and rewards he has experienced in developing Open Source Bioinformatics Software, and discussed the challenges he sees facing Bioinformatics Software developers in the future. The second Keynote address was delivered by Hilmar Lapp, Open Bio Foundation Parliamentarian. He explained the exciting changes that the O|B|F was undergoing in transitioning to a more active not-for-profit organization, and invited BOSC attendees to become involved with the foundation in ways that were not available to them before. Over the course of two days, attendees were provided a wealth of information from 15 excellent 20-minute presentations selected from a very competitive pool of applicants, and 17 lightning talks, speaking on a broad array of topics. Birds of a Feather groups then formed in the afternoon around a variety of specific topics, such as the DAS2 specification, and the various Bio* tools.
BOSC 2006
The BOSC committee is already in the process of planning BOSC 2006, to be held in conjunction with ISMB 2006 in Fortaleza, Brazil. We are looking forward to a great meeting.
O|B|F Projects Reports
CGL
Contributed by Mark Yandell
About CGL: the Comparative Genomics Library
CGL is an open source software library for comparative genomics using genome annotations. The library is designed to employ the contents of genome-annotation databases for purposes of large-scale inquiries into the structure, function and evolution of genes. To facilitate these analyses, CGL provides three core functionalities. First, CGL can convert the annotations of many different providers into a single standardized format (Chaos.xml); thus the software can be used to assemble very large repositories of annotations that encompass the contents of multiple genome databases. Second, the library extends the bioperl HSP objects so that information about annotated gene-structures is mapped to sequence alignments. Finally, CGL can make use of existing annotations to in extract information about gene structure from unannotated, partially assembled genomes. For more information downloading and using CGL go www.yandell-lab.org/cgl.
Current Release
Get the latest release of CGL from www.yandell-lab.org/cgl.
Have a look at the main reference for CGL: Large-Scale Trends in the Evolution of Gene Structures within 11 Animal Genomes Mark Yandell, Chris J. Mungall, Chris Smith, Simon Prochnik , Joshua Kaminker, George Hartzell, Suzanna Lewis and Gerald M. Rubin. In press PloS Computational Biology.
Contactt us
General questions and comments about CGL should be directed to myandell-at-fruitfly.org.
BioJava
Contributed by Mark Schreiber
Biojava in 2005
2005 saw an increasing amount of activity for biojava. The long awaited biojava1.4 was finally released in June which re-energised the project. A big spike in web activity for the biojava web-site was observed shortly after the release. Hits per day increased from about 6000 per day to about 10K per day after the release. The binary (JAR) distribution of biojava 1.4 has been downloaded more than 4000 times since it's release with a typical month seeing about 700 downloads.
The new activity has also increased the traffic on the mailing list and has prompted a jump in number of active developers. At the same time career changes work pressures have meant a few of the original lead developers have taken more back seat role. Fortunately, their Oracular wisdom can still be heard from time to time drifting from the mists of the ethernet.
During 2005 the Biojava in Anger site remained popular and saw a big increase in the number of tutorials available. There are now 64 short tutorials which cover most of the questions commonly asked in biojava.
New developments
The release of biojava1.4 also cleared the way for new developments to begin in earnest. Two exciting new developments are the biojavax extensions and the biojava structure package.
Biojavax
Biojavax is an API extension to biojava. It has been designed to extend the core interfaces of biojava without breaking any of the old design. Incompatability between versions has been a key problem with previous biojava releases. The idea is to increase the functionality of the core API, particularly in the areas of file parsing and data persistence. The relationship between biojavax and biojava is best compared to the relationship of javax and java. Javax extends, improves and sometimes deprecates the java API without being essential or breaking old code.
The need for biojavax came about when Richard Holland and I were working on a biojava / biosql model for the dengue virus database dengueinfo. We quickly deficiencies with biojava's biosql interaction model. We also identified areas where flatfile I/O could be improved. Some fixes were made to the code base but we also decided that the biojava core interfaces and I/O model could be usefully extended. We also redesigned the biosql mappings so that all persistence and transactions with biosql are now handled seamlessly by Hibernate. Although still experimental, we hope to have a preview release of biojava1.5 including biojavax by early 2006.
Biojava Structure Package
Beginning in biojava1.4 and continuing to develop is biojava's structure package. The package is designed to allow the I/O of PDB files. Additionally it contains an object model that describes molecular structures and allows for powerful manipulation and transformation of those structures. The main developer of the package is Andreas Prlic who is using it to develop a DAS client for structures called SPICE.
What is planned for 2006?
One of the core goals is to role out biojava1.5 including the biojavax APIs. Following that I would like to extend the work we have done with Hibernate to make use of the Spring Framework. The intended goal is to make biojava able to act as the middleware and/ or data object layer of an enterprise application.
Another area for development in 2006 will be improving biojava's abilities in data integration and interaction with semantic web technologies like RDF by building on code developed by Matthew Pocock when he was experimenting with his bjv2 API.
BioPerl
Contributed by Brian Osborne
The last 2 years in BioPerl have been characterized by significant acceptance and use by the academic and commercial communities. Up until 2003 roughly 100 papers were published referring to the BioPerl package. Since then more than 200 articles have appeared that cite BioPerl, primarily in the areas of genome annotation and database construction. These figures, and data on Web site use, show a steady increase in the popularity of BioPerl.
On the development side there has been much work on standardizing documentation and increased testing, a full "bug sweep", and a developer's release, version 1.5. In addition some 100 new modules have been added in the last 2 years, for a total of 816 modules and roughly 173,000 lines of code.
On the community side a small group of dedicated supporters continued to man the bioperl-l mailing list, which receives a steady stream of queries. The most exciting news in this area is the new Wiki-based BioPerl site which has been created in the last few months and will be released in January 2006.
GMOD
Contributed by Scott Cain
The Generic Model Organism Database (GMOD) Project is a largely open source project to develop a complete set of software for creating and administering a model organism database. Components of this project include genome visualization and editing tools, literature curation tools, a robust database schema (known as Chado), and a generic web front end. Several established model organism databases contribute to the project, including WormBase, the Saccharomyces Genome Database, FlyBase, Gramene, TAIR, the Rat Genome Database, and EcoCyc. Many components of GMOD rely on BioPerl, including one of the "flagship" projects, the Generic Genome Browser (GBrowse), the widely adopted, web-based genome feature visualizer. Over 150 organizations have adopted some tools from GMOD.
Chado
add stuff here from Chris Mungall about chado, including a few adopting orgs, like dictyBase and ParameciumDB
Contact info
For more information, please see www.gmod.org or contact Scott Cain (cain@cshl.edu).
PISE
Contributed by Catherine Letondal
BioPython
Contributed by Iddo Friedberg
BioPython has had two releases in 2005, both comprising of major additions to the code base. With 518 members subscribed to the general mailing list, and 210 on the developers list, BioPython has grown to become an important tool for computational biology. Traffic on the developers list is high, with contributions and patches submitted on a weekly basis. Currently the project comprises of some 237,000 lines of code.
Python's forte is it's ease of use for non-programmers, which makes it an ideal language for life scientists who wish to program. On the other hand, BioPython's documentation, as in many collaborative projects, has sorely been lagging behind developement. Thanks to efforts made by a few of our members, the documentation basis has been greatly improved. We see the impact of this in the mailing list, where many emails start with "I am new to programming and to Python, and I have been using Biopython so far and it is great! I just have one problem..." (We sometimes even manage to help with that one problem).
About 20 new modules were added in 2005, and for a full list please see the release notes [1].
DAS
BioMOBY
Contributed by Mark Wilkinson
In 2005, BioMoby enjoyed its most successful year so far! This was the year that it emerged from its adolescence as "just an amusing prototype", to mature into a serious codebase with a rigorous test suite, increasingly comprehensive documentation, and a formal RFC procedure for code changes. The number of independent Moby Web Service registries has grown to 5 (Canada, Germany, Spain, Australia, and the Philippines), and the number of interoperable services in the primary public registry at UBC in Vancouver exceeds 500. Powerful and intuitive tooling for service providers and for end-users has also become available in the past year, lowering the bar for participation and use, and greatly enhancing the accessibility of the BioMoby platform for the "newbie". The BioMoby community of core developers has doubled this year, and we have continued the tradition of having face-to-face developers meetings every 6 months to ensure that the project does not stagnate. Moby continues to enjoy a strong base of financial support from the original Genome Canada award, and from a strong and productive collaboration with the UK-based myGrid project, in addition to the many participating members worldwide who invest their own resources into tooling and core code improvements. We predict that 2006 will be the year that BioMoby truly appears on the biologist's radar and starts making a difference to the lives of researchers around the world!
BlipKit
Contributed by Chris Mungall
Blipkit (Biological Logic Programming Knowledge Integration Kit), aka
Blip, aka BioProlog, is a new addition to the OBF fold. Blip is an
integrated application programming library and a lightweight deductive
database system, and contains modules and schemas for representing and
using various biological and biomedical datatypes, such as sequence
features, phenotypes, pathways and phylogenies/species
taxonomies. Blip also has strong support for ontologies (both OBO and
OWL).
Some end user scripts and applications are provided, including an AmiGO clone. The current software is in alpha, early adopters are encouraged to view:
or the alternate url:
Newsletter edited by Jason Stajich