Member Projects

BioJava is an open-source project dedicated to providing a Java framework for processing biological data. It provides analytical and statistical routines, parsers for common file formats and allows the manipulation of biological sequences and 3D structures. The main goal of the project is to facilitate rapid application development for bioinformatics.

The Bioperl Project is an international association of users & developers of open source Perl tools for bioinformatics, genomics and life science BioPerl[1][2] is a collection of Perl modules that facilitate the development of Perl scripts for bioinformatics applications. It has played an integral role in the Human Genome Project.

Biopython is a set of freely available tools for biological computation written in Python by an international team of developers.

It is a distributed collaborative effort to develop Python libraries and applications which address the needs of current and future work in bioinformatics. The source code is made available under the Biopython License, which is extremely liberal and compatible with almost every license in the world.

BioRuby is a collection of open-source Ruby code, comprising classes for computational molecular biology and bioinformatics. It contains classes for DNA and protein sequence analysissequence alignment, biological database parsing, structural biology and other bioinformatics tasks.

BioSQL is a generic unifying schema for storing sequences from different sources, for instance Genbank or Swissprot. BioSQL is meant to be a common data storage layer supported by all the different Bio* projects, Bioperl, Biojava, Biopython, and Bioruby. Entries stored through an application written in, say, Bioperl could be retrieved by another written in Biojava.

The Distributed Annotation System (DAS) defines a communication protocol used to exchange annotations on genomic or protein sequences. It is motivated by the idea that such annotations should not be provided by single centralized databases, but should instead be spread over multiple sites. Data distribution, performed by DAS servers, is separated from visualization, which is done by DAS clients. The advantages of this system are that control over the data is retained by data providers, data is freed from the constraints of specific organisations and the normal issues of release cycles, API updates and data duplication are avoided.

The DAS/2 specification describes how different annotations servers can make annotations on the same genomic sequence. It uses a set of globally defined names to label each segment. A DAS client merges data from two annotation servers if they both use segments having the same global reference name.

EMBOSS is “The European Molecular Biology Open Software Suite”. EMBOSS is a free Open Source software analysis package specially developed for the needs of the molecular biology (e.g. EMBnet) user community. The software automatically copes with  data in a variety of formats and even allows transparent retrieval of sequence data from the web. Also, as extensive libraries are provided with the package, it is a platform to allow other scientists to develop and release software in true open source spirit. EMBOSS also integrates a range of currently available packages and tools for sequence analysis into a seamless whole.

Affiliated Projects

GBrowse (Generic Genome Browser), a part of GMOD is a combination of database and interactive web pages for manipulating and displaying annotations on genomes.

Ontologies and Definitions

  • Cyclone :

    Provides an open source Java API to the pathway tool BioCyc.

  • GO-Perl :

    Parsers and object model for OBO ontologies.

  • Chaos-XML :

    File format based on the sequence modules of the Chado relational schema.

  • Sequence Ontology :

    Collaborative ontology project for the definition of sequence features used in biological sequence annotation.

  • Gene Ontology :

    Defines concepts/classes used to describe gene function, and relationships between these concepts.

  • Obol :

    Software for automatically generating cross-product definitions (aka genus-differentia definitions) from the names of terms/classes in OBO ontologies.

Retired Projects

  • MOBY :

    A system for interoperability between biological data hosts and analytical services.

  • OBDA :

    Standardization of access to sequence data resources.

  • Wikiomics :

    Bioinformatics Wiki.

  • Pise :

    A tool to generate Web interfaces software for submitting jobs to these interfaces.

  • OpenBQS :

    Bibliographic query system.

  • BioClipse :

    Visual platform based on Eclipse.

  • BioWeka :

    Adds bioinformatics functionalities such as e.g. alignments to the popular machine learning framework Weka.

  • BlipKit :

    Chris Mungall’s Prolog toolkit for Bioinformatics and BioMedical Informatics.

  • Biolib :

    Cross-project/language C bindings for a shared codebase across the different Bio* projects (Pjotr Prins).