EU Codefest 2012
EU-codefest 2012 was 19 and 20 July in Lodi Italy
15 people attended the EU-Codefest, which took place on 19 and 20 July 2012, in Lodi, Italy, near Bergamo and Milano.
The EU-Codefest is modeled on the successful biohackathons in Japan, though we like it more free-flowing.
The weekend after 21 and 22 July continued (unofficially) to have a good time, and organised some outings.
Organising committee: Francesco Strozzi, Raoul Bonnal and Pjotr Prins
Three main topics were worked on during the CodeFest:
- NGS and high performance parsers for OpenBio projects.
- RDF and semantic web for bioinformatics.
- Bioinformatics pipelines definition, execution and distribution.
During the conference we discussed current issues and added
- Service discovery and integration (Debian, BioMoby, biogems.info, BioLinux)
- Workflow (bio-ngs, Galaxy, Taverna)
- PBS (bio-ngs)
- bio-maf introducing into pipeline
- biogems.info (Debian packages)
- bio-table (RDF support)
The following talks/introductions/micro-presentations were given at the Codefest:
- Semantic Web for the Integrated Genomic Data by Toshiaki Katayama, Database Center for Life Science, Tokyo, Japan
- Writing the worlds fastest GFF3 parser by Marjan Povolni, Faculty of Technical Sciences Novi Sad, Serbia
- Debian packaging by Steffen Möller, University of Lübeck, Germany
- BioWSR Bioinformatic Web Services Registry project update by José María Fernández González, Protein Design Group, CNB-CSIC Cantoblanco, Madrid,
- RNA-seq project introduction by James Tojo, Karolinska University, Stockholm, Sweden
- Software resources for Bioinformatics, including http://biogems.info/ by Pjotr Prins, Wageningen University, The Netherlands
We worked on developing bio-ngs to better handle pipelines and reproducible workflows, plus semantic databases to manage and query NGS sec ondary analysis data.
Bio RDF support
During the codefest we discussed the ins- and outs of graph stores, and the semantic web (triple stores). We now share a vision on how to move forward, using RDF for biological analysis. The next biohackathon will be in Japan in September 2012. Four of us will be there to continue this work.
Adding a list of BioLinux Software packages to biogems.info
We recognise is it often hard to find relevant information on current software packages. At the USA codefest, a week earlier, Brad Chapman and Hervé Ménager created a Manifest of bioinformatics packages installed on a running CloudBiolinux system (VM). At the EU codefest a web representation was created that parsed the manifest, pulled some more information from the Debian package information and generates the page http://www.biogems.info/biolinux.html.
The new overview led to a discussion on BioLinux/Debian packaging:
The overview helps the different parties focus on priorities (with packaging software).
Brad writes: From the Debian/Bio-Linux/packaging side, I'm happy to modify where we pull packages from. The goal is to have everything in single repositories to be community maintained so I'm similarly happy if some of the CloudBioLinux specific builds pick up package maintainers. The (CloudBiolunx) custom scripts now are there of necessity and not meant to be permanent. I'm also digging into nix so if there are any packages there that we've missed and would be useful, we can move that direction as well. It was also noted that proper Debian packages also track popularity statistics (based on installation of Debian).
Steffen added: My general concern (and motivation) is community forming and an avoidance of redundancies. That is why a nice friendly embracing email sent to the Debian/Ubuntu folks on Debian Med I consider important. Your efforts should be perceived as a continuation of the Debian Med efforts, not as a competition.
In bioinformatics, and in general life sciences, there are plenty of web services built following the different web service paradigms and bioinformatic web service variants: REST , SOAP, BioMOBY, DAS, BioMart, etc... Although there are already web service repositories like BioMOBY and web service catalogues like BioCatalogue or DASRegistry, we feel that many of the existing solutions are not extensible enough to be applied to new web service paradigms. Even worse, in some cases the source code of the registry is not available, so it cannot be reused, improved or extended.
So we (at INB) started the design and implementation of a new web service repository (in Ruby) that will allow us (and anyone using the codebase we are developing) to document and manage the web services developed by different groups, being open and extensible to the entire range of web service paradigms and variants used in bioinformatics. Due to the large web service portfolio the INB possesses, one of the main concerns when designing BioWSR was that web services from any paradigm should be describable at a certain level. As the new registry is strongly inspired by the BioMOBY repository, one requisite was that RDF triples should be a superset of the ones used in BioMOBY to make it backward compatible.
During the codefest we collected info on what requirements are there for GFF3/GTF tools, and worked on Debian packages for the same tools
We introduced a few people to Debian packaging and key signing. For CloudBiolinux we encourage further Debian packaging of both BioLinux packages, and custom build scripts. It was noted, for example, that Debian has a more recent version of Jalview than the biolinux-jalview package.
We were working on NGS, more specifically looking to our RNA sequencing project... Just wanted to check with you guys if everyone follows the same methods/pipeline for RNA sequencing. Also managed to set up Gbrowser with help of Paulo for our RNA seq project and also looked to other visualization tools. Meeting you guys and learning from the group was really helpful. Now I have a feeling to work on some of the open source projects but I guess I need to learn bit more to reach that level.