Cordon Bleu Bioinformatics

I attended the Bioinformatics Open Source Conference (BOSC 2019) organized this year along with ISMB/ECCB in Basel, Switzerland from July 21st-25th. BOSC 2019 was in multiple ways a lot of ‘firsts’ for me. I was attending my first ISMB/ECCB. It also happened to be my first time in Europe. It was the first time I was putting faces and voices to a lot of names. Like in most of the conferences these days, I met a lot of Twitter-verse friends for the very first time. And above all, this was my first ever BOSC.  I was funded in part by the Open Bioinformatics Foundation’s Travel Award and ISMB/ECCB’s Travel Fellowship. My travel and the learnings I summarize here would have been impossible without both.

Nomi Harris introducing the 20th BOSC 2019 (Couldn’t capture Nomi at this angle)

BOSC and me

I had been trying to go to BOSC for around three years. I submitted three abstracts over the three years which were all accepted for a lightning talk. However, a lack of funding made it impossible for me to make it to  BOSC 2016 and BOSC 2018. I submitted my abstract for a poster and talk about pysradb that was selected for a lightning talk (a short five-minute talk) and a poster.

ISMB/ECCB Single Cell Workshop

I arrived in Switzerland on July 19th, two days earlier than the official start date of the meeting and a day earlier than the Tutorial tracks.

OBF’s Travel award not only covered my ISMB registration but was also generous enough to cover the registration costs for Tutorial(s). ISMB/ECCB 2019 had multiple tutorial tracks spanning the fields of interpretation of deep learning in biology, computational drug discovery, statistical methods for single-cell RNA-seq, biological data visualization, biomarker discovery and tools for reproducible analysis. I decided to attend the tutorial on statistical methods for single-cell RNA-seq primarily to get an overview of the single-cell world, which is not very distant from my current research interest of deciphering translation regulation. The tutorial was very comprehensive and gave an overview of the technology, its nuances and the methods developed to tackle them. I expected it to be a bit more methods oriented, but it is difficult to have the best of everything for a diverse audience. 

Keynotes at ISMB/ECCB

ISMB/ECCB being a large conference (owing to the fact that it is two conferences combined into one) is organized into multiple tracks. The upside of a large conference is that there is so much new to learn. An obvious downside is that it could be overwhelming. I followed an “on-the-day” optimization strategy where I would look up schedules for the talks for a day in the early morning. This would have been practically impossible without the ISMB’s conference app. Though there are traditional conference booklets, an app comes in handy in scheduling. Given the multitude of tracks involved, I would keep running from one room to other. The calm dusk at Rhine river came to the rescue after each day’s heated effort, quite literally,

I made sure I attended all the keynotes. Day 1 had Dr. Nikolaus Rajewsky’s keynote on how single-cell RNA-seq is making it easier to improve our understanding of gene regulation on a much granular scale in space and time. Day 2’s Keynote had Dr. William Noble walking us through the applied machine learning world in genomics: an embedding method for inferring the corresponding cells given two biological assays data and an imputation method for learning the latent representation of the epigenome. Day 3 had Dr. Alexis Battle talking about modeling the complex impact of common and rare genetic variation on gene expression. The field of Genomics is full of examples of missing ground truth. On Day 4, Dr. Christophe Dessimoz talked on “Challenges and rewards of benchmarking – how to cope with a biased, incomplete, or even entirely missing ground truth”. I had been aware of Dr.Christophe’s work through his paper on the ortholog-paralog conjecture. Finally, Dr. Bonnie Berger’s keynote introduced me to the perils of data sharing in the genomic world and the solutions her lab has worked on over the years.

Poster Sessions

The poster sessions were spread over 4 days of ISMB/ECCB. With approximately 1000 posters, it is overwhelming to digest all the information you obtain. I again relied on the ISMB/ECCB mobile app to shortlist ones that I really wanted to learn about. I am sure I still missed out on a bunch. 

Keynote at BOSC

Dr. Nicola Mulder’s keynote introduced us to the challenges faced by African countries, owing to their history of exploitation and inequitable scientific partnerships. 

H3ABioNet is a Pan-African bioinformatics network that is trying to address this challenge by making the controlled access data findable and interoperable for both data providers and users.

BOSC Takeaways

BOSC happens over two days of the main conference that is often followed by another two days of CoFest. Unfortunately, I was not able to participate in CoFest. BOSC was packed with sessions on data crunching, data modeling and formats,  containers, open science, workflows and building open source communities. Besides these sessions there were also independent Birds of a Feather sessions (BoFs) which are informal self-organized meetups on a myriad of topics, spread over the two days. I participated in the “Reimaging the paper” BoF organized by Dr. Emmy Tsang of eLife Innovation. We discussed how the traditional model of publishing is limited in terms of reproducibility. The journals need to define a minimum standard for ensuring reproducibility, but defining what these gold standards should be, still remains a challenge in itself.

Summarizing all the talks at BOSC would be near impossible here. So I will describe the ones that got me super excited. The first one was by Dr. Malvika Sharan, on how promoting inclusiveness in Open Science communities requires an “open by design” approach where information sharing is scalable, the community is welcoming and supportive of newcomers and there are appropriate channels for skill transfer.

Dr. Devon Ryan’s talk on snakePipes was another talk that stuck around with me as it solves a critical problem that I have faced as a computational biology PhD student. A lot of data analysis requires mundane steps in analysis. snakesPipes is designed to make this useful for everyone with such a need, without forcing them to be familiar with CLI.

If you are wondering what “eierlegendewollmilchsau” is, to the best of my knowledge (and memory), it translates to “has only advantages, satisfies all needs, meets all demands” and probably fits in well with the ideology behind snakePipes.

Mateusz Kuzak from Elixir in his talk “ELIXIR Europe on the Road to Sustainable Research Software” encouraged everyone to open source their work from day one. No, there is no evidence of your work being scooped if you choose to do so. Here is his advice:

I need to mention my fellow OBF Travel awardee Dr. Aziz Khan’s efforts at building ECRcentral that helps early-stage researches find and discuss funding opportunities besides providing them a platform to share their experiences and mentor peers. As an early stage researcher myself, I have found the resources there extremely useful.

Finally, a project that I will keep a lookout on is Biotite by Patrick Kunzmann designed as a  comprehensive and efficient computational molecular biology library. But we already have Biopython? Well, yes and no. Extensive usage of numpy and cythonization results in reduced runtimes.

Biotite and Biopython: Open source wins
Biotite’s perfomance compared at multiple tasks

My lightning talk at BOSC

My schedule before leaving for ISMB/ECCB 2019 from Los Angeles was jam-packed. I did the grave mistake of overestimating my ability to make a presentation for a 5-minute lightning talk. You just have to talk for 5 minutes, but then you have only five minutes. My presentation was only complete the night before my actual talk and would not have been possible without the feedback of Meghna Verma, a PhD candidate at Virginia Tech. I met her also for the first time at ISMB. I had a lightning talk on Day 1 where I talked about pysradb, which I built over the last year to help me in my Ph.D. project. In the future, I plan to not overestimate my abilities, they are not worth the potential ill-effects they can have.

I also had a poster on pysradb on the day of my lightning talk. I got a lot of visitors and a bunch of feedback. I hope pysradb also gets more contributors in the days to come!

P-01: pysradb poster

Applying for BOSC Travel Award

If you are thinking of traveling to any event promoting open-source bioinformatics software development and open science in the biological research community, OBF covers up to USD 1000 and it is possible to request a higher amount. The application process is one of the smoothest I have ever come across!

Application Deadlines: April 15, August 15, December 15 every year.

Applying for ISMB Travel Fellowship

ISCB also provides Travel Fellowships that can come in handy if your original source of funding is not sufficient. Switzerland travel was a bit expensive, so I had applied for a travel fellowship once I got an acceptance. If you are submitting to ISMB in the future, you will receive instructions via email if you want to apply for a Travel Fellowship. Both the OBF and ISMB applications are very easy to fill out.

Cordon Bleu?

Though more often associated with a dish, I chose the title of this post as such for the high quality of talks at ISMB/ECCB and BOSC. I learned a lot in the span of five days and look forward to participating in the future as well. Again, I can’t thank enough the Open Bioinformatics Foundation and ISMB/ECCB for making this cordon bleu experience possible.

Cordon Bleu – the dish

Dusk at the Rhine river, a few minutes walk away from the conference venue