The color of bioinformatics: what is it and how can it be modified?

This is a guest blog post from Tendai Mutangadura, who was supported by the ongoing Open Bioinformatics Foundation travel fellowship program to attend the GCCBOSC 2018 meeting in Portland, June 2018. The OBF’s Travel Fellowship program continues to help open source bioinformatics software developers with funding to attend conferences or workshops. This was one of three awards from our April 2018 travel fellowships call. Our August call recently closed, the current call closes 15 December 2018, you might want to apply?

When I was selected as one of 3 recipients of the April 2018 OBF Travel Fellowships, I wanted this to signify a turning point in my career. I expected to meet and interact with many great minds in open science and bioinformatics, and the GCC-BOSC 2018 meeting in beautiful Portland exceeded my expectations. Because I was travelling light and had been to Portland once before, I chose to use public transport from PDX to get to Reed College, the meeting venue. This afforded me a mini tour of Portland before getting to the serious but fun business of the meeting. When I stepped off the Reed College bus stop, I flagged down the first person I saw to ask for directions to the registration venue, and this person was none other than Anton, one of the scientists instrumental in the development of the Galaxy project. Great start. As we chatted en route to the registration place, I took the opportunity to brag to him that I had recently figured out the causal mutation associated with a neurodegenerative disease in one of our whole-genome-sequenced dogs using the web-based Galaxy platform.

I attended as many training sessions related to Galaxy on Day 1 of training as I could. I have been a Galaxy platform user for >3 years and had previously attended the GCC 2016, so it was great to meet new and old acquaintances this time round. I even had the opportunity to get help with aspects of my Jetstream account virtual machines during one of the two CollaborationFest days that I attended. I found the CollaborationFest very useful in making new contacts and discussing potential future collaborations.

On Day 2 of Training, the highlights of my training, based my bioinformatics needs, included a 2.5 hour GATK training session and the bcbio workshop. In the latter, Brad Chapman, starring here, talked about and demonstrated how communities can work together to make giant strides in developing robust open source software pipelines and making these freely accessible to anyone, everyone, anywhere. For someone like me, for whom having access to computing resources and setting aside the time to focus on developing or tweaking code as part of my day job can sometimes be an uphill struggle, the bcbio workshop was a godsend. Bcbio allows me to do my day job duties and do bioinformatics too. After the meeting, I immediately contacted one of my XSEDE Extended Collaborative Support Services (ECSS) team members, Phil Blood, to discuss the possibility of putting together a species agnostic variant-calling pipeline. I have already started this project using my XSEDE start-up grant computing resources allocation on Bridges, at the Pittsburgh Supercomputing Center (PSC). So far, I have been off to a good start. For those who may not be aware of the many great free computing resources out there, such resources exist, as I have alluded to above, for anyone to take advantage of.

Now, a light-hearted reference to a serious (according to me), but common observation at conferences such as but definitely not limited to GCC-BOSC: the lack of diversity of attendees. This is what prompted me to title my blog post the way I did. My first answer was (metaphorically speaking) that the color of bioinformatics should be ‘rainbow’. But when I googled ‘rainbow colors’ it occurred to me that the colors black and white are not part of the rainbow. I also refreshed my rusty optical physics and got explanations why this is the case. Now, to get back on track, would it not be wonderful if more people of color were involved in this bioinformatics revolution? What can be done to redress this current state of affairs? Thumbs up to the OBF for recognizing and doing something about this lack of diversity by creating the OBF Travel Fellowship! For my part, when and if I complete any pipeline(s) based on bcbio code, I plan to publish a collection of such pipelines online as self-paced tutorials (the website will go live soon) in a very user-friendly format targeted to those who are command line challenged, from any community, to encourage them to get started in bioinformatics analyses, or at least analyze their own data without buying expensive commercial packages. This would be my way, albeit at very small scale, of democratizing bioinformatics. One of the advantages of involving people with as diverse backgrounds as possible with basic training in bioinformatics and genomics is that this may help reduce mistrusts linked to unfortunate historical incidents such as the Tuskegee experiments, not only for countries like the US but anywhere around the world where similar types of mistrust may exist.