Saturday, July 27, 2013

BOSC-2013: my personal summary

ISMB/ECCB 2013 is over, and now I have some time to wrap up my thoughts and impressions.

Actually I would like to tell about one of the SIGs that was preceding ISMB: the Bioinformatics Open Source Conference or simply BOSC 2013. This conference is organized by the Open Bionformatics Foundation (guys responsible for Biopython, BioJava and other awesome projects) and devoted to all aspects of developing open source software for bioinformatics research. Starting from this year the scope of the conference was extended from open software to open science in general.

This was the second time I attended BOSC and once again I enjoyed greatly the people and the talks. Although BOSC is not the biggest SIG, it has this awesome atmosphere of a community of enthusiasts. For me it is an event where I can meet and talk to people who share my views on how the bioinformatics research should be conducted. A lot of things are being discussed: new open source tools and pipelines, visualization methods along with best practices to perform bioinformatics analysis and balance between being a software engineer and a researcher producing scientific results.

All talks were great, I highly recommend to check the presentation slides. Soon there will be video also. I would like to highlight several talks that I found especially interesting for myself.

Jug: Python-library that allows using decorators to easily launch multiple instances of your pipeline.

Ten simple rules of the open development of scientific software: based on a manuscript with the same name, which I missed somehow. This paper is a part of bigger PLOS "Ten simple rules of ..." collection.

DGE-Vis: an interesting interactive way to visualize RNA-seq expression studies

Unipro UGENE: a sequence analysis toolkit with sexy GUI. This is the project I've been involvled with for a long time, so I was very happy that it was presented at BOSC.

bcbbio: a community developed pipeline for SNP calling. Even if you are not interested in SNP discovery, this post by Brad Chapman is still worth looking at, because it contains many nice ideas on how to develop a reliable bioiformatics pipeline.

This time I was also giving a talk at BOSC. It was about Qualimap, a tool for next generation sequencing alignment data quality assesment. I have been working on it since I started my PhD at MPIIB in October 2011. We published the paper almost a year ago and since then Qualimap has become more mature and gathered a small community of users. We continue to maintain the tool and improve it based on the feedback from users. As I mentioned during the conference we are soon moving to a bitbucket repo. Subscribe to the Google group for updates.

Btw, one of the speakers was Sean Eddy. During his talk I felt somewhat similar to being at the concert of a favorite rock band \m/ I wish I had my copy of Sequence Analysis book with me to get an autograph :) One cool idea from his talk: sometimes reinventing the wheel and developing your own home-brewed set of tools for common tasks is actually normal. This activity should be considered as an exercise which allows to better understand things that you work on. After having enough time spent playing with the concept, it is still recommended to switch to the best instruments available ;)

The conference also included several panel discussions. An important discussion was devoted to the funding of open source scientific software development. It is commonly considered that main result of scientific research should be a manuscript. However for a computational biologist most often the software is true result of the research, while the paper is only "an advertisiment". Unfortunately the software usually is not considered by grant committees or lab bosses as a relevant result. The discussion demonstrated that bioinformatics community slowly moves away from this misconception, however a lot of work has to be done yet.

So, to sum up: BOSC is awesome, don't miss it next year in Boston.

No comments: