Skip navigation

Resilience in computational genomics work highlighted

Resilience in computational genomics work highlighted

Event Date: September 23, 2017
Work done by CRISP researchers, Kanak Mahadik, Associate Director Milind Kulkarni, and Director Saurabh Bagchi on resilience in computational genomics has been highlighted at the ACM bioinformatics conference called BCB held in Boston in August 2017. The work shows for the first time how to speed up genomic assembly, while correcting errors in reads, by running on a cluster of Hadoop nodes.

The work appeared in a paper titled "Scalable Genomic Assembly through Parallel de Bruijn Graph Construction for Multiple K-mers”. It showed how error correction of erroneous genomic reads can be done, together with the assembly process, in a scalable manner, as it runs on a cluster of Hadoop nodes. This technique breaks a scalability barrier that had been with us for over 7 years and now allows us to assemble the human genome 6.7X faster than the current state-of-the-art, while preserving the quality of the assembled genome.

The full paper is as follows:

Scalable Genomic Assembly through Parallel de Bruijn Graph Construction for Multiple K-mers,” Kanak Mahadik, Christopher Wright, Milind Kulkarni, Saurabh Bagchi, Somali Chaterji. In Proceedings of the 8th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM BCB), pp. 425-431, Aug 20-23, 2017, Boston, MA.