MetaFast: fast reference-free graph-based comparison of shotgun metagenomic data

Overview of attention for article published in Bioinformatics, June 2016
  • In the top 5% of all research outputs scored by Altmetric
  • Among the highest-scoring outputs from this source (#13 of 7,651)
  • High Attention Score compared to outputs of the same age (97th percentile)
  • High Attention Score compared to outputs of the same age and source (98th percentile)

9 news outlets
2 blogs
56 tweeters
1 Facebook page


72 Mendeley
6 CiteULike
MetaFast: fast reference-free graph-based comparison of shotgun metagenomic data
Published in
Bioinformatics, June 2016
DOI 10.1093/bioinformatics/btw312
Vladimir I. Ulyantsev, Sergey V. Kazakov, Veronika B. Dubinkina, Alexander V. Tyakht, Dmitry G. Alexeev, Ulyantsev, Vladimir I, Kazakov, Sergey V, Dubinkina, Veronika B, Tyakht, Alexander V, Alexeev, Dmitry G


High-throughput metagenomic sequencing has revolutionized our view on the structure and metabolic potential of microbial communities. However, analysis of metagenomic composition is often complicated by the high complexity of the community and the lack of related reference genomic sequences. As a start point for comparative metagenomic analysis, the researchers require efficient means for assessing pairwise similarity of the metagenomes (beta-diversity). A number of approaches is used to address this task, however, most of them have inherent disadvantages that limit their scope of applicability. For instance, the reference-based methods poorly perform on metagenomes from previously unstudied niches, while composition-based methods appear to be too abstract for straightforward interpretation and do not allow to identify the differentially abundant features. We developed MetaFast, an approach that allows to represent a shotgun metagenome from an arbitrary environment as a modified de Bruijn graph consisting of simplified components. For multiple metagenomes, the resulting representation is used to obtain a pairwise similarity matrix. The dimensional structure of the metagenomic components preserved in our algorithm reflects the inherent subspecies-level diversity of microbiota. The method is computationally efficient and especially promising for an analysis of metagenomes from novel environmental niches. Source code and binaries are freely available for download at https://github.com/ctlab/metafast The code is written in Java and is platform independent (tested on Linux and Windows x86_64). VIU: ulyantsev@rain.ifmo.ru, SVK: svkazakov@rain.ifmo.ru, VBD: dubinkina@phystech.edu, AVT: at@niifhm.ru, DGA: exappeal@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

