View on GitHub

Metabarcoding Course, 28-11-2019

University of Otago, Dunedin

Taxonomy Reference Databases


Curated databases for Qiime and other programs

For working with 16s, 18s, or fungal ITS, there are curated databases, already formatted for Qiime (and regular versions for other programs), as well as pre-trained classifiers for standard 16S primer regions. There are details and links to these on the Qiime data resources page


Public microbiome data

The Qiita website is an online resource containing data from hundreds of microbiome studies. This includes sequence data and sample metadata and allows for comparison of your study to hundreds of others. One useful utility of this site is the qiime Clawback plugin to create weights of your taxonomy classifier based on your study metadata (e.g. gut microbiome). There is a full tutorial on the Qiime forum page. This is meant to increase the accuracy of your taxonomic assignments. See the ‘NB-Bespoke’ results in the Bokulich et al 2018 paper or on the Taxonomy classification page of this course.


Other Databases

The NCBI and EMBL sequence databases have the largest collection of sequences. Of the two, EMBL is better curated. However, both of these databases are subject to error and searches on them can be inexact and recover sequences that are not the correct gene, misidentified or unassigned taxa.