View on GitHub

Metabarcoding Course, 28-11-2019

University of Otago, Dunedin

Denoising and clustering

Now that the initial sequences are processed, we will create representative sequences from them, where each sequence will represent multiple replicates and similar sequences. The Qiime2 overview tutorial has a good description of the two main processes, which will try here.

Denoising

qiime dada2 denoise-single \
  --i-demultiplexed-seqs {DEMUX_SEQS}.qza \
  --p-trim-left 0 \
  --p-trunc-len 120 \
  --o-representative-sequences {REP-SEQS}.qza \
  --o-table {FREQ-TABLE}.qza \
  --o-denoising-stats {DENOISING-STATS}.qza

qiime metadata tabulate \
  --m-input-file {DENOISING-STATS}.qza \
  --o-visualization {DENOISING-STATS_VIZ}.qza

qiime feature-table summarize \
  --i-table {FREQ-TABLE}.qza \
  --o-visualization {FREQ-TABLE_VIZ}.qzv \
  --m-sample-metadata-file sample_metadata.tsv

qiime feature-table tabulate-seqs \
  --i-data {REP-SEQS}.qza \
  --o-visualization {REP-SEQS_VIZ}.qzv

Clustering with vsearch

Qiime2 provides a vsearch clustering tutorial, an example from which we are doing here.

To cluster with vsearch first, dereplicate the sequences

qiime vsearch dereplicate-sequences \
  --i-sequences {DEMUX_SEQS}.qza \
  --o-dereplicated-table {FREQ-TABLE-DEREP}.qza \
  --o-dereplicated-sequences {DEREP-SEQS}.qza

qiime vsearch cluster-features-de-novo \
  --i-table {FREQ-TABLE-DEREP}.qza \
  --i-sequences {DEREP-SEQS}.qza \
  --p-perc-identity 0.99 \
  --o-clustered-table {FREQ-TABLE}.qza \
  --o-clustered-sequences {REP-SEQS}.qza

tabulate and summarise

qiime feature-table summarize \
  --i-table {FREQ-TABLE}.qza \
  --o-visualization {TABLE-SUMMARIZE-VIZ}.qzv \
  --m-sample-metadata-file sample_metadata.tsv

time qiime feature-table tabulate-seqs \
  --i-data {REP-SEQS}.qza \
  --o-visualization {REP-SEQS_VIZ}.qzv

qiime tools view {TABLE-SUMMARIZE-VIZ}.qzv

qiime tools view {REP-SEQS_VIZ}.qzv

questions: what kind of differences do you observe between the two