MTP name Help

About MTP

MTP / Sample name

Memo

-

Tag

-

Target taxon

-

Database version

Region

-

About read counts

Total valid reads
The number of reads used for data analysis after passing QC. Non-specific amplicons, amplicons not assigned to the target taxon, and chimeras are removed in the QC process.
-

Removed

Low quality amplicons
The number of low quality amplicons (too short, too long, erroneous sequenced, non-specific products created during PCR). These sequences may lead to erroneous identification as spurious novel species if left unfiltered. Learn more
-
Non-target amplicons
The number of reads that do not match the PCR target taxa. (e.g. reads identified as Archaea or Eukarya when target taxa is Bacteria)
-
Chimeric amplicons
The number of chimeric reads created during PCR. These may lead to erroneous identification as novel species if left unfiltered. Learn more
-
Total reads after pre-filter
The number of reads after using a pre-filter to remove low quality reads from raw data produced by a NGS sequencing platform. Reads with short lengths and low Q-values are removed by the pre-filter, and in the case of paired-end sequencing, unmerged reads are also filtered out.
-

About read lengths

Min
Max

- bp

- bp

Average

- bp

About taxonomic assignment

No. of reads identified at the species level
The number of reads that were successfully identified against reference databases at the species level with a 97% similarity cutoff. This can indicate the taxonomic coverage of a database.

  • EzBioCloud
  • Greengenes

No. of species found
The number of unique species identified using reference databases.

  • EzBioCloud
  • Greengenes

OTU-picking

Method
This section indicates what clustering method was used to form OTUs from sequenced reads.

CL_OPEN_REF_UCLUST_MC2: each read is identified at the species-level against the reference database with a given similarity cutoff. Reads that fall below this cutoff are compiled and UCLUST is used to perform de novo clustering to generate additional OTUs. This strategy is called Open-reference OTU picking. Finally, OTUs with single reads (singletons) are omitted from further analysis.

* uclust : http://drive5.com/usearch/manual/uclust_algo.html
* cdhit : http://www.bioinformatics.org/cd-hit/

Cutoff
This is the sequence similarity value used for OTU calculation, species-level identification against the reference database, and de novo clustering. 97% is commonly used for Bacteria.

No. of OTUs found in the sample
Operational Taxonomic Unit (OTU) is a group of sequences clustered by sequence similarity. Because many bacterial species exhibit greater than 97% sequence similarity with other species, OTU count doesn't necessarily equate to the actual number of different species. This value represents the number of OTUs observed during experimentation, and may be different from the total number of OTUs (Species richness) in the sample.

Good's coverage of library(%)
This is an index of the extent to which the number of sequencing reads used for analysis represents the actual species population of the sample. The value can range from 0 to 100%, with 100% indicating a complete sampling of species, meaning that additional sequencing is unlikely to find any more new species.

Reference(s):
Good, I. J. "The population frequencies of species and the estimation of population parameters." Biometrika (1953): 237-264.

Diversity indices
Diversity indices are measures of species diversity, based on the number and pattern of OTUs observed in the sample. The indices include statistical estimates of species richness (Ace, Chao, Jackknife), and estimates of species evenness (Shannon, Simpson, NPShannon).

ACE
ACE is an indicator of species richness (total number of species in a sample) that is sensitive to rare OTUs (singletons and doubletons). Higher values indicate higher diversity.

Reference(s):
Chao, A., and Lee, S.-M. "Estimating the number of classes via sample coverage." Journal of the American statistical Association 87.417 (1992): 210-217.

  • LCI
  • Value
  • HCI

Chao1
Chao1 is an indicator of species richness (total number of species in a sample) that is sensitive to rare OTUs (singletons and doubletons). Higher values indicate higher diversity.

Reference(s):
Chao, A. "Estimating the population size for capture-recapture data with unequal catchability." Biometrics (1987): 783-791.

  • LCI
  • Value
  • HCI

Jackknife
Jackknife is an indicator of species richness (total number of species in a sample) that is sensitive to rare OTUs (singletons and doubletons) as well as to abundant OTUs (tripletons and more). Higher values indicate higher diversity.

Reference(s):
Burnham, K. P. & Overton, W. S. (1979) Robust estimation of population size when capture probabilities vary among animals. Ecology, 60, 927-936.

  • LCI
  • Value
  • HCI

Shannon
Shannon is an indicator of species evenness (proportional distribution of the number of each species in a sample) that exhibits values greater than 0.
Higher values indicate higher diversity, and the maximum value is achieved when all species are present in equal numbers.

Reference(s):
Magurran, A. E. (2013). Measuring biological diversity. John Wiley & Sons.

  • LCI
  • Value
  • HCI

Simpson
Simpson is an indicator of species evenness (proportional distribution of the number of each species in a sample) that displays the probability that two randomly selected sequences are of the same species.
Values range from 0 to 1, and lower values indicate higher diversity.

Reference(s):
Magurran, A. E. (2013). Measuring biological diversity. John Wiley & Sons.

  • LCI
  • Value
  • HCI

NPShannon
NPShannon is an indicator of species evenness (proportional distribution of the number of each species in a sample) that estimates diversity when there are unseen species and unknown abundance.
Values are greater than 0, and higher values indicate higher diversity.

Reference(s):
Chao, A., & Shen, T. J. (2003). Nonparametric estimation of Shannon’s index of diversity when there are unseen species in sample. Environmental and ecological statistics, 10(4), 429-443.

  • Value

Phylogenetic diversity
Phylogenetic diversity is a measure of biodiversity which incorporates phylogenetic difference between species. It is defined and calculated as "the sum of the lengths of all those branches
that are members of the corresponding minimum spanning path", in which 'branch' is a segment of a cladogram, and the minimum spanning path is the minimum distance between the two nodes.

Reference(s):
1. https://en.wikipedia.org/wiki/Phylogenetic_diversity
2. DP Faith. 1992. Conservation evaluation and phylogenetic diversity. Biological Conservation 61: 1-10

  • Value

Rarefaction curve
The rarefaction curve is a graph that expresses species diversity by plotting the correlation between the size of the sample data and the number of OTUs.
The x-axis represents the number of sampled reads, and the y-axis represents the number of OTUs discovered. In general, as the number of reads increases, the number of OTUs converges to the maximum value.
The steeper the slope of the curve, the higher the species diversity.
Reference(s):
Heck, K. L., van Belle, G., & Simberloff, D. (1975). Explicit calculation of the rarefaction diversity measurement and the determination of sufficient sample size. Ecology, 56(6), 1459-1461.

Rank abundance curve
The rank abundance graph can be used to observe species evenness. The x-axis represents the rank of OTUs, and the y-axis represents the relative abundance of OTUs at each rank. The graph converges to 0, and the steeper the slope of the curve, the lower the species diversity.

Reference(s):
Whittaker, R. H. (1965). Dominance and diversity in land plant communities. Science, 147(3655), 250-260.

EzBioCloud

Name

Taxa included in the group

Greengenes

Contig
A contig is a set of identical and sometimes overlapping sequences that together represent a consensus region of DNA.

Contig

Select a Taxonomic hierarchy
from the left menu

Contig

No filtered results.
  • Top Hit -
    • Similarity
      contig similarity
    • Count
      contig count

Clone
A clone is an individual sequence that was not included in contigs.

Clone

Select a Taxonomic hierarchy
from the left menu

Clone

No filtered results.
  • Top Hit -
    • Similarity
      -

Search

  • Search taxa

Pre-defined group

  • -
    -

Abundance of a selected taxon

  • - -
  • -