簡介
Global distribution of soil organisms. Data deposited in this project represent unique (non-clustered) sequences. These sequences are members of the curated OTU list (tag-jump filtered and chimera-free, clustered at 98% similarity threshold) from the GSMc dataset (Tedersoo et al., Fungal Diversity, 2021, https://doi.org/10.1007/s13225-021-00493-7). For each OTU in each sample, within-OTU sequences were dereplicated ignoring terminal gaps; in the presence of sequence variants differing only in the length of homopolymeric regions, only the most abundant variant was preserved. Taxonomic annotation was transferred from the representative sequence of each OTU to all unique sequences clustered in it. The current dataset includes additional soil samples not covered by the published article (Tedersoo et al., Fungal Diversity, 2021). Additional samples were collected following a slightly different sampling protocol. Taxon occurrences originating from these samples can be filtered out by Dataset name ('Global soil samples subproject (sequences from additional samples)') and Dataset ID (108273). The number of distinct sampling sites: 3 736, sampling events: 4 514. The number of unique taxa based on UNITE species hypotheses on 1.5% distance threshold: 292 413.
資料授權引用格式:Creative Commons Attribution (CC-BY) 4.0 License
DOI: 10.15468/xaofbe
聯絡資訊
- Senckenberg, Senckenberganlage 25, D-60325 Frankfurt am Main, Germany
- info_sesam@senckenberg.de