Estimate repertoire diversity across different sequencing depths using rarefaction (interpolation) and extrapolation. This uses the iNEXT algorithm to help determine if sequencing depth is sufficient to capture repertoire diversity.
Arguments
- sample_table
A tibble from
readImmunoSeq()orproductiveSeq()containing "junction_aa", "duplicate_count", and "repertoire_id" columns. Can contain one or multiple repertoires.- q
Diversity order to calculate:
0 (default): Species richness (number of unique clones)
1: Shannon diversity (accounts for evenness)
2: Simpson diversity (emphasizes abundant clones)
- endpoint
Maximum sequencing depth for extrapolation. Default is 100000. Set higher to predict diversity at deeper sequencing.
- nboot
Number of bootstrap iterations for confidence intervals (default 10). Higher values give more precise estimates but take longer.
- conf
Confidence level for intervals (default 0.95 for 95% CI)
Value
A tibble with rarefaction/extrapolation results:
m: Sample size (sequencing depth)Method: "Rarefaction", "Observed", or "Extrapolation"Order.q: Diversity order (same asqparameter)qD: Estimated diversity at depthmqD.LCL: Lower confidence limitqD.UCL: Upper confidence limitSC: Standard error from bootstraprepertoire_id: Sample identifier
Details
This function wraps the iNEXT package (Chao et al. 2014) for rarefaction and extrapolation analysis. It converts the sample table into a matrix format where rows are unique sequences and columns are repertoires, then runs iNEXT on all samples simultaneously.
Rarefaction vs Extrapolation:
Rarefaction (m less than observed): Subsample sequences to depth m and count unique clones. Shows how diversity increases with sequencing depth. Useful to compare samples at equal depth.
Extrapolation (m greater than observed): Predict diversity at deeper sequencing using Chao1 estimator for unseen species. Shows whether sequencing is complete (plateau) or more diversity remains (still increasing).
How to interpret the curve: Plateau reached = Sequencing depth is sufficient, most clones captured. Still increasing steeply = Need deeper sequencing to capture full diversity. Comparing samples = Use rarefied diversity at same depth, not raw counts.
References
Chao, A., et al. (2014). Rarefaction and extrapolation with Hill numbers: a framework for sampling and estimation in species diversity studies. Ecological Monographs, 84(1), 45-67.
Examples
if (FALSE) { # \dontrun{
file_path <- system.file("extdata", "TCRB_sequencing",
package = "LymphoSeq2")
study_table <- LymphoSeq2::readImmunoSeq(path = file_path, threads = 1)
amino_table <- LymphoSeq2::productiveSeq(study_table,
aggregate = "junction_aa",
prevalence = TRUE
)
# Run on all samples at once
rarefaction_table <- LymphoSeq2::runINext(amino_table)
} # }
