Skip to contents

Combines alpha and beta TCR chains from the same cell into a single observation, creating paired-chain clonotypes for single-cell VDJ data. Uses conservative counting to avoid inflating clone frequencies.

Usage

merge_chains(airr_data, mode = "none", chain_separator = "|")

Arguments

airr_data

A tibble with AIRR-formatted data from read10x()

mode

Merging mode:

  • "none" (default): Keep all chains separate (no merging)

  • "strict": Only keep cells with exactly one alpha and one beta chain

  • "best": Select the most frequent alpha and beta chains per cell

chain_separator

Character to use between chain sequences. Default: "|"

Value

A tibble with one row per cell containing both chains. The output includes:

  • cell_id: Cell barcode

  • repertoire_id: Sample identifier

  • junction: Combined nucleotide CDR3 (alpha|beta)

  • junction_aa: Combined amino acid CDR3 (alpha|beta)

  • v_call_alpha, v_call_beta: Separate V gene calls

  • j_call_alpha, j_call_beta: Separate J gene calls

  • duplicate_count: Minimum count between the two chains

  • duplicate_frequency: Recalculated frequency

  • paired_clonotype: Combined identifier for the pair

Details

This function pairs TRA and TRB chains from the same cell while maintaining compatibility with downstream LymphoSeq2 functions. Key features:

Conservative Counting: Uses the minimum duplicate_count between paired chains to avoid overestimating clone frequency. This is important because UMI counts can differ between chains.

Chain Identification: Stores individual chain information in separate columns (alpha/beta) so gene usage and other metrics can still be calculated.

Combined Sequences: The junction and junction_aa fields contain both chains separated by "|" for use in diversity and similarity analyses.

Mode Selection:

  • strict: Most stringent - requires exactly 1 alpha + 1 beta

  • best: More permissive - selects top chain when multiples exist

  • none: No merging (returns input unchanged)

Examples

if (FALSE) { # \dontrun{
# Read 10X AIRR data
sc_data <- read10x("path/to/airr.tsv")

# Strict pairing
paired <- merge_chains(sc_data, mode = "strict")

# Use with standard analyses
diversity <- clonality(paired)
top_clones <- topSeqs(paired, top = 100)

# Gene frequency on alpha chains
alpha_genes <- paired %>%
  select(repertoire_id, duplicate_count, v_call = v_call_alpha) %>%
  geneFreq(locus = "V")
} # }