Skip to contents

Search for published T cell receptor beta CDR3 amino acid sequences with known antigen specificity in a list of data frames.

Usage

searchPublished(study_table)

Arguments

study_table

A tibble generated by the LymphoSeq2 functions readImmunoSeq() or productiveSeq(). "junction_aa", "duplicate_frequency", and "duplicate_count" are required columns.

Value

Returns a tibble with each instance that a published TCR sequence appeared along with additional information including antigen specificity, epitope, HLA type, and PubMed ID (PMID) for the reference where the sequence was characterized.

Examples

file_path <- system.file("extdata", "TCRB_sequencing",
 package = "LymphoSeq2")
study_table <- LymphoSeq2::readImmunoSeq(path = file_path, threads = 1)
#> Dataset Analysis:
#>   Files: 10, Total: 0.00 GB, Largest: 0.0 MB
#>   Available memory: 11.4 GB
study_table <- LymphoSeq2::topSeqs(study_table, top = 100)
amino_table <- LymphoSeq2::productiveSeq(study_table = study_table,
 aggregate = "junction_aa")
LymphoSeq2::searchPublished(amino_table)
#> # A tibble: 810 × 16
#>    repertoire_id junction_aa     v_call d_call j_call v_family d_family j_family
#>    <chr>         <chr>           <chr>  <chr>  <chr>  <chr>    <chr>    <chr>   
#>  1 TRB_CD4_949   CAISVGGSSPLHF   TRBV1… TRBD0… TRBJ0… V10      D02      J01     
#>  2 TRB_CD4_949   CASDGGFRNTIYF   TRBV1… TRBD0… TRBJ0… V19      D02      J01     
#>  3 TRB_CD4_949   CASGGLNTEAFF    NA     NA     TRBJ0… NA       NA       J01     
#>  4 TRB_CD4_949   CASGLVAGSTLGGE… TRBV1… TRBD0… TRBJ0… V12      D02      J02     
#>  5 TRB_CD4_949   CASGTGGETQYF    TRBV0… TRBD0… TRBJ0… V06      D02      J02     
#>  6 TRB_CD4_949   CASHSSGNTIYF    TRBV0… NA     TRBJ0… V06      NA       J01     
#>  7 TRB_CD4_949   CASKPPGQGGYGYTF TRBV0… TRBD0… TRBJ0… V06      D01      J01     
#>  8 TRB_CD4_949   CASMIDPSGNTIYF  TRBV0… NA     TRBJ0… V05      NA       J01     
#>  9 TRB_CD4_949   CASNARVDSPLHF   TRBV0… TRBD0… TRBJ0… V06      D01      J01     
#> 10 TRB_CD4_949   CASRLGESPLHF    NA     NA     TRBJ0… NA       NA       J01     
#> # ℹ 800 more rows
#> # ℹ 8 more variables: reading_frame <chr>, duplicate_count <int>,
#> #   duplicate_frequency <dbl>, PMID <fct>, HLA <fct>, antigen <fct>,
#> #   epitope <fct>, prevalence <dbl>