Table 1 Number of sequences with correctly identified V, D, and J genes or alleles in public sequence data sets
Datasets/ToolsV_geneV_alleleD_geneD_alleleJ_geneJ_allele
Data set of TRB (24 sequences)
 IMonitor24 (100%)23 (96%)17 (71%)12 (50%)24 (100%)24 (100%)
 IgBLAST23 (96%)22 (92%)13 (54%)11 (46%)24 (100%)24 (100%)
 Decombinatora21 (87%)23 (96%)
Data set of IGH (1763 sequences)
 IMonitor1735 (98%)1509 (86%)1037 (59%)952 (54%)1619 (92%)1533 (87%)
 IgBLAST1716 (97%)1518 (86%)986 (56%)956 (54%)1563 (89%)1498 (85%)
  • The data sets were obtained from the IMGT/LIGM-DB database (http://www.imgt.org/ligmdb/); searched by “Homo sapiens,” “rearranged,” “TRB,” or “IGH”; and then the selected sequences were annotated manually (Annot. level==“manual”) and annotated by V, D, J genes. So these sequences have a fairly high level of annotation confidence. The data sets and HighV-QEUST came from the same website, so HighV-QEUST was not used here. The references used for tools were the same and were from the IMGT database (http://www.imgt.org).

  • a Decombinator analyzed just the gene level of V and J.