Deep learning techniques create a comprehensive reference atlas of CD8+ T cells, integrating single-cell RNA-seq data and T-cell receptor diversity to enhance understanding of immune responses across diseases.
In a recent study published in Nature Methods, researchers developed scAtlasVAE, a deep learning model that integrates large-scale single-cell ribonucleic acid (RNA) sequencing (scRNA-seq) data, using paired We created a comprehensive CD8+ T cell atlas. T cell receptor (TCR) information.
They discovered associations between CD8+ T cell subtypes, characterized three distinct depleted T cell subtypes, and revealed diverse clonal and transcriptomic patterns in autoimmunity and immune-related inflammation.
background
Recent studies have highlighted the involvement of CD8+ T cells in autoimmune diseases, contributing to both disease progression and protective mechanisms. Advances in single-cell immune profiling have enabled simultaneous analysis of the CD8+ T cell transcriptome and TCR repertoire, revealing cellular heterogeneity, clonal dynamics, and functional transitions. However, our understanding of CD8+ T cell subtypes and their clonal status across different conditions remains unclear.
Variational autoencoder (VAE)-based methods are great for integrating large datasets, but struggle with comparisons between atlases and semi-supervised training across diverse annotation criteria. In this study, researchers developed a VAE-based deep learning model to integrate and coordinate scRNA-seq datasets across studies, allowing transfer of cell subtype annotations.
About research
The CD8+ T Cell Atlas was created using data from 68 studies, 961 single-cell immune profiling samples, and over 1.1 million cells across 42 conditions. Data were processed and quality controlled to exclude cells with insufficient gene expression or TCR information.
A deep learning framework, scAtlasVAE, was developed to integrate and analyze this dataset. We employ a batch unconditional encoder and a batch conditional decoder to correct for batch effects and reconstruct gene expression data using a zero-inflated negative binomial distribution. scAtlasVAE supports both unsupervised and supervised modes, enabling tasks such as atlas integration, cell subtype annotation, and transfer learning.
scAtlasVAE was benchmarked against existing methods. Clonotype analysis was performed to identify different T cell subtypes with unique TCR repertoires and clonotypes shared among different conditions. This model was further validated for its ability to predict CD8+ T cell subtypes within the query dataset.
Furthermore, this study used various bioinformatics tools for gene expression analysis and regulatory network inference to investigate the functional role of CD8+ T cells in various diseases.
Results and discussion
scAtlasVAE demonstrated superior performance across multiple benchmark tasks, including single atlas integration, cross-atlas integration, and cell subtype annotation transfer. When compared with existing methods such as scVI, scANVI, scPoli, SCALEX, Scanorama, Harmony, and Seurat, scAtlasVAE was found to have improved batch effect correction and biological variance preservation.
Benchmarking on two established atlases, TCellLandscape and TCellMap, confirmed its effectiveness in both zero-shot and full-shot transfer learning modes. Unlike other methods, scAtlasVAE incorporates independent predictors of individual cell subtypes, allowing for better placement of annotations across datasets.
Using this framework, we clustered CD8+ T cells into 18 subtypes: naïve T cells, central/effector memory T cells, recently activated effector T cells, mucosa-associated invariant T (MAIT) cells, and innate T-cells were grouped into eight major categories, including: highly cytotoxic cells (ILTCK), tissue-resident memory T cells, exhausted T cells (Tex), and proliferating cells. Integration between atlases validated these subtypes and revealed novel populations such as ILTCK-like cells (ILTCK-LC). Paired TCR analysis showed decreased diversity and increased clonal expansion in disease states and tumors. Among the depleted T cells, three subtypes (GZMK+, ITGAE+, and XBP1+) exhibit distinct transcriptomic profiles and tumor-specific enrichment, reflecting their diverse roles in cancer immunity.
Clonotype sharing revealed that GZMK+ Tex cells share clonotypes with both tissue-resident and circulating subtypes, whereas ITGAE+ and XBP1+ Tex cells have a more restricted sharing pattern . High-resolution analysis identified Tex subtypes such as ISG+, DUSP1+, TCF7+, and TNFRSF9+ cells. CPI-induced inflammation (irAE) was shown to be dominated by ITGAE+ Tex cells, which is different from autoimmune inflammation. ITGAE+ Tex cells exhibited enhanced cytotoxic pathways.
The scAtlasVAE model successfully annotated external datasets, confirmed the alignment of Tex subtypes across the atlas, and highlighted similarities between irAE inflammation and cancer TILs. Clonotype sharing between ILTCK-LC and MAIT cells suggested functional diversity.
Limitations of this study include reliance on user-defined annotations for rare subtypes, limited applicability to CD8+ T cells, and the need for experimental validation of newly defined subtypes.
conclusion
In conclusion, the TCR integrated atlas and scAtlasVAE developed in this study provide a valuable resource to study CD8 + T cell heterogeneity and dynamics. These enable integration between datasets and enhance insight into diverse biological systems.