This analysis is based on the single-cell RNA-seq dataset published in:
Tu, J. et al. (2021). Single-Cell Transcriptomics of Human Nucleus Pulposus Cells: Understanding Cell Heterogeneity and Degeneration. Advanced Science, 8(23), 2103631. https://doi.org/10.1002/advs.202103631
From the plots generated: - UMAP/tSNE: Reveal 6 biologically interpretable NPC subtypes. - DotPlot: Confirms canonical marker expression in annotated subtypes. - Pseudotime: Suggests HT-CLNPs are early progenitors transitioning into mature states such as effector or fibroNPCs.
This analysis use dataset “GSE165722”.
Filter low-quality cells based on UMI count, gene number, and mitochondrial content using Seurat. This step ensures robust downstream analysis by removing potential doublets and stressed or dying cells.
QC Violin Plot
Normalize gene expression data, identify variable genes, scale the matrix, and reduce dimensionality using PCA and UMAP. Perform graph-based clustering to define transcriptionally distinct cell groups.
Global cell type identities are assigned using SingleR, referencing transcriptional profiles from datasets such as the Human Primary Cell Atlas. This method enables the identification of broad cell categories (e.g., MSCs, T cells) based on cross-tissue gene expression similarity.
UMAP Clusters
This approach evaluates the activation level of each NPC subtype signature at the single-cell level using AUCell. While umap_clusters_celltype.png displays general cell type classifications inferred from global references, umap_npc_subtypes_auc.png reveals functionally distinct NPC subtypes based on enrichment of tissue-specific marker genes.
UMAP Clusters
Identify differentially expressed genes for each cluster using
Seurat’s FindAllMarkers()
. Combine marker gene expression
with reference-based annotation tools (e.g., SingleR) to assign cell
types.
## p_val avg_log2FC pct.1 pct.2 p_val_adj cluster gene
## 1 0 1.8229819 0.868 0.317 0 0 CYR61
## 2 0 1.2440728 0.977 0.431 0 0 DCN
## 3 0 1.8586392 0.877 0.345 0 0 CTGF
## 4 0 1.3793767 0.947 0.417 0 0 LUM
## 5 0 1.7458917 0.940 0.456 0 0 FN1
## 6 0 0.6984552 0.915 0.438 0 0 CLU
Top Marker DotPlot
Cell Type Distribution Barplot
Use Monocle3 to infer dynamic developmental trajectories among selected clusters. Order cells along pseudotime and visualize progression paths to study lineage relationships during degeneration.
Pseudotime trajectory of NPC populations
Perform ligand-receptor analysis using CellChat to explore cell-cell interactions among NPC subpopulations or all clusters. Identify enriched signaling pathways and visualize the intercellular network.
Cell-cell communication of NPC populations
Use CytoTRACE to infer differentiation potential and cellular plasticity based on gene counts and transcriptional diversity. This step helps to highlight progenitor-like populations within NPC clusters.
Apply the SCENIC pipeline to construct gene regulatory networks and infer regulon activity across single cells. This identifies key transcription factors and their target modules associated with IVDD progression.
Perform gene set enrichment analysis using the fgsea method with Hallmark gene sets. This highlights stage-specific biological pathways such as TNF, MAPK, or unfolded protein response during disc degeneration.
Use AUCell to compute the activation level of gene signatures (e.g., SASP, ECM remodeling) in individual cells. This provides fine-grained insights into heterogeneous transcriptional programs across NPC states.
Isolate immune cell subsets (e.g., G-MDSCs, macrophages, T cells) for focused reclustering and downstream analysis. This enables functional dissection of the immune microenvironment in NP degeneration.
.counts.tsv.gz
and
.cellname.txt.gz
pairs are expected in
data/GSE165722/extracted/
filtered_feature_bc_matrix/
directories for each sample. ## 📊 Additional Static Results