iSCALE Unlocks Molecular Maps of Large Human Tissues
Dr. Mingyao Li co-authored a study demonstrating iSCALE, a method that maps large human tissues at molecular and spatial resolution, uncovering disease features previously invisible.
Leveraging cutting-edge statistics and AI to advance single-cell genomics, spatial omics, and digital pathology for precision medicine and disease understanding.
Explore the latest research breakthroughs and publications from the SC2SG research team.
Dr. Mingyao Li co-authored a study demonstrating iSCALE, a method that maps large human tissues at molecular and spatial resolution, uncovering disease features previously invisible.
Read about Dr. Mingyao Li’s recent publication in Nature, which maps early human brain development using spatial transcriptomics. The study reveals that cortical layers and areas are specified earlier than previously thought, offering new insight into how the brain takes shape.
DESC is an unsupervised deep learning algorithm for clustering scRNA-seq data. The algorithm constructs a non-linear mapping function from the original scRNA-seq data space to a low-dimensional feature space by iteratively learning cluster-specific gene expression representation and cluster assignment based on a deep neural network. This iterative procedure moves each cell to its nearest cluster, balances biological and technical differences between clusters, and reduces the influence of batch effect. DESC also enables soft clustering by assigning cluster-specific probabilities to each cell, which facilitates the identification of cells clustered with high-confidence and interpretation of results.
ItClust is an Iterative Transfer learning algorithm for scRNA-seq Clustering. It starts from building a training neural network to extract gene-expression signatures from a well-labeled source dataset. This step enables initializing the target network with parameters estimated from the training network. The target network then leverages information in the target dataset to iteratively fine-tune parameters in an unsupervised manner, so that the target-data-specific gene-expression signatures are captured. Once fine-tuning is finished, the target network then returns clustered cells in the target data. ItClust has shown to be a powerful tool for scRNA-seq clustering and cell type classification analysis. It can accurately extract information from source data and apply it to help cluster cells in target data. It is robust to strong batch effect between source and target data, and is able to separate unseen cell types in the target. Furthermore, it provides confidence scores that facilitates cell type assignment. With the increasing popularity of scRNA-seq in biomedical research, we expect ItClust will make better utilization of the vast amount of existing well annotated scRNA-seq datasets, and enable researchers to accurately cluster and annotate cells in scRNA-seq.
CarDEC is a joint deep learning computational tool that is useful for analyses of single-cell RNA-seq data. CarDEC can be used to:
sciPENN is a deep learning computational tool that is useful for analyses of CITE-seq data. sciPENN can be used to:
SpaGCN is a graph convolutional network to integrate gene expression and histology to identify spatial domains and spatially variable genes. To jointly model all spots in a tissue slide, SpaGCN integrates information from gene expression, spatial locations and histological pixel intensities across spots into an undirected weighted graph. Each vertex in the graph contains gene expression information of a spot and the edge weight between two vertices quantifies their expression similarity that is driven by spatial dependency of their coordinates and the corresponding histology. To aggregate gene expression of each spot from its neighboring spots, SpaGCN utilizes a convolutional layer based on edge weights specified by the graph. The aggregated gene expression is then fed into a deep embedding clustering algorithm to cluster the spots into different spatial domains. After spatial domains are identified, genes that are enriched in each spatial domain can be detected by differential expression analysis between domains. SpaGCN is applicable to both in-situ transcriptomics with single-cell resolution (seqFISH, seqFISH+, MERFISH, STARmap, and FISSEQ) and spatial barcoding based transcriptomics (Spatial Transcriptomics , SLIDE-seq, SLIDE-seqV2, HDST, 10x Visium, DBiT-seq, Stero-seq, and PIXEL-seq) data.
TESLA is a machine learning framework for multi-level tissue annotation on the histology image with pixel-level resolution in Spatial Transcriptomics (ST). By integrating information from high-resolution histology image, TESLA can impute gene expression at superpixels and fill in missing gene expression in tissue gaps. The increased gene expression resolution makes it possible to treat gene expression data as images, which enabled the integration with histological features for joint tissue segmentation and annotation of different cell types directly on the histology image with pixel-level resolution. Additionally, TESLA can detect unique structures of tumor immune microenvironment such as Tertiary Lymphoid Structures (TLSs), , separate a tumor into core and edge to examine their cellular compositions, expression features, and molecular processes. TESLA has been evaluated on five cancer datasets. Our results consistently showed that TESLA can generate high-quality super-resolution gene expression images, which facilitated the downstream multi-level tissue annotation.
MISO is a deep-learning based method developed for the integration and clustering of multi-modal spatial omics data. MISO requires minimal hyperparameter tuning, and can be applied to any number of omic and imaging data modalities from any multi-modal spatial omics experiment. MISO has been evaluated on datasets from experiements including spatial transcriptomics (transcriptomics and histology), spatial epigenome-transcriptome co-profiling (chromatin accessibility, histone modification, and transcriptomics), spatial CITE-seq (transcriptomics, proteomics, and histology), and spatial transcriptomics and metabolomics (transcriptomics, metabolomics, and histology).
This software package implements iSCALE (Inferring Spatially resolved Cellular Architectures for Large-sized tissue Environments), A novel framework designed to integrate multiple daughter captures and utilize H&E information from large tissue samples, enabling the prediction of gene expression in large-sized tissues with near single-cell resolution.
Abedini A, Levinsohn J, Klötzer KA, Dumoulin B, Ma Z, Frederick J, Dhillon P, Balzer MS, Shrestha R, Liu H, Vitale S, Bergeson AM, Devalaraja‑Narashimha K, Grandi P, Bhattacharyya T, Hu E, Pullen SS, Boustany‑Kari C, Guarnieri P, Karihaloo A, Traum D, Yan H, Coleman K, Palmer M, Sarov‑Blat L, Morton L, Hunter CA, Kaestner KH, Li M, Susztak K. Single-cell multi‑omic and spatial profiling of human kidneys implicates the fibrotic microenvironment in kidney disease progression. Nat Genet. 2024 Aug;56(8):1712–1724. doi:10.1038/s41588-024-01802-x.
Coleman K, Schroeder A, Loth M, Zhang D, Park JH, Sung JY, Blank N, Cowan AJ, Qian X, Chen J, Jiang J, Yan H, Samarah LZ, Clemenceau JR, Jang I, Kim M, Barnfather I, Rabinowitz JD, Deng Y, Lee EB, Lazar A, Gao J, Furth EE, Hwang TH, Wang L, Thaiss CA, Hu J, Li M. Resolving tissue complexity by multimodal spatial omics modeling with MISO. Nat Methods. 2025 Mar;22(3):530–538. doi:10.1038/s41592‑024‑02574‑2.
Govek KW, Nicodemus P, Lin Y, et al. CAJAL enables analysis and integration of single-cell morphological data using metric geometry. Nat Commun. 2023;14(1):3672. doi:10.1038/s41467-023-39424-2.
Guo P, Mao L, Chen Y, Lee CN, Cardilla A, Li M, Bartosovic M, Deng Y, et al. Multiplexed spatial mapping of chromatin features, transcriptome and proteins in tissues. Nat Methods. 2025 Mar;22(3):520–529. doi:10.1038/s41592‑024‑02576‑0.
Hu J, Li X, Coleman K, Schroeder A, Ma N, Irwin DJ, Lee EB, Shinohara RT, Li M. SpaGCN: integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network. Nat Methods. 2021 Nov;18(11):1342–1351. doi:10.1038/s41592-021-01255-8.
Perlman BS, Burget N, Zhou Y, Schwartz GW, Petrovic J, Modrusan Z, Faryabi RB. Enhancer-promoter hubs organize transcriptional networks promoting oncogenesis and drug resistance. Nat Commun. 2024;15(1):8070. doi:10.1038/s41467-024-52375-6.
Wilson PC, Verma A, Yoshimura Y, Muto Y, Li H, Malvin NP, Dixon EE, Humphreys BD. Mosaic loss of Y chromosome is associated with aging and epithelial injury in chronic kidney disease. Genome Biol. 2024 Jan 29;25(1):36. doi:10.1186/s13059-024-03173-2.
Zhang Z, Mathew D, Lim TL, et al. Recovery of biological signals lost in single‑cell batch integration with CellANOVA. Nat Biotechnol. 2024 Nov 26;42(11). doi:10.1038/s41587-024-02463-1.
Zhang D, Wang X, Shivashankar GV, Uhler C. Inferring super-resolution tissue architecture by integrating spatial transcriptomics with histology. Nat Biotechnol. 2024;42(1):22–31. doi:10.1038/s41587-023-02019-9.