In recent years, the combination of cutting-edge microscopy and gene-specific in-situ labeling approaches gave rise to the nascent field of image-based spatial transcriptomics, that allows to generate large gene expression profiles of biological specimen at a subcellular scale while retaining the native spatial context. At EPFL, for example, we recently established an experimental method called Hybridization In Situ Sequencing (HybISS) for profiling the expression of up to two hundred genes simultaneously in mouse tissue sections. HybISS achieves this multiplexing by sequentially imaging the same tissue section with a changing set of fluorescent labels specific to a subset of gene transcripts, followed by a decoding step that combinatorically reconstructs the gene identity. Like other spatial transcriptomics approaches, HybISS heavily relies on computational processing methods that can accurately detect several hundred thousand transcript localizations in noisy tissue images and that can efficiently track (link) all these localizations across imaging cycles to produce the final gene expression map. Importantly, these methods need to tolerate the large variability of tissue distortions, labeling artifacts, and misalignments that are typically present in acquired images. However, currently available computational pipelines only employ rather uninformed classical heuristics for transcript detection and linking, such as simple filtering and thresholding. As a result, available pipelines are hard to use and error-prone, have to be tweaked for every experiment, and are neither robust nor adapted to the often-changing and challenging image conditions.
In this project, scientists propose a computational spatial transcriptomics framework - codebook-aware ILP detection and tracking (CBAIDT) - that leverages modern computer vision techniques while using a problem-adapted novel tracking approach to substantially increase the accuracy and robustness of gene expression map generation. Specifically, they will generate the first comprehensive training set of HybISS images with annotated transcript detections. This will be used to develop a deep learning-based transcript detection method using a novel loss function to provide a scalable detection module that is robust against changing imaging conditions. They also intend to propose a complementary tracking module that uses discrete optimization via integer linear programming (ILP) together with a novel linking (codebook) constraint to generate globally optimal tracks. The increased accuracy and spatial precision afforded by this framework will allow them to accurately reconstruct the gene expression map of a mouse embryo at the beginning of neurulation and provide new insights into the distribution of patterning transcription factors and signaling molecules. Finally, they plan, together with SV-BIOP and the Histology core facility, to make their framework accessible as the CBAIDT software package. This project should provide the spatial transcriptomics community with an easy-to-use yet powerful tool to perform quantitative reconstruction of transcriptomics experiments with unparalleled accuracy.