The Data Sprint on Imaging Transcriptomics, organized by the Peter Mac Computational Biology program, recently brought together over 50 bioinformaticians and computational biologists for a two-day collaboration. Focused on data from the Phenomics Australia node, the Victorian Centre for Functional Genomics (VCFG) at Peter Mac, the sprint delved into image-based profiling in 2D and 3D cells, and Multiplexed Analysis of Cells (MAC-seq).
This collaborative success signals progress in computational biology, advancing our understanding of imaging transcriptomics for drug discovery and disease research.
In a recent data sprint, organised by the Peter Mac Computational Biology program, bioinformaticians and computational biologists from various backgrounds were brought together for two days of intense scientific collaboration and networking. This event, now in its second year, revolved around the analysis of data provided by the Victorian Centre for Functional Genomics (VCFG), offering a set of rewarding challenges.
Over 50 participants attended, representing a diverse mix of post-docs, students, and research staff. The participants formed teams based on their specific interests in the data and quickly developed prototypes and analyses. These teams then presented their findings to the wider group, fostering an environment of collaboration and knowledge sharing. This event served as a catalyst for the further development of bioinformatics tools dedicated to analyse data from high-throughput sequencing and integrate it with imaging data (Imaging Transcriptomics).
The datasets at the centre of our sprint (all originated from the VCFG), therefore focussed on these two omics methods:
1.) Image-based profiling in 2D and 3D cells.
This approach condensed the rich information contained in biological images into multidimensional profiles of image-based features. These profiles can be invaluable for understanding disease mechanisms and predicting the activity, toxicity, or mechanism of action of drugs. However, changes that don’t manifest as alterations in cell morphology, especially in organoids and spheroids, can be overlooked. Additionally, interpreting imaging features in isolation can be challenging.
2.) Multiplexed Analysis of Cells (MAC-seq).
This mRNA sequencing method allows the cell lysis directly in 384-well plates of 2D and 3D cells, tag cells with barcodes and run the pooled samples as one to dramatically reduce the sequencing cost per sample. MAC-seq is a 3’ methodology, similar to 10X in a single cell experiment. The Poly-A selection process allows transcripts to be tagged with a well ID and a unique molecular identifier (UMI), a full-length cDNA intermediate is generated at this point. Fragmentation and adapter ligation allow for a PCR step specific that results in a 3’ library that is read in 2 steps: Read 1 covers the well ID and UMI and Read 2 provides the 3’ gene sequence.
The perturbations and conditions included in our analysis encompassed a Drug Library with a well-defined mechanism of action (MOA) collection, consisting of 69 compounds at four different concentrations, as well as control groups. Additionally, we targeted 48 genes using CRISPR/RNAi, reflecting the pathways affected by the drugs in the library, along with approximately 20 additional targets commonly altered in cancer or during cancer drug treatment.
The outcomes of our sprint were diverse and included:
- The development of a nextflow pipeline for MACseq quality control, annotation, and normalization.
- The integration of phenotypic and transcriptional signatures of pathway perturbations, such as unsupervised clustering of gene expression and the identification of cluster-associated phenotypic features.
- A systematic assessment of different R methods to effectively integrate RNAseq and phenotype data.
- The integration of shared targets between drugs, siRNA, and sgRNA from sequencing data.
- Exploration of whether phenotype imaging data from cell painting can predict viability/toxicity.
- Collation of expression data into Seurat objects and an exploration of assumptions for different normalization methods and batch effects. This also involved investigating correlations between cell number and sequencing depths between replicates, as well as analysing gene expression changes for known drug targets and genes in pathways.
Overall, our data sprint was a collaborative success, bringing together experts in the field to address complex challenges in computational biology and advance our understanding of imaging transcriptomics and its applications in drug discovery and disease research. We will keep exploring the new collaborations and are in the process of forming a working group to formalise many of the workflows and pipelines that were used during the sprint to standardise the way MACseq data is analysed across PeterMac and beyond.
The VCFG provides a collaborative and innovative partnership. Primarily operates a ‘researcher driven, staff assisted’ model working with researchers each step of the way, through assay development, optimisation, transfection and analysis. This partnership begins with a discussion with A/Prof Kaylene Simpson followed by embedding into the laboratory, training on instruments and performing experiments alongside us until project completion. Comprehensive user guides and associated instrument guides are provided. All data generated remains the intellectual property of the researcher. Importantly, each project is customised to the specific biological question, helping drive the project to the best screen outcome possible.
Through the Victorian Centre for Functional Genomics (VCFG) at the Peter MacCallum Cancer Centre, the Harry Perkins Institute of Medical Research (Perkins), and most recently through the ANU Centre for Therapeutic Discovery (ACTD, The John Curtin School of Medical Research, ANU) Monash University, and at the University of Adelaide (in partnership with SAHMRI), Phenomics Australia Functional Genomics and High-throughput screening services enable biomedical researchers Australia-wide with the ability to perform novel discovery-based screens using multiple platforms.