Stream 2: Genomics Signal Integration
Overview
Stream 2 is the data backbone of the QuantOmics pipeline. Trainees in this stream tackle the grand challenge of converting noisy, high-dimensional quantum sensor data into high-fidelity, analytically tractable multi-omic datasets. Without Stream 2, the hardware signals from Stream 1 cannot be interpreted, and the AI models of Stream 3 have no validated data to learn from.
The stream bridges two worlds: the messy, real-world signals from quantum probes operating in complex biological matrices, and the clean, structured representations required by state-of-the-art computational genomics and AI methods.
Research Focus Areas
Whole-Genome Sequencing & Variant Analysis
Long- and short-read sequencing data generated by quantum-enhanced nanopore platforms requires sophisticated analysis. Trainees work on:
- Applying comprehensive whole-genome sequencing (WGS) analyses to brain organoids to identify therapeutic targets for neurodevelopmental disorders such as autism
- Developing improved variant calling algorithms that account for unique noise characteristics of quantum-coupled sequencing
- Building tools for structural variant detection in complex genomic regions
Multi-Omic Data Fusion
Integrating data across molecular scales — from DNA to protein to cellular phenotype. Key projects include:
- Fusing electrical and optical sensor data streams with nanopore sequencing reads and methylation maps
- Building end-to-end Snakemake/Nextflow pipelines for reproducible multi-omic analysis
- Developing early vs. late fusion strategies for genomic, transcriptomic, proteomic, and EHR data
- Applying representation learning to extract shared latent features across data modalities
Computational Tools for Sensor-Derived Genomic Data
Novel sensors generate novel data formats. Stream 2 trainees develop new bioinformatics tools:
- Signal processing algorithms for denoising raw quantum sensor output before genomic analysis
- Probabilistic models for base-calling from quantum-coupled nanopore reads
- Quality control frameworks tailored to attomolar-sensitivity assay data
Epigenomics & Methylation Analysis
DNA methylation patterns hold rich information about cell state and disease. Trainees build:
- Pipelines for integrating methylation data with quantum sensor readout from epigenetic biosensors
- Differential methylation analysis in disease-relevant organoid models
- Tools for cross-referencing methylation signatures with EHR phenotype data
Validation Against Real Disease Models
Stream 2 research is validated against clinically relevant biological systems, primarily using patient-derived organoids developed in collaboration with Stream 1. Key application areas:
- Neurodevelopmental disorders — applying WGS to brain organoids to identify new therapeutic targets for autism, building on existing team expertise in the genomic architecture of these conditions
- Cancer — integrating multi-omic data to characterize tumor heterogeneity and identify drug-resistant cell populations in organoid cultures
- Cardiotoxicity — developing computational pipelines for analyzing multi-omic data from cardiomyocyte organ-on-chip models under drug perturbation
Example Trainee Projects
- Whole-genome sequencing of autism brain organoids — identifying novel de novo variants and gene regulatory disruptions associated with autism spectrum disorder
- Snakemake pipeline for multi-omic fusion — building a reproducible, containerized pipeline integrating nanopore sequencing, ATAC-seq, and EHR metadata
- Methylation-guided biomarker discovery — developing an algorithm that identifies disease-specific methylation patterns detectable by quantum epigenetic biosensors
- Cross-ancestry variant classification — building genomic models that integrate population-diverse genetic marker data to reduce bias in variant pathogenicity prediction
Equity & Diversity in Genomic Data
A critical challenge in Stream 2 is the profound lack of diversity in genomic reference databases, which are overwhelmingly of European origin. Trainees in this stream are explicitly trained to:
- Integrate data representing comprehensive genetic markers from diverse populations
- Understand how racial bias in genomic AI can lead to misclassification for underrepresented groups
- Apply ethical frameworks for Indigenous data sovereignty and community consent in genomic research
- Validate tools across diverse cell lines to ensure equitable performance
Stream 2 Co-Leads
- Dr. Brett Trost (UofT / SickKids) — Computational genomics, multi-omic pipeline development
- Dr. Jacques Corbeil (U Laval / MILA) — Medical genomics, AI-driven data integration, organ-on-chip resources
- Dr. Brenda Andrews (UofT) — Functional genomics, systems biology, gene networks (Tier 1 CRC)
Courses Supporting Stream 2
- Course 1.3 — AI in Genomics
- Bootcamp 1.4 — Multimodal-Omics Data Integration (the core course for this stream)
- Course 1.5 — Responsible Innovation & EDI in Precision Health (Indigenous data sovereignty, bias in genomic AI)