Stream 2: Genomics Signal Integration

Overview

Stream 2 is the data backbone of the QuantOmics pipeline. Trainees in this stream tackle the grand challenge of converting noisy, high-dimensional quantum sensor data into high-fidelity, analytically tractable multi-omic datasets. Without Stream 2, the hardware signals from Stream 1 cannot be interpreted, and the AI models of Stream 3 have no validated data to learn from.

The stream bridges two worlds: the messy, real-world signals from quantum probes operating in complex biological matrices, and the clean, structured representations required by state-of-the-art computational genomics and AI methods.


Research Focus Areas

Whole-Genome Sequencing & Variant Analysis

Long- and short-read sequencing data generated by quantum-enhanced nanopore platforms requires sophisticated analysis. Trainees work on:

  • Applying comprehensive whole-genome sequencing (WGS) analyses to brain organoids to identify therapeutic targets for neurodevelopmental disorders such as autism
  • Developing improved variant calling algorithms that account for unique noise characteristics of quantum-coupled sequencing
  • Building tools for structural variant detection in complex genomic regions

Multi-Omic Data Fusion

Integrating data across molecular scales — from DNA to protein to cellular phenotype. Key projects include:

  • Fusing electrical and optical sensor data streams with nanopore sequencing reads and methylation maps
  • Building end-to-end Snakemake/Nextflow pipelines for reproducible multi-omic analysis
  • Developing early vs. late fusion strategies for genomic, transcriptomic, proteomic, and EHR data
  • Applying representation learning to extract shared latent features across data modalities

Computational Tools for Sensor-Derived Genomic Data

Novel sensors generate novel data formats. Stream 2 trainees develop new bioinformatics tools:

  • Signal processing algorithms for denoising raw quantum sensor output before genomic analysis
  • Probabilistic models for base-calling from quantum-coupled nanopore reads
  • Quality control frameworks tailored to attomolar-sensitivity assay data

Epigenomics & Methylation Analysis

DNA methylation patterns hold rich information about cell state and disease. Trainees build:

  • Pipelines for integrating methylation data with quantum sensor readout from epigenetic biosensors
  • Differential methylation analysis in disease-relevant organoid models
  • Tools for cross-referencing methylation signatures with EHR phenotype data

Validation Against Real Disease Models

Stream 2 research is validated against clinically relevant biological systems, primarily using patient-derived organoids developed in collaboration with Stream 1. Key application areas:

  • Neurodevelopmental disorders — applying WGS to brain organoids to identify new therapeutic targets for autism, building on existing team expertise in the genomic architecture of these conditions
  • Cancer — integrating multi-omic data to characterize tumor heterogeneity and identify drug-resistant cell populations in organoid cultures
  • Cardiotoxicity — developing computational pipelines for analyzing multi-omic data from cardiomyocyte organ-on-chip models under drug perturbation

Example Trainee Projects

  • Whole-genome sequencing of autism brain organoids — identifying novel de novo variants and gene regulatory disruptions associated with autism spectrum disorder
  • Snakemake pipeline for multi-omic fusion — building a reproducible, containerized pipeline integrating nanopore sequencing, ATAC-seq, and EHR metadata
  • Methylation-guided biomarker discovery — developing an algorithm that identifies disease-specific methylation patterns detectable by quantum epigenetic biosensors
  • Cross-ancestry variant classification — building genomic models that integrate population-diverse genetic marker data to reduce bias in variant pathogenicity prediction

Equity & Diversity in Genomic Data

A critical challenge in Stream 2 is the profound lack of diversity in genomic reference databases, which are overwhelmingly of European origin. Trainees in this stream are explicitly trained to:

  • Integrate data representing comprehensive genetic markers from diverse populations
  • Understand how racial bias in genomic AI can lead to misclassification for underrepresented groups
  • Apply ethical frameworks for Indigenous data sovereignty and community consent in genomic research
  • Validate tools across diverse cell lines to ensure equitable performance

Stream 2 Co-Leads

  • Dr. Brett Trost (UofT / SickKids) — Computational genomics, multi-omic pipeline development
  • Dr. Jacques Corbeil (U Laval / MILA) — Medical genomics, AI-driven data integration, organ-on-chip resources
  • Dr. Brenda Andrews (UofT) — Functional genomics, systems biology, gene networks (Tier 1 CRC)

Courses Supporting Stream 2

  • Course 1.3 — AI in Genomics
  • Bootcamp 1.4 — Multimodal-Omics Data Integration (the core course for this stream)
  • Course 1.5 — Responsible Innovation & EDI in Precision Health (Indigenous data sovereignty, bias in genomic AI)