Research

You can also find my articles on my Google Scholar profile.

Nanopore signal classification with pangenome indexes

Nanopore sequencing generates reads by measuring electrical current signal that is converted to nucleic acid sequences typically with a neural network. This basecalling step is a bottleneck in real-time classification pipelines. We developed a novel nanopore signal-based read classification method that uses the r-index, a full-text index that scales to pangenomes. This method, Sigmoni, is significantly faster and more accurate than existing methods for classifying nanopore reads against large pangenomes.

Building and visualizing pangenomes

mumemto Pangenomes require an underlying alignment for interpretability and usability. This structure can take the form of a multiple alignment or graph, but these are impractical to compute. We developed a novel method, Mumemto, to compute maximal unique matches (multi-MUMs), commonly used as anchors for alignment, at the scale of hundreds of human genomes. Mumemto can visualize pangenome synteny (example below), accelerate graph construction, identify misassemblies, and even improve full-text index-based classification (see further work).

mumemto
potato pangenome (ref)