Lessons in Spectrometry: Deep learning and rule based annotation of tandem mass spectra for glycan sequencing
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Glycans, a broad class of complex sugar molecules, are versatile tools in every living organism. They serve as crucial mediators of information transfer as well as modulators of higher-level structural and chemical features. The exact downstream function of most glycan structures on most proteins is not currently understood, neither causally nor mechanistically. Achieving this requires addressing the difficulty in characterising and quantifying glycan structures from biological samples. While tandem mass spectrometry (MS/MS) has been used to measure glycans for decades, completely characterising structural detail in novel biological contexts is a major bottleneck, taking weeks of dedicated work. Public experimental data also remains varied in format and scope between research groups, which makes collective progress heterogeneous. We collected all public glycomics MS/MS data and catalogued the glycan representation formats published. We then established a conversion scheme to translate these representations from all commonly used chemical nomenclatures to a canonical IUPAC-condensed format. Using this data we applied interpretable tree-based methods to devise simpler and more general rules for structural annotation using specific observed masses. Finally, we showed that a deep residual network with 1D convolutions can learn an approximate mapping from tandem mass spectra to glycan structure classes. This work contributes to faster, simpler and more harmonised glycomics MS/MS analysis whilst promoting greater standardisation across structural annotations.
Description
Keywords
Citation
ISBN
Articles
Urban J, Joeres R, Bojar D. Bridging worlds: connecting glycan representations with glycoinformatics via Universal Input and a canonicalized nomenclature. Bioinformatics Advances 2025; 5 (1), vbaf310. doi:10.1093/bioadv/vbaf310
Urban J, Jin C, Thomsson KA, Karlsson NG, Ives CM, Fadda E, Bojar D. Predicting glycan structure from tandem mass spectrometry via deep learning. Nature Methods 2024; 21, 1206–1215. doi:10.1038/s41592-024-02314-6