Motivation: Understanding chemical reactions requires bridging fine-grained molecular edits with broader semantic context. Reaction mechanisms are determined not only by local atom–bond transformations but also by the global reaction class. However, most existing approaches treat these tasks separately or rely on external atom-mapping tools, introducing noise and limiting end-to-end learnability. We introduce MARCC (Mapping-Assisted Reaction Center and Classification), a multi-task graph neural network that jointly predicts atom mappings, reaction centers, and reaction classes within a unified architecture. Results: MARCC integrates three key innovations: (i) a mapping-guided cross-attention mechanism that aligns reactants and products for local edit detection, (ii) a dual-graph design that explicitly reasons about bond-level transformations, and (iii) pooled product embeddings for global reaction classification. On the USPTO-50K benchmark, MARCC achieves state-of-the-art results when trained with both reactants and products, including 98.2% atom mapping accuracy, 99.1% Top-1 edit localization accuracy, and 97.2% reaction classification accuracy. Even under the products-only setting, MARCC delivers competitive performance comparable to specialized baselines. Ablation studies confirm the value of mapping-guided attention and multi-task supervision, which enhance both predictive accuracy and interpretability. By unifying atom-level alignment, local reactivity, and global classification, MARCC provides a structured and interpretable framework for reaction understanding. Beyond benchmarks, MARCC has the potential to support applications in reaction annotation, template discovery, and mechanism inference; with additional domain-specific modeling and data, it could be extended to biochemical domains such as enzyme-catalyzed transformations and metabolic pathway modeling. Availability and Implementation: The source code and implementation details are available at https://github.com/ maryamastero/MARCC and archived at https://doi.org/10.5281/zenodo.18500230. Contact: [email protected], [email protected] Supplementary Information: Supplementary data are available at Bioinformatics online.

A cross-attentive multi-task graph learning framework for chemical reaction modeling / M. Astero, A. Li, E. Casiraghi, J. Rousu. - In: BIOINFORMATICS. - ISSN 1367-4811. - (2025), pp. btag193.1-btag193.9. [Epub ahead of print] [10.1093/bioinformatics/btag193]

A cross-attentive multi-task graph learning framework for chemical reaction modeling

E. Casiraghi
Penultimo
;
2025

Abstract

Motivation: Understanding chemical reactions requires bridging fine-grained molecular edits with broader semantic context. Reaction mechanisms are determined not only by local atom–bond transformations but also by the global reaction class. However, most existing approaches treat these tasks separately or rely on external atom-mapping tools, introducing noise and limiting end-to-end learnability. We introduce MARCC (Mapping-Assisted Reaction Center and Classification), a multi-task graph neural network that jointly predicts atom mappings, reaction centers, and reaction classes within a unified architecture. Results: MARCC integrates three key innovations: (i) a mapping-guided cross-attention mechanism that aligns reactants and products for local edit detection, (ii) a dual-graph design that explicitly reasons about bond-level transformations, and (iii) pooled product embeddings for global reaction classification. On the USPTO-50K benchmark, MARCC achieves state-of-the-art results when trained with both reactants and products, including 98.2% atom mapping accuracy, 99.1% Top-1 edit localization accuracy, and 97.2% reaction classification accuracy. Even under the products-only setting, MARCC delivers competitive performance comparable to specialized baselines. Ablation studies confirm the value of mapping-guided attention and multi-task supervision, which enhance both predictive accuracy and interpretability. By unifying atom-level alignment, local reactivity, and global classification, MARCC provides a structured and interpretable framework for reaction understanding. Beyond benchmarks, MARCC has the potential to support applications in reaction annotation, template discovery, and mechanism inference; with additional domain-specific modeling and data, it could be extended to biochemical domains such as enzyme-catalyzed transformations and metabolic pathway modeling. Availability and Implementation: The source code and implementation details are available at https://github.com/ maryamastero/MARCC and archived at https://doi.org/10.5281/zenodo.18500230. Contact: [email protected], [email protected] Supplementary Information: Supplementary data are available at Bioinformatics online.
Settore INFO-01/A - Informatica
Settore CHEM-04/A - Chimica industriale
2025
20-apr-2026
Article (author)
File in questo prodotto:
File Dimensione Formato  
bioinformatics_MARCC_Maryam.pdf

accesso aperto

Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Licenza: Creative commons
Dimensione 1.62 MB
Formato Adobe PDF
1.62 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1239936
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex 0
social impact