Motivation: Understanding chemical reactions requires bridging fine-grained molecular edits with broader semantic context. Reaction mechanisms are determined not only by local atom–bond transformations but also by the global reaction class. However, most existing approaches treat these tasks separately or rely on external atom-mapping tools, introducing noise and limiting end-to-end learnability. We introduce MARCC (Mapping-Assisted Reaction Center and Classification), a multi-task graph neural network that jointly predicts atom mappings, reaction centers, and reaction classes within a unified architecture. Results: MARCC integrates three key innovations: (i) a mapping-guided cross-attention mechanism that aligns reactants and products for local edit detection, (ii) a dual-graph design that explicitly reasons about bond-level transformations, and (iii) pooled product embeddings for global reaction classification. On the USPTO-50K benchmark, MARCC achieves state-of-the-art results when trained with both reactants and products, including 98.2% atom mapping accuracy, 99.1% Top-1 edit localization accuracy, and 97.2% reaction classification accuracy. Even under the products-only setting, MARCC delivers competitive performance comparable to specialized baselines. Ablation studies confirm the value of mapping-guided attention and multi-task supervision, which enhance both predictive accuracy and interpretability. By unifying atom-level alignment, local reactivity, and global classification, MARCC provides a structured and interpretable framework for reaction understanding. Beyond benchmarks, MARCC has the potential to support applications in reaction annotation, template discovery, and mechanism inference; with additional domain-specific modeling and data, it could be extended to biochemical domains such as enzyme-catalyzed transformations and metabolic pathway modeling. Availability and Implementation: The source code and implementation details are available at https://github.com/ maryamastero/MARCC and archived at https://doi.org/10.5281/zenodo.18500230. Contact: [email protected], [email protected] Supplementary Information: Supplementary data are available at Bioinformatics online.
A cross-attentive multi-task graph learning framework for chemical reaction modeling / M. Astero, A. Li, E. Casiraghi, J. Rousu. - In: BIOINFORMATICS. - ISSN 1367-4811. - (2025), pp. btag193.1-btag193.9. [Epub ahead of print] [10.1093/bioinformatics/btag193]
A cross-attentive multi-task graph learning framework for chemical reaction modeling
E. CasiraghiPenultimo
;
2025
Abstract
Motivation: Understanding chemical reactions requires bridging fine-grained molecular edits with broader semantic context. Reaction mechanisms are determined not only by local atom–bond transformations but also by the global reaction class. However, most existing approaches treat these tasks separately or rely on external atom-mapping tools, introducing noise and limiting end-to-end learnability. We introduce MARCC (Mapping-Assisted Reaction Center and Classification), a multi-task graph neural network that jointly predicts atom mappings, reaction centers, and reaction classes within a unified architecture. Results: MARCC integrates three key innovations: (i) a mapping-guided cross-attention mechanism that aligns reactants and products for local edit detection, (ii) a dual-graph design that explicitly reasons about bond-level transformations, and (iii) pooled product embeddings for global reaction classification. On the USPTO-50K benchmark, MARCC achieves state-of-the-art results when trained with both reactants and products, including 98.2% atom mapping accuracy, 99.1% Top-1 edit localization accuracy, and 97.2% reaction classification accuracy. Even under the products-only setting, MARCC delivers competitive performance comparable to specialized baselines. Ablation studies confirm the value of mapping-guided attention and multi-task supervision, which enhance both predictive accuracy and interpretability. By unifying atom-level alignment, local reactivity, and global classification, MARCC provides a structured and interpretable framework for reaction understanding. Beyond benchmarks, MARCC has the potential to support applications in reaction annotation, template discovery, and mechanism inference; with additional domain-specific modeling and data, it could be extended to biochemical domains such as enzyme-catalyzed transformations and metabolic pathway modeling. Availability and Implementation: The source code and implementation details are available at https://github.com/ maryamastero/MARCC and archived at https://doi.org/10.5281/zenodo.18500230. Contact: [email protected], [email protected] Supplementary Information: Supplementary data are available at Bioinformatics online.| File | Dimensione | Formato | |
|---|---|---|---|
|
bioinformatics_MARCC_Maryam.pdf
accesso aperto
Tipologia:
Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Licenza:
Creative commons
Dimensione
1.62 MB
Formato
Adobe PDF
|
1.62 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.




