The ability to pinpoint and predict sites of metabolism (SoMs) is essential for designing and optimizing effective and safe bioactive small molecules. However, the number of molecules with annotated SoMs is limited, hindering the advancement of data-driven methods such as machine learning for metabolism prediction. Here, we provide a comprehensive characterization of SoM data obtained from the readouts of a human hepatocyte assay conducted at AstraZeneca Gothenburg. We explore a new strategy for SoM annotation that accounts for uncertainty in the experimental data, and we relate our findings to the most comprehensive SoM data collection available to date. Our study includes entropy analysis of SoM annotations, accompanied by representative examples that highlight the complexities of interpreting and working with metabolism data. Furthermore, we demonstrate the impact and value of the new metabolism data on SoM prediction. Importantly, a substantial portion of the data generated and analyzed as part of this work is made publicly available.

Metabolite Identification Data in Drug Discovery, Part 2: Site-of-Metabolism Annotation, Analysis, and Exploration for Machine Learning / Y. Chen, S. Winiwarter, R.A. Jacob, M. Ahlqvist, A. Mazzolari, F. Miljković, J. Kirchmair. - In: MOLECULAR PHARMACEUTICS. - ISSN 1543-8384. - 22:11(2025 Nov 03), pp. 6772-6787. [10.1021/acs.molpharmaceut.5c00740]

Metabolite Identification Data in Drug Discovery, Part 2: Site-of-Metabolism Annotation, Analysis, and Exploration for Machine Learning

A. Mazzolari;
2025

Abstract

The ability to pinpoint and predict sites of metabolism (SoMs) is essential for designing and optimizing effective and safe bioactive small molecules. However, the number of molecules with annotated SoMs is limited, hindering the advancement of data-driven methods such as machine learning for metabolism prediction. Here, we provide a comprehensive characterization of SoM data obtained from the readouts of a human hepatocyte assay conducted at AstraZeneca Gothenburg. We explore a new strategy for SoM annotation that accounts for uncertainty in the experimental data, and we relate our findings to the most comprehensive SoM data collection available to date. Our study includes entropy analysis of SoM annotations, accompanied by representative examples that highlight the complexities of interpreting and working with metabolism data. Furthermore, we demonstrate the impact and value of the new metabolism data on SoM prediction. Importantly, a substantial portion of the data generated and analyzed as part of this work is made publicly available.
data analysis; data annotation; data sets; drug metabolism; sites of metabolism (SoMs); xenobiotic metabolism;
Settore CHEM-07/A - Chimica farmaceutica
3-nov-2025
21-ott-2025
Article (author)
File in questo prodotto:
File Dimensione Formato  
metabolite-identification-data-in-drug-discovery-part-2-site-of-metabolism-annotation-analysis-and-exploration-for.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Licenza: Creative commons
Dimensione 6.05 MB
Formato Adobe PDF
6.05 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1242962
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
  • OpenAlex ND
social impact