Supervised machine learning (ML) is used extensively in biology and deserves closer scrutiny. The DOME recommendations aim to enhance the validation and reproducibility of ML research by establishing standards for key aspects such as data handling and processing, optimization, evaluation, and model interpretability. The recommendations help to ensure that key details are reported transparently by providing a structured set of questions. Here, we introduce the DOME Registry (URL: this http URL), a database that allows scientists to manage and access comprehensive DOME-related information on published ML studies. The registry uses external resources like ORCID, APICURON and the Data Stewardship Wizard to streamline the annotation process and ensure comprehensive documentation. By assigning unique identifiers and DOME scores to publications, the registry fosters a standardized evaluation of ML methods. Future plans include continuing to grow the registry through community curation, improving the DOME score definition and encouraging publishers to adopt DOME standards, promoting transparency and reproducibility of ML in the life sciences.

DOME Registry: Implementing community-wide recommendations for reporting supervised machine learning in biology / O. Abdelghani Attafi, D. Clementel, K. Kyritsis, E. Capriotti, G. Farrell, S. Fragkouli, L. Jael Castro, A. Hatos, T. Lenaerts, S. Mazurenko, S. Mozaffari, F. Pradelli, P. Ruch, C. Savojardo, P. Turina, F. Zambelli, D. Piovesan, A. Miguel Monzon, F. Psomopoulos, S.C.E. Tosatto. - (2024 Aug 14). [10.48550/arxiv.2408.07721]

DOME Registry: Implementing community-wide recommendations for reporting supervised machine learning in biology

F. Zambelli;
2024

Abstract

Supervised machine learning (ML) is used extensively in biology and deserves closer scrutiny. The DOME recommendations aim to enhance the validation and reproducibility of ML research by establishing standards for key aspects such as data handling and processing, optimization, evaluation, and model interpretability. The recommendations help to ensure that key details are reported transparently by providing a structured set of questions. Here, we introduce the DOME Registry (URL: this http URL), a database that allows scientists to manage and access comprehensive DOME-related information on published ML studies. The registry uses external resources like ORCID, APICURON and the Data Stewardship Wizard to streamline the annotation process and ensure comprehensive documentation. By assigning unique identifiers and DOME scores to publications, the registry fosters a standardized evaluation of ML methods. Future plans include continuing to grow the registry through community curation, improving the DOME score definition and encouraging publishers to adopt DOME standards, promoting transparency and reproducibility of ML in the life sciences.
machine learning; standards; transparency; reproducibility
Settore BIO/11 - Biologia Molecolare
Settore BIO/10 - Biochimica
14-ago-2024
https://arxiv.org/abs/2408.07721
File in questo prodotto:
File Dimensione Formato  
2408.07721v1.pdf

accesso aperto

Descrizione: Preprint
Tipologia: Pre-print (manoscritto inviato all'editore)
Dimensione 795.06 kB
Formato Adobe PDF
795.06 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1087728
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact