Understanding the interactions between small molecules and their protein targets is a central challenge in drug discovery, underpinning the development of new therapeutics and the rational design of chemical probes. Despite the availability of large-scale bioactivity data, significant variability in experimental conditions, reporting formats, and structural representations often limits their utility for computational modeling and machine learning applications. To address these challenges, curated and standardized datasets are essential, providing reliable benchmarks for method development, predictive modeling, and AI-driven analyses. This thesis presents DELTA, a curated and standardized ligand–target database designed to support AI-driven analyses, and predictive modeling in drug discovery. DELTA provides benchmark subsets of 200 ligands per target, evenly divided between active and inactive compounds, alongside extended collections (DELTA-X) to facilitate robust predictive modeling. The database integrates high-quality 3D structures for both targets and ligands, with detailed annotations on physiological function, pathological relevance, and binding sites. Machine learning models were developed to predict ligand activity across all protein targets contained in DELTA. Random forest models, optimized through hyperparameter tuning and feature selection, achieved robust and generalizable performance. Integration of these models into reverse screening pipelines enables efficient target prioritization, reducing computational cost while focusing on biologically relevant interactions. Overall, DELTA represents a comprehensive and versatile resource for ligand–target data, 3D structural analysis, and predictive modeling, establishing a foundation for AI-driven drug discovery, mechanistic studies, and rational experimental prioritization.

LIGAND- AND STRUCTURE-BASED METHODS FOR IN SILICO TARGET IDENTIFICATION / A. Pisati ; tutor: G. Vistoli ; co-tutor: A. Pedretti ; phd coordinator: G. Vistoli. Dipartimento di Scienze Farmaceutiche, 2026 Apr 17. 38. ciclo, Anno Accademico 2025/2026.

LIGAND- AND STRUCTURE-BASED METHODS FOR IN SILICO TARGET IDENTIFICATION

A. Pisati
2026

Abstract

Understanding the interactions between small molecules and their protein targets is a central challenge in drug discovery, underpinning the development of new therapeutics and the rational design of chemical probes. Despite the availability of large-scale bioactivity data, significant variability in experimental conditions, reporting formats, and structural representations often limits their utility for computational modeling and machine learning applications. To address these challenges, curated and standardized datasets are essential, providing reliable benchmarks for method development, predictive modeling, and AI-driven analyses. This thesis presents DELTA, a curated and standardized ligand–target database designed to support AI-driven analyses, and predictive modeling in drug discovery. DELTA provides benchmark subsets of 200 ligands per target, evenly divided between active and inactive compounds, alongside extended collections (DELTA-X) to facilitate robust predictive modeling. The database integrates high-quality 3D structures for both targets and ligands, with detailed annotations on physiological function, pathological relevance, and binding sites. Machine learning models were developed to predict ligand activity across all protein targets contained in DELTA. Random forest models, optimized through hyperparameter tuning and feature selection, achieved robust and generalizable performance. Integration of these models into reverse screening pipelines enables efficient target prioritization, reducing computational cost while focusing on biologically relevant interactions. Overall, DELTA represents a comprehensive and versatile resource for ligand–target data, 3D structural analysis, and predictive modeling, establishing a foundation for AI-driven drug discovery, mechanistic studies, and rational experimental prioritization.
17-apr-2026
Settore CHEM-07/A - Chimica farmaceutica
Database; Machine Leaning; DELTA; Activity; Prediction
VISTOLI, GIULIO
VISTOLI, GIULIO
Doctoral Thesis
LIGAND- AND STRUCTURE-BASED METHODS FOR IN SILICO TARGET IDENTIFICATION / A. Pisati ; tutor: G. Vistoli ; co-tutor: A. Pedretti ; phd coordinator: G. Vistoli. Dipartimento di Scienze Farmaceutiche, 2026 Apr 17. 38. ciclo, Anno Accademico 2025/2026.
File in questo prodotto:
File Dimensione Formato  
phd_unimi_R14161.pdf

accesso aperto

Descrizione: Doctoral thesis
Tipologia: Publisher's version/PDF
Licenza: Creative commons
Dimensione 6.04 MB
Formato Adobe PDF
6.04 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1242338
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact