Understanding the interactions between small molecules and their protein targets is a central challenge in drug discovery, underpinning the development of new therapeutics and the rational design of chemical probes. Despite the availability of large-scale bioactivity data, significant variability in experimental conditions, reporting formats, and structural representations often limits their utility for computational modeling and machine learning applications. To address these challenges, curated and standardized datasets are essential, providing reliable benchmarks for method development, predictive modeling, and AI-driven analyses. This thesis presents DELTA, a curated and standardized ligand–target database designed to support AI-driven analyses, and predictive modeling in drug discovery. DELTA provides benchmark subsets of 200 ligands per target, evenly divided between active and inactive compounds, alongside extended collections (DELTA-X) to facilitate robust predictive modeling. The database integrates high-quality 3D structures for both targets and ligands, with detailed annotations on physiological function, pathological relevance, and binding sites. Machine learning models were developed to predict ligand activity across all protein targets contained in DELTA. Random forest models, optimized through hyperparameter tuning and feature selection, achieved robust and generalizable performance. Integration of these models into reverse screening pipelines enables efficient target prioritization, reducing computational cost while focusing on biologically relevant interactions. Overall, DELTA represents a comprehensive and versatile resource for ligand–target data, 3D structural analysis, and predictive modeling, establishing a foundation for AI-driven drug discovery, mechanistic studies, and rational experimental prioritization.
LIGAND- AND STRUCTURE-BASED METHODS FOR IN SILICO TARGET IDENTIFICATION / A. Pisati ; tutor: G. Vistoli ; co-tutor: A. Pedretti ; phd coordinator: G. Vistoli. Dipartimento di Scienze Farmaceutiche, 2026 Apr 17. 38. ciclo, Anno Accademico 2025/2026.
LIGAND- AND STRUCTURE-BASED METHODS FOR IN SILICO TARGET IDENTIFICATION
A. Pisati
2026
Abstract
Understanding the interactions between small molecules and their protein targets is a central challenge in drug discovery, underpinning the development of new therapeutics and the rational design of chemical probes. Despite the availability of large-scale bioactivity data, significant variability in experimental conditions, reporting formats, and structural representations often limits their utility for computational modeling and machine learning applications. To address these challenges, curated and standardized datasets are essential, providing reliable benchmarks for method development, predictive modeling, and AI-driven analyses. This thesis presents DELTA, a curated and standardized ligand–target database designed to support AI-driven analyses, and predictive modeling in drug discovery. DELTA provides benchmark subsets of 200 ligands per target, evenly divided between active and inactive compounds, alongside extended collections (DELTA-X) to facilitate robust predictive modeling. The database integrates high-quality 3D structures for both targets and ligands, with detailed annotations on physiological function, pathological relevance, and binding sites. Machine learning models were developed to predict ligand activity across all protein targets contained in DELTA. Random forest models, optimized through hyperparameter tuning and feature selection, achieved robust and generalizable performance. Integration of these models into reverse screening pipelines enables efficient target prioritization, reducing computational cost while focusing on biologically relevant interactions. Overall, DELTA represents a comprehensive and versatile resource for ligand–target data, 3D structural analysis, and predictive modeling, establishing a foundation for AI-driven drug discovery, mechanistic studies, and rational experimental prioritization.| File | Dimensione | Formato | |
|---|---|---|---|
|
phd_unimi_R14161.pdf
accesso aperto
Descrizione: Doctoral thesis
Tipologia:
Publisher's version/PDF
Licenza:
Creative commons
Dimensione
6.04 MB
Formato
Adobe PDF
|
6.04 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.




