Structure-based virtual screening approaches like molecular docking rely on accurately identifying and precisely calculating binding pockets to efficiently search for potential ligands. In this paper, we introduce GENEOnet, a machine learning model designed for volumetric protein pocket detection that employs Group Equivariant Non-Expansive Operators (GENEOs). These operators simplify model complexity and enable more informed domain knowledge integration by selecting specific physical and chemical properties for each operator to focus on, as well as how they should react. Unlike other methods in this field, GENEOnet has fewer model parameters, resulting in reduced training costs, and offers greater explainability, allowing the parameters to be easily interpreted. GENEOnet processes the empty space within a protein by converting it into a 3D grid of uniform blocks, known as ‘voxels’. It then identifies regions of the grid with an output value above a threshold, thus producing a list of predicted pockets, ranked according to the model’s average output value. Our experimental results show that GENEOnet performs robustly even with small training datasets of 200 proteins and surpasses other established state-of-the-art methods in various metrics. Specifically, GENEOnet’s score indicating the probability that the top-ranked pocket is the correct one is 0.764, compared to 0.702 for P2Rank, the next best performing algorithm on our PDBbind test set. Moreover, a case study considering various ABL1 kinase conformations demonstrates the excellent agreement between GENEOnet’s predictions and experimental sites. GENEOnet is available as a web service at https://geneonet.exscalate.eu, where users can access the pre-trained model for detecting and ranking protein cavities.

GENEOnet: a breakthrough in protein binding pocket detection using group equivariant non-expansive operators / G. Bocchi, P. Frosini, A. Micheletti, A. Pedretti, G. Palermo, D. Gadioli, C. Gratteri, F. Lunghini, A.D. Biswas, P.F.W. Stouten, A.R. Beccari, A. Fava, C. Talarico. - In: SCIENTIFIC REPORTS. - ISSN 2045-2322. - 15:1(2025 Oct 03), pp. 34597.1-34597.15. [10.1038/s41598-025-18132-5]

GENEOnet: a breakthrough in protein binding pocket detection using group equivariant non-expansive operators

G. Bocchi
Primo
;
A. Micheletti;A. Pedretti;A.D. Biswas;
2025

Abstract

Structure-based virtual screening approaches like molecular docking rely on accurately identifying and precisely calculating binding pockets to efficiently search for potential ligands. In this paper, we introduce GENEOnet, a machine learning model designed for volumetric protein pocket detection that employs Group Equivariant Non-Expansive Operators (GENEOs). These operators simplify model complexity and enable more informed domain knowledge integration by selecting specific physical and chemical properties for each operator to focus on, as well as how they should react. Unlike other methods in this field, GENEOnet has fewer model parameters, resulting in reduced training costs, and offers greater explainability, allowing the parameters to be easily interpreted. GENEOnet processes the empty space within a protein by converting it into a 3D grid of uniform blocks, known as ‘voxels’. It then identifies regions of the grid with an output value above a threshold, thus producing a list of predicted pockets, ranked according to the model’s average output value. Our experimental results show that GENEOnet performs robustly even with small training datasets of 200 proteins and surpasses other established state-of-the-art methods in various metrics. Specifically, GENEOnet’s score indicating the probability that the top-ranked pocket is the correct one is 0.764, compared to 0.702 for P2Rank, the next best performing algorithm on our PDBbind test set. Moreover, a case study considering various ABL1 kinase conformations demonstrates the excellent agreement between GENEOnet’s predictions and experimental sites. GENEOnet is available as a web service at https://geneonet.exscalate.eu, where users can access the pre-trained model for detecting and ranking protein cavities.
Equivariance; GENEO; Molecular docking; Pocket detection;
Settore MATH-03/B - Probabilità e statistica matematica
Settore STAT-01/A - Statistica
Settore CHEM-07/A - Chimica farmaceutica
Settore INFO-01/A - Informatica
3-ott-2025
Article (author)
File in questo prodotto:
File Dimensione Formato  
unpaywall-bitstream--1003552863.pdf

accesso aperto

Descrizione: Articolo principale
Tipologia: Publisher's version/PDF
Licenza: Creative commons
Dimensione 3.78 MB
Formato Adobe PDF
3.78 MB Adobe PDF Visualizza/Apri
41598_2025_18132_MOESM1_ESM.pdf

accesso aperto

Descrizione: Supplementary information
Tipologia: Altro
Licenza: Creative commons
Dimensione 4.05 MB
Formato Adobe PDF
4.05 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1188696
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact