Machine learning has advanced the progress of protein design, also enabling more efficient and accurate modeling of protein-ligand interfaces. Due to the complexity of biological systems, selecting optimal candidates from the heterogeneous outputs of generative protein design tools remains a persistent challenge. In this work, we introduce a consensus ranking framework that integrates five state- of-the-art inverse folding models — ProteinMPNN, LigandMPNN, ESM-IF1, CARBonAra, and ProRefiner — applied to 25,716 curated protein-ligand complexes from the BioLip database. Our approach frames design selection as a supervised learning-to-rank problem and leverages a LightGBM-based LambdaMART model to fuse het- erogeneous scoring features into a unified ranking. We pointed out that consensus-ranked sequences outperform individual model selections in stability, binding affinity, and structural fidelity, as evaluated using Schrödinger and MOE free energy difference cal- culations. In a case study on three enzymes (NOV1, CYP153A, and LCD), our method consistently improves design quality, suggesting that consensus ranking can significantly enhance the success rate and efficiency of AI-driven protein engineering.

Benchmarking and Consensus Ranking of Inverse Folding Models for Protein-Ligand Interface Design / Y. Wei, U. Guerrini, I. Eberini - In: BCB Companion '25: Companion / [a cura di] M. Xinghua Shi, X. Qian. - [s.l] : ACM, 2025. - ISBN 979-8-4007-2222-6. - pp. 1-7 (( 16. International Conference on Bioinformatics, Computational Biology and Health Informatics Philadelphia 2025 [10.1145/3768322.3769031].

Benchmarking and Consensus Ranking of Inverse Folding Models for Protein-Ligand Interface Design

Y. Wei;U. Guerrini;I. Eberini
2025

Abstract

Machine learning has advanced the progress of protein design, also enabling more efficient and accurate modeling of protein-ligand interfaces. Due to the complexity of biological systems, selecting optimal candidates from the heterogeneous outputs of generative protein design tools remains a persistent challenge. In this work, we introduce a consensus ranking framework that integrates five state- of-the-art inverse folding models — ProteinMPNN, LigandMPNN, ESM-IF1, CARBonAra, and ProRefiner — applied to 25,716 curated protein-ligand complexes from the BioLip database. Our approach frames design selection as a supervised learning-to-rank problem and leverages a LightGBM-based LambdaMART model to fuse het- erogeneous scoring features into a unified ranking. We pointed out that consensus-ranked sequences outperform individual model selections in stability, binding affinity, and structural fidelity, as evaluated using Schrödinger and MOE free energy difference cal- culations. In a case study on three enzymes (NOV1, CYP153A, and LCD), our method consistently improves design quality, suggesting that consensus ranking can significantly enhance the success rate and efficiency of AI-driven protein engineering.
English
Machine Learning; Protein Design
Settore BIOS-07/A - Biochimica
Intervento a convegno
Sì, ma tipo non specificato
Pubblicazione scientifica
   Metal-containing Radical Enzymes (MetRaZymes)
   MetRaZymes
   EUROPEAN COMMISSION
   101073546
BCB Companion '25: Companion
M. Xinghua Shi, X. Qian
ACM
2025
1
7
7
979-8-4007-2222-6
Volume a diffusione internazionale
Gold
International Conference on Bioinformatics, Computational Biology and Health Informatics
Philadelphia
2025
16
manual
Aderisco
Y. Wei, U. Guerrini, I. Eberini
Book Part (author)
open
273
Benchmarking and Consensus Ranking of Inverse Folding Models for Protein-Ligand Interface Design / Y. Wei, U. Guerrini, I. Eberini - In: BCB Companion '25: Companion / [a cura di] M. Xinghua Shi, X. Qian. - [s.l] : ACM, 2025. - ISBN 979-8-4007-2222-6. - pp. 1-7 (( 16. International Conference on Bioinformatics, Computational Biology and Health Informatics Philadelphia 2025 [10.1145/3768322.3769031].
info:eu-repo/semantics/bookPart
3
Prodotti della ricerca::03 - Contributo in volume
File in questo prodotto:
File Dimensione Formato  
Benchmarking and Consensus Ranking of Inverse Folding Models for Protein-Ligand Interface Design.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Licenza: Creative commons
Dimensione 696.78 kB
Formato Adobe PDF
696.78 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1203255
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact