We introduce Finenzyme, a Protein Language Model (PLM) that employs a multifaceted learning strategy based on transfer learning from a decoder-based Transformer, conditional learning using specific functional keywords, and fine-tuning for the in silico modeling of enzymes. Our experiments show that Finenzyme significantly enhances generalist PLMs like ProGen for the in silico prediction and generation of enzymes belonging to specific Enzyme Commission (EC) categories. Our in silico experiments demonstrate that Finenzyme generated sequences can diverge from natural ones, while retaining similar predicted tertiary structure, predicted functions and the active sites of their natural counterparts. We show that embedded representations of the generated sequences obtained from the embeddings computed by both Finenzyme and ESMFold closely resemble those of natural ones, thus making them suitable for downstream tasks, including e.g. EC classification. Clustering analysis based on the primary and predicted tertiary structure of sequences reveals that the generated enzymes form clusters that largely overlap with those of natural enzymes. These overall in silico validation experiments indicate that Finenzyme effectively captures the structural and functional properties of target enzymes, and can in perspective support targeted enzyme engineering tasks.

Fine-tuning of conditional Transformers improves in silico enzyme prediction and generation / M. Nicolini, E. Saitto, R.E. Jimenez Franco, E. Cavalleri, A.J. Galeano Alfonso, D. Malchiodi, A. Paccanaro, P.N. Robinson, E. Casiraghi, G. Valentini. - In: COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL. - ISSN 2001-0370. - 27:(2025), pp. 1318-1334. [10.1016/j.csbj.2025.03.037]

Fine-tuning of conditional Transformers improves in silico enzyme prediction and generation

M. Nicolini
Primo
;
E. Cavalleri;D. Malchiodi;E. Casiraghi
Penultimo
;
G. Valentini
Ultimo
2025

Abstract

We introduce Finenzyme, a Protein Language Model (PLM) that employs a multifaceted learning strategy based on transfer learning from a decoder-based Transformer, conditional learning using specific functional keywords, and fine-tuning for the in silico modeling of enzymes. Our experiments show that Finenzyme significantly enhances generalist PLMs like ProGen for the in silico prediction and generation of enzymes belonging to specific Enzyme Commission (EC) categories. Our in silico experiments demonstrate that Finenzyme generated sequences can diverge from natural ones, while retaining similar predicted tertiary structure, predicted functions and the active sites of their natural counterparts. We show that embedded representations of the generated sequences obtained from the embeddings computed by both Finenzyme and ESMFold closely resemble those of natural ones, thus making them suitable for downstream tasks, including e.g. EC classification. Clustering analysis based on the primary and predicted tertiary structure of sequences reveals that the generated enzymes form clusters that largely overlap with those of natural enzymes. These overall in silico validation experiments indicate that Finenzyme effectively captures the structural and functional properties of target enzymes, and can in perspective support targeted enzyme engineering tasks.
Large Language Models; Protein Language Models; Fine-tuning of Large Language Models; Conditional Transformers; In silico enzyme design and modeling
Settore INFO-01/A - Informatica
2025
Article (author)
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S2001037025001072-main.pdf

accesso aperto

Descrizione: Research Article
Tipologia: Publisher's version/PDF
Dimensione 6.56 MB
Formato Adobe PDF
6.56 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1157185
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact