IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

Adhering to explicit length constraints, such as generating text with a precise word count, remains a significant challenge for Large Language Models (LLMs). This study aims at investigating the differences between foundation models and their instruction-tuned counterparts, on length-controlled text generation in English and Italian. We analyze both performance and internal component contributions using Cumulative Weighted Attribution, a metric derived from Direct Logit Attribution. Our findings reveal that instruction-tuning substantially improves length control, primarily by specializing components in deeper model layers. Specifically, attention heads in later layers of IT models show increasingly positive contributions, particularly in English. In Italian, while attention contributions are more attenuated, final-layer MLPs exhibit a stronger positive role, suggesting a compensatory mechanism. These results indicate that instruction-tuning reconfigures later layers for task adherence, with component-level strategies potentially adapting to linguistic context.

How Instruction-Tuning Imparts Length Control: A Cross-Lingual Mechanistic Analysis / E. Rocchetti, A. Ferrara. - (2025 Sep 02). [10.48550/arXiv.2509.02075]

How Instruction-Tuning Imparts Length Control: A Cross-Lingual Mechanistic Analysis

E. Rocchetti^Primo;A. Ferrara^Ultimo

2025

Abstract

Adhering to explicit length constraints, such as generating text with a precise word count, remains a significant challenge for Large Language Models (LLMs). This study aims at investigating the differences between foundation models and their instruction-tuned counterparts, on length-controlled text generation in English and Italian. We analyze both performance and internal component contributions using Cumulative Weighted Attribution, a metric derived from Direct Logit Attribution. Our findings reveal that instruction-tuning substantially improves length control, primarily by specializing components in deeper model layers. Specifically, attention heads in later layers of IT models show increasingly positive contributions, particularly in English. In Italian, while attention contributions are more attenuated, final-layer MLPs exhibit a stronger positive role, suggesting a compensatory mechanism. These results indicate that instruction-tuning reconfigures later layers for task adherence, with component-level strategies potentially adapting to linguistic context.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				large language models; mechanistic interpretability; constrained text generation
			
	Settori scientifico-disciplinari del pre-print (validi dal 09/05/2024)
	
				Settore INFO-01/A - Informatica
			
	Data di depostio del pre-print
	
				2-set-2025
			
	DOI
	
				https://dx.doi.org/10.48550/arXiv.2509.02075
			
	URL del pre-print
	
				http://arxiv.org/abs/2509.02075v1
			
	Appare nelle tipologie:
	
				24 - Pre-print

File in questo prodotto:

File	Dimensione	Formato
arXiv 2509.02075.pdf accesso aperto Descrizione: How Instruction-Tuning Imparts Length Control: A Cross-Lingual Mechanistic Analysis Tipologia: Pre-print (manoscritto inviato all'editore) Licenza: Creative commons Dimensione 458.21 kB Formato Adobe PDF Visualizza/Apri	458.21 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1189259

Citazioni

ND

ND

ND

ND

social impact