IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

Instruction tuning is commonly assumed to endow language models with a domain-general ability to follow instructions, yet the underlying mechanism remains poorly understood. Does instruction-following rely on a universal mechanism or compositional skill deployment? We investigate this through diagnostic probing across nine diverse tasks in three instruction-tuned models. Our analysis provides converging evidence against a universal mechanism. First, general probes trained across all tasks consistently underperform task-specific specialists, indicating limited representational sharing. Second, cross-task transfer is weak and clustered by skill similarity. Third, causal ablation reveals sparse asymmetric dependencies rather than shared representations. Tasks also stratify by complexity across layers, with structural constraints emerging early and semantic tasks emerging late. Finally, temporal analysis shows constraint satisfaction operates as dynamic monitoring during generation rather than pre-generation planning. These findings indicate that instruction-following is better characterized as skillful coordination of diverse linguistic capabilities rather than deployment of a single abstract constraint-checking process.

How LLMs Follow Instructions: Skillful Coordination, Not a Universal Mechanism / E. Rocchetti, A. Ferrara. - (2026 Apr 07).

How LLMs Follow Instructions: Skillful Coordination, Not a Universal Mechanism

E. Rocchetti^Primo;A. Ferrara^Ultimo

2026

Abstract

Instruction tuning is commonly assumed to endow language models with a domain-general ability to follow instructions, yet the underlying mechanism remains poorly understood. Does instruction-following rely on a universal mechanism or compositional skill deployment? We investigate this through diagnostic probing across nine diverse tasks in three instruction-tuned models. Our analysis provides converging evidence against a universal mechanism. First, general probes trained across all tasks consistently underperform task-specific specialists, indicating limited representational sharing. Second, cross-task transfer is weak and clustered by skill similarity. Third, causal ablation reveals sparse asymmetric dependencies rather than shared representations. Tasks also stratify by complexity across layers, with structural constraints emerging early and semantic tasks emerging late. Finally, temporal analysis shows constraint satisfaction operates as dynamic monitoring during generation rather than pre-generation planning. These findings indicate that instruction-following is better characterized as skillful coordination of diverse linguistic capabilities rather than deployment of a single abstract constraint-checking process.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				Computer Science - Artificial Intelligence; Computer Science - Artificial Intelligence
			
	Settori scientifico-disciplinari del pre-print (validi dal 09/05/2024)
	
				Settore INFO-01/A - Informatica
			
	Data di depostio del pre-print
	
				7-apr-2026
			
	URL del pre-print
	
				http://arxiv.org/abs/2604.06015v1
			
	Appare nelle tipologie:
	
				24 - Pre-print

File in questo prodotto:

File	Dimensione	Formato
2604.06015v1 (1).pdf accesso aperto Tipologia: Pre-print (manoscritto inviato all'editore) Licenza: Creative commons Dimensione 842.21 kB Formato Adobe PDF Visualizza/Apri	842.21 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1234035

Citazioni

ND

ND

ND

ND

social impact