Do It for HER: First-Order Temporal Logic Reward Specification in Reinforcement Learning

Olivieri, P.; Lasca, F.; Gianola, A.; Papini, M.

doi:10.1609/aaai.v40i29.39645

In this work, we propose a novel framework for the logical specification of non-Markovian rewards in Markov Decision Processes (MDPs) with large state spaces. Our approach leverages Linear Temporal Logic Modulo Theories over finite traces (LTLfMT), a more expressive extension of classical temporal logic in which predicates are first-order formulas of arbitrary first-order theories rather than simple Boolean variables. This enhanced expressiveness enables the specification of complex tasks over unstructured and heterogeneous data domains, promoting a unified and reusable framework that eliminates the need for manual predicate encoding. However, the increased expressive power of LTLfMT introduces additional theoretical and computational challenges compared to standard LTLf specifications. We address these challenges from a theoretical standpoint, identifying a fragment of LTLfMT that is tractable but sufficiently expressive for reward specification in an infinite-state-space context. From a practical perspective, we introduce a method based on reward machines and Hindsight Experience Replay (HER) to translate first-order logic specifications and address reward sparsity. We evaluate this approach to a continuous-control setting using Non-Linear Arithmetic Theory, showing that it enables natural specification of complex tasks. Experimental results show how a tailored implementation of HER is fundamental in solving tasks with complex goals.

Do It for HER: First-Order Temporal Logic Reward Specification in Reinforcement Learning / P. Olivieri, F. Lasca, A. Gianola, M. Papini (PROCEEDINGS OF THE ... AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE). - In: Proceedings of the 40th Annual AAAI Conference on Artificial Intelligence[s.l] : AAAI PRESS, 2026. - pp. 24613-24621 (( 40. AAAI Conference on Artificial Intelligence Singapore 2026 [10.1609/aaai.v40i29.39645].

Do It for HER: First-Order Temporal Logic Reward Specification in Reinforcement Learning

Olivieri, Pierriccardo;Lasca, Fausto;Gianola, Alessandro;M. Papini^Ultimo

2026

Abstract

In this work, we propose a novel framework for the logical specification of non-Markovian rewards in Markov Decision Processes (MDPs) with large state spaces. Our approach leverages Linear Temporal Logic Modulo Theories over finite traces (LTLfMT), a more expressive extension of classical temporal logic in which predicates are first-order formulas of arbitrary first-order theories rather than simple Boolean variables. This enhanced expressiveness enables the specification of complex tasks over unstructured and heterogeneous data domains, promoting a unified and reusable framework that eliminates the need for manual predicate encoding. However, the increased expressive power of LTLfMT introduces additional theoretical and computational challenges compared to standard LTLf specifications. We address these challenges from a theoretical standpoint, identifying a fragment of LTLfMT that is tractable but sufficiently expressive for reward specification in an infinite-state-space context. From a practical perspective, we introduce a method based on reward machines and Hindsight Experience Replay (HER) to translate first-order logic specifications and address reward sparsity. We evaluate this approach to a continuous-control setting using Non-Linear Arithmetic Theory, showing that it enables natural specification of complex tasks. Experimental results show how a tailored implementation of HER is fundamental in solving tasks with complex goals.

Scheda breve

Scheda completa

Scheda completa (DC)

	Settori scientifico-disciplinari del contributo (validi dal 09/05/2024)
	
				Settore INFO-01/A - Informatica
Settore IINF-05/A - Sistemi di elaborazione delle informazioni
			
	Data di pubblicazione
	
				2026
			
	DOI
	
				https://dx.doi.org/10.1609/aaai.v40i29.39645
			
	URL
	
				https://ojs.aaai.org/index.php/AAAI/article/view/39645
			
	Tipologia
	
				Book Part (author)
			
	Appare nelle tipologie:
	
				03 - Contributo in volume

File in questo prodotto:

File	Dimensione	Formato
39645-Article Text-43736-1-2-20260314.pdf accesso aperto Tipologia: Publisher's version/PDF Licenza: Creative commons Dimensione 171.71 kB Formato Adobe PDF Visualizza/Apri	171.71 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1231423

Citazioni

ND

ND

ND

0

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca