Spatial Reconstruction and Joint Training in Transformer Network for Cross-Domain Remote Sensing Images Semantic Segmentation

Zeng, J.; Deng, S.; Zhai, Y.; Jia, X.; Qin, C.; Coscia, P.; Genovese, A.; Piuri, V.; Scotti, F.

doi:10.1109/tgrs.2025.3599841

Recently, Unsupervised Domain Adaptation (UDA) methods have attracted considerable attention in Remote Sensing Images (RSI) semantic segmentation. However, cross-domain RSI exhibit diverse scales, imbalanced distributions within domains, and significant inter-domain variations. In response to these challenges, we combine Spatial reconstruction and Joint training with the Transformer Network (SJT-Net). This framework introduces a spatial reconstruction method to address the issue of inconsistent ground sampling distances in cross domain RSI, which is rarely considered in existing approaches. Transferring domain knowledge at a similar spatial scale improves the spatial representation ability of UDA models. Unlike traditional adversarial training using ResNet for feature extraction, the SJT-Net employs Segformer, which enhances the model’s ability to capture in-class features across domains and improves global dependency modeling. Transmitting these refined features to the discriminator allows for more precise feature-level domain alignment. To enhance feature decoding, an interactive global-local decoder is constructed to efficiently capture both global relationships and local details of landform objects. Our framework leverages adversarial training to generate highly confident model weights and pseudo-labels for self-training in the target domain. Through iterative updates, the model’s generalization capability is gradually improved, eventually achieving optimal segmentation performance. Experimental results demonstrate that SJT-Net outperforms current UDA approaches and accomplishes state-of-the-art (SOTA) segmentation accuracy. The repository can be accessed at https://github.com/AnsonD0820/SJT-Net.

Spatial Reconstruction and Joint Training in Transformer Network for Cross-Domain Remote Sensing Images Semantic Segmentation / J. Zeng, S. Deng, Y. Zhai, X. Jia, C. Qin, P. Coscia, A. Genovese, V. Piuri, F. Scotti. - In: IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING. - ISSN 0196-2892. - 63:(2025), pp. 5406618.1-5406618.18. [10.1109/tgrs.2025.3599841]

Spatial Reconstruction and Joint Training in Transformer Network for Cross-Domain Remote Sensing Images Semantic Segmentation

Zeng, Junying^Primo;Deng, Senyao;Zhai, Yikui;Jia, Xudong;Qin, Chuanbo;P. Coscia;A. Genovese;V. Piuri^Penultimo;F. Scotti^Ultimo

2025

Abstract

Recently, Unsupervised Domain Adaptation (UDA) methods have attracted considerable attention in Remote Sensing Images (RSI) semantic segmentation. However, cross-domain RSI exhibit diverse scales, imbalanced distributions within domains, and significant inter-domain variations. In response to these challenges, we combine Spatial reconstruction and Joint training with the Transformer Network (SJT-Net). This framework introduces a spatial reconstruction method to address the issue of inconsistent ground sampling distances in cross domain RSI, which is rarely considered in existing approaches. Transferring domain knowledge at a similar spatial scale improves the spatial representation ability of UDA models. Unlike traditional adversarial training using ResNet for feature extraction, the SJT-Net employs Segformer, which enhances the model’s ability to capture in-class features across domains and improves global dependency modeling. Transmitting these refined features to the discriminator allows for more precise feature-level domain alignment. To enhance feature decoding, an interactive global-local decoder is constructed to efficiently capture both global relationships and local details of landform objects. Our framework leverages adversarial training to generate highly confident model weights and pseudo-labels for self-training in the target domain. Through iterative updates, the model’s generalization capability is gradually improved, eventually achieving optimal segmentation performance. Experimental results demonstrate that SJT-Net outperforms current UDA approaches and accomplishes state-of-the-art (SOTA) segmentation accuracy. The repository can be accessed at https://github.com/AnsonD0820/SJT-Net.

Scheda breve

Scheda completa

Scheda completa (DC)

	Settori scientifico-disciplinari dell'articolo (validi dal 09/05/2024)
	
				Settore INFO-01/A - Informatica
Settore IINF-05/A - Sistemi di elaborazione delle informazioni
			
	Data di pubblicazione
	
				2025
			
	Data ahead of print o data di stampa
	
				18-ago-2025
			
	Rivista in ANCE
	
				IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING
			
	DOI
	
				https://dx.doi.org/10.1109/tgrs.2025.3599841
			
	Tipologia
	
				Article (author)
			
	Appare nelle tipologie:
	
				01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
Spatial_Reconstruction_and_Joint_Training_in_Transformer_Network_for_Cross-Domain_Remote_Sensing_Images_Semantic_Segmentation (final).pdf accesso riservato Tipologia: Publisher's version/PDF Licenza: Nessuna licenza Dimensione 10.1 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	10.1 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1179962

Citazioni

ND

0

0

0

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca