DLGCNet: Multimodal remote sensing semantic segmentation via dual diagonal low-rank adaptation and graph convolutional feature fusion

Zeng, J.; Jiahua, X.; Jia, X.; Deng, B.; Zhai, Y.; Qin, C.; Coscia, P.; Genovese, A.; Tian, X.

doi:10.1016/j.knosys.2026.116299

In recent years, multimodal remote sensing images (MRSI) have demonstrated complementary cross-modal information owing to their heterogeneous characteristics, enabling more detailed and effective scene interpretation than single-modality data. To address challenges in existing multimodal fusion methods, this paper proposes DLGCNet, a multimodal segmentation network that jointly optimizes a dual diagonal low-rank adaptation (D2LoRA) training framework for visual foundation models (VFMs) and a graph convolutional feature fusion (GCFF) module. To better adapt to the data distributions of MRSI, D2LoRA introduces two trainable diagonal matrices that perform row-wise and column-wise transformations on the low-rank weight matrix, thereby improving the VFM’s adaptability and feature extraction performance for MRSI. To overcome the limited cross-modal modeling capacity of convolutional neural network-based fusion and the excessive complexity of transformer-based fusion, GCFF dynamically adjusts the graph Laplacian according to modal information and establishes long-range cross-modal dependencies with lower computational complexity than transformer-based fusion methods. Experimental results demonstrate that, compared to current state-of-the-art multimodal data fusion methods, the proposed DLGCNet achieves optimal segmentation results on three datasets: Potsdam, Vaihingen, and WHU-OPT-SAR. The source code is accessible at https://github.com/2023xjh2023/DLGCNet.

DLGCNet: Multimodal remote sensing semantic segmentation via dual diagonal low-rank adaptation and graph convolutional feature fusion / J. Zeng, J.X.. - In: KNOWLEDGE-BASED SYSTEMS. - ISSN 0950-7051. - 347:(2026 Jul 19), pp. 116299.1-116299.13. [10.1016/j.knosys.2026.116299]

DLGCNet: Multimodal remote sensing semantic segmentation via dual diagonal low-rank adaptation and graph convolutional feature fusion

Zeng, Junying^Primo;Xu, Jiahua^Secondo;Jia, Xudong;Deng, Bin;Zhai, Yikui;Qin, Chuanbo;P. Coscia;A. Genovese^Penultimo;Tian, Xiaolin^Ultimo

2026

Abstract

In recent years, multimodal remote sensing images (MRSI) have demonstrated complementary cross-modal information owing to their heterogeneous characteristics, enabling more detailed and effective scene interpretation than single-modality data. To address challenges in existing multimodal fusion methods, this paper proposes DLGCNet, a multimodal segmentation network that jointly optimizes a dual diagonal low-rank adaptation (D2LoRA) training framework for visual foundation models (VFMs) and a graph convolutional feature fusion (GCFF) module. To better adapt to the data distributions of MRSI, D2LoRA introduces two trainable diagonal matrices that perform row-wise and column-wise transformations on the low-rank weight matrix, thereby improving the VFM’s adaptability and feature extraction performance for MRSI. To overcome the limited cross-modal modeling capacity of convolutional neural network-based fusion and the excessive complexity of transformer-based fusion, GCFF dynamically adjusts the graph Laplacian according to modal information and establishes long-range cross-modal dependencies with lower computational complexity than transformer-based fusion methods. Experimental results demonstrate that, compared to current state-of-the-art multimodal data fusion methods, the proposed DLGCNet achieves optimal segmentation results on three datasets: Potsdam, Vaihingen, and WHU-OPT-SAR. The source code is accessible at https://github.com/2023xjh2023/DLGCNet.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				Multimodal fusion ; Remote sensing; Semantic segmentation; Parameter-efficient fine-tuning
			
	Settori scientifico-disciplinari dell'articolo (validi dal 09/05/2024)
	
				Settore INFO-01/A - Informatica
			
	Data di pubblicazione
	
				19-lug-2026
			
	Data ahead of print o data di stampa
	
				25-mag-2026
			
	Rivista in ANCE
	
				KNOWLEDGE-BASED SYSTEMS
			
	DOI
	
				https://dx.doi.org/10.1016/j.knosys.2026.116299
			
	Tipologia
	
				Article (author)
			
	Appare nelle tipologie:
	
				01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S0950705126010257-main.pdf accesso riservato Tipologia: Publisher's version/PDF Licenza: Nessuna licenza Dimensione 4.77 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	4.77 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1249949

Citazioni

ND

ND

0

0

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca