GMTNet: Dense Object Detection via Global Dynamically Matching Transformer Network

Dong, C.; Wang, C.; Zhai, Y.; Ye, L.; Zhou, J.; Coscia, P.; Genovese, A.; Piuri, V.; Scotti, F.

doi:10.1109/tcsvt.2024.3522661

In recent years, object detection models have been extensively applied across various industries, leveraging learned samples to recognize and locate objects. However, industrial environments present unique challenges, including complex backgrounds, dense object distributions, object stacking, and occlusion. To address these challenges, we propose the Global Dynamic Matching Transformer Network (GMTNet). GMTNet partitions images into blocks and employs a sliding window approach to capture information from each block and their interrelationships, mitigating background interference while acquiring global information for dense object recognition. By reweighting key-value pairs in multi-scale feature maps, GMTNet enhances global information relevance and effectively handles occlusion and overlap between objects. Furthermore, we introduce a dynamic sample matching method to tackle the issue of excessive candidate boxes in dense detection tasks. This method adaptively adjusts the number of matched positive samples according to the specific detection task, enabling the model to reduce the learning of irrelevant features and simplify post-processing. Experimental results demonstrate that GMTNet excels in dense detection tasks and outperforms current mainstream algorithms. The code will be available at http://github.com/yikuizhai/GMTNet.

GMTNet: Dense Object Detection via Global Dynamically Matching Transformer Network / C. Dong, C. Wang, Y. Zhai, Y. Li, J. Zhou, P. Coscia, A. Genovese, V. Piuri, F. Scotti. - In: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY. - ISSN 1051-8215. - 35:5(2025 May), pp. 4923-4936. [10.1109/tcsvt.2024.3522661]

GMTNet: Dense Object Detection via Global Dynamically Matching Transformer Network

Dong, Chaojun;Wang, Chengxuan;Zhai, Yikui;Li, Ye;Zhou, Jianhong;P. Coscia;A. Genovese;V. Piuri^Penultimo;F. Scotti^Ultimo

2025

Abstract

In recent years, object detection models have been extensively applied across various industries, leveraging learned samples to recognize and locate objects. However, industrial environments present unique challenges, including complex backgrounds, dense object distributions, object stacking, and occlusion. To address these challenges, we propose the Global Dynamic Matching Transformer Network (GMTNet). GMTNet partitions images into blocks and employs a sliding window approach to capture information from each block and their interrelationships, mitigating background interference while acquiring global information for dense object recognition. By reweighting key-value pairs in multi-scale feature maps, GMTNet enhances global information relevance and effectively handles occlusion and overlap between objects. Furthermore, we introduce a dynamic sample matching method to tackle the issue of excessive candidate boxes in dense detection tasks. This method adaptively adjusts the number of matched positive samples according to the specific detection task, enabling the model to reduce the learning of irrelevant features and simplify post-processing. Experimental results demonstrate that GMTNet excels in dense detection tasks and outperforms current mainstream algorithms. The code will be available at http://github.com/yikuizhai/GMTNet.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				Dense Object Detection; Dynamic Sample Matching; Industrial Scenarios; Mobile Window System;
			
	Settori scientifico-disciplinari dell'articolo (validi dal 09/05/2024)
	
				Settore INFO-01/A - Informatica
Settore IINF-05/A - Sistemi di elaborazione delle informazioni
			
	Data di pubblicazione
	
				mag-2025
			
	Data ahead of print o data di stampa
	
				25-dic-2024
			
	Rivista in ANCE
	
				IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
			
	DOI
	
				https://dx.doi.org/10.1109/tcsvt.2024.3522661
			
	Tipologia
	
				Article (author)
			
	Appare nelle tipologie:
	
				01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
csvt24.pdf accesso riservato Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore) Dimensione 8.05 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	8.05 MB	Adobe PDF	Visualizza/Apri Richiedi una copia
GMTNet_Dense_Object_Detection_via_Global_Dynamically_Matching_Transformer_Network (final published).pdf accesso riservato Tipologia: Publisher's version/PDF Dimensione 3.84 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	3.84 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1127836

Citazioni

ND

1

1

2

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca