Self-Supervised CLIP-Guided for Few-Shot Industrial Anomaly Detection

Chen, Y.; Ying, X.; Wang, T.; Zhai, Y.; Tan, K.; Zhou, J.; Coscia, P.; Genovese, A.; Chen, C.L.P.

doi:10.1109/tim.2026.3661696

Few-shot industrial anomaly detection aims to identify unseen defects using only a limited number of normal samples. However, most existing approaches still rely heavily on auxiliary industrial datasets for training. In this paper, we propose a novel self-supervised CLIP-guided for few-shot industrial anomaly detection, which eliminates the need for auxiliary industrial data. Specifically, we first introduce a pseudo-anomaly generation strategy that synthesizes both structural and textural anomalies. Then, leveraging the cross-modal semantic understanding capability of CLIP, we contrast the multi-scale visual features with learnable textual prompts to achieve anomaly localization grounded in language semantics. Inspired by the human cognitive process of identifying anomalies through reference comparison, we introduce a support set composed of a few normal samples and perform semantic-level feature alignment with the query set via CLIP visual encoder, thereby enhancing anomaly discrimination. Furthermore, we also introduce Adapter to alleviate the semantic offset problem between text and image modalities in industrial scenarios of CLIP, and enhance the model’s robustness to the spatial structure differences between query set and support set. Extensive experiments conducted on the MVTec AD, the VisA, the BTAD and the MPDD datasets demonstrate that our method achieves competitive results under the few-shot setting. Moreover, its effectiveness and deployability are validated through real-world application in battery spot-welding defect inspection. The code is available at https://github.com/YiKuiZhai/SCF-AD.

Self-Supervised CLIP-Guided for Few-Shot Industrial Anomaly Detection / Y. Chen, Y. Xu, T. Wang, Y. Zhai, K. Tan, J. Zhou, P. Coscia, A. Genovese, C.L.P. Chen. - In: IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT. - ISSN 0018-9456. - (2026), pp. 1-16. [Epub ahead of print] [10.1109/tim.2026.3661696]

Self-Supervised CLIP-Guided for Few-Shot Industrial Anomaly Detection

Chen, Yingwen;Xu, Ying;Wang, Tianlei;Zhai, Yikui;Tan, Kanghong;Zhou, Jianhong;P. Coscia;A. Genovese;Chen, C. L. Philip

2026

Abstract

Few-shot industrial anomaly detection aims to identify unseen defects using only a limited number of normal samples. However, most existing approaches still rely heavily on auxiliary industrial datasets for training. In this paper, we propose a novel self-supervised CLIP-guided for few-shot industrial anomaly detection, which eliminates the need for auxiliary industrial data. Specifically, we first introduce a pseudo-anomaly generation strategy that synthesizes both structural and textural anomalies. Then, leveraging the cross-modal semantic understanding capability of CLIP, we contrast the multi-scale visual features with learnable textual prompts to achieve anomaly localization grounded in language semantics. Inspired by the human cognitive process of identifying anomalies through reference comparison, we introduce a support set composed of a few normal samples and perform semantic-level feature alignment with the query set via CLIP visual encoder, thereby enhancing anomaly discrimination. Furthermore, we also introduce Adapter to alleviate the semantic offset problem between text and image modalities in industrial scenarios of CLIP, and enhance the model’s robustness to the spatial structure differences between query set and support set. Extensive experiments conducted on the MVTec AD, the VisA, the BTAD and the MPDD datasets demonstrate that our method achieves competitive results under the few-shot setting. Moreover, its effectiveness and deployability are validated through real-world application in battery spot-welding defect inspection. The code is available at https://github.com/YiKuiZhai/SCF-AD.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				CLIP; cross-modal; representation alignment; industrial anomaly detection; few-shot
			
	Settori scientifico-disciplinari dell'articolo (validi dal 09/05/2024)
	
				Settore INFO-01/A - Informatica
			
	Data di pubblicazione
	
				2026
			
	Data ahead of print o data di stampa
	
				feb-2026
			
	Rivista in ANCE
	
				IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
			
	DOI
	
				https://dx.doi.org/10.1109/tim.2026.3661696
			
	Tipologia
	
				Article (author)
			
	Appare nelle tipologie:
	
				01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
tim26_compressed.pdf accesso riservato Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore) Licenza: Nessuna licenza Dimensione 1.02 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.02 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1217115

Citazioni

ND

0

ND

0

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca