Deep neural networks compression: A comparative survey and choice recommendations

Marinó, G.C.; Petrini, A.; Malchiodi, D.; Frasca, M.

doi:10.1016/j.neucom.2022.11.072

The state-of-the-art performance for several real-world problems is currently reached by deep and, in particular, convolutional neural networks (CNN). Such learning models exploit recent results in the field of deep learning, leading to highly performing, yet very large neural networks with typically millions to billions of parameters. As a result, such models are often redundant and excessively oversized, with a detrimental effect on the environment in terms of unnecessary energy consumption and a limitation to their deployment on low-resource devices. The necessity for compression techniques able to reduce the number of model parameters and their resource demand is thereby increasingly felt by the research community. In this paper we propose the first extensive comparison, to the best of our knowledge, of the main lossy and structure-preserving approaches to compress pre-trained CNNs, applicable in principle to any existing model. Our study is intended to provide a first and preliminary guidance to choose the most suitable compression technique when there is the need to reduce the occupancy of pre-trained models. Both convolutional and fully-connected layers are included in the analysis. Our experiments involved two pre-trained state-of-the-art CNNs (proposed to solve classification or regression problems) and five benchmarks, and gave rise to important insights about the applicability and performance of such techniques w.r.t. the type of layer to be compressed and the category of problem tackled.

Deep neural networks compression: A comparative survey and choice recommendations / G.C. Marinó, A. Petrini, D. Malchiodi, M. Frasca. - In: NEUROCOMPUTING. - ISSN 0925-2312. - 520:(2023 Feb 01), pp. 152-170. [10.1016/j.neucom.2022.11.072]

Deep neural networks compression: A comparative survey and choice recommendations

Marinó, Giosué Cataldo^Primo;A. Petrini^Secondo;D. Malchiodi^Penultimo;M. Frasca^Ultimo

2023

Abstract

The state-of-the-art performance for several real-world problems is currently reached by deep and, in particular, convolutional neural networks (CNN). Such learning models exploit recent results in the field of deep learning, leading to highly performing, yet very large neural networks with typically millions to billions of parameters. As a result, such models are often redundant and excessively oversized, with a detrimental effect on the environment in terms of unnecessary energy consumption and a limitation to their deployment on low-resource devices. The necessity for compression techniques able to reduce the number of model parameters and their resource demand is thereby increasingly felt by the research community. In this paper we propose the first extensive comparison, to the best of our knowledge, of the main lossy and structure-preserving approaches to compress pre-trained CNNs, applicable in principle to any existing model. Our study is intended to provide a first and preliminary guidance to choose the most suitable compression technique when there is the need to reduce the occupancy of pre-trained models. Both convolutional and fully-connected layers are included in the analysis. Our experiments involved two pre-trained state-of-the-art CNNs (proposed to solve classification or regression problems) and five benchmarks, and gave rise to important insights about the applicability and performance of such techniques w.r.t. the type of layer to be compressed and the category of problem tackled.

Scheda breve

Scheda completa

Scheda completa (DC)

	Presenza di coautori internazionali
	
				No
			
	Lingua dell'articolo
	
				English
			
	Parole chiave
	
				CNN compression; Connection pruning; Weight quantization; Weight sharing; Huffman coding; Succinct Deep Neural Networks
			
	Settori scientifico-disciplinari dell'articolo (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Tipo
	
				Articolo
			
	Revisione (peer review)
	
				Esperti anonimi
			
	Classificazione in base al tipo di ricerca
	
				Ricerca di base
			
	Classificazione della pubblicazione
	
				Pubblicazione scientifica
			
	Sustainable development goals
	
				Goal 9: Industry, Innovation, and Infrastructure
			
	Titolo del progetto
	
	Titolo Progetto
	
									Multi-criteria optimized data structures: from compressed indexes to learned indexes, and beyond
								
	Nome finanziatore
	
										MINISTERO DELL'ISTRUZIONE E DEL MERITO
									
	N. Contratto
	
									2017WR7SHH_004
								
	Data di pubblicazione
	
				1-feb-2023
			
	Data ahead of print o data di stampa
	
				25-nov-2022
			
	Rivista in ANCE
	
				NEUROCOMPUTING
			
	Editore
	
				Elsevier
			
	Volume o annata
	
				520
			
	Pagina iniziale
	
				152
			
	Pagina finale
	
				170
			
	Numero di pagine
	
				19
			
	Stato di pubblicazione
	
				Pubblicato
			
	Rilevanza del periodico
	
				Periodico con rilevanza internazionale
			
	DOI
	
				https://dx.doi.org/10.1016/j.neucom.2022.11.072
			
	Banca dati sorgente
	
				crossref
			
	Identificativo ISI
	
				WOS:000904786300014
			
	Identificativo SCOPUS
	
				2-s2.0-85143547302
			
	Adesione alla policy Open Access di Ateneo
	
				Aderisco
			
	Tipologia
	
				info:eu-repo/semantics/article
			
	Citazione
	
				Deep neural networks compression: A comparative survey and choice recommendations / G.C. Marinó, A. Petrini, D. Malchiodi, M. Frasca. - In: NEUROCOMPUTING. - ISSN 0925-2312. - 520:(2023 Feb 01), pp. 152-170. [10.1016/j.neucom.2022.11.072]
			
	Fulltext
	
				open
			
	Tipologia
	
				Prodotti della ricerca::01 - Articolo su periodico
			
	Numero autori
	
				4
			
	Tipologia sito docente
	
				262
			
	Tipologia
	
				Article (author)
			
	Presenza impact factor
	
				Periodico con Impact Factor
			
	Tutti gli autori
	
						G.C. Marinó, A. Petrini, D. Malchiodi, M. Frasca
					
	Appare nelle tipologie:
	
				01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
deep-neural-network-compression-neucom.pdf accesso aperto Tipologia: Publisher's version/PDF Dimensione 1.69 MB Formato Adobe PDF Visualizza/Apri	1.69 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/948823

Citazioni

ND

86

67

ND

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca