The state-of-the-art performance for several real-world problems is currently reached by deep and, in particular, convolutional neural networks (CNN). Such learning models exploit recent results in the field of deep learning, leading to highly performing, yet very large neural networks with typically millions to billions of parameters. As a result, such models are often redundant and excessively oversized, with a detrimental effect on the environment in terms of unnecessary energy consumption and a limitation to their deployment on low-resource devices. The necessity for compression techniques able to reduce the number of model parameters and their resource demand is thereby increasingly felt by the research community. In this paper we propose the first extensive comparison, to the best of our knowledge, of the main lossy and structure-preserving approaches to compress pre-trained CNNs, applicable in principle to any existing model. Our study is intended to provide a first and preliminary guidance to choose the most suitable compression technique when there is the need to reduce the occupancy of pre-trained models. Both convolutional and fully-connected layers are included in the analysis. Our experiments involved two pre-trained state-of-the-art CNNs (proposed to solve classification or regression problems) and five benchmarks, and gave rise to important insights about the applicability and performance of such techniques w.r.t. the type of layer to be compressed and the category of problem tackled.

Deep neural networks compression: A comparative survey and choice recommendations / G.C. Marinó, A. Petrini, D. Malchiodi, M. Frasca. - In: NEUROCOMPUTING. - ISSN 0925-2312. - 520:(2023 Feb 01), pp. 152-170. [10.1016/j.neucom.2022.11.072]

Deep neural networks compression: A comparative survey and choice recommendations

A. Petrini
Secondo
;
D. Malchiodi
Penultimo
;
M. Frasca
Ultimo
2023

Abstract

The state-of-the-art performance for several real-world problems is currently reached by deep and, in particular, convolutional neural networks (CNN). Such learning models exploit recent results in the field of deep learning, leading to highly performing, yet very large neural networks with typically millions to billions of parameters. As a result, such models are often redundant and excessively oversized, with a detrimental effect on the environment in terms of unnecessary energy consumption and a limitation to their deployment on low-resource devices. The necessity for compression techniques able to reduce the number of model parameters and their resource demand is thereby increasingly felt by the research community. In this paper we propose the first extensive comparison, to the best of our knowledge, of the main lossy and structure-preserving approaches to compress pre-trained CNNs, applicable in principle to any existing model. Our study is intended to provide a first and preliminary guidance to choose the most suitable compression technique when there is the need to reduce the occupancy of pre-trained models. Both convolutional and fully-connected layers are included in the analysis. Our experiments involved two pre-trained state-of-the-art CNNs (proposed to solve classification or regression problems) and five benchmarks, and gave rise to important insights about the applicability and performance of such techniques w.r.t. the type of layer to be compressed and the category of problem tackled.
No
English
CNN compression; Connection pruning; Weight quantization; Weight sharing; Huffman coding; Succinct Deep Neural Networks
Settore INF/01 - Informatica
Articolo
Esperti anonimi
Ricerca di base
Pubblicazione scientifica
Goal 9: Industry, Innovation, and Infrastructure
   Multi-criteria optimized data structures: from compressed indexes to learned indexes, and beyond
   MINISTERO DELL'ISTRUZIONE E DEL MERITO
   2017WR7SHH_004
1-feb-2023
25-nov-2022
Elsevier
520
152
170
19
Pubblicato
Periodico con rilevanza internazionale
crossref
Aderisco
info:eu-repo/semantics/article
Deep neural networks compression: A comparative survey and choice recommendations / G.C. Marinó, A. Petrini, D. Malchiodi, M. Frasca. - In: NEUROCOMPUTING. - ISSN 0925-2312. - 520:(2023 Feb 01), pp. 152-170. [10.1016/j.neucom.2022.11.072]
open
Prodotti della ricerca::01 - Articolo su periodico
4
262
Article (author)
Periodico con Impact Factor
G.C. Marinó, A. Petrini, D. Malchiodi, M. Frasca
File in questo prodotto:
File Dimensione Formato  
deep-neural-network-compression-neucom.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 1.69 MB
Formato Adobe PDF
1.69 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/948823
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 29
  • ???jsp.display-item.citation.isi??? 21
social impact