Neural networks (NN) serve as the backbone for various applications, including computer vision, speech recognition, and natural language processing. Due to their iterative nature, training NNs is a highly compute-intensive task that is typically executed using a statically allocated set of devices (e.g., CPUs or GPUs). This static allocation prevents adjusting priorities, making it impossible to reassign resources to urgent tasks, and potentially causing high-priority training jobs to miss their expected completion times. This paper proposes DECOR-NN (DEadline COnstrained Resource allocation for Neural Networks), a control mechanism for NN training that dynamically allocates resources according to a user-defined deadline (i.e., a Service Level Agreement), ensuring that the training phase completes within the specified time. The solution leverages control theory and has been developed on top of PyTorch, a widely-used framework for training NNs. DECOR-NN dynamically allocates either GPUs or fractions of CPUs to meet user deadlines and also allows users to modify the deadline at runtime to accommodate changes in job priorities. A comprehensive empirical evaluation using three benchmark applications demonstrates that DECOR-NN successfully completes training jobs with an average deviation from the deadline of only 1.75 %.
Dynamic Resource Allocation for Deadline-Constrained Neural Network Training / L. Baresi, M. Garlini, G. Quattrocchi (ICSE WORKSHOP ON SOFTWARE ENGINEERING FOR ADAPTIVE AND SELF-MANAGING SYSTEMS). - In: 2025 IEEE/ACM 20th Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS)[s.l] : IEEE, 2025. - ISBN 9798331501815. - pp. 39-49 (( 20. International Conference on Software Engineering for Adaptive and Self-Managing Systems Ottawa 2025 [10.1109/SEAMS66627.2025.00013].
Dynamic Resource Allocation for Deadline-Constrained Neural Network Training
G. QuattrocchiUltimo
2025
Abstract
Neural networks (NN) serve as the backbone for various applications, including computer vision, speech recognition, and natural language processing. Due to their iterative nature, training NNs is a highly compute-intensive task that is typically executed using a statically allocated set of devices (e.g., CPUs or GPUs). This static allocation prevents adjusting priorities, making it impossible to reassign resources to urgent tasks, and potentially causing high-priority training jobs to miss their expected completion times. This paper proposes DECOR-NN (DEadline COnstrained Resource allocation for Neural Networks), a control mechanism for NN training that dynamically allocates resources according to a user-defined deadline (i.e., a Service Level Agreement), ensuring that the training phase completes within the specified time. The solution leverages control theory and has been developed on top of PyTorch, a widely-used framework for training NNs. DECOR-NN dynamically allocates either GPUs or fractions of CPUs to meet user deadlines and also allows users to modify the deadline at runtime to accommodate changes in job priorities. A comprehensive empirical evaluation using three benchmark applications demonstrates that DECOR-NN successfully completes training jobs with an average deviation from the deadline of only 1.75 %.| File | Dimensione | Formato | |
|---|---|---|---|
|
SEAMS_25___Dynamic_Resource_Allocation_for_Deadline_Constrained__Neural_Network_Training-11.pdf
accesso riservato
Tipologia:
Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Licenza:
Nessuna licenza
Dimensione
448.58 kB
Formato
Adobe PDF
|
448.58 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
|
Dynamic_Resource_Allocation_for_Deadline-Constrained_Neural_Network_Training.pdf
accesso riservato
Tipologia:
Publisher's version/PDF
Licenza:
Nessuna licenza
Dimensione
644.29 kB
Formato
Adobe PDF
|
644.29 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.




