Scene geometry estimation from images plays a key role in robotics, augmented reality, and autonomous systems. In particular, Monocular Depth Estimation (MDE) focuses on predicting depth using a single RGB image, avoiding the need for expensive sensors. State-of-the-art approaches use deep learning models for MDE while processing images as a whole, sub-optimally exploiting their spatial information. A recent research direction focuses on smaller image patches, as depth information varies across different regions of an image. This approach reduces model complexity and improves performance by capturing finer spatial details. From this perspective, we propose a novel warp patch-based extraction method which corrects perspective camera distortions, and employ it in tailored training and inference pipelines. Our experimental results show that our patch-based approach outperforms its full-image-trained counterpart and the classical crop patch-based extraction. With our technique, we obtain a general performance enhancements over recent state-of-the-art models. Code will be available at https://github.com/AntonioFusillo/PatchMDE

On the relevance of patch-based extraction methods for monocular depth estimation / P. Coscia, A. Fusillo, A. Genovese, V. Piuri, F. Scotti. - In: IMAGE AND VISION COMPUTING. - ISSN 0262-8856. - (2025), pp. 105857.1-105857.34. [Epub ahead of print] [10.1016/j.imavis.2025.105857]

On the relevance of patch-based extraction methods for monocular depth estimation

P. Coscia
Primo
;
A. Fusillo
Secondo
;
A. Genovese;V. Piuri
Penultimo
;
F. Scotti
Ultimo
2025

Abstract

Scene geometry estimation from images plays a key role in robotics, augmented reality, and autonomous systems. In particular, Monocular Depth Estimation (MDE) focuses on predicting depth using a single RGB image, avoiding the need for expensive sensors. State-of-the-art approaches use deep learning models for MDE while processing images as a whole, sub-optimally exploiting their spatial information. A recent research direction focuses on smaller image patches, as depth information varies across different regions of an image. This approach reduces model complexity and improves performance by capturing finer spatial details. From this perspective, we propose a novel warp patch-based extraction method which corrects perspective camera distortions, and employ it in tailored training and inference pipelines. Our experimental results show that our patch-based approach outperforms its full-image-trained counterpart and the classical crop patch-based extraction. With our technique, we obtain a general performance enhancements over recent state-of-the-art models. Code will be available at https://github.com/AntonioFusillo/PatchMDE
Monocular Depth Estimation (MDE); Autonomous Driving (AD); Patch-based approach; Single Image Depth Estimation (SIDE); Metric depth;
Settore INFO-01/A - Informatica
Settore IINF-05/A - Sistemi di elaborazione delle informazioni
   Edge AI Technologies for Optimised Performance Embedded Processing (EdgeAI)
   EdgeAI
   MINISTERO DELLO SVILUPPO ECONOMICO
   101097300

   SEcurity and RIghts in the CyberSpace (SERICS)
   SERICS
   MINISTERO DELL'UNIVERSITA' E DELLA RICERCA
   codice identificativo PE00000014
2025
3-dic-2025
Article (author)
File in questo prodotto:
File Dimensione Formato  
imavis25b.pdf

accesso aperto

Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Licenza: Creative commons
Dimensione 10.61 MB
Formato Adobe PDF
10.61 MB Adobe PDF Visualizza/Apri
imavis25b_compressed.pdf

accesso aperto

Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Licenza: Creative commons
Dimensione 4.96 MB
Formato Adobe PDF
4.96 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1202284
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact