This paper presents the digitization and online dissemination of the Manzini & Savoia (2005) corpus, one of the most comprehensive resources on morphosyntactic variation in Italian and Romansh dialects. Developed within Project CHANGES (Cultural Heritage Active Innovation for Sustainable Society), the initiative responds to the urgent need for systematic documentation and open-access preservation of linguistic diversity. The new platform integrates a relational PostgreSQL database, a Strapi-based backend, and an interactive web interface, offering multiple modes of exploration–including map-based navigation, morphosyntactic query, and access to original fieldwork notebooks. The entire dataset (64,472 examples with IPA transcription and metadata) is openly available on Zenodo for independent research and reuse. The project also explores experimental applications of Large Language Models (LLMs) for automatic annotation, demonstrating the potential for computational approaches in dialectology. This work provides a replicable model for sustainable digital archiving and fosters interdisciplinary research across linguistic, computational, and cultural heritage domains.

Morphosyntactic Variation in Italian and Romansh Dialects: The Manzini & Savoia (2005) Corpus Within Project CHANGES / G. Mazzaggio, C. Zoli, N. Binazzi, L.A. Ludovico, M.V. Vena, M. Rita Manzini, L. Maria Savoia - In: DH2025 : Digital Heritage International Congress 2025 / [a cura di] S. Campana, D. Ferdani, H. Graf, G. Guidi, Z. Hegarty, S. Pescarin, F. Remondino. - [s.l] : Eurographics - The European Association for Computer Graphics, 2025 Sep. - ISBN 978-3-03868-277-6. - pp. 1-8 (( convegno Digital Heritage (DH) World Congress & Expo tenutosi a Siena nel 2025 [10.2312/dh.20253066].

Morphosyntactic Variation in Italian and Romansh Dialects: The Manzini & Savoia (2005) Corpus Within Project CHANGES

L.A. Ludovico;M.V. Vena;
2025

Abstract

This paper presents the digitization and online dissemination of the Manzini & Savoia (2005) corpus, one of the most comprehensive resources on morphosyntactic variation in Italian and Romansh dialects. Developed within Project CHANGES (Cultural Heritage Active Innovation for Sustainable Society), the initiative responds to the urgent need for systematic documentation and open-access preservation of linguistic diversity. The new platform integrates a relational PostgreSQL database, a Strapi-based backend, and an interactive web interface, offering multiple modes of exploration–including map-based navigation, morphosyntactic query, and access to original fieldwork notebooks. The entire dataset (64,472 examples with IPA transcription and metadata) is openly available on Zenodo for independent research and reuse. The project also explores experimental applications of Large Language Models (LLMs) for automatic annotation, demonstrating the potential for computational approaches in dialectology. This work provides a replicable model for sustainable digital archiving and fosters interdisciplinary research across linguistic, computational, and cultural heritage domains.
Human-centered computing → Web-based interaction; Information systems → Relational database model; Applied computing → Language translation; Social and professional topics → Cultural characteristics
Settore INFO-01/A - Informatica
Settore LIFI-01/A - Linguistica italiana
set-2025
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
dh20253066.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Licenza: Nessuna licenza
Dimensione 21.82 MB
Formato Adobe PDF
21.82 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1182916
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact