This paper examines the role of metadata and annotations in the analysis of digital corpora by linguistic analysis tools. The present study is grounded in a discursive approach to the analysis of linguistic data. This conceptual framework determines both the concept of the corpus adopted for the study and the methodological principles guiding its exploration. The structuring of the corpus through metadata represents a pivotal juncture between the digital constitution of the material observed and the way in which it can be explored by means of corpus analysis tools. The paper initially addresses the methodological transformations, and the epistemological implications produced by digital resources. The evolving conceptualization of empirical data and the nature of the objects of study are also highlighted. A typology of metadata is then proposed, based on two main parameters: the different types of information they represent, on the one hand, and the characteristics of the data, on the other. Specific focus is directed towards the digital discourses of Web 2.0.

Discours numériques et approches outillées : quelques réflexions sur les apports des métadonnées / C. Cagninelli. - In: LINGUE E LINGUAGGI. - ISSN 2239-0359. - 65:(2024), pp. 385-412. [10.1285/i22390359v65p385]

Discours numériques et approches outillées : quelques réflexions sur les apports des métadonnées

C. Cagninelli
2024

Abstract

This paper examines the role of metadata and annotations in the analysis of digital corpora by linguistic analysis tools. The present study is grounded in a discursive approach to the analysis of linguistic data. This conceptual framework determines both the concept of the corpus adopted for the study and the methodological principles guiding its exploration. The structuring of the corpus through metadata represents a pivotal juncture between the digital constitution of the material observed and the way in which it can be explored by means of corpus analysis tools. The paper initially addresses the methodological transformations, and the epistemological implications produced by digital resources. The evolving conceptualization of empirical data and the nature of the objects of study are also highlighted. A typology of metadata is then proposed, based on two main parameters: the different types of information they represent, on the one hand, and the characteristics of the data, on the other. Specific focus is directed towards the digital discourses of Web 2.0.
corpus; digital discourse; corpus analysis tools; corpus structure; annotations
Settore FRAN-01/B - Lingua, traduzione e linguistica francese
2024
Article (author)
File in questo prodotto:
File Dimensione Formato  
Cagninelli_LL65_30031-148826-1-PB.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Licenza: Creative commons
Dimensione 848.32 kB
Formato Adobe PDF
848.32 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1173166
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact