Modeling the evolution of topics and forecast future trends is a crucial task when analyzing scientific papers. In this work we propose tASKE (temporal Automated System for Knowledge Extraction), a dynamic topic modeling approach which exploits zero-shot classification and contextual embeddings in order to track topic evolution through time. The approach is evaluated against a corpus of data science papers, assessing the ability of tASKE to correctly classify documents and retrieving relevant derivation relationships between older and new topics in time.
Exploiting Contextual Embeddings to Extract Topic Genealogy from Scientific Literature / A. Ferrara, S. Montanelli, S. Picascia, D. Riva (CEUR WORKSHOP PROCEEDINGS). - In: SDU 2023 : Scientific Document Understanding 2023 / [a cura di] A. Pouran Ben Veyseh, F. Dernoncourt, T. Huu Nguyen, V. Dack Lai. - [s.l] : CEUR-WS, 2023. - pp. 1-9 (( convegno Proceedings of the Workshop on Scientific Document Understanding co-located with 37th AAAI Conference on Artificial Inteligence (AAAI 2023) tenutosi a on line nel 2023.
Exploiting Contextual Embeddings to Extract Topic Genealogy from Scientific Literature
A. Ferrara
Primo
;S. Montanelli
Secondo
;S. Picascia
Penultimo
;D. Riva
Ultimo
2023
Abstract
Modeling the evolution of topics and forecast future trends is a crucial task when analyzing scientific papers. In this work we propose tASKE (temporal Automated System for Knowledge Extraction), a dynamic topic modeling approach which exploits zero-shot classification and contextual embeddings in order to track topic evolution through time. The approach is evaluated against a corpus of data science papers, assessing the ability of tASKE to correctly classify documents and retrieving relevant derivation relationships between older and new topics in time.File | Dimensione | Formato | |
---|---|---|---|
paper5.pdf
accesso aperto
Tipologia:
Publisher's version/PDF
Dimensione
884.83 kB
Formato
Adobe PDF
|
884.83 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.