In this study, we deal with the design of computational-linguistic resources and strategies for the analysis of under-resourced languages. In particular, we present empirical analyses aiming at identifying the best path to semi-utomatically annotate a dialectal Arabic corpus via a neural multi-task architecture. Such an architecture is used to automatically generate several levels of linguistic annotation which can be evaluated by comparison with the gold annotation. Changing the order in which annotations are produced can have an impact on the quantitative results. Through multiple sets of experiments we show how to get the best performances with this methodology.
An Empirical Analysis of Task Relations in the Multi-Task Annotation of an {A}rabizi Corpus / Gugliotta, Elisa; Dinarelli, Marco. - (2023), pp. 154-165. [10.34619/srmk-injj]
An Empirical Analysis of Task Relations in the Multi-Task Annotation of an {A}rabizi Corpus
Gugliotta, Elisa
Conceptualization
;
2023-01-01
Abstract
In this study, we deal with the design of computational-linguistic resources and strategies for the analysis of under-resourced languages. In particular, we present empirical analyses aiming at identifying the best path to semi-utomatically annotate a dialectal Arabic corpus via a neural multi-task architecture. Such an architecture is used to automatically generate several levels of linguistic annotation which can be evaluated by comparison with the gold annotation. Changing the order in which annotations are produced can have an impact on the quantitative results. Through multiple sets of experiments we show how to get the best performances with this methodology.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.