Given the lack of resources for Arabic dialects, the construction of corpora, lexical resources, and tools is a non-trivial challenge. The focus of the article is to describe our in-progress work to address these deficiencies. We start with Moroccan and Tunisian dialects to provide annotated corpora and corpus-based lexical resources. We also aim to extend an existing morphological engine with linguistic resources built ad hoc for each dialect. In addition, we develop an integrated component in the morphological engine to better address linguistic and sociolinguistic characteristics while preserving the integrity of dialectal texts.

Challenges and Progress in Constructing Arabic Dialect Corpora and Linguistic tools: A Focus on Moroccan and Tunisian Dialects / Nahli, Ouafae; Gugliotta, Elisa; Khlif, Nadia; Giulia, Benotto. - INTERNATIONAL IEEE CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY:(2023), pp. 293-298. [10.1109/cist56084.2023.10410009]

Challenges and Progress in Constructing Arabic Dialect Corpora and Linguistic tools: A Focus on Moroccan and Tunisian Dialects

Gugliotta, Elisa
Writing – Original Draft Preparation
;
2023-01-01

Abstract

Given the lack of resources for Arabic dialects, the construction of corpora, lexical resources, and tools is a non-trivial challenge. The focus of the article is to describe our in-progress work to address these deficiencies. We start with Moroccan and Tunisian dialects to provide annotated corpora and corpus-based lexical resources. We also aim to extend an existing morphological engine with linguistic resources built ad hoc for each dialect. In addition, we develop an integrated component in the morphological engine to better address linguistic and sociolinguistic characteristics while preserving the integrity of dialectal texts.
2023
Inglese
INTERNATIONAL IEEE CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY
293
298
6
http://dx.doi.org/10.1109/cist56084.2023.10410009
Esperti anonimi
Arabic dialects, corpora, lexical resources, morphological engine, annotated corpora, Moroccan dialect, Tunisian dialect, linguistic resources, sociolinguistic characteristics, dialectal texts
Internazionale
No
Nahli, Ouafae; Gugliotta, Elisa; Khlif, Nadia; Giulia, Benotto
Challenges and Progress in Constructing Arabic Dialect Corpora and Linguistic tools: A Focus on Moroccan and Tunisian Dialects / Nahli, Ouafae; Gugliotta, Elisa; Khlif, Nadia; Giulia, Benotto. - INTERNATIONAL IEEE CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY:(2023), pp. 293-298. [10.1109/cist56084.2023.10410009]
info:eu-repo/semantics/article
1 Contributo su Rivista::1.1 Articolo in rivista
262
4
none
   A lexical corpus-based model of Contemporary Written Arabic
   CWALM
   MUR
   PRIN2020
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11388/361749
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact