Towards the Automatic Generation of Arabic Lexical Recognition Tests Using Orthographic and Phonological Similarity Maps
Date
2021
Authors
Salah, Saeed
Nassar, Mohammad
Zaghal, Raid
Hamed, Osama
Journal Title
Journal ISSN
Volume Title
Publisher
Journal of King Saud University – Computer and Information Sciences, Elsevier, 2021
Abstract
Lexical Recognition Test (LRT) themes are one of the main methods that are widely used to measure lan guage proficiency of some common languages such as English, German and Spanish. However, similar
research for Arabic is still at development stages, and existing proposals mainly use human-crafted meth ods. In this paper, a new methodology, based on a newly developed algorithm, was proposed with the
aim of automatically constructing high quality nonwords associated with a real quick measurement of
Arabic proficiency levels (Arabic LRT). The suggested algorithm will automatically generate nonwords
based on Arabic special characteristics they are orthography (spelling), phonology (pronunciation), n grams and the word frequency map, which is an important factor to create a multi-level test. With the
help of a large dataset of Arabic vocabulary, the proposed algorithm was experimented. For this purpose,
a Web-based application, following the suggested methodology, was designed and implemented to facil itate the process of collecting and analyzing learners’ responses. The experimental results have shown
that the LRT questions that were automatically generated by the proposed system had confused the
learners, this is clear from the output of the confusion matrix which showed that (1/3) of the generated
nonwords were able to distract the learners (with accuracy 65%). Consequentially, the results of recall
and precision have smaller values, 0.52 and 0.48, respectively
Description
Keywords
NLP , LRT , n-gram , Dialects , MSA , Orthographic , Phonological
Citation
S. Salah, M. Nassar, R. Zaghal, and O. Hamed “Towards the Automatic Generation of Arabic Lexical Recognition Tests Using Orthographic and Phonological Similarity Maps”, Journal of King Saud University – Computer and Information Sciences, Elsevier, Article in Press, 2021.