MORPHOLOGY BASED SPELLING CHECKER FOR GEEZ LANGUAGE
No Thumbnail Available
Date
2023-03-06
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Hawassa University
Abstract
Geez is one of the ancient languages. It belongs to Semitic language family. Many ancient
literature and books have written in Geez. Currently, Geez course are offered in various
colleges, universities and in some primary schools. However, still developed NLP applications
are insufficient for this language. In order to write error free Geez text in less time, spelling
checker application is a critical NLP application. Spelling checker is a tool used to detect
spelling error in a block of text and gives closer suggestions to the error words. A previous
attempt has made to develop a spelling checker for Geez language. This attempt was focus
only homophone alphabet interchangeably error.
In this study, we proposed morphology based (dictionary lookup and morphological analyzer)
approach to Geez language spelling checker. The system have three main compenents.These
are text preprocessing, error detection, and error correction. To achieve the objective of this
study the researcher builds one main dictionaries that contains Geez language lexicon and
morphological feature. The researcher built 6115 unique Geez lexicon and 955 rules had
defined. We adopt the Hunspell dictionary and affix file format to design a lexicon (i.e. the
knowledge base component) and hashing algorithm for searching. Hunspell is an open source
spelling checker tool. It has designed especially for languages that have complex morphology.
Finally, the researcher has developed a prototype of a system to test the functionality and
performance of the Geez language spelling checker. The accuracy of error detection expressed
in terms of precision and recall. In addition, the accuracy of suggestion expressed in terms of
suggestion adequacy. Therefore, we got the result of lexical recall 91.9%, error recall 83.7%,
lexical precision 97.2%, error precision 62.2% and correct suggestions provided by GLSC
87.5%. The overall performance of the system is 90.05%. We conclude that increase the size
of the dictionary and develop well organized rule will increase the overall performance of the
Geez language spelling checker.
Description
Keywords
Error Detection, Error Correction, Spell Checker, Morphology, Non Word Error, Real Word Error, Geez language, dictionary lookup
