Kebebush Kamiso2026-01-262023-04-06https://etd.hu.edu.et/handle/123456789/225Machine translation (MT) is the area of Natural Language Processing (NLP) that focuses on obtaining a target language text from a source language text using automatic techniques. It is a multidisciplinary field and the challenge has been approached from various points of view including linguistics and statistics. MT usually involves one or more approaches. Our preference for this study is to develop the bi directional Sidaamu Afoo - Amharic machine translation system, make use of a statistical machine translation (SMT) approach. To conduct the experiment, a parallel corpus was collected from all possible available sources. These include mostly the Old and New Testaments of the Holy Bible for both languages. We used the monolingual Contemporary Amharic Corpus and the Sidama Afoo corpus compiled by a research team in the Informatics Faculty of Hawassa University. Different preprocessing tasks such as tokenization, cleaning, and normalization have been done to make the corpus suitable for the system. To accomplish the objective of this thesis work, we conducted four experiments using word and morpheme-based translation units with SMT for Sidaamu Afoo - Amharic language pairs. The first two experiments focus on word-based SMT and the next two on morpheme-based translation using unsupervised morphological segmentation tool; Morfessor. For each experiment, we used 30,100 parallel sentences. Out of the total parallel sentences, we used 80% (24,100) of randomly selected parallel sentences for training, 10% (3,000) for tuning and another 10% (3,000) for testing. The basic tools used for accomplishing the machine translation are Moses for the translation process which is MGIZA ++ for word and morpheme alignment and KenLM for language modeling; Morfessor for morphological segmentation. For evaluation SacreBLEU package which are BLEU, ChrF and TER metrics. According to the experimental findings, the differences between Amharic to Sidaamu Afoo and Sidaamu Afoo to Amharic in the Word-based alignment translation were 6.2, 16, and 1.9 for BLUE, ChrF2, and TER, respectively. In the Morpheme-based alignment, the differences between Amharic to Sidaamu Afoo and Sidaamu Afoo to Amharic translation were 7.5, 20.4, and 5.1, for BLUE, ChrF2, and TER respectively. In conclusion, the results show that morpheme-based alignment performance is better than word based alignment, for Amharic to Sidaamu Afoo than Sidaamu Afoo to AmharicenSMTmorpheme level alignmentword level alignmentmorfessorBi-Directional Sidaamu Afoo - Amharic Statistical Machine TranslationThesis