Computer Science

Permanent URI for this collectionhttps://etd.hu.edu.et/handle/123456789/76

Browse

Search Results

Now showing 1 - 10 of 39
  • Item
    DETECTION AND CLASSIFICATION OF INDIGENOUS ROCK MINERALS USING DEEP LEARNING TECHNIQUES
    (Hawassa University, 2023-03-08) HADIA BERKA WAKO
    Ethiopia is undoubtedly a place of riches, with a vast and diverse landmass that is rich in resources. However, less attention has been given in utilizing computing discipline like Artificial Intelligence to solve the current problems in the area of mineral mining in Ethiopia. GUJI Zone is one of Oromia 20 administrative zones blessed with different mineral resources. Despite the fact that mineral has lions share contribution to economy of Ethiopia, little work is done in modernizing the mining industry in Ethiopia especially in empowering small-scale Artisanal community. GUJI is one of the zones following outmoded techniques to identify minerals in mining industry. Rock mineral detection and classification employing conventional methods involves testing physical and chemical properties at both the micro- and macro-scale in the laboratory, which is expensive and time-consuming. Identifying tiny rock minerals and detecting its originality using traditional procedure and techniques takes too much time. Identification of minerals merely through visual observation is often erroneous. To address these problems, a deep learning approach for the classification and detection of Rock Minerals is proposed. The design- science research methodology is followed to achieve the objectives of the research. To conduct this study, 2000 images were collected from Guji’s zone and Mindat.org website. After collecting the images, image pre-processing techniques such as image resizing, image segmentation using roboflow, and image annotation are performed. Moreover, data augmentation is applied to balance the dataset and increase the number of images. This research work focuses on classifying and detecting fifteen types of rock minerals. Based on YOLOv7 deep learning model we have used 70% of the dataset to train the model and 30 % of the dataset to test the performance of the model. Finally, the developed model is evaluated using accuracy, precision, recall, and mAP with other models. Experimental result shows that the accuracy obtained from YOLOv7 is 76%mAP for large objects comparing to other models. Consequently, the pretrained weight of yolov7 achieved a 97.3% accuracy in classifying and detecting with other images
  • Item
    EXPLORING A BETTER FEATURE EXTRACTION METHOD FOR AMHARIC HATE SPEECH DETECTION
    (Hawassa University, 2021-10-08) YESUF MOHAMED YIMAM
    Hate speech is a speech that causes people to be attacked, discriminated, and hated because of their personal and collective identities. When hate speech grows, it will cause death and displacement of peoples from their homes and properties. Social media has the ability of widely spreading hate speech. To solve this problem, various researchers have studied many ways to detect social media hate speeches that are spreading in international and local languages. Because the problem is so serious, it needs to be carefully studied and better addressed in a variety of solutions. The previous studies detect a speech as hate speech, based on the frequency (occurrence) of a word in a given dataset; this means it does not consider the role of each word in a given sentence. The main purpose of this study is to design a method that can generate hate speech features from a given text by identifying the role of a word in a given sentence, so that hate speech can easily be distinguished from other forms of speech in a better way. To do this, various researches related to this study have been studied and reviewed. This study created a new feature extraction method for Amharic hate speech detection. The created model needs a training and testing dataset, so that posts and comments, which are posted on 25 popular Facebook pages, have been collected to build the dataset. Whether a speech is hateful or not, should be determined by the law that prohibits hate speech. So that, using different filtration methods, datasets that contain religious, ethnic, and hate words are collected and given to law experts, to annotate it manually. The law experts labeled 2590 datasets into three classes; Religion-hate, Ethnic-hate, and Non-hate. After dataset preparation, a new feature extraction method, which can distinguish hate speech from other speech, is developed. The new feature extraction method and other feature extraction methods that are used in other related studies are implemented and computed with three machine learning classification algorithms: SVM, NB, and RF. The result in different evaluation metrics shows that the new feature extraction method performed better in all combinations of classification algorithms. By using 80% of 2590 labeled datasets as a training set and the rest as a test set, 96.2% average accuracy is achieved using the combination of SVM with the new feature extraction method.
  • Item
    DEVELOPING IMAGE-BASED ENSET PLANT DISEASE IDENTIFICATION USING CONVOLUTIONAL NEURAL NETWORK
    (Hawassa University, 2020-11-07) UMER NURI MOHAMMED
    Nowadays, decline in food plant productivity is a major problem causing food insecurity to which plant disease is one of the factors. Early identification and accurate diagnosis of the health status of food plants is hence critical to limit the spread of plant diseases and it should be in a technological manner rather than by the labor force. Traditional observation methods by farmers or domain experts is perhaps time-consuming, expensive and sometimes inaccurate. Based on the literature, the literature suggests that deep learning approaches are the most accurate models for the detection of plant disease. Convolutional Neural network (CNN) is one of the popular approaches that allows computational models that are composed of multiple processing layers to learn representations of image data with multiple levels of abstraction. These models have dramatically improved the state-of-the-art in visual object recognition and image classification that makes it a good way for enset plant disease classification problems. For this purpose, we used an appropriate CNN based model for identifying and classifying the three most critical diseases of enset plants: - enset bacterial wilt, enset Leaf spot, and Root mealybug diseases. Enset is one of a major source of food in the South, Central and Southwestern parts of Ethiopia. A total of 14,992 images are used for conducting experiments including augmented images with four different categories; three diseased and a healthy class obtained from the different agricultural sectors stationed at Hawassa and Worabe Ethiopia, these images are provided as input to the proposed model. Under the 10-fold cross-validation strategy, the experimental results show that the proposed model can effectively detect and classify four classes of enset plant diseases with the best classification accuracy of 99.53%, which is higher than compared to other classical deep learning models such as MobileNet and Inception v3 deep learning models
  • Item
    A MODEL TOWARDS PRICE PREDICTION FOR COMMODITIES USING DEEP LEARNING: CASE OF ETHIOPIAN COMMODITY EXCHANGE
    (Hawassa University, 2022-10-03) SOLEN GOBENA
    The development of information technology makes it possible to collect and store large amounts of data every second. Market Enterprises are generating large amounts of data, and it is difficult to use traditional data analysis methods to analyze and predict their future market price. Price predictions are an integral component of trade and policy analysis. The prices of agricultural commodities directly influence the real income of farmers and it also affects the national foreign currency. Haricot bean is produced in many areas of Ethiopia and it is rich in starch, protein, and dietary fiber, and is an excellent source of minerals and vitamins. Haricot bean is also the main agricultural commodity traded on the Ethiopian commodity exchange (ECX) market for the past 10 years. Though there are price prediction works for various crops in Ethiopia and abroad using machine learning and deep learning approaches, price prediction for Haricot bean has not been studied using machine learning as to the best of our knowledge,. The main objective of this study is to develop a price prediction model that can predict future prices of Haricot Bean traded at the ECX market based on time series data. Past 10 years, data has been obtained from the Ethiopian commodity exchange (ECX) with sample dataset size of 12272. Simple linear regression (SLR), multiple linear regression (MLR), and long short term memory (LSTM) were evaluated as predictive models. The results showed that LSTM outperformed other predictive models in all measures of model performance for predicting the Haricot Bean prices by achieving a coefficient of determination (R2 ) of 0.97, mean absolute percentage error (MAPE) of 0.015, and mean absolute error (MAE) of 0.032.
  • Item
    FOR SIDAMA LANGUAGE USING THE HIDDEN MARKOV MODEL WITH VITERBI ALGORITHM
    (Hawassa University, 2022-04-07) BELACHEW KEBEDE ESHETU
    The Parts of Speech (POS) tagger is an essential low-level tool in many natural language processing (NLP) applications. POS tagging is the process of assigning a corresponding part of a speech tag to a word that describes how it is used in a sentence. There are different approaches to POS tagging. The most common approaches are rule-based, stochastic, and hybrid POS tagging. In this paper, the stochastic approach, particularly the Hidden Markov Model (HMM) approach with the Viterbi algorithm, was applied to develop the part of the speech tagger for Sidaama. The HMM POS tagger tags the words based on the most probable sequence of words. For training and testing the model, 9,660 Sidaama sentences containing 130,847 tokens (words, punctuation, and symbols) were collected, and 4 experts in the language undertook the POS annotation. Thirty-one (31) POS tags were used in the annotation. The source of the corpus is fables, news, reading passages, and some scripts from the Bible. 90% of the corpus is used for training and the remaining 10% is used for testing. The POS tagger was implemented using the Python programming language (python 3.7.0) and the Natural Language Toolkit (NLTK 3.0.0). The performance of the Sidaama POS tagger was tested and validated using a ten-fold cross-validation technique. In the performance analysis experiment, the model achieved an accuracy of 91.25% for HMM model and 98.46% with the Viterbi algorithm
  • Item
    DECISION FRAMEWORK FOR THE USAGE OF CLOUD TECHNOLOGY IN ETHIOPIA HIGHER EDUCATION INSTITUTIONS
    (Hawassa Unversity, 2019-10-06) SELAM DESALGNE
    The rapid technology advancements are always creating new opportunities and a new way of working. Cloud Technology is being popularizing across the world especially in academic institutions. It is not a new technology but rather a new delivery model for information and services using existing technologies. The paradigm has been recognized recently as key enabling efficient and effective technological services that will reshape the delivery and support of the educational services. This study is conducted on public Ethiopian Higher Educational Institutions to explore the critical determinants that influence the adoption of the Cloud Technology. Despite the fact that cloud computing offers great deal of opportunities, its adoption exacerbated with lack of standards and relative lack of general framework created dilemma for the institutions how to approach the cloud adoption. An exploratory study is carried out. This research work proposes TOETAD conceptual framework according to the Technology Organization Environment (TOE) model, Diffusion of Innovation (DOI) theory and Technology Acceptance Model (TAM) with added Decision Maker Context to the model. Adoption determinants for the technology will be examined through the lens of integrated model. The framework factors were identified by critically reviewing studies found in the literature together with factors from the industrial standards within the context of Ethiopia Higher Education Institutions. Data is collected by online questionnaire survey with IT managers, lectures, E-learning coordinators and Team Leaders from selected 17 Ethiopia Higher Educational Institutions with a total 103 respondent. On the other hand the proposed frame work is evaluated by an expert to validate the framework. The result also helps to encourage the Public Higher Educational Institutions in Ethiopia to understand the nature of the problem, increase their awareness about factors to be considered while adopting the cloud computing
  • Item
    QUERY EXPANSION FOR AFAAN OROMO INFORMATION RETRIEVAL USING AUTOMATIC THESAURUS
    (Hawassa University, 2021-03-05) SAMUEL MESFIN BAYU
    Recently, the amount of textual information written in Afaan Oromo language is increasing dynamically. Likewise, the need to access the information also increases. But, it is difficult to retrieve and satisfy one`s own information need, because of the inability of the users to formulate a good query and the terminological variation or term mismatching among the world of readers and the world of authors. Hence, query expansion is an effective mechanism to reduce term mismatching problems and also to improve the retrieval performance of IR systems. The idea behind query expansion is to reformulate the user’s original query by adding related terms. In this study, an automatic Afaan Oromo thesaurus is constructed from manually collected documents. After the text preprocessing tasks are performed on the document corpus, the preprocessed words are vectorized in multidimensional space by using Word2Vec`s skip-gram model. In which, words that share similar context have similar vector representation. Then cosine similarity measure was applied to construct the thesaurus. A one-to-many association approach was employed to select expansion terms. Hence top five terms that have the highest similarity score with the entire query were selected from the thesaurus and added to the original query of the user for query expansion. Then the reformulated query was used to retrieve more relevant documents. Experimentations were performed to observe the quality of the constructed thesaurus and the effect of integrating query expansion into the Afaan Oromo IR system. The result shows that the constructed thesaurus generates related terms with average relatedness accuracy of 62.1%. On the other hand, the integration of query expansion registered performance improvement by 14.3 % recall, 2.9 % F-measure, and performance decrement of 5.5% for precision
  • Item
    COMPUTATIONAL MODEL FOR WOLAYITTA LANGUAGE SPELLING CHECKER
    (Hawassa Inversity, 2020-08-10) RAHEL SHUME
    Spelling checker systems are built, and researches are conducted worldwide to meet the needs of different languages. In Ethiopia, spelling checker researches were carried out for only Amharic language. These works have paved the way for researches on other languages like Wolayitta. A computational spelling checker is proposed and adopted for the Wolayitta language in this research. Word level spelling checker were built based on an edit distance algorithm and language model based open source tools and platforms. A dictionary database was constructed from 13,313 words from Wolayitta lexicon dictionary and 20,000 words from Wolayitta Bible. The database was then clustered into training and testing. The statistical model accept input from the user then check whether it is available in dictionary or not if it is available it will do nothing if not it gives suggestion. The testing was done in two phases where the first is error detection rate and the second evaluation is error correction. Tests were carried out on the input data from the user, and accuracy of 94.57% was achieved for spelling error detection, while accuracy of around 90.98 % was retrieved for the spelling suggestion model. The results were promising, and further researches can be entertained as per the recommendations made by the researcher
  • Item
    Predictions of the Status of Undernutrition for Children below Five Using Ensemble Metho
    (Hawassa University, 2023-08-02) Natnael Abate Choreno
    Undernutrition is one of the main causes of morbidity and mortality in children under five in most developing countries, including Ethiopia. It increases the risk of infectious diseases, impairs cognitive and physical development, reduces school performance and productivity, and perpetuates intergenerational cycles of poverty and malnutrition. The primary goal of this thesis is to build an ensemble model that predicts the undernutrition status of children under five using data from the 2019 EMDHS. The experiments covered 15082 instances and 20 attributes. Ensemble methods combine several models to deliver better results. Typically, results from an ensemble approach are more accurate than those from a single model. The selected method consists of preprocessing, feature selection, k-fold cross-validation, model building, an ensemble classifier, and final prediction steps. In this work, different machine learning classification models such as the Decision Tree, Support Vector Machine, K-Nearest Neighbors, and Naive Bayes classifiers have been used as base model algorithms with an accuracy rate of 0.92%, 0.94%, 0.92%, and 0.75% respectively. The final result was combined by the stacking ensemble method with logistic regression. The most accurate predictive model, with a 96 % accuracy rate was created using the stacking ensemble method. HAZ, WAZ, WHZ, age in 5 years groups, region, source of drinking water, education level, type of toilet facility, wealth index, total children born, number of antenatal visits, vaccination, breastfeeding duration, ever had nutritious food and plain water has given are the major features that contribute to undernutrition in children under-five. The findings of this study provided encouraging evidence that using the ensemble method could support the development of a predictive model that predicts the nutritional status of children under five in Ethiopia. Future research could produce better results by combining large datasets from clinical and hospital datasets. Future research may also include children over the age of five and children with obesity as a malnutrition status
  • Item
    Improving delay tolerant network buffer management approach for rural area’s health professionals’ information exchange syste
    (Hawassa University, 2022-08-06) Mulusew Abebe
    Delay-tolerant networks (DTNs) are mobile networks in the field of wireless network which are emphasized to provide end-to-end connectivity in the areas where the networks are not reliable and often susceptible to interferences. Despite the rapid advancement of communication technology, there are still rural places that are not connected to the Internet. Health information exchange between rural area and the urban areas still hampered by in adequate telecommunication infrastructures coverage, intermittent connectivity and absence of end-to-end connectivity. The term Delay Tolerant Network (DTN) is invented to bridged communication gaps that have not been connected to the Internet. In current TCP/IP technology communication is possible only when end-to-end path is available. As a result, the usual Internet and TCP/IP network cannot be valid for some hard environments which are characterized by lack of direct path between nodes, lot of power outages and intermittent connectivity. In this work, the researcher investigated the performance of various delay tolerant network routing protocols and selected MaxProp which is convenient for the proposed framework. Most routing algorithm of delay tolerant network assume the nodes buffer space as unlimited but, it is not the case in reality. As flooding-based routing relies on buffer to have a copy of every message at every node, buffer space has substantial impact on delivery probability. The existing buffer management policies compute in biased way, directed by a single parameter in a random manner while other relevant parameters are completely neglected, resulting in an inability to make a reasonable selection. Therefore, the researcher proposed a reasonable buffer management approach on the situations where there is a short contact duration, limited bandwidth and buffer. The proposed buffer management approach improves buffer availability by implementing three buffer management strategies: scheduling, dropping, and clearing buffers entirely for computing purposes, using three parameters: message type, hop count and time to live. The performance of proposed approach is validated through simulation by using opportunistic Network Environment (ONE) simulator. They were analyzed on three metrics, namely delivery probability, average latency and overhead ratio. The simulation results collected in this thesis shows that when the nodes buffer get constrained the proposed method MaxProp Routing based on Message Type Priority (MPRMTP) perform better than the existing buffer management policy by increasing the message delivery quality and decreasing overhead ratio. However, when there is sufficient buffer space, both MaxProp, and MPRMTP shows comparable performance