DEVELOPING LOAN DEFAULT PREDICTION MODEL USING MACHINE LEARNING TECHNIQUES
No Thumbnail Available
Date
2024-07-10
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Hawassa University
Abstract
Loan defaults pose a significant risk to financial institution, leading to substantial financial losses
and impacting their stability and profitability. Existing predictive models often overlook key
borrower characteristics, resulting in less accurate predictions. This study aims to improve loan
default prediction by integrating borrower-specific features and loan characteristics using a
blending ensemble model. Specifically, we focus on borrower characteristics such as business
location, loan product type, yearly business income, location of collateral, total years of
experience, and educational status, which are used by some Ethiopian banks for risk assessment
but have been underexplored in previous studies.
We employ three base models: logistic regression, multilayer perceptron, and random forest. These
models are combined using a weighted average blending ensemble approach to enhance predictive
performance. The dataset, consisting of 18,184 records from a single bank, was split using an 70/30
ratio for training and testing.
Our findings demonstrate that the blending ensemble model outperform individual base models in
predicting loan defaults, achieving higher accuracy (98.62%), precision, recall, and F1-score. The
most significance predictors identified includes sex, collected total, educational status,
employment status, and age, while gender and marital status shower lesser impact. This study
contributes to the field by providing a more robust predictive model that incorporates
underexplored borrower characteristics, offering financial institutions a more accurate tool for risk
assessment and decision-making
Description
Keywords
Loan default, machine learning, loan status, normal loan, special mention, substandard loans, doubtful loans, and loss loan, blending ensemble, multilayer perceptron, random forest, logistic regression
