How Secure Are Good Loans: Validating Loan-Granting Decisions And Predicting Default Rates On Consumer Loans (original) (raw)

Machine learning predictivity applied to consumer creditworthiness

Future Business Journal, 2020

Credit risk evaluation has a relevant role to financial institutions, since lending may result in real and immediate losses. In particular, default prediction is one of the most challenging activities for managing credit risk. This study analyzes the adequacy of borrower's classification models using a Brazilian bank's loan database, and exploring machine learning techniques. We develop Support Vector Machine, Decision Trees, Bagging, AdaBoost and Random Forest models, and compare their predictive accuracy with a benchmark based on a Logistic Regression model. Comparisons are analyzed based on usual classification performance metrics. Our results show that Random Forest and Adaboost perform better when compared to other models. Moreover, Support Vector Machine models show poor performance using both linear and nonlinear kernels. Our findings suggest that there are value creating opportunities for banks to improve default prediction models by exploring machine learning techniques.

A New Prediction Approach for Preventing Default Customers from Applying Personal Loans Using Machine Learning

IJCSMC, 2021

In the Egyptian banking industry, loan officers use pure judgment to make personal loan approval decisions. In this paper, we develop a new predictive method for default customers' loans using machine learning. The new predictive method uses the available personal data and historical credit data to evaluate the credit trust-worthiness of customers to obtain loans. We used the ABE dataset for training and testing, as we used 10 features from the application form and i-score report class that could give great help to credit officers for taking the right decision through avoiding customer selection using random techniques. The collected dataset was analysed by using various machine learning classifiers based on important selected features, to obtain high accuracy. We compared the performance of several machine learning classifiers before and after feature selection. We have found that in terms of high accuracy, the most important features are (activity-income-loan) and in terms of better performance the decision tree classifier has surpassed any other machine learning classifier with significant prediction accuracy of almost 94.85%.

Statistical Learning for Analysis of Credit Risk Data

In the financial sector, credit risk and financial modeling have been widely explored in practice, establishing particular scale characterization through pre-existing models and now the introduction of machine learning approaches. Our investigation is to generate a prediction model on a "Give Me Some Credit" dataset from Kaggle to help understand credit scoring and potential patterns of delinquency. Using various analytical models based on machine learning methods, risk levels of future credit loans are identified by accurately predicting the probability of an individual experiencing future financial distress. The results of data analysis in terms of the accuracy and the quality of the classifier are inspected through the ROC curve fitting. The ability to curate a precise model that can validate an individual's credit behaviour is further investigated in the report along with the insight of significant variables. Modelling an individual's credit score is imperative as the categorization is the initial and indicative impression of their financial responsibility.

Reassessment and Monitoring of Loan Applications with Machine Learning

Applied Artificial Intelligence, 2018

Credit scoring and monitoring are the two important dimensions of the decision-making process for the loan institutions. In the first part of this study, we investigate the role of machine learning for applicant reassessment and propose a complementary screening step to an existing scoring system. We use a real data set from one of the prominent loan companies in Turkey. The information provided by the applicants form the variables in our analysis. The company's experts have already labeled the clients as bad and good according to their ongoing payments. Using this labeled data set, we execute several methods to classify the bad applicants as well as the significant variables in this classification. As the data set consists of applicants who have passed the initial scoring system, most of the clients are marked as good. To deal with this imbalanced nature of the problem, we employ a set of different approaches to improve the performance of predicting the applicants who are likely to default. In the second part of this study, we aim to predict the payment behavior of clients based on their static (demographic and financial) and dynamic (payment) information. Furthermore, we analyze the effect of the length of the payment history and the staying power of the proposed prediction models.

Unveiling the Future Machine Learning Predicts Credit Card Scores

International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 2023

The review of credit issuance decisions has undergone significant enhancements by incorporating manual judgement and statistical analysis into the decision-making processes. As financial institution databases grow in size, this integration has remarkably improved the reliability and efficiency of credit issuance decisions. Machine learning algorithms, especially Artificial Neural Network (ANN), have played a pivotal role in assisting with credit approval decisions. However, the varying algorithms and parameter selections among prediction models have led to differences in prediction performance. This study aims to improve model construction in the credit scoring process and analyze the forecast effectiveness of prevalent models. By setting a predetermined performance objective, numerous regression models and classifiers, including Decision Trees and Random Forest, were evaluated for their prediction accuracy. Through rigorous experimentation, ANN emerged as the top-performing model, exhibiting the highest performance score in terms of balanced accuracy. The findings of this research contribute to refining credit approval decision-making and offer valuable insights for financial institutions seeking to adopt robust machine learning models for credit scoring, ultimately enhancing the overall credit assessment process.

Credit Risk Assessment Using Statistical and Machine Learning: Basic Methodology and Risk Modeling Applications

Computational Economics, 2000

Risk assessment of financialintermediaries is an area of renewed interest due tothe financial crises of the 1980's and 90's. Anaccurate estimation of risk, and its use in corporateor global financial risk models, could be translatedinto a more efficient use of resources. One importantingredient to accomplish this goal is to find accuratepredictors of individual risk in the credit portfoliosof institutions. In this context we make a comparativeanalysis of different statistical and machine learningmodeling methods of classification on a mortgage loandata set with the motivation to understand theirlimitations and potential. We introduced a specificmodeling methodology based on the study of errorcurves. Using state-of-the-art modeling techniques webuilt more than 9,000 models as part of the study. Theresults show that CART decision-tree models providethe best estimation for default with an average 8.31%error rate for a training sample of 2,000 records. Asa result of the error curve analysis for this model weconclude that if more data were available,approximately 22,000 records, a potential 7.32% errorrate could be achieved. Neural Networks provided thesecond best results with an average error of 11.00%.The K-Nearest Neighbor algorithm had an averageerror rate of 14.95%. These results outperformed thestandard Probit algorithm which attained an averageerror rate of 15.13%. Finally we discuss thepossibilities to use this type of accurate predictivemodel as ingredients of institutional and global riskmodels.

Prediction Model for Loan Default Using Machine Learning

The International Journal of Science & Technoledge

Business firms and households sometimes seek for extra-funding to fulfill certain needs. The demand which arises from the need of extra funds is fulfilled by the credit market. Banks and others financial lending institutions are the key players in this market (Gaigaliene and Cesnys, 2018). Loan is one of the most important products of most financial institutions. All financial lenders try to find effective business strategies for persuading customers to apply for loans. However, there are some borrowers who default in loan payments (Begum and Deniz, 2019). During a loan term, default may occur when the borrower fails to make required payments. Therefore, an assessment of a borrower's default risk over time is essential to enable timely risk management. Credit officers determine whether borrowers can fulfill their requirements using manually analysis of borrower's credit history. In the last decade, this trend has changed over time with technological advancement (Rehman, 2017). In recent years, financial lending institutions are using automated loan default models as credit risk scoring tools when granting loans to potential borrowers (Bao et al., 2019). Machine Learning (ML) algorithms have been applied to assess the credit risk of borrowers in financial lending institutions (Djeundj and Crook, 2018). Reliable models for credit risks play an important role in loss control and revenue maximization (Luo and Nie, 2016). Earlier research treated loan default prediction as a binary classification problem, where a loan is classified as either creditworthy or non-creditworthy (Rosenberg and Gleit, 1994). Linear Discriminant Analysis (LDA) and logistic regression (LR) are two most popular tools for constructing credit scoring models (Wiginton, 1980). Subsequently, other classification algorithm such as, Artificial neural networks (ANN) Gulsoy and Kulluk (2019) support vector machines (SVM) Alaka et al. (2018), decision trees (DT) Liu et al. (2015), and Bayesian classifier (BC) Carta et al. (2020), have been used to estimate borrowers' probability of default. Recently, time-to-default modeling has attracted increasing research interest (Dirick et al., 2017). Time-to-default data fall into the category of lifetime data in general, which is commonly analyzed by survival analysis (SA) (Malekipirbazari and Aksakalli, 2015). In loan prediction, two types of errors inevitably lead to inefficiency in prediction

Recent developments in consumer credit risk assessment

European Journal of Operational …, 2007

Consumer credit risk assessment involves the use of risk assessment tools to manage a borrower's account from the time of pre-screening a potential application through to the management of the account during its life and possible write-off. The riskiness of lending to a credit applicant is usually estimated using a logistic regression model though researchers have considered many other types of classifier and whilst preliminary evidence suggest support vector machines seem to be the most accurate, data quality issues may prevent these laboratory based results from being achieved in practice. The training of a classifier on a sample of accepted applicants rather than on a sample representative of the applicant population seems not to result in bias though it does result in difficulties in setting the cut off. Profit scoring is a promising line of research and the Basel 2 accord has had profound implications for the way in which credit applicants are assessed and bank policies adopted.

Bank Loan Prediction Using Machine Learning

In our banking system, banks sell many products, but credit lines are the main source of income for any bank. So they can earn interest on these credited loans. A bank's profit or loss is highly dependent on its credit. That is, whether the customer will repay the loan or default on the payment. By predicting loan defaults, banks can reduce their bad assets. This makes the study/research of this system or phenomenon very important. Earlier research from this era has shown that there are numerous ways to study the problem of credit default management. However, making correct predictions is so important to maximizing profits that it is essential to study the properties of different methods and how they compare. A very important approach in predictive analytics is used to study the problem of predicting loan defaults: (i) data collection, (ii) data cleaning, and (iii) performance evaluation. Experimental tests show that his Naive Bayes model outperforms other models in terms of credit prediction.