Latent factor models for credit scoring in P2P systems (original) (raw)

Network Based Scoring Models to Improve Credit Risk Management in Peer to Peer Lending Platforms

Frontiers in Artificial Intelligence

Financial intermediation has changed extensively over the course of the last two decades. One of the most significant change has been the emergence of FinTech. In the context of credit services, fintech peer to peer lenders have introduced many opportunities, among which improved speed, better customer experience, and reduced costs. However, peer-to-peer lending platforms lead to higher risks, among which higher credit risk: not owned by the lenders, and systemic risks: due to the high interconnectedness among borrowers generated by the platform. This calls for new and more accurate credit risk models to protect consumers and preserve financial stability. In this paper we propose to enhance credit risk accuracy of peer-to-peer platforms by leveraging topological information embedded into similarity networks, derived from borrowers' financial information. Topological coefficients describing borrowers' importance and community structures are employed as additional explanatory variables, leading to an improved predictive performance of credit scoring models.

Credit Scoring for Peer-to-Peer Lending

Risks

This paper shows how to improve the measurement of credit scoring by means of factor clustering. The improved measurement applies, in particular, to small and medium enterprises (SMEs) involved in P2P lending. The approach explores the concept of familiarity which relies on the notion that the more familiar/similar things are, the closer they are in terms of functionality or hidden characteristics (latent factors that drive the observed data). The approach uses singular value decomposition to extract the factors underlying the observed financial performance ratios of SMEs. We then cluster the factors using the standard k-mean algorithm. This enables us to segment the heterogeneous population into clusters with more homogeneous characteristics. The result shows that clusters with relatively fewer number of SMEs produce a more parsimonious and interpretable credit scoring model with better default predictive performance.

Network-Based Models to Improve Credit Scoring Accuracy

2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), 2018

Technological advancements have prompted the emergence of peer-to-peer credit services which improve user experience and offer significant reductions in costs. These advantages may be offset by a higher credit risk, due to disintermediation and information asymmetries. We postulate that networkbased information can be employed as a tool for reducing risks through an improved credit scoring model that increases the accuracy of default predictions. Our research assumption is proven by means of empirical analysis that shows how including network parameters in classical scoring algorithms, such as logistic regression and CART, does indeed improve predictive accuracy.

Credit-worthiness Prediction in Microfinance using Mobile Data: A Spatio-network Approach

2016

Many communities in underdeveloped and developing economies of the world suffer from lack of access to personal credit via formal financial institutions, like banks. However, with the rapid increase in Internet and mobile phone penetration rates, firms are now trying to circumvent this problem using novel technology-enabled approaches. In this research, we leverage a real-world dataset obtained in collaboration with a microfinance firm to show that locational data from mobile phones, coupled with information about communication networks, can be effectively exploited to improve prediction of loan default rates. Specifically, we draw upon recent work in network cohesion based regression modeling to develop a model that uses locational predictors, but within a networked context. We contend that the results from our research can not only illuminate how locational data might be used in assessing creditworthiness, but also empower microfinance firms in resource-poor communities with novel methods for credit scoring.

Predictive Analysis of Default Risk in Peer-to-Peer Lending Platforms: Empirical Evidence from LendingClub

Journal of Financial Risk Management, 2023

In recent years, the expansion of Fintech has speeded the development of the online peer-to-peer lending market, offering a huge opportunity for investment by directly connecting borrowers to lenders, without traditional financial intermediaries. This innovative approach is though accompanied by increasing default risk since the information asymmetry tends to rise with online businesses. This paper aimed to predict the probability of default of the borrower, using data from the LendingClub, the leading American online peer-to-peer lending platform. For this purpose, three machine learning methods were employed: logistic regression, random forest and neural network. Prior to the scoring models building, the LendingClub model was assessed, using the grades attributed to the borrowers in the dataset. The results indicated that the LendingClub model showed low performance with an AUC of 0.67, whereas the logistic regression (0.9), the random forest (0.9) and the neural network (0.93) displayed better predictive power. It stands out that the neural network classifier outperformed the other models with the highest AUC. No difference was noted in their respective accuracy value which was 0.9. Besides, in order to enhance their investment decision, investors might take into consideration the relationship between some variables and the likelihood of default. For instance, the higher the loan amounts, the higher the likelihood of default. The higher the debt to income, the higher the likelihood of default. While the higher the annual income, the lower the probability of default. The probability of default has a tendency to decline as the number of total open accounts rises.

Spatial Regression Models to Improve P2P Credit Risk Management

Frontiers in Artificial Intelligence

Calabrese et al. (2017) have shown how binary spatial regression models can be exploited to measure contagion effects in credit risk arising from bank failures. To illustrate their methodology, the authors have employed the Bank for International Settlements' data on flows between country banking systems. Here we apply a binary spatial regression model to measure contagion effects arising from corporate failures. To derive interconnectedness measures, we use the World Input-Output Trade (WIOT) statistics between economic sectors. Our application is based on a sample of 1,185 Italian companies. We provide evidence of high levels of contagion risk, which increases the individual credit risk of each company.

Predicting Peer to Peer Lending Loan Risk Using Classification Approach

International Journal of Advanced Science Computing and Engineering

Technological innovations have affected all sectors of life, especially, the financial sector with the emergence of financial technology. One of them is marked by the emergence of Peer-to-Peer Lending ("P2P Lending). Credit Risk Management is essential to P2P Lending as it directly affects business results, therefore it is important for P2P Lending to predict borrowers with the highest probability to become good or bad loans based on their profile or characteristics. In the experiments, five classification algorithms are used, which are Gradient Boosted Trees, Naïve Bayes, Random Forest, Decision Tree and Logistic Regression. The result is two modelling performed well that is Random Forest with accuracy 93.38% and Decision Tree with 92.35%.

The use of profit scoring as an alternative to credit scoring systems in peer-to-peer (P2P) lending

Decision Support Systems, 2016

This study goes beyond peer-to-peer (P2P) lending credit scoring systems by proposing a profit scoring. Credit scoring systems estimate loan default probability. Although failed borrowers do not reimburse the entire loan, certain amounts may be recovered. Moreover, the riskiest types of loans possess a high probability of default, but they also pay high interest rates that can compensate for delinquent loans. Unlike prior studies, which generally seek to determine the probability of default, we focus on predicting the expected profitability of investing in P2P loans, measured by the internal rate of return. Overall, 40,901 P2P loans are examined in this study. Factors that determine loan profitability are analyzed, finding that these factors differ from factors that determine the probability of default. The results show that P2P lending is not currently a fully efficient market. This means that data mining techniques are able to identify the most profitable loans, or in financial jargon, “beat the market.” In the analyzed sample, it is found that a lender selecting loans by applying a profit scoring system using multivariate regression outperforms the results obtained by using a traditional credit scoring system, based on logistic regression.

PUTTING THE BRAIN TO WORK: CREDIT INDEX EVALUATION FOR P2P LENDING BASED ON ARTIFICIAL NEURAL NETWORK MODELING

Compendium, 2018

Effective assessment of a borrower's various credit indexes is key for unravelling the problem of information asymmetry in the context of Peer-to-Peer Lending (P2P). Mitigating adverse selection of high default potential borrowers continues to plague P2P lending platforms. In order to understand which factors determine borrower credit status (ie. loan approval, loan repayment potential, risk of default), this study renders an Artificial Neural Network Model on one of the most popular P2P lending platforms. Our results show that the interest rate, the ratio of loan to income and the loan term are the most important indicators in reflecting the borrower's credit status, while the frequency of inquiries, the borrowing category have a relatively low degree of importance. This study finds that the borrower's credit index status is better explained at the lower quantiles and becomes more difficult to discern at higher quantiles. This work also finds that for longer loan terms, the borrower repayment pressure and the default rates rise with higher loan-to-income ratios and higher interest rates. Additionally, we find that higher credit rankings and higher expected returns lead to higher probabilities of defaulting. To reduce the probability of borrower default, this study recommends building lending groups or lending pools, selecting higher income credit candidates and increasing credit limits. To validate our results, we perform robustness tests that modify the learning coefficient and the training-to-validation data ratio in order to show that the empirical results of this paper are robust and effective.

Lending Behavior and Community Structure in an Online Peer-to-Peer Economic Network

2009 International Conference on Computational Science and Engineering, 2009

Increasingly, economic transactions are taking place over social networks. We study the static and dynamic characteristics of a peer-to-peer lending network through 350,000 loan listings and accompanying member profiles from the online marketplace Prosper.com. Our results imply that social factors such as participation in affinity groups and descriptive profile text are correlated with financial indicators; at the same time, we see evidence of suboptimal lending decisions, minimal learning, and herding behavior in the network. We discuss implications and suggest possible improvements to the online peer-to-peer lending model.