LightMIRM: Light Meta-learned Invariant Risk Minimization for Trustworthy Loan Default Prediction | IEEE Conference Publication | IEEE Xplore

LightMIRM: Light Meta-learned Invariant Risk Minimization for Trustworthy Loan Default Prediction


Abstract:

Machine learning models are increasingly applied to loan default prediction to reduce the labor cost of financial institutions and the waiting time of lenders. We find th...Show More

Abstract:

Machine learning models are increasingly applied to loan default prediction to reduce the labor cost of financial institutions and the waiting time of lenders. We find that existing loan default prediction models remain lack minimax fairness, i.e., encountering significant performance drops on underrepresented subpopulations. The main cause of this trustworthy issue is pursuing Empirical Risk Minimization over the whole population, which will overlook the underrepresented subpopulations. To tackle this issue, we split the training data into subpopulations (a.k.a. environments) and conduct Invariant Risk Minimization (IRM) to learn the optimal prediction model across environments. A technical challenge is the computation cost of directly using existing IRM methods suitable for loan default prediction, such as meta-IRM, which quadratically increases as the number of environments. To reduce the complexity in training, we propose a light meta-IRM method which reduces time complexity to be linear through environment sampling and loss replaying strategies. We apply the light meta-IRM to train a representative loan default prediction model and conduct both online and offline evaluations on a large auto loan platform. Extensive experiment results validate the advantage of the proposed light meta-IRM w.r.t. the overall accuracy, minimax fairness, and training cost.
Date of Conference: 03-07 April 2023
Date Added to IEEE Xplore: 26 July 2023
ISBN Information:

ISSN Information:

Conference Location: Anaheim, CA, USA

I. Introduction

Loan default prediction [1], [2] plays an important role in the financial system, which predicts loan defaults for financial institutions and the banking industry. In the current financial system, human approvers are overwhelmed by the massive loan applications, lengthening the average waiting time [1]. To accelerate the reviewing procedure, machine learning techniques [3] – [6] are increasingly adopted to share the workload, which predict loan defaults from the profile of the lender such as occupation, income, and credit records. Explainable machine learning models such as GBDT [4] and Logistic Regression [3] are the standard choices for loan default prediction due to the requirement of trustworthiness in practice as the increase of relevant regulations on financial algorithms launched by different countries [7].

References

References is not available for this document.