Credit Risk Modeling Architecture for Banking System
- The Importance of Data in Credit Risk Modeling
- Credit Risk Components
- Credit Risk Model Architecture
The importance of credit risk modeling is hard to underestimate. It today’s fast-changing globalised world it is very important to get a far reaching risk assessment horison. The banking system is no facing new challenges in risk modeling and thus new machine learning techniques should be used to get a better credit risk modeling.
In this blogpost we’re going to decompose credit risk into various components of credit risk modeling. Which will bring us to a multi-level credit risk architecture that should be used in software. That we’ll examine various challenges that you might face during development of Credit Risk Modeling software, and first of all: data quality requirements.
The Importance of Data in Credit Risk Modeling
Credit risk models are very important these days because they more and more steer the strategic decisions of banks. The outcome of credit risk models are being directly used by banks and financial institutions to decide upon the buffer and equity capital.
And any financial system we want to make sure that the bank is well capitalized, because if the bank is not well-capitalized it’s going to bring savings depositors into problems. In other words, the minimum equity or the buffer that a bank holds are directly being determined by credit risk models, as well as:
– market risk models;
– operational risk models;
– fraud risk models;
– insurance risk models, etc.
All of these types of models are being directly quantified by means of big data and analytics. So while building a software we better make sure that data and analytical models are of good quality.
It must be mentioned that these analytical models are becoming more and more subjects of regulation. For example Basel 2, Basel 3, Solvency 3. In essence, all this regulations direct what are the inputs of the analytical models and how the outputs should be defined.
If there are errors in analytical models they will directly affect profitability, solvency, shareholder value, macro economy and society as a whole.
Credit Risk Components
One of such important models that keep economy out of trouble is Credit Risk Model. It is consists of various components. There are three credit risk components that play a crucial role:
- Probability of default (PD) (decimal, between 0 and 1): probability of a default of a counterparty over a one year period as it has been defined by a regulation worldwide (introduced by the Basel Committee).
The main challenge here is how to map PD to credit scores which can be any number or even negative number.
- Loss given default (LGD) (decimal): ratio of the loss on an exposure due to default of a counterparty to the amount outstanding. This parameter should be estimated by using analytics.
- Exposure of default (EAD) is measured in currency terms and represents the amount of outstanding.
Usually this parameter does not need to be estimated. We can just look at the amount of the outstanding of a mortgage loan or other instalment loan.
- Expected loss = PD x LGD x EAD
- Unexpected loss = f(PD, LGD, EAD)
These are three key parameter and they are combined to quantify both expected and unexpected loss. Any change in the data of PD and LGD gives us similar change in unexpected loss of capital.
Credit Risk Model Architecture
Now, those three risk components are used in the modeling and serve as main inputs to the modeling or credit risk. Now let’s examine a basic three-layer credit risk model architecture:
The preparation of data specifically. Here we’re talking about internally collected data by the bank or financial institution. Plus external data from credit bureaus and companies such as Equifax, Experian, Vida, etc.
Plus, we’ll add expert judgements to these data in order to steer the analytical model into right direction.
Implementation of data pre-processing algorithms: outlier detection, outlier treatment, missing values treatment, categorization, weight of evidence scoring, info filtering, etc.
Once the data is pre-processed we’ll going to feed it to the model. In a PD context this will be an application of a scorecard or a behavioral of scorecard.
- An application corecard is one that is being used to score new credit applications to decide upon whether you’re going to grab mortgage, grant the credit card (yes\no).
- A behavioral scorecard is being constructed for ongoing monitoring. So once the customer has entered you primitive population, you’ll monitor their repayment behaviour by means of the behavioral scorecard.
Typically both the application and behavioral scorecards are being developed using logistic regression. Logistic regression (logit model) is one of the most widely used scorecard building techniques in the industry.
This scorecard gives us a score that allows us to rank order of obligors in terms of default risk. This is used to discriminate the risky obligors from the non-risky obligors, by using a continuous score.
Definition of ratings and Calibration
On this level of architecture that scores are mapped to risk ratings (Default ratings). That that mapping is accompanied with probabilities of default (PD). This is where we’re defining ratings and calibrate risk measures.
This multilevel architectural framework is typically adopted in PD modeling as well as LGD and EAD modeling. This framework facilitates the software development process and modeling exercise.
Moreover, this framework can be also used for validation of the model. Such back testing can be situated on the data processing level, model creation and calibration levels of the framework.
In addition, there are ways to use back testing of the framework for means of benchmarking and stress testing of the Credit Risk Model. You can stress test data at the first level, stress test model at the second level, and test rating and calibration in final.