Development of a scorecard is a collective process. In an organisation, it takes collaborative effort of different teams such as analytics, sales, IT and legal to successfully build and deploy a scorecard. The scorecard development takes place in following five stages :
- Scorecard Planning & Segmentation
The planning stage of Scorecard development is driven by the business needs and requirements. The role of a scorecard should be in coherence with the organisational needs such as reduction in bad debt, increase in approval rate or increased profitability. Next steps, include selection of a segmentation scheme for the portfolio which helps to assess risk for specific population for the portfolio. The segmentation scheme should be in line with the target market segments of the business. While product managers can help decide on the market segments, it is usually a good practice to check if the segmentation is not in contradiction with the existing laws and regulations. Further, segmentation can be done by forming clusters based on identical default rates. - Setting up Scorecard Parameters
Before starting work on Scorecard Development, It must be ensured that sufficient and reliable data is available with minimum number of goods and bads in the population. The quantity of data should be in order with concepts of statistical significance and randomness.Certain set of accounts which do not follow the same scoring process can be excluded while selecting the development sample. Similarly, if the company ceases to operate in certain geographic or business areas, such accounts should also be removed from the analysis.
The next set of steps include selection of modelling time-frames (Sample Window & Performance Window). Sample window indicates the time-frame from which accounts should be picked to form the modelling population whereas the time-frame in which performance of such accounts is monitored is termed as performance window. A vintage curve is critical in the selection of performance window. Further, roll rate analysis and capture rate analysis is conducted to finalize the bad-rate definition for the model. - Modelling Data Preparation & Feature Engineering
Once the scorecard parameters are set, a large set of characteristics is generated for each account based on available data. The characteristics may be developed using various data sources such as Internal, Bureau or Vendor data sources. One should avoid using ambiguous data from sources such as unverified income from application form to develop model characteristics. Also, it is seen that several derived characteristics such as LTV (loan to value) perform better than individual characteristics of loan and asset value. A good model can have about ~1500 characteristic variables at the beginning. Then, characteristics are filtered out using IV and clustering based on predictive power, robustness and business sense to arrive at a final set of characteristics which will be passed through the model. The characteristics are further transformed into appropriate categories with the help of binning and woe transformation.
Also, random sampling is performed on the modelling population to seperate out development sample with validation sample. Usually a 80:20 ratio is taken between development sample and validation sample for model development. - Scorecard Model Building
Logistic Regression is used as an industry standard for building a scorecard model. Filtered characteristics along with a target variable are passed through the model to arrive at the final score card. One can use logistic regression in three different modes viz. forward selection, backward selection or stepwise. These techniques ensure that one has the best combination of characteristics in the final scorecard. However, Model Building is an iterative process. One must always check if the sign of parameter estimates for all the characteristics is in order and the contribution of variables in the Scorecard is not highly skewed. A threshold should also be put on the p-values (or chi-square) so that only variables with a decisive predictive power could make it to the scorecard. Also, variables which show high-correlation or multicollinearity (VIF) with other variables should be eliminated. After making these checks, the same process should be re-iterated with the new set of characteristics and the process should be continued till all the checks are met.
The predictive power of the model is judged using GINI criteria whereas KS tells about the seperation which can be achieved from the model. The robustness of the model is then tested by scoring using the developed model. The model should be consistent in terms of PSI, CSI, Gini and KS when this test is performed. - Implementation
Once our final scorecard is ready, the probability weights and characteristics can be sent to IT or implementation team. The team then integrates the scorecard at the right sequence to complete the decisioning process
Thanks on your marvelous posting! I certainly enjoyed reading it, you can be a great author.I will remember to bookmark your blog and may come back at some
point. I want to encourage one to continue your great job,
have a nice morning!