5 November 2019
The successful applicant will work on research using transactions data for credit risk modelling with the ID Co. and be supervised by Jonathan Crook and Galina Andreeva.
The traditional credit scoring models have used application form (and behavioural) variables with credit reference agency variables giving additional information on accounts at other lenders. However, these predictors are, at the most frequent, measured monthly; the application variables (for example income, address, and so on) are not updated; and crucially, they do not give an accurate direct indication of the ability of the account holder to repay any loans granted. Essentially, these variables do not give an indication of an account holder’s cash flow.
On the other hand, account-level transactions data provides daily information on all receipts and expenditures for an account holder for each account for which data is obtained. This information allows a very accurate daily measure of income (stable and volatile) and a fine classification of expenditures by service/product type and by merchant. Following expenditure categorisation and income aggregation across sources and classification into stable and volatile components, a full cash flow analysis for each account holder may be obtained on a daily basis. When used as covariates in a probability of default (PD) model, such covariates are expected to provide a much more accurate prediction of PD for each account holder than current models.
This project will develop a methodology for incorporating a novel type of digital data (financial transactions) into credit risk and affordability models. Transactional data provides more accurate and up-to-date information about the financial status and behaviour of the borrower, compared to traditional data, which is static and often outdated. Despite the great potential of transactional data, its current use is limited because of technical problems which this project will overcome. The project will experiment with innovative categorisation/aggregation algorithms. It will also estimate application and behavioural credit risk models using a range of advanced statistical and machine-learning algorithms.