[SOLVED] STAT542_project3

30.00 $

Category:

Description

5/5 - (2 votes)

Project 3: Lending Club Loan Status

You are provided with historical loan data issued by Lending Club. The goal is to build a model to predict the chance of default for a loan.

Source

There are two sets of lending club data on Kaggle

We will use data from the 2nd site: accepted_2007_to_2018Q2.csv.

The dataset has over 100 features, but some of them have too many NA values, and some are not suposed to be available at the beginning of the loan. For example, it is not meaningful to predict the status of a loan if we knew the date/amount of the last payment of that loan. So we focus on the following features (5 features in each row and 30 features in total including the response ‘loan_status’)

‘addr_state’, ‘annual_inc’, ‘application_type’, ‘dti’, ‘earliest_cr_line’,   ’emp_length’, ’emp_title’, ‘fico_range_high’, ‘fico_range_low’, ‘grade’,   ‘home_ownership’, ‘initial_list_status’, ‘installment’, ‘int_rate’, ‘id’,  ‘loan_amnt’, ‘loan_status’, ‘mort_acc’, ‘open_acc’, ‘pub_rec’,   ‘pub_rec_bankruptcies’, ‘purpose’, ‘revol_bal’, ‘revol_util’, ‘sub_grade’,  ‘term’, ‘title’, ‘total_acc’, ‘verification_status’, ‘zip_code’

Students do not need to download data from Kaggle. A copy of cleaned data (with 30 features) is available on Piazza: loan_stat542.csv

[What do the different Note statuses mean?] After a loan is issued by lendclub, the loan becomes “Current”.

  • The ideal scenario: lending club keeps receiving the monthly installment payment from the borrower, and eventually the loan is paid off and the loan status becomes “Fully Paid.”
  • Signs of trouble: loan is past due with 15 days (In Grace Period), late for 15-30 days, or late for 31-120 days.
  • Once a loan is past due for more than 120 days, its status will be “Default” or “Charged-off”. Lending Club explains the difference between these two [Here]. For this project, we will treat them the same.

We focus on closed loans, i.e., loan status being one of the following:

  • Class 1 (bad loans): ‘Default’ or ‘Charged Off’;
  • Class 0 (good loans): ‘Fully Paid’.

ython.