Problem 3 (45 marks)
Dataset: credit.csv. The description of the variables is in an excel file named
This data set consists of genuine credit records from a South German bank. The aim would generally
be to predict which customers will repay the loan in full and which of them will not. There are 1000
records and all amounts are in Deutschmarks. Answer the following using suitable approaches
whether descriptive/graphical or inferential and using a suitable package e.g. StatTools. Justify your
answers in the main text and include all workings as appendix.
a) Wherever possible and meaningful, provide a brief analysis of each variable, including their
distribution, outliers, etc.
b) Does there seem to be differences in age, length of loan, or amount of loan for those who repaid
their loans and those who defaulted?
c) Explore and describe the association of each variable with the credit status.
d) Does the Length of the loan vary with the use of the loan?
e) Determine relationships, if any, between Age, Length of loan and Amount of loan.
f) Construct a 3-way contingency table from the factors credit, record and use, and analyse it. You
must state your final conclusions in detail.
- 5 years ago
Purchase the answer to view it