In this post I will look at linear regression to model the process determining interest rate on peer-to-peer loans provided by the Lending club. Like other peer-to-peer services, the Lending Club aims to directly connect producers and consumers, or in this case borrowers and lenders, by cutting out the middleman. Borrowers apply for loans online and provide details about the desired loan as well their financial status (such as their FICO score). Lenders use the information provided to choose which loans to invest in. The Lending Club, finally, uses a proprietary algorithm to determine the interest charged on an applicant …
Articles with the report tag
Categorisation of inertial activity data
The ubiquity of mobile phones equipped with a wide range of sensors presents interesting opportunities for data mining applications. In this report we aim to find out whether data from accelerometers and gyroscopes can be used to identify physical activities performed by subjects wearing mobile phones on their wrist.
Methods
The data used in this analysis is based on the “Human activity recognition using smartphones” data set available from the UCL Machine Learning Repository [1]. A preprocessed version was downloaded from the Data Analysis online course [2]. The set contains data derived from 3-axial linear acceleration and 3-axial angular velocity …
Titanic survival prediction
In this report I will provide an overview of my solution to kaggle’s “Titanic” competition. The aim of this competition is to predict the survival of passengers aboard the titanic using information such as a passenger’s gender, age or socio-economic status. I will explain my data munging process, explore the available predictor variables, and compare a number of different classification algorithms in terms of their prediction performance. All analysis presented here was performed in R. The corresponding source code is available on github.
Data munging
The data set provided by kaggle contains 1309 records of passengers aboard the …