Articles with the report tag

Oct 07 · Reports

Interest rates on P2P loans

In this post I will look at linear regression to model the process determining interest rate on peer-to-peer loans provided by the Lending club. Like other peer-to-peer services, the Lending Club aims to directly connect producers and consumers, or in this case borrowers and lenders, by cutting out the middleman. Borrowers apply for loans online and provide details about the desired loan as well their financial status (such as their FICO score). Lenders use the information provided to choose which loans to invest in. The Lending Club, finally, uses a proprietary algorithm to determine the interest charged on an applicant …

Nov 06 · Reports

Categorisation of inertial activity data

The ubiquity of mobile phones equipped with a wide range of sensors presents interesting opportunities for data mining applications. In this report we aim to find out whether data from accelerometers and gyroscopes can be used to identify physical activities performed by subjects wearing mobile phones on their wrist.

Human activity

Methods

The data used in this analysis is based on the “Human activity recognition using smartphones” data set available from the UCL Machine Learning Repository [1]. A preprocessed version was downloaded from the Data Analysis online course [2]. The set contains data derived from 3-axial linear acceleration and 3-axial angular velocity …

Oct 23 · Reports

Titanic survival prediction

In this report I will provide an overview of my solution to kaggle’s “Titanic” competition. The aim of this competition is to predict the survival of passengers aboard the titanic using information such as a passenger’s gender, age or socio-economic status. I will explain my data munging process, explore the available predictor variables, and compare a number of different classification algorithms in terms of their prediction performance. All analysis presented here was performed in R. The corresponding source code is available on github.

Titanic

Data munging

The data set provided by kaggle contains 1309 records of passengers aboard the …