CA2 – AirBnB’s Analysis of Variance

Taking the dataset from AirBnb’s New User Bookings scenario

https://www.kaggle.com/c/airbnb-recruiting-new-user-bookings

Analyse the dataset to see if it is possible to create a model which describes where people are likely to travel as their first trip on airbnb.

Your model can be one of the many algorithms covered on the course, or from any model you’ve come across in a previous existence.

You are required to calculate the goodness of fit of your model to your data, outlining the significant independent variables, if any.

You are also required to verify if the data has normal distribution – and if it should be treated parametrically or non-parametrically.

Please examine the data set also with an emphasis to data quality and mention some data scrubbing techniques that could be used to make the data easier to work with going forward.

Your solution will include a program, runnable in R or python as well as a word document outlining your research.

Your paper should be written as a formal paper.

You are also required to create a blog post on your <name>.dbsdataprojects.com blog with your research – including your runnable program script.

Leave a Reply

Your email address will not be published. Required fields are marked *