This repo contains a analysis of Airbnb's Seattle dataset for 2016-2017
This project is done as a part of Udacity Data Scientist Nanodegree. The datasets for this project are collected form Kaggle repository. There are three data sets called reviews, listings and calendar and the data collected from 2016. The aim of this project is to analyse the datasets and trying to answer some questions
This project is executed by the CRISP-DM which involves the following steps:
Business Understanding
Data Understanding
Data Preparation
Modeling
Evaluation
Deployment
You can check all the analysis, steps and findings the medium post in the following link:
https://medium.com/@ArwaData/seattle-airbnb-udacity-project-2be5670f5f67
- numpy
- pandas
- matplotlib
- seaborn
Information provided by Airbnb
The dataset that Airbnb provides has three sub-datasets, which is:
- A Calendar dataset where we can see the availability of a certain listing and the corresponding price.
- A Listing dataset with a full description of listings
- A Reviews dataset with information about the review from each guess.
- What are the factors that have the most effect on the price?
- What is the best month based on availability?
- What is the best month based on the price?
- What is the best neighborhood based on availability?
- What is the best neighborhood based on the price?
- dataset source: kaggle
- the number of accommodates, bedrooms, bathrooms, beds and guests included are determining the price.
- January is the most available month of the year.
- the first two months of the year has the less price.
- Roxhill, Fairmount park and holly park are the top 3 neighborhoods in which you can have a lot of available options.
- the most expensive neighborhood is Fairmount Park, and the least prices are in Roxhill neighborhood.
- Roxhill is the best choice if you want more options and less price.