Skip to content

ThiagoBorges81/house_rocket

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

House Rocket

Disclaimer: House Rocket is a fictional company. Even though the business is fictional, here I present a real-world solution using data science methods. Therefore, all the procedures and discussions presented here are intended to present my data science skills as well as practice and educational purposes. The assumptions, questions and conclusions/proposals are fictional and made solely to fill the requirements of this fictional business. The data used for this project are available in the Kaggle website (https://www.kaggle.com/datasets/shivachandel/kc-house-data) in a public domain for download.

1. Understanding the company

The Real Estate industry possess multiple ways to generate revenue, including real estate agent, wholesaling, wholetailing, buy and hold investing, house flipping, and remote investing to mention a few. These options have their pros and cons and the model of choice depends on the context and goals of the company. For the purpose of this project, the fictional company House Rocket uses a model where the company search for good purchasing opportunities, with low prices and properties in good conditions where they can potentially make a profit when selling.

2. Business question

The company has big challenges such as finding the best market opportunities and generate revenue. The company’s CEO, fortunatelly, is a data-driven person and makes decisions on solid evidences, backed up by data. Currently, the company has a large database and the information it contains should drive the purchasing and selling decisions. Therefore, the House Rocket’s CEO had approached the Data Science team of the company with two questions, with which the answer would provide support for the decisions of the company:

1. What are the properties House Rocket should buy and at what price?
2. Once the property is bought, when is the best time to sell the property? And for what price?

2.1 The database The company has a large database of properties located in King’s County in Washington, USA. The properties attibutes are described in table 1.

Property attibute Description
id Single identification code of the property
date Date the property has been sold
price Property’s selling price
bedrooms Property’s number of bedrooms
bathrooms Property’s number of bathrooms
sqft_living Building’s size
sqft_lot Size of the land’s lot
floors Property’s number of floors
waterfront Does the property have waterview? (0=no; 1=yes)
view Property’s quality of view index - from 0 to 4
condition Property’s overall condition index - from 1 to 5
grade Property’s overall quality index (construction and design) - from 1 to 13
sqft_above Property’s space above the ground
sqft_basement Property’s basement size
yr_built Property’s construction year
yr_renovated Year of the las renovation
zipcode Postal code
lat Latitude position of the property
long Longitude position of the property
sqft_living15 15 neighbours property’s interior size
sqft_lot15 15 neighbours property’s land lot

2.2 Description of the database

• Bathrooms with ratio numbers are considered as not having all bathroom items;
• The scores of the “view” column mean: 0 = no view, 1 = regular, 2 = medium, 3 = good, 4 = excellent;
• The scores of the “condition” column mean: 1 = very bad, 2 = bad, 3 = medium, 4 = good, 5 = very good;
• The scores in the column called “grade” were classified using a score band for each level: 1 to 3 – very low; 4 to 6 – low, 7 = medium, 8 to 10 = high, and 11 to 13 = very high;
• The price fluctuates depending on the warmer or colder seasons of the year.  

3. Planning the Solution

There are three components in the solution that makes up the solution plan: the deliverable, the tools used, and the process. Take a look below for further details:

3.1 The deliverable

This is the product that will be sent for the CEO to fullfill his requirements:

• A table with the suggested buying properties;
• A table with the suggested time and price for selling;
• A web application for data exploration.

3.2 Tools

For this project, I have used the following tools:
• Python 3.10.4
• Microsoft Visual Studio Code
• Streamlit
• Streamlit share

3.3 Process

This project aims to respond to business questions using exploratory analysis of the available data. Therefore, the dataset was processed to answer these questions:

3.3.1. What are the properties House Rocket should buy and at what price?

Location and condition play an important role on property pricing. Therefore, the dataset was filtered using location as a grouping factor and the median was calculated, to avoid influence of extreme scores. Then, those properties with price below the median for that region and presented condition above agood condition were selected.

3.3.2. Once the property is bought, when is the best time and price for selling?

Considering the seasonality, the chosen properties were re-grouped by location and season of the year and a new median was calculated. After this processing, the following conditions were applied for selling:

3.3.2.a. If the purchasing price was above the regional median, considering the seasonality, the selling price should be 10% above the purchasing price;

3.3.2.b. If the purchasing price was below the regional median, ocnsidering the seasonality, the selling price should be 30% above the purchasing price;

4. Main insights from the data

Assessing the data allowed the establishment of few hipotheses. After the verification of these initial hipotheses, it was possible to better understand the real estate market in King County, Washington.

4.1. The hipotheses:

4.1.1: Properties with waterview are 30% more expensive than those without waterviews.
FALSE – properties with waterviews presented higher prices, but they are 68.0% more expensive.

4.1.2: Properties built before 1955 are 50% cheaper than those built after 1955.
FALSE – even though the properties built before 1955 are cheaper than those built after 1955, the difference is 0.8% lower.

4.1.3: Properties without basement are 40% larger than properties with basement
FALSE – properties without basement are 18.4% larger than properties with basement.

4.1.4: There is a 10% year-over-year(YoY) price growth of the properties
FALSE – the YoY growth is -0.1%.

4.1.5: Properties with 3 bedrooms present a 15% month-over-month (MoM) growth.
FALSE – the MoM growth is 0%.

5. Financial impact

The strategies implemented in this project, the financial impact was an average profit of $71,700.00 per property.

6. Conclusion and next steps

This project successfully responded to the CEO business questions, addressing the issues related to the company’s business model. Additionally, the project showed that there is room for improvement in a future analysis, including further insights that may be generated using the metrics present in the data and evaluate the best selling predictors using regression models.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages