Skip to content

Conversation

@abinayagoudjandhyala
Copy link
Contributor

@abinayagoudjandhyala abinayagoudjandhyala commented Sep 22, 2025

Pull Request for PyVerse 💡

Requesting to submit a pull request to the PyVerse repository.


Issue Title

Add Airbnb Data Analysis Project under datascience folder

  • I have provided the issue title.

Info about the Related Issue

What's the goal of the project?
To contribute an Airbnb Data Analysis Project under the datascience folder, providing a comprehensive analysis of Airbnb listing data using Tableau, Excel, and visual presentations. The aim is to highlight pricing trends, availability, and location patterns in the Airbnb rental market.

  • I have described the aim of the project.

Name

Please mention your name.
Abinaya Goud Jandhyala

  • I have provided my name.

GitHub ID

Please mention your GitHub ID.
abinayagoudjandhyala

  • I have provided my GitHub ID.

Email ID

Please mention your email ID for further communication.
[email protected]

  • I have provided my email ID.

Identify Yourself

Mention in which program you are contributing (e.g., WoB, GSSOC, SSOC, SWOC).
GSSOC 2025

  • I have mentioned my participant role.

Closes

Enter the issue number that will be closed through this PR.
Closes: #1799

  • I have provided the issue number.

Describe the Add-ons or Changes You've Made

Give a clear description of what you have added or modified.
Added a complete Airbnb Data Analysis project featuring Tableau visualizations, Excel datasets, and a summary presentation. The project offers insights into average prices per bedroom, pricing by zip code, weekly revenue patterns, and listing breakdowns. A README details key insights and suggestions for future enhancements, such as integrating review/amenities data and ML predictive analysis.

  • I have described my changes.

Type of Change

Select the type of change:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Code style update (formatting, local variables)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Describe how your changes have been tested.
The files were checked for integrity and correctness. Tableau workbook was loaded to verify interactivity and consistency of visualizations. The Excel dataset was reviewed for completeness and accurate linking to the Tableau workbook. The README was tested for clear instructions and setup reproducibility.

  • I have described my testing process.

Checklist

Please confirm the following:

  • My code follows the guidelines of this project.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly wherever it was hard to understand.
  • I have made corresponding changes to the documentation.
  • My changes generate no new warnings.
  • I have added things that prove my fix is effective or that my feature works.
  • Any dependent changes have been merged and published in downstream modules.

Summary by CodeRabbit

  • Documentation
    • Added comprehensive end-user documentation for the Airbnb Data Analysis project, including overview, included assets, objectives, analysis focus areas (average price per bedroom, price distribution by zip code, weekly revenue trends, listing types), key insights, setup/usage instructions, and potential future enhancements. This improves discoverability and onboarding for readers. No functional behavior or data logic was changed.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 22, 2025

Walkthrough

Added a new README.md for the Airbnb Data Analysis project describing project overview, included artifacts (presentation and Tableau workbook), objectives, insights, setup/usage instructions, and potential future enhancements. No code or public APIs were modified.

Changes

Cohort / File(s) Summary of Changes
Documentation
Data_Science/Airbnb-Data-Analysis/README.md
Introduced README detailing project scope, files (AirbnbProject.pptx, TableauProject.twbx), objectives, analysis focus areas, setup steps, key insights, and future enhancements.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Suggested labels

Contributor, Status: Review Ongoing 🔄, level2, gssoc25

Suggested reviewers

  • TheChaoticor
  • UTSAVS26

Poem

A hop, a skip, a README appears—
Charts in Tableau, insights to cheer!
Bedrooms, zips, and weekly flow,
Carrots crunch as trends all show. 🥕
A slide, a twbx—neatly aligned,
Thump-thump! says bunny: “Data refined.”

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Linked Issues Check ⚠️ Warning The PR objectives state it will add a Tableau workbook, an Excel dataset, a presentation, and a README under /datascience/Airbnb-Data-Analysis/ to satisfy issue [#1799], but the provided raw_summary shows only README.md was added and explicitly notes no data or workbook files were included; therefore the required deliverables (TableauProject.twbx, AIRBNB PROJECT.pptx, and the Excel dataset) are missing from the PR. This mismatch between the PR description and the actual file changes means the PR does not currently meet the linked issue's coding/delivery requirements. Because the key artifacts referenced in the issue are not present, the check cannot pass. Please add and commit the missing deliverables (TableauProject.twbx, AIRBNB PROJECT.pptx, and the Excel dataset) to /datascience/Airbnb-Data-Analysis/, verify the Tableau workbook loads against the included dataset, update README.md to reference the exact filenames and paths, and then update the PR description so this check can be re-run and issue [#1799] can be closed.
✅ Passed checks (4 passed)
Check name Status Explanation
Out of Scope Changes Check ✅ Passed Based on the raw_summary the only change in this PR is README.md under datascience/Airbnb-Data-Analysis/ and no source code, config, or other directories were modified, so there are no apparent out-of-scope or unrelated changes; the PR also notes no exported/public declarations were changed. The modifications are limited to project documentation and thus align with the linked issue's scope. If additional files are later added, their inclusion should be rechecked for scope.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "Add Airbnb Data Analysis Project under datascience folder #1799" clearly summarizes the main change in the changeset. The title accurately reflects that this PR introduces a new Airbnb Data Analysis project to the datascience folder, which aligns with the PR objectives to add a complete project with Tableau workbook, Excel dataset, presentation, and README. The title is concise, readable, and avoids vague terms or excessive noise. A teammate scanning git history would immediately understand that this PR adds a new project without needing further clarification.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

👋 Thank you for opening this pull request! We're excited to review your contribution. Please give us a moment, and we'll get back to you shortly!

Feel free to join our community on Discord to discuss more!

@github-actions
Copy link

✅ PR validation passed! Syncing labels and assignees from the linked issue...

@github-actions github-actions bot added Contributor Denotes issues or PRs submitted by contributors to acknowledge their participation. gssoc25 level1 Status: Review Ongoing 🔄 PR is currently under review and awaiting feedback from reviewers. labels Sep 22, 2025
@github-actions
Copy link

✅ PR validation passed! Syncing labels and assignees from the linked issue...

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (4)
Data_Science/Airbnb-Data-Analysis/README.md (4)

17-22: Define metrics precisely (weekly revenue, zip source, bed_type status, date range).

Document formulas and data fields to remove ambiguity and ease validation.

Apply:

 ## Key Insights
 - **Average Price Per Bedroom**: Larger properties generally have higher average prices, with steep increases for properties with 4+ bedrooms.
 - **Price by Zip Code**: Some zip codes show significantly higher prices, possibly indicating more desirable or premium areas.
 - **Revenue Trends**: Weekly revenue peaks during certain seasons, reflecting high-travel demand periods.
 - **Common Listing Types**: Real beds are the most frequent, and properties with 1-2 bedrooms make up the majority of listings.
+
+## Methodology & Metric Definitions
+- Date range analyzed: <insert coverage window>.
+- Weekly revenue: <define formula, e.g., price × (7 − weekly_availability) or based on calendar data>; specify fields used.
+- Zip code source: <state field or geocoding method>. Note any imputation rules.
+- Bed type and bedroom count fields: <list exact field names>; note if `bed_type` exists in this dataset/version.
+- Outlier handling: <winsorization/filtering rules, if any>.

28-33: Note privacy when adding reviews and outline basic ML evaluation.

Minor content tweak to future work to set expectations.

Apply:

 ## Future Enhancements
 Potential future improvements include:
 - Adding More Data Features: Incorporating additional features like reviews or amenities.
-- Predictive Analysis: Using machine learning to predict pricing trends based on data.
+- Predictive Analysis: Using machine learning to predict pricing trends based on data; report RMSE/MAE and train/test split methodology.
 - Interactive Web Dashboard: Creating a web-based, interactive dashboard for public access.
+ - Privacy: If incorporating reviews/amenities, ensure text is anonymized and complies with the dataset license.

23-27: Provide a non‑Tableau fallback (exported images/PDF or public link).

This widens accessibility for reviewers without Tableau Desktop.

Apply:

 - Open **TableauProject.twbx**: Use Tableau Desktop to open the file and explore the visualizations.
 - Review the Presentation: Open **AirbnbProject.pptx** to see summarized findings.
+ - No Tableau? Include/export key dashboards as PNG/PDF in `docs/` or provide a public Tableau link for viewing.

17-22: Qualify insight statements with dataset scope to avoid overgeneralization.

Add time/place context so claims aren’t read as universal truths.

Apply:

-- **Average Price Per Bedroom**: Larger properties generally have higher average prices, with steep increases for properties with 4+ bedrooms.
-**Price by Zip Code**: Some zip codes show significantly higher prices, possibly indicating more desirable or premium areas.
-**Revenue Trends**: Weekly revenue peaks during certain seasons, reflecting high-travel demand periods.
-**Common Listing Types**: Real beds are the most frequent, and properties with 1-2 bedrooms make up the majority of listings.
+- **Average Price Per Bedroom** (in this dataset/timeframe): Larger properties show higher average prices, with steeper increases for 4+ bedrooms.
+- **Price by Zip Code** (study region): Certain zip codes trend higher, indicating premium areas.
+- **Revenue Trends**: Weekly revenue peaks during specific seasons in the analyzed window.
+- **Common Listing Types**: Real beds are most frequent, and 1–2 bedroom listings dominate this dataset.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 45a0fbe and bc96083.

⛔ Files ignored due to path filters (1)
  • Data_Science/Airbnb-Data-Analysis/AIRBNB PROJECT.pptx is excluded by !**/*.pptx
📒 Files selected for processing (1)
  • Data_Science/Airbnb-Data-Analysis/README.md (1 hunks)
🔇 Additional comments (1)
Data_Science/Airbnb-Data-Analysis/README.md (1)

1-1: Confirm directory casing: Data_Sience vs datascience.

Found: ./Data_Science/Airbnb-Data-Analysis. PR references "/datascience/...". Confirm intended casing and update paths or rename to avoid duplicate/ambiguous paths on case‑insensitive filesystems.

Comment on lines +3 to +4
## Overview
This project provides a comprehensive analysis of Airbnb listing data to uncover trends in pricing, availability, and other key factors. Using Tableau for visualization, the project explores the Airbnb rental market, highlighting patterns based on geographic location, room types, and pricing variations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add data source, license, and non‑affiliation/privacy notes.

Provenance and licensing are required for compliance and reuse; add a short disclaimer on Airbnb affiliation and PII handling.

Apply:

 ## Overview
 This project provides a comprehensive analysis of Airbnb listing data to uncover trends in pricing, availability, and other key factors. Using Tableau for visualization, the project explores the Airbnb rental market, highlighting patterns based on geographic location, room types, and pricing variations.
+
+## Data Source & Licensing
+- Source: <dataset name/provider, URL, retrieval date>.
+- License: <dataset license>; confirm that redistribution of the workbook and excerpts complies.
+- Privacy: dataset must not include precise addresses, emails, phone numbers, or other PII. Any such fields should be removed or generalized.
+- Non‑affiliation: This project is for educational purposes and is not affiliated with or endorsed by Airbnb.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
## Overview
This project provides a comprehensive analysis of Airbnb listing data to uncover trends in pricing, availability, and other key factors. Using Tableau for visualization, the project explores the Airbnb rental market, highlighting patterns based on geographic location, room types, and pricing variations.
## Overview
This project provides a comprehensive analysis of Airbnb listing data to uncover trends in pricing, availability, and other key factors. Using Tableau for visualization, the project explores the Airbnb rental market, highlighting patterns based on geographic location, room types, and pricing variations.
## Data Source & Licensing
- Source: <dataset name/provider, URL, retrieval date>.
- License: <dataset license>; confirm that redistribution of the workbook and excerpts complies.
- Privacy: dataset must not include precise addresses, emails, phone numbers, or other PII. Any such fields should be removed or generalized.
- Non-affiliation: This project is for educational purposes and is not affiliated with or endorsed by Airbnb.
🤖 Prompt for AI Agents
In Data_Science/Airbnb-Data-Analysis/README.md around lines 3-4, add brief
provenance, licensing and privacy/non‑affiliation statements: specify the
original data source (e.g., "Dataset obtained from Airbnb public listings via
[source/link] on [date]" or the exact file name), include a license or reuse
terms (e.g., CC BY‑NC or link to dataset's license) and a short non‑affiliation
disclaimer ("This project is not affiliated with Airbnb"), and add a PII/privacy
note explaining any personal data removal/anonymization steps and that no
attempt was made to contact hosts/guests. Keep each note one or two sentences
and place them under a new "Data Sources and License" or "Provenance and
Privacy" section near the top of the README.

Comment on lines +6 to +9
## Project Files
- **AirbnbProject.pptx**: PowerPoint presentation summarizing the main insights from the analysis.
- **TableauProject.twbx**: Tableau workbook for creating and exploring interactive visualizations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

🧩 Analysis chain

Include the Excel dataset and reconcile file names; avoid spaces/case drift.

  • Project files list omits the Excel dataset mentioned in the PR objectives.
  • Filenames differ from the PR text (e.g., “AIRBNB PROJECT.pptx” vs “AirbnbProject.pptx”). Normalize and match the actual committed files to prevent broken references across OSes.

Apply this minimal update (adjust names to whatever is actually in the repo):

 ## Project Files
-- **AirbnbProject.pptx**: PowerPoint presentation summarizing the main insights from the analysis.
-- **TableauProject.twbx**: Tableau workbook for creating and exploring interactive visualizations.
+- **Airbnb-Project.pptx**: PowerPoint presentation summarizing the main insights from the analysis.
+- **TableauProject.twbx**: Tableau workbook for creating and exploring interactive visualizations.
+- **data/Airbnb_Listings.xlsx**: Source dataset used by the workbook (if not embedded in the TWBX).

Run this to verify filenames and the folder that actually exists:


🏁 Script executed:

#!/usr/bin/env bash
set -euo pipefail
for d in Data_Science/Airbnb-Data-Analysis datascience/Airbnb-Data-Analysis; do
  if [ -d "$d" ]; then
    echo "Contents of $d:"
    ls -al "$d"
  fi
done

Length of output: 377


Normalize filenames in README and add the missing Excel dataset

  • Repo (Data_Science/Airbnb-Data-Analysis) contains: AIRBNB PROJECT.pptx and TableauProject.twbx; README lists AirbnbProject.pptx and mentions an Excel dataset that is not present.
  • Action (choose one):
    • Rename Data_Science/Airbnb-Data-Analysis/"AIRBNB PROJECT.pptx" → Data_Science/Airbnb-Data-Analysis/AirbnbProject.pptx (remove spaces/case drift) and add the source dataset at data/Airbnb_Listings.xlsx if it's not embedded in the TWBX.
    • OR update README to reference the exact committed filenames (Data_Science/Airbnb-Data-Analysis/"AIRBNB PROJECT.pptx", TableauProject.twbx) and explicitly state whether the source dataset is embedded in the TWBX or where the Excel file is located.
  • Use consistent, space-free filenames for cross-OS compatibility.
🤖 Prompt for AI Agents
In Data_Science/Airbnb-Data-Analysis/README.md around lines 6–9, the README
mentions "AirbnbProject.pptx" and an Excel dataset that do not match the repo
contents (the repo contains "AIRBNB PROJECT.pptx" and TableauProject.twbx and
the Excel file is missing or embedded); fix by either (A) renaming the repo file
to AirbnbProject.pptx (remove spaces/change case) and adding the source dataset
at Data_Science/Airbnb-Data-Analysis/data/Airbnb_Listings.xlsx if it is not
embedded in the TWBX, or (B) update the README to reference the exact committed
filenames ("AIRBNB PROJECT.pptx" and "TableauProject.twbx") and explicitly state
whether the source dataset is embedded in the TWBX or where the Excel file can
be found; ensure filenames use consistent, space-free naming if you choose
rename, and commit the README and any file renames/additions together.

⚠️ Potential issue

🧩 Analysis chain

Track large binaries with Git LFS to keep repo lean.

TWBX/PPTX (and XLSX) often exceed normal size thresholds; ensure they’re LFS‑tracked.

Use this check:


🏁 Script executed:

#!/usr/bin/env bash
set -euo pipefail
echo "Potential large binaries:"
python - <<'PY'
import os
exts={'.twbx','.twb','.pptx','.xlsx','.xls','.csv'}
for root,_,files in os.walk('.'):
  for f in files:
    if os.path.splitext(f)[1].lower() in exts:
      p=os.path.join(root,f)
      try:
        sz=os.path.getsize(p)
      except OSError:
        continue
      if sz>=5*1024*1024:
        print(f"{p}\t{sz/1024/1024:.1f} MB")
PY
echo
echo ".gitattributes entries referencing LFS:"
[ -f .gitattributes ] && rg -n 'filter=lfs' .gitattributes || echo "No .gitattributes found"

Length of output: 1760


Track large binaries with Git LFS and migrate existing large files.

No .gitattributes found; the repo contains multiple files >=5MB. Add a .gitattributes and LFS-track binary/artifact patterns (e.g. *.twbx, *.pptx, *.xlsx, *.xls and large *.csv), then migrate the existing large files into LFS or move them to external storage (git lfs migrate import --include="path,...").
Notable large files found:

  • Data_Science/Airbnb-Data-Analysis/TableauProject.twbx — 5.1 MB
  • Machine_Learning/Book Recommendation System/books.csv — 73.9 MB
  • Deep_Learning/toxicity insult analyser/jigsaw-toxic-comment-classification-challenge/train.csv/train.csv — 65.6 MB
  • Deep_Learning/toxicity insult analyser/jigsaw-toxic-comment-classification-challenge/test.csv/test.csv — 57.6 MB
  • Machine_Learning/Fake news detection project/dataset/Fake.csv — 59.9 MB
  • Machine_Learning/Fake news detection project/dataset/True.csv — 51.1 MB

Comment on lines +23 to +27
## Setup and Usage
To interact with the project and explore the data:
- Open **TableauProject.twbx**: Use Tableau Desktop to open the file and explore the visualizations.
- Review the Presentation: Open **AirbnbProject.pptx** to see summarized findings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Specify Tableau Desktop version and whether data is embedded; add relinking steps.

Without this, opening the TWBX may fail or silently point to the wrong data.

Apply:

 ## Setup and Usage
-To interact with the project and explore the data:
-- Open **TableauProject.twbx**: Use Tableau Desktop to open the file and explore the visualizations.
-- Review the Presentation: Open **AirbnbProject.pptx** to see summarized findings.
+To interact with the project and explore the data:
+- Prerequisite: Tableau Desktop (specify tested version, e.g., 2023.3 or later).
+- Open **TableauProject.twbx** in Tableau Desktop.
+  - If the workbook expects an external Excel file (not embedded), go to Data > [Data Source] > Edit Connection and relink to `data/Airbnb_Listings.xlsx`.
+  - If an extract is used, Data > [Data Source] > Extract > Refresh after relinking.
+- Review the presentation: open **Airbnb-Project.pptx** for summarized findings.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
## Setup and Usage
To interact with the project and explore the data:
- Open **TableauProject.twbx**: Use Tableau Desktop to open the file and explore the visualizations.
- Review the Presentation: Open **AirbnbProject.pptx** to see summarized findings.
## Setup and Usage
To interact with the project and explore the data:
- Prerequisite: Tableau Desktop (specify tested version, e.g., 2023.3 or later).
- Open **TableauProject.twbx** in Tableau Desktop.
- If the workbook expects an external Excel file (not embedded), go to Data > [Data Source] > Edit Connection and relink to `data/Airbnb_Listings.xlsx`.
- If an extract is used, Data > [Data Source] > Extract > Refresh after relinking.
- Review the presentation: open **Airbnb-Project.pptx** for summarized findings.
🤖 Prompt for AI Agents
In Data_Science/Airbnb-Data-Analysis/README.md around lines 23 to 27, the TWBX
instructions lack Tableau version and data-link guidance; update the README to
state the minimum Tableau Desktop version used (e.g., Tableau Desktop 2023.1)
and explicitly note whether the TWBX contains embedded/extracted data or
references external files, then add concise relinking steps: open Tableau
Desktop, go to Data > Replace Data Source (or Data > Extract > Refresh) if the
workbook points to external CSV/Hyper files, and provide the expected relative
path(s) to the data files in the repo (or note that data is embedded so no
relinking is needed).

@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity. Remove the stale label or comment, or this will be closed in 5 days.

@github-actions github-actions bot added stale and removed stale labels Oct 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Contributor Denotes issues or PRs submitted by contributors to acknowledge their participation. gssoc25 level1 Status: Review Ongoing 🔄 PR is currently under review and awaiting feedback from reviewers.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Airbnb Data Analysis Project under datascience folder

1 participant