Skip to content

Conversation

@nij-patel
Copy link
Collaborator

@nij-patel nij-patel commented Mar 3, 2025

Changes Made

1. New GitHub action to automatically close non-Simplify listings that are inactive

Wrote validate_listings.py, a script that takes all the listings from listings.json, and if they are not sourced from Simplify, checks if they are still active. Does this through headless browsing with Selenium. Checks if the page 404s, or contains any words (via regex) that may indicate that the listing is no longer active ("job not found", "no longer accepting applications", etc.) If a listing is deemed to be inactive, it's changed to be inactive.

This script is ran daily via validate_listings.yml on all the listings as a GitHub action. Can also be activated manually.

2. Dash Delimiting for Title Bloat

To clean up the bloat in titles in the internship listings, the script also removes extra parts of job titles buy delimiting based on the dash.

Testing

To test the script, I created a copy of the same listings in test_listings.json and ran the script on it several times. I observed that the script successfully removed the listings that were no longer active (checked by clicking on the links later). Additionally, I temporarily had the listings found to be inactive added to a closed_listings.json file that I inspected to further make sure it went well.

For the dash delimiting, I had the script print out the titles of the jobs, confirming it behaved as expected.

@rushilsrivastava rushilsrivastava added the enhancement New feature or request label Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants