-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Joanne edited this page Sep 30, 2025
·
24 revisions
- Purpose: create tools to clean consultant delivered Household Travel Survey data
- Goals: get the HTS data to a certain level of quality for PSRC analysis and modeling purposes
-
Assessment of data quality
- develop metrics (number of error flags, NAs) to assess if the data needs cleaning, how much cleaner the data is after each process
-
- a set of scripts in SQL for automatic data cleaning process, identifying error flags and generating tables ready for Shiny-Fixie
-
Manual data cleaning with Shiny-Fixie
Shiny-Fixie includes
- a shiny user interface designed for manual data cleaning
- a set of stored procedures (psrc/hhts_cleaning/hhts_cleaning /Stored Procedures) that update tables in database
-
Post-Fixie cleaning
- update all derived variables from cleaned data
-
hhts_cleaningDatabase (hhts_cleaning repo)- a database that stores all data tables, views and stored procedures
- temporal data tables in
hhts_cleaningdatabase that tracks all data edits (includes all previous records and when records are valid from and valid to, but not who made the edits)
1-6 are missing/error dest_purpose recodes:
- 1 - Return leg of loop trip to 'Home'
- 2 - Pickup/dropoff by behavior
- 4 - School purpose by location
- 5 - Home or work purpose by location
- 6 - Missing purpose assigned by common destination within household
other codes
- 7 - Impute missing mode by speed
- 8 - Link trip
- 12 - Revise excessive speed (with distance matrix API travel time)
- 13 - Impute purpose from destination (using location recognition API)
- 15 - Shiny-Fixie manual data edit
4 is obsolete (we don't collect license data anymore); I've chosen not to run the procedures behind 9-11 (silent passenger insertions; recode when work purpose assigned to accompanying dependents), and 14 isn't relevant either (split trip from traces; trace data changed in 2023).
The wiki describes the data cleaning process for the PSRC Household Travel Survey Program.