diff --git a/README.md b/README.md index d1532ec..a8c29e0 100644 --- a/README.md +++ b/README.md @@ -45,7 +45,7 @@ The app can return results in either CSV or JSON format. The fields in either ca Data for travel time estimation through the app are sourced from [HERE](https://github.com/CityofToronto/bdit_data-sources/tree/master/here)'s [traffic API](https://developer.here.com/documentation/traffic-api/api-reference.html) and are available back to about 2017. HERE collects data from motor vehicles that report their speed and position to HERE, most likely as a by-poduct of the driver making use of an in-car navigation system connected to the Internet. -The number of vehicles within the City of Toronto reporting their position to HERE in this way has been estimated to be around 500 vehicles during the AM and PM peak periods, with lower numbers in the off hours. While this may seem like a lot, in practice many of these vehicles are on the highways and the coverage of any particular city street within a several hour time window can be very minimal if not nil. For this reason, we are currently restricting travel time estimates to "arterial" streets and highways. +The number of vehicles within the City of Toronto reporting their position to HERE in this way has been [estimated](./analysis/total-fleet-size.r) to be around 2,000 to 3,000 vehicles during the AM and PM peak periods, with lower numbers in the off hours. While this may seem like a lot, in practice many of these vehicles are on the highways and the coverage of any particular city street within a several hour time window can be very minimal if not nil. For this reason, we are currently restricting travel time estimates to "arterial" streets and highways. Travel times are provided to us in the form of _average speeds_ along links of the street network in 5-minute time bins. Given the sparseness of the vehicle probe data, most links, in most time bins are empty. The scond most common sample size is a single vehicle observation. diff --git a/analysis/total-fleet-size.r b/analysis/total-fleet-size.r new file mode 100644 index 0000000..d017e90 --- /dev/null +++ b/analysis/total-fleet-size.r @@ -0,0 +1,25 @@ + +library('tidyverse') +library('dbplyr') + +con <- DBI::dbConnect( + RPostgres::Postgres(), + host = 'insert DB host here', + user = 'nwessel', + dbname = 'bigdata', + password = rstudioapi::askForPassword("Database password") +) + +# The idea here is to convert average speeds and lengths of links into *time* +# The total amount of time spent traveling by vehicles in a given unit of time +# just is the average total number of vehicles reporting their locations to Here +tbl( con, in_schema('here','ta_path') ) %>% + filter( dt == '2024-07-18' ) %>% + mutate( hours = sample_size * ((length / 1000) / mean) ) %>% + group_by( tod ) %>% + summarize( + # x 12 because otherwise we'd have the number of hours in a five-minute bin + probe_count = 12 * sum(hours) + ) %>% + ggplot( aes(x = tod, y = probe_count) ) + + geom_line()