Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update summary stats on Here probe fleet size #139

Merged
merged 6 commits into from
Jul 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ The app can return results in either CSV or JSON format. The fields in either ca

Data for travel time estimation through the app are sourced from [HERE](https://github.com/CityofToronto/bdit_data-sources/tree/master/here)'s [traffic API](https://developer.here.com/documentation/traffic-api/api-reference.html) and are available back to about 2017. HERE collects data from motor vehicles that report their speed and position to HERE, most likely as a by-poduct of the driver making use of an in-car navigation system connected to the Internet.

The number of vehicles within the City of Toronto reporting their position to HERE in this way has been estimated to be around 500 vehicles during the AM and PM peak periods, with lower numbers in the off hours. While this may seem like a lot, in practice many of these vehicles are on the highways and the coverage of any particular city street within a several hour time window can be very minimal if not nil. For this reason, we are currently restricting travel time estimates to "arterial" streets and highways.
The number of vehicles within the City of Toronto reporting their position to HERE in this way has been [estimated](./analysis/total-fleet-size.r) to be around 2,000 to 3,000 vehicles during the AM and PM peak periods, with lower numbers in the off hours. While this may seem like a lot, in practice many of these vehicles are on the highways and the coverage of any particular city street within a several hour time window can be very minimal if not nil. For this reason, we are currently restricting travel time estimates to "arterial" streets and highways.

Travel times are provided to us in the form of _average speeds_ along links of the street network in 5-minute time bins. Given the sparseness of the vehicle probe data, most links, in most time bins are empty. The scond most common sample size is a single vehicle observation.

Expand Down
25 changes: 25 additions & 0 deletions analysis/total-fleet-size.r
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@

library('tidyverse')
library('dbplyr')

con <- DBI::dbConnect(
RPostgres::Postgres(),
host = 'insert DB host here',
user = 'nwessel',
dbname = 'bigdata',
password = rstudioapi::askForPassword("Database password")
)

# The idea here is to convert average speeds and lengths of links into *time*
# The total amount of time spent traveling by vehicles in a given unit of time
# just is the average total number of vehicles reporting their locations to Here
tbl( con, in_schema('here','ta_path') ) %>%
filter( dt == '2024-07-18' ) %>%
mutate( hours = sample_size * ((length / 1000) / mean) ) %>%
group_by( tod ) %>%
summarize(
# x 12 because otherwise we'd have the number of hours in a five-minute bin
probe_count = 12 * sum(hours)
) %>%
ggplot( aes(x = tod, y = probe_count) ) +
geom_line()
Loading