Skip to content

Commit

Permalink
Update summary stats on Here probe fleet size (#139)
Browse files Browse the repository at this point in the history
* estimate size of the HERE fleet

* add comment

* obscure DB host

* update docs

* add link

* comment/document
  • Loading branch information
Nate-Wessel authored Jul 24, 2024
1 parent 65c6a59 commit 2a82edd
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ The app can return results in either CSV or JSON format. The fields in either ca

Data for travel time estimation through the app are sourced from [HERE](https://github.com/CityofToronto/bdit_data-sources/tree/master/here)'s [traffic API](https://developer.here.com/documentation/traffic-api/api-reference.html) and are available back to about 2017. HERE collects data from motor vehicles that report their speed and position to HERE, most likely as a by-poduct of the driver making use of an in-car navigation system connected to the Internet.

The number of vehicles within the City of Toronto reporting their position to HERE in this way has been estimated to be around 500 vehicles during the AM and PM peak periods, with lower numbers in the off hours. While this may seem like a lot, in practice many of these vehicles are on the highways and the coverage of any particular city street within a several hour time window can be very minimal if not nil. For this reason, we are currently restricting travel time estimates to "arterial" streets and highways.
The number of vehicles within the City of Toronto reporting their position to HERE in this way has been [estimated](./analysis/total-fleet-size.r) to be around 2,000 to 3,000 vehicles during the AM and PM peak periods, with lower numbers in the off hours. While this may seem like a lot, in practice many of these vehicles are on the highways and the coverage of any particular city street within a several hour time window can be very minimal if not nil. For this reason, we are currently restricting travel time estimates to "arterial" streets and highways.

Travel times are provided to us in the form of _average speeds_ along links of the street network in 5-minute time bins. Given the sparseness of the vehicle probe data, most links, in most time bins are empty. The scond most common sample size is a single vehicle observation.

Expand Down
25 changes: 25 additions & 0 deletions analysis/total-fleet-size.r
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@

library('tidyverse')
library('dbplyr')

con <- DBI::dbConnect(
RPostgres::Postgres(),
host = 'insert DB host here',
user = 'nwessel',
dbname = 'bigdata',
password = rstudioapi::askForPassword("Database password")
)

# The idea here is to convert average speeds and lengths of links into *time*
# The total amount of time spent traveling by vehicles in a given unit of time
# just is the average total number of vehicles reporting their locations to Here
tbl( con, in_schema('here','ta_path') ) %>%
filter( dt == '2024-07-18' ) %>%
mutate( hours = sample_size * ((length / 1000) / mean) ) %>%
group_by( tod ) %>%
summarize(
# x 12 because otherwise we'd have the number of hours in a five-minute bin
probe_count = 12 * sum(hours)
) %>%
ggplot( aes(x = tod, y = probe_count) ) +
geom_line()

0 comments on commit 2a82edd

Please sign in to comment.