- 2.
miovision_api
Table Structure - PostgreSQL Functions
- 3. Finding Gaps and Malfunctioning Camera
This folder contains sql scripts used in both the API and the old data dump process. The csv_data/
sub-folder contains sql
files unique to processing the data from csv dumps.
Reference table for each unique intersection at which data has been collected, you can also see them on this map.:
Field Name | Data Type | Description | Example |
---|---|---|---|
intersection_uid | integer | Unique identifier for table | 10 |
id | text | Unique id from Miovision API | 990cd89a-430a-409a-b0e7-d37338394148 |
intersection_name | text | Intersection short name in format of [E / W street] / [N / S street] | King / Bathurst |
date_installed | date | Installation date of the camera (date of the first available timestamp) | 2017-10-03 |
date_decommissioned | date | Decommissioned date of the camera (date of the last available timestamp) | NULL |
lat | numeric | Latitude of intersection location | 43.643945 |
lng | numeric | Longitude of intersection location | -79.402667 |
street_main | text | Name of primary street | King |
street_cross | text | Name of secondary street | Bathurst |
int_id | bigint | int_id linked to centrelines | 13467722 |
px | integer | px linked to traffic lights | 201 |
geom | geometry | Point geometry of that intersection | 0101000020E61000006B0BCF4BC5D953C01CB62DCA6CD24540 |
n_leg_restricted | boolean | Whether that leg is restricted to vehicles | NULL |
e_leg_restricted | boolean | Whether that leg is restricted to vehicles | NULL |
s_leg_restricted | boolean | Whether that leg is restricted to vehicles | NULL |
w_leg_restricted | boolean | Whether that leg is restricted to vehicles | NULL |
api_name | text | Intersection name from API | Bathurst Street and King Street West |
Reference table for all classifications:
Field Name | Data Type | Description | Example |
---|---|---|---|
classification_uid | serial | Unique identifier for table | 2 |
classification | text | Textual description of mode | Bicycles |
location_only | boolean | If TRUE, represents movement on crosswalk (as opposed to road) | FALSE |
class_type | text | General class category (Vehicles, Pedestrians, or Cyclists) | Cyclists |
Here is a description of the classification_uids and corresponding types.
Note that bicycles are available at both a turning movement level and at an approach level. Approach level bicycle counts (classification_uid = 10
) should be used for the large majority of applications as the data is considered more accurate.
classification_uid | classification | location_only | class_type | definition / notes |
---|---|---|---|---|
1 | Light | false | "Vehicles" | Cars and other passenger vehicles (like vans, SUVs or pick-up trucks) |
2 | Bicycle | false | "Cyclists" | do not use - poor data quality. Tracks bicycle turning movements |
3 | Bus | false | A large vehicle that provides transportation for many humans. Since 2019-08-22 this includes streetcars, however initially they were classified under 9, MotorizedVehicle. Note: This classification is excluded from "Vehicles" class_type because transit vehicle volumes are irrelevant to most of our analyses. |
|
4 | SingleUnitTruck | false | "Vehicles" | A truck that has a non-detachable cab and trailer system |
5 | ArticulatedTruck | false | "Vehicles" | A truck that has a detachable cab and trailer system |
6 | Pedestrian | true | "Pedestrians" | A walker. May or may not include zombies... |
7 | Bicycle | true | "Cyclists" | Bicycle in crosswalk. Same movement_uids as 6, Pedestrian. Unclear if it is necessarily being walked or ridden. This movement is exlcuded from aggregate tables. |
8 | WorkVan | A van used for commercial purposes Workvan classification was folded in to "Light" vehicles in the API. | ||
9 | MotorizedVehicle | false | "Vehicles" | Miscellaneous vehicles. Prior to 2019-08-22 this included streetcars. |
10 | Bicycle | false | "Cyclists" | The preferred bike classification. Tracks bicycle entrances and exits passing through manually drawn zones on the SmartSense overlay. Exits are exlcuded from aggregate tables. This classification is only available from 2021-07-11 on. Bicycle data is not great - stay tuned. |
Reference table for road user movements:
Field Name | Data Type | Description | Example |
---|---|---|---|
movement_uid | integer | Identifier representing current turning movement | 1 |
movement_name | text | Short description of movement | thru |
crosswalk_movement | boolean | Whether the movement describes pedestrians on crosswalks | false |
movement_pretty_name | text | Long description of movement | Through |
Here is a description of the movement_uids and corresponding types:
movement_uid | movement_pretty_name | definition |
---|---|---|
1 | Through | Vehicle drove through the intersection (no turns) |
2 | Left | Vehicle turned left |
3 | Right | Vehicle turned right |
4 | U-Turn | Vehicle went back from whence it came - usually excluded from counts |
5 | Clockwise | Pedestrian (or bike in crosswalk) proceeded clockwise around the intersection (a pedestrian on the north leg going clockwise would be heading eastbound) |
6 | Counter Clockwise | Pedestrian (or bike in crosswalk) proceeded counter clockwise around the intersection (a pedestrian on the north leg going counter clockwise would be heading westbound) |
7 | Bicycle Entrance | Used to determine where bicycles entered the intersection. Although typically only used for bicycles, special detection zones set up to monitor TTC right-of-way intrusion also used movement_uid = 7. |
8 | Bicycle Exit | Used to determine where bicycles exited the intersection |
Data table storing all 1-minute observations in its transformed form. Records represent total 1-minute volumes for each [intersection]-[classification]-[leg]-[turning movement] combination.
Partitioned by year and month using declarative partitioning.
Field Name | Data Type | Description | Example |
---|---|---|---|
volume_uid | serial | Unique identifier for table | 5100431 |
intersection_uid | integer | Identifier linking to specific intersection stored in intersections |
31 |
datetime_bin | timestamp without time zone | Start of 1-minute time bin in EDT | 2017-10-13 09:07:00 |
classification_uid | text | Identifier linking to specific mode class stored in classifications |
1 |
leg | text | Entry leg of movement | E |
movement_uid | integer | Identifier linking to specific turning movement stored in movements |
2 |
volume | integer | Total 1-minute volume | 12 |
volume_15min_mvt_uid | serial | Unenforced foreign key to volumes_15min_mvt The foreign key constraint was removed to support partitioning. |
14524 |
- A Unique constraint exists on
miovision_api.volumes
based onintersection_uid
,datetime_bin
,classification_uid
,leg
, andmovement_uid
.
Data are aggregated from 1-minute volume data into two types of 15-minute volume products: Turning Movement Count (TMC) (in volumes_15min_mvt
) and Automatic Traffic Recorder (ATR) (in volumes_15min
) equivalents. Have a look at Understanding Legs, Movement and Direction of Travel in getting_started.md
for a visual explanation of the differences between the two tables. The diagram below briefly describes the flow of data between key aggregate tables.
erDiagram
volumes }|--|| volumes_15min_mvt : "aggregate into 15min TMC bins"
volumes
volumes_15min_mvt ||--|{ volumes_15min : "transform into ATR style"
volumes_15min_mvt {
integer intersection_uid
integer classification_uid
date datetime_bin
integer movement_uid
text leg
integer volume
}
volumes }|--|| volumes_daily_unfiltered : "aggregate to daily"
volumes }|--|| unacceptable_gaps : "identify runs of zeros"
unacceptable_gaps }|--|| volumes_15min_mvt : "exclude 15min bins"
unacceptable_gaps {
integer intersection_uid
datetime datetime_bin
}
volumes_15min_mvt ||--|{ anomalous_ranges : "identify data anomalies (manual and automated)"
anomalous_ranges ||--|{ volumes_daily : "exclude"
anomalous_ranges ||--|{ volumes_15min_filtered : "exclude"
anomalous_ranges ||--|{ volumes_15min_mvt_filtered : "exclude"
anomalous_ranges {
integer intersection_uid
integer classification_uid
datetime range_start
datetime range_end
}
volumes_daily_unfiltered ||--|{ volumes_daily : "filtered"
volumes_15min ||--|{ volumes_15min_filtered : "filtered"
volumes_15min_mvt ||--|{ volumes_15min_mvt_filtered : "filtered"
volumes_daily_unfiltered {
integer intersection_uid
integer classification_uid
date dt
integer volume
}
volumes_daily {
integer intersection_uid
integer classification_uid
date dt
integer volume
}
volumes_15min {
integer intersection_uid
integer classification_uid
date datetime_bin
text leg
integer volume
}
volumes_15min_filtered {
integer intersection_uid
integer classification_uid
date datetime_bin
text leg
integer volume
}
volumes_15min_mvt_filtered {
integer intersection_uid
integer classification_uid
date datetime_bin
integer movement_uid
text leg
integer volume
}
(Use view volumes_15min_mvt_filtered
to exclude anomalous_ranges)
volumes_15min_mvt
contains data aggregated into 15 minute bins. In order to make averaging hourly volumes simpler, the volume can be NULL
(for all modes) or 0
for classifications 1, 2, 6, 10 (which corresponds to light vehicles, bicycles (classifications 2 and 10) and pedestrians).
The 1-min data do not identify if a camera is malfunctioning, so gaps in data could either mean there was no volume, or that the camera malfunctioned. Because we have continuous data from these counters, we no longer try to interpolate
data during gaps. When our heuristics identify unacceptable_gaps
, then the entire hour of data is thrown out and the volume is set to NULL
to imply that the data has been processed for this hour, but the results have been discarded.
A 0
value implies the process identifies the camera was working, but there was no volume for that mode. Only volumes for pedestrians, cyclists and light vehicles (classification_uid IN (1,2,6,10)
) are filled in because those are the modes we report on more frequently. Other modes are not filled because they have much lower volumes, so the 0s would expand the size of the dataset considerably.
The aggregate_15_min_mvt()
function performs zero-filling by cross-joining a table containing all possible movements described in (intersection_movements
). The only type of movement tracked in the 1-minute volume data, but not the aggregated data, is bicycle exits (classification_uid = 10 and movement_uid = 8
). The vendor recommended that bicycle exits not be used due to data quality concerns.
Field Name | Data Type | Description | Example |
---|---|---|---|
volume_15min_mvt_uid | integer | Unique identifier for table from sequence miovision_api.volumes_15min_mvt_volume_15min_mvt_uid_seq . |
14524 |
intersection_uid | integer | Identifier linking to specific intersection stored in intersections |
31 |
datetime_bin | timestamp without time zone | Start of 15-minute time bin in EDT | 2017-12-11 14:15:00 |
classification_uid | text | Identifier linking to specific mode class stored in classifications |
1 |
leg | text | Entry leg of movement | E |
movement_uid | integer | Identifier linking to specific turning movement stored in movements |
2 |
volume | integer | Total 15-minute volume | 78 |
processed | boolean | Flag if data has been aggregated to miovision_15min |
TRUE |
Please note that movements for vehicles (including bicycles) are different than those for pedestrians.
Please see this diagram for a visualization of turning movements for vehicles (including bicycles) and this diagram for a visualization of pedestrian movements.
- A Unique constraint was added to
miovision_api.volumes_15min_mvt
table based onintersection_uid
,datetime_bin
,classification_uid
,leg
andmovement_uid
.
(Use view volumes_15min_filtered
to exclude anomalous_ranges)
Data table storing ATR versions of the 15-minute turning movement data. Data in
volumes
is stored in TMC format, so must be converted to ATR to be included in
volumes_15min
.
ATR movements define leg as the approach direction of vehicles (like TMCs), and direction as the cardinal direction of traffic travelling through that side of the intersection. For a typical '+' intersection, there will be 8 possible ATR since there are 4 legs and 2 directions of travel for each ATR leg.
If you are having trouble picturing it, check out this diagram.
Field Name | Data Type | Description | Example |
---|---|---|---|
volume_15min_uid | integer | Unique identifier for table from sequence miovision_api.volumes_15min_volume_15min_uid_seq . |
12412 |
intersection_uid | integer | Identifier linking to specific intersection stored in intersections |
31 |
datetime_bin | timestamp without time zone | Start of 15-minute time bin in EDT | 2017-12-11 14:15:00 |
classification_uid | text | Identifier linking to specific mode class stored in classifications |
1 |
leg | text | Segment leg of intersection | E |
dir | text | Direction of traffic on specific leg | EB |
volume | integer | Total 15-minute volume | 107 |
miovision_api.movement_map
is used to convert the TMC data to the ATR data.
A Unique constraint was added to the miovision_api.volumes_15min
table based on intersection_uid
, datetime_bin
, classification_uid
, leg
and dir
.
Daily volumes by intersection_uid, classification_uid. Excludes anomalous_ranges
(use discouraged based on investigations) but does not exclude time around unacceptable_gaps
(zero volume periods).
Note the table volumes_daily_unfiltered
can be used (with caution) to include data labelled as anomalous.
Field Name | Comments | Data Type | Exmple |
---|---|---|---|
dt | date | 2023-09-05 | |
intersection_uid | integer | 1 | |
classification_uid | integer | 1 | |
daily_volume | integer | 13830 | |
isodow | Use WHERE isodow <= 5 AND holiday is False for non-holiday weekdays. |
smallint | 2 |
holiday | boolean | False | |
datetime_bins_missing | Minutes with zero vehicle volumes out of a total of possible 1440 minutes. | smallint | 69 |
unacceptable_gap_minutes | Periods of consecutive zero volumes deemed unacceptable based on avg intersection volume in that hour. | smallint | 0 |
avg_historical_gap_vol | Avg historical volume for that classification and gap duration based on averages from a 60 day lookback in that hour. | integer |
Data table storing all the hours containing gaps larger than 5-20 minutes (this minimum threshold is set based on gapsize_lookup
) for each intersection, including those with no data. Generated daily using the find_gaps(start_date)
function.
More information can be found at #3. Finding gaps and malfunctioning camera. This table is used in the aggregate_15_min_mvt
function to set any data during unacceptable hours to null
to prevent it from being included in AVG calculations.
Field Name | Comments | Data Type | Example |
---|---|---|---|
dt | The date for which the function miovision_api.find_gaps was run to insert this row. |
date | 2023-09-05 |
intersection_uid | integer | 4 | |
gap_start | The timestamp when the unacceptable gap starts. | timestamp without time zone | 2023-09-05 07:57:00 |
gap_end | The timestamp when the unacceptable gap ends. | timestamp without time zone | 2023-09-05 10:18:00 |
gap_minutes_total | Duration of gap in minutes (gap_end - gap_start) | integer | 141 |
allowable_total_gap_threshold | The minimum duration of zero volume to be considered an unacceptable gap for this intersection, hour, and day type (weekend/weekday) based on a 60 day lookback and calculated in miovision_api.gapsize_lookup_insert. | integer | 5 |
datetime_bin | A 15 datetime_bin which falls within the gap, to be used for joining to volumes_15min* tables. | timestamp without time zone | 2023-09-05 07:45:00 |
gap_minutes_15min | The portion of the total gap which falls within the 15 minute bin starting with datetime_bin. | integer | 3 |
Data table storing a 60 day lookback average hourly volume for each intersection_uid
, classification_uid
.
Used to determine the maximum acceptable gap for use in unacceptable_gaps
table and to determine the average historical volume associated with those gaps by mode.
column_name | Comments | data_type | sample |
---|---|---|---|
dt | The date for which miovision_api.gapsize_lookup_insert was run to generate this row. The lookback period used to is the 60 days preceeding this date, matching the same day type (weekday/weekend) |
date | 2023-11-01 |
intersection_uid | nan | integer | 20 |
classification_uid | A null classification_uid refers to the total volume for that intersection which is used to determine the gap_tolerance. | integer | 5 |
hour_bin | Hour of the day from 0-23. | smallint | 18 |
weekend | True if Saturday/Sunday or holiday (based on ref.holiday table). | boolean | False |
avg_hour_vol | The average volume for this hour/intersection/weekend combination based on a 60 day lookback. | numeric | 1.6666666666666667 |
gap_tolerance | The minimum gap duration to be considered an unacceptable_gap. Only valid for the overall intersection volume (classification_uid IS NULL). | smallint |
The breaks table contains Moments in time when data collection methods changed in such a way that we would expect clear pre- and post-change paradigms that may not be intercomparable
.
column_name | Comments | data_type | sample |
---|---|---|---|
uid | simple incrementing primary key | serial | 1 |
intersection_uid | Reference to intersections table. A null intersection_uid refers to all intersections. | integer | |
classification_uid | Reference to classifications table. A null classification_uid refers to all classifications. | integer | |
break_time | Moment the change takes place | timestamp | |
give_or_take | approximate bounds if the precise time is not known | interval | |
notes | required description of what changed - be verbose! | text |
The anomalous_ranges
table is used to log issues related to specific intersection / classification combos as identified either by a human or an automated process. See below for more information and examples of how to use this table.
Column | Description |
---|---|
uid | simple incrementing primary key |
intersection_uid | the intersection; NULL if applies to all intersections, e.g. an algorithm change |
classification_uid | the classification; NULL if applies to all classifications e.g. a badly misaligned camera |
range_start | the beginning of the anomalous range in question; may be open-ended (NULL). Inclusive. The precision here is to the second, so if you're unsure about alignment with time bins, it may be best to be conservative with this and extend the range_start slightly before the problem area. |
range_end | the end of the anomalous range in question; may be open-ended (NULL). Exclusive. The precision here is to the second, so if you're unsure about alignment with time bins, it may be best to be conservative with this and extend the range_end slightly past the problem area. |
notes | as detailed a description of the issue as reasonably possible; if there are unknowns or investigations in progress, describe them here also |
investigation_level | references miovision_api.anomaly_investigation_levels ; indicates the degree to which the issue has been investigated. Is it just a suspicion? Has it been authoritatively confirmed? Etc. |
problem_level | references miovision_api.anomaly_problem_levels ; indicates the degree or nature of the problem. e.g. valid with a caveat vs do-not-use under any circumstance |
leg | Intersection leg which is affected by range. Null refers to all legs. |
open_issues
is a complementary table to anomalous_ranges
. It is similar to a view of anomalous ranges which are currently open, except it is a table to allow interactive editing in PgAdmin. Each day via Airflow DAG, issues which are no longer active are removed, but the "logged" and "reviewer_notes" columns remain and other columns may be updated.
column_name | data_type | sample |
---|---|---|
uid | smallint | 1797 |
intersection_uid | smallint | 1 |
intersection_id | text | 8184ba7d-a2e3-4a1c-b70f-31da15e7462a |
intersection_name | text | Bathurst Street and Adelaide Street West |
classification_uid | smallint | 2 |
classification | text | Bicycle TMC |
leg | text | W |
range_start | date | 2024-11-20 |
num_days | integer | 8 |
notes | text | Zero counts, identified by a daily airflow process running function miovision_api.identify_zero_counts |
volume | bigint | |
alerts | text | |
logged | boolean | |
reviewer_notes | text |
These two tables are used to enforce standardized descriptions in the investigation_level
and problem_level
columns of anomalous_ranges
.
Column | Description |
---|---|
uid | very short descriptive text; primary key |
description | full description / documentation of the category; refer directly to these tables for documentation of the available classifications. |
Reference table for transforming aggregated turning movement counts (see volumes_15min_mvt
) into segment-level volumes (see volumes_15min
):
Field Name | Data Type | Description | Example |
---|---|---|---|
leg_new | text | Intersection leg on which 15-minute volume will be assigned | E |
dir | text | Direction on which 15-minute volume will be assigned | EB |
leg_old | text | Intersection leg on which 15-minute turning movement volume is currently assigned | W |
movement_uid | integer | Identifier representing current turning movement - see movements |
1 |
Here are some example rows from the table:
leg_new | dir | leg_old | movement_uid | description of movement |
---|---|---|---|---|
E | EB | E | 4 | Approached intersection from the east leg, u-turned, exited intersection from the east leg |
E | EB | S | 3 | Approached intersection from the south leg, turned right, exited intersection from the east leg |
E | EB | W | 1 | Approached intersection from the west leg, proceeded straight ahead, exited intersection from the east leg |
E | EB | N | 2 | Approached intersection from the north leg, turned left, exited intersection from the east leg |
leg_new
(leg for ATR) - anything that crosses that side of the intersectiondir
- heading of traffic crossingleg_new
leg_old
(leg for TMC) - direction the vehicles approach into intersectionmovement_uid
- turning movement stored inmovements
The example above represents a mapping from TMC to ATR E
leg and EB
direction. The blue and green arrows in this diagram will help you visualize the movements described in the table.
Reference table for all unique time periods. Used primarily to aggregate 15-minute data for reporting purposes:
Field Name | Data Type | Description | Example |
---|---|---|---|
period_id | integer | Unique identifier for table | 3 |
day_type | text | Day type for date filter | [Weekday OR Weekend] |
period_name | text | Textual description of period | 14 Hour |
period_range | timerange | Specific start and end times of period | [06:00:00,20:00:00) |
report_flag | boolean | Indicates whether the period is used in a report | true |
This was created using create-table-intersection_movements.sql
and is a reference table of all observed movements for each classification at each intersection. This is used in aggregating to the 15-minute TMC's in order to fill in 0s in the volumes. Subsequently, movements present in the volumes data which were erroneous were deleted from the table. This table will include movements which are illegal, such as left turns at intersections with turn restrictions but not movements like a turn onto the wrong way of a one-way street. It will need to be manually updated when a new location is added.
Since this reference table must be updated every time a new intersection is added, there are several iterations of it. The earliest is miovision_api.intersection_movements_20200805
; the latest is intersection_movements_20210712
. Users should use intersection_movements
and may find their permissions restricted on the dated versions of this table.
Field Name | Data Type | Description | Example |
---|---|---|---|
intersection_uid | integer | ID for intersection | 1 |
classification_uid | integer | Identifier linking to specific mode class stored in classifications |
1 |
leg | text | Entry leg of movement | E |
movement_uid | integer | Identifier linking to specific turning movement stored in movements |
2 |
This table maps all miovision intersection legs to centreline street segments. It needs to be updated manually using this script when intersections are added.
Field Name | Data Type | Description | Example |
---|---|---|---|
centreline_id | numeric | Corresponds to centreline_id in gis_core.centreline |
14016757 |
intersection_uid | integer | ID for intersection | 1 |
leg | text | A segment that forms part of a miovision intersection, identified by its location relative to the centre of the intersection | W |
Known issues:
- 91: Lakeshore and Spadina - Two centrelines matched to West leg are actually legit. One is for the Gardiner off-ramp (Left turns only) and one is for Lakeshore (Thru + Right). They could be differentiated by movement.
- 78: Bloor and Kingsway is a 4 legged intersection, but the south leg is not in the centreline (private road / cemetery entrance).
- 68: Steeles and Jane, N leg is outside of TO.
This table contains alerts for Miovision intersections pulled daily from the API by the pull_alerts
task in the miovision_pull
DAG. Both ongoing and closed issues that intersect with the current day are pulled. The alerts can also be found in Miovision Alert emails or in the in the Miovision One UI.
Field Name | Data Type | Description | Example |
---|---|---|---|
alert_id | text | A unique id for the alert, from the API. | 75dc5b77-faa4-487e-a3de-a6b11358fdf5 |
intersection_id | text | The intersection id, corresponding to intersections.intersection_id column | c04704a0-e1e2-4101-9c29-6823d0f41c52 |
intersection_uid | integer | The intersection uid, a foreign key referencing intersections.intersection_uid column | 6 |
alert | text | Short text description of the alert. More detail and tips for resolution are available in this Miovision help article | PERIPHERAL_UNAVAILABLE |
start_time | timestamp | Start of the alert. | 2024-01-12 10:20:00 |
end_time | timestamp | Start of the alert. Note when the end_time is null it means the alert is ongoing. | 2024-01-21 15:35:00 |
Below is an example of how to anti-join the alerts table:
SELECT ...
FROM miovision_api.volumes AS v
--anti join alerts
LEFT JOIN miovision_api.alerts AS a
ON a.intersection_uid = v.intersection_uid
AND v.datetime_bin >= a.start_time
AND v.datetime_bin < a.end_time
WHERE a.intersection_uid IS NULL
This table contains details of Miovision cameras, which we are sometimes required to provide to maintenance. It is updated daily by miovision_hardware
Airflow DAG. Join to miovision_api.active_intersections AS ai ON ai.id = camera_details.intersection_id
.
column_name | data_type | sample |
---|---|---|
intersection_id | text | 253a327c-4e4b-4e4e-b3a9-c2c3e7753825 |
camera_id | text | Miovision SmartView 360 NWC |
camera_label | text | 1ebf4ec0-88fd-49ec-8cf4-3e0ae0af0128 |
last_seen | date | 2024-12-05 |
This table stores the last updated date of Miovision detection configurations. This may be useful at some point in the future to determine for which dates calibration studies are relevant. It was only populated starting 2024-12-05, so the MIN updated_time
was the most recent update at that point. Further configuration details can be seen in Miovision One.
column_name | data_type | sample |
---|---|---|
intersection_uid | integer | 97 |
updated_time | timestamp without time zone | 2024-02-23 03:09:36.684000 |
To create explicit relationships between tables, volumes
, volume_15min_mvt
, atr_mvt_uid
and volume_15min
have primary and foreign keys. Primary keys are unique identifiers for each entry in the table, while foreign keys refer to a primary key in another table and show how an entry is related to that entry.
volumes
has the primary keyvolume_uid
and foreign keyvolume_15min_mvt_uid
which refers tovolume_15min_mvt
volumes_15min_mvt
has the primary keyvolume_15min_mvt_uid
volume_15min
has the primary keyvolume_15min_uid
The current primary purpose for the keys is so that on deletion, the delete cascades through all tables. The keys also indicate whether it is new data if the foreign key is null, and tells the function to aggregate the data if it is new data. The keys can also be used in selecting data.
The tables below are produced using functions explained here. They produce a lookup table of date-intersection combinations to be used for checking purposes or even for formal reporting. They are not used regularly.
Table | Purpose |
---|---|
api_log |
Contains a record of the start_date and end_date for an intersection_uid and when the data was pulled as date_added |
missing_dates |
Contains a record of the intersection_uid and the dt that were missing in the volumes_15min table, with period_type stated |
report_dates |
Contains a record for each intersection-date combination in which at least forty 15-minute time bins exist between 6AM and 8PM |
This is a crossover table to link volumes_15min_mvt
to the volumes_15min
table. As described above, the TMC to ATR relationship is a many to many relationship. The aggregate_15_min()
function that populates volumes_15min
also populates this table so that a record of which volume_15min_mvt
bin corresponds to which volume_15min
bin is kept, and vice versa. As a result, multiple entries of both volume_15min_uid
and volume_15min_mvt_uid
can be found in the query.
Field Name | Data Type | Description | Example |
---|---|---|---|
volume_15min_mvt_uid | int | Unique identifier for volumes_15min_mvt table |
14524 |
volume_15min_uid | serial | Unique identifier for table | 12412 |
This section describes the SQL functions in the miovision_api
schema used to aggregate data and many helper functions including those used to clear data for re-processing and to create partition tables.
Function | Comment |
---|---|
aggregate_15_min(start_date date, end_date date, intersections integer[]) |
Aggregates data from miovision_api.volumes_15min_mvt (turning movements counts/TMC) into miovision_api.volumes_15min (automatic traffic recorder /ATR). Also updates miovision_api.volumes_mvt_atr_xover and miovision_api.volumes_15min_mvt.processed column. Takes an optional intersection array parameter to aggregate only specific intersections. Use clear_volumes_15min() to remove existing values before summarizing. |
aggregate_15_min_mvt(start_date date, end_date date) |
Aggregates valid movements from miovision_api.volumes in to miovision_api.volumes_15min_mvt as 15 minute turning movement counts (TMC) bins and fills in gaps with 0-volume bins. Also updates foreign key in miovision_api.volumes . Takes an optional intersection array parameter to aggregate only specific intersections. Use clear_15_min_mvt() to remove existing values before summarizing. |
aggregate_volumes_daily(start_date date, end_date date) |
Aggregates data from miovision_api.volumes_15min_mvt into miovision_api.volumes_daily . Includes a delete clause to clear the table for those dates before any inserts. |
api_log(start_date date, end_date date, intersections integer[]) |
Logs inserts from the api to miovision_api.volumes via the miovision_api.api_log table. Takes an optional intersection array parameter to aggregate only specific intersections. Use clear_api_log() to remove existing values before summarizing. |
get_report_dates(start_date timestamp, end_date timestamp, intersections integer[]) |
Logs the intersections/classes/dates added to miovision_api.volumes_15min to miovision_api.report_dates . Takes an optional intersection array parameter to aggregate only specific intersections. Use clear_report_dates() to remove existing values before summarizing. |
find_gaps(start_date date, end_date date) |
Find unacceptable gaps and insert into table miovision_api.unacceptable_gaps . |
identify_zero_counts(start_date date) |
Identifies intersection / classification (only classification_uids 1,2,6,10) combos with zero volumes for the start_date called. Inserts or updates anomaly into anomalous_range table unless an existing, overlapping, manual entery exists. |
Function | Comment |
---|---|
clear_15_min_mvt(start_date timestamp, end_date timestamp, intersections integer[]) |
Clears data from miovision_api.volumes_15min_mvt in order to facilitate re-pulling. intersections param defaults to all intersections. |
clear_api_log(_start_date date, _end_date date, intersections integer[]) |
Clears data from miovision_api.api_log in order to facilitate re-pulling. intersections param defaults to all intersections. |
clear_report_dates(_start_date date, _end_date date, intersections integer[]) |
Clears data from miovision_api.report_dates in order to facilitate re-pulling. intersections param defaults to all intersections. |
clear_volumes(start_date timestamp, end_date timestamp, intersections integer[]) |
Clears data from miovision_api.volumes in order to facilitate re-pulling. intersections param defaults to all intersections. |
clear_volumes_15min(start_date timestamp, end_date timestamp, intersections integer[]) |
Clears data from miovision_api.volumes_15min in order to facilitate re-pulling. intersections param defaults to all intersections. |
Function | Comment |
---|---|
find_invalid_movements(start_date timestamp, end_date timestamp) |
Used exclusively within intersection_tmc.py insert_data function to raise notice in the logs about invalid movements. |
get_intersections_uids(intersections integer[]) |
Returns all intersection_uids if optional intersections param is omitted, otherwise returns only the intersection_uids provided as an integer array to intersections param. Used in miovision_api.clear_* functions. Example usage: SELECT miovision_api.get_intersections_uids() --returns all intersection_uids or SELECT miovision_api.get_intersections_uids(ARRAY[1,2,3]::integer[]) --returns only {1,2,3} |
Function | Comment |
---|---|
create_mm_nested_volumes_partitions(base_table text, year_ integer, mm_ integer) |
Create a new month partition under the parent year table base_table . Only to be used for miovision_api volumes_15min and volumes_15min_mvt tables. Example: SELECT miovision_api.create_yyyy_volumes_partition('volumes_15min', 2023) |
create_yyyy_volumes_15min_partition(base_table text, year_ integer) |
Create a new year partition under the parent table base_table . Only to be used for miovision_api volumes_15min and volumes_15min_mvt tables. Example: SELECT miovision_api.create_yyyy_volumes_partition('volumes_15min', 2023) |
create_yyyy_volumes_partition(base_table text, year_ integer, datetime_col text) |
Create a new year partition under the parent table base_table . Only to be used for miovision_api volumes table. Use parameter datetime_col to specify the partitioning timestamp column, ie. datetime_bin . Example: SELECT miovision_api.create_yyyy_volumes_partition('volumes', 2023, 'datetime_bin') |
Function | Comment |
---|---|
determine_working_machine(start_date date, end_date date) |
Function no longer in use. Previously used in check_miovision DAG to determine if any cameras had gaps larger than 4 hours. See: miovision_check DAG check_gaps task for new implementation. |
missing_dates(_date date) |
Function no longer in use. Previously used to log dates with missing data to miovision_api.missing_dates . |
In order to better determine if a camera is still working, we have decided to use the gaps and islands method to figure where the gaps are (gaps as in the unfilled space or interval between the 1min bins; a break in continuity) and their sizes. There are two parts of this in the whole process.
The following process is used to determine camera wide data outages and then find out if the gaps are within the acceptable range or not. The timebins exceeding the allowed gap_size will then be inserted an unacceptable_gaps
table. Finding gaps is important so that we know how reliable the data is for that time period based on past volume and not include those bins found within the unacceptable gaps range.
- Within
find_gaps
function, thegapsize_lookup_insert
function is run to populate the tablegapsize_lookup
with acceptable gap sizes based on avg hourly volumes from a 60 day rolling lookback.
- The set of acceptable
gap_tolerance
implemented is based on an investigation stated in this notebook.
- Then, the function
miovision_api.find_gaps
identifies all gaps of data in the tablemiovision_api.volumes
and check if they are within the acceptable range of gap sizes or not based on the information from thegapsize_lookup
table above. Sensors with no data are not identified - these are instead included inanomalous_ranges
viaidentify_zero_counts
function. - Unacceptable gaps are cross joined to 15 minute bins and inserted into the table
miovision_api.unacceptable_gaps
. - Based on the
unacceptable_gaps
table,aggregate_15min_mvt
function will not aggregate 1min bins found within within 15 minute periods matching unacceptable_gaps'datetime_bin
.
The following process is to determine if a Miovision camera is still working. It is different from the process above because the gap sizes used above are small and do not say much about whether a camera is still
working. We roughly define a camera to be malfunctioning if that camera/intersection has a gap greater than 4 hours. This is checked daily by miovision_check
DAG, task check_gaps
. A slack notification will be sent if there's at least 1 camera that is not working. The function also returns a list of intersections that are not working and from what time to what time that the gaps happen which is helpful in figuring out what has happened.
The table miovision_api.anomalous_ranges
is used to flag times and places in the data where counts are potentially questionable, suspicious, or irregular. Sometimes counts are clearly not valid or data are missing, and sometimes counts look very weird but have been investigated and confirmed as valid or subject to some specific caveat. This table helps us track the status of these investigations and their results.
The basic idea is to identify sections of data (by timerange, intersection, and classification) that have been flagged as suspicious/anomalous for whatever reason, and state what we know about that data-subset in a semi-structured way. This lets us further investigate where that is required and carve out data that has been formally cast into doubt from queries that need only-the-best.
When looking for only "typical" data, you can use filtered views volumes_daily
, volumes_15min_filtered
, volumes_15min_mvt_filtered
which filter out anomalous ranges with an anti-join like so:
--anti join anomalous_ranges
LEFT JOIN miovision_api.anomalous_ranges AS ar
ON ar.problem_level = ANY(ARRAY['do-not-use', 'questionable'])
AND ar.intersection_uid = v15.intersection_uid
AND (
ar.classification_uid = v15.classification_uid
OR ar.classification_uid IS NULL
) AND (
ar.leg = v15.leg
OR ar.leg IS NULL
) AND v15.datetime_bin >= ar.range_start
AND (
v15.datetime_bin <= ar.range_end
OR ar.range_end IS NULL
)
WHERE ar.uid IS NULL
You may also wish to use ref.holiday
to filter out holidays.
WHERE
NOT EXISTS ( -- also exclude official holidays
SELECT 1 FROM ref.holiday WHERE holiday.dt = v.datetime_bin::date
)
The anomalous_ranges
table is populated in different ways:
- manually by
miovision_data_detectives
after visual inspection or prompting from an Airflow alert - automatically by a daily script which identifys zero volume days by intersection/classification.
- There is an intention to eventually flag more unusual volumes automagically (see Issue #630).
- initially some records were added after some manual QC work which can can be found in the dev_notebooks readme.md (including notebooks and code).