Skip to content

Commit 96f1034

Browse files
committed
adds fill feature, tests and readme
1 parent 5384d28 commit 96f1034

18 files changed

+1062183
-268
lines changed

README.md

+165-57
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,24 @@
1-
# OSM Building Extractor
2-
3-
*Under Construction*
1+
# OSM Ox
42

53
A tool for extracting locations and features from Open Street Map (OSM) data.
64

5+
## Why?
6+
7+
We use Osmox to extract locations from OSM for city or national scale agent based models. This tends to focus on extracting buildings and their likelly usages, for example `homes`, `schools`, `medical facilities` and `places of work`. But can also be abstracted to other objects such as parks or land use.
8+
9+
Under the hood Osmox is a collection of labelling and GIS type operations:
10+
11+
- filtering
12+
- activity labelling
13+
- simple spatial activity inference
14+
- feature extraction (such as floor areas)
15+
- filling missing data
16+
17+
Assembled togther they form part of our wider pipeline. But as a stadalone tool, Osmox is useful for extracting insights from OSM in a hihly reproducible manner.
18+
719
## Install
820

9-
```
21+
```{sh}
1022
git clone [email protected]:arup-group/osmox.git
1123
pip install osmox
1224
# or pip -e install osmox
@@ -15,7 +27,54 @@ pytest
1527
osmox --help
1628
```
1729

18-
## Run
30+
## Quick Start
31+
32+
Extract `home`, `work`, `education`, `shop` and various other activity locations ("facilities") for the Isle of Man:
33+
34+
`osmox run configs/example.json example_data/isle-of-man.osm example -crs "epsg:27700"` (paths given from osmox project root)
35+
36+
After about 30 seconds you should find locations for the extract facilities in the specified `example` directory. Each facility includes a number of facilities as per the config:
37+
38+
```{geojson}
39+
{
40+
"type": "FeatureCollection",
41+
"features": [
42+
...
43+
{
44+
"id": "13589",
45+
"type": "Feature",
46+
"properties": {
47+
"activities": "home",
48+
"area": 196,
49+
"distance_to_nearest_education": 816.4434678355371,
50+
"distance_to_nearest_medical": 366.81198701080626,
51+
"distance_to_nearest_shop": 133.12877450643526,
52+
"distance_to_nearest_transit": 122.33125535187033,
53+
"floor_area": 392.0,
54+
"id": 1869954720,
55+
"levels": 2.0,
56+
"units": 1
57+
},
58+
"geometry": {
59+
"type": "Point",
60+
"coordinates": [220894.60596542264, 467332.85704661923]
61+
}
62+
},
63+
...
64+
```
65+
66+
Outputs are written as WGS 84 (epsg:4326), so that they can be quickly inspected via [kepler](https://kepler.gl) or equivalent:
67+
68+
![isle of man floor areas](./readme_fixtures/floor-areas.png)
69+
*^ Isle of Man facility `floor_area` feature. Approximated based on polygon areas and floor labels or sensible defaults.*
70+
71+
![isle of man activites](./readme_fixtures/activities.png)
72+
*^ Isle of Man `activities` feature. For simulations we use this information to control what agents can do where, but this is also a good disagregate proxy for land-use. In this example, blue areas are residential, orange commercial and brown is other work places.*
73+
74+
![isle of man distance_to_nearest_transit](./readme_fixtures/distance-to-transit.png)
75+
*^ Isle of Man `distance_to_nearest_transit`. See that other distances are available, such as distance to nearest education.*
76+
77+
## Osmox Run
1978

2079
`osmox run <CONFIG_PATH> <INPUT_PATH> <OUTPUT_PATH>` is the main entry point for OSMOX:
2180

@@ -30,11 +89,11 @@ Options:
3089
3190
```
3291

33-
We describe configs below. The `<INPUT_PATH>` should point to an OSM map dataset (for example `osm.pbf`). The `<OUTPUT_PATH>` should point to an output directory.
92+
We describe configs below. The `<INPUT_PATH>` should point to an OSM map dataset (`osm`(xml) and `osm.pbf` are supported). The `<OUTPUT_PATH>` should point to an exiting or new output directory.
3493

3594
## Configs
3695

37-
Configs are important. So we provide some examples in `mc/configs` and a validation method:
96+
Configs are important. So we provide some examples in `mc/configs` and a validation method for when you start editing or building your own configs:
3897

3998
```{sh}
4099
osmox validate <CONFIG PATH>
@@ -76,7 +135,7 @@ INFO:osmox.main:Done.
76135

77136
Once complete you will find osmox has created one or two output `.geojson` in the specified `<OUTPUT_PATH>`. If you have specified a crs, you will find your outputs as both this crs and as epsg4326.
78137

79-
We generally refer to the outputs collectively as `facilities` and the properties as `features`. Note that each facility has a unique id, a bunch of features (depending on the configuration) and a point geometry. In the case of areas or polygons, such as buildings, the point represents the centroid. Measured features such as `floor_area` and `distance_to_nearest_X` are measured in the specified crs. Generally we assume you will specify a grid crs such as 27700 for the UK.
138+
We generally refer to the outputs collectively as `facilities` and the properties as `features`. Note that each facility has a unique id, a bunch of features (depending on the configuration) and a point geometry. In the case of areas or polygons, such as buildings, the point represents the centroid.
80139

81140
```{geojson}
82141
{
@@ -101,12 +160,16 @@ We generally refer to the outputs collectively as `facilities` and the propertie
101160
}
102161
```
103162

104-
## Definitions
163+
In the quick start demo, we specified the coordinate reference system as `epsg:27700` (this is the default, but we specified it for visibility) so that distance and area based features would have sensible units (metres in this case). If extracting data from other areas we would encourage using the relevant grid crs for that area.
164+
165+
## Configuration
166+
167+
### Definitions
105168

106169
**OSMObjects** - objects extracted from OSM. These can be points, lines or polygons. Objects have features.
107170
**OSMFeatures** - OSM objects have features. Features typically include a key and value based on the [OSM wiki](https://wiki.openstreetmap.org/wiki/Map_features).
108171

109-
## Primary Functionality
172+
### Primary Functionality
110173

111174
The primary use case for osmox is for extracting a representation of places where people can do various activities ('education' or 'work' or 'shop' for example). This is done applying a configured mapping to OSM tags:
112175

@@ -119,17 +182,19 @@ The primary use case for osmox is for extracting a representation of places wher
119182
"kindergarden",
120183
"school",
121184
"university",
122-
"college"
185+
"college",
186+
"yes"
123187
]
124-
}
188+
}, ...
125189
}
126190
```
127191

128192
- **Activity Map** object activities based on OSM tags (eg: this building type 'university' is an education facility). Activity mapping is based on the same `config.json`, but we add a new section `activity_mapping`. For each OSMTag (a key such as `building` and a value such as `hotel`,) we map a list of activities:
129193

130194
```{json}
131195
{
132-
"activity_mapping": {
196+
...
197+
"activity_mapping": {
133198
"building": {
134199
"hotel": ["work", "visit"],
135200
"residential": ["home"]
@@ -138,20 +203,39 @@ The primary use case for osmox is for extracting a representation of places wher
138203
}
139204
```
140205

206+
Because an OSM tag key is often sufficient to make an activity mapping, we allow use of `*` as "all":
207+
208+
```{json}
209+
{
210+
...
211+
"activity_mapping": {
212+
...
213+
"office": {
214+
"*": ["work"]
215+
}
216+
}
217+
}
218+
```
219+
220+
Note that the filter controls the final objects that get extracted but that the activity mapping is more general. It is typical to map tags that are not included in the filter because these can be used by subsequent steps (such as inference) to assign activities where otherwise useful tags aren't included. There is no harm in over specifying the mapping.
221+
141222
These configs get looong - but we've supplied some full examples in the project.
142223

143-
## Spatial Inference
224+
### Spatial Inference
144225

145-
Because OSMObjects do not always contain useful tags, we also infer object tags based on spatial operations with surrounding tags. The most common use case for this are building objects that are simply tagged as `building:yes`. We use the below logic to infer useful tags, such as 'building:shop' or 'building:residential'.
226+
Because OSMObjects do not always contain useful tags, we also infer object tags based on spatial operations with surrounding tags.
146227

147-
- **Contains.** - If an OSMObject has no activity mappable tags (eg `building:yes`), tags are assigned based on the tags of objects that are contained within. For example, a building that contains an `amenity:shop` point is then tagged as `amenity:shop`.
228+
The most common use case for this is building objects that are simply tagged as `building:yes`. We use the below logic to infer useful tags, such as 'building:shop' or 'building:residential'.
229+
230+
- **Contains.** - If an OSMObject has no mappable tags (eg `building:yes`), tags are assigned based on the tags of objects that are contained within. For example, a building that contains an `amenity:shop` point is then tagged as `amenity:shop`.
148231
- **Within.** - Where an OSM object *still* does not have a useful OSM tag - tags are assigned based on the tags of objects that contain the object. The most common case is for untagged buildings to be assigned based on landuse objects. For example, a building within a `landuse:residential` area will be assigned with `landuse:residential`.
149232

150233
In both cases we need to add the OSMTags we plan to use to the `activity_mapping` config, eg:
151234

152235
```{json}
153236
{
154-
"activity_mapping": {
237+
....
238+
"activity_mapping": {
155239
"building": {
156240
"hotel": ["work", "visit"],
157241
"residential": ["home"]
@@ -169,81 +253,105 @@ In both cases we need to add the OSMTags we plan to use to the `activity_mapping
169253
- **Default.** - Where an OSMObject *still* does not have a useful OSM tag, we can optionally apply defaults. Again, these are set in the config:
170254

171255
```{json}
172-
"default_activities": ["home"]
256+
{
257+
...
258+
"default_tags": [["building", "residential"]],
259+
...
260+
}
173261
```
174262

175-
## Feature Extraction
263+
### Feature Extraction
176264

177265
Beyond simple assignment of human activities based on OSM tags, we also support the extraction of other features:
178266

179-
- tags (eg 'building:yes')
267+
- areas
268+
- floors
180269
- floor areas
181270
- units (eg residential units in a building)
182271

183272
These can be configured as follows:
184273

185274
```{json}
186-
"features_config": ["units", "floors", "area", "floor_area"]
275+
{
276+
...
277+
"features_config": ["units", "floors", "area", "floor_area"]
278+
...
279+
}
187280
```
188281

189-
## Distance to Nearest Extraction
282+
### Distance to Nearest Extraction
190283

191284
OSMOX also supports calculating distance to nearest features based on object activities. For example we can extract nearest distance to `transit`, `education`, `shop` and `medical` by adding the following to the config:
192285

193286
```{json}
194-
"distance_to_nearest": ["transit", "education", "shop", "medical"],
287+
{
288+
...
289+
"distance_to_nearest": ["transit", "education", "shop", "medical"],
290+
...
291+
}
195292
```
196293

197294
Note that the selected activities are based on the activity mapping config. Any activities should therefore be included in the activity mapping part of the config. You can use `osmox validate <CONFIG PATH>` to check if a config is correctly configured.
198295

199-
## Fill Missing Activities
296+
### Fill Missing Activities
200297

201-
We have noted that it is not uncommon for some small urban areas to not have building objects, but to have an appropriate landuse area tagged as 'residential'.
298+
We have noted that it is not uncommon for some small areas to not have building objects, but to have an appropriate landuse area tagged as 'residential'.
202299

203-
We therefore provide a general solution for filling such areas with a grid of objects. This fill method only fills areas that to not have the required activity.
300+
We therefore provide a very ad-hoc solution for filling such areas with a grid of objects. This fill method only fills areas that do not have the required activities already within them.
204301

205302
For example, given an area tagged as `landuse:residential` by OSM, that does not contain any object of activity type `home`, the fill method will add a grid of new objects tagged `building:house`. The new objects will also have activity type `home`, size `10 by 10` and be spaced at `25 by 25`:
206303

207304
```{json}
208-
"fill_missing_activities":
209-
[
210-
{
211-
"area_tags": [["landuse", "residential"]],
212-
"required_acts": ["home"],
213-
"new_tags": [["building", "house"]],
214-
"size": [10, 10],
215-
"spacing": [25, 25]
216-
}
217-
]
305+
{
306+
...
307+
"fill_missing_activities":
308+
[
309+
{
310+
"area_tags": [["landuse", "residential"]],
311+
"required_acts": ["home"],
312+
"new_tags": [["building", "house"]],
313+
"size": [10, 10],
314+
"spacing": [25, 25]
315+
}
316+
]
317+
}
218318
```
219319

220-
Multiple groups can be defined:
221-
222-
```{json}
223-
"fill_missing_activities":
224-
[
225-
{
226-
"area_tags": [["landuse", "residential"]],
227-
"required_acts": ["home"],
228-
"new_tags": [["building", "house"]],
229-
"size": [10, 10],
230-
"spacing": [25, 25]
231-
},
232-
{
233-
"area_tags": [["landuse", "forest"], ["landuse", "orchard"]],
234-
"required_acts": ["tree_climbing", "glamping"],
235-
"new_tags": [["amenity", "tree"], ["building", "tree house"]],
236-
"size": [3, 3],
237-
"spacing": [8, 8]
238-
}
239-
]
240-
```
320+
![isle of man distance_to_nearest_transit](./readme_fixtures/activity-fill.png)
321+
*^ Example Isle of Man activity filling in action for a residential area without building locations.*
241322

242323
Note that the selected activities are based on the activity mapping config. Any activities should therefore be included in the activity mapping part of the config. You can use `osmox validate <CONFIG PATH>` to check if a config is correctly configured.
243324

325+
Multiple groups can also be defined, for example:
326+
327+
```{json}
328+
{
329+
...
330+
"fill_missing_activities":
331+
[
332+
{
333+
"area_tags": [["landuse", "residential"]],
334+
"required_acts": ["home"],
335+
"new_tags": [["building", "house"]],
336+
"size": [10, 10],
337+
"spacing": [25, 25]
338+
},
339+
{
340+
"area_tags": [["landuse", "forest"], ["landuse", "orchard"]],
341+
"required_acts": ["tree_climbing", "glamping"],
342+
"new_tags": [["amenity", "tree"], ["building", "tree house"]],
343+
"size": [3, 3],
344+
"spacing": [8, 8]
345+
}
346+
]
347+
....
348+
}
349+
```
244350

245351
## TODO
246352

353+
- move to toml/yaml configs
247354
- todo add support to keep original geometries
248355
- add .shp option
249356
- add other distance or similar type features, eg count of nearest neighbours
357+
- warning or feedback when trying to process really large datasets

0 commit comments

Comments
 (0)