Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintain history on the cities.geojson file #3

Open
iandees opened this issue May 18, 2018 · 10 comments
Open

Maintain history on the cities.geojson file #3

iandees opened this issue May 18, 2018 · 10 comments
Labels
enhancement New feature or request

Comments

@iandees
Copy link
Contributor

iandees commented May 18, 2018

@migurski suggested that I keep the history of the cities.geojson file. This is a great idea as there are a ton of different contributors to it.

Here's how I might do it:

  1. Check out the old repo to a new place
  2. Use git filter-branch from here to grab only the commits that touch the cities.json file into a branch.
  3. Add this stripped-out local repo as a remote to this repo (on my machine)
  4. git fetch that remote
  5. git branch repo1 remotes/repo1/master from here to create a new branch with the history from the cities.json file
  6. Run the .json to .geojson script and check that modification into this new branch, also renaming the file
@iandees iandees added the enhancement New feature or request label May 18, 2018
@migurski
Copy link

Here’s a stripped-out version of the repository: https://github.com/migurski/metro-extracts/tree/migurski/prepare-nextzen

I ran git filter-branch twice:

  • Remove current master files other than cities.json:

      git filter-branch --tree-filter 'rm -rf .gitignore App Dockerfile LICENSE.txt Procfile app.json circle.yml docs requirements.txt run-debug.py runtime.txt docker*' -f
    
  • Remove any other historical files other than cities.json and older cities.txt:

      git filter-branch --tree-filter "find . ! -name 'cities.*' -delete" -f
    

I tried a run of this with --prune-empty set but found that it was too aggressive, and removed some past contributors’ commits.

@iandees
Copy link
Contributor Author

iandees commented May 20, 2018

👏 thanks for doing this!

@drewda
Copy link

drewda commented May 21, 2018

Hi guys, FYI, after suggestions from @migurski on Twitter, @irees did this in https://github.com/interline-io/osm-extracts/blob/master/cities.json

How about we find a common place where both projects can source the same GeoJSON file? Slightly different pipeline internals and output formats -- but we probably have the same notion of metros/regions in mind.

@iandees
Copy link
Contributor Author

iandees commented May 21, 2018

Yea, good idea @drewda. I'm happy to use yours if you'd like. I do have plans to support polygons rather than just bounding boxes. Does the tool you're using support that? It's ok if not, maybe we could store the bounding box separately.

@irees
Copy link

irees commented May 22, 2018

@iandees Yes, I plan to support polygons. Currently I generate a simple bbox based on the polygon geometry, but I am going to add full polygon clipping support (it's using osmconvert). Once per day, after updating the planet, I spin up a temporary ~200 cpu cluster and do the extracts in about 30 minutes; working on a blog post describing it.

@migurski
Copy link

FWIW, I was pretty intentional about using bboxes instead of polygons in 2011: people were using extracts to do local renders, and we had a large extra chunk of space around each metro center to ensure that you could zoom out and still see a meaningful map. Having a big fringe or even merging neighboring metros (like Tijuana and San Diego) was a goal, rather than clipping polygonal areas restricted to specific town boundaries.

@drewda
Copy link

drewda commented May 22, 2018

I'm happy to use yours if you'd like.

@iandees sounds good. Feel free to edit the interline-io/osm-extracts readme, or I can do that, to link off to your extracts when they're ready to be made public. Also, if you need to access the GeoJSON file with Content-Type headers and all, we reverse-proxy it out at http://app.interline.io/osm_extracts/bounding_boxes.geojson (with 1 hour caching behind the scenes).

FWIW, I was pretty intentional about using bboxes instead of polygons in 2011

@migurski thanks for the background. Enforcing a bounding box makes a lot of sense when the primary application is tile rendering. For the routing engine graphs that @irees and I build from our extracts, we don't have those constraints. Still, it's hard to imagine situations that would warrant non-convex polygons or other odd shapes.

@iandees are there specific regions you have in mind for switching from bboxes to polygons?

@iandees
Copy link
Contributor Author

iandees commented May 22, 2018

are there specific regions you have in mind for switching from bboxes to polygons?

I don't have anything in particular need for it right now, but I wanted to let people submit pull requests for polygons.

Two thoughts:

  1. Metro areas like Minneapolis/St Paul that are more round than square. On the other hand, including the extra bits in the corners of a rectangle version of a metro area doesn't really increase the size of the extract that much.
  2. We could generate extracts for continents or countries.

@drewda
Copy link

drewda commented May 23, 2018

@iandees sounds good. Feel free to open PRs against the cities.json whenever you want. I can also give you push access, so we have more hands to handle any incoming PRs.

@drewda
Copy link

drewda commented May 24, 2018

Related: we now have CircleCI running JSON Schema validation, to decrease the chances of improperly formed GeoJSON entering either of our systems: https://github.com/interline-io/osm-extracts/blob/master/cities-schema.json

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants