Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simplify() can return invalid geometries (even with preserve_topology=True) #2165

Open
jmealo opened this issue Oct 4, 2024 · 6 comments
Open

Comments

@jmealo
Copy link

jmealo commented Oct 4, 2024

The fix for this has been Back-ported to 3.11 at libgeos/geos@5297e1e

Expected behavior and actual behavior.

Calling .simplify(tolerance=0.01, preserve_topology=True) only returns valid Geometry (as indicated by the docs).

Steps to reproduce the problem.

  • Call .simplify(tolerance=0.01, preserve_topology=True) on any of the 47 test cases provided network_as_polygon_edgecase.zip
  • Check is_valid on the returned value from simplify -- it'll be false

Workaround for issue:

def simplify(
        shape: shapely.geometry.base.BaseGeometry,
        tolerance: float,
        preserve_topology: bool = True
) -> shapely.geometry.base.BaseGeometry:
    """
    Simplify a Shapely geometry object.

    This function attempts to simplify a given Shapely geometry object while preserving its topology.
    If the simplified geometry is invalid, the tolerance is reduced iteratively until a valid geometry is obtained.
    This function is a workaround for a bug in Shapely's simplify method, which can return invalid geometries.

    As of Shapely 2.0.6, this bug still exists.

    :param shapely.geometry.base.BaseGeometry shape: The geometry object to simplify.
    :param float tolerance: The tolerance level for simplification.
    :param bool preserve_topology: Whether to preserve the topology of the geometry. Default is True.

    :return: The simplified geometry object.
    :rtype: shapely.geometry.base.BaseGeometry
    """
    # Workaround: Simplify can return invalid geometries, so we need to keep reducing the tolerance until we get a
    # valid geometry. This appears to be a bug in Shapely, as the simplify method is supposed to return a valid
    # geometry (when preserve_topology=True). We have tests in place to verify when this can safely be removed.
    # The geometries that caused this issue (requiring this workaround) are available in:
    # tests/fixtures/network_as_polygons_edgecase.zip

    if not shape.is_valid:
        return shape

    while True:
        simplified_shape = shape.simplify(
            tolerance,
            preserve_topology=preserve_topology,
        )

        if simplified_shape.is_valid:
            return simplified_shape
        else:
            tolerance /= 10

pytest Code Validating the Issue

If you place network_as_polygon_edgecase.zip in tests/fixtures, the following pytest can verify that Shapely returns an invalid shape for all 47 test cases. (I'm using a zip file because the files are larger than what I'd like to keep track of in my repository).

class GeoJSONEdgeCaseReader:
    def __init__(self, zip_file_path):
        self.zip_file_path = zip_file_path
        self.zip_file = zipfile.ZipFile(zip_file_path, 'r')
        self.geojson_files = [f for f in self.zip_file.namelist() if f.endswith('.geojson')]
        self.file_count = len(self.geojson_files)
        self.current_index = 0

    def get_next_geojson(self) -> Dict[str, str]:
        if self.current_index >= len(self.geojson_files):
            return None

        file_name = self.geojson_files[self.current_index]

        with self.zip_file.open(file_name) as file:
            print (f"Reading file: {file_name}")
            geojson_data = file.read().decode('utf-8')
        self.current_index += 1

        return {
            'data': geojson_data,
            'file_name': file_name
        }

def test_simplify_road_network_as_polygon_edge_case():
    fixture_type = 'network_as_polygon_edgecase'
    reader = GeoJSONEdgeCaseReader(os.path.join(os.path.dirname(__file__), 'fixtures', f'{fixture_type}.zip'))

    invalid_shapely_simplifications = []

    while True:
        fixture = reader.get_next_geojson()
        if fixture is None:
            break

        data = fixture['data']
        file_path = f"fixtures/{fixture_type}.zip/{fixture['file_name']}"
        shape_to_simplify = shape((json.loads(data)))

        # attempt to simplify edge cases with shapely and our internal simplify function
        shapely_simplified = shape_to_simplify.simplify(tolerance=0.01)

        if not shapely_simplified.is_valid:
            invalid_shapely_simplifications.append(file_path)

    assert len(invalid_shapely_simplifications) == reader.file_count, (
        f"Upstream fix likely, please examine why: Shapely simplifications succeeded for: {invalid_shapely_simplifications}"
    )

Operating system

  • Mac OS 14.5 (23F79)
  • Ubuntu 20

Shapely version and provenance

Behavior observed on both 1.8.4 and 2.0.6 installed via PDM.

@jmealo
Copy link
Author

jmealo commented Oct 4, 2024

Unrelated to:
#1425
#640
#1969

@theroggy
Copy link
Member

theroggy commented Oct 4, 2024

The other issues are not related.

Shapely is mainly a python wrapper around the C++ library GEOS. For simplify it calls the GEOS function GEOSTopologyPreserveSimplify.

So, it's best to post your issue there as well and reference this issue here for further tracking.

@jmealo
Copy link
Author

jmealo commented Oct 7, 2024

@theroggy: Thanks for the quick reply. I've posted an upstream issue here: libgeos/geos#1180

@jmealo
Copy link
Author

jmealo commented Oct 22, 2024

@theroggy: libgeos/geos#1180 <~~~ The fixes for this have been back ported to GEOS 3.11. Would it be possible to cut a new release?

@theroggy
Copy link
Member

The next stap would be that geos cuts a new 3.11 release (3.11.5): https://github.com/libgeos/geos/releases, and then this can be included in new shapely 2.0.7 wheels.

FYI: Note that using conda you have more flexibility in the geos version you can install... so that way I've been using geos 3.12 and 3.13 for a while now combined with shapely 2.0.x.

@jmealo
Copy link
Author

jmealo commented Oct 23, 2024

@theroggy Thanks for the tip! I'm pretty new to Python and I've been wondering about keeping geospatial dependencies pinned at particular versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants