Refine description of PyForestScan capabilities.

iosefa · iosefa · commit 03507c2ce3b9 · 2025-02-04T08:51:41.000+13:00
Clarified the limitations of existing tools and emphasized `PyForestScan`'s unique features, such as native tiling, multi-format support, and its focus on scalable forest analysis. These updates align the narrative with the tool's core strengths and application in large-scale ecological research.
diff --git a/paper/paper.md b/paper/paper.md
@@ -35,9 +35,9 @@ bibliography: paper.bib
 
 Remote sensing data, particularly point cloud data from airborne lidar sensors, are now widely used to understand forest ecosystems at fine spatial resolutions over large areas. Such data enable the calculation of metrics like canopy height, canopy cover, PAI, PAD, FHD, as well as DTMs, which are essential for forest management, biodiversity conservation, and carbon accounting [@mcelhinnyForestWoodlandStand2005; @drakeEstimationTropicalForest2002; @pascualRoleImprovedGround2020; @guerra-hernandezUsingBitemporalALS2024; @pascualIntegratedAssessmentCarbon2023; @pascualNewRemoteSensingbased2021a]. 
 
-Despite Python's prominence as a powerful language for geospatial and ecological analysis, there remains a scarcity of open-source Python tools dedicated to computing forest structural metrics from airborne lidar point-cloud data. This gap is significant given Python's extensive libraries for data science and its increasingly important role in ecology and deep learning [@doi:10.1111/2041-210X.13901]. Existing open-source solutions that offer some of these metrics are primarily available in the R programming language. For instance, `lidR` [@rousselLidRPackageAnalysis2020a; @rousselAirborneLiDARData2024] provides functions for point cloud manipulation, metric computation, and visualization but lacks native calculations for FHD and PAI. Another tool, `leafR` [@dealmeidaLeafRCalculatesLeaf2021], calculates FHD, leaf area index (LAI), and leaf area density (LAD) - both of which are very similar to PAI and PAD - but is limited in processing large datasets due to the absence of tiling functionality. Moreover, the importance of scale in lidar-based analyses of forest structure is well-documented [@doi:10.1111/2041-210X.14040], and `leafR` does not allow users to modify voxel depth, which can be important for accurate estimation of structural metrics across different forest types and scales. Similarly, `canopyLazR` [@kamoskeLeafAreaDensity2019] provides tools to calculate LAD and LAI from point cloud lidar data but only allows the calculation of these metrics and lacks support for tiling mechanisms, limiting its applicability to large datasets. Proprietary solutions like LAStools [@lastools], FUSION [@fusion], and Global Mapper [@globalmapper] offer tools to calculate some of these metrics -mostly canopy height- but may not provide the flexibility required for diverse ecological contexts and are often inaccessible due to licensing costs. This lack of a comprehensive, scalable Python-based solution makes it challenging for researchers, ecologists, and forest managers to integrate point-cloud-based analysis into their Python workflows efficiently. This is particularly problematic when working with large datasets or when integrating analyses with other Python-based tools, such as those used for processing space-based waveform lidar data from the Global Ecosystem Dynamics Investigation (GEDI) mission [@tangAlgorithmTheoreticalBasis2019; @DUBAYAH2020100002], which also provides data on PAI, plant area volume density (PAVD), and FHD.
+Despite Python's prominence as a powerful language for geospatial and ecological analysis, there remains a scarcity of open-source Python tools dedicated to computing forest structural metrics from airborne lidar point-cloud data. This gap is significant given Python's extensive libraries for data science and its increasingly important role in ecology and deep learning [@doi:10.1111/2041-210X.13901]. Existing open-source solutions that offer some of these metrics are primarily available in the R programming language. For instance, `lidR` [@rousselLidRPackageAnalysis2020a; @rousselAirborneLiDARData2024] provides functions for point cloud manipulation, metric computation, and visualization but lacks native calculations for FHD and PAI. Another tool, `leafR` [@dealmeidaLeafRCalculatesLeaf2021], calculates FHD, leaf area index (LAI), and leaf area density (LAD) - both of which are very similar to PAI and PAD - but is limited in processing large datasets due to the absence of tiling functionality. Moreover, the importance of scale in lidar-based analyses of forest structure is well-documented [@doi:10.1111/2041-210X.14040], and `leafR` does not allow users to modify voxel depth, which can be important for accurate estimation of structural metrics across different forest types and scales. Similarly, `canopyLazR` [@kamoskeLeafAreaDensity2019] focuses on LAD and LAI but omits broader metrics and does not provide native support for large-scale tiling. Proprietary solutions like LAStools [@lastools], FUSION [@fusion], and Global Mapper [@globalmapper] offer tools to calculate some of these metrics -mostly canopy height- but may not provide the flexibility required for diverse ecological contexts and are often inaccessible due to licensing costs. This lack of a comprehensive, scalable Python-based solution makes it challenging for researchers, ecologists, and forest managers to integrate point-cloud-based analysis into their Python workflows efficiently. This is particularly problematic when working with large datasets or when integrating analyses with other Python-based tools, such as those used for processing space-based waveform lidar data from the Global Ecosystem Dynamics Investigation (GEDI) mission [@tangAlgorithmTheoreticalBasis2019; @DUBAYAH2020100002], which also provides data on PAI, plant area volume density (PAVD), and FHD.
 
-`PyForestScan` was developed to fill this gap by providing an open-source, Python-based solution to calculate forest structural metrics that can handle large-scale point-cloud data while remaining accessible and efficient. By leveraging IO capabilities of `PDAL`, it handles large-scale analyses by allowing users to work with more efficient point-cloud data structure, such as spatially indexed hierarchical octree formats like EPT or COPC. `PyForestScan` supports commonly used formats such as .las, .laz, as well as more efficient formats such as COPC and EPT, and integrates with well-established geospatial frameworks for point clouds like `PDAL` [@howard_butler_2024_13993879; @BUTLER2021104680]. The more mathematically intensive calculations of PAD, PAI, and FHD are calculated following established methods by @kamoskeLeafAreaDensity2019 and @hurlbertNonconceptSpeciesDiversity1971, and details are provided in the documentation. `PyForestScan` provides native tiling mechanisms to calculate metrics across large landscapes, IO support across multiple formats, point cloud processing tools to filter points and create ground surfaces, as well as simple visualization functions for core metrics. `PyForestScan` brings this functionality to Python, while also introducing capabilities not found in any single existing open-source software. By focusing on forest structural metrics, `PyForestScan` provides an essential tool for the growing need to analyze forest structure at scale in the context of environmental monitoring, conservation, and climate-related research.
+`PyForestScan` was developed to fill this gap by providing an open-source, Python-based solution to calculate forest structural metrics that can handle large-scale point-cloud data while remaining accessible and efficient. By leveraging IO capabilities of `PDAL`, it handles large-scale analyses by allowing users to work with more efficient point-cloud data structure, such as spatially indexed hierarchical octree formats like EPT or COPC. `PyForestScan` supports commonly used formats such as .las, .laz, as well as more efficient formats such as COPC and EPT, and integrates with well-established geospatial frameworks for point clouds like `PDAL` [@howard_butler_2024_13993879; @BUTLER2021104680]. The more mathematically intensive calculations of PAD, PAI, and FHD are calculated following established methods by @kamoskeLeafAreaDensity2019 and @hurlbertNonconceptSpeciesDiversity1971, and details are provided in the documentation. `PyForestScan` provides native tiling mechanisms to calculate metrics across large landscapes, IO support across multiple formats, point cloud processing tools to filter points and create ground surfaces, as well as simple visualization functions for core metrics. By combining these features, `PyForestScan` meets the growing need to analyze forest structure in environmental monitoring, conservation, and climate-focused research.
 
 # Usage