A comprehensive geospatial analysis tool for assessing flood vulnerability and green infrastructure alignment in rail corridors. This tool implements the methodology described in COMPLETE_METHODOLOGY_GUIDE.txt for analyzing spatial relationships between permeable pavement infrastructure and flood vulnerability indicators.
To what extent does current permeable pavement distribution align with stormwater flood vulnerability in the urban rail corridors between Seattle and Tacoma?
All workflows, data requirements, and reporting in this repository are organized to answer that question as directly as possible. See docs/research_question_alignment.md for a guided walkthrough of the analysis and interpretation.
This tool provides automated geospatial analysis capabilities for:
- Vulnerability Assessment: Computing multi-factor flood vulnerability indices based on topography, slope, soil type, imperviousness, and drainage proximity
- Infrastructure Density Analysis: Calculating permeable pavement density within corridor buffers
- Alignment Assessment: Evaluating spatial correlation between vulnerability and infrastructure placement
- Gap Analysis: Identifying priority areas with high vulnerability but low infrastructure coverage
- Spatial Clustering: Detecting hot spots and cold spots using Local Moran's I and Getis-Ord Gi* statistics
- Runoff Modeling: Estimating stormwater runoff using SCS Curve Number method
- Multi-buffer analysis (100m, 250m, 500m)
- Coordinate system standardization (WA State Plane South EPSG:2927)
- Spatial data validation and cleaning
- Corridor segmentation at station locations
- Composite vulnerability index calculation
- Infrastructure density metrics
- Correlation and regression analysis
- Quadrant classification for priority setting
- Gap index computation
- Global Moran's I for spatial autocorrelation
- Local Indicators of Spatial Association (LISA)
- Getis-Ord Gi* hot spot analysis
- Distance decay pattern analysis
- SCS Curve Number runoff estimation
- Design storm scenario analysis (2-year, 10-year, 25-year)
- Infrastructure optimization scenarios
- Runoff reduction benefit quantification
Main Class: GeospatialAnalysisTool
__init__(data_dir, output_dir, config_path): Initialize analysis tool with configurationload_data(rail_path, infrastructure_path): Load and validate spatial data, create bufferscalculate_vulnerability(imperviousness_raster, dem_path, soils_path): Compute multi-factor vulnerability indexanalyze_infrastructure_density(): Calculate infrastructure density metrics per segmentassess_alignment(): Evaluate correlation between vulnerability and infrastructure placementperform_spatial_clustering(variable_col): Execute spatial autocorrelation analysisperform_runoff_modeling(storm_events, soil_type): Model stormwater runoff scenariosgenerate_report(): Create comprehensive analysis summarysave_results(formats): Export results to multiple formats (Shapefile, GeoPackage, CSV, GeoJSON)
calculate_morans_i(segments, variable_col): Compute Global Moran's I for spatial autocorrelationinterpret_morans_i(I, p_value): Interpret Moran's I significance and clustering patterncalculate_local_morans(segments, variable_col): Perform Local Indicators of Spatial Association (LISA)calculate_hot_spots(segments, variable_col, distance_threshold): Getis-Ord Gi* hot spot analysisperform_spatial_clustering_analysis(segments, variable_col): Complete spatial clustering workflow
prepare_curve_numbers(segments, soil_type): Prepare SCS Curve Numbers based on land coveradjust_cn_for_gsi(cn_current, density_sqft_per_acre): Adjust CN for green infrastructure impactcalculate_runoff_volumes(segments, storm_events): Estimate runoff for design storm scenariosoptimize_infrastructure_allocation(segments, total_infrastructure_sqft): Optimize GSI placementcalculate_optimization_benefit(segments, storm_event): Quantify runoff reduction benefitsperform_runoff_modeling(segments, storm_events, soil_type): Complete runoff modeling workflow
fetch_ssurgo_soils_by_bbox(bbox): Download SSURGO soils data from USDA NRCS APIfetch_nlcd_impervious(year): Instructions for NLCD imperviousness raster downloadfetch_fema_nfhl_by_bbox(bbox): Download FEMA flood zones via ArcGIS REST APIfetch_noaa_atlas14_depths(lat, lon): Retrieve NOAA Atlas 14 precipitation depthsclip_file_to_bbox(input_path, bbox, out_subdir, out_name, target_epsg): Clip spatial data to study areaparse_bbox_arg(s): Parse bounding box string to coordinate dictionary
Main Class: DataPipelineScheduler
add_data_source(name, fetch_fn, refresh_days, dependencies): Register data sourceschedule_refresh(source_name): Schedule automated data refreshrun_pipeline(force_refresh): Execute complete data acquisition pipelinegenerate_status_report(): Create data freshness and quality reportexport_metadata(): Export data provenance and lineage information
Main Class: MultiJurisdictionConsolidator
register_jurisdiction(name, bbox, data_sources): Register jurisdiction-specific data sourcesharmonize_schemas(): Standardize attribute schemas across jurisdictionsconsolidate_infrastructure(): Merge infrastructure data from multiple jurisdictionsgenerate_acquisition_status(): Track data completeness across jurisdictions
Main Class: SeattleOpenDataClient
fetch_gsi_facilities(bbox, facility_types): Download green infrastructure facilitiesfetch_stormwater_infrastructure(bbox): Download storm drains and catch basinsfetch_land_use(bbox): Download zoning and land use datafetch_boundary(jurisdiction_name): Download municipal boundariescache_and_validate(data, output_path): Cache and validate downloaded data
Main Class: NOAACDOClient
__init__(api_key): Initialize with NOAA CDO API keyfetch_precipitation_history(station_id, start_date, end_date): Historical precipitation datafetch_wet_season_totals(station_id, years): Aggregate October-March precipitationfind_stations_near(lat, lon, radius_km): Locate nearby weather stations
Main Class: NWSForecastClient
get_gridpoint_forecast(lat, lon): Retrieve 7-day precipitation forecastget_extended_outlook(lat, lon): Retrieve 6-10 day outlookapply_climate_scenario(baseline_precip, scenario): Apply RCP climate projectionsestimate_future_vulnerability(current_vuln, scenario, horizon_year): Project future conditions
Main Class: USGSWaterServicesClient
get_streamflow_current(site_code): Real-time streamflow dataget_streamflow_history(site_code, start_date, end_date): Historical streamflowcompare_to_flood_stage(site_code, current_flow): Compare to NWS flood stagesfind_nearby_gages(lat, lon, radius_km): Locate nearby stream gages
load_results(output_dir): Load analysis results from output directoryplot_correlation(gdf, output_dir): Create vulnerability vs. density scatter plotplot_quadrant_counts(gdf, output_dir): Visualize quadrant distribution bar chartplot_map(gdf, column, title, filename, output_dir, cmap): Generate thematic mapmain(): Generate complete visualization suite
Streamlit Interactive Dashboard Functions:
load_segment_frame(): Load analysis segments for interactive explorationload_infrastructure_raw(): Load raw infrastructure point dataapply_weighted_vulnerability(buffer_distance, weight_tuple): Recalculate vulnerability with custom weightscompute_runoff_scenarios(serialized_segments, events): Interactive runoff modelingbuild_multilayer_map(data): Create interactive Folium mapbuild_correlation_scatter(data): Create interactive Plotly scatter plotfilter_segments(gdf, vuln_range, density_range, jurisdictions, quadrants): Dynamic segment filtering
load_analysis_segments(): Load segments with all metricsload_infrastructure(): Load infrastructure facilitiescompute_summary_statistics(segments, infrastructure): Calculate summary metricscreate_sample_charts_data(segments): Prepare chart-ready datasetsexport_lightweight_geojson(segments): Export simplified geometry for web displaygenerate_data_manifest(stats, charts): Create metadata manifest
convert_gpkg_to_shp(root_dir): Batch convert GeoPackage to Shapefile format
merge_infrastructure(): Consolidate infrastructure data from multiple sources
download_svi_2020(): Download CDC Social Vulnerability Indexdownload_ssurgo_soils(): Download SSURGO soils via Web Soil Surveydownload_osm_infrastructure(): Extract green infrastructure from OpenStreetMapdownload_osm_rail(): Extract rail corridors from OpenStreetMapdownload_sound_transit_boundary(): Download Sound Transit service area
validate_spatial_data(gdf, dataset_name): Validate geometry and CRSreproject_to_standard(gdf, target_epsg): Reproject to standard coordinate systemcreate_buffers(gdf, distances_meters): Generate multiple buffer distancessplit_line_at_points(line, points): Segment corridor at station locationscalculate_infrastructure_density(segments, infrastructure, buffer_gdf): Spatial join and density calculation
calculate_runoff_depth(precip_inches, curve_number): SCS Curve Number runoff equationcalculate_cn_from_imperviousness(imperv_pct, hsg): Derive CN from imperviousnesscorrelation_analysis(x, y, method): Pearson and Spearman correlationclassify_vulnerability(score, low_threshold, high_threshold): Classify vulnerability levelassign_quadrant(vuln_score, density, vuln_median, density_median): Quadrant classificationcalculate_gap_index(vuln_score, density, adequacy_threshold): Compute protection gap metric
- Python 3.9 or higher
- pip package manager
- GDAL/OGR (for spatial operations)
pip install -r requirements.txt- Geospatial: geopandas, rasterio, fiona, shapely, rtree, pyproj
- Analysis: numpy, pandas, scipy, scikit-learn, statsmodels
- Spatial Statistics: pysal, esda, libpysal, splot
- Visualization: matplotlib, seaborn, contextily
- Utilities: click, rasterstats
- Ensure required datasets listed in docs/research_question_alignment.md are present.
- Run the alignment workflow:
python scripts/geospatial_analysis.py \
--rail data/raw/rail/corridor.shp \
--infrastructure data/raw/infrastructure/permeable_pavement.shp \
--imperviousness data/raw/landcover/nlcd_2019_impervious_aoi.tif \
--dem data/raw/elevation/dem_aoi.tif \
--soils data/processed/soils/ssurgo_aoi.gpkg \
--config config.yaml \
--verbose- Review the synthesized findings in
data/outputs/analysis_summary.txt(mirrored inreports/), which provides the direct answer to the research question along with actionable statistics.
Complete 6-phase vulnerability and alignment assessment:
python scripts/geospatial_analysis.py \
--rail data/raw/rail/corridor.shp \
--infrastructure data/raw/infrastructure/permeable_pavement.shp \
--imperviousness data/raw/landcover/nlcd_2019_impervious_aoi.tif \
--dem data/raw/elevation/dem_aoi.tif \
--soils data/processed/soils/ssurgo_aoi.gpkg \
--config config.yaml \
--verboseDownload external datasets for your study area:
python download_data.py --bbox "-122.36,47.58,-122.30,47.62" --verboseDownload supplementary datasets (CDC SVI, OSM data, etc.):
python scripts/download_additional_data.pyGenerate publication-ready maps and charts:
python scripts/visualize_results.py --output-dir data/outputs_finalLaunch web-based Streamlit dashboard for dynamic exploration:
streamlit run scripts/dashboard.pyFeatures:
- Interactive vulnerability weight adjustment
- Real-time runoff scenario modeling
- Multi-layer mapping with Folium
- Correlation analysis and quadrant filtering
- Data export and download capabilities
Prepare lightweight datasets for web dashboard:
python scripts/generate_dashboard_data.pyStandalone spatial statistics module:
from scripts.spatial_clustering import perform_spatial_clustering_analysis
results = perform_spatial_clustering_analysis(segments, variable_col='gap_index')Standalone hydrological modeling:
from scripts.runoff_modeling import perform_runoff_modeling
results = perform_runoff_modeling(segments, storm_events=['2-year', '10-year', '25-year'])Automate data refresh and quality monitoring:
python scripts/data_pipeline_scheduler.py --schedule daily --reportBatch convert GeoPackage to Shapefile:
python scripts/convert_formats.pyConsolidate infrastructure data from multiple sources:
python scripts/merge_data.pyInteractive step-by-step analysis:
jupyter notebook notebooks/interactive_exploration.ipynbfrom scripts.integrations.seattle_opendata import SeattleOpenDataClient
client = SeattleOpenDataClient()
facilities = client.fetch_gsi_facilities(bbox, facility_types=['rain_garden', 'bioswale'])from scripts.integrations.noaa_cdo import NOAACDOClient
client = NOAACDOClient(api_key='YOUR_KEY')
precip = client.fetch_precipitation_history('GHCND:USW00024233', '2020-01-01', '2024-12-31')from scripts.integrations.nws_forecast import NWSForecastClient
client = NWSForecastClient()
forecast = client.get_gridpoint_forecast(47.6062, -122.3321)from scripts.integrations.usgs_water import USGSWaterServicesClient
client = USGSWaterServicesClient()
streamflow = client.get_streamflow_current('12113000') # Duwamish Riverfrom scripts.integrations.multi_jurisdiction import MultiJurisdictionConsolidator
consolidator = MultiJurisdictionConsolidator()
consolidator.register_jurisdiction('Seattle', bbox, data_sources)
consolidator.register_jurisdiction('Tacoma', bbox, data_sources)
consolidated = consolidator.consolidate_infrastructure()Before running analysis, download required datasets:
# Download external data
python download_data.py --bbox "-122.36,47.58,-122.30,47.62"See DATA_ACQUISITION_WORKFLOW.md for complete instructions.
Use the tool programmatically:
from scripts.geospatial_analysis import GeospatialAnalysisTool
# Initialize
tool = GeospatialAnalysisTool(
data_dir='data',
output_dir='data/outputs',
config_path='config.yaml'
)
# Load data (required)
tool.load_data(
rail_path='data/raw/rail/corridor.shp',
infrastructure_path='data/raw/infrastructure/permeable_pavement.shp'
)
# Run analyses with real data
tool.calculate_vulnerability(
imperviousness_raster='data/raw/landcover/nlcd_2019_impervious_aoi.tif',
dem_path='data/raw/elevation/dem_aoi.tif',
soils_path='data/processed/soils/ssurgo_aoi.gpkg'
)
tool.analyze_infrastructure_density()
tool.assess_alignment()
# Generate outputs
tool.generate_report()
tool.save_results()Use download_data.py to fetch external datasets for your area of interest:
python download_data.py --bbox "-122.36,47.58,-122.30,47.62" --verboseThis script will:
- ✅ Automatically download: FEMA flood zones, SSURGO soils metadata
⚠️ Provide instructions for: NLCD imperviousness, elevation/DEM, rail corridors, infrastructure
Automated (via API):
fetch_fema_nfhl_by_bbox(bbox): FEMA NFHL flood zones (✅ working)fetch_ssurgo_soils_by_bbox(bbox): USDA SSURGO soils (⚠️ may need manual processing)
Manual Download Required:
- NLCD imperviousness: https://www.mrlc.gov/viewer/
- Elevation/DEM: https://apps.nationalmap.gov/downloader/
- Rail corridors: WSDOT Portal, OSM, or local agencies
- Infrastructure: Seattle Open Data or local jurisdiction
See DATA_ACQUISITION_WORKFLOW.md for complete instructions.
Outputs are cached to data/raw/ and processed data saved to data/processed/ as GeoPackages reprojected to Washington State Plane South (EPSG:2927).
GeospatialAnalysis/
│
├── COMPLETE_METHODOLOGY_GUIDE.txt # Comprehensive methodology documentation
├── README.md # This file
├── LICENSE # MIT License
├── requirements.txt # Python dependencies
│
├── data/ # Data directory
│ ├── raw/ # Raw input data
│ │ ├── rail/ # Rail corridor shapefiles
│ │ ├── infrastructure/ # Permeable pavement data
│ │ ├── elevation/ # DEM/LiDAR data
│ │ ├── soils/ # SSURGO soil data
│ │ ├── landcover/ # NLCD imperviousness
│ │ └── drainage/ # Storm drain infrastructure
│ ├── processed/ # Intermediate processing outputs
│ └── outputs/ # Final analysis results
│
├── scripts/ # Analysis scripts
│ ├── geospatial_analysis.py # Main analysis tool
│ └── utils/ # Utility functions
│ ├── __init__.py
│ ├── gis_functions.py # GIS operations
│ └── statistics.py # Statistical functions
│
├── analysis/ # Advanced analysis scripts
│ └── (R scripts for spatial regression)
│
└── figures/ # Output visualizations
├── maps/ # Map outputs
└── charts/ # Chart outputs
- Rail Corridor: Line or polygon shapefile of rail corridor
- Infrastructure: Point or polygon shapefile of permeable pavement facilities
- Elevation: LiDAR-derived DEM (3-6 foot resolution)
- Soils: SSURGO hydrologic soil groups
- Land Cover: NLCD imperviousness raster
- Drainage: Storm drain lines and catch basins
- Precipitation: NOAA Atlas 14 design storm depths
- Jurisdictions: Municipal boundaries for jurisdictional analysis
See COMPLETE_METHODOLOGY_GUIDE.txt Section 1 for detailed data acquisition instructions:
- WSDOT GeoData Portal (rail infrastructure)
- Seattle Open Data (permeable pavement, drainage)
- USDA Web Soil Survey (soils)
- USGS National Map (elevation, land cover)
- NOAA Climate Data (precipitation)
analysis_segments.shp: Segment-level analysis results with all metricsinfrastructure_processed.shp: Processed infrastructure data
analysis_summary.txt: Text summary of key findingsanalysis_segments.csv: Tabular data for further analysis
vuln_mean: Composite vulnerability index (0-10 scale)vuln_class: Vulnerability classification (Low/Moderate/High)density_sqft_per_acre: Infrastructure densityfacility_count: Number of facilities per segmentquadrant: Alignment quadrant classificationgap_index: Protection gap metricimperv_mean: Mean imperviousness percentagebuffer_area_acres: Analysis area in acres
This tool implements a six-phase methodology:
- Coordinate system standardization
- Buffer generation (100m, 250m, 500m)
- Corridor segmentation
- Data validation
- Topographic position analysis
- Slope calculation
- Soil drainage classification
- Imperviousness assessment
- Drainage proximity
- Weighted composite index
- Spatial join with buffers
- Density calculation (sq ft/acre)
- Temporal cohort analysis
- Jurisdictional comparison
- Pearson and Spearman correlation
- Quadrant classification
- Gap index calculation
- Multiple regression modeling
- Global Moran's I
- Local Moran's I (LISA)
- Getis-Ord Gi* hot spots
- SCS Curve Number preparation
- Runoff volume calculation
- Optimization scenarios
VULNERABILITY ASSESSMENT
Mean vulnerability: 5.23
High vulnerability segments: 8
Moderate vulnerability segments: 12
Low vulnerability segments: 5
INFRASTRUCTURE DENSITY
Mean density: 847.3 sq ft/acre
Median density: 592.1 sq ft/acre
Segments with zero infrastructure: 3
ALIGNMENT ANALYSIS
Correlation (r): -0.342
P-value: 0.0156
⚠ Significant INVERSE correlation detected
Priority gap segments (Q3): 7 segments
Mean gap index: 3.45
MIT License - see LICENSE file for details
Copyright (c) 2025 Christopher Tritt
This tool implements methodologies developed for rail corridor flood resilience analysis in the Seattle-Tacoma region. The comprehensive methodology guide provides detailed step-by-step instructions for conducting spatial analysis of green infrastructure and vulnerability indicators.