Skip to content

Commit e6c501c

Browse files
committed
rename saber
1 parent f57b495 commit e6c501c

20 files changed

+159
-154
lines changed

README.md

+70-73
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,4 @@
1-
# Hydrological Bias Correction on Large Mode
2-
This repository contains Python code which can be used to calibrate biased, non-gridded hydrologic models. Most of the
3-
code in this repository will work on any model's results. The data preprocessing and automated calibration functions
4-
are programmed to expect data following the GEOGloWS ECMWF Streamflow Service's structure and format.
1+
# Stream Analysis for Bias Estimation and Reduction
52

63
## Theory
74
Basins and streams will be used interchangeably to refer to the specific stream subunit.
@@ -45,16 +42,16 @@ file formats are acceptable
4542
5. Historical simulated discharge for each stream segment and for as long (temporally) as is available.
4643
6. Observed discharge data for as many stream reaches as possible within the target region.
4744
7. The units of the simulation and observation data must be in the same units.
48-
8. A working directory folder on the computer where the scripts are going to be run.
45+
8. A working directory on the computer where the scripts are going to be run.
4946

5047
## Process
5148
### 1 Create a Working Directory
5249

5350
```python
54-
import hbc
51+
import saber as saber
5552

5653
path_to_working_directory = '/my/file/path'
57-
hbc.prep.scaffold_workdir(path_to_working_directory)
54+
saber.prep.scaffold_workdir(path_to_working_directory)
5855
```
5956

6057
Your working directory should exactly like this.
@@ -112,12 +109,12 @@ gdf.to_file('/file/path/to/save', driver='GeoJSON')
112109

113110
Your table should look like this:
114111

115-
downstream_model_id | model_id | drainage_area_mod | stream_order | x | y |
116-
------------------- | ----------------- | ----------------- | ------------- | --- | --- |
117-
unique_stream_# | unique_stream_# | area in km^2 | stream_order | ## | ## |
118-
unique_stream_# | unique_stream_# | area in km^2 | stream_order | ## | ## |
119-
unique_stream_# | unique_stream_# | area in km^2 | stream_order | ## | ## |
120-
... | ... | ... | ... | ... | ... |
112+
| downstream_model_id | model_id | drainage_area_mod | stream_order | x | y |
113+
|---------------------|-----------------|-------------------|--------------|-----|-----|
114+
| unique_stream_# | unique_stream_# | area in km^2 | stream_order | ## | ## |
115+
| unique_stream_# | unique_stream_# | area in km^2 | stream_order | ## | ## |
116+
| unique_stream_# | unique_stream_# | area in km^2 | stream_order | ## | ## |
117+
| ... | ... | ... | ... | ... | ... |
121118

122119
2. Prepare a csv of the attribute table of the gauge locations shapefile.
123120
- You need the columns:
@@ -127,12 +124,12 @@ unique_stream_# | unique_stream_# | area in km^2 | stream_order | ##
127124

128125
Your table should look like this (column order is irrelevant):
129126

130-
model_id | drainage_area_obs | gauge_id
131-
----------------- | ----------------- | ------------
132-
unique_stream_num | area in km^2 | unique_gauge_num
133-
unique_stream_num | area in km^2 | unique_gauge_num
134-
unique_stream_num | area in km^2 | unique_gauge_num
135-
... | ... | ...
127+
| model_id | drainage_area_obs | gauge_id |
128+
|-------------------|-------------------|------------------|
129+
| unique_stream_num | area in km^2 | unique_gauge_num |
130+
| unique_stream_num | area in km^2 | unique_gauge_num |
131+
| unique_stream_num | area in km^2 | unique_gauge_num |
132+
| ... | ... | ... |
136133

137134
Your project's working directory now looks like
138135
```
@@ -162,17 +159,17 @@ The Assignments Table is the core of the regional bias correction method it is a
162159
stream segment in the model and several columns of other information which are filled in during the RBC algorithm. It
163160
looks like this:
164161

165-
downstream_model_id | model_id | drainage_area | stream_order | gauge_id
166-
------------------- | ----------------- | ------------- | ------------ | ----------------
167-
unique_stream_num | unique_stream_num | area in km^2 | stream_order | unique_gauge_num
168-
unique_stream_num | unique_stream_num | area in km^2 | stream_order | unique_gauge_num
169-
unique_stream_num | unique_stream_num | area in km^2 | stream_order | unique_gauge_num
170-
... | ... | ... | ... | ...
162+
| downstream_model_id | model_id | drainage_area | stream_order | gauge_id |
163+
|---------------------|-------------------|---------------|--------------|------------------|
164+
| unique_stream_num | unique_stream_num | area in km^2 | stream_order | unique_gauge_num |
165+
| unique_stream_num | unique_stream_num | area in km^2 | stream_order | unique_gauge_num |
166+
| unique_stream_num | unique_stream_num | area in km^2 | stream_order | unique_gauge_num |
167+
| ... | ... | ... | ... | ... |
171168

172169
```python
173-
import hbc
170+
import saber as saber
174171
workdir = '/path/to/project/directory/'
175-
hbc.prep.gen_assignments_table(workdir)
172+
saber.prep.gen_assignments_table(workdir)
176173
```
177174

178175
Your project's working directory now looks like
@@ -211,45 +208,45 @@ Use the dat
211208

212209
1. Create a single large csv of the historical simulation data with a datetime column and 1 column per stream segment labeled by the stream's ID number.
213210

214-
datetime | model_id_1 | model_id_2 | model_id_3
215-
----------- | ----------- | ----------- | -----------
216-
1979-01-01 | 50 | 50 | 50
217-
1979-01-02 | 60 | 60 | 60
218-
1979-01-03 | 70 | 70 | 70
219-
... | ... | ... | ...
220-
211+
| datetime | model_id_1 | model_id_2 | model_id_3 |
212+
|------------|------------|------------|------------|
213+
| 1979-01-01 | 50 | 50 | 50 |
214+
| 1979-01-02 | 60 | 60 | 60 |
215+
| 1979-01-03 | 70 | 70 | 70 |
216+
| ... | ... | ... | ... |
217+
221218
2. Process the large simulated discharge csv to create a 2nd csv with the flow duration curve on each segment (script provided).
222219

223-
p_exceed | model_id_1 | model_id_2 | model_id_3
224-
----------- | ----------- | ----------- | -----------
225-
100 | 0 | 0 | 0
226-
99 | 10 | 10 | 10
227-
98 | 20 | 20 | 20
228-
... | ... | ... | ...
220+
| p_exceed | model_id_1 | model_id_2 | model_id_3 |
221+
|----------|------------|------------|------------|
222+
| 100 | 0 | 0 | 0 |
223+
| 99 | 10 | 10 | 10 |
224+
| 98 | 20 | 20 | 20 |
225+
| ... | ... | ... | ... |
229226

230227
3. Process the large historical discharge csv to create a 3rd csv with the monthly averages on each segment (script provided).
231228

232-
month | model_id_1 | model_id_2 | model_id_3
233-
----------- | ----------- | ----------- | -----------
234-
1 | 60 | 60 | 60
235-
2 | 30 | 30 | 30
236-
3 | 70 | 70 | 70
237-
... | ... | ... | ...
229+
| month | model_id_1 | model_id_2 | model_id_3 |
230+
|-------|------------|------------|------------|
231+
| 1 | 60 | 60 | 60 |
232+
| 2 | 30 | 30 | 30 |
233+
| 3 | 70 | 70 | 70 |
234+
| ... | ... | ... | ... |
238235

239236
```python
240-
import hbc
237+
import saber as saber
241238

242239
workdir = '/path/to/working/directory'
243240

244-
hbc.prep.historical_simulation(
241+
saber.prep.historical_simulation(
245242
workdir,
246243
'/path/to/historical/simulation/netcdf.nc' # optional - if nc not stored in data_inputs folder
247244
)
248-
hbc.prep.hist_sim_table(
245+
saber.prep.hist_sim_table(
249246
workdir,
250247
'/path/to/historical/simulation/netcdf.nc' # optional - if nc not stored in data_inputs folder
251248
)
252-
hbc.prep.observed_data(
249+
saber.prep.observed_data(
253250
workdir,
254251
'/path/to/obs/csv/directory' # optional - if csvs not stored in workdir/data_inputs/obs_csvs
255252
)
@@ -296,10 +293,10 @@ For each of the following, generate and store clusters for many group sizes- bet
296293
Use this code:
297294

298295
```python
299-
import hbc
296+
import saber as saber
300297

301298
workdir = '/path/to/project/directory/'
302-
hbc.cluster.generate(workdir)
299+
saber.cluster.generate(workdir)
303300
```
304301

305302
This function creates trained kmeans models saved as pickle files, plots (from matplotlib) of what each of the clusters
@@ -354,12 +351,12 @@ The justification for this is obvious. The observations are the actual streamflo
354351
- The reason listed for this assignment is "gauged"
355352

356353
```python
357-
import hbc
354+
import saber as saber
358355

359-
# assign_table = pandas DataFrame (see hbc.table module)
356+
# assign_table = pandas DataFrame (see saber.table module)
360357
workdir = '/path/to/project/directory/'
361-
assign_table = hbc.table.read(workdir)
362-
hbc.assign.gauged(assign_table)
358+
assign_table = saber.table.read(workdir)
359+
saber.assign.gauged(assign_table)
363360
```
364361

365362
### 7 Assign basins by Propagation (hydraulically connected to a gauge)
@@ -375,12 +372,12 @@ be less sensitive to changes in flows up stream, may connect basins with differe
375372
i is the number of stream segments up/down from the gauge the river is.
376373

377374
```python
378-
import hbc
375+
import saber as saber
379376

380-
# assign_table = pandas DataFrame (see hbc.table module)
377+
# assign_table = pandas DataFrame (see saber.table module)
381378
workdir = '/path/to/project/directory/'
382-
assign_table = hbc.table.read(workdir)
383-
hbc.assign.propagation(assign_table)
379+
assign_table = saber.table.read(workdir)
380+
saber.assign.propagation(assign_table)
384381
```
385382

386383
### 8 Assign basins by Clusters (hydrologically similar basins)
@@ -391,12 +388,12 @@ Using the results of the optimal clusters
391388
- Review assignments spatially. Run tests and view improvements. Adjust clusters and reassign as necessary.
392389

393390
```python
394-
import hbc
391+
import saber as saber
395392

396-
# assign_table = pandas DataFrame (see hbc.table module)
393+
# assign_table = pandas DataFrame (see saber.table module)
397394
workdir = '/path/to/project/directory/'
398-
assign_table = hbc.table.read(workdir)
399-
hbc.assign.clusters_by_dist(assign_table)
395+
assign_table = saber.table.read(workdir)
396+
saber.assign.clusters_by_dist(assign_table)
400397
```
401398

402399
### 9 Generate GIS files of the assignments
@@ -405,18 +402,18 @@ use to visualize the results of this process. These GIS files help you investiga
405402
used at each step. Use this to monitor the results.
406403

407404
```python
408-
import hbc
405+
import saber as saber
409406

410407
workdir = '/path/to/project/directory/'
411-
assign_table = hbc.table.read(workdir)
408+
assign_table = saber.table.read(workdir)
412409
drain_shape = '/my/file/path/'
413-
hbc.gis.clip_by_assignment(workdir, assign_table, drain_shape)
414-
hbc.gis.clip_by_cluster(workdir, assign_table, drain_shape)
415-
hbc.gis.clip_by_unassigned(workdir, assign_table, drain_shape)
410+
saber.gis.clip_by_assignment(workdir, assign_table, drain_shape)
411+
saber.gis.clip_by_cluster(workdir, assign_table, drain_shape)
412+
saber.gis.clip_by_unassigned(workdir, assign_table, drain_shape)
416413

417414
# or if you have a specific set of ID's to check on
418415
list_of_model_ids = [123, 456, 789]
419-
hbc.gis.clip_by_ids(workdir, list_of_model_ids, drain_shape)
416+
saber.gis.clip_by_ids(workdir, list_of_model_ids, drain_shape)
420417
```
421418

422419
After this step, your project directory should look like this:
@@ -509,13 +506,13 @@ excluded each time. The code provided will help you partition your gauge table i
509506
against the observed data which was withheld from the bias correction process.
510507

511508
```python
512-
import hbc
509+
import saber as saber
513510
workdir = '/path/to/project/directory'
514511
drain_shape = '/path/to/drainageline/gis/file.shp'
515512
obs_data_dir = '/path/to/obs/data/directory' # optional - if data not in workdir/data_inputs/obs_csvs
516513

517-
hbc.validate.sample_gauges(workdir)
518-
hbc.validate.run_series(workdir, drain_shape, obs_data_dir)
514+
saber.validate.sample_gauges(workdir)
515+
saber.validate.run_series(workdir, drain_shape, obs_data_dir)
519516
```
520517

521518
After this step your working directory should look like this:

examples/colombia-magdalena/magdalena_example.py

+23-23
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
import numpy as np
44

5-
import hbc
5+
import saber
66

77

88
np.seterr(all="ignore")
@@ -13,47 +13,47 @@
1313
obs_data_dir = os.path.join(workdir, 'data_inputs', 'obs_csvs')
1414

1515
# Only need to do this step 1x ever
16-
# hbc.prep.scaffold_working_directory(workdir)
16+
# saber.prep.scaffold_working_directory(workdir)
1717

1818
# Create the gauge_table and drain_table.csv
1919
# Scripts not provided, check readme for instructions
2020

2121
# Generate the assignments table
22-
# assign_table = hbc.table.gen(workdir)
23-
# hbc.table.cache(workdir, assign_table)
22+
# assign_table = saber.table.gen(workdir)
23+
# saber.table.cache(workdir, assign_table)
2424
# Or read the existing table
25-
# assign_table = hbc.table.read(workdir)
25+
# assign_table = saber.table.read(workdir)
2626

2727
# Prepare the observation and simulation data
2828
# Only need to do this step 1x ever
29-
# hbc.prep.historical_simulation(os.path.join(workdir, 'data_simulated', 'south_america_era5_qout.nc'), workdir)
30-
# hbc.prep.observation_data(workdir)
29+
# saber.prep.historical_simulation(os.path.join(workdir, 'data_simulated', 'south_america_era5_qout.nc'), workdir)
30+
# saber.prep.observation_data(workdir)
3131

3232
# Generate the clusters using the historical simulation data
33-
# hbc.cluster.generate(workdir)
34-
# assign_table = hbc.cluster.summarize(workdir, assign_table)
35-
# hbc.table.cache(workdir, assign_table)
33+
# saber.cluster.generate(workdir)
34+
# assign_table = saber.cluster.summarize(workdir, assign_table)
35+
# saber.table.cache(workdir, assign_table)
3636

3737
# Assign basins which are gauged and propagate those gauges
38-
# assign_table = hbc.assign.gauged(assign_table)
39-
# assign_table = hbc.assign.propagation(assign_table)
40-
# assign_table = hbc.assign.clusters_by_dist(assign_table)
41-
# todo assign_table = hbc.assign.clusters_by_monavg(assign_table)
38+
# assign_table = saber.assign.gauged(assign_table)
39+
# assign_table = saber.assign.propagation(assign_table)
40+
# assign_table = saber.assign.clusters_by_dist(assign_table)
41+
# todo assign_table = saber.assign.clusters_by_monavg(assign_table)
4242

4343
# Cache the assignments table with the updates
44-
# hbc.table.cache(workdir, assign_table)
44+
# saber.table.cache(workdir, assign_table)
4545

4646
# Generate GIS files so you can go explore your progress graphically
47-
# hbc.gis.clip_by_assignment(workdir, assign_table, drain_shape)
48-
# hbc.gis.clip_by_cluster(workdir, assign_table, drain_shape)
49-
# hbc.gis.clip_by_unassigned(workdir, assign_table, drain_shape)
47+
# saber.gis.clip_by_assignment(workdir, assign_table, drain_shape)
48+
# saber.gis.clip_by_cluster(workdir, assign_table, drain_shape)
49+
# saber.gis.clip_by_unassigned(workdir, assign_table, drain_shape)
5050

5151
# Compute the corrected simulation data
52-
# assign_table = hbc.table.read(workdir)
53-
# hbc.calibrate_region(workdir, assign_table)
54-
# vtab = hbc.validate.gen_val_table(workdir)
55-
hbc.gis.validation_maps(workdir, gauge_shape)
56-
hbc.analysis.plot(workdir, obs_data_dir, 9007721)
52+
# assign_table = saber.table.read(workdir)
53+
# saber.calibrate_region(workdir, assign_table)
54+
# vtab = saber.validate.gen_val_table(workdir)
55+
saber.gis.validation_maps(workdir, gauge_shape)
56+
saber.analysis.plot(workdir, obs_data_dir, 9007721)
5757

5858

5959
# import pandas as pd

examples/example_inputs.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,14 @@
22

33

44
# COLOMBIA
5-
workdir = '/Users/rchales/data/regional-bias-correction/colombia-magdalena'
5+
workdir = '/Users/rchales/data/saber/colombia-magdalena'
66
drain_shape = os.path.join(workdir, 'gis_inputs', 'magdalena_dl_attrname_xy.json')
77
gauge_shape = os.path.join(workdir, 'gis_inputs', 'ideam_stations.json')
88
obs_data_dir = os.path.join(workdir, 'data_inputs', 'obs_csvs')
99
hist_sim_nc = os.path.join(workdir, 'data_inputs', 'south_america_era5_qout.nc')
1010

1111
# TEXAS
12-
workdir = '/Users/rchales/data/regional-bias-correction/texas'
12+
workdir = '/Users/rchales/data/saber/texas'
1313
drain_shape = os.path.join(workdir, 'shapefiles', 'texas-dl.json')
1414
gauge_shape = os.path.join(workdir, 'shapefiles', 'texas-gauges.shp')
1515
obs_data_dir = os.path.join(workdir, 'data_inputs', 'obs_csvs')

0 commit comments

Comments
 (0)