1
- # Hydrological Bias Correction on Large Mode
2
- This repository contains Python code which can be used to calibrate biased, non-gridded hydrologic models. Most of the
3
- code in this repository will work on any model's results. The data preprocessing and automated calibration functions
4
- are programmed to expect data following the GEOGloWS ECMWF Streamflow Service's structure and format.
1
+ # Stream Analysis for Bias Estimation and Reduction
5
2
6
3
## Theory
7
4
Basins and streams will be used interchangeably to refer to the specific stream subunit.
@@ -45,16 +42,16 @@ file formats are acceptable
45
42
5 . Historical simulated discharge for each stream segment and for as long (temporally) as is available.
46
43
6 . Observed discharge data for as many stream reaches as possible within the target region.
47
44
7 . The units of the simulation and observation data must be in the same units.
48
- 8 . A working directory folder on the computer where the scripts are going to be run.
45
+ 8 . A working directory on the computer where the scripts are going to be run.
49
46
50
47
## Process
51
48
### 1 Create a Working Directory
52
49
53
50
``` python
54
- import hbc
51
+ import saber as saber
55
52
56
53
path_to_working_directory = ' /my/file/path'
57
- hbc .prep.scaffold_workdir(path_to_working_directory)
54
+ saber .prep.scaffold_workdir(path_to_working_directory)
58
55
```
59
56
60
57
Your working directory should exactly like this.
@@ -112,12 +109,12 @@ gdf.to_file('/file/path/to/save', driver='GeoJSON')
112
109
113
110
Your table should look like this:
114
111
115
- downstream_model_id | model_id | drainage_area_mod | stream_order | x | y |
116
- ------------------- | ----------------- | ----------------- | ------------- | --- | --- |
117
- unique_stream_ # | unique_stream_ # | area in km^2 | stream_order | ## | ## |
118
- unique_stream_ # | unique_stream_ # | area in km^2 | stream_order | ## | ## |
119
- unique_stream_ # | unique_stream_ # | area in km^2 | stream_order | ## | ## |
120
- ... | ... | ... | ... | ... | ... |
112
+ | downstream_model_id | model_id | drainage_area_mod | stream_order | x | y |
113
+ | --------------------- | -----------------| ------------------- | -------------- | ----- | ----- |
114
+ | unique_stream_ # | unique_stream_ # | area in km^2 | stream_order | ## | ## |
115
+ | unique_stream_ # | unique_stream_ # | area in km^2 | stream_order | ## | ## |
116
+ | unique_stream_ # | unique_stream_ # | area in km^2 | stream_order | ## | ## |
117
+ | ... | ... | ... | ... | ... | ... |
121
118
122
119
2 . Prepare a csv of the attribute table of the gauge locations shapefile.
123
120
- You need the columns:
@@ -127,12 +124,12 @@ unique_stream_# | unique_stream_# | area in km^2 | stream_order | ##
127
124
128
125
Your table should look like this (column order is irrelevant):
129
126
130
- model_id | drainage_area_obs | gauge_id
131
- ----------------- | ----------------- | ------------
132
- unique_stream_num | area in km^2 | unique_gauge_num
133
- unique_stream_num | area in km^2 | unique_gauge_num
134
- unique_stream_num | area in km^2 | unique_gauge_num
135
- ... | ... | ...
127
+ | model_id | drainage_area_obs | gauge_id |
128
+ | ------------------- | ------------------- | ------------------ |
129
+ | unique_stream_num | area in km^2 | unique_gauge_num |
130
+ | unique_stream_num | area in km^2 | unique_gauge_num |
131
+ | unique_stream_num | area in km^2 | unique_gauge_num |
132
+ | ... | ... | ... |
136
133
137
134
Your project's working directory now looks like
138
135
```
@@ -162,17 +159,17 @@ The Assignments Table is the core of the regional bias correction method it is a
162
159
stream segment in the model and several columns of other information which are filled in during the RBC algorithm. It
163
160
looks like this:
164
161
165
- downstream_model_id | model_id | drainage_area | stream_order | gauge_id
166
- ------------------- | ----------------- | ------------- | ------------ | ----------------
167
- unique_stream_num | unique_stream_num | area in km^2 | stream_order | unique_gauge_num
168
- unique_stream_num | unique_stream_num | area in km^2 | stream_order | unique_gauge_num
169
- unique_stream_num | unique_stream_num | area in km^2 | stream_order | unique_gauge_num
170
- ... | ... | ... | ... | ...
162
+ | downstream_model_id | model_id | drainage_area | stream_order | gauge_id |
163
+ | --------------------- | ------------------- | --------------- | -------------- | ------------------ |
164
+ | unique_stream_num | unique_stream_num | area in km^2 | stream_order | unique_gauge_num |
165
+ | unique_stream_num | unique_stream_num | area in km^2 | stream_order | unique_gauge_num |
166
+ | unique_stream_num | unique_stream_num | area in km^2 | stream_order | unique_gauge_num |
167
+ | ... | ... | ... | ... | ... |
171
168
172
169
``` python
173
- import hbc
170
+ import saber as saber
174
171
workdir = ' /path/to/project/directory/'
175
- hbc .prep.gen_assignments_table(workdir)
172
+ saber .prep.gen_assignments_table(workdir)
176
173
```
177
174
178
175
Your project's working directory now looks like
@@ -211,45 +208,45 @@ Use the dat
211
208
212
209
1 . Create a single large csv of the historical simulation data with a datetime column and 1 column per stream segment labeled by the stream's ID number.
213
210
214
- datetime | model_id_1 | model_id_2 | model_id_3
215
- ----------- | ----------- | ----------- | -----------
216
- 1979-01-01 | 50 | 50 | 50
217
- 1979-01-02 | 60 | 60 | 60
218
- 1979-01-03 | 70 | 70 | 70
219
- ... | ... | ... | ...
220
-
211
+ | datetime | model_id_1 | model_id_2 | model_id_3 |
212
+ | ------------ | ------------ | ------------ | ------------ |
213
+ | 1979-01-01 | 50 | 50 | 50 |
214
+ | 1979-01-02 | 60 | 60 | 60 |
215
+ | 1979-01-03 | 70 | 70 | 70 |
216
+ | ... | ... | ... | ... |
217
+
221
218
2 . Process the large simulated discharge csv to create a 2nd csv with the flow duration curve on each segment (script provided).
222
219
223
- p_exceed | model_id_1 | model_id_2 | model_id_3
224
- ----------- | ----------- | ----------- | -----------
225
- 100 | 0 | 0 | 0
226
- 99 | 10 | 10 | 10
227
- 98 | 20 | 20 | 20
228
- ... | ... | ... | ...
220
+ | p_exceed | model_id_1 | model_id_2 | model_id_3 |
221
+ | ----------| ------------ | ------------ | ------------ |
222
+ | 100 | 0 | 0 | 0 |
223
+ | 99 | 10 | 10 | 10 |
224
+ | 98 | 20 | 20 | 20 |
225
+ | ... | ... | ... | ... |
229
226
230
227
3 . Process the large historical discharge csv to create a 3rd csv with the monthly averages on each segment (script provided).
231
228
232
- month | model_id_1 | model_id_2 | model_id_3
233
- ----------- | ----------- | ----------- | -----------
234
- 1 | 60 | 60 | 60
235
- 2 | 30 | 30 | 30
236
- 3 | 70 | 70 | 70
237
- ... | ... | ... | ...
229
+ | month | model_id_1 | model_id_2 | model_id_3 |
230
+ | -------| ------------ | ------------ | ------------ |
231
+ | 1 | 60 | 60 | 60 |
232
+ | 2 | 30 | 30 | 30 |
233
+ | 3 | 70 | 70 | 70 |
234
+ | ... | ... | ... | ... |
238
235
239
236
``` python
240
- import hbc
237
+ import saber as saber
241
238
242
239
workdir = ' /path/to/working/directory'
243
240
244
- hbc .prep.historical_simulation(
241
+ saber .prep.historical_simulation(
245
242
workdir,
246
243
' /path/to/historical/simulation/netcdf.nc' # optional - if nc not stored in data_inputs folder
247
244
)
248
- hbc .prep.hist_sim_table(
245
+ saber .prep.hist_sim_table(
249
246
workdir,
250
247
' /path/to/historical/simulation/netcdf.nc' # optional - if nc not stored in data_inputs folder
251
248
)
252
- hbc .prep.observed_data(
249
+ saber .prep.observed_data(
253
250
workdir,
254
251
' /path/to/obs/csv/directory' # optional - if csvs not stored in workdir/data_inputs/obs_csvs
255
252
)
@@ -296,10 +293,10 @@ For each of the following, generate and store clusters for many group sizes- bet
296
293
Use this code:
297
294
298
295
``` python
299
- import hbc
296
+ import saber as saber
300
297
301
298
workdir = ' /path/to/project/directory/'
302
- hbc .cluster.generate(workdir)
299
+ saber .cluster.generate(workdir)
303
300
```
304
301
305
302
This function creates trained kmeans models saved as pickle files, plots (from matplotlib) of what each of the clusters
@@ -354,12 +351,12 @@ The justification for this is obvious. The observations are the actual streamflo
354
351
- The reason listed for this assignment is "gauged"
355
352
356
353
``` python
357
- import hbc
354
+ import saber as saber
358
355
359
- # assign_table = pandas DataFrame (see hbc .table module)
356
+ # assign_table = pandas DataFrame (see saber .table module)
360
357
workdir = ' /path/to/project/directory/'
361
- assign_table = hbc .table.read(workdir)
362
- hbc .assign.gauged(assign_table)
358
+ assign_table = saber .table.read(workdir)
359
+ saber .assign.gauged(assign_table)
363
360
```
364
361
365
362
### 7 Assign basins by Propagation (hydraulically connected to a gauge)
@@ -375,12 +372,12 @@ be less sensitive to changes in flows up stream, may connect basins with differe
375
372
i is the number of stream segments up/down from the gauge the river is.
376
373
377
374
``` python
378
- import hbc
375
+ import saber as saber
379
376
380
- # assign_table = pandas DataFrame (see hbc .table module)
377
+ # assign_table = pandas DataFrame (see saber .table module)
381
378
workdir = ' /path/to/project/directory/'
382
- assign_table = hbc .table.read(workdir)
383
- hbc .assign.propagation(assign_table)
379
+ assign_table = saber .table.read(workdir)
380
+ saber .assign.propagation(assign_table)
384
381
```
385
382
386
383
### 8 Assign basins by Clusters (hydrologically similar basins)
@@ -391,12 +388,12 @@ Using the results of the optimal clusters
391
388
- Review assignments spatially. Run tests and view improvements. Adjust clusters and reassign as necessary.
392
389
393
390
``` python
394
- import hbc
391
+ import saber as saber
395
392
396
- # assign_table = pandas DataFrame (see hbc .table module)
393
+ # assign_table = pandas DataFrame (see saber .table module)
397
394
workdir = ' /path/to/project/directory/'
398
- assign_table = hbc .table.read(workdir)
399
- hbc .assign.clusters_by_dist(assign_table)
395
+ assign_table = saber .table.read(workdir)
396
+ saber .assign.clusters_by_dist(assign_table)
400
397
```
401
398
402
399
### 9 Generate GIS files of the assignments
@@ -405,18 +402,18 @@ use to visualize the results of this process. These GIS files help you investiga
405
402
used at each step. Use this to monitor the results.
406
403
407
404
``` python
408
- import hbc
405
+ import saber as saber
409
406
410
407
workdir = ' /path/to/project/directory/'
411
- assign_table = hbc .table.read(workdir)
408
+ assign_table = saber .table.read(workdir)
412
409
drain_shape = ' /my/file/path/'
413
- hbc .gis.clip_by_assignment(workdir, assign_table, drain_shape)
414
- hbc .gis.clip_by_cluster(workdir, assign_table, drain_shape)
415
- hbc .gis.clip_by_unassigned(workdir, assign_table, drain_shape)
410
+ saber .gis.clip_by_assignment(workdir, assign_table, drain_shape)
411
+ saber .gis.clip_by_cluster(workdir, assign_table, drain_shape)
412
+ saber .gis.clip_by_unassigned(workdir, assign_table, drain_shape)
416
413
417
414
# or if you have a specific set of ID's to check on
418
415
list_of_model_ids = [123 , 456 , 789 ]
419
- hbc .gis.clip_by_ids(workdir, list_of_model_ids, drain_shape)
416
+ saber .gis.clip_by_ids(workdir, list_of_model_ids, drain_shape)
420
417
```
421
418
422
419
After this step, your project directory should look like this:
@@ -509,13 +506,13 @@ excluded each time. The code provided will help you partition your gauge table i
509
506
against the observed data which was withheld from the bias correction process.
510
507
511
508
``` python
512
- import hbc
509
+ import saber as saber
513
510
workdir = ' /path/to/project/directory'
514
511
drain_shape = ' /path/to/drainageline/gis/file.shp'
515
512
obs_data_dir = ' /path/to/obs/data/directory' # optional - if data not in workdir/data_inputs/obs_csvs
516
513
517
- hbc .validate.sample_gauges(workdir)
518
- hbc .validate.run_series(workdir, drain_shape, obs_data_dir)
514
+ saber .validate.sample_gauges(workdir)
515
+ saber .validate.run_series(workdir, drain_shape, obs_data_dir)
519
516
```
520
517
521
518
After this step your working directory should look like this:
0 commit comments