Skip to content

Commit 36fcea1

Browse files
DAT: Iss2 (#4)
* docs: corrections in the variable units * doc: correcting typos * adding example to use wfdei-gem-capa dataset * Adding the script to process WFDEI-GEM-CaPa dataset: https://doi.org/10.20383/101.0111 * adding capability of call wfdei-gem-capa script
1 parent d570160 commit 36fcea1

File tree

6 files changed

+303
-10
lines changed

6 files changed

+303
-10
lines changed

canrcm4_wfdei_gem_capa/README.md

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
In this file, the details of the dataset is explained.
33

44
## Location of Dataset Files
5-
The `CanRCM4-WFDEI-GEM-CaPA` dataset is located under the following directory accessible from Compute Canada (CC) Graham Cluster:
5+
The `CanRCM4-WFDEI-GEM-CaPA` dataset is located under the following directory accessible from Digital Alliance of Canada (formerly Compute Canada) Graham cluster:
66
```
77
/project/rpp-kshook/Model_Output/280_CanRCM4_Cor_WFDEI-GEM-CaPA
88
```
@@ -74,16 +74,18 @@ Each NetCDF file belongs to a single variable. The list of variables included in
7474
The spatial extent of the dataset is on latitutes from `31.0625` to `71.9375` and longitudes from `-149.9375` to `-50.0625` covering North America. The resolution is 0.125 degrees.
7575

7676
## Temporal Extent
77-
The time-steps are hourly covering from `January 1951` to `December 2100`.
77+
The time-steps are 3-hourly covering from `January 1951` to `December 2100`.
7878

7979
## Short Description on Dataset Variables
80-
In most hydrological modelling applications, usually 7 variables are needed detailed as following: 1) specific humidity at 1.5 (or 2) meters, 2) surface pressure, 3) air temperature at 1.5 (or 2) meters, 4) wind speed at 10 meters, 5) precipitation, 6) downward short wave radiation, and 7) downward long wave radiation. These variables are available through `RDRS` v2.1 dataset and their details are described in the table below:
80+
In most hydrological modelling applications, usually 7 variables are needed detailed as following: 1) specific humidity at the Lowest Model Level (sigma=0.995), 2) surface pressure, 3) air temperature at the Lowest Model Level, 4) wind speed at the Lowest Model Level (sigma=0.995), 5) precipitation, 6) downward short wave radiation, and 7) downward long wave radiation. These variables are available through `CanRCM4-WFDEI-GEM-CaPA` dataset and their details are described in the table below:
81+
8182
|Variable Name |Dataset Variable |Unit |IPCC abbreviation|Comments |
8283
|----------------------|-------------------|-----|-----------------|----------------------|
83-
|surface pressure |ps |Pa |ps | |
84-
|specific [email protected]|hus |1 |huss | |
85-
|air tempreature @1.5m |ta |K |tas | |
86-
|wind speed @10m |wind |m/s |wspd |Wind Modulus at Lowest Model Level (sigma=0.995)|
87-
|precipitation |pr |mm/hr| | |
84+
|surface pressure |ps |Pa |ps |surface pressure |
85+
|specific [email protected]|hus |1 |huss |Specific Humidity at Lowest Model Level (sigma=0.995)|
86+
|air tempreature @1.5m |ta |K |tas |Air Temperature at Lowest Model Level (sigma=0.995)|
87+
|wind speed @10m |wind |m s-1|wspd |Wind Modulus at Lowest Model Level (sigma=0.995)|
88+
|precipitation |pr |kg m-2 s-1| |precipitation flux |
8889
|short wave radiation |rsds |W m-2|rsds |Surface Downwelling Shortwave Flux|
8990
|long wave radiation |lsds |W m-2|rlds |Surface Downwelling Longwave Flux|
91+

example/canrcm4_wfdei_gem_capa_example_ssrb_1980_2018.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
# You should have received a copy of the GNU General Public License
1919
# along with this program. If not, see <http://www.gnu.org/licenses/>.
2020

21-
# This is a simple example to extract ECMWF ERA5 data for the
21+
# This is a simple example to extract CanRCM4 data for the
2222
# South Saskatchewan River Basin (SSRB) approximate extents
2323
# from Jan 1980 to Dec 2020.
2424

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
#!/bin/bash
2+
3+
# Meteorological Data Processing Workflow
4+
# Copyright (C) 2022, University of Saskatchewan
5+
#
6+
# This file is part of Meteorological Data Processing Workflow
7+
#
8+
# This program is free software: you can redistribute it and/or modify
9+
# it under the terms of the GNU General Public License as published by
10+
# the Free Software Foundation, either version 3 of the License, or
11+
# (at your option) any later version.
12+
#
13+
# This program is distributed in the hope that it will be useful,
14+
# but WITHOUT ANY WARRANTY; without even the implied warranty of
15+
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
16+
# GNU General Public License for more details.
17+
#
18+
# You should have received a copy of the GNU General Public License
19+
# along with this program. If not, see <http://www.gnu.org/licenses/>.
20+
21+
# This is a simple example to extract WFDEI-GEM-CaPA data for the
22+
# South Saskatchewan River Basin (SSRB) approximate extents
23+
# from Jan 1980 to Dec 2020.
24+
25+
# As is mentioned on the main webpage of the repository, it is
26+
# recommended to submit annual jobs for this dataset.
27+
28+
# Always call the script in the root directory of the repository
29+
cd ..
30+
echo "The current directory is: $(pwd)"
31+
32+
# First, submitting wihtout disaggregation
33+
./extract-dataset.sh --dataset="wfdei_gem_capa" \
34+
--dataset-dir="/project/rpp-kshook/Model_Output/181_WFDEI-GEM-CaPA_1979-2016" \
35+
--output-dir="$HOME/scratch/wfdei_gem_capa_output/" \
36+
--start-date="1980-01-01 00:00:00" \
37+
--end-date="2015-12-31 21:00:00" \
38+
--lat-lims=49,54 \
39+
--lon-lims=-120,-98 \
40+
--variable="pr,hus,wind" \
41+
--prefix="wfdei_" \
42+
--email="[email protected]" \
43+
-j;
44+

extract-dataset.sh

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -358,10 +358,21 @@ case "${dataset,,}" in
358358
if [[ "$parallel" == true ]]; then
359359
echo "$(basename $0): Warning: Parallel processing is not supported for CanRCM4-WFDEI-GEM-CaPA dataset;"
360360
echo "$(basename $0): For quasi-parallel processing, consider submitting individual jobs for each ensemble member;"
361-
echo "$(basename $0): Continuing with serial processing of the requested domain."
361+
echo "$(basename $0): Continuing with serial processing of the requested spatial and temporal domain."
362362
fi
363363
call_processing_func "$(dirname $0)/canrcm4_wfdei_gem_capa/canrcm4_wfdei_gem_capa.sh"
364364
;;
365+
366+
# WFDEI-GEM-CaPA
367+
"wfdei-gem-capa" | "wfdei_gem_capa" | "wfdei-gem_capa" | "wfdei_gem-capa")
368+
# adding the non-parallel argument
369+
if [[ "$parallel" == true ]]; then
370+
echo "$(basename $0): Warning: Parallel processing is not supported for WFDEI-GEM-CaPA dataset;"
371+
echo "$(basename $0): For quasi-parallel processing, consider submitting individual jobs for each variable;"
372+
echo "$(basename $0): Continuing with serial processing of the requested spatial and temporal domain."
373+
fi
374+
call_processing_func "$(dirname $0)/wfdei_gem_capa/wfdei_gem_capa.sh"
375+
;;
365376

366377
# dataset not included above
367378
*)

wfdei_gem_capa/README.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# CCRN `WFDEI-GEM-CaPA`
2+
In this file, the details of the dataset is explained.
3+
4+
## Location of Dataset Files
5+
The `WFDEI-GEM-CaPA` dataset is located under the following directory accessible from Digital Alliance of Canada (formerly Compute Canada) Graham cluster:
6+
7+
```
8+
/project/rpp-kshook/Model_Output/181_WFDEI-GEM-CaPA_1979-2016
9+
```
10+
and the structure of the dataset hourly files is as following:
11+
12+
```console
13+
/project/rpp-kshook/Model_Output/181_WFDEI-GEM-CaPA_1979-2016
14+
├── hus_WFDEI_GEM_1979_2016.Feb29.nc
15+
├── pr_WFDEI_GEM_1979_2016.Feb29.nc
16+
├── ps_WFDEI_GEM_1979_2016.Feb29.nc
17+
├── rlds_WFDEI_GEM_1979_2016.Feb29.nc
18+
├── rsds_WFDEI_GEM_1979_2016.Feb29.nc
19+
├── ta_WFDEI_GEM_1979_2016.Feb29.nc
20+
└── wind_WFDEI_GEM_1979_2016.Feb29.nc
21+
```
22+
23+
## Coordinate Variables and Time-stamps
24+
25+
### Coordinate Variables
26+
The coordinate variables of the `WFDEI-GEM-CaPA` simulations are `lon` and `lat` representing the longitude and latitude points, respectively.
27+
28+
### Time-stamps
29+
The time-stamps are included in the original files.
30+
31+
## Dataset Variables
32+
The list of variables included in the dataset is descriped in [Short Description on Dataset Variables](##short-description-on-dataset-variables)
33+
34+
## Spatial Extent
35+
The spatial extent of the dataset is on latitutes from `31.0625` to `71.9375` and longitudes from `-149.9375` to `-50.0625` covering North America. The resolution is 0.125 degrees.
36+
37+
## Temporal Extent
38+
The time-steps are 3-hourly covering from `January 1951` to `December 2100`.
39+
40+
## Short Description on Dataset Variables
41+
In most hydrological modelling applications, usually 7 variables are needed detailed as following: 1) specific humidity at 40 meters, 2) surface pressure, 3) air temperature at 40 meters, 4) wind speed at 40 meters, 5) precipitation, 6) downward short wave radiation, and 7) downward long wave radiation. These variables are available through `WFDEI-GEM-CaPA` dataset and their details are described in the table below:
42+
|Variable Name |Dataset Variable |Unit |IPCC abbreviation|Comments |
43+
|----------------------|-------------------|-----|-----------------|----------------------|
44+
|surface pressure |ps |Pa |ps |surface pressure at time stamp|
45+
|specific humidity@40m |hus |kg/kg|huss |specific humidity elevated to 40m at time stamp|
46+
|air tempreature @40m |ta |K |tas |air temperature elevated to 40m at time stamp|
47+
|wind speed @40m |wind |m/s |wspd |wind speed elevated to 40m at time stamp|
48+
|precipitation |pr |kg m-2 s-1| |Mean rainfall rate over the previous 3 hours|
49+
|short wave radiation |rsds |W m-2|rsds |Mean surface incident shortwave radiation over the previous 3 hours|
50+
|long wave radiation |lsds |W m-2|rlds |Mean surface incident shortwave radiation over the previous 3 hour|
51+

wfdei_gem_capa/wfdei_gem_capa.sh

Lines changed: 185 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,185 @@
1+
#!/bin/bash
2+
# Meteorological Data Processing Workflow
3+
# Copyright (C) 2022, University of Saskatchewan
4+
#
5+
# This file is part of Meteorological Data Processing Workflow
6+
#
7+
# This program is free software: you can redistribute it and/or modify
8+
# it under the terms of the GNU General Public License as published by
9+
# the Free Software Foundation, either version 3 of the License, or
10+
# (at your option) any later version.
11+
#
12+
# This program is distributed in the hope that it will be useful,
13+
# but WITHOUT ANY WARRANTY; without even the implied warranty of
14+
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
15+
# GNU General Public License for more details.
16+
#
17+
# You should have received a copy of the GNU General Public License
18+
# along with this program. If not, see <http://www.gnu.org/licenses/>.
19+
20+
# =========================
21+
# Credits and contributions
22+
# =========================
23+
# 1. Parts of the code are taken from https://www.shellscript.sh/tips/getopt/index.html
24+
25+
26+
# ================
27+
# General comments
28+
# ================
29+
# * All variables are camelCased for distinguishing from function names;
30+
# * function names are all in lower_case with words seperated by underscore for legibility;
31+
# * shell style is based on Google Open Source Projects'
32+
# Style Guide: https://google.github.io/styleguide/shellguide.html
33+
34+
35+
# ===============
36+
# Usage Functions
37+
# ===============
38+
short_usage() {
39+
echo "usage: $(basename $0) [-cio DIR] [-v VARS] [-se DATE] [-t CHAR] [-ln REAL,REAL] [-p STR]"
40+
}
41+
42+
43+
# argument parsing using getopt - WORKS ONLY ON LINUX BY DEFAULT
44+
parsedArguments=$(getopt -a -n wfdei_gem_capa -o i:v:o:s:e:t:l:n:p:c:m: --long dataset-dir:,variables:,output-dir:,start-date:,end-date:,time-scale:,lat-lims:,lon-lims:,prefix:,cache:,ensemble: -- "$@")
45+
validArguments=$?
46+
if [ "$validArguments" != "0" ]; then
47+
short_usage;
48+
exit 1;
49+
fi
50+
51+
# check if no options were passed
52+
if [ $# -eq 0 ]; then
53+
echo "$(basename $0): ERROR! arguments missing";
54+
exit 1;
55+
fi
56+
57+
# check long and short options passed
58+
eval set -- "$parsedArguments"
59+
while :
60+
do
61+
case "$1" in
62+
-i | --dataset-dir) datasetDir="$2" ; shift 2 ;; # required
63+
-v | --variables) variables="$2" ; shift 2 ;; # required
64+
-o | --output-dir) outputDir="$2" ; shift 2 ;; # required
65+
-s | --start-date) startDate="$2" ; shift 2 ;; # required
66+
-e | --end-date) endDate="$2" ; shift 2 ;; # required
67+
-t | --time-scale) timeScale="$2" ; shift 2 ;; # redundant - added for compatibility
68+
-l | --lat-lims) latLims="$2" ; shift 2 ;; # required
69+
-n | --lon-lims) lonLims="$2" ; shift 2 ;; # required
70+
-p | --prefix) prefix="$2" ; shift 2 ;; # optional
71+
-c | --cache) cache="$2" ; shift 2 ;; # required
72+
-m | --ensemble) ensemble="$2" ; shift 2 ;; # redundant - added for compatibility
73+
74+
# -- means the end of the arguments; drop this, and break out of the while loop
75+
--) shift; break ;;
76+
77+
# in case of invalid option
78+
*)
79+
echo "$(basename $0): ERROR! invalid option '$1'";
80+
short_usage; exit 1 ;;
81+
esac
82+
done
83+
84+
# raise error in case --ensemble argument are provided
85+
if [[ -n "$ensemble" ]]; then
86+
echo "$(basename $0): ERROR! invalid option '--ensemble'"
87+
exit 1
88+
fi
89+
90+
# make array of variable names
91+
IFS=',' read -ra variablesArr <<< "$(echo "$variables")"
92+
93+
# check the prefix of not set
94+
if [[ -z $prefix ]]; then
95+
prefix="data"
96+
fi
97+
98+
99+
# =====================
100+
# Necessary Assumptions
101+
# =====================
102+
# TZ to be set to UTC to avoid invalid dates due to Daylight Saving
103+
alias date='TZ=UTC date'
104+
105+
# expand aliases for the one stated above
106+
shopt -s expand_aliases
107+
108+
109+
# ==========================
110+
# Necessary Global Variables
111+
# ==========================
112+
# the structure of file names is as follows: "%var__WFDEI_GEM_1979_2016.Feb29.nc"
113+
format="%Y-%m-%dT%H:%M:%S" # date format
114+
fileStruct="_WFDEI_GEM_1979_2016.Feb29.nc" # source dataset files' suffix constant
115+
116+
latVar="lat"
117+
lonVar="lon"
118+
timeVar="time"
119+
120+
121+
# ===================
122+
# Necessary Functions
123+
# ===================
124+
# Modules below available on Compute Canada (CC) Graham Cluster Server
125+
load_core_modules () {
126+
module -q load cdo/2.0.4
127+
module -q load nco/5.0.6
128+
}
129+
load_core_modules
130+
131+
132+
#######################################
133+
# useful one-liners
134+
#######################################
135+
#calcualte Unix EPOCH time in seconds from 1970-01-01 00:00:00
136+
unix_epoch () { date --date="$@" +"%s"; }
137+
138+
#check whether the input is float or real
139+
check_real () { if [[ "$1" == *'.'* ]]; then echo 'float'; else echo 'int'; fi; }
140+
141+
#convert to float if the number is 'int'
142+
to_float () { if [[ $(check_real $1) == 'int' ]]; then printf "%.1f" "$1"; echo; else printf "$1"; echo; fi; }
143+
144+
#join array element by the specified delimiter
145+
join_by () { local IFS="$1"; shift; echo "$*"; }
146+
147+
#to_float the latLims and lonLims, real numbers delimited by ','
148+
lims_to_float () { IFS=',' read -ra l <<< $@; f_arr=(); for i in "${l[@]}"; do f_arr+=($(to_float $i)); done; echo $(join_by , "${f_arr[@]}"); }
149+
150+
151+
# ===============
152+
# Data Processing
153+
# ===============
154+
# display info
155+
echo "$(basename $0): processing CCRN WFDEI-GEM_CaPA..."
156+
157+
# make the output directory
158+
echo "$(basename $0): creating output directory under $outputDir"
159+
mkdir -p "$outputDir"
160+
161+
# reformat $startDate and $endDate
162+
startDateFormated="$(date --date="$startDate" +"$format")" # startDate
163+
endDateFormated="$(date --date="$endDate" +"$format")" # endDate
164+
165+
# extract $startYear and $endYear
166+
startYear="$(date --date="$startDate" +"%Y")"
167+
endYear="$(date --date="$endDate" +"%Y")"
168+
169+
# making the output directory
170+
mkdir -p "$outputDir"
171+
172+
# loop over variables
173+
for var in "${variablesArr[@]}"; do
174+
ncks -O -d "$latVar",$(lims_to_float "$latLims") \
175+
-d "$lonVar",$(lims_to_float "$lonLims") \
176+
-d "$timeVar","$startDateFormated","$endDateFormated" \
177+
"$datasetDir/${var}${fileStruct}" "$outputDir/${prefix}${var}_WFDEI_GEM_${startYear}_${endYear}.Feb29.nc"
178+
179+
done
180+
181+
# wait to assure the loop is over
182+
wait
183+
184+
echo "$(basename $0): results are produced under $outputDir."
185+

0 commit comments

Comments
 (0)