Skip to content
captainceramic edited this page Oct 27, 2014 · 1 revision

Datasets in the CWSLab plugin.

The CWSLab VisTrails plugin uses the concept of a ‘Dataset’ (note that this is different to the concept of a netCDF dataset). A Dataset represents or defines a subset of all the files accessible to the system, and is defined as the set of files of a particular type, with a set of constraints. Examples of Datasets are:

  • All downloaded, time-dependent CMIP5 files from the ACCESS1-3 model
  • All seasonal climatologies with the variable ‘rsds’, the perturbed physics value ‘2’ and the model ‘MIROC5’
  • All time slice change files with variable ‘tos’, comparing the period 2080-2099 with the period 1986-2005
  • All ERA-INT reanalysis files with the monthly frequency.

Individual modules in a workflow made with the MAS plug-in have Datasets as their inputs and outputs. This allows you to use the system to perform actions on Datasets like:

  • Re-gridding all of the maximum temperature data for the MIROC5 and ACCESS1-0 models to a common r240x120 grid.
  • Generating maps of rainfall data for all ensemble members of the rcp 6.0 experiment for the CESM-CM5 model.

Dataset definitions.

The CWSLab plugin contains a number of pre-made Datasets representing the CMIP5 archive and some regional climate model output. However, you may be interested in creating your own to represent particular data that you have access to.

In the plugin, a dataset is represented by two pieces of information - a filename pattern and a set of contraints. For example, imagine you had the following set of files in your home directory:

  • /home/tim/data_red.txt
  • /home/tim/data_blue.txt
  • /home/tim/data_green.txt

This can be represented by the filename pattern /home/tim/data_%colour%.txt, where the word between the % signs (%colour%) labels a tag that changes between files in the Dataset.

Clone this wiki locally