Data Formats

This page explains the different data formatting options available in otoole. Firstly, the format of the user configuration file is explained. Following this, the different input data formats are explained.

.. SEEALSO::
   See the Simplicity_ repository for a full example of these formats

User Configuration File

.. versionadded:: v1.0.0
    The user configuration file is now required for data conversion

Overview

Most commands in otoole require the user to specify a configuration file that describes the parameters, sets, and results in the model. This configuration file is written in yaml and is typically saved as config.yaml. This section will cover how to format the user configuration file. If the user incorrectly enters data, validation checks in otoole should catch this.

Information Required

The table below highlights what information is required for each Set, Parameter and Result definition in the configuration file. Required values are given by X, while optional values are given by (X).

	Set	Parameter	Result
name	X	X	X
short_name	(X)	(X)	(X)
dtype	X	X	X
type	X	X	X
default		X	X

.. deprecated:: v1.0.3
    The ``Calculated`` keyword is no longer needed for Result definitions

Warning

Names longer than 31 characters require a short_name field. This is due to character limits on excel sheet names. otoole will raise an error if a short_name is not provided in these instances.

Sets Format

Sets are defined as follows:

SETNAME:
    short_name: SET (Optional)
    dtype: "int" or "string"
    type: set

Note

It's convention in OSeMOSYS to capitalize set names

Parameters Format

Parameters are defined as follows. When referencing set indices use the full name, not the short_name:

ParameterName:
    short_name: ParamName (Optional)
    indices: [SETNAME, SETNAME, ...]
    type: param
    dtype: "int" or "float"
    default: 0

Note

It's convention in OSeMOSYS to use Pascal case for parameter names

Results Format

Results are defined as follows. When referencing set indices use the full name, not the short_name:

AnnualEmissions:
    short_name: ParamName (Optional)
    indices: [SETNAME, SETNAME, ...]
    type: result
    dtype: "int" or "float"
    default: 0

Note

It's convention in OSeMOSYS to use Pascal case for result names

Examples

Below are examples of correctly formatted configuration file values. See the Simplicity repository for a complete example.

Set definition of TECHNOLOGY:

TECHNOLOGY:
    dtype: str
    type: set

Parameter definition of AccumulatedAnnualDemand:

AccumulatedAnnualDemand:
    short_name: AccAnnualDemand
    indices: [REGION,FUEL,YEAR]
    type: param
    dtype: float
    default: 0

Result definition of AnnualEmissions:

AnnualEmissions:
    indices: [REGION,EMISSION,YEAR]
    type: result
    dtype: float
    default: 0

Tip

See the :ref:`examples` page to create a template configuration file

Input Data

.. deprecated:: v1.0.0
    The ``datapackage`` format is no longer supported

Overview

This section will describe how to format data for excel, csv, and datafile formats.

Excel

Interfacing with otoole through excel is a very user-friendly method to handle OSeMOSYS input data. In the excel workbook (an *.xlsx file), each sheet will correspond to a single parameter or set. Parameters that are indexed over years are pivoted on the YEAR index. This creates a wide formatted dataset, where each year is the column header, with the first columns holding the remaining indices.

For example, referencing the Simplicity model, the AccumulatedAnnualDemand parameter data will be under the AccumulatedAnnualDemand sheet and contain the data

REGION	TECHNOLOGY	2014	2015	2016	2017	2018	2019	2020
SIMPLICITY	BACKSTOP1	999999	999999	999999	999999	999999	999999	999999
SIMPLICITY	BACKSTOP2	999999	999999	999999	999999	999999	999999	999999
SIMPLICITY	ETHPLANT	25	25	25	25	25	25	25
SIMPLICITY	GRID_EXP	4000	4000	4000	4000	4000	4000	4000
SIMPLICITY	HYD1	4500	4500	4500	4500	4500	4500	4500
SIMPLICITY	HYD2	3500	3500	3500	3500	3500	3500	3500
...	...	...	...	...	...	...	...	...

Parameters that are not indexed over years will have an extra column titled VALUE. This column will hold the input value for that parameter. For example, the OperationalLife parameter in the Simplicity example will be formatted as shown

REGION	TECHNOLOGY	VALUE
SIMPLICITY	BACKSTOP1	1
SIMPLICITY	BACKSTOP2	1
SIMPLICITY	ETHPLANT	30
SIMPLICITY	GAS_EXTRACTION	1
SIMPLICITY	GAS_IMPORT	1
SIMPLICITY	GRID_EXP	50
SIMPLICITY	HYD1	80
SIMPLICITY	HYD2	80
...	...	...

Set definitions will have a single column, titled VALUE. For example, the set TECHNOLOGY will be formatted as shown

VALUE
BACKSTOP1
BACKSTOP2
ETHPLANT
GAS_EXTRACTION
GAS_IMPORT
GRID_EXP
HYD1
HYD2
...

CSV

Interfacing with otoole through a folder of CSV files is the most "computer friendly" way to handle input data. This is due to csv files being easy to read and write, and independent of the program, programming language, and operating system. This allows otoole to easily integrate into workflows.

When working with CSV data, all parameters and sets are saved under their name given in the configuration file, and nested in a single directory. CSV data will follow long formatting standards, where each column is the name of the index, and the final column is titled VALUE.

For example, the following data for AccumulatedAnnualDemand will be under the file data/AccumulatedAnnualDemand.csv

REGION	FUEL	YEAR	VALUE
SIMPLICITY	ETH	2014	1
SIMPLICITY	RAWSUG	2014	0.5
SIMPLICITY	ETH	2015	1.03
SIMPLICITY	RAWSUG	2015	0.51
SIMPLICITY	ETH	2016	1.061
SIMPLICITY	RAWSUG	2016	0.519
SIMPLICITY	ETH	2017	1.093
SIMPLICITY	RAWSUG	2017	0.529
SIMPLICITY	ETH	2018	1.126
...	...	...	...

While the TECHNOLOGY set data will be under the file data/TECHNOLOGY.csv` and formatted as shown with a single VALUE column.

VALUE
BACKSTOP1
BACKSTOP2
ETHPLANT
GAS_EXTRACTION
GAS_IMPORT
GRID_EXP
HYD1
HYD2
...

Datafile

Datafiles are the least user-friendly method of handling data, however, they are required for the OSeMOSYS GNU MathProg version of OSeMOSYS. Datafiles are written in MathProg, which shares syntax with the AMPL programming language.

Datafiles contain all model data in one file (often a *.txt file), and will follow a similar data standard to long formatted CSV data. However, the default value for the parameter is included in its declaration statement.

For example, in the file data.txt, the parameter AccumulatedAnnualDemand will be defined as follows:

param default 0.0 : AccumulatedAnnualDemand :=
    SIMPLICITY ETH 2014 1
    SIMPLICITY RAWSUG 2014 0.5
    SIMPLICITY ETH 2015 1.03
    SIMPLICITY RAWSUG 2015 0.51
    SIMPLICITY ETH 2016 1.061
    SIMPLICITY RAWSUG 2016 0.519
    SIMPLICITY ETH 2017 1.093
    SIMPLICITY RAWSUG 2017 0.529
    SIMPLICITY ETH 2018 1.126
    SIMPLICITY RAWSUG 2018 0.538
    SIMPLICITY ETH 2019 1.159
    SIMPLICITY RAWSUG 2019 0.548
    SIMPLICITY ETH 2020 1.194
    SIMPLICITY RAWSUG 2020 0.558
    ...

And in the same data.txt file, the set TECHNOLOGY will be defined as follows:

set TECHNOLOGY :=
    BACKSTOP1
    BACKSTOP2
    ETHPLANT
    GAS_EXTRACTION
    GAS_IMPORT
    GRID_EXP
    HYD1
    HYD2
    ...

.. SEEALSO::
   For reading and writing between Python and AMPL_, see the amply_ Python package.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data.rst

data.rst

Data Formats

User Configuration File

Overview

Information Required

Sets Format

Parameters Format

Results Format

Examples

Input Data

Overview

Excel

CSV

Datafile

Files

data.rst

Latest commit

History

data.rst

File metadata and controls

Data Formats

User Configuration File

Overview

Information Required

Sets Format

Parameters Format

Results Format

Examples

Input Data

Overview

Excel

CSV

Datafile