diff --git a/docs/examples.rst b/docs/examples.rst index 3c2f1885..281a9831 100644 --- a/docs/examples.rst +++ b/docs/examples.rst @@ -6,7 +6,7 @@ Examples This page will present examples to show the full functionality of ``otoole``. It will walk through the ``convert``, ``results``, ``setup``, ``viz`` and ``validate`` -functionality in seperate simple use cases. +functionality in separate simple use cases. .. NOTE:: To follow these examples, clone the Simplicity_ repository and run all commands @@ -34,12 +34,12 @@ abbreviated instructions are shown below To install GLPK on **Linux**, run the command:: - sudo apt-get update - sudo apt-get install glpk glpk-utils + $ sudo apt-get update + $ sudo apt-get install glpk glpk-utils To install GLPK on **Mac**, run the command:: - brew install glpk + $ brew install glpk To install GLPK on **Windows**, follow the instructions on the `GLPK Website`_. Be sure to add GLPK to @@ -48,7 +48,7 @@ your environment variables after installation Alternatively, if you use Anaconda_ to manage your Python packages, you can install GLPK via the command:: - conda install -c conda-forge glpk + $ conda install -c conda-forge glpk 2. Test the GLPK install ~~~~~~~~~~~~~~~~~~~~~~~~ @@ -58,6 +58,9 @@ Once installed, you should be able to call the ``glpsol`` command:: GLPSOL: GLPK LP/MIP Solver, v4.65 No input problem file specified; try glpsol --help +.. TIP:: + See the `GLPK Wiki`_ for more information on the ``glpsol`` command + 3. Install CBC ~~~~~~~~~~~~~~ @@ -67,11 +70,11 @@ instructions are shown below To install CBC on **Linux**, run the command:: - sudo apt-get install coinor-cbc coinor-libcbc-dev + $ sudo apt-get install coinor-cbc coinor-libcbc-dev To install CBC on **Mac**, run the command:: - brew install coin-or-tools/coinor/cbc + $ brew install coin-or-tools/coinor/cbc To install CBC on **Windows**, follow the install instruction on the CBC_ website. @@ -79,7 +82,7 @@ website. Alternatively, if you use Anaconda_ to manage your Python packages, you can install CBC via the command:: - conda install -c conda-forge coincbc + $ conda install -c conda-forge coincbc 4. Test the CBC install ~~~~~~~~~~~~~~~~~~~~~~~ @@ -96,122 +99,103 @@ Once installed, you should be able to directly call CBC:: You can exit the solver by typing ``quit`` -Data Conversion with CSVs -------------------------- +Input Data Conversion +--------------------- Objective ~~~~~~~~~ -Use a folder of CSV data to build and solve an OSeMOSYS model with CBC_. Generate -the full suite of OSeMOSYS results. - -1. ``otoole`` Convert -~~~~~~~~~~~~~~~~~~~~~ -We first want to convert the folder of Simplicity_ CSVs into -an OSeMOSYS datafile called ``simplicity.txt``:: - - $ otoole convert csv datafile data simplicity.txt config.yaml - -2. Build the Model -~~~~~~~~~~~~~~~~~~~ -Use GLPK_ to build the model and save it as ``simplicity.lp``:: +Convert input data between CSV, Excel, and GNU MathProg data formats. - $ glpsol -m OSeMOSYS.txt -d simplicity.txt --wlp simplicity.lp --check +1. Clone ``Simplicity`` +~~~~~~~~~~~~~~~~~~~~~~~ +If not already done so, clone the Simplicity_ repository:: -.. TIP:: - See the `GLPK Wiki`_ for more information on the ``glpsol`` command + $ git clone https://github.com/OSeMOSYS/simplicity.git + $ cd simplicity -3. Solve the Model -~~~~~~~~~~~~~~~~~~ -Use CBC_ to solve the model and save the solution file as ``simplicity.sol``:: +.. NOTE:: + Further information on the ``config.yaml`` file is in the :ref:`template-setup` section - $ cbc simplicity.lp solve -solu simplicity.sol +2. Convert CSV data into MathProg data +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Convert the folder of Simplicity_ CSVs (``data/``) into an OSeMOSYS datafile called ``simplicity.txt``:: -4. Generate the full set of results -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Use ``otoole``'s ``result`` package to generate the results file:: + $ otoole convert csv datafile data simplicity.txt config.yaml - $ otoole results cbc csv simplicity.sol results datafile simplicity.txt config.yaml +3. Convert MathProg data into Excel Data +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Convert the new Simplicity_ datafile (``simplicity.txt``) into Excel data called ``simplicity.xlsx``:: -5. View Results -~~~~~~~~~~~~~~~ -Results are now viewable in the files ``results/*.csv`` + $ otoole convert datafile excel simplicity.txt simplicity.xlsx config.yaml .. TIP:: - Before moving onto the next section, remove all the generated files:: + Excel workbooks are an easy way for humans to interface with OSeMOSYS data! + +4. Convert Excel Data into CSV data +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Convert the new Simplicity_ excel data (``simplicity.xlsx``) into a folder of CSV data +called ``simplicity/``. Note that this data will be the exact same as the original CSV data folder (``data/``):: - $ rm simplicity.lp simplicity.sol simplicity.txt results/* + $ otoole convert excel csv simplicity.xlsx simplicity config.yaml -Data Conversion with Excel --------------------------- +Process Solutions from Different Solvers +---------------------------------------- Objective ~~~~~~~~~ -Use an excel worksheet to build and solve an OSeMOSYS model with CBC. +Process solutions from GLPK_, CBC_, Gurobi_, and CPLEX_. This example assumes +you have an existing GNU MathProg datafile called ``simplicity.txt`` (from the +previous example). -1. Create the Excel Workbook -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Use the example CSV data to create an Excel Workbook using ``otoole convert``:: - - $ otoole convert csv excel data simplicity.xlsx config.yaml +1. Process a solution from GLPK +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Use GLPK_ to build the model, save the problem as ``simplicity.glp``, solve the model, and +save the solution as ``simplicity.sol``. Use otoole to create a folder of CSV results called ``results-glpk/``. +When processing solutions from GLPK, the model file (``*.glp``) must also be passed:: -Excel workbooks are an easy way for humans to interface with OSeMOSYS data! + $ glpsol -m OSeMOSYS.txt -d simplicity.txt --wglp simplicity.glp --write simplicity.sol -2. Create the MathProg datafile -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Next, we want to convert the excel workbook (``simplicity.xlsx``) into -an OSeMOSYS datafile (``simplicity.txt``):: + $ otoole results glpk csv simplicity.sol results-glpk datafile simplicity.txt config.yaml --glpk_model simplicity.glp - $ otoole convert excel datafile simplicity.xlsx simplicity.txt config.yaml +.. NOTE:: + By default, MathProg OSeMOSYS models will write out folder of CSV results to a ``results/`` + directory if solving via GLPK. However, using ``otoole`` allows the user to programmatically access results + and control read/write locations -3. Build the Model -~~~~~~~~~~~~~~~~~~ -Use GLPK_ to build the model and save it as ``simplicity.lp``:: +2. Process a solution from CBC +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Use GLPK_ to build the model and save the problem as ``simplicity.lp``. Use CBC_ to solve the model and +save the solution as ``simplicity.sol``. Use otoole to create a folder of CSV results called ``results/`` from the solution file:: $ glpsol -m OSeMOSYS.txt -d simplicity.txt --wlp simplicity.lp --check -4. Solve the Model -~~~~~~~~~~~~~~~~~~ -Use CBC_ to solve the model and save the solution file as ``simplicity.sol``:: - $ cbc simplicity.lp solve -solu simplicity.sol -5. Generate the selected results -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Use ``otoole``'s ``result`` package to generate the result CSVs:: - - $ otoole results cbc csv simplicity.sol results datafile simplicity.txt config.yaml - -Data Processing with GLPK -------------------------- + $ otoole results cbc csv simplicity.sol results csv data config.yaml -Objective -~~~~~~~~~ +3. Process a solution from Gurobi +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Use GLPK_ to build the model and save the problem as ``simplicity.lp``. Use Gurobi_ to solve the model and +save the solution as ``simplicity.sol``. Use otoole to create a folder of CSV results called ``results/`` from the solution file:: -Build and solve a model using GLPK and otoole + $ glpsol -m OSeMOSYS.txt -d simplicity.txt --wlp simplicity.lp --check -1. Build the solve the model using GLPK -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Use GLPK_ to build the model, save the problem as ``simplicity.glp``, solve the model, -and save the solution as ``simplicity.sol```:: + $ gurobi_cl ResultFile=simplicity.sol simplicity.lp - $ glpsol -m OSeMOSYS.txt -d simplicity.txt --wglp simplicity.glp --write simplicity.sol + $ otoole results gurobi csv simplicity.sol results csv data config.yaml -2. Use otoole to process the solution in CSVs -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Use ``otoole``'s ``results`` command to transform the soltuion file into a folder of CSVs -under the directory ``results-glpk``. When processing solutions from GLPK, the model file (``*.glp``) -must also be passed:: +4. Process a solution from CPLEX +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Use GLPK_ to build the model and save the problem as ``simplicity.lp``. Use CPLEX_ to solve the model and +save the solution as ``simplicity.sol``. Use otoole to create a folder of CSV results called ``results/`` from the solution file:: - $ otoole results glpk csv simplicity.sol results-glpk datafile simplicity.txt config.yaml --glpk_model simplicity.glp + $ glpsol -m OSeMOSYS.txt -d simplicity.txt --wlp simplicity.lp --check -.. NOTE:: - By default, MathProg OSeMOSYS models will write out folder of CSV results to a ``results/`` - directory if solving via GLPK. However, for programatically accessing results, using ``otoole`` - to control the read/write location, and for supporting future implementations of OSeMOSYS, - using ``otoole`` can be benifical. + $ cplex -c "read simplicity.lp" "optimize" "write simplicity.sol" + $ otoole results cplex csv simplicity.sol results csv data config.yaml Model Visualization ------------------- @@ -238,6 +222,8 @@ displayed .. image:: _static/simplicity_res.png +.. _template-setup: + Template Setup -------------- @@ -284,13 +270,12 @@ horizon. For example, if the model horizon is from 2020 to 2050, the .. NOTE:: While this step in not technically required, by filling out the years in - CSV format, ``otoole`` will pivot all the Excel sheets on the years - during the conversion process. This will save significant formatting time! + CSV format ``otoole`` will pivot all the Excel sheets on these years. + This will save significant formatting time! 4. Convert the CSV Template Data ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -To convert the template CSV data into Excel formatted data, run the following -``convert`` command:: +Convert the template CSV data into Excel formatted data:: $ otoole convert csv excel template_data template.xlsx template_config.yaml @@ -500,3 +485,4 @@ will also flag it as an isolated fuel. This means the fuel is unconnected from t .. _CBC: https://github.com/coin-or/Cbc .. _CPLEX: https://www.ibm.com/products/ilog-cplex-optimization-studio/cplex-optimizer .. _Anaconda: https://www.anaconda.com/ +.. _Gurobi: https://www.gurobi.com/ diff --git a/docs/functionality.rst b/docs/functionality.rst index abb61bb6..b923f1b4 100644 --- a/docs/functionality.rst +++ b/docs/functionality.rst @@ -89,25 +89,26 @@ so as to speed up the model matrix generation and solution times. ``otoole results`` ~~~~~~~~~~~~~~~~~~ -The ``results`` command creates a folder of CSV result files from a CBC_, CLP_, +The ``results`` command creates a folder of CSV result files from a GLPK_, CBC_, CLP_, Gurobi_ or CPLEX_ solution file together with the input data:: $ otoole results --help - usage: otoole results [-h] [--write_defaults] + usage: otoole results [-h] [--glpk_model GLPK_MODEL] [--write_defaults] {cbc,cplex,gurobi} {csv} from_path to_path {csv,datafile,excel} input_path config positional arguments: - {cbc,cplex,glpk,gurobi} Result data format to convert from - {csv} Result data format to convert to - from_path Path to file or folder to convert from - to_path Path to file or folder to convert to - {csv,datafile,excel} Input data format - input_path Path to input_data - config Path to config YAML file + {cbc,cplex,glpk,gurobi} Result data format to convert from + {csv} Result data format to convert to + from_path Path to file or folder to convert from + to_path Path to file or folder to convert to + {csv,datafile,excel} Input data format + input_path Path to input_data + config Path to config YAML file optional arguments: - -h, --help show this help message and exit - --write_defaults Writes default values + -h, --help show this help message and exit + --glpk_model GLPK_MODEL GLPK model file required for processing GLPK results + --write_defaults Writes default values .. versionadded:: v1.0.0 The ``config`` positional argument is now required diff --git a/src/otoole/convert.py b/src/otoole/convert.py index 9a315886..3cecd343 100644 --- a/src/otoole/convert.py +++ b/src/otoole/convert.py @@ -31,7 +31,7 @@ def read_results( input_path: str, glpk_model: str = None, ) -> Tuple[Dict[str, pd.DataFrame], Dict[str, float]]: - """Read OSeMOSYS results from CBC, GLPK or Gurobi results files + """Read OSeMOSYS results from CBC, GLPK, Gurobi, or CPLEX results files Arguments --------- diff --git a/src/otoole/results/results.py b/src/otoole/results/results.py index ee688a77..dfb7c426 100644 --- a/src/otoole/results/results.py +++ b/src/otoole/results/results.py @@ -6,7 +6,6 @@ import pandas as pd from otoole.input import ReadStrategy -from otoole.preprocess.longify_data import check_datatypes from otoole.results.result_package import ResultsPackage LOGGER = logging.getLogger(__name__) @@ -21,7 +20,7 @@ def read( Arguments --------- filepath : str, TextIO - A path name or file buffer pointing to the CBC solution file + A path name or file buffer pointing to the solution file input_data : dict, default=None dict of dataframes @@ -88,13 +87,13 @@ def _convert_to_dataframe(self, file_path: Union[str, TextIO]) -> pd.DataFrame: def _convert_wide_to_long(self, data: pd.DataFrame) -> Dict[str, pd.DataFrame]: """Convert from wide to long format - Converts a pandas DataFrame containing all CBC results to reformatted + Converts a pandas DataFrame containing all wide format results to reformatted dictionary of pandas DataFrames in long format ready to write out Arguments --------- data : pandas.DataFrame - CBC results stored in a dataframe + results stored in a dataframe Example ------- @@ -179,91 +178,23 @@ def rename_duplicate_column(index: List) -> List: return column -class ReadCplex(ReadResults): - """ """ +class ReadCplex(ReadWideResults): + """Read a CPLEX solution file into memeory""" - def get_results_from_file( - self, filepath: Union[str, TextIO], input_data - ) -> Dict[str, pd.DataFrame]: - - if input_data: - years = input_data["YEAR"].values # type: List - start_year = int(years[0]) - end_year = int(years[-1]) - else: - raise RuntimeError("To process CPLEX results please provide the input file") - - if isinstance(filepath, str): - with open(filepath, "r") as sol_file: - data = self.extract_rows(sol_file, start_year, end_year) - elif isinstance(filepath, StringIO): - data = self.extract_rows(filepath, start_year, end_year) - else: - raise TypeError("Argument filepath type must be a string or an open file") - - results = {} - - for name in data.keys(): - results[name] = self.convert_df(data[name], name, start_year, end_year) - - return results + def _convert_to_dataframe(self, file_path: Union[str, TextIO]) -> pd.DataFrame: + """Reads a Cplex solution file into a pandas DataFrame - def extract_rows( - self, sol_file: TextIO, start_year: int, end_year: int - ) -> Dict[str, List[List[str]]]: - """ """ - data = {} # type: Dict[str, List[List[str]]] - for linenum, line in enumerate(sol_file): - line = line.replace("\n", "") - try: - row_as_list = line.split("\t") # type: List[str] - name = row_as_list[0] # type: str - - if name in data.keys(): - data[name].append(row_as_list) - else: - data[name] = [row_as_list] - except ValueError as ex: - msg = "Error caused at line {}: {}. {}" - raise ValueError(msg.format(linenum, line, ex)) - return data - - def extract_variable_dimensions_values(self, data: List) -> Tuple[str, Tuple, List]: - """Extracts useful information from a line of a results file""" - variable = data[0] - try: - number = len(self.results_config[variable]["indices"]) - except KeyError as ex: - print(data) - raise KeyError(ex) - dimensions = tuple(data[1:(number)]) - values = data[(number):] - return (variable, dimensions, values) - - def convert_df( - self, data: List[List[str]], variable: str, start_year: int, end_year: int - ) -> pd.DataFrame: - """Read the cplex lines into a pandas DataFrame""" - index = self.results_config[variable]["indices"] - columns = ["variable"] + index[:-1] + list(range(start_year, end_year + 1, 1)) - df = pd.DataFrame(data=data, columns=columns) - df, index = check_duplicate_index(df, columns, index) - df = df.drop(columns="variable") - - LOGGER.debug( - f"Attempting to set index for {variable} with columns {index[:-1]}" - ) - try: - df = df.set_index(index[:-1]) - except NotImplementedError as ex: - LOGGER.error(f"Error setting index for {df.head()}") - raise NotImplementedError(ex) - df = df.melt(var_name="YEAR", value_name="VALUE", ignore_index=False) - df = df.reset_index() - df = check_datatypes(df, self.user_config, variable) - df = df.set_index(index) - df = df[(df != 0).any(axis=1)] - return df + Arguments + --------- + user_config : Dict[str, Dict] + file_path : Union[str, TextIO] + """ + df = pd.read_xml(file_path, xpath=".//variable", parser="etree") + df[["Variable", "Index"]] = df["name"].str.split("(", expand=True) + df["Index"] = df["Index"].str.replace(")", "", regex=False) + LOGGER.debug(df) + df = df[(df["value"] != 0)].reset_index().rename(columns={"value": "Value"}) + return df[["Variable", "Index", "Value"]].astype({"Value": float}) class ReadGurobi(ReadWideResults): @@ -274,7 +205,8 @@ def _convert_to_dataframe(self, file_path: Union[str, TextIO]) -> pd.DataFrame: Arguments --------- - file_path : str + user_config : Dict[str, Dict] + file_path : Union[str, TextIO] """ df = pd.read_csv( file_path, @@ -295,8 +227,8 @@ class ReadCbc(ReadWideResults): Arguments --------- - user_config - results_config + user_config : Dict[str, Dict] + results_config : Dict[str, Dict] """ def _convert_to_dataframe(self, file_path: Union[str, TextIO]) -> pd.DataFrame: @@ -335,7 +267,7 @@ class ReadGlpk(ReadWideResults): Arguments --------- - user_config + user_config : Dict[str, Dict] glpk_model: Union[str, TextIO] Path to GLPK model file. Can be created using the `--wglp` flag. """ diff --git a/tests/test_read_strategies.py b/tests/test_read_strategies.py index 5658dd68..a7a20d10 100644 --- a/tests/test_read_strategies.py +++ b/tests/test_read_strategies.py @@ -23,212 +23,128 @@ class TestReadCplex: - cplex_empty = ( - "AnnualFixedOperatingCost REGION AOBACKSTOP 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0" - ) - cplex_short = "AnnualFixedOperatingCost REGION CDBACKSTOP 0.0 0.0 137958.8400384134 305945.38410619126 626159.9611543404 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0" - cplex_long = "RateOfActivity REGION S1D1 CGLFRCFURX 1 0.0 0.0 0.0 0.0 0.0 0.3284446367303371 0.3451714779880536 0.3366163200621617 0.3394945166233896 0.3137488154250392 0.28605725055560716 0.2572505015401749 0.06757558148965725 0.0558936625751148 0.04330608461292407 0.0" - - cplex_mid_empty = ( - pd.DataFrame( - data=[], - columns=["REGION", "TECHNOLOGY", "YEAR", "VALUE"], - ) - .astype({"VALUE": float}) - .set_index(["REGION", "TECHNOLOGY", "YEAR"]) - ) - - cplex_mid_short = pd.DataFrame( - data=[ - ["REGION", "CDBACKSTOP", 2017, 137958.8400384134], - ["REGION", "CDBACKSTOP", 2018, 305945.38410619126], - ["REGION", "CDBACKSTOP", 2019, 626159.9611543404], - ], - columns=["REGION", "TECHNOLOGY", "YEAR", "VALUE"], - ).set_index(["REGION", "TECHNOLOGY", "YEAR"]) - - cplex_mid_long = pd.DataFrame( - data=[ - ["REGION", "S1D1", "CGLFRCFURX", 1, 2020, 0.3284446367303371], - ["REGION", "S1D1", "CGLFRCFURX", 1, 2021, 0.3451714779880536], - ["REGION", "S1D1", "CGLFRCFURX", 1, 2022, 0.3366163200621617], - ["REGION", "S1D1", "CGLFRCFURX", 1, 2023, 0.3394945166233896], - ["REGION", "S1D1", "CGLFRCFURX", 1, 2024, 0.3137488154250392], - ["REGION", "S1D1", "CGLFRCFURX", 1, 2025, 0.28605725055560716], - ["REGION", "S1D1", "CGLFRCFURX", 1, 2026, 0.2572505015401749], - ["REGION", "S1D1", "CGLFRCFURX", 1, 2027, 0.06757558148965725], - ["REGION", "S1D1", "CGLFRCFURX", 1, 2028, 0.0558936625751148], - ["REGION", "S1D1", "CGLFRCFURX", 1, 2029, 0.04330608461292407], - ], - columns=[ - "REGION", - "TIMESLICE", - "TECHNOLOGY", - "MODE_OF_OPERATION", - "YEAR", - "VALUE", - ], - ).set_index(["REGION", "TIMESLICE", "TECHNOLOGY", "MODE_OF_OPERATION", "YEAR"]) - - dataframe_short = { - "AnnualFixedOperatingCost": pd.DataFrame( - data=[ - ["REGION", "CDBACKSTOP", 2017, 137958.8400384134], - ["REGION", "CDBACKSTOP", 2018, 305945.3841061913], - ["REGION", "CDBACKSTOP", 2019, 626159.9611543404], - ], - columns=["REGION", "TECHNOLOGY", "YEAR", "VALUE"], - ).set_index(["REGION", "TECHNOLOGY", "YEAR"]) - } + cplex_data = """ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + +""" - dataframe_long = { - "RateOfActivity": pd.DataFrame( - data=[ - ["REGION", "S1D1", "CGLFRCFURX", 1, 2020, 0.3284446367303371], - ["REGION", "S1D1", "CGLFRCFURX", 1, 2021, 0.3451714779880536], - ["REGION", "S1D1", "CGLFRCFURX", 1, 2022, 0.3366163200621617], - ["REGION", "S1D1", "CGLFRCFURX", 1, 2023, 0.3394945166233896], - ["REGION", "S1D1", "CGLFRCFURX", 1, 2024, 0.3137488154250392], - ["REGION", "S1D1", "CGLFRCFURX", 1, 2025, 0.28605725055560716], - ["REGION", "S1D1", "CGLFRCFURX", 1, 2026, 0.2572505015401749], - ["REGION", "S1D1", "CGLFRCFURX", 1, 2027, 0.06757558148965725], - ["REGION", "S1D1", "CGLFRCFURX", 1, 2028, 0.0558936625751148], - ["REGION", "S1D1", "CGLFRCFURX", 1, 2029, 0.04330608461292407], - ], - columns=[ - "REGION", - "TIMESLICE", - "TECHNOLOGY", - "MODE_OF_OPERATION", - "YEAR", - "VALUE", + def test_convert_to_dataframe(self, user_config): + input_file = self.cplex_data + reader = ReadCplex(user_config) + with StringIO(input_file) as file_buffer: + actual = reader._convert_to_dataframe(file_buffer) + # print(actual) + expected = pd.DataFrame( + [ + ["NewCapacity", "SIMPLICITY,ETHPLANT,2015", 0.030000000000000027], + ["NewCapacity", "SIMPLICITY,ETHPLANT,2016", 0.030999999999999917], + ["RateOfActivity", "SIMPLICITY,ID,HYD1,1,2020", 0.25228800000000001], + ["RateOfActivity", "SIMPLICITY,ID,HYD1,1,2021", 0.25228800000000001], + ["RateOfActivity", "SIMPLICITY,ID,HYD1,1,2022", 0.25228800000000001], ], - ).set_index(["REGION", "TIMESLICE", "TECHNOLOGY", "MODE_OF_OPERATION", "YEAR"]) - } - - test_data = [ - (cplex_short, dataframe_short), - (cplex_long, dataframe_long), - ] - - @mark.parametrize("cplex_input,expected", test_data, ids=["short", "long"]) - def test_read_cplex_to_dataframe(self, cplex_input, expected, user_config): - cplex_reader = ReadCplex(user_config=user_config) - - input_data = { - "YEAR": pd.DataFrame(data=list(range(2015, 2031, 1)), columns=["VALUE"]), - "REGION": pd.DataFrame(data=["REGION"], columns=["VALUE"]), - "TECHNOLOGY": pd.DataFrame( - data=["CDBACKSTOP", "CGLFRCFURX"], columns=["VALUE"] - ), - "MODE_OF_OPERATION": pd.DataFrame(data=[1], columns=["VALUE"]), - "TIMESLICE": pd.DataFrame(data=["S1D1"], columns=["VALUE"]), - } - - with StringIO(cplex_input) as file_buffer: - actual, _ = cplex_reader.read(file_buffer, input_data=input_data) - for name, item in actual.items(): - pd.testing.assert_frame_equal(item, expected[name]) - - test_data_mid = [(cplex_short, cplex_mid_short), (cplex_long, cplex_mid_long)] - - def test_read_empty_cplex_to_dataframe(self, user_config): - cplex_input = self.cplex_empty - - cplex_reader = ReadCplex(user_config) + columns=["Variable", "Index", "Value"], + ).astype({"Variable": str, "Index": str, "Value": float}) - input_data = { - "YEAR": pd.DataFrame(data=list(range(2015, 2031, 1)), columns=["VALUE"]) - } + pd.testing.assert_frame_equal(actual, expected) - with StringIO(cplex_input) as file_buffer: - data, _ = cplex_reader.read(file_buffer, input_data=input_data) - assert "AnnualFixedOperatingCost" in data + def test_solution_to_dataframe(self, user_config): + input_file = self.cplex_data + reader = ReadCplex(user_config) + with StringIO(input_file) as file_buffer: + actual = reader.read(file_buffer) + # print(actual) expected = ( pd.DataFrame( - data=[], + [ + ["SIMPLICITY", "ETHPLANT", 2015, 0.030000000000000027], + ["SIMPLICITY", "ETHPLANT", 2016, 0.030999999999999917], + ], columns=["REGION", "TECHNOLOGY", "YEAR", "VALUE"], ) - .astype({"REGION": str, "VALUE": float, "YEAR": int, "TECHNOLOGY": str}) + .astype({"REGION": str, "TECHNOLOGY": str, "YEAR": int, "VALUE": float}) .set_index(["REGION", "TECHNOLOGY", "YEAR"]) ) - actual = data["AnnualFixedOperatingCost"] - pd.testing.assert_frame_equal(actual, expected, check_index_type=False) - - test_data_to_cplex = [ - (cplex_empty, cplex_mid_empty), - (cplex_short, cplex_mid_short), - (cplex_long, cplex_mid_long), - ] - - @mark.parametrize( - "cplex_input,expected", test_data_to_cplex, ids=["empty", "short", "long"] - ) - def test_convert_cplex_to_df(self, cplex_input, expected, user_config): - data = cplex_input.split("\t") - variable = data[0] - cplex_reader = ReadCplex(user_config=user_config) - actual = cplex_reader.convert_df([data], variable, 2015, 2030) - pd.testing.assert_frame_equal(actual, expected, check_index_type=False) + pd.testing.assert_frame_equal(actual[0]["NewCapacity"], expected) - def test_convert_lines_to_df_empty(self, user_config): - - data = [ - [ - "AnnualFixedOperatingCost", - "REGION", - "AOBACKSTOP", - "0", - "0", - "0", - "0", - "0", - "0", - "0", - "0", - "0", - ] - ] - variable = "AnnualFixedOperatingCost" - cplex_reader = ReadCplex(user_config) - actual = cplex_reader.convert_df(data, variable, 2015, 2023) - pd.testing.assert_frame_equal( - actual, + expected = ( pd.DataFrame( - data=[], - columns=["REGION", "TECHNOLOGY", "YEAR", "VALUE"], + [ + ["SIMPLICITY", "ID", "HYD1", 1, 2020, 0.25228800000000001], + ["SIMPLICITY", "ID", "HYD1", 1, 2021, 0.25228800000000001], + ["SIMPLICITY", "ID", "HYD1", 1, 2022, 0.25228800000000001], + ], + columns=[ + "REGION", + "TIMESLICE", + "TECHNOLOGY", + "MODE_OF_OPERATION", + "YEAR", + "VALUE", + ], + ) + .astype( + { + "REGION": str, + "TIMESLICE": str, + "TECHNOLOGY": str, + "MODE_OF_OPERATION": int, + "YEAR": int, + "VALUE": float, + } + ) + .set_index( + ["REGION", "TIMESLICE", "TECHNOLOGY", "MODE_OF_OPERATION", "YEAR"] ) - .astype({"REGION": str, "TECHNOLOGY": str, "YEAR": int, "VALUE": float}) - .set_index(["REGION", "TECHNOLOGY", "YEAR"]), - check_index_type=False, ) - - def test_check_datatypes_with_empty(self): - - df = pd.DataFrame(data=[], columns=["REGION", "FUEL", "YEAR", "VALUE"]) - - parameter = "AccumulatedAnnualDemand" - - config_dict = { - "AccumulatedAnnualDemand": { - "indices": ["REGION", "FUEL", "YEAR"], - "type": "param", - "dtype": float, - "default": 0, - }, - "REGION": {"dtype": "str", "type": "set"}, - "FUEL": {"dtype": "str", "type": "set"}, - "YEAR": {"dtype": "int", "type": "set"}, - } - - actual = check_datatypes(df, config_dict, parameter) - - expected = pd.DataFrame( - data=[], columns=["REGION", "FUEL", "YEAR", "VALUE"] - ).astype({"REGION": str, "FUEL": str, "YEAR": int, "VALUE": float}) - - pd.testing.assert_frame_equal(actual, expected, check_index_type=False) + pd.testing.assert_frame_equal(actual[0]["RateOfActivity"], expected) class TestReadGurobi: @@ -1247,3 +1163,43 @@ def test_whitespace_converter( reader = ReadCsv(user_config=user_config, keep_whitespace=keep_whitespace) actual = reader._whitespace_converter(indices) assert actual == expected + + +class TestLongifyData: + """Tests for the preprocess.longify_data module""" + + # example availability factor data + data_valid = pd.DataFrame( + [ + ["SIMPLICITY", "ETH", 2014, 1.0], + ["SIMPLICITY", "RAWSUG", 2014, 0.5], + ["SIMPLICITY", "ETH", 2015, 1.03], + ["SIMPLICITY", "RAWSUG", 2015, 0.51], + ["SIMPLICITY", "ETH", 2016, 1.061], + ["SIMPLICITY", "RAWSUG", 2016, 0.519], + ], + columns=["REGION", "FUEL", "YEAR", "VALUE"], + ) + + data_invalid = pd.DataFrame( + [ + ["SIMPLICITY", "ETH", "invalid", 1.0], + ["SIMPLICITY", "RAWSUG", 2014, 0.5], + ], + columns=["REGION", "FUEL", "YEAR", "VALUE"], + ) + + def test_check_datatypes_valid(self, user_config): + df = self.data_valid.astype( + {"REGION": str, "FUEL": str, "YEAR": int, "VALUE": float} + ) + actual = check_datatypes(df, user_config, "AvailabilityFactor") + expected = df.copy() + + pd.testing.assert_frame_equal(actual, expected) + + def test_check_datatypes_invalid(self, user_config): + df = self.data_invalid + + with raises(ValueError): + check_datatypes(df, user_config, "AvailabilityFactor")