Skip to content

hydrogeohc/Energy-forecasting-API

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Energy consumption forecasting

Overview

This project focuses on modeling and predicting energy consumption using a variety of machine-learning techniques. It involves loading, preprocessing, and exploring energy consumption data, engineering relevant features, building predictive models, and evaluating their performance. The project also implements a Flask API to serve the model evaluation results.

File Structure

├── app.py                       # Flask application for serving model results via API
├── data_preprocessing.py        # Data loading, preprocessing, feature engineering, and visualization
├── model_building.py            # Model training, evaluation (including LSTM), and time series cross-validation
├── energy_consumption_data.csv  # The dataset used for training and evaluation
├── requirements.txt             # List of Python dependencies for the project
└── README.md                    # This file (project overview and setup instructions)

Dependencies

The project relies on the following Python libraries:

  • pandas: For data manipulation and analysis
  • scikit-learn: For various machine learning models and evaluation metrics
  • tensorflow/keras: For building and training the LSTM model
  • Flask: For creating the web API
  • numpy: For numerical computations

To install all required dependencies, it is recommended to use a virtual environment and run:

pip install -r requirements.txt

A sample requirements.txt is shown below:

flask==3.1.0
pandas==2.2.3
scikit-learn==1.6.1
tensorflow==2.18.0
numpy==2.02
seaborn==0.13.2
matplotlib==3.10.1

Usage

  1. Clone the repository:

    git clone https://github.com/hydrogeohc/Optim_Energy
    cd Optim_Energy
    
  2. Create a virtual environment:

    python3 -m venv venv  # create the virtual environment
    source venv/bin/activate   # activate on macOS/Linux
    # venv\Scripts\activate   # activate on Windows
    
  3. Install dependencies:

    pip install --upgrade pip # upgrade pip
    pip install -r requirements.txt # Install the dependencies
    
  4. Run the Flask API:

    python app.py
    
  5. Access the API endpoints:

    • http://127.0.0.1:5000/ - Returns a welcome message.
    • http://127.0.0.1:5000/model_results - Retrieves model evaluation results in JSON format. Note: The AirPlay Receiver service on macOS may conflict with port 5000. If you encounter issues, try changing the Flask port to 5001 in app.py and accessing http://127.0.0.1:5001/model_results.

Models

The following machine learning models are implemented and evaluated in this project:

  • Linear Regression
  • Random Forest
  • Gradient Boosting
  • LSTM (Long Short-Term Memory)

Note: The traditional models (Linear Regression, Random Forest, and Gradient Boosting) are evaluated using Time Series Cross-Validation. However, the current implementation evaluates the LSTM using a single train/test split. The LSTM results are therefore not directly comparable to the other models.

Data

The energy_consumption_data.csv file contains the historical energy consumption data used to train and evaluate the models. The dataset includes features such as:

  • timestamp: Date and time of the reading
  • temperature: Temperature at the time of the reading
  • humidity: Humidity at the time of the reading
  • energy_consumption: The target variable, representing the energy consumed

The project also engineers new features such as:

  • temp_humidity_interaction: Interaction term between temperature and humidity
  • energy_consumption_lag1: Energy consumption from the previous time period (lag 1)
  • energy_consumption_lag2: Energy consumption from two time periods prior (lag 2)
  • hour, day_of_week, is_weekend: Temporal features derived from the timestamp.

Evaluation Metrics

The models are evaluated based on the following metrics:

  • Mean Squared Error (MSE)
  • Mean Absolute Error (MAE)
  • Bias

Model Selection

Model Evaluation Results

The following table presents the model evaluation results, including Mean Squared Error (MSE), Mean Absolute Error (MAE), and Bias:

Model MSE MAE Bias
Random Forest 14245.34 104.40 -8.63
Gradient Boosting 15696.48 106.75 -9.58
Linear Regression 16427.58 109.70 -12.03
LSTM 18726.25 109.30 -18.30

Base on the above results, the Random Forest seems to be the best with all the lower MSE, MAE, and closeer to zero bias.

  • note:
  • MSE might be the most important metric when the project wants to avoid large errors
  • MAE might be preferred when the project want a model that's generally accurate
  • Bias is important when the project need a model that doesn't systematically over- or under-predict.

Troubleshooting

  • ModuleNotFoundError: Ensure that all Python dependencies are installed in your virtual environment. If you see the error, install the libraries that are missing.
  • File Permissions: Make sure that the energy_consumption_data.csv file is readable by the Flask application.

License

MIT License

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages