Process-guided deep learning for lake water temperature

This repository contains the code used to predict daily water temperature profiles (0-20 m) in Lake Mendota using deep learning models with physical constraints.

Overview

The project investigates four aspects of lake temperature modeling:

Base Models — Compare four deep learning architectures (LSTM, Transformer, CNN-LSTM, AttentionLSTM) trained on observational data with varying training set sizes (20%-100%).
Pretraining — Pretrain models on simulation data from the General Lake Model (GLM), then finetune on real observations.
Ensemble — Combine predictions from the four base models using a depth-wise weighted ensemble.
Process-Guided Loss — Add an energy conservation constraint to the loss function to improve physical consistency.

Repository Structure

process_guided_deep_learning/
├── Base Model/              # Train four DL models on observational data
├── Pretraining/             # Pretrain on simulation, then finetune on observations
├── Ensemble/                # Depth-wise weighted ensemble of base models
├── Loss Function/           # Energy-conservation-constrained ensemble
├── Simulation/              # GLM setup and simulation script
└── Validation/              # Monthly energy balance analysis

Environment and Data Setup

This repository was developed primarily for Google Colab + Google Drive.

The project uses daily weather and water temperature data from Lake Mendota, together with lake bathymetry, ice-cover records, and GLM simulation output. The expected input files are referenced in each module's environment_configuration.py.

Before running the scripts:

Mount Google Drive in Colab.
Update the paths in each environment_configuration.py file if your folder structure is different.
Check the additional hard-coded Google Drive paths in the main training and evaluation scripts, such as:
- Base Model/basemodel_training.py
- Base Model/figures_plot.py
- Pretraining/basemodel_pretraining.py
- Pretraining/basemodel_training.py
- Pretraining/figures_plot.py
- Ensemble/ensemble_data_processing.py
- Ensemble/ensemble_evaluation.py
- Ensemble/parameter_tuning_ensemble.py
- Loss Function/ensemble_data_processing.py
- Loss Function/energy_parameter_tuning.py
- Loss Function/ensemble_energy_evaluation.py
- Validation/monthly_energy_analysis.py

Requirements

Python 3.8+
PyTorch
TensorFlow
hyperopt
scikit-learn
numpy
pandas
matplotlib
seaborn
tqdm
For simulation: R with packages GLM3r, glmtools, rLakeAnalyzer, tidyverse, ncdf4

Recommended Execution Order

1. Simulation

If you want to regenerate GLM simulation output:

Run Simulation/general_lake_model.R
This script should be launched from the Simulation folder

2. Base Model

Run Base Model/basemodel_training.py
Run Base Model/figures_plot.py

3. Pretraining

Run Pretraining/basemodel_pretraining.py
Run Pretraining/basemodel_training.py
Run Pretraining/figures_plot.py

4. Ensemble

Run Ensemble/parameter_tuning_ensemble.py
Run Ensemble/ensemble_evaluation.py

5. Process-Guided Loss

Run Loss Function/energy_parameter_tuning.py
Run Loss Function/ensemble_energy_evaluation.py

6. Validation

Run Validation/monthly_energy_analysis.py

Notes

The repository is organized around comparative experiments on model architecture, pretraining, ensemble learning, and process-guided loss design.
The current implementation uses random splitting of sliding-window samples, so the code is most appropriate for comparative evaluation under a shared protocol.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Base Model		Base Model
Ensemble		Ensemble
Loss Function		Loss Function
Pretraining		Pretraining
Simulation		Simulation
Validation		Validation
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Process-guided deep learning for lake water temperature

Overview

Repository Structure

Environment and Data Setup

Requirements

Recommended Execution Order

1. Simulation

2. Base Model

3. Pretraining

4. Ensemble

5. Process-Guided Loss

6. Validation

Notes

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Process-guided deep learning for lake water temperature

Overview

Repository Structure

Environment and Data Setup

Requirements

Recommended Execution Order

1. Simulation

2. Base Model

3. Pretraining

4. Ensemble

5. Process-Guided Loss

6. Validation

Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages