-
Notifications
You must be signed in to change notification settings - Fork 2
Modules
The pbdl package consists of three modules, each designed for specific use cases in physics-based deep learning:
-
pbdl.loader: Provides basic dataset access using NumPy arrays. -
pbdl.torch.loader: Supports dataset loading for training models in PyTorch. -
pbdl.torch.phi.loaderSupports dataset loading for training models in PyTorch with integrated solver.
This module suitable for loading datasets without training and NumPy arrays are sufficient.
A Dataloader instance requires at least two arguments:
- dataset name (positional): The name of the dataset to be loaded.
-
time_steps: The interval between input and target frame. If set toNone, this interval is maximal (number of frames in the simulation minus one).
Additionally, it accepts the following keyword arguments:
-
sel_sims: Select specific simulations. By default, all simulations are included. -
trim_start/trim_end: Discard the initial or final sequence of frames, which may be uninteresting. -
step_size: Use every k-th frame (thinning out datasets with many frames). By default the step size is 1. -
normalize_data/normalize_const: Choose from the available normalization strategies. By default normalization is disabled. -
batch_size: Define the number of samples in each batch. -
shuffle: Determine whether the samples should be provided in a random order. -
intermediate_time_steps: If enabled, not only the initial and target frames are supplied but also all intermediate frames. Useful for computing accumulated errors over multiple time steps.
For a convenient way to use all simulation frames, set the
all_time_stepsflag. Note that this flag also controls related settings like time steps, step size, and intermediate time steps.
The following code provides a minimal example:
from pbdl.loader import Dataloader
import matplotlib.pyplot as plt
loader = Dataloader(
"incompressible-wake-flow-tiny",
time_steps=10, # interval between input and target frame
sel_sims=[0], # select first simulation
batch_size=3,
shuffle=True,
)
inputs, targets = next(iter(loader))
for i in range(len(inputs)):
plt.subplot(2, len(inputs), i + 1)
plt.imshow(inputs[i][0]) # display field at index 0
plt.axis("off")
plt.title("input {}".format(i + 1))
for i in range(len(targets)):
plt.subplot(2, len(targets), len(targets) + i + 1)
plt.imshow(targets[i][0]) # display field at index 0
plt.axis("off")
plt.title("target {}".format(i + 1))
plt.show()This module is suitable for loading datasets for training with PyTorch. Unlike the dataloader in the previous module, the dataloader from pbdl.torch.loader returns a pair (input tensor, target tensor), where both elements are PyTorch tensors. Each layer of the tensors represents a physical field or constant.
The following code provides a minimal example:
import torch
import numpy as np
from pbdl.torch.loader import Dataloader
import examples.tcf.net_small as net_small
loader = Dataloader(
"transonic-cylinder-flow-tiny",
time_steps=10,
sel_sims=[0, 1],
step_size=3,
normalize_data="std",
batch_size=3,
shuffle=True,
)
net = net_small.NetworkSmall()
criterionL2 = torch.nn.MSELoss()
optimizer = torch.optim.Adam(net.parameters(), lr=0.0001, weight_decay=0.0)
for epoch in range(5):
for i, (input, target) in enumerate(loader):
net.zero_grad()
output = net(input)
loss = criterionL2(output, target)
loss.backward()
optimizer.step()
print(f"epoch { epoch }, loss { loss.item() }")This module is suitable if you want to integrate a (PhiFlow) solver into the training loop of your PyTorch program. It introduces new features that must be enabled using the following parameters:
-
batch_by_const: A list of indices representing constants. It ensures that all samples in a batch share the same constant values. This is useful when using a solver function that requires a batch of samples but only one scalar value for each constant. -
ret_batch_const: When enabled, the loader also returns the non-normalized constants for the batch. This option is only available if batching by constants is enabled.
Additionally, the module provides auxiliary functions for converting tensors between PyTorch and PhiFlow:
-
to_phiflow(t): Converts network input to solver input by removing constant layers. -
from_phiflow(t): Converts solver output to match network output format. -
cat_constants(t,l): Concatenates constant layers from tensorlonto tensort. This is useful because the network output does not include the constant layers required for the network input in the next iteration.
The following code provides a minimal example:
import torch
from pbdl.torch.phi.loader import Dataloader
from examples.ks.ks_networks import ConvResNet1D
from examples.ks.ks_solver import DifferentiableKS
# solver parameters
DOMAIN_SIZE_BASE = 8
PREDHORZ = 5
device = "cuda:0" if torch.cuda.is_available() else "cpu"
diff_ks = DifferentiableKS(resolution=48, dt=0.5)
loader = Dataloader(
"ks-dataset",
PREDHORZ,
step_size=20,
intermediate_time_steps=True,
batch_size=16,
batch_by_const=[0],
ret_batch_const=True,
)
net = ConvResNet1D(16, 3, device=device)
optimizer = torch.optim.Adam(net.parameters(), lr=1e-4)
loss = torch.nn.MSELoss()
for epoch in range(4):
for i, (input, targets, const) in enumerate(loader):
input = input.to(device)
targets = targets.to(device)
optimizer.zero_grad()
domain_size = const[0]
inputs = [input]
outputs = []
for _ in range(PREDHORZ):
output_solver = diff_ks.etd1(
loader.to_phiflow(inputs[-1]), DOMAIN_SIZE_BASE * domain_size
)
correction = diff_ks.dt * net(inputs[-1])
output_combined = loader.from_phiflow(output_solver) + correction
outputs.append(output_combined)
inputs.append(loader.cat_constants(outputs[-1], inputs[0]))
outputs = torch.stack(outputs, axis=1)
loss_value = loss(outputs, targets)
loss_value.backward()
optimizer.step()
print(f"epoch { epoch }, loss {(loss_value.item()*10000.) :.3f}")