Reproducing ETTh results in the original paper

**Describe the bug**
Hi, I am a graduate student, currently working on evaluation studies for time-series forecasting models. Really appreciate the splendid work! And we (with my collaborators) really appreciate the help (if possible) in reproducing the results of Moirai, one of the SOTA TS foundation models nowadays. We will be more than happy to report the valid performance of all baselines in our work.

We were trying to reproduce the zero-shot performance of Moirai in "Unified Training of Universal Time Series Forecasting Transformers" (will move on to MOE and 2nd version later). We noticed that the MSE/MAE result for Moirai-Small on ETTh1 is 0.375/0.402 and ETTh2 with 0.281/0.334 when prediction_len=96. However, we can only get 0.415/0.419 for ETTh1 and >10 loss for ETTh2. To avoid misrepresenting this powerful model: can we have a small snippet to reproduce zero-shot results on one small dataset like ETTh1? We find the finetuning process but not the exact zero-shot (or we missed that somewhere). Any guidance for model configurations and model pipelines would also be appreciated.


**To Reproduce**
Here's the code for how we initialize and produce the output from Moirai, 1st version, small. 
```python
# initialize the model with parameters for ETTh1/2
model = MoiraiForecast(
            module=MoiraiModule.from_pretrained(f"Salesforce/moirai-1.1-R-small"),
            prediction_length=96,
            context_length=1000,
            patch_size=32,
            num_samples=100,
            target_dim=7,
            feat_dynamic_real_dim=0,
            past_feat_dynamic_real_dim=0,
 ).to(self.device)

# we follow the data loader and normalization from Time-series-library: https://github.com/thuml/Time-Series-Library/blob/main/data_provider/data_loader.py

# for batch prediction, we follow the jupyter notebook tutorial and adapt from torch tensor input
# but we also notice there's gluon ts dataset format support. Is there any difference between these two?
with torch.no_grad():
    for i, (batch_x, batch_y, _, _) in enumerate(test_loader):
        
        # batch_x shape: (batch_size, seq_len, n_variates)
        batch_x = batch_x.to(self.device).float()

        past_target = batch_x[:, -seq_len:, :]                       
        past_observed_target = torch.ones_like(past_target, dtype=torch.bool)
        past_is_pad = torch.zeros((batch_size, seq_len), dtype=torch.bool, device=self.device)

        # Moirai forward
        pred = model(
            past_target=past_target,
            past_observed_target=past_observed_target,
            past_is_pad=past_is_pad,
        )

        # for point or mean forecasting
        # shape of (batch_size, pred_len, n_variates)
        pred = pred.mean(dim=1)

```

**Expected behavior**
As discussed above

**Error message or code output**
As discussed above.

**Environment**
* Operating system: Ubuntu 22.04.3 LTS
* Python version: Python 3.11.11
* PyTorch version: 2.4.1
* uni2ts version: latest 2.0.0


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing ETTh results in the original paper #237

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reproducing ETTh results in the original paper #237

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions