Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rope.frequencies missing exception #518

Open
kc334 opened this issue Jan 2, 2025 · 0 comments
Open

rope.frequencies missing exception #518

kc334 opened this issue Jan 2, 2025 · 0 comments

Comments

@kc334
Copy link

kc334 commented Jan 2, 2025

I tried to use rope with this command:

dora --verbose run  \
    solver=musicgen/musicgen_base_32khz_test \
    model/lm/model_scale=xsmall \
    conditioner=none \
    transformer_lm.positional_embedding=sin_rope \
    --clear

this is the configuration file musicgen_base_32khz_test:

# @package __global__

# This is the training loop solver
# for the base MusicGen model (text-to-music)
# on monophonic audio sampled at 32 kHz
defaults:
  - musicgen/default
  - /model: lm/musicgen_lm
  # - override /dset: audio/default
  # - override /model/lm/model_scale: xsmall
  - override /dset: audio/example
  - _self_

autocast: true
autocast_dtype: float16

# EnCodec large trained on mono-channel music audio sampled at 32khz
# with a total stride of 640 leading to 50 frames/s.
# rvq.n_q=4, rvq.bins=2048, no quantization dropout
# (transformer_lm card and n_q must be compatible)
compression_model_checkpoint: //pretrained/facebook/encodec_32khz

channels: 1
sample_rate: 32000

deadlock:
  use: true  # deadlock detection

dataset:
  batch_size: 2  # 32 GPUs
  # segment_duration: 1
  sample_on_weight: false  # Uniform sampling all the way
  sample_on_duration: false  # Uniform sampling all the way
  valid:
    num_samples: 1
  generate:
    num_samples: 1
  evaluate:
    num_samples: 0
  num_workers: 2


evaluate:
  metrics:
    kld: false

metrics:
  kld:
    use_gt: false
    model: passt
    passt:
      pretrained_length: 20

generate:
  lm:
    use_sampling: true
    top_k: 250
    top_p: 0.0
  

optim:
  epochs: 1
  updates_per_epoch: 1
  optimizer: dadam
  lr: 1e-6
  ema:
    use: true
    updates: 10
    device: cpu

logging:
  log_tensorboard: true

schedule:
  lr_scheduler: cosine
  cosine:
    warmup: 100
    lr_min_ratio: 0.0
    cycle_length: 1.0

tensorboard:
  name: debug_test

and got following exception after finishing training epoch

Exception has occurred: RuntimeError       (note: full exception trace is shown but execution is paused at: _run_module_as_main)
Error(s) in loading state_dict for LMModel:
	Missing key(s) in state_dict: "transformer.layers.0.self_attn.rope.frequencies", "transformer.layers.1.self_attn.rope.frequencies". 

I guess a similar issue could occur after turning xpos on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant