Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timbre_transfer error when loading a model trained by ae.gin #333

Open
XinjianOUYANG opened this issue Mar 10, 2021 · 2 comments
Open

timbre_transfer error when loading a model trained by ae.gin #333

XinjianOUYANG opened this issue Mar 10, 2021 · 2 comments

Comments

@XinjianOUYANG
Copy link

Hi,
I have a problem when I upload a model trained by ae.gin to timbre_transfer colab file.
image
I guess the reason is that audio_features doesn't have the key 'z'(only have'f0_hz','f0_confidence','loudness_db'). But I have no idea how to get 'z' from the original audio file. Could you give me some advice?
Thanks a lot!

@XinjianOUYANG
Copy link
Author

The previous error has been solved. But I got another one when I load a model trined by ae.gin into timre_transfer colab.

InvalidArgumentError Traceback (most recent call last)
in
58
59 #z_feature = model_z.encode(audio_features)
---> 60 outputs_z = model_z(audio_features, training=False) # Run the forward pass, add losses, and create a dictionary of outputs.
61
62 # print(outputs.keys())

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/ddsp/training/models/model.py in call(self, return_losses, *args, **kwargs)
52 # Run model.
53 self._losses_dict = {}
---> 54 outputs = super().call(*args, **kwargs)
55
56 # Get total loss.

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py in call(self, *args, **kwargs)
1010 with autocast_variable.enable_auto_cast_variables(
1011 self._compute_dtype_object):
-> 1012 outputs = call_fn(inputs, *args, **kwargs)
1013
1014 if self._activity_regularizer:

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/ddsp/training/models/autoencoder.py in call(self, features, training)
56 def call(self, features, training=True):
57 """Run the core of the network, get predictions and loss."""
---> 58 features = self.encode(features, training=training)
59 features.update(self.decoder(features, training=training))
60

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/ddsp/training/models/autoencoder.py in encode(self, features, training)
42 features.update(self.preprocessor(features, training=training))
43 if self.encoder is not None:
---> 44 features.update(self.encoder(features))
45 return features
46

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/ddsp/training/nn.py in call(self, *inputs, **kwargs)
134
135 # Run input tensors through the model.
--> 136 outputs = super().call(*inputs, **kwargs)
137
138 # Return dict if call() returns it.

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py in call(self, *args, **kwargs)
1010 with autocast_variable.enable_auto_cast_variables(
1011 self._compute_dtype_object):
-> 1012 outputs = call_fn(inputs, *args, **kwargs)
1013
1014 if self._activity_regularizer:

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/ddsp/training/encoders.py in call(self, *args, **unused_kwargs)
45 time_steps = int(args[-1].shape[1])
46 inputs = args[:-1] # Last input just used for time_steps.
---> 47 z = self.compute_z(*inputs)
48 return self.expand_z(z, time_steps)
49

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/ddsp/training/encoders.py in compute_z(self, audio)
120
121 # Normalize.
--> 122 z = self.z_norm(mfccs[:, :, tf.newaxis, :])[:, :, 0, :]
123 # Run an RNN over the latents.
124 z = self.rnn(z)

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py in wrapper(*args, **kwargs)
199 """Call target, and fall back on dispatchers if there is a TypeError."""
200 try:
--> 201 return target(*args, **kwargs)
202 except (TypeError, ValueError):
203 # Note: convert_to_eager_tensor currently raises a ValueError, not a

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/tensorflow/python/ops/array_ops.py in _slice_helper(tensor, slice_spec, var)
1034 var_empty = constant([], dtype=dtypes.int32)
1035 packed_begin = packed_end = packed_strides = var_empty
-> 1036 return strided_slice(
1037 tensor,
1038 packed_begin,

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py in wrapper(*args, **kwargs)
199 """Call target, and fall back on dispatchers if there is a TypeError."""
200 try:
--> 201 return target(*args, **kwargs)
202 except (TypeError, ValueError):
203 # Note: convert_to_eager_tensor currently raises a ValueError, not a

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/tensorflow/python/ops/array_ops.py in strided_slice(input_, begin, end, strides, begin_mask, end_mask, ellipsis_mask, new_axis_mask, shrink_axis_mask, var, name)
1207 strides = ones_like(begin)
1208
-> 1209 op = gen_array_ops.strided_slice(
1210 input=input_,
1211 begin=begin,

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/tensorflow/python/ops/gen_array_ops.py in strided_slice(input, begin, end, strides, begin_mask, end_mask, ellipsis_mask, new_axis_mask, shrink_axis_mask, name)
10445 return _result
10446 except _core._NotOkStatusException as e:

10447 _ops.raise_from_not_ok_status(e, name)
10448 except _core._FallbackException:
10449 pass

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/tensorflow/python/framework/ops.py in raise_from_not_ok_status(e, name)
6860 message = e.message + (" name: " + name if name is not None else "")
6861 # pylint: disable=protected-access
-> 6862 six.raise_from(core._status_to_exception(e.code, message), None)
6863 # pylint: enable=protected-access
6864

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/six.py in raise_from(value, from_value)

InvalidArgumentError: Index out of range using input dim 2; input has only 2 dims [Op:StridedSlice] name: autoencoder_16/mfcc_time_distributed_rnn_encoder_12/strided_slice/

I check the code but I don't know how to solve this issue.
Could you give me some advice?

@jesseengel
Copy link
Contributor

Sorry for the delayed response. As you've observed, the timbre transfer notebook is designed to work with solo_instrument.gin which does not have an additional 'z' vector. It looks like you're trying to first encode z and then run the full call, and it's tripping up on running the encoder a second time as MFCC only has 2 dims when it was expecting to have 3. This might be due to something about the preprocessing running twice on the audio and adding extra dimensions, but I'm not entirely sure from this trace alone.

One thing you could try would be to just manually run model_z.decode(features) after you manually run encoding, so as to not run the encoding twice on the same dict. Hope that helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants