timbre_transfer error when loading a model trained by ae.gin #333

XinjianOUYANG · 2021-03-10T19:24:45Z

Hi,
I have a problem when I upload a model trained by ae.gin to timbre_transfer colab file.

I guess the reason is that audio_features doesn't have the key 'z'(only have'f0_hz','f0_confidence','loudness_db'). But I have no idea how to get 'z' from the original audio file. Could you give me some advice?
Thanks a lot!

XinjianOUYANG · 2021-03-12T22:44:01Z

The previous error has been solved. But I got another one when I load a model trined by ae.gin into timre_transfer colab.

InvalidArgumentError Traceback (most recent call last)
in
58
59 #z_feature = model_z.encode(audio_features)
---> 60 outputs_z = model_z(audio_features, training=False) # Run the forward pass, add losses, and create a dictionary of outputs.
61
62 # print(outputs.keys())

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/ddsp/training/models/model.py in call(self, return_losses, *args, **kwargs)
52 # Run model.
53 self._losses_dict = {}
---> 54 outputs = super().call(*args, **kwargs)
55
56 # Get total loss.

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py in call(self, *args, **kwargs)
1010 with autocast_variable.enable_auto_cast_variables(
1011 self._compute_dtype_object):
-> 1012 outputs = call_fn(inputs, *args, **kwargs)
1013
1014 if self._activity_regularizer:

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/ddsp/training/models/autoencoder.py in call(self, features, training)
56 def call(self, features, training=True):
57 """Run the core of the network, get predictions and loss."""
---> 58 features = self.encode(features, training=training)
59 features.update(self.decoder(features, training=training))
60

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/ddsp/training/models/autoencoder.py in encode(self, features, training)
42 features.update(self.preprocessor(features, training=training))
43 if self.encoder is not None:
---> 44 features.update(self.encoder(features))
45 return features
46

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/ddsp/training/nn.py in call(self, *inputs, **kwargs)
134
135 # Run input tensors through the model.
--> 136 outputs = super().call(*inputs, **kwargs)
137
138 # Return dict if call() returns it.

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py in call(self, *args, **kwargs)
1010 with autocast_variable.enable_auto_cast_variables(
1011 self._compute_dtype_object):
-> 1012 outputs = call_fn(inputs, *args, **kwargs)
1013
1014 if self._activity_regularizer:

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/ddsp/training/encoders.py in call(self, *args, **unused_kwargs)
45 time_steps = int(args[-1].shape[1])
46 inputs = args[:-1] # Last input just used for time_steps.
---> 47 z = self.compute_z(*inputs)
48 return self.expand_z(z, time_steps)
49

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/ddsp/training/encoders.py in compute_z(self, audio)
120
121 # Normalize.
--> 122 z = self.z_norm(mfccs[:, :, tf.newaxis, :])[:, :, 0, :]
123 # Run an RNN over the latents.
124 z = self.rnn(z)

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py in wrapper(*args, **kwargs)
199 """Call target, and fall back on dispatchers if there is a TypeError."""
200 try:
--> 201 return target(*args, **kwargs)
202 except (TypeError, ValueError):
203 # Note: convert_to_eager_tensor currently raises a ValueError, not a

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/tensorflow/python/ops/array_ops.py in _slice_helper(tensor, slice_spec, var)
1034 var_empty = constant([], dtype=dtypes.int32)
1035 packed_begin = packed_end = packed_strides = var_empty
-> 1036 return strided_slice(
1037 tensor,
1038 packed_begin,

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py in wrapper(*args, **kwargs)
199 """Call target, and fall back on dispatchers if there is a TypeError."""
200 try:
--> 201 return target(*args, **kwargs)
202 except (TypeError, ValueError):
203 # Note: convert_to_eager_tensor currently raises a ValueError, not a

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/tensorflow/python/ops/array_ops.py in strided_slice(input_, begin, end, strides, begin_mask, end_mask, ellipsis_mask, new_axis_mask, shrink_axis_mask, var, name)
1207 strides = ones_like(begin)
1208
-> 1209 op = gen_array_ops.strided_slice(
1210 input=input_,
1211 begin=begin,

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/tensorflow/python/ops/gen_array_ops.py in strided_slice(input, begin, end, strides, begin_mask, end_mask, ellipsis_mask, new_axis_mask, shrink_axis_mask, name)
10445 return _result
10446 except _core._NotOkStatusException as e:

10447 _ops.raise_from_not_ok_status(e, name)
10448 except _core._FallbackException:
10449 pass

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/tensorflow/python/framework/ops.py in raise_from_not_ok_status(e, name)
6860 message = e.message + (" name: " + name if name is not None else "")
6861 # pylint: disable=protected-access
-> 6862 six.raise_from(core._status_to_exception(e.code, message), None)
6863 # pylint: enable=protected-access
6864

~/opt/anaconda3/envs/DDSP/lib/python3.8/site-packages/six.py in raise_from(value, from_value)

InvalidArgumentError: Index out of range using input dim 2; input has only 2 dims [Op:StridedSlice] name: autoencoder_16/mfcc_time_distributed_rnn_encoder_12/strided_slice/

I check the code but I don't know how to solve this issue.
Could you give me some advice?

jesseengel · 2021-03-27T20:56:01Z

Sorry for the delayed response. As you've observed, the timbre transfer notebook is designed to work with solo_instrument.gin which does not have an additional 'z' vector. It looks like you're trying to first encode z and then run the full call, and it's tripping up on running the encoder a second time as MFCC only has 2 dims when it was expecting to have 3. This might be due to something about the preprocessing running twice on the audio and adding extra dimensions, but I'm not entirely sure from this trace alone.

One thing you could try would be to just manually run model_z.decode(features) after you manually run encoding, so as to not run the encoding twice on the same dict. Hope that helps

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

timbre_transfer error when loading a model trained by ae.gin #333

timbre_transfer error when loading a model trained by ae.gin #333

XinjianOUYANG commented Mar 10, 2021

XinjianOUYANG commented Mar 12, 2021

jesseengel commented Mar 27, 2021

timbre_transfer error when loading a model trained by ae.gin #333

timbre_transfer error when loading a model trained by ae.gin #333

Comments

XinjianOUYANG commented Mar 10, 2021

XinjianOUYANG commented Mar 12, 2021

jesseengel commented Mar 27, 2021