Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to update sample_rate of ddsp_run #355

Open
nindidooo opened this issue May 26, 2021 · 4 comments
Open

how to update sample_rate of ddsp_run #355

nindidooo opened this issue May 26, 2021 · 4 comments

Comments

@nindidooo
Copy link

I am using the DDSP train_autoencoder colab notebook here:
https://colab.research.google.com/github/magenta/ddsp/blob/master/ddsp/colab/demos/train_autoencoder.ipynb

I want to train the autoencoder to use 48kHz audio, not 16kHz (default).

To do so, I have made the following changes to sections of the notebook ( --sample_rate=48000)

!ddsp_prepare_tfrecord \
  --input_audio_filepatterns=$AUDIO_FILEPATTERN \
  --output_tfrecord_path=$TRAIN_TFRECORD \
  --num_shards=10 \
  --sample_rate=48000\
  --alsologtostderr

Save dataset statistics for timbre transfer

from ddsp.colab import colab_utils
import ddsp.training

data_provider = ddsp.training.data.TFRecordProvider(TRAIN_TFRECORD_FILEPATTERN,
                                                    sample_rate=48000)
dataset = data_provider.get_dataset(shuffle=False)
PICKLE_FILE_PATH = os.path.join(SAVE_DIR, 'dataset_statistics.pkl')


and then for training:

!ddsp_run \

  --mode=train \
  --alsologtostderr \
  --save_dir="$SAVE_DIR" \
  --gin_file=models/solo_instrument.gin \
  --gin_file=datasets/tfrecord.gin \
  --gin_param="TFRecordProvider.file_pattern='$TRAIN_TFRECORD_FILEPATTERN'" \
  --gin_param="batch_size=16" \
  --gin_param="sample_rate=48000" \
  --gin_param="train_util.train.num_steps=30000" \
  --gin_param="train_util.train.steps_per_save=300" \
  --gin_param="trainers.Trainer.checkpoints_to_keep=10"

The result is below...I cannot figure out why it doesn't work.
Can someone please tell me what I am doing wrong? Thanks very much in advance!

2021-05-26 01:50:25.950745: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
WARNING:root:Argument whitelist is deprecated. Please use allowlist.
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_probability/python/internal/variadic_reduce.py:115: calling function (from tensorflow.python.eager.def_function) with experimental_compile is deprecated and will be removed in a future version.
Instructions for updating:
experimental_compile is deprecated, use jit_compile instead
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_probability/python/internal/variadic_reduce.py:115: calling function (from tensorflow.python.eager.def_function) with experimental_compile is deprecated and will be removed in a future version.
Instructions for updating:
experimental_compile is deprecated, use jit_compile instead
I0526 01:50:29.118762 139803541604224 ddsp_run.py:176] Restore Dir: /content/drive/MyDrive/drum_sample_solo/SHORT-ddsp-solo-instrument
I0526 01:50:29.119161 139803541604224 ddsp_run.py:177] Save Dir: /content/drive/MyDrive/drum_sample_solo/SHORT-ddsp-solo-instrument
I0526 01:50:29.121088 139803541604224 resource_reader.py:50] system_path_file_exists:optimization/base.gin
E0526 01:50:29.121450 139803541604224 resource_reader.py:55] Path not found: optimization/base.gin
I0526 01:50:29.125206 139803541604224 resource_reader.py:50] system_path_file_exists:eval/basic.gin
E0526 01:50:29.125523 139803541604224 resource_reader.py:55] Path not found: eval/basic.gin
I0526 01:50:29.127723 139803541604224 resource_reader.py:50] system_path_file_exists:models/solo_instrument.gin
E0526 01:50:29.127967 139803541604224 resource_reader.py:55] Path not found: models/solo_instrument.gin
I0526 01:50:29.128289 139803541604224 resource_reader.py:50] system_path_file_exists:models/ae.gin
E0526 01:50:29.128505 139803541604224 resource_reader.py:55] Path not found: models/ae.gin
I0526 01:50:29.135938 139803541604224 resource_reader.py:50] system_path_file_exists:datasets/tfrecord.gin
E0526 01:50:29.136190 139803541604224 resource_reader.py:55] Path not found: datasets/tfrecord.gin
I0526 01:50:29.136540 139803541604224 resource_reader.py:50] system_path_file_exists:datasets/base.gin
E0526 01:50:29.136765 139803541604224 resource_reader.py:55] Path not found: datasets/base.gin
I0526 01:50:29.169140 139803541604224 train_util.py:78] Defaulting to MirroredStrategy
2021-05-26 01:50:29.170495: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-05-26 01:50:29.179300: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-05-26 01:50:29.179911: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:00:04.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-05-26 01:50:29.179946: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-05-26 01:50:29.182908: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-05-26 01:50:29.182982: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-05-26 01:50:29.184593: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-05-26 01:50:29.184967: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-05-26 01:50:29.186709: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.10
2021-05-26 01:50:29.187344: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-05-26 01:50:29.187547: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-05-26 01:50:29.187663: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-05-26 01:50:29.188345: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-05-26 01:50:29.188900: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-05-26 01:50:29.189253: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX512F
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-05-26 01:50:29.189600: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-05-26 01:50:29.190186: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:00:04.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-05-26 01:50:29.190261: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-05-26 01:50:29.190840: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-05-26 01:50:29.191472: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-05-26 01:50:29.191521: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-05-26 01:50:29.701107: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-05-26 01:50:29.701162: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0
2021-05-26 01:50:29.701171: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: N
2021-05-26 01:50:29.701381: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-05-26 01:50:29.702050: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-05-26 01:50:29.702700: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-05-26 01:50:29.703268: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2021-05-26 01:50:29.703314: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 13787 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:04.0, compute capability: 7.0)
WARNING:tensorflow:Collective ops is not configured at program startup. Some performance features may not be enabled.
W0526 01:50:29.705162 139803541604224 mirrored_strategy.py:379] Collective ops is not configured at program startup. Some performance features may not be enabled.
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
I0526 01:50:29.708120 139803541604224 mirrored_strategy.py:369] Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
2021-05-26 01:50:30.023877: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2021-05-26 01:50:30.024463: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 2000179999 Hz
2021-05-26 01:50:30.091103: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: audio. Can't parse serialized Example.
2021-05-26 01:50:30.091209: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: audio. Can't parse serialized Example.
2021-05-26 01:50:30.091287: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: audio. Can't parse serialized Example.
2021-05-26 01:50:30.091416: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: audio. Can't parse serialized Example.
2021-05-26 01:50:30.091468: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: audio. Can't parse serialized Example.
2021-05-26 01:50:30.091860: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: audio. Can't parse serialized Example.
2021-05-26 01:50:30.092173: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: audio. Can't parse serialized Example.
2021-05-26 01:50:30.092210: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: audio. Can't parse serialized Example.
Traceback (most recent call last):
File "/usr/local/bin/ddsp_run", line 8, in
sys.exit(console_entry_point())
File "/usr/local/lib/python3.7/dist-packages/ddsp/training/ddsp_run.py", line 224, in console_entry_point
app.run(main)
File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 303, in run
_run_main(main, args)
File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "/usr/local/lib/python3.7/dist-packages/ddsp/training/ddsp_run.py", line 199, in main
report_loss_to_hypertune=FLAGS.hypertune)
File "/usr/local/lib/python3.7/dist-packages/gin/config.py", line 1069, in gin_wrapper
utils.augment_exception_message_and_reraise(e, err_str)
File "/usr/local/lib/python3.7/dist-packages/gin/utils.py", line 41, in augment_exception_message_and_reraise
raise proxy.with_traceback(exception.traceback) from None
File "/usr/local/lib/python3.7/dist-packages/gin/config.py", line 1046, in gin_wrapper
return fn(*new_args, **new_kwargs)
File "/usr/local/lib/python3.7/dist-packages/ddsp/training/train_util.py", line 198, in train
trainer.build(next(dataset_iter))
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/input_lib.py", line 686, in next
return self.get_next()
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/input_lib.py", line 717, in get_next
2021-05-26 01:50:30.093915: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: audio. Can't parse serialized Example.
self._iterators[i].get_next_as_list_static_shapes(new_name))
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/input_lib.py", line 1927, in get_next_as_list_static_shapes
return self._format_data_list_with_options(self._iterator.get_next())
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/data/ops/multi_device_iterator_ops.py", line 585, in get_next
result.append(self._device_iterators[i].get_next())
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/data/ops/iterator_ops.py", line 814, in get_next
return self._next_internal()
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/data/ops/iterator_ops.py", line 747, in _next_internal
output_shapes=self._flat_output_shapes)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/gen_dataset_ops.py", line 2728, in iterator_get_next
_ops.raise_from_not_ok_status(e, name)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py", line 6897, in raise_from_not_ok_status
six.raise_from(core._status_to_exception(e.code, message), None)
File "", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: Key: audio. Can't parse serialized Example.
[[{{node ParseSingleExample/ParseExample/ParseExampleV2}}]]
[[MultiDeviceIteratorGetNextFromShard]]
[[RemoteCall]] [Op:IteratorGetNext]
In call to configurable 'train' (<function train at 0x7f25fe4f68c0>)

colab_utils.save_dataset_statistics(data_provider, PICKLE_FILE_PATH, batch_size=1)

@YuvoBowmar4244
Copy link

Did you ever make any progress on this? I'd be interested in this as well.

@nglazyrin
Copy link

I think I managed to update all the numbers for sample_rate = 44100. For 48000 it should be a bit easier.

The first step is to correct sample rate and everything related to it:
--gin_param="TFRecordProvider.sample_rate=44100" --gin_param="Harmonic.sample_rate=44100" --gin_param="FilteredNoise.n_samples=176400" --gin_param="Harmonic.n_samples=176400" --gin_param="Reverb.reverb_length=132300"

So I have 4 second long files in 44100 Hz and I want a 3 second long reverb. For 48000 Hz you will probably stop here.

But then I got this exception:
ValueError: For upsampling, the target the number of timesteps must be divisible by the number of input frames. (timesteps:176400, frames:1001, add_endpoint=True).

What it actually tells is that it cannot upsample the loudness/f0 values (that have a default frame rate of 250) from 1000 to 176400. So we need the frame rate to be a divider of 44100, e.g. 210. That means re-creating your dataset with
ddsp_prepare_tfrecord --frame_rate=210 and then adding
--gin_param='F0LoudnessPreprocessor.time_steps=840' --gin_param="TFRecordProvider.frame_rate=210" to the ddsp_run call

@PratikStar
Copy link

Wow! Thanks a lot @nglazyrin ! I am going to try to follow your method for my 44.1kHz audio

@PratikStar
Copy link

@nglazyrin I am not able to get good results with frame_rate=210 or 700 for my 44.1kHz audio. I have updated the other parameters (time_steps, frame_rate in each component, etc) while creating the dataset and while training correctly.

Were you actually able to resynthesize the 44.1kHz audio by following the commands in your comment only? Is there anything else that I might be missing?
I am training on a custom guitar dataset without success!! :(

cc: @jesseengel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants