Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSError: sndfile library not found #91

Open
danicuki opened this issue May 25, 2020 · 9 comments
Open

OSError: sndfile library not found #91

danicuki opened this issue May 25, 2020 · 9 comments

Comments

@danicuki
Copy link

Trying to run on a docker container

$docker run -i -t continuumio/miniconda /bin/bash

After installing, I get this error

(jukebox) root@182b585df72d:/jukebox# python jukebox/sample.py --model=5b_lyrics --name=sample_5b --levels=3 --sample_length_in_seconds=20 \
> --total_sample_length_in_seconds=180 --sr=44100 --n_samples=6 --hop_fraction=0.5,0.5,0.125
Traceback (most recent call last):
  File "jukebox/sample.py", line 7, in <module>
    from jukebox.utils.audio_utils import save_wav, load_audio
  File "/jukebox/jukebox/utils/audio_utils.py", line 4, in <module>
    import soundfile
  File "/opt/conda/envs/jukebox/lib/python3.7/site-packages/soundfile.py", line 142, in <module>
    raise OSError('sndfile library not found')
OSError: sndfile library not found
@johndpope
Copy link
Contributor

that's the wrong command - it's loading this https://hub.docker.com/r/continuumio/miniconda/dockerfile

Try starting with this Dockerfile specific to jukebox
https://github.com/btrude/jukebox-docker

this line
https://github.com/johndpope/jukebox-docker/blob/master/Dockerfile#L198
should import soundfile

N.b - check nvidia-smi on your host for your cuda version - it should match with this import statement in dockerfile - you may need to bump cuda:10.2 - cuda:10.0
(sidenote - nvidia also have cudagl docker images / not applicable here)
https://hub.docker.com/r/nvidia/cuda/
FROM nvidia/cuda:10.1-devel-ubuntu18.04

@danicuki
Copy link
Author

Thanks for the help. Now I've got this error:

Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx
root@af4016dfb45c:/opt/jukebox# exit

How do I run my docker host with NVIDIA on a Mac?

@btrude
Copy link

btrude commented May 26, 2020

Thanks for the help. Now I've got this error:

Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx
root@af4016dfb45c:/opt/jukebox# exit

How do I run my docker host with NVIDIA on a Mac?

You can't, it is currently only supported on linux

@danicuki
Copy link
Author

danicuki commented May 26, 2020 via email

@perlman-izzy
Copy link

I'm a noob but I was able to get rid of that error by conda install -c conda-forge libsndfile . Although I think that's supposed to be covered in one of the install libraries somewhere so it could be a red flag you didn't install libraries properly. That's what happened to me.

@perlman-izzy
Copy link

I can get the program to run for like 2 minutes and then I get the error below. Anybody have any suggestions? Running on a vast.ai server.

py", line 581, in _load

deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly)

RuntimeError: unexpected EOF, expected 20664312 more bytes. The file might be corrupted.

terminate called after throwing an instance of 'c10::Error'

what(): owning_ptr == NullType::singleton() || owning_ptr->refcount_.load() > 0 ASSERT FAILED at /opt/conda/conda-bld/pytorch_1556653114079/work/c10/util/intrusive_ptr.h:350, please report a bug to PyTorch. intrusive_ptr: Can only intrusive_ptr::reclaim() owning pointers that were created using intrusive_ptr::release(). (reclaim at /opt/conda/conda-bld/pytorch_1556653114079/work/c10/util/intrusive_ptr.h:350)

frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7fc5a1c8ddc5 in /opt/conda/envs/jukebox/lib/python3.7/site-packages/torch/lib/libc10.so)

frame #1: THStorage_free + 0xca (0x7fc5a29d120a in /opt/conda/envs/jukebox/lib/python3.7/site-packages/torch/lib/libcaffe2.so)

frame #2: + 0x14872d (0x7fc5d0cb272d in /opt/conda/envs/jukebox/lib/python3.7/site-packages/torch/lib/libtorch_python.so)

frame #26: __libc_start_main + 0xf0 (0x7fc5df535830 in /lib/x86_64-linux-gnu/libc.so.6)

Aborted (core dumped)

@btrude
Copy link

btrude commented May 31, 2020

I can get the program to run for like 2 minutes and then I get the error below. Anybody have any suggestions? Running on a vast.ai server.

py", line 581, in _load

deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly)

RuntimeError: unexpected EOF, expected 20664312 more bytes. The file might be corrupted.

terminate called after throwing an instance of 'c10::Error'

what(): owning_ptr == NullType::singleton() || owning_ptr->refcount_.load() > 0 ASSERT FAILED at /opt/conda/conda-bld/pytorch_1556653114079/work/c10/util/intrusive_ptr.h:350, please report a bug to PyTorch. intrusive_ptr: Can only intrusive_ptr::reclaim() owning pointers that were created using intrusive_ptr::release(). (reclaim at /opt/conda/conda-bld/pytorch_1556653114079/work/c10/util/intrusive_ptr.h:350)

frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7fc5a1c8ddc5 in /opt/conda/envs/jukebox/lib/python3.7/site-packages/torch/lib/libc10.so)

frame #1: THStorage_free + 0xca (0x7fc5a29d120a in /opt/conda/envs/jukebox/lib/python3.7/site-packages/torch/lib/libcaffe2.so)

frame #2: + 0x14872d (0x7fc5d0cb272d in /opt/conda/envs/jukebox/lib/python3.7/site-packages/torch/lib/libtorch_python.so)

frame #26: __libc_start_main + 0xf0 (0x7fc5df535830 in /lib/x86_64-linux-gnu/libc.so.6)

Aborted (core dumped)

Most likely one of the audio files you transferred to vast was corrupted, failed to transfer fully, or maybe you started training before it had fully transferred all the files from the directory. My vast servers have been egregiously slow this weekend so I was on support yesterday and they told me to just spin up a bunch of servers, determine which one doesn't have extremely slow network speeds and then destroy all the other ones (I have just built a pc specifically for ml at home so my vast days are behind me now thankfully and I'm definitely rethinking my recommendation given their slowness). I'll also note that if you are looking to train with your own music it is mostly pointless to do anything other than finetune the 1b model with your own genre/artist tag replacing existing one(s). Even with a local gpu with 24gb of vram I do not have enough memory to finetune or train from scratch at the depth of the 5b models, and training the small priors/vqvae results in significantly worse quality than just finetuning the 1b. I was able to get uncanny results finetuning with 1.5 hours of my own music on 1x tesla m40 for only 8 hours (but I am currently in the process of continuing that training so I would expect better results with even more training and properly annealing the training rate etc).

@Jekyll233
Copy link

尝试在 docker 容器上运行

$docker run -i -t continuumio/miniconda /bin/bash

安装后,我收到此错误

(jukebox) root@182b585df72d:/jukebox# python jukebox/sample.py --model=5b_lyrics --name=sample_5b --levels=3 --sample_length_in_seconds=20 \
> --total_sample_length_in_seconds=180 --sr=44100 --n_samples=6 --hop_fraction=0.5,0.5,0.125
Traceback (most recent call last):
  File "jukebox/sample.py", line 7, in <module>
    from jukebox.utils.audio_utils import save_wav, load_audio
  File "/jukebox/jukebox/utils/audio_utils.py", line 4, in <module>
    import soundfile
  File "/opt/conda/envs/jukebox/lib/python3.7/site-packages/soundfile.py", line 142, in <module>
    raise OSError('sndfile library not found')
OSError: sndfile library not found

How was it solved?

@btrude
Copy link

btrude commented Feb 16, 2023

尝试在 docker 容器上运行
$docker run -i -t continuumio/miniconda /bin/bash
安装后,我收到此错误

(jukebox) root@182b585df72d:/jukebox# python jukebox/sample.py --model=5b_lyrics --name=sample_5b --levels=3 --sample_length_in_seconds=20 \
> --total_sample_length_in_seconds=180 --sr=44100 --n_samples=6 --hop_fraction=0.5,0.5,0.125
Traceback (most recent call last):
  File "jukebox/sample.py", line 7, in <module>
    from jukebox.utils.audio_utils import save_wav, load_audio
  File "/jukebox/jukebox/utils/audio_utils.py", line 4, in <module>
    import soundfile
  File "/opt/conda/envs/jukebox/lib/python3.7/site-packages/soundfile.py", line 142, in <module>
    raise OSError('sndfile library not found')
OSError: sndfile library not found

How was it solved?

apt-get update && apt-get install -y libsndfile1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants