Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault (core dumped) #516

Open
zhulongcc opened this issue Jan 1, 2025 · 0 comments
Open

Segmentation fault (core dumped) #516

zhulongcc opened this issue Jan 1, 2025 · 0 comments

Comments

@zhulongcc
Copy link

I meet this question. (version: Ubuntu 20.04, cuda=11.8, torch=2.4.0, 915b82d)

Here the command I input in terminal:
python launch.py --config configs/fantasia3d.yaml --train --gpu 0 system.prompt_processor.prompt="hulk" system.geometry.shape_init=mesh:load/shapes/human.obj system.geometry.shape_init_params=0.9 system.geometry.shape_init_mesh_up=+y system.geometry.shape_init_mesh_front=+z

And to print fault message I add these 2 line code in launch.py:

import faulthandler
faulthandler.enable()

Then it print message in terminal as below:

Epoch 0: |                                     | 4540/? [07:57<00:00,  9.51it/s]Fatal Python error: Segmentation fault                                          

Thread 0x00007f57168aa700 (most recent call first):
  <no Python frame>

Thread 0x00007f5741fff700 (most recent call first):
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/threading.py", line 324 in wait
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/threading.py", line 607 in wait
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/site-packages/tqdm/_monitor.py", line 60 in run
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/threading.py", line 973 in _bootstrap

Thread 0x00007f58548cf700 (most recent call first):
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/threading.py", line 324 in wait
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/threading.py", line 607 in wait
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/site-packages/tqdm/_monitor.py", line 60 in run
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/threading.py", line 973 in _bootstrap

Thread 0x00007f5864935700 (most recent call first):
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/threading.py", line 324 in wait
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/queue.py", line 180 in get
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/site-packages/tensorboard/summary/writer/event_file_writer.py", line 269 in _run
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/site-packages/tensorboard/summary/writer/event_file_writer.py", line 244 in run
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/threading.py", line 973 in _bootstrap

Current thread 0x00007f5ab7804280 (most recent call first):
  File "/home/jane/Desktop/threestudio/threestudio/utils/base.py", line 43 in do_update_step_end
  File "/home/jane/Desktop/threestudio/threestudio/systems/base.py", line 125 in on_train_batch_end
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 167 in _call_lightning_module_hook
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 270 in advance
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 140 in run
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 363 in advance
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 205 in run
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1025 in _run_stage
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 981 in _run
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 574 in _fit_impl
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 47 in _call_and_handle_interrupt
  File "/home/jane/anaconda3/envs/3s/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 538 in fit
  File "/home/jane/Desktop/threestudio/launch.py", line 250 in main
  File "/home/jane/Desktop/threestudio/launch.py", line 307 in <module>

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, torch._C, torch._C._fft, torch._C._linalg, torch._C._nested, torch._C._nn, torch._C._sparse, torch._C._special, gmpy2.gmpy2, scipy._lib._ccallback_c, scipy.signal._sigtools, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, scipy.linalg._solve_toeplitz, scipy.linalg._decomp_lu_cython, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.linalg._propack._spropack, scipy.sparse.linalg._propack._dpropack, scipy.sparse.linalg._propack._cpropack, scipy.sparse.linalg._propack._zpropack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy._lib._uarray._uarray, scipy.signal._max_len_seq_inner, scipy.signal._upfirdn_apply, scipy.signal._spline, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.spatial.transform._rotation, scipy.interpolate._fitpack, scipy.interpolate._dfitpack, scipy.optimize._group_columns, scipy.optimize._trlib._trlib, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.optimize._direct, scipy.interpolate._bspl, scipy.interpolate._ppoly, scipy.interpolate.interpnd, scipy.interpolate._rbfinterp_pythran, scipy.interpolate._rgi_cython, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, scipy.signal._sosfilt, scipy.signal._spectral, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.special.cython_special, scipy.stats._stats, scipy.stats._biasedurn, scipy.stats._levy_stable.levyst, scipy.stats._stats_pythran, scipy.stats._ansari_swilk_statistics, scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats._mvn, scipy.stats._rcont.rcont, scipy.stats._unuran.unuran_wrapper, scipy.signal._peak_finding_utils, PIL._imaging, kiwisolver._cext, regex._regex, _brotli, yaml._yaml, sentencepiece._sentencepiece, PIL._imagingft, skimage.morphology._misc_cy, skimage.measure._ccomp, _skeletonize_lee_cy, skimage.morphology._skeletonize_lee_cy, skimage.morphology._skeletonize_various_cy, skimage._shared.geometry, skimage.measure._pnpoly, skimage.morphology._convex_hull, skimage.morphology._grayreconstruct, skimage.morphology._extrema_cy, skimage.morphology._flood_fill_cy, skimage.morphology._max_tree, google._upb._message, psutil._psutil_linux, psutil._psutil_posix, lxml._elementpath, lxml.etree, xxhash._xxhash, embreex.rtcore, embreex.rtcore_scene, embreex.mesh_construction, shapely.lib, shapely._geos, shapely._geometry_helpers, mcubes._mcubes, markupsafe._speedups, sklearn.__check_build._check_build, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.strptime, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.lib, pandas._libs.ops, pandas._libs.hashing, pandas._libs.arrays, pandas._libs.tslib, pandas._libs.sparse, pandas._libs.internals, pandas._libs.indexing, pandas._libs.index, pandas._libs.writers, pandas._libs.join, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.groupby, pandas._libs.json, pandas._libs.parsers, pandas._libs.testing, sklearn.utils._isfinite, sklearn.utils.sparsefuncs_fast, sklearn.utils.murmurhash, sklearn.utils._openmp_helpers, sklearn.metrics.cluster._expected_mutual_info_fast, sklearn.preprocessing._csr_polynomial_expansion, sklearn.preprocessing._target_encoder_fast, sklearn.metrics._dist_metrics, sklearn.metrics._pairwise_distances_reduction._datasets_pair, sklearn.utils._cython_blas, sklearn.metrics._pairwise_distances_reduction._base, sklearn.metrics._pairwise_distances_reduction._middle_term_computer, sklearn.utils._heap, sklearn.utils._sorting, sklearn.metrics._pairwise_distances_reduction._argkmin, sklearn.metrics._pairwise_distances_reduction._argkmin_classmode, sklearn.utils._vector_sentinel, sklearn.metrics._pairwise_distances_reduction._radius_neighbors, sklearn.metrics._pairwise_distances_reduction._radius_neighbors_classmode, sklearn.metrics._pairwise_fast, sklearn.neighbors._partition_nodes, sklearn.neighbors._ball_tree, sklearn.neighbors._kd_tree, sklearn.utils.arrayfuncs, sklearn.utils._random, sklearn.utils._seq_dataset, sklearn.linear_model._cd_fast, _loss, sklearn._loss._loss, sklearn.svm._liblinear, sklearn.svm._libsvm, sklearn.svm._libsvm_sparse, sklearn.linear_model._sag_fast, sklearn.utils._weight_vector, sklearn.linear_model._sgd_fast, sklearn.decomposition._online_lda_fast, sklearn.decomposition._cdnmf_fast, numba.core.typeconv._typeconv, numba._helperlib, numba._dynfunc, numba._dispatcher, numba.core.runtime._nrt_python, numba.np.ufunc._internal, numba.experimental.jitclass._box (total: 230)
Segmentation fault (core dumped)

So it seems like a fault occur at File "/home/jane/Desktop/threestudio/threestudio/utils/base.py", line 43 in do_update_step_end, this line is:
module = getattr(self, attr)

Sometimes the training process terminates unexpectedly and the system crashes, so I have to force it to shut down. As it has no output message, I cannot figure out what happened.

Finally I have no idea to solve this problem.

Hope someone help me plz... o(╥﹏╥)o

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant