Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation Fault:11 when running NetPyNE simulation in parallel (MPI) #3339

Open
livia5810 opened this issue Feb 25, 2025 · 8 comments
Open
Labels

Comments

@livia5810
Copy link

Context

We tried to run a python script implementing a NetpyNE model in parallel using MPI. The simulation itself ran through without errors but the output saving at the end did not work.

Overview of the issue

The output of the simulation could not be saved and this error was shown:

_Thread 1 "nrniv" received signal SIGSEGV, Segmentation fault.
warning: 1376   ../Objects/obmalloc.c: No such file or directory
0x00007fffecb78fc0 in get_state () at ../Objects/obmalloc.c:1376_
...
  _0x00007fffecb78fc0 in get_state () at ../Objects/obmalloc.c:1376
#1  _PyObject_Free (ctx=<optimized out>, p=0x555555a2ac90) at ../Objects/obmalloc.c:2421
#2  _PyObject_Free (ctx=<optimized out>, p=0x555555a2ac90) at ../Objects/obmalloc.c:2414
#3  0x00007ffff4546d3d in del_wcargv (argc=<optimized out>) at ./src/nrnpython/nrnpython.cpp:100
#4  del_wcargv (argc=<optimized out>) at ./src/nrnpython/nrnpython.cpp:97
#5  nrnpython_start (b=<optimized out>) at ./src/nrnpython/nrnpython.cpp:176
#6  0x00007ffff7bb6949 in ivocmain_session (argc=<optimized out>, argv=0x7fffffffd1b8, env=0x7fffffffd1e0, start_session=start_session@entry=1) at ./src/ivoc/ivocmain.cpp:855
#7  0x00007ffff7bb6eee in ivocmain (argc=<optimized out>, argv=<optimized out>, env=<optimized out>) at ./src/ivoc/ivocmain.cpp:408
#8  0x00005555555550da in main (argc=<optimized out>, argv=<optimized out>, env=0x7fffffffd1e0) at ./src/ivoc/nrnmain.cpp:53_

This seemed to be an issue in src/nrnpython/nrnpython.cpp and a workaround was to get rid of both the final call to PyMem_Free(wcargv); as well as PyMem_Free(wcargv[i]); in del_wcargv().
Could this be a race condition where wcargv is modified from two places at once?

NEURON setup

  • Version: 8.2.2
  • Installation method: Debian
  • OS + Version: Debian 13
  • Compiler + Version: GCC 14.2

Minimal working example - MWE

mpiexec -n 4 nrniv -python -mpi tut1.py (example from NetPyNE tutorial: http://doc.netpyne.org/tutorial.html)

@livia5810 livia5810 added the bug label Feb 25, 2025
@nrnhines
Copy link
Member

So far I'm not able to reproduce the issue. Starting with a netpyne folder with tut1.py and HHTut.py and using Python 3.12.3

python3.12 -m venv env
env/bin/activate
pip install --upgrade pip
pip install netpyne
pip install neuron
which nrniv # /home/hines/models/netpyne/env/bin/nrniv
mpiexec -n 4 nrniv -python -mpi tut1.py

The last command ran and exited normally with the last few lines being

Plotting 2D representation of network cell locations and connections...
  Done; plotting time = 1.16 s

Total time = 2.16 s
(env) hines@hines-ThinkStation-P5:~/models/netpyne$ 

Note:

(env) hines@hines-ThinkStation-P5:~/models/netpyne$ nrniv
/home/hines/models/netpyne/env/bin/nrniv:10: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
  from pkg_resources import working_set
NEURON -- VERSION 8.2.6 HEAD (078a34a9d) 2024-07-24
Duke, Yale, and the BlueBrain Project -- Copyright 1984-2022
See http://neuron.yale.edu/neuron/credits

oc>

@ximion
Copy link

ximion commented Feb 26, 2025

Thank you for your reply! I can also reproduce the issue (easy, as I am in the same group as the original reporter), and we are using Python 3.13 (currently 3.13.2 to be precise, using the Debian-packaged Neuron).

Does that help? Can you reproduce the issue with Python 3.13? Given the place where the crash occurs, it's maybe not unlikely that the Python version has something to do with it (but I haven't found anything obvious yet).

@nrnhines
Copy link
Member

What is the URL for the Debian-packaged Neuron you mentioned above?

For Python3.13.1 and the current neuron-nightly VERSION 9.0a-493-gb17d87243 I'm not seeing a problem.

pip install neuron-nightly

NEURON 8.2.6 definitely has a problem with Python3.13 . See #3316
I successfully run your model with the prospective 8.2.7 version (VERSION 8.2.6-10-gf0aec2603 hines/8.2-py13)

Since you mention you are using 8.2 for Python3.13, I am guessing your Debian-packaged Neuron does not contain the commits 6d91299 and c624c94

@ximion
Copy link

ximion commented Feb 27, 2025

You can find the package here: https://packages.debian.org/source/sid/neuron

Looking at https://salsa.debian.org/science-team/neuron/-/tree/debian/master/debian/patches?ref_type=heads it looks like the package includes 6d91299 (and its dependencies), but not c624c94

How strange! I can include the latter patch and see if that does anything, but I'd be surprised if that changes anything. We can try with the nightly version though and report back!
Without freeing wcargv (no del_wcargv call) we haven't found any other issues, the simulation results are the same as before (but of course, that's not a fix at all, just papering over whatever the real problem is...).

@nrnhines
Copy link
Member

I thought I'd try building the package

git clone https://salsa.debian.org/science-team/neuron.git salsa
cd salsa
mkdir build
cd build
cmake .. -G Ninja -DCMAKE_INSTALL_PREFIX=install

but unfortunately the last cmake command generated

CMake Error at cmake/CheckGitDescribeCompatibility.cmake:47 (message):
  Failed to parse Git tag: ''
Call Stack (most recent call first):
  CMakeLists.txt:219 (include)

Anyone know a simple work around for that. git tag test did not do the trick.
Note:

$ git describe
debian/8.2.2-8

One thing that bothers me about this is that the current distribution tag is 8.2.6 and I think there are a lot of bug fixes between 8.2.2 and 8.2.6

@ximion
Copy link

ximion commented Feb 27, 2025

Are you familiar with building Debian packages and are you on Debian? One extremely quick way to get a package out would be doing these steps:

sudo apt install git-buildpackage debspawn
# create a container for Debian 13 to build in
debspawn create unstable
cd /path/to/package/git/checkout
# build the package
gbp buildpackage --git-builder='debspawn build unstable'

That will create the packages in /var/lib/debspawn/results/

The key thing to note is that the Git checkout won't have the patches applied, so if you build it like you did, you'll be missing some of the changes.

If you just want to quickly compile the thing quick&dirty to inspect the generated binary artifacts, you can try this:

sudo apt install devscripts
# download the package and unpack it in the current directory
dget -ux http://deb.debian.org/debian/pool/main/n/neuron/neuron_8.2.2-8.dsc
cd neuron_8.2.2-8/
# run just the build step
make -f debian/rules build

I haven't tested that, but that should create a directory for you where cmake is configured and Neuron is built, with the exact changes present in the package itself :-)

Thank you for going through all this trouble!

I looked through the Neuron Git history, and it looks like patch f58692e completely fixes the issue from this bug - however, that's a major refactoring and sadly not in a stable Neuron release yet...

@nrnhines
Copy link
Member

nrnhines commented Mar 3, 2025

I'm not familiar with building debian packages. My os is ubuntu 24.04 . Taking your second approach above, after sudo apt install ... the dget generated

-rw-rw-r--  1 hines hines     2359 Feb 24 10:28 neuron_8.2.2-8.dsc
-rw-rw-r--  1 hines hines    33156 Feb 24 10:28 neuron_8.2.2-8.debian.tar.xz
drwxrwxr-x 18 hines hines     4096 Mar  3 08:34 neuron-8.2.2

but

~/neuron$ cd neuron-8.2.2
~/neuron/neuron-8.2.2$  make -f debian/rules build
dh build --buildsystem=cmake
make: dh: No such file or directory

However I'm not so happy with this approach as it loses the usefulness of being a git repository.

Also tried the first method but I'm too out of context to understand what is going on. The second command seemed to create some container but I don't know what you meant by cd /path/to/package/git/checkout. I cloned the nrn repository into temp and

$ cd temp
hines@hines-ThinkStation-P5:~/neuron/temp$ gbp buildpackage --git-builder='debspawn build unstable'
gbp:error: Can't determine package type: Failed to read changelog: [Errno 2] No such file or directory: './debian/changelog'
hines@hines-ThinkStation-P5:~/neuron/temp

I guess I'll continue developing a normal nrn release/8.2.7 which will allow pip install neuron and a debian distribution can be created from that if needed.

@JCGoran
Copy link
Collaborator

JCGoran commented Mar 5, 2025

Note that the developers of NEURON are not the same people as the maintainers of the NEURON Debian package, so for any problems with the Debian package itself, please report this to the downstream bug tracker first if your issue persists. Due to the fact that the maintainers of downstream packages often apply various patches to the original source distributions, we are only able to provide proper support if NEURON is installed in one of the following ways:

Also, as @nrnhines mentioned above, the 8.2 release of NEURON does not yet support Python 3.13.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants