[mvapich-discuss] mvapich2 crash

SpiglerG spiglerg at gmail.com
Tue Nov 11 12:51:26 EST 2008


Hi.
Since when I started using the BASS cluster, I've worked on porting my
applications to the new system (mainly libraries compatibility), but I
had some weird problems with using the installed MPI library.
After some coding, I could finally take the problem down to a simple case.
It seems that running an MPI program which uses pthread library, and
has threads instancing malloc/free calls leads to program crashes due
to `munmap_chunk() invalid pointer`s.
This could be depend on some memory locking or strict memory handling
from the MPI system; could someone help me solving it?
I'm attaching the source code I'm using (the stripped-down one), among
with an example of a crash.
I'm compiling with `mpicc -o pt pt.c -lpthread -fPIC` (-fPIC just to
get some more debug information), and running with `mpirun -np 1
-machinefile machines $(pwd)/pt` (where machines contains a single
line with an allocated machine, eg using `qlogin -l gpus=4`, [as I'm
working on GPU nodes for my apps]).

Hope someone can help me.
              Giacomo Spigler
-------------- next part --------------
In main: creating thread 0
In main: creating thread 1
In main: creating thread 2
In main: creating thread 3
In main: creating thread 4
Hello, World! It's me, thread #5!
Hello, World! It's me, thread #5!
Hello, World! It's me, thread #5!
Hello, World! It's me, thread #5!
Hello, World! It's me, thread #5!
*** glibc detected *** /home/spiglerg/bugtest/pt: munmap_chunk(): invalid pointer: 0x00002aaab3e80020 ***
======= Backtrace: =========
/lib64/libc.so.6(cfree+0x1b6)[0x3376e74d86]
/home/spiglerg/bugtest/pt(PrintHello+0x6b)[0x400911]
/lib64/libpthread.so.0[0x3377a062f7]
/lib64/libc.so.6(clone+0x6d)[0x3376ed1b6d]
======= Memory map: ========
00400000-00401000 r-xp 00000000 00:16 151175235                          /home/spiglerg/bugtest/pt
00600000-00601000 rw-p 00000000 00:16 151175235                          /home/spiglerg/bugtest/pt
19b8a000-19bc4000 rw-p 19b8a000 00:00 0 
19bc4000-19bc5000 rw-p 19bc4000 00:00 0 
19bc5000-19bf6000 rw-p 19bc5000 00:00 0 
41417000-41418000 ---p 41417000 00:00 0 
41418000-49418000 rw-p 41418000 00:00 0 
49418000-49419000 ---p 49418000 00:00 0 
49419000-51419000 rw-p 49419000 00:00 0 
51419000-5141a000 ---p 51419000 00:00 0 
5141a000-5941a000 rw-p 5141a000 00:00 0 
5941a000-5941b000 ---p 5941a000 00:00 0 
5941b000-6141b000 rw-p 5941b000 00:00 0 
6141b000-6141c000 ---p 6141b000 00:00 0 
6141c000-6941c000 rw-p 6141c000 00:00 0 
6941c000-6941d000 ---p 6941c000 00:00 0 
6941d000-7141d000 rw-p 6941d000 00:00 0 
7141d000-7141e000 ---p 7141d000 00:00 0 
7141e000-7941e000 rw-p 7141e000 00:00 0 
3375600000-337561a000 r-xp 00000000 fd:00 965129                         /lib64/ld-2.5.so
337581a000-337581b000 r--p 0001a000 fd:00 965129                         /lib64/ld-2.5.so
337581b000-337581c000 rw-p 0001b000 fd:00 965129                         /lib64/ld-2.5.so
3375a00000-3375a27000 r-xp 00000000 fd:00 965160                         /lib64/libibt.so.0.0
3375a27000-3375c27000 ---p 00027000 fd:00 965160                         /lib64/libibt.so.0.0
3375c27000-3375c29000 rw-p 00027000 fd:00 965160                         /lib64/libibt.so.0.0
3375e00000-3375e20000 r-xp 00000000 fd:00 964873                         /lib64/libpublic.so.0.0
3375e20000-3376020000 ---p 00020000 fd:00 964873                         /lib64/libpublic.so.0.0
3376020000-3376021000 rw-p 00020000 fd:00 964873                         /lib64/libpublic.so.0.0
3376200000-3376201000 r-xp 00000000 fd:00 965087                         /lib64/libmosal.so.0.0
3376201000-3376400000 ---p 00001000 fd:00 965087                         /lib64/libmosal.so.0.0
3376400000-3376401000 rw-p 00000000 fd:00 965087                         /lib64/libmosal.so.0.0
3376600000-337660f000 r-xp 00000000 fd:00 964860                         /lib64/libvapi.so.0.0
337660f000-337680e000 ---p 0000f000 fd:00 964860                         /lib64/libvapi.so.0.0
337680e000-337680f000 rw-p 0000e000 fd:00 964860                         /lib64/libvapi.so.0.0
3376a00000-3376a02000 r-xp 00000000 fd:00 965091                         /lib64/libmtl_common.so.0.0
3376a02000-3376c01000 ---p 00002000 fd:00 965091                         /lib64/libmtl_common.so.0.0
3376c01000-3376c02000 rw-p 00001000 fd:00 965091                         /lib64/libmtl_common.so.0.0
3376e00000-3376f4a000 r-xp 00000000 fd:00 965132                         /lib64/libc-2.5.so
3376f4a000-3377149000 ---p 0014a000 fd:00 965132                         /lib64/libc-2.5.so
3377149000-337714d000 r--p 00149000 fd:00 965132                         /lib64/libc-2.5.so
337714d000-337714e000 rw-p 0014d000 fd:00 965132                         /lib64/libc-2.5.so
337714e000-3377153000 rw-p 337714e000 00:00 0 
3377200000-3377202000 r-xp 00000000 fd:00 965004                         /lib64/libdl-2.5.so
3377202000-3377402000 ---p 00002000 fd:00 965004                         /lib64/libdl-2.5.so
3377402000-3377403000 r--p 00002000 fd:00 965004                         /lib64/libdl-2.5.so
3377403000-3377404000 rw-p 00003000 fd:00 965004                         /lib64/libdl-2.5.so
3377600000-3377682000 r-xp 00000000 fd:00 965138                         /lib64/libm-2.5.so
3377682000-3377881000 ---p 00082000 fd:00 965138                         /lib64/libm-2.5.so
3377881000-3377882000 r--p 00081000 fd:00 965138                         /lib64/libm-2.5.so
3377882000-3377883000 rw-p 00082000 fd:00 965138                         /lib64/libm-2.5.so
3377a00000-3377a15000 r-xp 00000000 fd:00 965134                         /lib64/libpthread-2.5.so
3377a15000-3377c14000 ---p 00015000 fd:00 965134                         /lib64/libpthread-2.5.so
3377c14000-3377c15000 r--p 00014000 fd:00 965134                         /lib64/libpthread-2.5.so
3377c15000-3377c16000 rw-p 00015000 fd:00 965134                         /lib64/libpthread-2.5.so
3377c16000-3377c1a000 rw-p 3377c16000 00:00 0 
3377e00000-3377e01000 r-xp 00000000 fd:00 965111                         /lib64/libmpga.so.0.0
3377e01000-3378000000 ---p 00001000 fd:00 965111                         /lib64/libmpga.so.0.0
3378000000-3378001000 rw-p 00000000 fd:00 965111                         /lib64/libmpga.so.0.0
3378e00000-3378e0d000 r-xp 00000000 fd:00 965068                         /lib64/libgcc_s-4.1.2-20080102.so.1
3378e0d000-337900d000 ---p 0000d000 fd:00 965068                         /lib64/libgcc_s-4.1.2-20080102.so.1
337900d000-337900e000 rw-p 0000d000 fd:00 965068                         /lib64/libgcc_s-4.1.2-20080102.so.1
3379200000-33792e6000 r-xp 00000000 fd:02 461360                         /usr/lib64/libstdc++.so.6.0.8
33792e6000-33794e5000 ---p 000e6000 fd:02 461360                         /usr/lib64/libstdc++.so.6.0.8
33794e5000-33794eb000 r--p 000e5000 fd:02 461360                         /usr/lib64/libstdc++.so.6.0.8
33794eb000-33794ee000 rw-p 000eb000 fd:02 461360                         /usr/lib64/libstdc++.so.6.0.8
33794ee000-3379500000 rw-p 33794ee000 00:00 0 
2aaaaaaab000-2aaaaaab4000 r-xp 00000000 fd:00 965103                     /lib64/libmt25218vpd.so.0.0
2aaaaaab4000-2aaaaacb3000 ---p 00009000 fd:00 965103                     /lib64/libmt25218vpd.so.0.0
2aaaaacb3000-2aaaaacb4000 rw-p 00008000 fd:00 965103                     /lib64/libmt25218vpd.so.0.0
2aaaaacb4000-2aaaaacb5000 rw-s 00000000 00:10 7445                       /dev/SysIbt
2aaaaacb5000-2aaaaacb6000 rw-s 00000000 00:10 7445                       /dev/SysIbt
2aaaaacb6000-2aaaaacb7000 rw-s ffffc20000a5a000 00:10 7445               /dev/SysIbt
2aaaaacb7000-2aaaaacb8000 rw-s ffff81024bced000 00:10 7445               /dev/SysIbt
2aaaaacb8000-2aaaaacb9000 rw-s ffff8102459da000 00:10 7445               /dev/SysIbt
2aaaaacb9000-2aaaaacba000 rw-p 2aaaaacb9000 00:00 0 
2aaaaacba000-2aaaaaeba000 rw-p 2aaaaacba000 00:00 0 
2aaaaaeba000-2aaaaaebb000 rw-p 2aaaaaeba000 00:00 0 
2aaaaaede000-2aaaaaee8000 r-xp 00000000 fd:00 964828                     /lib64/libnss_files-2.5.so
2aaaaaee8000-2aaaab0e7000 ---p 0000a000 fd:00 964828                     /lib64/libnss_files-2.5.so
2aaaab0e7000-2aaaab0e8000 r--p 00009000 fd:00 964828                     /lib64/libnss_files-2.5.so
2aaaab0e8000-2aaaab0e9000 rw-p 0000a000 fd:00 964828                     /lib64/libnss_files-2.5.so
2aaaab0e9000-2aaaabaae000 rw-p 2aaaab0e9000 00:00 0 
2aaaabaae000-2aaaabaaf000 ---p 2aaaabaae000 00:00 0 
2aaaabaaf000-2aaab3aaf000 rwxp 2aaaabaaf000 00:00 0 
2aaab3e80000-2aaab4622000 rw-p 2aaab3e80000 00:00 0 
2aaab8000000-2aaab83f2000 rw-p 2aaab8000000 00:00 0 
2aaab83f2000-2aaabc000000 ---p 2aaab83f2000 00:00 0 
2aaabc000000-2aaabc021000 rw-p 2aaabc000000 00:00 0 
2aaabc021000-2aaac0000000 ---p 2aaabc021000 00:00 0 
2b222e787000-2b222e789000 rw-p 2b222e787000 00:00 0 
2b222e789000-2b222e78a000 rw-s 00000000 00:10 7445                       /dev/SysIbt
2b222e7ab000-2b222e7ac000 rw-p 2b222e7ab000 00:00 0 
2b222e7ac000-2b222e832000 r-xp 00000000 fd:00 1029187                    /opt/iba/lib64/shared/libmpich.so.1.0
2b222e832000-2b222ea31000 ---p 00086000 fd:00 1029187                    /opt/iba/lib64/shared/libmpich.so.1.0
2b222ea31000-2b222ea36000 rw-p 00085000 fd:00 1029187                    /opt/iba/lib64/shared/libmpich.so.1.0
2b222ea36000-2b222ea87000 rw-p 2b222ea36000 00:00 0 
2b222ea87000-2b222ea9a000 r-xp 00000000 fd:00 965128                     /lib64/libmpicm.so.1.0
2b222ea9a000-2b222ec99000 ---p 00013000 fd:00 965128                     /lib64/libmpicm.so.1.0
2b222ec99000-2b222ec9b000 rw-p 00012000 fd:00 965128                     /lib64/libmpicm.so.1.0
2b222ec9b000-2b222ecbe000 rw-p 2b222ec9b000 00:00 0 
7fff7c30d000-7fff7c323000 rw-p 7fff7c30d000 00:00 0                      [stack]
ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0                  [vdso]
bash: line 1:  5887 Aborted                 /usr/bin/env MPIRUN_MPD=0 MPIRUN_HOST=bass-gpu26.cs.unc.edu MPIRUN_PORT=47559 MPIRUN_PROCESSES='bass-gpu26:' MPIRUN_RANK=0 MPIRUN_NPROCS=1 MPIRUN_ID=5881 /home/spiglerg/bugtest/pt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pt.c
Type: text/x-csrc
Size: 823 bytes
Desc: not available
Url : http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20081111/115df7f0/pt-0001.bin


More information about the mvapich-discuss mailing list