[mvapich-discuss] problem w/MVAPICH in the frames of Gen1

Andrey Slepuhin andrey.slepuhin at t-platforms.ru
Thu Aug 10 13:17:37 EDT 2006


So, the problem is with memory locking - see the failed mlock() in the 
end of output. Regardless of rlimits you also may have memlock 
capability enabled only for root - I don't remember exact /proc entry, 
where you can disable it, but you can try to search it.

Best regards,
Andrey

Mikhail Kuzminsky wrote:
> In message from Andrey Slepuhin <andrey.slepuhin at t-platforms.ru> (Wed, 
> 09 Aug 2006 22:08:09 +0400):
>> Mikhail, did you checked the permissions of the special files in /dev 
>> filesystem?
>              Andrey,
> thanks for ideas !
> 
> /dev/mst is 755
> /dev/mst/mt23108* permisiions: -c and rw for everybody.
> 
>> Anyway it is good idea to run the program under strace to see what's 
>> going wrong.
> 
> I applied strace output in my 1st message here,
> and now (below) apply strace for more simple perf_main: as I wrote in 
> previous message, the problem isn't only for mvapich.
> Yours
> Mikhail
> 
> strace on server part (strace perf_main -trc ...)           
> -------------------------------------------------------------
> <skipped>
> open("/usr/local/ifort/lib/x86_64/libvapi.so", O_RDONLY) = -1 ENOENT (No 
> such file or directory)
> stat("/usr/local/ifort/lib/x86_64", 0x7fbfffe6d0) = -1 ENOENT (No such 
> file or directory)
> open("/usr/local/ifort/lib/libvapi.so", O_RDONLY) = -1 ENOENT (No such 
> file or directory)
> stat("/usr/local/ifort/lib", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> <skipped>
> open("/home/local/ibgd/driver/infinihost/lib64/libvapi.so", O_RDONLY) = 3
> read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220m\0\0"..., 
> 640) = 640
> fstat(3, {st_mode=S_IFREG|0755, st_size=329519, ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
> = 0x2a95589000
> mmap(NULL, 1181736, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x2a9566d000
> mprotect(0x2a9568a000, 1062952, PROT_NONE) = 0
> mmap(0x2a9576d000, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 
> 3, 0)= 0x2a9576d000
> close(3)                                = 0
> open("/home/SGE/lib/lx24-amd64/libmosal.so", O_RDONLY) = -1 ENOENT (No 
> such file or directory)
> <etc skipped>
> open("/home/local/ibgd/driver/infinihost/lib64/libmosal.so", O_RDONLY) = 3
> read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240B\0\0"..., 
> 640) = 640
> fstat(3, {st_mode=S_IFREG|0755, st_size=165285, ...}) = 0
> mmap(NULL, 1107048, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x2a9578e000
> mprotect(0x2a9579b000, 1053800, PROT_NONE) = 0
> mmap(0x2a9588e000, 61440, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 
> 3, 0) = 0x2a9588e000
> close(3)                                = 0
> open("/home/SGE/lib/lx24-amd64/libpthread.so.0", O_RDONLY) = -1 ENOENT 
> (No such file or directory)
> <etc skipped>
> open("/lib64/libpthread.so.0", O_RDONLY) = 3
> read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0PS\0\0\0"..., 
> 640) = 640
> fstat(3, {st_mode=S_IFREG|0755, st_size=93461, ...}) = 0
> mmap(NULL, 1653792, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x2a9589d000
> mprotect(0x2a958ab000, 1596448, PROT_NONE) = 0
> mmap(0x2a9599d000, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 
> 3, 0) = 0x2a9599d000
> mmap(0x2a959ad000, 539680, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x2a959ad000
> close(3)                                = 0
> open("/home/SGE/lib/lx24-amd64/libc.so.6", O_RDONLY) = -1 ENOENT (No 
> such file or directory)
> <etc skipped>
> open("/lib64/libc.so.6", O_RDONLY)      = 3
> read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20\335\1"..., 
> 640) = 640
> fstat(3, {st_mode=S_IFREG|0755, st_size=1534814, ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
> = 0x2a9558a000
> mmap(NULL, 2365888, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x2a95a31000
> mprotect(0x2a95b55000, 1169856, PROT_NONE) = 0
> mmap(0x2a95c31000, 253952, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 
> 3, 0x100000) = 0x2a95c31000
> mmap(0x2a95c6f000, 14784, PROT_READ|PROT_WRITE, 
> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x2a95c6f000
> close(3)                                = 0
> open("/home/SGE/lib/lx24-amd64/libdl.so.2", O_RDONLY) = -1 ENOENT (No 
> such file or directory)
> <etc skipped>
> open("/lib64/libdl.so.2", O_RDONLY)     = 3
> read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260\37\0"..., 
> 640) = 640
> fstat(3, {st_mode=S_IFREG|0755, st_size=16506, ...}) = 0
> mmap(NULL, 1058696, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x2a95c73000
> mprotect(0x2a95c76000, 1046408, PROT_NONE) = 0
> mmap(0x2a95d73000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 
> 3, 0) = 0x2a95d73000
> close(3)                                = 0
> munmap(0x2a9556c000, 117044)            = 0
> brk(0)                                  = 0x51cae0
> brk(0x53dae0)                           = 0x53dae0
> brk(0)                                  = 0x53dae0
> brk(0x53e000)                           = 0x53e000
> arch_prctl(0x1002, 0x51cf80)            = 0
> getpid()                                = 16991
> rt_sigaction(SIGRTMIN, {0x2a958a5f20, [], 0x4000000}, NULL, 8) = 0
> rt_sigaction(SIGRT_1, {0x2a958a5f60, [], 0x4000000}, NULL, 8) = 0
> rt_sigaction(SIGRT_2, {0x2a958a6070, [], 0x4000000}, NULL, 8) = 0
> rt_sigprocmask(SIG_BLOCK, [RTMIN], NULL, 8) = 0
> rt_sigprocmask(SIG_UNBLOCK, [33], NULL, 8) = 0
> _sysctl({{CTL_KERN, KERN_VERSION}, 2, 0x7fbfffed10, 35, (nil), 0}) = 0
> open("/dev/mosal", O_RDONLY)            = 3
> getpid()                                = 16991
> ioctl(3, 0x7800, 0x7fbfffead0)          = 0
> fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
> = 0x2a9556c000
> write(1, "\n", 1)                       = 1
> write(1, "********************************"..., 
> 45********************************************) = 45
> write(1, "*********  perf_main version 10."..., 45*********  perf_main 
> version 10.3  *********) = 45
> write(1, "*********  CPU is: 1594.86 Mcps "..., 45*********  CPU is: 
> 1594.86 Mcps    *********) = 45
> write(1, "*********  Architecture X86_64  "..., 42********* Architecture 
> X86_64  *********) = 42
> write(1, "********************************"..., 
> 45********************************************) = 45
> write(1, "\n", 1)                       = 1
> socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 4
> setsockopt(4, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
> bind(4, {sa_family=AF_INET, sin_port=htons(4000), 
> sin_addr=inet_addr("0.0.0.0")}, 16) = 0
> listen(4, 1)                            = 0
> accept(4, {sa_family=AF_INET, sin_port=htons(32840), 
> sin_addr=inet_addr("10.0.0.2")}, [5224175576339709968]) = 5
> sendto(5, "\1\10size=128000 iter=1000 mtu=-1 t"..., 266, 0, NULL, 0) = 266
> getpid()                                = 16991
> open("/dev/vipkl", O_RDONLY)            = 6
> ioctl(6, 0x80287801, 0x7fbfffe400)      = 0
> ioctl(6, 0x80287801, 0x7fbfffe400)      = 0
> getpid()                                = 16991
> ioctl(6, 0x80287801, 0x7fbfffe3e0)      = 0
> ioctl(6, 0x80287801, 0x7fbfffe3e0)      = 0
> getpid()                                = 16991
> ioctl(6, 0x80287801, 0x7fbfffe870)      = 0
> ioctl(6, 0x80287801, 0x7fbfffe870)      = 0
> ioctl(6, 0x80287801, 0x7fbfffe840)      = 0
> ioctl(6, 0x80287801, 0x7fbfffe7b0)      = 0
> ioctl(6, 0x80287801, 0x7fbfffe820)      = 0
> ioctl(6, 0x80287801, 0x7fbfffe730)      = 0
> ioctl(6, 0x80287801, 0x7fbfffe6e0)      = 0
> getrlimit(0x3, 0x7fbfffe5b0)            = 0
> pipe([7, 8])                            = 0
> clone(child_stack=0x523ea0, 
> flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND) = 16992
> write(8, "@\251X\225*\0\0\0\5\0\0\0\177\0\0\0\320\347\377\277\177"..., 
> 168) = 168
> rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
> write(8, "\200\317Q\0\0\0\0\0\0\0\0\0*\0\0\0@\350\377\277\177\0\0"..., 
> 168) = 168
> rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
> rt_sigsuspend([] <unfinished ...>
> --- SIGRTMIN (Unknown signal 32) @ 0 (0) ---
> <... rt_sigsuspend resumed> )           = -1 EINTR (Interrupted system 
> call)
> rt_sigreturn(0x20)                      = -1 ENOSYS (Function not 
> implemented)
> write(8, "\200\317Q\0\0\0\0\0\0\0\0\0*\0\0\0@\350\377\277\177\0\0"..., 
> 168) = 168
> rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
> rt_sigsuspend([] <unfinished ...>
> --- SIGRTMIN (Unknown signal 32) @ 0 (0) ---
> <... rt_sigsuspend resumed> )           = -1 EINTR (Interrupted system 
> call)
> rt_sigreturn(0x20)                      = -1 ENOSYS (Function not 
> implemented)
> open("/usr/local/ifort/lib/libthhul.so", O_RDONLY) = -1 ENOENT (No such 
> file or directory)
> open("/usr/local/intel/compiler70/ia32/lib/libthhul.so", O_RDONLY) = -1 
> ENOENT (No such file or directory)
> open("/etc/ld.so.cache", O_RDONLY)      = 9
> fstat(9, {st_mode=S_IFREG|0644, st_size=117044, ...}) = 0
> mmap(NULL, 117044, PROT_READ, MAP_PRIVATE, 9, 0) = 0x2a9558b000
> close(9)                                = 0
> open("/home/local/ibgd/driver/infinihost/lib64/libthhul.so", O_RDONLY) = 9
> read(9, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240.\0\0"..., 
> 640) = 640
> fstat(9, {st_mode=S_IFREG|0755, st_size=296063, ...}) = 0
> mmap(NULL, 1135944, PROT_READ|PROT_EXEC, MAP_PRIVATE, 9, 0) = 0x2a96176000
> mprotect(0x2a9618a000, 1054024, PROT_NONE) = 0
> mmap(0x2a96276000, 90112, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 
> 9, 0) = 0x2a96276000
> close(9)                                = 0
> munmap(0x2a9558b000, 117044)            = 0
> mmap(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
> 0) = 0x2a9558b000
> brk(0)                                  = 0x53e000
> brk(0x560000)                           = 0x560000
> brk(0)                                  = 0x560000
> brk(0x582000)                           = 0x582000
> brk(0)                                  = 0x582000
> brk(0x5a3000)                           = 0x5a3000
> brk(0)                                  = 0x5a3000
> brk(0x5c5000)                           = 0x5c5000
> brk(0)                                  = 0x5c5000
> brk(0x5e7000)                           = 0x5e7000
> brk(0)                                  = 0x5e7000
> brk(0x609000)                           = 0x609000
> brk(0)                                  = 0x609000
> brk(0x62b000)                           = 0x62b000
> ioctl(6, 0x80287801, 0x7fbfffe800)      = 0
> mlock(0x627000, 2150135809)             = -1 EPERM (Operation not 
> permitted)
> write(1, "Error: Allocating PD : Invalid V"..., 47Error: Allocating PD : 
> Invalid Virtual Address) = 47
> --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> +++ killed by SIGSEGV +++
> ----------------------------end of strace---------------------------
> strace on client (strace perf_main -a10.0.0.1):
> open("/usr/local/intel/compiler70/ia32/lib/libvapi.so", O_RDONLY) = -1 
> ENOENT (No such file or directory)
> stat("/usr/local/intel/compiler70/ia32/lib", {st_mode=S_IFDIR|0777, 
> st_size=4096, ...}) = 0
> <etc skipped>
> open("/etc/ld.so.cache", O_RDONLY)      = 3
> fstat(3, {st_mode=S_IFREG|0644, st_size=116851, ...}) = 0
> mmap(NULL, 116851, PROT_READ, MAP_PRIVATE, 3, 0) = 0x2a9556c000
> close(3)                                = 0
> open("/home/local/ibgd/driver/infinihost/lib64/libvapi.so", O_RDONLY) = 3
> read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220m\0\0"..., 
> 640) = 640
> fstat(3, {st_mode=S_IFREG|0755, st_size=329519, ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
> = 0x2a95589000
> mmap(NULL, 1181736, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x2a9566d000
> mprotect(0x2a9568a000, 1062952, PROT_NONE) = 0
> mmap(0x2a9576d000, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 
> 3, 0)= 0x2a9576d000
> close(3)                                = 0
> open("/home/SGE/lib/lx24-amd64/libmosal.so", O_RDONLY) = -1 ENOENT (No 
> such file or directory)
> <etc skipped>
> open("/home/local/ibgd/driver/infinihost/lib64/libmosal.so", O_RDONLY) = 3
> read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240B\0\0"..., 
> 640) = 640
> fstat(3, {st_mode=S_IFREG|0755, st_size=165285, ...}) = 0
> mmap(NULL, 1107048, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x2a9578e000
> mprotect(0x2a9579b000, 1053800, PROT_NONE) = 0
> mmap(0x2a9588e000, 61440, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 
> 3, 0) = 0x2a9588e000
> close(3)                                = 0
> open("/home/SGE/lib/lx24-amd64/libpthread.so.0", O_RDONLY) = -1 ENOENT 
> (No such file or directory)
> <etc skipped>
> open("/lib64/libpthread.so.0", O_RDONLY) = 3
> read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0PS\0\0\0"..., 
> 640) = 640
> fstat(3, {st_mode=S_IFREG|0755, st_size=93461, ...}) = 0
> mmap(NULL, 1653792, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x2a9589d000
> mprotect(0x2a958ab000, 1596448, PROT_NONE) = 0
> mmap(0x2a9599d000, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 
> 3, 0) = 0x2a9599d000
> mmap(0x2a959ad000, 539680, PROT_READ|PROT_WRITE, 
> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x2a959ad000
> close(3)                                = 0
> open("/home/SGE/lib/lx24-amd64/libc.so.6", O_RDONLY) = -1 ENOENT (No 
> such file or directory)
> open("/usr/local/ifort/lib/libc.so.6", O_RDONLY) = -1 ENOENT (No such 
> file or directory)
> open("/usr/local/intel/compiler70/ia32/lib/libc.so.6", O_RDONLY) = -1 
> ENOENT (No such file or directory)
> open("/opt/globus/lib/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or 
> directory)
> open("/lib64/libc.so.6", O_RDONLY)      = 3
> read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20\335\1"..., 
> 640) = 640
> fstat(3, {st_mode=S_IFREG|0755, st_size=1534814, ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
> = 0x2a9558a000
> mmap(NULL, 2365888, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x2a95a31000
> mprotect(0x2a95b55000, 1169856, PROT_NONE) = 0
> mmap(0x2a95c31000, 253952, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 
> 3, 0x100000) = 0x2a95c31000
> mmap(0x2a95c6f000, 14784, PROT_READ|PROT_WRITE, 
> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x2a95c6f000
> close(3)                                = 0
> open("/home/SGE/lib/lx24-amd64/libdl.so.2", O_RDONLY) = -1 ENOENT (No 
> such file or directory)
> <etc skipped>
> open("/lib64/libdl.so.2", O_RDONLY)     = 3
> read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260\37\0"..., 
> 640) = 640
> fstat(3, {st_mode=S_IFREG|0755, st_size=16506, ...}) = 0
> mmap(NULL, 1058696, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x2a95c73000
> mprotect(0x2a95c76000, 1046408, PROT_NONE) = 0
> mmap(0x2a95d73000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 
> 3, 0) = 0x2a95d73000
> close(3)                                = 0
> munmap(0x2a9556c000, 116851)            = 0
> brk(0)                                  = 0x51cae0
> brk(0x53dae0)                           = 0x53dae0
> brk(0)                                  = 0x53dae0
> brk(0x53e000)                           = 0x53e000
> arch_prctl(0x1002, 0x51cf80)            = 0
> getpid()                                = 23995
> rt_sigaction(SIGRTMIN, {0x2a958a5f20, [], 0x4000000}, NULL, 8) = 0
> rt_sigaction(SIGRT_1, {0x2a958a5f60, [], 0x4000000}, NULL, 8) = 0
> rt_sigaction(SIGRT_2, {0x2a958a6070, [], 0x4000000}, NULL, 8) = 0
> rt_sigprocmask(SIG_BLOCK, [RTMIN], NULL, 8) = 0
> rt_sigprocmask(SIG_UNBLOCK, [33], NULL, 8) = 0
> _sysctl({{CTL_KERN, KERN_VERSION}, 2, 0x7fbfffecc0, 35, (nil), 0}) = 0
> open("/dev/mosal", O_RDONLY)            = 3
> getpid()                                = 23995
> ioctl(3, 0x7800, 0x7fbfffea80)          = 0
> fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
> = 0x2a9556c000
> write(1, "\n", 1)                       = 1
> write(1, "********************************"..., 
> 45********************************************) = 45
> write(1, "*********  perf_main version 10."..., 45*********  perf_main 
> version 10.3  *********) = 45
> write(1, "*********  CPU is: 1593.92 Mcps "..., 45*********  CPU is: 
> 1593.92 Mcps    *********) = 45
> write(1, "*********  Architecture X86_64  "..., 42********* Architecture 
> X86_64  *********) = 42
> write(1, "********************************"..., 
> 45********************************************) = 45
> write(1, "\n", 1)                       = 1
> socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 4
> setsockopt(4, SOL_SOCKET, SO_REUSEADDR, [17179869184001], 4) = 0
> connect(4, {sa_family=AF_INET, sin_port=htons(4000), 
> sin_addr=inet_addr("10.0.0.1")}, 16) = 0
> recvfrom(4, "\1\10", 2, 0, NULL, NULL)  = 2
> recvfrom(4, "size=128000 iter=1000 mtu=-1 tra"..., 264, 0, NULL, NULL) = 
> 264
> getpid()                                = 23995
> open("/dev/vipkl", O_RDONLY)            = 5
> ioctl(5, 0x80287801, 0x7fbfffe3b0)      = 0
> ioctl(5, 0x80287801, 0x7fbfffe3b0)      = 0
> getpid()                                = 23995
> ioctl(5, 0x80287801, 0x7fbfffe390)      = 0
> ioctl(5, 0x80287801, 0x7fbfffe390)      = 0
> getpid()                                = 23995
> ioctl(5, 0x80287801, 0x7fbfffe820)      = 0
> ioctl(5, 0x80287801, 0x7fbfffe820)      = 0
> ioctl(5, 0x80287801, 0x7fbfffe7f0)      = 0
> ioctl(5, 0x80287801, 0x7fbfffe760)      = 0
> ioctl(5, 0x80287801, 0x7fbfffe7d0)      = 0
> ioctl(5, 0x80287801, 0x7fbfffe6e0)      = 0
> ioctl(5, 0x80287801, 0x7fbfffe690)      = 0
> getrlimit(0x3, 0x7fbfffe560)            = 0
> pipe([6, 7])                            = 0
> clone(child_stack=0x523ea0, 
> flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND) = 23996
> write(7, "@\251X\225*\0\0\0\5\0\0\0\177\0\0\0\200\347\377\277\177"..., 
> 168) = 168
> rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
> write(7, "\200\317Q\0\0\0\0\0\0\0\0\0*\0\0\0\360\347\377\277\177"..., 
> 168) = 168
> rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
> rt_sigsuspend([] <unfinished ...>
> --- SIGRTMIN (Unknown signal 32) @ 0 (0) ---
> <... rt_sigsuspend resumed> )           = -1 EINTR (Interrupted system 
> call)
> rt_sigreturn(0x20)                      = -1 ENOSYS (Function not 
> implemented)
> write(7, "\200\317Q\0\0\0\0\0\0\0\0\0*\0\0\0\360\347\377\277\177"..., 
> 168) = 168
> rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0
> rt_sigsuspend([] <unfinished ...>
> --- SIGRTMIN (Unknown signal 32) @ 0 (0) ---
> <... rt_sigsuspend resumed> )           = -1 EINTR (Interrupted system 
> call)
> rt_sigreturn(0x20)                      = -1 ENOSYS (Function not 
> implemented)
> open("/usr/local/ifort/lib/libthhul.so", O_RDONLY) = -1 ENOENT (No such 
> file or directory)
> <etc skipped>
> open("/etc/ld.so.cache", O_RDONLY)      = 8
> fstat(8, {st_mode=S_IFREG|0644, st_size=116851, ...}) = 0
> mmap(NULL, 116851, PROT_READ, MAP_PRIVATE, 8, 0) = 0x2a9558b000
> close(8)                                = 0
> open("/home/local/ibgd/driver/infinihost/lib64/libthhul.so", O_RDONLY) = 8
> read(8, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240.\0\0"..., 
> 640) = 640
> fstat(8, {st_mode=S_IFREG|0755, st_size=296063, ...}) = 0
> mmap(NULL, 1135944, PROT_READ|PROT_EXEC, MAP_PRIVATE, 8, 0) = 0x2a96176000
> mprotect(0x2a9618a000, 1054024, PROT_NONE) = 0
> mmap(0x2a96276000, 90112, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 
> 8, 0) = 0x2a96276000
> close(8)                                = 0
> munmap(0x2a9558b000, 116851)            = 0
> mmap(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
> 0) = 0x2a9558b000
> brk(0)                                  = 0x53e000
> brk(0x560000)                           = 0x560000
> brk(0)                                  = 0x560000
> brk(0x582000)                           = 0x582000
> brk(0)                                  = 0x582000
> brk(0x5a3000)                           = 0x5a3000
> brk(0)                                  = 0x5a3000
> brk(0x5c5000)                           = 0x5c5000
> brk(0)                                  = 0x5c5000
> brk(0x5e7000)                           = 0x5e7000
> brk(0)                                  = 0x5e7000
> brk(0x609000)                           = 0x609000
> brk(0)                                  = 0x609000
> brk(0x62b000)                           = 0x62b000
> ioctl(5, 0x80287801, 0x7fbfffe7b0)      = 0
> mlock(0x627000, 2150135809)             = -1 EPERM (Operation not 
> permitted)
> write(1, "Error: Allocating PD : Invalid V"..., 47Error: Allocating PD : 
> Invalid Virtual Address) = 47
> --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> +++ killed by SIGSEGV +++
> ------------------------end of strace -------------------------------
> 
>>
>> WBR,
>> Andrey
>>
>> Mikhail Kuzminsky wrote:
>>>        To be more exactly and short:
>>>
>>> 1) All the limits are the same for root and "guest" (used for MPI test)
>>>
>>> 2) It's now right for both rsh and ssh calls
>>>
>>> 3) mvapich-0.9.8 works OK under root
>>>
>>> 4) But under guest mpirun_rsh w/both -rsh and -ssh fails in viainit.c
>>>
>>> And what else is the difference between root and guest ?
>>>
>>> Yours
>>> Mikhail _______________________________________________
>>> mvapich-discuss mailing list
>>> mvapich-discuss at mail.cse.ohio-state.edu
>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>> -- 
>> A right thing should be simple (tm)

-- 
A right thing should be simple (tm)


More information about the mvapich-discuss mailing list