[mvapich-discuss] port problem using MVAPICH2
Stephan Gerber
StephanGerber at gmx.net
Tue Dec 11 10:37:04 EST 2007
Dear MVAPICH users and developers,
i have some problems using MVAPICH2.
to start with MVAPICH(1) and OPEMMPI are both running on our infinibandcluster but both do not scale as i wanted to do them.
so i want to use MVAPICH2 to achieve better results (hopefully...).
my system is a dual-Opteron cluster with 4 nodes each has 2 processors with each of them two cores.
i tried booting with:
mpdboot --totalnum=4 -1 --file=/Users/gerber/mac/mpd.hosts --rsh=ssh --verbose --ncpus=16
in this case i end up with the follwoing error message
mpdboot_n01.local (handle_mpd_output 373): from mpd on n01, invalid port info:
does anyone know which problem might that be and how to solve it?
if i use the --chhup(only) option i see that only one node out of four is up!?
if i dont use the option --totalnum=4 the mpdboot workes fine but still afetr using mpdtrace i see that there is only one host up...
for the second boot-approach i tried starting mpiexec but i end up with the following error:
rank 3 in job 1 n01.local_39320 caused collective abort of all ranks
exit status of rank 3: return code 13
[rdma_iba_init.c:91] Error initializing MVAPICH2 malloc library
Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(230): Initialization failed
MPID_Init(81)........: channel initialization failed
(unknown)(): Other MPI error[gerber at n01]
any help would be appreciated!
thanx in advance
br
stephan
More information about the mvapich-discuss
mailing list