[mvapich-discuss] Problem solved: program crashing running
mvapich over infiniband
Derek Stewart
stewart at cnf.cornell.edu
Thu Feb 5 00:12:23 EST 2009
Hi Matthew,
I just wanted to update you and let you know I solved the problem. It turns
out the local versions of the input file on the different nodes were not
identical. So the local program which was set for a larger memory
calculation was receiving much less data from the small memory versions
initiated on the other nodes. Once I fixed this problem, it ran smoothly.
Thanks again,
Derek
Matthew Koop writes:
> Hi Derek,
>
> Thanks for reporting this problem. Can you give us some additional
> information about the run/system? How many processes are you running with
> and what HCAs are you using?
>
> We're also interested in trying to reproduce the problem here on our
> machines. Is there a dataset that you are using that you could send to us?
>
> Matt
>
> On Wed, 4 Feb 2009, Derek Stewart wrote:
>
>> Hi all,
>>
>> I was wondering if anyone would have a suggestion for this error. I am
>> running abinit version 5.4.4p compiled with mvapich 2-1.2p1 and gcc (GCC)
>> 3.4.6 and gfortran 4.1.2, Linux 2.6.9-78.0.13.ELsmp 64bit.
>>
>> Warning! Rndv Receiver is receiving (36680 < 1263624) less than as expected
>> rank 1 in job 1
>>
>> c32_32836 caused collective abort of all ranks
>> exit status of rank 1: killed by signal 9
>>
>>
>> Thanks,
>>
>> Derek
>>
>> ################################
>> Derek Stewart, Ph. D.
>> Scientific Computation Associate
>> http://www.people.cornell.edu/pages/das248/
>> 250 Duffield Hall
>> Cornell Nanoscale Facility (CNF)
>> Ithaca, NY 14853
>> stewart (at) cnf.cornell.edu
>> (607) 255-2856
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>
################################
Derek Stewart, Ph. D.
Scientific Computation Associate
http://www.people.cornell.edu/pages/das248/
250 Duffield Hall
Cornell Nanoscale Facility (CNF)
Ithaca, NY 14853
stewart (at) cnf.cornell.edu
(607) 255-2856
More information about the mvapich-discuss
mailing list