[mvapich-discuss] MPI Job crash on multi-node settings only

Kin Fai Tse kftse20031207 at gmail.com
Tue Nov 18 17:33:56 EST 2014


Hello Hari,

My MVAPICH2 is configured with only compilers set to Intel compilers
11.1, and I launch the job using this line:

mpirun_rsh -np 2 -hostfile nf ./a.out

with nf contains 2 lines, z1-1 and z1-4.

I actually also ran the same program in multiple clusters that I have
access to, only that particular one (z1-x) had the problem, the problem is
not there even in another cluster(z0) purchased together with z1, which is
connected to another infiniband switch.

As I investigate this problem, I heard that z1's infiniband connection
might be different from z0, so I do suspect it is infiniband's problem,
however I don't know how to interpret that occasional error reported:
Cannot allocate memory (12).

Regards,
Kin Fai

On Wednesday, November 19, 2014, Hari Subramoni <subramoni.1 at osu.edu> wrote:

> Hello Kin,
>
> I ran your program on our local cluster on a multi-node setting multiple
> times with the latest MVAPICH2-2.1a and was not able to get the failure you
> were talking about.
>
> From the error message, it looks like there might be some firewall running
> on your system preventing mpirun_rsh from accessing the second node leading
> to the error. Could you please consult your system administrator and
> disable any firewalls that could be running and retry? Could you also let
> us know how you're launching the job using mpirun_rsh, your hostfile, and
> how you configured MVAPICH2?
>
>
> Regards,
> Hari.
>
> On Mon, Nov 17, 2014 at 6:42 PM, Kin Fai Tse <kftse20031207 at gmail.com
> <javascript:_e(%7B%7D,'cvml','kftse20031207 at gmail.com');>> wrote:
>
>> Dear all,
>>
>> I am running a small MPI program on cluster using mpirun_rsh.
>>
>> When the 2 process are on the same node, there is no problem
>> But when I use 2 processes on 2 different nodes, communicating a small
>> part of a very large static array approximately to 1562500 will immediately
>> crash the program during launch.
>>
>> The error is:
>>
>> [z1-4:mpispawn_1][child_handler] MPI process (rank: 1, pid: 29421)
>> terminated with signal 11 -> abort job
>> [z1-0:mpirun_rsh][process_mpispawn_connection] mpispawn_1 from node z1-4
>> aborted: MPI process error (1)
>> [z1-0:mpispawn_0][read_size] read() failed on file descriptor 8:
>> Connection reset by peer (104)
>> [z1-0:mpispawn_0][read_size] Unexpected End-Of-File on file descriptor 8.
>> MPI process died?
>> [z1-0:mpispawn_0][error_sighandler] Caught error: Segmentation fault
>> (signal 11)
>> [unset]: Error reading initack on 6
>> Error on readline:: Connection reset by peer
>> /bin/bash: line 1: 29409 Segmentation fault
>>
>>
>> and occationally I got some delayed error message up to 30s after running
>> the program:
>>
>> [z1-0:mpi_rank_0][handle_cqe] Send desc error in msg to 1, wc_opcode=0
>> [z1-0:mpi_rank_0][handle_cqe] Msg from 1: wc.status=12,
>> wc.wr_id=0xc8d1c0, wc.opcode=0, vbuf->phead->type=0 =
>> MPIDI_CH3_PKT_EAGER_SEND
>> [z1-0:mpi_rank_0][handle_cqe]
>> src/mpid/ch3/channels/mrail/src/gen2/ibv_channel_manager.c:573: [] Got
>> completion with error 12, vendor code=0x81, dest rank=1
>> : Cannot allocate memory (12)
>>
>>
>> Here is my program for your reference:
>> I am sure it is due to MPI communication, as the program does not ever
>> crash when I comment out both MPI_SEND and MPI_RECV, however commenting out
>> only 1 does not work.
>>
>> #include "mpi.h"
>> #include <cstdio>
>>
>> #define MAXBLOCK 9999999
>> #define INIT 1000
>> #define INCR 1000
>>
>> int main(int argc, char* argv[]){
>>  int rank, size;
>>  int i;
>>  double time;
>>  double data[MAXBLOCK];
>>  double data2[2];
>>  MPI::Status status;
>>  MPI::Init();
>>  time=MPI::Wtime();
>>  rank = MPI::COMM_WORLD.Get_rank();
>>  size = MPI::COMM_WORLD.Get_size();
>>  if(rank == 0){
>>   for(i = INIT; i < MAXBLOCK; i+=INCR){
>>   data[i]=data[i];
>>    MPI::COMM_WORLD.Send(data, i, MPI::DOUBLE, 1, 0);
>>    printf("Size: %d sent.\n", i);
>>   }
>>  } else {
>>   i = INIT;
>>   while(i < MAXBLOCK){
>>   data[i]=data[i];
>>    MPI::COMM_WORLD.Recv(data, i, MPI::DOUBLE, 0, 0, status);
>>    i+=INCR;
>>   }
>>  }
>>  MPI::Finalize();
>>  return 0;
>> }
>>
>> I am quite frustrated about why communicating only a fraction of data of
>> the whole array will crash on multi-node setting.
>>
>> Best regards,
>> Kin Fai
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> <javascript:_e(%7B%7D,'cvml','mvapich-discuss at cse.ohio-state.edu');>
>> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20141119/bfc27204/attachment-0001.html>


More information about the mvapich-discuss mailing list