[mvapich-discuss] MPIDI_CH3I_SMP_Init(1852): write: Success

Xiaoyi Lu lu.932 at osu.edu
Fri Jan 6 13:10:20 EST 2017


Hi, Qiuyi,

Thanks for your interest in our project.

For efficiently running MPI jobs in Docker instances, please try to use MVAPICH2-Virt package, which is available on our website.
You can follow the steps in our user guide (http://mvapich.cse.ohio-state.edu/userguide/virt/) to install and configure it.

Please feel free to let us know if you have any questions.

Thanks,
Xiaoyi

> On Jan 6, 2017, at 4:27 AM, 吕秋义 <lvqiuyi at 126.com> wrote:
> 
> Hello,   
>     I want to run Gromacs in Docker using mvapich2-2.1. I have installed infiniband in my host and I have used the same software running in the host well. But when I run Gromacs in Docker I got an error:
> 
> connect [mt_checkin]: Connection refused
> connect [mt_checkin]: Connection refused
> connect [mt_checkin]: Connection refused
> [infiniband-mynode2:mpirun_rsh][child_handler] Error in init phase, aborting! (1/12 mpispawn connections)
> connect [mt_checkin]: Connection refused
> [infiniband-mynode2:mpirun_rsh][child_handler] Error in init phase, aborting! (1/12 mpispawn connections)
> connect [mt_checkin]: Connection refused
> connect [mt_checkin]: Connection refused
> 
> The command I use is "mpirun_rsh -np 192 -hostfile /home/lqy/new/host  /home/lqy/gromacs-4.5.3/bin/mdrun_mpi -s /home/lqy/new/lmd_10.tpr -deffnm lmd_10"
> 
> When I use the command "mpirun -np 192 -hostfile /home/lqy/new/host  /home/lqy/gromacs-4.5.3/bin/mdrun_mpi -s /home/lqy/new/lmd_10.tpr -deffnm lmd_10". I got the following error:
> 
> [cli_0]: aborting job:
> Fatal error in MPI_Init:
> Other MPI error, error stack:
> MPIR_Init_thread(514)....: 
> MPID_Init(359)...........: channel initialization failed
> MPIDI_CH3_Init(446)......: 
> MPIDI_CH3I_SMP_Init(1852): write: Success
> 
> [infiniband-mynode2:mpi_rank_144][error_sighandler] Caught error: Segmentation fault (signal 11)
> [infiniband-mynode2:mpi_rank_84][error_sighandler] Caught error: Segmentation fault (signal 11)
> [infiniband-mynode2:mpi_rank_108][error_sighandler] Caught error: Segmentation fault (signal 11)
> [infiniband-mynode2:mpi_rank_36][error_sighandler] Caught error: Segmentation fault (signal 11)
> [infiniband-mynode2:mpi_rank_60][error_sighandler] Caught error: Segmentation fault (signal 11)
> [infiniband-mynode2:mpi_rank_96][error_sighandler] Caught error: Segmentation fault (signal 11)
> [infiniband-mynode2:mpi_rank_48][error_sighandler] Caught error: Segmentation fault (signal 11)
> [infiniband-mynode2:mpi_rank_156][error_sighandler] Caught error: Segmentation fault (signal 11)
> [infiniband-mynode2:mpi_rank_72][error_sighandler] Caught error: Segmentation fault (signal 11)
>     
>     
>     I use ipoib and I use pipework to pass the IP of the IB to the container.
>     I can't find any information about this error. Any help would be appreciate!!
> 
> 
> 
>  
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss




More information about the mvapich-discuss mailing list