[mvapich-discuss] Error running test program after MVAPICH2-1.4.1 installation

senthil.natesan at ndsu.edu senthil.natesan at ndsu.edu
Mon May 17 15:38:18 EDT 2010


Thanks Jonathan. I followed both ways. Full path to binary worked
successfully and ./pmemd didn't work. Do you still want me to compile with
the mentioned flags and run the program? I would be happy to do the same.
Please let me know.

Thanks,

Senthil



> Senthil:
> I have cc'ed this to an internal developers list.  Your pmemd-wrapper
> script is being executed however it still cannot find pmemd.  I
> believe that somehow the PATH is not being set or preserved correctly
> when mpirun_rsh/mpispawn is used.  Can you try either specifying
> ./pmemd in your script or the full path to the binary?
>
> It may also be useful for you to set
>     CFLAGS=-DSPAWN_DEBUG -DMPISPAWN_DEBUG
> when you're compiling mvapich2 so that we can see more debugging
> information while mpirun_rsh/mpispawn is running.
>
> On Sat, May 15, 2010 at 7:45 PM,  <senthil.natesan at ndsu.edu> wrote:
>>
>> Thanks Jonathan for your valuable time and suggestions. I tried the
>> wrapper script, but it doesn't seem to be fixing the problem
>> I get the following error:
>>
>> /usr/bin/xauth:  error in locking authority file
>> /home/senthil/.Xauthority
>> /usr/bin/xauth:  error in locking authority file
>> /home/senthil/.Xauthority
>> pmemd-wrapper: line 2: pmemd: command not found
>> MPI process (rank: 1) terminated unexpectedly on compute-0-1.local
>> Exit code -5 signaled from compute-0-1
>> pmemd-wrapper: line 2: pmemd: command not found
>> pmemd-wrapper: line 2: pmemd: command not found
>> MPI process (rank: 0) terminated unexpectedly on compute-0-0.local
>> MPI process (rank: 2) terminated unexpectedly on compute-0-2.local
>> pmemd-wrapper: line 2: pmemd: command not found
>> MPI process (rank: 3) terminated unexpectedly on compute-0-3.local
>>
>>
>> I once again want to emphasize that the same command "pmemd" works well
>> with mpiexec.
>>
>> thanks very much.
>>
>> Senthil
>>
>>
>>
>>
>>
>>> It looks like we're having problems with executing the command due to
>>> the extra options that need to be passed.  The error message indicates
>>> that it is trying to execute `pmemd -O -i min1.in -o min1.out -inf
>>> ...', instead of `pmemd'.  The problem seen in your first email seems
>>> to be resolved since `mpispawn' is now found but we'll look into this
>>> 2nd issue.  As a work around in the meantime, you should be able to
>>> create a script that simply calls the real binary with the needed
>>> options.
>>>
>>> Ex:
>>> #!/bin/sh
>>> # filename: pmemd-wrapper
>>> # permissions: 755
>>> pmemd -O -i min1.in -o min1.out -inf min1info -p C0101sol.prmtop -c
>>> C0101sol.inpcrd -r C0101solmin1.rst -ref C0101sol.inpcrd
>>>
>>> mpirun_rsh -np 4 -hostfile hostlist pmemd-wrapper
>>>
>>> On Sat, May 15, 2010 at 4:24 PM,  <senthil.natesan at ndsu.edu> wrote:
>>>>
>>>>> Jonathan Perkins wrote:
>>>>
>>>>> Just to check, did you install mvapich2 on all of the machines in the
>>>>> hostfile?
>>>>>
>>>>
>>>>
>>>> I installed mvapich2 on all compute nodes, but still I am getting the
>>>> following error.
>>>>
>>>>
>>>> mpirun_rsh -np 4 -hostfile  hostlist  pmemd -O -i min1.in -o min1.out
>>>> -inf
>>>> min1info -p C0101sol.prmtop -c C0101sol.inpcrd -r C0101sol    min1.rst
>>>> -ref C0101sol.inpcrd
>>>>
>>>> ****************************************
>>>> /usr/bin/xauth:  error in locking authority file
>>>> /home/senthil/.Xauthority
>>>> /usr/bin/xauth:  error in locking authority file
>>>> /home/senthil/.Xauthority
>>>> /usr/bin/xauth:  error in locking authority file
>>>> /home/senthil/.Xauthority
>>>> execv: No such file or directory
>>>> pmemd -O -i min1.in -o min1.out -inf min1info -p C0101sol.prmtop -c
>>>> C0101sol.inpcrd -r C0101solmin1.rst -ref C0101sol.inpcrd
>>>> execv: No such file or directory
>>>> pmemd execv: No such file or directory
>>>> pmemd -O -i min1.in -o min1.out -inf min1info -p C0101sol.prmtop -c
>>>> C0101sol.inpcrd -r C0101solmin1.rst -ref C0101sol.inpcrd
>>>> execv: No such file or directory
>>>> pmemd -O -i min1.in -o min1.out -inf min1info -p C0101sol.prmtop -c
>>>> C0101sol.inpcrd -r C0101solmin1.rst -ref C0101sol.inpcrd
>>>> MPI process (rank: 1) terminated unexpectedly on compute-0-1.local
>>>> -O -i min1.in -o min1.out -inf min1info MPI process (rank: 2)
>>>> terminated
>>>> unexpectedly on compute-0-2.local
>>>> -p Exit code -5 signaled from compute-0-1
>>>> C0101sol.prmtop MPI process (rank: 3) terminated unexpectedly on
>>>> compute-0-3.local
>>>> -c C0101sol.inpcrd -r C0101solmin1.rst -ref C0101sol.inpcrd
>>>> MPI process (rank: 0) terminated unexpectedly on compute-0-0.local
>>>>
>>>> *******************************************************
>>>>
>>>>
>>>> The same program runs fine with mpiexec (from mvapich2). The
>>>> submitting
>>>> script as follows.
>>>>
>>>>
>>>> mpdboot -v -n 5 --ifhn=craycx1 -f mpd.hosts
>>>>
>>>> mpiexec -machinefile  hostlist -np 4  pmemd -O -i min1.in -o min1.out
>>>> -inf
>>>> min1info -p C0101sol.prmtop -c C0101sol    .inpcrd -r C0101solmin1.rst
>>>> -ref C0101sol.inpcrd
>>>>
>>>> mpdallexit
>>>>
>>>> Thanks in advance for your suggestions.
>>>>
>>>>
>>>> Senthil Natesan
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> mvapich-discuss mailing list
>>>> mvapich-discuss at cse.ohio-state.edu
>>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Jonathan Perkins
>>>
>>>
>>
>>
>>
>
>
>
> --
> Jonathan Perkins
>
>




More information about the mvapich-discuss mailing list