[mvapich-discuss] mpirun_rsh issue with different users on the
same computers
Krishna Chaitanya Kandalla
kandalla at cse.ohio-state.edu
Thu Jan 7 13:34:31 EST 2010
Robert,
Once you have started the second job, are the two jobs
making any progress? Can you try disabling cpu affinity by setting
MV2_ENABLE_AFFINITY=0 while launching both the jobs. In MVAPICH2,
cpu_affinity is on by default and we deal with mapping processes to
cores. However, if you are running more than one job on the same compute
node, its very likely that the processes will get scheduled on the same
cores. In such cases, we are better off letting the linux kernel take
care of mapping the processes to cores. Ofcourse, this is under the
assumption that you are having a few "idle" cores in each node when you
start the second job. Also, which version of MVAPICH/MVAPICH2 are you
using?
Thanks,
Krishna
Robert Soliday wrote:
> If I launch a job with mpirun_rsh it runs without issues. But if while
> I am running this job, another user starts another job with mpirun_rsh
> and it lands on some of the same nodes then both jobs pause until one
> of them is killed. Then the other that was not killed finishes
> normally. Is there a port conflict or something that I should be
> looking for?
>
> Thanks,
> --Bob
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
More information about the mvapich-discuss
mailing list