[mvapich-discuss] mpirun_rsh issue with different users on the same computers

Krishna Chaitanya Kandalla kandalla at cse.ohio-state.edu
Thu Jan 7 13:34:31 EST 2010


Robert, 
             Once you have started the second job, are the two jobs 
making any progress? Can you try disabling cpu affinity by setting 
MV2_ENABLE_AFFINITY=0 while launching both the jobs. In MVAPICH2, 
cpu_affinity is on by default and we deal with mapping processes to 
cores. However, if you are running more than one job on the same compute 
node, its very likely that the processes will get scheduled on the same 
cores. In such cases, we are better off letting the linux kernel take 
care of mapping the processes to cores. Ofcourse, this is under the 
assumption that you are having a few "idle" cores in each node when you 
start the second job.  Also, which version of MVAPICH/MVAPICH2 are you 
using?

Thanks,
Krishna

Robert Soliday wrote:
> If I launch a job with mpirun_rsh it runs without issues. But if while 
> I am running this job, another user starts another job with mpirun_rsh 
> and it lands on some of the same nodes then both jobs pause until one 
> of them is killed. Then the other that was not killed finishes 
> normally. Is there a port conflict or something that I should be 
> looking for?
>
> Thanks,
> --Bob
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>


More information about the mvapich-discuss mailing list