[mvapich-discuss] Hard coded /tmp patch for shared memory files

Adam Moody moody20 at llnl.gov
Tue Jul 29 17:16:35 EDT 2008


Hi Lei,
In practice, we found there are some disadvantages with using shared 
memory segments, as well.  Some codes may seg fault or be killed early 
by the user, which then leaves its shared memory segment orphaned.  Over 
time, the cluster runs into problems with resource exhaustion.  It's 
difficult to know which segments can be freed, especially on nodes which 
may be running sereral jobs.  We encountered such problems with another 
MPI implementation on a cluster which is cpu-scheduled, such that each 
node may run multiple jobs at once.

We don't see this problem when using files in /tmp, since they are 
unlinked very soon after they are created (so that the OS will do the 
cleanup) and before MPI returns control to the user application from 
MPI_Init.  It may be good to keep both methods available.  I think we'd 
prefer the /tmp files here.
-Adam Moody
Lawrence Livermore National Laboratory


Lei Chai wrote:

> Hi John,
>
> Thanks for reporting the problem and sending the patch to us. We have 
> also realized the limitation, and have come up with a solution that 
> does not require an actual file path for shared memory communication 
> (by using shmget and shmat function calls, thanks to suggestions from 
> TACC). The new solution will be available in the next mvapich2 release.
>
> Thanks again,
> Lei
>
>
> John Partridge wrote:
>
>> We recently had a customer issue with shared memory files being
>> hard coded to /tmp. The circumstances were that the system was
>> a diskless cluster with /tmp being an in memory files system.
>>
>> The /tmp file system was not large enough to support the shared
>> memory files. So, the customer asked if we could make mvapich use
>> an alternative path for the shared memory files.
>>
>> The version the customer is using is mvapich-0.9.9-1326 (from ofed-1.3)
>> and  we produced a patch to get an alternative path via an environment
>> variable. The patch is attached in case you might want to include it
>> in a future release of mvapich/mvapich2
>>
>> Regards
>> John
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http:// mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http:// mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>


More information about the mvapich-discuss mailing list