[mvapich-discuss] OMB collective memory limit

Panda, Dhabaleswar panda at cse.ohio-state.edu
Sat Sep 10 09:57:42 EDT 2016


Hi,

As you might have noticed, we released OMB 5.3.2 together with MVAPICH2 2.2 GA and MVAPICH2-X 2.2 GA yesterday. The latest OMB 5.3.2 has the support you requested. Please try this latest version and let us know if you see any issues.

Thanks,

DK
________________________________
From: mvapich-discuss-bounces at cse.ohio-state.edu on behalf of Subramoni, Hari
Sent: Monday, September 05, 2016 11:38 PM
To: Andreas Kempf
Cc: mvapich-discuss at cse.ohio-state.edu
Subject: Re: [mvapich-discuss] OMB collective memory limit


Dear Dr. Kempf,

Thanks for reporting the issue to us. We will make appropriate changes. It should be available in an upcoming release of OMB.

Best Regards,
Hari.

On Sep 5, 2016 11:31 PM, "Andreas Kempf" <Andreas.Kempf at emea.nec.com<mailto:Andreas.Kempf at emea.nec.com>> wrote:
Dear MVAPICH team,

while running large OSU benchmarks (5.3.1 and 5.3.0) using the "collective" test cases, I noticed that the -M parameter for setting the "per process maximum memory consumption" does not accept values larger than 2 GiB:

$ mpirun -np 2 ./osu_alltoall -M 33221225472
Requested memory limit too low, using [1048576] instead.Requested memory limit too low, using [1048576] instead.

Apparently, while most of the code treats large values properly, the routine set_max_memlimit in osu_coll.c accepts its parameter as a regular integer:

osu_coll.c line 348
set_max_memlimit(atoll(optarg));

osu_coll.c line 227-239
static int
set_max_memlimit (int value)
{
    options.max_mem_limit = value;

    if (value < MAX_MEM_LOWER_LIMIT) {
        options.max_mem_limit = MAX_MEM_LOWER_LIMIT;
        fprintf(stderr,"Requested memory limit too low, using [%d] instead.",
                MAX_MEM_LOWER_LIMIT);
    }

    return 0;
}

So the optarg is properly treated as 64 bit value but passed to set_max_memlimit as a regular integer. Consequently, after the cast the check in set_max_memlimit fails and the default is used as a result. Using "set_max_memlimit (uint64_t value)" seems to allow for larger memory allocations.

Please fix the collective test cases to allow for larger memory allocations. On several thousand processors, 2 GiB can become a real limitation.

Best regards,

Andreas Kempf

----------------
Dr. Andreas Kempf
Benchmarking Analyst at NEC Deutschland GmbH
Tel: +49 211 5369 310
----
NEC Deutschland GmbH, Prinzenallee 11, 40549 Düsseldorf
Geschäftsführer Yuichi Kojima
Handelsregister Düsseldorf HRB 57941; VAT ID DE129424743



_______________________________________________
mvapich-discuss mailing list
mvapich-discuss at cse.ohio-state.edu<mailto:mvapich-discuss at cse.ohio-state.edu>
http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss

-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 1446670 bytes
Desc: not available
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20160910/250a5cd9/attachment-0001.bin>


More information about the mvapich-discuss mailing list