[mvapich-discuss] Oversubscription support
Maksym Planeta
mplaneta at os.inf.tu-dresden.de
Mon Oct 26 10:47:04 EDT 2015
On 10/26/2015 03:45 PM, Jonathan Perkins wrote:
> Sorry, I meant to ask if you were setting MV2_USE_BLOCKING to 1.
>
No problem. I've got it
The error is:
Anyway the error is not related to blocking:
[54] Error parsing CPU mapping string
[54] INTERNAL ERROR: invalid error code ffffffff (Ring Index out of
range) in MPIDI_CH3I_set_affinity:119
[54] [cli_54]: aborting job:
[54] Fatal error in MPI_Init:
[54] Other MPI error, error stack:
[54] MPIR_Init_thread(514):
[54] MPID_Init(359).......: channel initialization failed
[54] MPIDI_CH3_Init(469)..:
[54]
And it happens, because mv2_get_assigned_cpu_core returns -1 for ranks,
which local_id is bigger than number of cores.
> On Mon, Oct 26, 2015 at 10:41 AM Jonathan Perkins
> <perkinjo at cse.ohio-state.edu <mailto:perkinjo at cse.ohio-state.edu>> wrote:
>
> When you're running with oversubscription, were you
> setting MV2_USE_BLOCKING to 0? If so, what type of errors were you
> hitting?
>
> On Mon, Oct 26, 2015 at 10:34 AM Maksym Planeta
> <mplaneta at os.inf.tu-dresden.de
> <mailto:mplaneta at os.inf.tu-dresden.de>> wrote:
>
> Hi,
>
> I'm interested in using MVAPICH library with oversubscription,
> i.e. with
> more than one rank per core. In version 2.1 oversubscription worked
> until certain limit and then the library was just breaking
> because of bugs.
>
> So I updated to 2.2a and found out that the new version contains
> additional checks (for example in function
> mv2_get_assigned_cpu_core),
> which basically forbids to have more than one rank per core.
>
> Could you tell me the reason for that? Have you ever tried about
> running MVAPICH with oversubscription? And would you at least
> consider
> the patches for oversubscription support?
>
> --
> Regards,
> Maksym Planeta
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> <mailto:mvapich-discuss at cse.ohio-state.edu>
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
--
Regards,
Maksym Planeta
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5154 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20151026/3dd1681d/attachment-0001.p7s>
More information about the mvapich-discuss
mailing list