[mvapich-discuss] MVAPICH2-PSM 1.4 InfiniPath context sharing problems, including patch

Dhabaleswar Panda panda at cse.ohio-state.edu
Mon Feb 1 15:54:54 EST 2010


Dear Avneesh and Ben,

Thanks for your prompt replies. Yes, it looks like the version numbers are
inconsistent at different places and this led to the confusion. We have
actually QLogic OFED+ 5.1.0.0.49 version installed on our system and
MVAPICH2 1.4 has been tested with it. However, we tested only single MPI
job per node in our testing.  Thus, we did not encounter the issue being
reported in this e-mail thread. We will carry out testing with multiple
jobs per node and reproduce the problem. Then we will incorporate the
patch and necessary fixes to solve this issue.

Thanks,

DK

On Mon, 1 Feb 2010, Ben Truscott wrote:

> Dear Prof. Panda
>
> Many thanks for your reply. It seems that the version number reported in
> various places can unfortunately be somewhat inconsistent, but what I have
> referred to below as version 2.8 (due to the output of ipath_control;
> possibly this refers to the HCA's driver rather than the overall software
> package) is referred to on QLogic's website as version 5.1.0.0.49, which
> contains InfiniPath 2.3 and OFED+ 1.4.2, and supplies a version of PSM
> having PSM_VERNO == 0x10c. This package is currently available for
> download from <http://support.qlogic.com/support/drivers_software.aspx>,
> although it is an extremely large distribution (nearly 570MB) and has
> appeared only quite recently.
>
> Before the update to the new version (whatever its proper version number
> might be), I did experience periodic failures to acquire contexts,
> although since these were not consistent I did not at the time put them
> down to the behaviour of MVAPICH2; in retrospect, however, that may indeed
> have been the root cause since it is obvious from consideration of
> psm_entry.c that MPI_LOCALRANKS and MPI_LOCALRANKID were not being set
> although they are expected by the PSM library. I certainly noticed that
> the value of PSM_DEVICES I had passed during the MPI launch was not being
> honoured, since it happens that our login nodes run the InfiniPath
> software without having any HCAs installed, and if one requests
> PSM_DEVICES including the ipath device in this situation the program will
> simply exit with an error rather than printing e.g. the usual usage
> message.
>
> Best regards
>
> Ben Truscott
>
> On Mon, February 1, 2010 7:48 pm, Dhabaleswar Panda wrote:
> > Hi Ben,
> >
> > Thanks for your note. To the best of our knowledge, InfiniPath 2.8 is
> > publicly not available. Thus, MVAPICH2 1.4 has not been tested with it
> > yet. It has been tested with InfiniPath version 2.2. Do you see any
> > problem with MVAPICH2 1.4 and InfiniPath version 2.2? Once we have access
> > to InfiniPath 2.8, we will be carrying out tests with upcoming versions of
> > MVAPICH2. Thanks for sending us the design guidelines with InfiniPath 2.8
> > and the patch. We will review and incorporate these to the next MVAPICH2
> > release as appropriate.
> >
> > Thanks,
> >
> > DK
> >
> > On Mon, 1 Feb 2010, Ben Truscott wrote:
> >
> >> Dear all
> >>
> >> I am using MVAPICH2 1.4 built for the PSM device on a cluster equipped
> >> with
> >> QLogic InfiniPath QLE7140 Infiniband HCAs. After a recent update of our
> >> InfiniPath software from version 2.2 to the recently released version
> >> 2.8
> >> (the next major version after 2.2, also known as QLogic OFED+ 1.4) I
> >> began
> >> to notice consistent job failures caused by an inability to acquire the
> >> proper number of InfiniPath contexts in cases where two or more MPI jobs
> >> had been queued together on the same node at the same time.
> >>
> >> Using the PSM environment variable PSM_VERBOSE_ENV, which is a new
> >> addition
> >> to version 2.8 (PSM_TRACEMASK having disappeared) that prints the
> >> effective
> >> and default values of all variables that affect the operation of PSM, I
> >> was
> >> able to determine that this was due to the effective value for
> >> PSM_SHAREDCONTEXTS_MAX being set to 16 regardless of the value I had
> >> passed
> >> to the job. In fact the QLE7140 has four hardware contexts, each of
> >> which
> >> can be shared four ways within a single MPI job, but, due to a change in
> >> the behaviour of PSM from eager sharing to greedy context acquisition in
> >> the latest version, the specification of PSM_SHAREDCONTEXTS_MAX=16
> >> (default
> >> value: 4) caused the first job to start on each node to acquire one
> >> context
> >> per process without employing context sharing, thus leaving insufficient
> >> contexts available for subsequent jobs.
> >>
> >> Since I had experienced no problems with the version of PSM supplied
> >> with
> >> the InfiniPath 2.2 distribution, I initially suspected a bug in PSM
> >> itself
> >> and contacted QLogic, but they were unable to reproduce the problem.
> >> After
> >> verifying correct behaviour under OpenMPI I was persuaded that the
> >> problem
> >> must be specific to MVAPICH2 and hence examined the file psm_entry.c,
> >> which
> >> I found to contain a number of logic errors including hard-coded
> >> resetting
> >> of the PSM environment to values that are, in general, likely to give
> >> rise
> >> to problems of the sort that I encountered. I therefore submit the
> >> attached
> >> (commented) patch for your consideration with a view to its possible
> >> inclusion in the next version of MVAPICH2. Although I hope that its
> >> original author will not take offence to my saying so, I feel I should
> >> note
> >> as well that this file looks as if it was written very hastily and
> >> generates more compiler warnings (using Intel C 11.1) than the rest of
> >> the
> >> distribution combined. While the patch is solely intended to correct the
> >> erroneous context sharing behaviour and, admittedly, does not introduce
> >> any
> >> additional warnings, it may be worthwhile to re-visit psm_entry.c in
> >> general with a view to re-writing it for a future release.
> >>
> >> Best regards,
> >>
> >> Yours
> >>
> >> Ben Truscott
> >> School of Chemistry
> >> University of Bristol (UK)
>
>



More information about the mvapich-discuss mailing list