[mvapich-discuss] Help with polled desc error

wei huang huanwei at cse.ohio-state.edu
Thu Jan 31 13:05:06 EST 2008


Hi Scott,

We went up to 256 processes (32 nodes) and did not see the problem in few
hundred runs (cpi). Thus, to narrow down the problem, we want to make sure
the fabrics and system setup are ok. To diagnose this, we suggest you
running mpiGraph program from http://sourceforge.net/projects/mpigraph.
This test stresses the interconnects. It should fail at a much higher
frequency than simple cpi program if there is a problem with your system
setup.

Thanks.

Regards,
Wei Huang

774 Dreese Lab, 2015 Neil Ave,
Dept. of Computer Science and Engineering
Ohio State University
OH 43210
Tel: (614)292-8501


On Wed, 30 Jan 2008, Scott A. Friedman wrote:

> My co-worker passed this along...
>
> Yes, the error happens on the cpi.c program too.  It happened 2 times
> among the 9 cases I ran.
>
> I was using 128 processes (on 32 4-core nodes).
>
> ---
>
> and another...
>
>    It happens for a simple MPI program which just does MPI_Init and
> MPI_Finalize and print out number of processors.  It happened for
> anything from 4 nodes (16 processors ) and more.
>
> What environment variables should we look for?
>
> Thanks,
> Scott
>
> wei huang wrote:
> > Hi Scott,
> >
> > On how many processes (and how many nodes) you ran your program? Do you
> > have any environmental variables when you are running the program? Does
> > the error happen on simple test like cpi?
> >
> > Thanks.
> >
> > Regards,
> > Wei Huang
> >
> > 774 Dreese Lab, 2015 Neil Ave,
> > Dept. of Computer Science and Engineering
> > Ohio State University
> > OH 43210
> > Tel: (614)292-8501
> >
> >
> > On Wed, 30 Jan 2008, Scott A. Friedman wrote:
> >
> >> The low level ibv tests work fine.
> >
> > _______________________________________________
> > mvapich-discuss mailing list
> > mvapich-discuss at cse.ohio-state.edu
> > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>




More information about the mvapich-discuss mailing list