[mvapich-discuss] mvapich2 errors with openfoam

Purkayastha, Avi Avi.Purkayastha at nrel.gov
Wed Nov 20 00:00:30 EST 2013


Hi Sreeram,
Here are the two situations I tried with OpenFOAM built with shared 2.0a version of the mvapich2 library:

1) By default (without any envs) it ran with 512 procs but crashed on 1024 procs..
2) When I set "MV2_USE_LAZY_MEM_UNREGISTER=0", as you suggested..

In both cases, the error message is ..
WARNING: Error in initializing MVAPICH2 ptmalloc library.Continuing without InfiniBand registration cache support.
[0]
[0]
[0] --> FOAM FATAL IO ERROR:
[0] error in IOstream "IOstream" for operation operator>>(Istream&, List<T>&) : reading first token
[0]
[0] file: IOstream at line 0.
[0]
[0]     From function IOstream::fatalCheck(const char*) const
[0]     in file db/IOstreams/IOstreams/IOstream.C at line 114.
[0]
FOAM parallel run exiting
[0]
[cli_0]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
[512]
[512]
[512] --> FOAM FATAL IO ERROR:
[512] error in IOstream "IOstream" for operation operator>>(Istream&, List<T>&) : reading first token
[512]
[512] file: IOstream at line 0.
[512]
[512]     From function IOstream::fatalCheck(const char*) const
[512]     in file db/IOstreams/IOstreams/IOstream.C at line 114.
[512]
FOAM parallel run exiting
[512]
[cli_512]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 512
[proxy:0:1 at n0746] HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:913): assert (!closed) failed
[proxy:0:1 at n0746] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
:

The open foam output error was

:
n0913.15256
n0913.15257
n0913.15258
n0913.15259
)

Pstream initialized with:
    floatTransfer     : 0
    nProcsSimpleSum   : 0
    commsType         : nonBlocking
fileModificationChecking : Monitoring run-time modified files using timeStampMaster
allowSystemOperations : Disallowing user-supplied system call operations

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
Create time

Create mesh for time = 475


===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   EXIT CODE: 1
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================

Let me know if you need any other information(s). Please suggest what I should try next.

Many thanks for your help.

-- Avi





More information about the mvapich-discuss mailing list