[mvapich-discuss] Fwd: fortran system calls

David Stuebe dstuebe at umassd.edu
Thu Sep 18 21:15:25 EDT 2008


Hello MVAPICH

I am helping set up and new cluster and I have run into a problem
using mvapich to compile and run a Fortran90 code which uses system
calls. The program compiles, but will not run on more than one node,
even though only one processor makes the system call. Very strange!

All is well when run on only one node of the cluster.

Running:
mpvapich2 1.0.2
Intel 10.1 compiler
OFED 1.2.5.3
Linux X86_64 2.6.9-67.0.7.ELsmp

Cluster built by aspen systems - dual processor Quad core hardware.

Has anyone seen anything similar - I am not sure it is worth trying to
fix, but if by posting it I save someones else some time, I will feel
warm and fuzzy inside...

!==================================================
program mpi_test
  USE MPI
  implicit none

   INTEGER:: MYID,NPROCS, IERR

   WRITE(6,*)"START TEST"
   CALL MPI_INIT(IERR)
   WRITE(6,*)"MPI_INIT: MPI_COMM_WORLD,IERR",MPI_COMM_WORLD,IERR

   CALL MPI_COMM_RANK(MPI_COMM_WORLD,MYID,IERR)
   WRITE(6,*)"MPI_COMM_RANK: MYID,IERR",MYID,IERR
   CALL MPI_COMM_SIZE(MPI_COMM_WORLD,NPROCS,IERR)
   WRITE(6,*)"MPI_COMM_RANK: NPROCS,IERR",NPROCS,IERR

   CALL MPI_BARRIER(MPI_COMM_WORLD,IERR)

   WRITE(6,*) "CALLED BARRIER: myid",myid,IERR


   IF(MYID==0) THEN

      CALL SYSTEM( "uptime > up_out" )
      WRITE(6,*) "CALLED SYSTEM: myid",myid
   END IF

   CALL MPI_BARRIER(MPI_COMM_WORLD,IERR)

   WRITE(6,*) "CALLED BARRIER: myid",myid,IERR



   CALL MPI_FINALIZE(IERR)


end program mpi_test
!==================================================

RESULT FROM RUN:mpiexec -n 2 ./mpit

 START TEST
 START TEST
 MPI_INIT: MPI_COMM_WORLD,IERR  1140850688           0
 MPI_COMM_RANK: MYID,IERR           0           0
 MPI_COMM_RANK: NPROCS,IERR           2           0
 MPI_INIT: MPI_COMM_WORLD,IERR  1140850688           0
 MPI_COMM_RANK: MYID,IERR           1           0
 MPI_COMM_RANK: NPROCS,IERR           2           0
 CALLED BARRIER: myid           1           0
 CALLED BARRIER: myid           0           0
 CALLED SYSTEM: myid           0
 CALLED BARRIER: myid           0           0
send desc error
[0] Abort: [] Got completion with error 4, vendor code=52, dest rank=1
 at line 513 in file ibv_channel_manager.c
rank 0 in job 50  cpr_52824   caused collective abort of all ranks
  exit status of rank 0: killed by signal 9


Thanks so much

David


More information about the mvapich-discuss mailing list