[mvapich-discuss] error when extending jobs on 37 nodes.

teng ma xiaok1981 at gmail.com
Wed Nov 2 22:23:48 EDT 2011


I used mvapich 1.7
configure as

$ ./configure --prefix /home/tma/opt/mvapich217-limic2 --with-limic2
LDFLAGS=-Wl,-rpath=/usr/local/lib

limic 2  0.5.5

20 g ib.  24 cores/ node.   IMB test

I bound process onto each core by this command

mpirun_rsh -np 888 -hostfile ~/rankfile MV2_CPU_BINDING_POLICY=bunch
./IMB-MPI1 Bcast -npmin 888

It reports following errors:

There is no problems when tests spawned on nodes less than 35. But bigger
than 35 nodes, sometimes it's working, sometimes it reports following
error. It keep reporting errors when reaching 37 nodes(888 procs).

Thanks for help
Teng

[parapluie-19.rennes.grid5000.fr:mpi_rank_388][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-19.rennes.grid5000.fr:mpi_rank_392][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-19.rennes.grid5000.fr:mpi_rank_403][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-19.rennes.grid5000.fr:mpi_rank_386][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-19.rennes.grid5000.fr:mpi_rank_385][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-19.rennes.grid5000.fr:mpi_rank_397][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-19.rennes.grid5000.fr:mpi_rank_396][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-19.rennes.grid5000.fr:mpi_rank_407][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-19.rennes.grid5000.fr:mpi_rank_404][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-19.rennes.grid5000.fr:mpi_rank_393][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-19.rennes.grid5000.fr:mpi_rank_395][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-19.rennes.grid5000.fr:mpi_rank_391][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-19.rennes.grid5000.fr:mpi_rank_394][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-19.rennes.grid5000.fr:mpi_rank_401][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-19.rennes.grid5000.fr:mpi_rank_399][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-19.rennes.grid5000.fr:mpi_rank_390][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-19.rennes.grid5000.fr:mpi_rank_389][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-19.rennes.grid5000.fr:mpispawn_16][readline] Unexpected
End-Of-File on file descriptor 8. MPI process died?
[parapluie-19.rennes.grid5000.fr:mpispawn_16][mtpmi_processops] Error while
reading PMI socket. MPI process died?
[parapluie-19.rennes.grid5000.fr:mpispawn_16][child_handler] MPI process
(rank: 392, pid: 16153) terminated with signal 7 -> abort job
[parapluie-2.rennes.grid5000.fr:mpirun_rsh][process_mpispawn_connection]
mpispawn_16 from node parapluie-26.rennes.grid5000.fr aborted: MPI process
error (1)
[parapluie-34.rennes.grid5000.fr:mpi_rank_723][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-34.rennes.grid5000.fr:mpi_rank_725][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-34.rennes.grid5000.fr:mpispawn_30][child_handler] MPI process
(rank: 725, pid: 15984) terminated with signal 7 -> abort job
[parapluie-34.rennes.grid5000.fr:mpispawn_30][readline] Unexpected
End-Of-File on file descriptor 7. MPI process died?
[parapluie-34.rennes.grid5000.fr:mpispawn_30][mtpmi_processops] Error while
reading PMI socket. MPI process died?
[parapluie-9.rennes.grid5000.fr:mpispawn_7][read_size] Unexpected
End-Of-File on file descriptor 31. MPI process died?
[parapluie-9.rennes.grid5000.fr:mpispawn_7][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-9.rennes.grid5000.fr:mpispawn_7][child_handler] MPI process
(rank: 182, pid: 15057) terminated with signal 2 -> abort job
[parapluie-5.rennes.grid5000.fr:mpispawn_3][read_size] Unexpected
End-Of-File on file descriptor 33. MPI process died?
[parapluie-5.rennes.grid5000.fr:mpispawn_3][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-33.rennes.grid5000.fr:mpispawn_29][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-33.rennes.grid5000.fr:mpispawn_29][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-2.rennes.grid5000.fr:mpispawn_1][read_size] Unexpected
End-Of-File on file descriptor 34. MPI process died?
[parapluie-2.rennes.grid5000.fr:mpispawn_1][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-35.rennes.grid5000.fr:mpispawn_31][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-35.rennes.grid5000.fr:mpispawn_31][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-36.rennes.grid5000.fr:mpispawn_32][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-36.rennes.grid5000.fr:mpispawn_32][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-14.rennes.grid5000.fr:mpi_rank_265][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-14.rennes.grid5000.fr:mpi_rank_285][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-14.rennes.grid5000.fr:mpi_rank_282][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-14.rennes.grid5000.fr:mpi_rank_268][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-14.rennes.grid5000.fr:mpi_rank_281][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-14.rennes.grid5000.fr:mpispawn_11][readline] Unexpected
End-Of-File on file descriptor 21. MPI process died?
[parapluie-14.rennes.grid5000.fr:mpispawn_11][mtpmi_processops] Error while
reading PMI socket. MPI process died?
[parapluie-14.rennes.grid5000.fr:mpispawn_11][child_handler] MPI process
(rank: 282, pid: 16217) terminated with signal 7 -> abort job
[parapluie-15.rennes.grid5000.fr:mpispawn_12][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-15.rennes.grid5000.fr:mpispawn_12][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-13.rennes.grid5000.fr:mpispawn_10][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-13.rennes.grid5000.fr:mpispawn_10][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-15.rennes.grid5000.fr:mpispawn_12][child_handler] MPI process
(rank: 306, pid: 16126) terminated with signal 2 -> abort job
[parapluie-13.rennes.grid5000.fr:mpispawn_10][child_handler] MPI process
(rank: 258, pid: 16181) terminated with signal 2 -> abort job
[parapluie-12.rennes.grid5000.fr:mpispawn_9][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-12.rennes.grid5000.fr:mpispawn_9][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-2.rennes.grid5000.fr:mpispawn_1][child_handler] MPI process
(rank: 42, pid: 20462) terminated with signal 2 -> abort job
[parapluie-36.rennes.grid5000.fr:mpispawn_32][child_handler] MPI process
(rank: 776, pid: 14656) terminated with signal 2 -> abort job
[parapluie-6.rennes.grid5000.fr:mpispawn_4][read_size] Unexpected
End-Of-File on file descriptor 32. MPI process died?
[parapluie-6.rennes.grid5000.fr:mpispawn_4][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-22.rennes.grid5000.fr:mpispawn_19][readline] Unexpected
End-Of-File on file descriptor 19. MPI process died?
[parapluie-22.rennes.grid5000.fr:mpispawn_19][mtpmi_processops] Error while
reading PMI socket. MPI process died?
[parapluie-8.rennes.grid5000.fr:mpispawn_6][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-8.rennes.grid5000.fr:mpispawn_6][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-10.rennes.grid5000.fr:mpispawn_8][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-10.rennes.grid5000.fr:mpispawn_8][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-1.rennes.grid5000.fr:mpispawn_0][read_size] Unexpected
End-Of-File on file descriptor 31. MPI process died?
[parapluie-1.rennes.grid5000.fr:mpispawn_0][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-4.rennes.grid5000.fr:mpispawn_2][read_size] Unexpected
End-Of-File on file descriptor 32. MPI process died?
[parapluie-4.rennes.grid5000.fr:mpispawn_2][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-4.rennes.grid5000.fr:mpispawn_2][child_handler] MPI process
(rank: 52, pid: 14639) terminated with signal 2 -> abort job
[parapluie-1.rennes.grid5000.fr:mpispawn_0][child_handler] MPI process
(rank: 19, pid: 15701) terminated with signal 2 -> abort job
[parapluie-8.rennes.grid5000.fr:mpispawn_6][child_handler] MPI process
(rank: 154, pid: 14891) terminated with signal 2 -> abort job
[parapluie-35.rennes.grid5000.fr:mpispawn_31][child_handler] MPI process
(rank: 747, pid: 15915) terminated with signal 2 -> abort job
[parapluie-16.rennes.grid5000.fr:mpispawn_13][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-16.rennes.grid5000.fr:mpispawn_13][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-6.rennes.grid5000.fr:mpispawn_4][child_handler] MPI process
(rank: 104, pid: 14631) terminated with signal 2 -> abort job
[parapluie-16.rennes.grid5000.fr:mpispawn_13][child_handler] MPI process
(rank: 327, pid: 16129) terminated with signal 2 -> abort job
[parapluie-17.rennes.grid5000.fr:mpispawn_14][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-17.rennes.grid5000.fr:mpispawn_14][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-23.rennes.grid5000.fr:mpispawn_20][readline] Unexpected
End-Of-File on file descriptor 6. MPI process died?
[parapluie-23.rennes.grid5000.fr:mpispawn_20][mtpmi_processops] Error while
reading PMI socket. MPI process died?
[parapluie-27.rennes.grid5000.fr:mpispawn_23][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-27.rennes.grid5000.fr:mpispawn_23][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-25.rennes.grid5000.fr:mpispawn_21][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-25.rennes.grid5000.fr:mpispawn_21][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-20.rennes.grid5000.fr:mpispawn_17][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-20.rennes.grid5000.fr:mpispawn_17][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-7.rennes.grid5000.fr:mpispawn_5][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-7.rennes.grid5000.fr:mpispawn_5][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-7.rennes.grid5000.fr:mpispawn_5][child_handler] MPI process
(rank: 125, pid: 14833) terminated with signal 2 -> abort job
[parapluie-21.rennes.grid5000.fr:mpispawn_18][readline] Unexpected
End-Of-File on file descriptor 5. MPI process died?
[parapluie-21.rennes.grid5000.fr:mpispawn_18][mtpmi_processops] Error while
reading PMI socket. MPI process died?
[parapluie-26.rennes.grid5000.fr:mpispawn_22][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-26.rennes.grid5000.fr:mpispawn_22][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-18.rennes.grid5000.fr:mpispawn_15][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-18.rennes.grid5000.fr:mpispawn_15][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-27.rennes.grid5000.fr:mpispawn_23][child_handler] MPI process
(rank: 556, pid: 16057) terminated with signal 2 -> abort job
[parapluie-29.rennes.grid5000.fr:mpispawn_25][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-29.rennes.grid5000.fr:mpispawn_25][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-10.rennes.grid5000.fr:mpispawn_8][child_handler] MPI process
(rank: 203, pid: 15479) terminated with signal 2 -> abort job
[parapluie-32.rennes.grid5000.fr:mpispawn_28][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-32.rennes.grid5000.fr:mpispawn_28][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-40.rennes.grid5000.fr:mpispawn_36][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-40.rennes.grid5000.fr:mpispawn_36][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-26.rennes.grid5000.fr:mpispawn_22][child_handler] MPI process
(rank: 547, pid: 16123) terminated with signal 2 -> abort job
[parapluie-33.rennes.grid5000.fr:mpispawn_29][child_handler] MPI process
(rank: 707, pid: 16064) terminated with signal 2 -> abort job
[parapluie-32.rennes.grid5000.fr:mpispawn_28][child_handler] MPI process
(rank: 679, pid: 15969) terminated with signal 2 -> abort job
[parapluie-30.rennes.grid5000.fr:mpispawn_26][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-30.rennes.grid5000.fr:mpispawn_26][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-29.rennes.grid5000.fr:mpispawn_25][child_handler] MPI process
(rank: 602, pid: 16120) terminated with signal 2 -> abort job
[parapluie-39.rennes.grid5000.fr:mpispawn_35][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-39.rennes.grid5000.fr:mpispawn_35][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-38.rennes.grid5000.fr:mpispawn_34][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-38.rennes.grid5000.fr:mpispawn_34][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-31.rennes.grid5000.fr:mpispawn_27][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-31.rennes.grid5000.fr:mpispawn_27][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-30.rennes.grid5000.fr:mpispawn_26][child_handler] MPI process
(rank: 638, pid: 16126) terminated with signal 2 -> abort job
[parapluie-38.rennes.grid5000.fr:mpispawn_34][child_handler] MPI process
(rank: 836, pid: 14263) terminated with signal 2 -> abort job
[parapluie-5.rennes.grid5000.fr:mpispawn_3][child_handler] MPI process
(rank: 77, pid: 14608) terminated with signal 2 -> abort job
[parapluie-28.rennes.grid5000.fr:mpispawn_24][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-28.rennes.grid5000.fr:mpispawn_24][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-28.rennes.grid5000.fr:mpispawn_24][child_handler] MPI process
(rank: 591, pid: 16064) terminated with signal 2 -> abort job
[parapluie-20.rennes.grid5000.fr:mpispawn_17][child_handler] MPI process
(rank: 410, pid: 16147) terminated with signal 2 -> abort job
[parapluie-31.rennes.grid5000.fr:mpispawn_27][child_handler] MPI process
(rank: 651, pid: 16140) terminated with signal 2 -> abort job
[parapluie-39.rennes.grid5000.fr:mpispawn_35][child_handler] MPI process
(rank: 861, pid: 13947) terminated with signal 2 -> abort job
[parapluie-25.rennes.grid5000.fr:mpispawn_21][child_handler] MPI process
(rank: 524, pid: 16096) terminated with signal 2 -> abort job
[parapluie-40.rennes.grid5000.fr:mpispawn_36][child_handler] MPI process
(rank: 872, pid: 13689) terminated with signal 2 -> abort job
[parapluie-18.rennes.grid5000.fr:mpispawn_15][child_handler] MPI process
(rank: 380, pid: 16144) terminated with signal 2 -> abort job
[parapluie-37.rennes.grid5000.fr:mpispawn_33][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-37.rennes.grid5000.fr:mpispawn_33][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-15.rennes.grid5000.fr:mpi_rank_308][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-15.rennes.grid5000.fr:mpi_rank_289][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-15.rennes.grid5000.fr:mpi_rank_305][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-15.rennes.grid5000.fr:mpi_rank_294][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-15.rennes.grid5000.fr:mpi_rank_307][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-15.rennes.grid5000.fr:mpi_rank_311][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-15.rennes.grid5000.fr:mpi_rank_300][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-15.rennes.grid5000.fr:mpi_rank_301][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-15.rennes.grid5000.fr:mpi_rank_292][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-15.rennes.grid5000.fr:mpi_rank_303][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-15.rennes.grid5000.fr:mpi_rank_309][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-15.rennes.grid5000.fr:mpi_rank_295][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-15.rennes.grid5000.fr:mpispawn_12][readline] Unexpected
End-Of-File on file descriptor 20. MPI process died?
[parapluie-15.rennes.grid5000.fr:mpispawn_12][mtpmi_processops] Error while
reading PMI socket. MPI process died?
[parapluie-33.rennes.grid5000.fr:mpi_rank_697][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-33.rennes.grid5000.fr:mpi_rank_698][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-33.rennes.grid5000.fr:mpi_rank_702][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-33.rennes.grid5000.fr:mpi_rank_717][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-33.rennes.grid5000.fr:mpi_rank_715][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-33.rennes.grid5000.fr:mpi_rank_707][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-33.rennes.grid5000.fr:mpi_rank_713][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-33.rennes.grid5000.fr:mpi_rank_700][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-33.rennes.grid5000.fr:mpispawn_29][readline] Unexpected
End-Of-File on file descriptor 8. MPI process died?
[parapluie-33.rennes.grid5000.fr:mpispawn_29][mtpmi_processops] Error while
reading PMI socket. MPI process died?
[parapluie-15.rennes.grid5000.fr:mpispawn_12][child_handler] MPI process
(rank: 303, pid: 16225) terminated with signal 7 -> abort job
[parapluie-2.rennes.grid5000.fr:mpirun_rsh][process_mpispawn_connection]
mpispawn_12 from node parapluie-21.rennes.grid5000.fr aborted: MPI process
error (1)
[parapluie-4.rennes.grid5000.fr:mpispawn_2][read_size] Unexpected
End-Of-File on file descriptor 33. MPI process died?
[parapluie-4.rennes.grid5000.fr:mpispawn_2][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-4.rennes.grid5000.fr:mpispawn_2][child_handler] MPI process
(rank: 61, pid: 14747) terminated with signal 2 -> abort job
[parapluie-14.rennes.grid5000.fr:mpispawn_11][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-14.rennes.grid5000.fr:mpispawn_11][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-12.rennes.grid5000.fr:mpispawn_9][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-12.rennes.grid5000.fr:mpispawn_9][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-1.rennes.grid5000.fr:mpispawn_0][read_size] Unexpected
End-Of-File on file descriptor 31. MPI process died?
[parapluie-1.rennes.grid5000.fr:mpispawn_0][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-13.rennes.grid5000.fr:mpispawn_10][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-13.rennes.grid5000.fr:mpispawn_10][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-18.rennes.grid5000.fr:mpi_rank_367][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-18.rennes.grid5000.fr:mpi_rank_364][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-18.rennes.grid5000.fr:mpi_rank_366][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-18.rennes.grid5000.fr:mpi_rank_374][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-18.rennes.grid5000.fr:mpi_rank_371][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-18.rennes.grid5000.fr:mpi_rank_378][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-18.rennes.grid5000.fr:mpi_rank_377][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-18.rennes.grid5000.fr:mpi_rank_381][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-18.rennes.grid5000.fr:mpi_rank_379][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-18.rennes.grid5000.fr:mpi_rank_365][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-18.rennes.grid5000.fr:mpi_rank_375][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-18.rennes.grid5000.fr:mpi_rank_372][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-18.rennes.grid5000.fr:mpi_rank_368][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-18.rennes.grid5000.fr:mpi_rank_380][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-18.rennes.grid5000.fr:mpi_rank_382][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-18.rennes.grid5000.fr:mpi_rank_361][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-18.rennes.grid5000.fr:mpispawn_15][readline] Unexpected
End-Of-File on file descriptor 6. MPI process died?
[parapluie-18.rennes.grid5000.fr:mpispawn_15][mtpmi_processops] Error while
reading PMI socket. MPI process died?
[parapluie-14.rennes.grid5000.fr:mpispawn_11][child_handler] MPI process
(rank: 281, pid: 16276) terminated with signal 2 -> abort job
[parapluie-13.rennes.grid5000.fr:mpispawn_10][child_handler] MPI process
(rank: 253, pid: 16278) terminated with signal 2 -> abort job
[parapluie-12.rennes.grid5000.fr:mpispawn_9][child_handler] MPI process
(rank: 218, pid: 15584) terminated with signal 2 -> abort job
[parapluie-1.rennes.grid5000.fr:mpispawn_0][child_handler] MPI process
(rank: 4, pid: 15788) terminated with signal 2 -> abort job
[parapluie-5.rennes.grid5000.fr:mpispawn_3][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-5.rennes.grid5000.fr:mpispawn_3][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-6.rennes.grid5000.fr:mpispawn_4][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-6.rennes.grid5000.fr:mpispawn_4][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-18.rennes.grid5000.fr:mpispawn_15][child_handler] MPI process
(rank: 378, pid: 16225) terminated with signal 7 -> abort job
[parapluie-33.rennes.grid5000.fr:mpispawn_29][child_handler] MPI process
(rank: 707, pid: 16163) terminated with signal 7 -> abort job
[parapluie-2.rennes.grid5000.fr:mpispawn_1][read_size] Unexpected
End-Of-File on file descriptor 31. MPI process died?
[parapluie-2.rennes.grid5000.fr:mpispawn_1][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-5.rennes.grid5000.fr:mpispawn_3][child_handler] MPI process
(rank: 73, pid: 14706) terminated with signal 2 -> abort job
[parapluie-17.rennes.grid5000.fr:mpispawn_14][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-17.rennes.grid5000.fr:mpispawn_14][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-19.rennes.grid5000.fr:mpispawn_16][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-19.rennes.grid5000.fr:mpispawn_16][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-16.rennes.grid5000.fr:mpispawn_13][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-16.rennes.grid5000.fr:mpispawn_13][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-2.rennes.grid5000.fr:mpispawn_1][child_handler] MPI process
(rank: 27, pid: 20617) terminated with signal 2 -> abort job
[parapluie-20.rennes.grid5000.fr:mpispawn_17][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-20.rennes.grid5000.fr:mpispawn_17][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-34.rennes.grid5000.fr:mpispawn_30][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-34.rennes.grid5000.fr:mpispawn_30][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-39.rennes.grid5000.fr:mpispawn_35][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-39.rennes.grid5000.fr:mpispawn_35][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-37.rennes.grid5000.fr:mpispawn_33][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-37.rennes.grid5000.fr:mpispawn_33][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-10.rennes.grid5000.fr:mpispawn_8][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-10.rennes.grid5000.fr:mpispawn_8][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-10.rennes.grid5000.fr:mpispawn_8][child_handler] MPI process
(rank: 205, pid: 15577) terminated with signal 2 -> abort job
[parapluie-23.rennes.grid5000.fr:mpispawn_20][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-23.rennes.grid5000.fr:mpispawn_20][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-38.rennes.grid5000.fr:mpispawn_34][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-38.rennes.grid5000.fr:mpispawn_34][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-26.rennes.grid5000.fr:mpispawn_22][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-26.rennes.grid5000.fr:mpispawn_22][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-6.rennes.grid5000.fr:mpispawn_4][child_handler] MPI process
(rank: 100, pid: 14729) terminated with signal 2 -> abort job
[parapluie-22.rennes.grid5000.fr:mpispawn_19][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-22.rennes.grid5000.fr:mpispawn_19][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-22.rennes.grid5000.fr:mpispawn_19][child_handler] MPI process
(rank: 456, pid: 16225) terminated with signal 2 -> abort job
[parapluie-17.rennes.grid5000.fr:mpispawn_14][child_handler] MPI process
(rank: 342, pid: 16240) terminated with signal 2 -> abort job
[parapluie-40.rennes.grid5000.fr:mpispawn_36][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-40.rennes.grid5000.fr:mpispawn_36][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-40.rennes.grid5000.fr:mpispawn_36][child_handler] MPI process
(rank: 873, pid: 13792) terminated with signal 2 -> abort job
[parapluie-21.rennes.grid5000.fr:mpispawn_18][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-21.rennes.grid5000.fr:mpi_rank_435][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-21.rennes.grid5000.fr:mpi_rank_433][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-21.rennes.grid5000.fr:mpi_rank_451][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-21.rennes.grid5000.fr:mpi_rank_434][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-21.rennes.grid5000.fr:mpi_rank_445][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-21.rennes.grid5000.fr:mpi_rank_436][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-21.rennes.grid5000.fr:mpi_rank_443][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-21.rennes.grid5000.fr:mpi_rank_446][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-21.rennes.grid5000.fr:mpi_rank_440][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-21.rennes.grid5000.fr:mpi_rank_454][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-21.rennes.grid5000.fr:mpi_rank_441][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-21.rennes.grid5000.fr:mpi_rank_437][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-21.rennes.grid5000.fr:mpispawn_18][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-21.rennes.grid5000.fr:mpispawn_18][child_handler] MPI process
(rank: 441, pid: 16143) terminated with signal 7 -> abort job
[parapluie-7.rennes.grid5000.fr:mpispawn_5][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-7.rennes.grid5000.fr:mpispawn_5][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-7.rennes.grid5000.fr:mpispawn_5][child_handler] MPI process
(rank: 129, pid: 14939) terminated with signal 2 -> abort job
[parapluie-39.rennes.grid5000.fr:mpispawn_35][child_handler] MPI process
(rank: 853, pid: 14041) terminated with signal 2 -> abort job
[parapluie-34.rennes.grid5000.fr:mpispawn_30][child_handler] MPI process
(rank: 722, pid: 16032) terminated with signal 2 -> abort job
[parapluie-30.rennes.grid5000.fr:mpi_rank_639][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-30.rennes.grid5000.fr:mpi_rank_635][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-30.rennes.grid5000.fr:mpi_rank_633][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-30.rennes.grid5000.fr:mpi_rank_644][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-30.rennes.grid5000.fr:mpi_rank_628][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-30.rennes.grid5000.fr:mpi_rank_647][error_sighandler] Caught
error: Bus error (signal 7)
[parapluie-30.rennes.grid5000.fr:mpispawn_26][readline] Unexpected
End-Of-File on file descriptor 7. MPI process died?
[parapluie-30.rennes.grid5000.fr:mpispawn_26][mtpmi_processops] Error while
reading PMI socket. MPI process died?
[parapluie-27.rennes.grid5000.fr:mpispawn_23][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-27.rennes.grid5000.fr:mpispawn_23][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-9.rennes.grid5000.fr:mpispawn_7][read_size] Unexpected
End-Of-File on file descriptor 30. MPI process died?
[parapluie-9.rennes.grid5000.fr:mpispawn_7][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-9.rennes.grid5000.fr:mpispawn_7][child_handler] MPI process
(rank: 177, pid: 15154) terminated with signal 2 -> abort job
[parapluie-8.rennes.grid5000.fr:mpispawn_6][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-8.rennes.grid5000.fr:mpispawn_6][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-28.rennes.grid5000.fr:mpispawn_24][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-28.rennes.grid5000.fr:mpispawn_24][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-8.rennes.grid5000.fr:mpispawn_6][child_handler] MPI process
(rank: 157, pid: 14996) terminated with signal 2 -> abort job
[parapluie-35.rennes.grid5000.fr:mpispawn_31][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-35.rennes.grid5000.fr:mpispawn_31][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-35.rennes.grid5000.fr:mpispawn_31][child_handler] MPI process
(rank: 751, pid: 16020) terminated with signal 2 -> abort job
[parapluie-25.rennes.grid5000.fr:mpispawn_21][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-25.rennes.grid5000.fr:mpispawn_21][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-19.rennes.grid5000.fr:mpispawn_16][child_handler] MPI process
(rank: 388, pid: 16245) terminated with signal 2 -> abort job
[parapluie-37.rennes.grid5000.fr:mpispawn_33][child_handler] MPI process
(rank: 806, pid: 14630) terminated with signal 2 -> abort job
[parapluie-30.rennes.grid5000.fr:mpispawn_26][child_handler] MPI process
(rank: 635, pid: 16225) terminated with signal 7 -> abort job
[parapluie-16.rennes.grid5000.fr:mpispawn_13][child_handler] MPI process
(rank: 316, pid: 16220) terminated with signal 2 -> abort job
[parapluie-36.rennes.grid5000.fr:mpispawn_32][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-36.rennes.grid5000.fr:mpispawn_32][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-36.rennes.grid5000.fr:mpispawn_32][child_handler] MPI process
(rank: 781, pid: 14760) terminated with signal 2 -> abort job
[parapluie-27.rennes.grid5000.fr:mpispawn_23][child_handler] MPI process
(rank: 560, pid: 16163) terminated with signal 2 -> abort job
[parapluie-25.rennes.grid5000.fr:mpispawn_21][child_handler] MPI process
(rank: 519, pid: 16193) terminated with signal 2 -> abort job
[parapluie-31.rennes.grid5000.fr:mpispawn_27][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-31.rennes.grid5000.fr:mpispawn_27][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-29.rennes.grid5000.fr:mpispawn_25][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-29.rennes.grid5000.fr:mpispawn_25][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-28.rennes.grid5000.fr:mpispawn_24][child_handler] MPI process
(rank: 578, pid: 16153) terminated with signal 2 -> abort job
[parapluie-29.rennes.grid5000.fr:mpispawn_25][child_handler] MPI process
(rank: 605, pid: 16225) terminated with signal 2 -> abort job
[parapluie-26.rennes.grid5000.fr:mpispawn_22][child_handler] MPI process
(rank: 531, pid: 16209) terminated with signal 2 -> abort job
[parapluie-31.rennes.grid5000.fr:mpispawn_27][child_handler] MPI process
(rank: 663, pid: 16254) terminated with signal 2 -> abort job
[parapluie-23.rennes.grid5000.fr:mpispawn_20][child_handler] MPI process
(rank: 482, pid: 16203) terminated with signal 2 -> abort job
[parapluie-32.rennes.grid5000.fr:mpispawn_28][read_size] Unexpected
End-Of-File on file descriptor 29. MPI process died?
[parapluie-32.rennes.grid5000.fr:mpispawn_28][handle_mt_peer] Error while
reading PMI socket. MPI process died?
[parapluie-32.rennes.grid5000.fr:mpispawn_28][child_handler] MPI process
(rank: 680, pid: 16072) terminated with signal 2 -> abort job
[parapluie-38.rennes.grid5000.fr:mpispawn_34][child_handler] MPI process
(rank: 837, pid: 14366) terminated with signal 2 -> abort job
[parapluie-20.rennes.grid5000.fr:mpispawn_17][child_handler] MPI process
(rank: 417, pid: 16256) terminated with signal 2 -> abort job
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20111102/5607a490/attachment-0001.html


More information about the mvapich-discuss mailing list