[mvapich-discuss] MPI_Comm_accept failed when some client connected to the server

Hari Subramoni subramoni.1 at osu.edu
Sun Apr 19 10:58:36 EDT 2015


Hello,

Can you please send the following

1. Output of mpiname -a?
2. Exact command used to run the application
3. Run-time parameters used

Can you re-compile MVAPICH2 with debugging options and run with
"MV2_DEBUG_SHOW_BACKTRACE=1"?

Please refer to the following sections of the userguide for more
information.

http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.1-userguide.html#x1-1250009.1.14
http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.1-userguide.html#x1-15500010.5

Regards,
Hari.

On Sun, Apr 19, 2015 at 12:22 AM, 马凯 <makailove123 at 163.com> wrote:

>     I have met another trouble when using port communication.
>     The MPI_Open_port seem to work OK, and it gave me this:
> server available at
> tag#0$description#"#RANK:00000000(00000001:0000034a:00000001:00000000)#"$
>
>     And then, I launched the client to connect to the port according to
> the port_name. But at that moment, the server would abort, and told me this:
>  [gpu-cluster-1:mpi_rank_0][error_sighandler] Caught error: Segmentation
> fault (signal 11)
> [gpu-cluster-1:mpispawn_0][readline] Unexpected End-Of-File on file
> descriptor 5. MPI process died?
> [gpu-cluster-1:mpispawn_0][mtpmi_processops] Error while reading PMI
> socket. MPI process died?
> [gpu-cluster-1:mpispawn_0][child_handler] MPI process (rank: 0, pid:
> 12810) terminated with signal 11 -> abort job
> [gpu-cluster-1:mpirun_rsh][process_mpispawn_connection] mpispawn_0 from
> node 192.168.2.1 aborted: Error while reading a PMI socket (4)
>     And the client would do nothing just being stuck.
>     I will be appreciated for any help!
>
>
>     The fallowing is my server code:
> #include<mpi.h>
> #include<stdio.h>
> #include<stdlib.h>
>
> void main(int argv, char *argc[]) {
>     int myid;
>     int size;
>     MPI_Comm client;
>     MPI_Status status;
>     char port_name[MPI_MAX_PORT_NAME];
>     char buf[1024];
>
>     MPI_Init(&argv, &argc);
>     MPI_Comm_size(MPI_COMM_WORLD, &size);
>     if(size > 1) {
>         printf("Server too big\n");
>         exit(EXIT_FAILURE);
>     }
>
>     MPI_Open_port(MPI_INFO_NULL, port_name);
>     printf("server available at %s\n", port_name);
>
> //  MPI_Publish_name("server", MPI_INFO_NULL, port_name);
>
>     MPI_Comm_accept(port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &client);
>     printf("Accpet successfully\n");
>
>     MPI_Recv(buf, 1024, MPI_CHAR, MPI_ANY_SOURCE, MPI_ANY_TAG, client,
> &status);
>
>     MPI_Comm_disconnect(&client);
> //  MPI_Unpublish_name("server", MPI_INFO_NULL, port_name);
>     MPI_Comm_free(&client);
>     MPI_Finalize();
> }
>
>     The fallowing is my client code:
> #include<mpi.h>
> #include<stdio.h>
> #include<stdlib.h>
>
> void main(int argc, char *argv[]) {
>     int myid;
>     int size;
>     MPI_Comm server;
>     MPI_Status status;
>     char port_name[MPI_MAX_PORT_NAME];
>     char buf[1024];
>
>     if(argc < 2) {
>         printf("too few arguments\n");
>         exit(EXIT_FAILURE);
>     }
>
>     MPI_Init(&argc, &argv);
>     MPI_Comm_size(MPI_COMM_WORLD, &size);
>     if(size > 1) {
>         printf("Client too big\n");
>         exit(EXIT_FAILURE);
>     }
>
>     //MPI_Lookup_name("server", MPI_INFO_NULL, port_name);
>
>     MPI_Comm_connect(argv[1], MPI_INFO_NULL, 0, MPI_COMM_WORLD, &server);
>     printf("Connect successfully\n");
>
>     MPI_Send(buf, 0, MPI_CHAR, 0, 100, server);
>
>     MPI_Comm_disconnect(&server);
>     MPI_Comm_free(&server);
>     MPI_Finalize();
> }
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150419/e6c64fa9/attachment.html>


More information about the mvapich-discuss mailing list