[mvapich-discuss] Problem with MPI_Init

Schwind Michael michael.schwind at informatik.tu-chemnitz.de
Thu Dec 7 04:36:50 EST 2006


Hello,

I currently test mvapich2 on
5 nodes each has a 
mellanox pci-express card
connected to a 8 port switch.
The motherboard is a Tyan s4881,
it has 4 Opteron 865 processors.

We have a 6th cluster node with 8 opterons
and two infinband adapters (pci-express).

We have an older 7th cluster node Tyan S4881 
with 4 Opteron 865 with one mellanox
pci-X adapter.

But I currently test mvapich2 on the 5 homogeneous 
nodes.

ibv_devinfo -v gives:



hca_id: mthca0
        fw_ver:                         3.5.0
        node_guid:                      0002:c902:0022:33c8
        sys_image_guid:                 0002:c902:0022:33cb
        vendor_id:                      0x02c9
        vendor_part_id:                 23108
        hw_ver:                         0xA1
        board_id:                       MT_0280120001
        phys_port_cnt:                  2
        max_mr_size:                    0xffffffffffffffff
        page_size_cap:                  0xfffff000
        max_qp:                         64512
        max_qp_wr:                      65535
        device_cap_flags:               0x00001c76
        max_sge:                        28
        max_sge_rd:                     0
        max_cq:                         65408
        max_cqe:                        131071
        max_mr:                         131056
        max_pd:                         32768
        max_qp_rd_atom:                 4
        max_ee_rd_atom:                 0
        max_res_rd_atom:                258048
        max_qp_init_rd_atom:            128
        max_ee_init_rd_atom:            0
        atomic_cap:                     ATOMIC_HCA (1)
        max_ee:                         0
        max_rdd:                        0
        max_mw:                         0
        max_raw_ipv6_qp:                0
        max_raw_ethy_qp:                0
        max_mcast_grp:                  8192
        max_mcast_qp_attach:            8
        max_total_mcast_qp_attach:      65536
        max_ah:                         0
        max_fmr:                        0
        max_srq:                        1008
        max_srq_wr:                     65535
        max_srq_sge:                    28
        max_pkeys:                      64
        local_ca_ack_delay:             15
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 12
                        port_lid:               12
                        port_lmc:               0x00
                        max_msg_sz:             0x80000000
                        port_cap_flags:         0x02510a6a
                        max_vl_num:             4
                        bad_pkey_cntr:          0x0
                        qkey_viol_cntr:         0x0
                        sm_sl:                  0
                        pkey_tbl_len:           64
                        gid_tbl_len:            32
                        subnet_timeout:         18
                        init_type_reply:        0
                        active_width:           4X (2)
                        active_speed:           2.5 Gbps (1)
                        phys_state:             LINK_UP (5)
                        GID[  0]:               
fe80:0000:0000:0000:0002:c902:0022:33c9

                port:   2
                        state:                  PORT_INIT (2)
                        max_mtu:                2048 (4)
                        active_mtu:             512 (2)
                        sm_lid:                 0
                        port_lid:               0
                        port_lmc:               0x00
                        max_msg_sz:             0x80000000
                        port_cap_flags:         0x02510a68
                        max_vl_num:             4
                        bad_pkey_cntr:          0x0
                        qkey_viol_cntr:         0x0
                        sm_sl:                  0
                        pkey_tbl_len:           64
                        gid_tbl_len:            32
                        subnet_timeout:         0
                        init_type_reply:        0
                        active_width:           4X (2)
                        active_speed:           2.5 Gbps (1)
                        phys_state:             LINK_UP (5)
                        GID[  0]:              
fe80:0000:0000:0000:0002:c902:0022:33ca

Thank you Michael





On Thursday 07 December 2006 09:26, Pavel Shamis (Pasha) wrote:
> Hi Michael,
> Can you please provide more information about your hardware (ibv_devinfo
> -v)
>
> Regards,
> Pavel Shamis (Pasha)
>
> Schwind Michael wrote:
> > Hello,
> >
> > first lets say thank you for your great work
> > on mvapich.
> >
> > Now my problem:
> >
> > My program hangs  in MPI_Init, when I start it.
> >
> >
> > The problem seems to be that the thread
> > waits to lock a mutex.
> >
> > The mutex was set in MPI_Init through
> > MPID_CS_ENTER() on line 89 in init.c.
> >
> > The thread calls the macro  MPID_CS_ENTER()
> > a second time in the function MPI_Comm_rank
> > some time later in the function MPI_Init:
> >
> > if (split_comm == 1){
> >        int my_id, size;
> >        MPI_Comm_rank(MPI_COMM_WORLD, &my_id);
> >
> > My distribution is debian testing on amd64.
> > I use openmpi 1.0 compiled with gcc 4.1.2.
> >
> >
> > Whats wrong with my setup.
> >
> > Thanks
> >
> > Michael
> >
> >
> >
> >
> > _______________________________________________
> > mvapich-discuss mailing list
> > mvapich-discuss at cse.ohio-state.edu
> > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss


More information about the mvapich-discuss mailing list