[mvapich-discuss] Problem with MPI_Init
Schwind Michael
michael.schwind at informatik.tu-chemnitz.de
Thu Dec 7 04:36:50 EST 2006
Hello,
I currently test mvapich2 on
5 nodes each has a
mellanox pci-express card
connected to a 8 port switch.
The motherboard is a Tyan s4881,
it has 4 Opteron 865 processors.
We have a 6th cluster node with 8 opterons
and two infinband adapters (pci-express).
We have an older 7th cluster node Tyan S4881
with 4 Opteron 865 with one mellanox
pci-X adapter.
But I currently test mvapich2 on the 5 homogeneous
nodes.
ibv_devinfo -v gives:
hca_id: mthca0
fw_ver: 3.5.0
node_guid: 0002:c902:0022:33c8
sys_image_guid: 0002:c902:0022:33cb
vendor_id: 0x02c9
vendor_part_id: 23108
hw_ver: 0xA1
board_id: MT_0280120001
phys_port_cnt: 2
max_mr_size: 0xffffffffffffffff
page_size_cap: 0xfffff000
max_qp: 64512
max_qp_wr: 65535
device_cap_flags: 0x00001c76
max_sge: 28
max_sge_rd: 0
max_cq: 65408
max_cqe: 131071
max_mr: 131056
max_pd: 32768
max_qp_rd_atom: 4
max_ee_rd_atom: 0
max_res_rd_atom: 258048
max_qp_init_rd_atom: 128
max_ee_init_rd_atom: 0
atomic_cap: ATOMIC_HCA (1)
max_ee: 0
max_rdd: 0
max_mw: 0
max_raw_ipv6_qp: 0
max_raw_ethy_qp: 0
max_mcast_grp: 8192
max_mcast_qp_attach: 8
max_total_mcast_qp_attach: 65536
max_ah: 0
max_fmr: 0
max_srq: 1008
max_srq_wr: 65535
max_srq_sge: 28
max_pkeys: 64
local_ca_ack_delay: 15
port: 1
state: PORT_ACTIVE (4)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 12
port_lid: 12
port_lmc: 0x00
max_msg_sz: 0x80000000
port_cap_flags: 0x02510a6a
max_vl_num: 4
bad_pkey_cntr: 0x0
qkey_viol_cntr: 0x0
sm_sl: 0
pkey_tbl_len: 64
gid_tbl_len: 32
subnet_timeout: 18
init_type_reply: 0
active_width: 4X (2)
active_speed: 2.5 Gbps (1)
phys_state: LINK_UP (5)
GID[ 0]:
fe80:0000:0000:0000:0002:c902:0022:33c9
port: 2
state: PORT_INIT (2)
max_mtu: 2048 (4)
active_mtu: 512 (2)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
max_msg_sz: 0x80000000
port_cap_flags: 0x02510a68
max_vl_num: 4
bad_pkey_cntr: 0x0
qkey_viol_cntr: 0x0
sm_sl: 0
pkey_tbl_len: 64
gid_tbl_len: 32
subnet_timeout: 0
init_type_reply: 0
active_width: 4X (2)
active_speed: 2.5 Gbps (1)
phys_state: LINK_UP (5)
GID[ 0]:
fe80:0000:0000:0000:0002:c902:0022:33ca
Thank you Michael
On Thursday 07 December 2006 09:26, Pavel Shamis (Pasha) wrote:
> Hi Michael,
> Can you please provide more information about your hardware (ibv_devinfo
> -v)
>
> Regards,
> Pavel Shamis (Pasha)
>
> Schwind Michael wrote:
> > Hello,
> >
> > first lets say thank you for your great work
> > on mvapich.
> >
> > Now my problem:
> >
> > My program hangs in MPI_Init, when I start it.
> >
> >
> > The problem seems to be that the thread
> > waits to lock a mutex.
> >
> > The mutex was set in MPI_Init through
> > MPID_CS_ENTER() on line 89 in init.c.
> >
> > The thread calls the macro MPID_CS_ENTER()
> > a second time in the function MPI_Comm_rank
> > some time later in the function MPI_Init:
> >
> > if (split_comm == 1){
> > int my_id, size;
> > MPI_Comm_rank(MPI_COMM_WORLD, &my_id);
> >
> > My distribution is debian testing on amd64.
> > I use openmpi 1.0 compiled with gcc 4.1.2.
> >
> >
> > Whats wrong with my setup.
> >
> > Thanks
> >
> > Michael
> >
> >
> >
> >
> > _______________________________________________
> > mvapich-discuss mailing list
> > mvapich-discuss at cse.ohio-state.edu
> > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
More information about the mvapich-discuss
mailing list