[mvapich-discuss] Infiniband problems

Jens Glaser jglaser at umn.edu
Sat Nov 3 21:47:32 EDT 2012


Hi,

I suspect I am having some trouble with Infiniband support  on the cluster (keeneland final system) I am using.

The latency tests run, but osu_bibw and osu_bw hang.

The system has Mellanox FDR adapters.

This is the information:

$ ibstat
CA 'mlx4_0'
	CA type: MT4099
	Number of ports: 2
	Firmware version: 2.10.5380
	Hardware version: 0
	Node GUID: 0x0002c903003ff800
	System image GUID: 0x0002c903003ff803
	Port 1:
		State: Active
		Physical state: LinkUp
		Rate: 56
		Base lid: 64
		LMC: 0
		SM lid: 264
		Capability mask: 0x02514868
		Port GUID: 0x0002c903003ff801
		Link layer: InfiniBand
	Port 2:
		State: Down
		Physical state: Disabled
		Rate: 40
		Base lid: 0
		LMC: 0
		SM lid: 0
		Capability mask: 0x02514868
		Port GUID: 0x0002c903003ff802
		Link layer: InfiniBand

The library was configured with CUDA support, I am using the latest version (1.9a).

Any ideas?

Jens


More information about the mvapich-discuss mailing list