[mvapich-discuss] ch3:psm MPI-3 RMA limited to $MV2_SHMEM_COLL_NUM_COMM windows
Jeff Hammond
jeff.science at gmail.com
Thu Mar 27 17:04:41 EDT 2014
I cannot allocate more RMA windows than the value of
MV2_SHMEM_COLL_NUM_COMM. This breaks ARMCI-MPI and thus NWChem. I do
not see this problem with non-PSM builds of MVAPICH2, although I have
not updated the Mellanox builds within the last few weeks, so perhaps
it is pervasively broken by a recent change.
I am working on the MVAPICH2 svn trunk. Please let me know when this
issue is resolved so I can support NWChem users running with Qlogic
IB.
Thanks,
Jeff
[jhammond at blogin2 tests]$ export MV2_SHMEM_COLL_NUM_COMM=1000
[jhammond at blogin2 tests]$
/home/jhammond/MPI/gcc482-mv2trunk/bin/mpiexec -n 4 ./test_malloc
Starting ARMCI memory allocation test with 4 processes
+ allocation 0
+ allocation 1
+ allocation 2
+ allocation 3
+ allocation 4
+ allocation 5
+ allocation 6
+ allocation 7
+ allocation 8
+ allocation 9
+ allocation 10
+ allocation 11
+ allocation 12
+ allocation 13
+ allocation 14
+ allocation 15
+ allocation 16
+ allocation 17
+ allocation 18
+ allocation 19
+ allocation 20
+ allocation 21
+ allocation 22
+ allocation 23
+ allocation 24
+ allocation 25
+ allocation 26
+ allocation 27
+ allocation 28
+ allocation 29
+ allocation 30
+ allocation 31
+ allocation 32
+ allocation 33
+ allocation 34
+ allocation 35
+ allocation 36
+ allocation 37
+ allocation 38
+ allocation 39
+ allocation 40
+ allocation 41
+ allocation 42
+ allocation 43
+ allocation 44
+ allocation 45
+ allocation 46
+ allocation 47
+ allocation 48
+ allocation 49
+ allocation 50
+ allocation 51
+ allocation 52
+ allocation 53
+ allocation 54
+ allocation 55
+ allocation 56
+ allocation 57
+ allocation 58
+ allocation 59
+ allocation 60
+ allocation 61
+ allocation 62
+ allocation 63
+ allocation 64
+ allocation 65
+ allocation 66
+ allocation 67
+ allocation 68
+ allocation 69
+ allocation 70
+ allocation 71
+ allocation 72
+ allocation 73
+ allocation 74
+ allocation 75
+ allocation 76
+ allocation 77
+ allocation 78
+ allocation 79
+ allocation 80
+ allocation 81
+ allocation 82
+ allocation 83
+ allocation 84
+ allocation 85
+ allocation 86
+ allocation 87
+ allocation 88
+ allocation 89
+ allocation 90
+ allocation 91
+ allocation 92
+ allocation 93
+ allocation 94
+ allocation 95
+ allocation 96
+ allocation 97
+ allocation 98
+ allocation 99
+ free 0
+ free 1
+ free 2
+ free 3
+ free 4
+ free 5
+ free 6
+ free 7
+ free 8
+ free 9
+ free 10
+ free 11
+ free 12
+ free 13
+ free 14
+ free 15
+ free 16
+ free 17
+ free 18
+ free 19
+ free 20
+ free 21
+ free 22
+ free 23
+ free 24
+ free 25
+ free 26
+ free 27
+ free 28
+ free 29
+ free 30
+ free 31
+ free 32
+ free 33
+ free 34
+ free 35
+ free 36
+ free 37
+ free 38
+ free 39
+ free 40
+ free 41
+ free 42
+ free 43
+ free 44
+ free 45
+ free 46
+ free 47
+ free 48
+ free 49
+ free 50
+ free 51
+ free 52
+ free 53
+ free 54
+ free 55
+ free 56
+ free 57
+ free 58
+ free 59
+ free 60
+ free 61
+ free 62
+ free 63
+ free 64
+ free 65
+ free 66
+ free 67
+ free 68
+ free 69
+ free 70
+ free 71
+ free 72
+ free 73
+ free 74
+ free 75
+ free 76
+ free 77
+ free 78
+ free 79
+ free 80
+ free 81
+ free 82
+ free 83
+ free 84
+ free 85
+ free 86
+ free 87
+ free 88
+ free 89
+ free 90
+ free 91
+ free 92
+ free 93
+ free 94
+ free 95
+ free 96
+ free 97
+ free 98
+ free 99
Test complete: PASS.
[jhammond at blogin2 tests]$ export MV2_SHMEM_COLL_NUM_COMM=100
[jhammond at blogin2 tests]$
/home/jhammond/MPI/gcc482-mv2trunk/bin/mpiexec -n 4 ./test_malloc
Starting ARMCI memory allocation test with 4 processes
+ allocation 0
+ allocation 1
+ allocation 2
+ allocation 3
+ allocation 4
+ allocation 5
+ allocation 6
+ allocation 7
+ allocation 8
+ allocation 9
+ allocation 10
+ allocation 11
+ allocation 12
+ allocation 13
+ allocation 14
+ allocation 15
+ allocation 16
+ allocation 17
+ allocation 18
+ allocation 19
+ allocation 20
+ allocation 21
+ allocation 22
+ allocation 23
+ allocation 24
+ allocation 25
+ allocation 26
+ allocation 27
+ allocation 28
+ allocation 29
+ allocation 30
+ allocation 31
+ allocation 32
+ allocation 33
+ allocation 34
+ allocation 35
+ allocation 36
+ allocation 37
+ allocation 38
+ allocation 39
+ allocation 40
+ allocation 41
+ allocation 42
+ allocation 43
+ allocation 44
+ allocation 45
+ allocation 46
+ allocation 47
+ allocation 48
+ allocation 49
+ allocation 50
+ allocation 51
+ allocation 52
+ allocation 53
+ allocation 54
+ allocation 55
+ allocation 56
+ allocation 57
+ allocation 58
+ allocation 59
+ allocation 60
+ allocation 61
+ allocation 62
+ allocation 63
+ allocation 64
+ allocation 65
+ allocation 66
+ allocation 67
+ allocation 68
+ allocation 69
+ allocation 70
+ allocation 71
+ allocation 72
+ allocation 73
+ allocation 74
+ allocation 75
+ allocation 76
+ allocation 77
+ allocation 78
+ allocation 79
+ allocation 80
+ allocation 81
+ allocation 82
+ allocation 83
+ allocation 84
+ allocation 85
+ allocation 86
+ allocation 87
+ allocation 88
+ allocation 89
+ allocation 90
+ allocation 91
+ allocation 92
+ allocation 93
+ allocation 94
+ allocation 95
+ allocation 96
+ allocation 97
+ allocation 98
+ allocation 99
test_malloc:116454 terminated with signal 11 at PC=2ae87e750a0e
SP=7fffbdc4aa80. Backtrace:
test_malloc:116456 terminated with signal 11 at PC=2b432f382a0e
SP=7fffd024a820. Backtrace:
test_malloc:116455 terminated with signal 11 at PC=2b8d22b46a0e
SP=7fff916b67c0. Backtrace:
test_malloc:116457 terminated with signal 11 at PC=2b1aeb8aea0e
SP=7fff6490d480. Backtrace:
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(+0xb9a0e)[0x2b432f382a0e]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(+0xb9a0e)[0x2ae87e750a0e]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPIDI_CH3U_Win_allocate+0x79)[0x2b432f373869]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(+0xb9a0e)[0x2b1aeb8aea0e]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(+0xb9a0e)[0x2b8d22b46a0e]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPIDI_CH3U_Win_allocate+0x79)[0x2b1aeb89f869]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPIDI_CH3U_Win_allocate+0x79)[0x2ae87e741869]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPID_Win_allocate+0x9f)[0x2b432f37a4ef]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPIDI_CH3U_Win_allocate+0x79)[0x2b8d22b37869]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(PMPI_Win_allocate+0x221)[0x2b432f462a01]
./test_malloc[0x4023ca]
./test_malloc[0x401f7a]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPID_Win_allocate+0x9f)[0x2ae87e7484ef]
./test_malloc[0x401ee8]
./test_malloc[0x401dfc]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPID_Win_allocate+0x9f)[0x2b1aeb8a64ef]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPID_Win_allocate+0x9f)[0x2b8d22b3e4ef]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x3b6ea1ed1d]
./test_malloc[0x401bc9]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(PMPI_Win_allocate+0x221)[0x2ae87e830a01]
./test_malloc[0x4023ca]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(PMPI_Win_allocate+0x221)[0x2b1aeb98ea01]
./test_malloc[0x4023ca]
./test_malloc[0x401f7a]
./test_malloc[0x401ee8]
./test_malloc[0x401dfc]
./test_malloc[0x401f7a]
./test_malloc[0x401ee8]
./test_malloc[0x401dfc]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(PMPI_Win_allocate+0x221)[0x2b8d22c26a01]
./test_malloc[0x4023ca]
./test_malloc[0x401f7a]
./test_malloc[0x401ee8]
./test_malloc[0x401dfc]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x3b6ea1ed1d]
./test_malloc[0x401bc9]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x3b6ea1ed1d]
./test_malloc[0x401bc9]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x3b6ea1ed1d]
./test_malloc[0x401bc9]
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 116456 RUNNING AT blogin2
= EXIT CODE: 1
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
[jhammond at blogin2 tests]$ export MV2_SHMEM_COLL_NUM_COMM=10
[jhammond at blogin2 tests]$
/home/jhammond/MPI/gcc482-mv2trunk/bin/mpiexec -n 4 ./test_malloc
Starting ARMCI memory allocation test with 4 processes
+ allocation 0
+ allocation 1
+ allocation 2
+ allocation 3
+ allocation 4
+ allocation 5
+ allocation 6
+ allocation 7
+ allocation 8
+ allocation 9
+ allocation 10
test_malloc:116472 terminated with signal 11 at PC=2ba320df5a0e
SP=7fff1d1d1c10. Backtrace:
test_malloc:116473 terminated with signal 11 at PC=2adc34339a0e
SP=7fff87299800. Backtrace:
test_malloc:116474 terminated with signal 11 at PC=2ba62c9dca0e
SP=7fff5457a550. Backtrace:
test_malloc:116475 terminated with signal 11 at PC=2b6069991a0e
SP=7ffff4ba8b70. Backtrace:
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(+0xb9a0e)[0x2b6069991a0e]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(+0xb9a0e)[0x2ba320df5a0e]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(+0xb9a0e)[0x2adc34339a0e]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(+0xb9a0e)[0x2ba62c9dca0e]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPIDI_CH3U_Win_allocate+0x79)[0x2b6069982869]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPID_Win_allocate+0x9f)[0x2b60699894ef]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPIDI_CH3U_Win_allocate+0x79)[0x2ba320de6869]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPIDI_CH3U_Win_allocate+0x79)[0x2adc3432a869]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPIDI_CH3U_Win_allocate+0x79)[0x2ba62c9cd869]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(PMPI_Win_allocate+0x221)[0x2b6069a71a01]
./test_malloc[0x4023ca]
./test_malloc[0x401f7a]
./test_malloc[0x401ee8]
./test_malloc[0x401dfc]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x3b6ea1ed1d]
./test_malloc[0x401bc9]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPID_Win_allocate+0x9f)[0x2ba62c9d44ef]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPID_Win_allocate+0x9f)[0x2ba320ded4ef]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPID_Win_allocate+0x9f)[0x2adc343314ef]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(PMPI_Win_allocate+0x221)[0x2ba320ed5a01]
./test_malloc[0x4023ca]
./test_malloc[0x401f7a]
./test_malloc[0x401ee8]
./test_malloc[0x401dfc]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(PMPI_Win_allocate+0x221)[0x2adc34419a01]
./test_malloc[0x4023ca]
./test_malloc[0x401f7a]
./test_malloc[0x401ee8]
./test_malloc[0x401dfc]
/home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(PMPI_Win_allocate+0x221)[0x2ba62cabca01]
./test_malloc[0x4023ca]
./test_malloc[0x401f7a]
./test_malloc[0x401ee8]
./test_malloc[0x401dfc]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x3b6ea1ed1d]
./test_malloc[0x401bc9]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x3b6ea1ed1d]
./test_malloc[0x401bc9]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x3b6ea1ed1d]
./test_malloc[0x401bc9]
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 116475 RUNNING AT blogin2
= EXIT CODE: 1
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
--
Jeff Hammond
jeff.science at gmail.com
More information about the mvapich-discuss
mailing list