[mvapich-discuss] ch3:psm MPI-3 RMA limited to $MV2_SHMEM_COLL_NUM_COMM windows
Mingzhe Li
li.2192 at osu.edu
Fri Mar 28 13:43:20 EDT 2014
Hi Jeff,
Thanks for reporting. We will take a look and get back to you.
Mingzhe
On Thu, Mar 27, 2014 at 5:04 PM, Jeff Hammond <jeff.science at gmail.com>wrote:
> I cannot allocate more RMA windows than the value of
> MV2_SHMEM_COLL_NUM_COMM. This breaks ARMCI-MPI and thus NWChem. I do
> not see this problem with non-PSM builds of MVAPICH2, although I have
> not updated the Mellanox builds within the last few weeks, so perhaps
> it is pervasively broken by a recent change.
>
> I am working on the MVAPICH2 svn trunk. Please let me know when this
> issue is resolved so I can support NWChem users running with Qlogic
> IB.
>
> Thanks,
>
> Jeff
>
> [jhammond at blogin2 tests]$ export MV2_SHMEM_COLL_NUM_COMM=1000
> [jhammond at blogin2 tests]$
> /home/jhammond/MPI/gcc482-mv2trunk/bin/mpiexec -n 4 ./test_malloc
> Starting ARMCI memory allocation test with 4 processes
> + allocation 0
> + allocation 1
> + allocation 2
> + allocation 3
> + allocation 4
> + allocation 5
> + allocation 6
> + allocation 7
> + allocation 8
> + allocation 9
> + allocation 10
> + allocation 11
> + allocation 12
> + allocation 13
> + allocation 14
> + allocation 15
> + allocation 16
> + allocation 17
> + allocation 18
> + allocation 19
> + allocation 20
> + allocation 21
> + allocation 22
> + allocation 23
> + allocation 24
> + allocation 25
> + allocation 26
> + allocation 27
> + allocation 28
> + allocation 29
> + allocation 30
> + allocation 31
> + allocation 32
> + allocation 33
> + allocation 34
> + allocation 35
> + allocation 36
> + allocation 37
> + allocation 38
> + allocation 39
> + allocation 40
> + allocation 41
> + allocation 42
> + allocation 43
> + allocation 44
> + allocation 45
> + allocation 46
> + allocation 47
> + allocation 48
> + allocation 49
> + allocation 50
> + allocation 51
> + allocation 52
> + allocation 53
> + allocation 54
> + allocation 55
> + allocation 56
> + allocation 57
> + allocation 58
> + allocation 59
> + allocation 60
> + allocation 61
> + allocation 62
> + allocation 63
> + allocation 64
> + allocation 65
> + allocation 66
> + allocation 67
> + allocation 68
> + allocation 69
> + allocation 70
> + allocation 71
> + allocation 72
> + allocation 73
> + allocation 74
> + allocation 75
> + allocation 76
> + allocation 77
> + allocation 78
> + allocation 79
> + allocation 80
> + allocation 81
> + allocation 82
> + allocation 83
> + allocation 84
> + allocation 85
> + allocation 86
> + allocation 87
> + allocation 88
> + allocation 89
> + allocation 90
> + allocation 91
> + allocation 92
> + allocation 93
> + allocation 94
> + allocation 95
> + allocation 96
> + allocation 97
> + allocation 98
> + allocation 99
> + free 0
> + free 1
> + free 2
> + free 3
> + free 4
> + free 5
> + free 6
> + free 7
> + free 8
> + free 9
> + free 10
> + free 11
> + free 12
> + free 13
> + free 14
> + free 15
> + free 16
> + free 17
> + free 18
> + free 19
> + free 20
> + free 21
> + free 22
> + free 23
> + free 24
> + free 25
> + free 26
> + free 27
> + free 28
> + free 29
> + free 30
> + free 31
> + free 32
> + free 33
> + free 34
> + free 35
> + free 36
> + free 37
> + free 38
> + free 39
> + free 40
> + free 41
> + free 42
> + free 43
> + free 44
> + free 45
> + free 46
> + free 47
> + free 48
> + free 49
> + free 50
> + free 51
> + free 52
> + free 53
> + free 54
> + free 55
> + free 56
> + free 57
> + free 58
> + free 59
> + free 60
> + free 61
> + free 62
> + free 63
> + free 64
> + free 65
> + free 66
> + free 67
> + free 68
> + free 69
> + free 70
> + free 71
> + free 72
> + free 73
> + free 74
> + free 75
> + free 76
> + free 77
> + free 78
> + free 79
> + free 80
> + free 81
> + free 82
> + free 83
> + free 84
> + free 85
> + free 86
> + free 87
> + free 88
> + free 89
> + free 90
> + free 91
> + free 92
> + free 93
> + free 94
> + free 95
> + free 96
> + free 97
> + free 98
> + free 99
> Test complete: PASS.
>
> [jhammond at blogin2 tests]$ export MV2_SHMEM_COLL_NUM_COMM=100
> [jhammond at blogin2 tests]$
> /home/jhammond/MPI/gcc482-mv2trunk/bin/mpiexec -n 4 ./test_malloc
> Starting ARMCI memory allocation test with 4 processes
> + allocation 0
> + allocation 1
> + allocation 2
> + allocation 3
> + allocation 4
> + allocation 5
> + allocation 6
> + allocation 7
> + allocation 8
> + allocation 9
> + allocation 10
> + allocation 11
> + allocation 12
> + allocation 13
> + allocation 14
> + allocation 15
> + allocation 16
> + allocation 17
> + allocation 18
> + allocation 19
> + allocation 20
> + allocation 21
> + allocation 22
> + allocation 23
> + allocation 24
> + allocation 25
> + allocation 26
> + allocation 27
> + allocation 28
> + allocation 29
> + allocation 30
> + allocation 31
> + allocation 32
> + allocation 33
> + allocation 34
> + allocation 35
> + allocation 36
> + allocation 37
> + allocation 38
> + allocation 39
> + allocation 40
> + allocation 41
> + allocation 42
> + allocation 43
> + allocation 44
> + allocation 45
> + allocation 46
> + allocation 47
> + allocation 48
> + allocation 49
> + allocation 50
> + allocation 51
> + allocation 52
> + allocation 53
> + allocation 54
> + allocation 55
> + allocation 56
> + allocation 57
> + allocation 58
> + allocation 59
> + allocation 60
> + allocation 61
> + allocation 62
> + allocation 63
> + allocation 64
> + allocation 65
> + allocation 66
> + allocation 67
> + allocation 68
> + allocation 69
> + allocation 70
> + allocation 71
> + allocation 72
> + allocation 73
> + allocation 74
> + allocation 75
> + allocation 76
> + allocation 77
> + allocation 78
> + allocation 79
> + allocation 80
> + allocation 81
> + allocation 82
> + allocation 83
> + allocation 84
> + allocation 85
> + allocation 86
> + allocation 87
> + allocation 88
> + allocation 89
> + allocation 90
> + allocation 91
> + allocation 92
> + allocation 93
> + allocation 94
> + allocation 95
> + allocation 96
> + allocation 97
> + allocation 98
> + allocation 99
>
> test_malloc:116454 terminated with signal 11 at PC=2ae87e750a0e
> SP=7fffbdc4aa80. Backtrace:
>
> test_malloc:116456 terminated with signal 11 at PC=2b432f382a0e
> SP=7fffd024a820. Backtrace:
>
> test_malloc:116455 terminated with signal 11 at PC=2b8d22b46a0e
> SP=7fff916b67c0. Backtrace:
>
> test_malloc:116457 terminated with signal 11 at PC=2b1aeb8aea0e
> SP=7fff6490d480. Backtrace:
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(+0xb9a0e)[0x2b432f382a0e]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(+0xb9a0e)[0x2ae87e750a0e]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPIDI_CH3U_Win_allocate+0x79)[0x2b432f373869]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(+0xb9a0e)[0x2b1aeb8aea0e]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(+0xb9a0e)[0x2b8d22b46a0e]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPIDI_CH3U_Win_allocate+0x79)[0x2b1aeb89f869]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPIDI_CH3U_Win_allocate+0x79)[0x2ae87e741869]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPID_Win_allocate+0x9f)[0x2b432f37a4ef]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPIDI_CH3U_Win_allocate+0x79)[0x2b8d22b37869]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(PMPI_Win_allocate+0x221)[0x2b432f462a01]
> ./test_malloc[0x4023ca]
> ./test_malloc[0x401f7a]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPID_Win_allocate+0x9f)[0x2ae87e7484ef]
> ./test_malloc[0x401ee8]
> ./test_malloc[0x401dfc]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPID_Win_allocate+0x9f)[0x2b1aeb8a64ef]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPID_Win_allocate+0x9f)[0x2b8d22b3e4ef]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x3b6ea1ed1d]
> ./test_malloc[0x401bc9]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(PMPI_Win_allocate+0x221)[0x2ae87e830a01]
> ./test_malloc[0x4023ca]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(PMPI_Win_allocate+0x221)[0x2b1aeb98ea01]
> ./test_malloc[0x4023ca]
> ./test_malloc[0x401f7a]
> ./test_malloc[0x401ee8]
> ./test_malloc[0x401dfc]
> ./test_malloc[0x401f7a]
> ./test_malloc[0x401ee8]
> ./test_malloc[0x401dfc]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(PMPI_Win_allocate+0x221)[0x2b8d22c26a01]
> ./test_malloc[0x4023ca]
> ./test_malloc[0x401f7a]
> ./test_malloc[0x401ee8]
> ./test_malloc[0x401dfc]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x3b6ea1ed1d]
> ./test_malloc[0x401bc9]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x3b6ea1ed1d]
> ./test_malloc[0x401bc9]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x3b6ea1ed1d]
> ./test_malloc[0x401bc9]
>
>
> ===================================================================================
> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> = PID 116456 RUNNING AT blogin2
> = EXIT CODE: 1
> = CLEANING UP REMAINING PROCESSES
> = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>
> ===================================================================================
>
>
> [jhammond at blogin2 tests]$ export MV2_SHMEM_COLL_NUM_COMM=10
> [jhammond at blogin2 tests]$
> /home/jhammond/MPI/gcc482-mv2trunk/bin/mpiexec -n 4 ./test_malloc
> Starting ARMCI memory allocation test with 4 processes
> + allocation 0
> + allocation 1
> + allocation 2
> + allocation 3
> + allocation 4
> + allocation 5
> + allocation 6
> + allocation 7
> + allocation 8
> + allocation 9
> + allocation 10
>
> test_malloc:116472 terminated with signal 11 at PC=2ba320df5a0e
> SP=7fff1d1d1c10. Backtrace:
>
> test_malloc:116473 terminated with signal 11 at PC=2adc34339a0e
> SP=7fff87299800. Backtrace:
>
> test_malloc:116474 terminated with signal 11 at PC=2ba62c9dca0e
> SP=7fff5457a550. Backtrace:
>
> test_malloc:116475 terminated with signal 11 at PC=2b6069991a0e
> SP=7ffff4ba8b70. Backtrace:
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(+0xb9a0e)[0x2b6069991a0e]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(+0xb9a0e)[0x2ba320df5a0e]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(+0xb9a0e)[0x2adc34339a0e]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(+0xb9a0e)[0x2ba62c9dca0e]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPIDI_CH3U_Win_allocate+0x79)[0x2b6069982869]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPID_Win_allocate+0x9f)[0x2b60699894ef]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPIDI_CH3U_Win_allocate+0x79)[0x2ba320de6869]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPIDI_CH3U_Win_allocate+0x79)[0x2adc3432a869]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPIDI_CH3U_Win_allocate+0x79)[0x2ba62c9cd869]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(PMPI_Win_allocate+0x221)[0x2b6069a71a01]
> ./test_malloc[0x4023ca]
> ./test_malloc[0x401f7a]
> ./test_malloc[0x401ee8]
> ./test_malloc[0x401dfc]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x3b6ea1ed1d]
> ./test_malloc[0x401bc9]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPID_Win_allocate+0x9f)[0x2ba62c9d44ef]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPID_Win_allocate+0x9f)[0x2ba320ded4ef]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(MPID_Win_allocate+0x9f)[0x2adc343314ef]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(PMPI_Win_allocate+0x221)[0x2ba320ed5a01]
> ./test_malloc[0x4023ca]
> ./test_malloc[0x401f7a]
> ./test_malloc[0x401ee8]
> ./test_malloc[0x401dfc]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(PMPI_Win_allocate+0x221)[0x2adc34419a01]
> ./test_malloc[0x4023ca]
> ./test_malloc[0x401f7a]
> ./test_malloc[0x401ee8]
> ./test_malloc[0x401dfc]
>
> /home/jhammond/MPI/gcc482-mv2trunk/lib/libmpich.so.12(PMPI_Win_allocate+0x221)[0x2ba62cabca01]
> ./test_malloc[0x4023ca]
> ./test_malloc[0x401f7a]
> ./test_malloc[0x401ee8]
> ./test_malloc[0x401dfc]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x3b6ea1ed1d]
> ./test_malloc[0x401bc9]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x3b6ea1ed1d]
> ./test_malloc[0x401bc9]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x3b6ea1ed1d]
> ./test_malloc[0x401bc9]
>
>
> ===================================================================================
> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> = PID 116475 RUNNING AT blogin2
> = EXIT CODE: 1
> = CLEANING UP REMAINING PROCESSES
> = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>
> ===================================================================================
>
> --
> Jeff Hammond
> jeff.science at gmail.com
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140328/ebdaae2c/attachment-0001.html>
More information about the mvapich-discuss
mailing list