[mvapich-discuss] Hang in free

Khuvis, Samuel skhuvis at osc.edu
Tue Mar 24 16:10:13 EDT 2020


Hi,

A code I am working on is hanging with MVAPICH2 on multiple nodes but not with Intel MPI or with MVAPICH2 on a single node. Based on the backtrace below it seems to be hanging inside of a free. I found a thread from 2009 (http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/2009-December/002654.html) that suggested setting MV2_USE_LAZY_MEM_UNREGISTER=0 at runtime. This seems to prevent the hang from occurring but could potentially hurt performance. Is there a better solution for this issue? Let me know if you need any further information about the code.


#0  0x00002acec0eaa831 in find_and_free_dregs_inside () from /opt/mvapich2/intel/19.0/2.3.3/lib/libmpi.so.12
#1  0x00002acec0ed98bc in mvapich2_munmap () from /opt/mvapich2/intel/19.0/2.3.3/lib/libmpi.so.12
#2  0x00002acec0edaf19 in _int_free () from /opt/mvapich2/intel/19.0/2.3.3/lib/libmpi.so.12
#3  0x00002acec0edcb2a in free () from /opt/mvapich2/intel/19.0/2.3.3/lib/libmpi.so.12
#4  0x000000000044e7e1 in VerticalLineLocus_blunder (proinfo=0xbfe698910011c2ee, nccresult=0x2ace00000007, MagImages=0x2ace0011c2d6, DEM_resolution=0,
    im_resolution=6.9526732747253049e-310, RPCs=0x58ce2e, Imagesizes_ori=0x3fffe0, Imagesizes=0x40b1800000000000, Images=0x7ffcc8959168, Template_size=96 '`',
    Size_Grid2D=..., param=..., GridPts=0x7ffcc8959038, Grid_wgs=0x7ffcc8959068, GridPT3=0x7ffcc8959070, NumofIAparam=144 '\220', ImageAdjust=0x7ffcc8959078,
    Pyramid_step=152 '\230', Startpos=0x7ffcc8959020, save_filepath=0x7ffcc89590a0 "", tile_row=168 '\250', tile_col=176 '\260', iteration=184 '\270',
    bl_count=192 '\300', Boundary=0x7ffcc895903c, ori_images=0x7ffcc8959040, blunder_selected_level=-929722300, bblunder=200) at setsm_code.cpp:14584
#5  0x00002acec1c71d43 in __kmp_invoke_microtask () from /opt/intel/19.0.5/compilers_and_libraries_2019/linux/lib/intel64_lin/libiomp5.so
#6  0x00002acec1c0163f in __kmp_invoke_task_func (gtid=-1058373488) at ../../src/kmp_runtime.cpp:7426
#7  0x00002acec1c0065c in __kmp_launch_thread (this_thr=0x2acec0ea8090 <vma_compare_search>) at ../../src/kmp_runtime.cpp:6041
#8  0x00002acec1c722fb in _INTERNAL_26_______src_z_Linux_util_cpp_cabc1a3b::__kmp_launch_worker (thr=0x2acec0ea8090 <vma_compare_search>)
    at ../../src/z_Linux_util.cpp:586
#9  0x00002acec215edd5 in start_thread () from /lib64/libpthread.so.0
#10 0x00002acec247102d in clone () from /lib64/libc.so.6


Thanks,
Samuel Khuvis
Scientific Applications Engineer
Ohio Supercomputer Center (OSC)<https://osc.edu/>
A member of the Ohio Technology Consortium<https://oh-tech.org/>
1224 Kinnear Road, Columbus, Ohio 43212
Office: (614) 292-5178<tel:+16142925178> • Fax: (614) 292-7168<tel:+16142927168>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20200324/f9e50a11/attachment.html>


More information about the mvapich-discuss mailing list