[mvapich-discuss] Announcing the Release of MVAPICH2 1.8RC1 and OSU Micro-Benchmarks (OMB) 3.5.2

Devendar Bureddy bureddy at cse.ohio-state.edu
Thu Mar 22 21:10:04 EDT 2012


Hi Jens

Thank you for letting us know this issue.  This is a corner case we
missed in heterogeneous GPU configuration.  This issue should happen
only when running with 3 processes in your configuration (GPU0 to
IOH1, GPU1&2 to IOH2).  The attached patch should fix this issue.  Can
you please try this patch and let us know if this works for you?

Please follow below instructions for applying the patch.

$ tar xf mvapich2-1.8rc1.tar.gz
$ cd mvapich2-1.8rc1
$ patch -p1 < diff.patch
patching file src/mpid/ch3/channels/mrail/src/gen2/ibv_cuda_util.c
$

Thanks
Devendar

On Thu, Mar 22, 2012 at 3:14 PM, Jens Glaser <jglaser at umn.edu> wrote:
> Hi,
>
> I am having trouble using the new version of MVAPICH2 with CUDA support.
>
> I am running on a host with 3 GPUs connected to two IO hubs (GPU0 to IOH1, GPU1&2 to IOH2), and MPI_Initialize hangs on this system when I run it with mpirun -np 3.
>
> Details:
>
> Configure line:
>
> ./configure --prefix=/nics/d/home/jglaser/mpich2-install --enable-cuda --with-cuda-include=/sw/keeneland/cuda/4.1/linux_binary/include/ --with-cuda-libpath=/sw/keeneland/cuda/4.1/linux_binary/lib64 --enable-shared --with-ib-libpath=/usr/lib64/
>
> Test program:
> ================
> #include <mpi.h>
> #include <cuda_runtime.h>
> #include <stdlib.h>
>
> int main(int argc, char ** argv)
>    {
>    int rank;
>
>    cudaSetDevice(atoi(getenv("MV2_COMM_WORLD_LOCAL_RANK")));
>    printf("before init\n");
>    MPI_Init(&argc,&argv);
>    printf("after init");
>    MPI_Finalize();
>    printf("after finalize");
>    }
> ================
>
> Compile with NVCC and appropriate options (obtained from mpicc -show)
>
> Test program output
>
> mpirun -np 3 ./mpitest
> before init
> before init
> before init
> Ctrl-C caught... cleaning up processes
> (it hangs)
>
> It works with two GPUs:
> mpirun -np 2 ./mpitest
> before init
> before init
> after init
> after init
> after finalize
> after finalize
>
> The last version of MVAPICH2 (1.8a2) did work without problems.
>
> Any idea?
>
> Thanks,
>
> Jens
>
> On Mar 22, 2012, at 12:21 PM, Dhabaleswar Panda wrote:
>
>> The MVAPICH team is pleased to announce the release of MVAPICH2 1.8RC1
>> and OSU Micro-Benchmarks (OMB) 3.5.2.
>>
>> Features, Enhancements, and Bug Fixes for MVAPICH2 1.8RC1 are listed
>> here.
>>
>> * New Features and Enhancements (since 1.8a2):
>>
>>    - New design for intra-node communication from GPU Device buffers
>>      using CUDA IPC for better performance and correctness
>>        - Thanks to Joel Scherpelz from NVIDIA for his suggestions
>>    - Enabled shared memory communication for host transfers when CUDA is
>>      enabled
>>    - Optimized and tuned collectives for GPU device buffers
>>    - Enhanced pipelined inter-node device transfers
>>    - Enhanced shared memory design for GPU device transfers for
>>      large messages
>>    - Enhanced support for CPU binding with socket and numanode level
>>      granularity
>>    - Support suspend/resume functionality with mpirun_rsh
>>    - Exporting local rank, local size, global rank and global size
>>      through environment variables (both mpirun_rsh and hydra)
>>    - Update to hwloc v1.4
>>    - Checkpoint-Restart support in OFA-IB-Nemesis interface
>>    - Enabling run-through stabilization support to handle process
>>      failures in OFA-IB-Nemesis interface
>>    - Enhancing OFA-IB-Nemesis interface to handle IB errors gracefully
>>    - Performance tuning on various platforms
>>    - Support for Mellanox IB FDR adapter
>>
>> * Bug Fixes (since 1.8a2):
>>
>>    - Fix a hang issue on InfiniHost SDR/DDR cards
>>        - Thanks to Nirmal Seenu from Fermilab for the report
>>    - Fix an issue with runtime parameter MV2_USE_COALESCE usage
>>    - Fix an issue with LiMIC2 when CUDA is enabled
>>    - Fix an issue with intra-node communication using datatypes and GPU
>>      device buffers
>>    - Fix an issue with Dynamic Process Management when launching
>>      processes on multiple nodes
>>        - Thanks to Rutger Hofman from VU Amsterdam for the report
>>    - Fix build issue in hwloc source with mcmodel=medium flags
>>        - Thanks to Nirmal Seenu from Fermilab for the report
>>    - Fix a build issue in hwloc with --disable-shared or
>>      --disabled-static options
>>    - Use portable stdout and stderr redirection
>>        - Thanks to Dr. Axel Philipp from MTU Aero Engines for the patch
>>    - Fix a build issue with PGI 12.2
>>        - Thanks to Thomas Rothrock from U.S. Army SMDC for the patch
>>    - Fix an issue with send message queue in OFA-IB-Nemesis interface
>>    - Fix a process cleanup issue in Hydra when MPI_ABORT is called
>>      (upstream MPICH2 patch)
>>    - Fix an issue with non-contiguous datatypes in MPI_Gather
>>    - Fix a few memory leaks and warnings
>>
>> Bugfixes for OSU Micro-Benchmarks (OMB) 3.5.2 is listed here.
>>
>> * Bug Fix (since OMB 3.5.1):
>>  - Fix typo which led to use of incorrect buffers
>>
>> The complete set of features and enhancements for MVAPICH2 1.8RC1 compared
>> to MVAPICH2 1.7 are as follows:
>>
>> * Features & Enhancements:
>>    - Support for MPI communication from NVIDIA GPU device memory
>>        - High performance RDMA-based inter-node point-to-point
>>          communication (GPU-GPU, GPU-Host and Host-GPU)
>>        - High performance intra-node point-to-point communication for
>>          multi-GPU adapters/node (GPU-GPU, GPU-Host and Host-GPU)
>>        - Taking advantage of CUDA IPC (available in CUDA 4.1) in
>>          intra-node communication for multiple GPU adapters/node
>>        - Optimized and tuned collectives for GPU device buffers
>>        - MPI datatype support for point-to-point and collective
>>          communication from GPU device buffers
>>    - Support suspend/resume functionality with mpirun_rsh
>>    - Enhanced support for CPU binding with socket and numanode level
>>      granularity
>>    - Exporting local rank, local size, global rank and global size
>>      through environment variables (both mpirun_rsh and hydra)
>>    - Update to hwloc v1.4
>>    - Checkpoint-Restart support in OFA-IB-Nemesis interface
>>    - Enabling run-through stabilization support to handle process
>>      failures in OFA-IB-Nemesis interface
>>    - Enhancing OFA-IB-Nemesis interface to handle IB errors gracefully
>>    - Performance tuning on various architecture clusters
>>    - Support for Mellanox IB FDR adapter
>>    - Adjust shared-memory communication block size at runtime
>>    - Enable XRC by default at configure time
>>    - New shared memory design for enhanced intra-node small message
>>      performance
>>    - Tuned inter-node and intra-node performance on different cluster
>>      architectures
>>    - Support for fallback to R3 rendezvous protocol if RGET fails
>>    - SLURM integration with mpiexec.mpirun_rsh to use SLURM allocated
>>      hosts without specifying a hostfile
>>    - Support added to automatically use PBS_NODEFILE in Torque and PBS
>>      environments
>>    - Enable signal-triggered (SIGUSR2) migration
>>    - Reduced memory footprint of the library
>>    - Enhanced one-sided communication design with reduced memory
>>      requirement
>>    - Enhancements and tuned collectives (Bcast and Alltoallv)
>>    - Flexible HCA selection with Nemesis interface
>>        - Thanks to Grigori Inozemtsev, Queens University
>>    - Support iWARP interoperability between Intel NE020 and
>>      Chelsio T4 Adapters
>>    - RoCE enable environment variable name is changed from MV2_USE_RDMAOE
>>      to MV2_USE_RoCE
>>
>> Sample performance numbers for MPI communication from NVIDIA GPU memory
>> using MVAPICH2 1.8RC1 and OMB 3.5.2 can be obtained from the following
>> URL:
>>
>> http://mvapich.cse.ohio-state.edu/performance/gpu.shtml
>>
>> For downloading MVAPICH2 1.8RC1, OMB 3.5.2, associated user guide, quick
>> start guide, and accessing the SVN, please visit the following URL:
>>
>> http://mvapich.cse.ohio-state.edu
>>
>> All questions, feedbacks, bug reports, hints for performance tuning,
>> patches and enhancements are welcome. Please post it to the
>> mvapich-discuss mailing list (mvapich-discuss at cse.ohio-state.edu).
>>
>> We are also happy to inform that the number of downloads from MVAPICH
>> project site has crossed 100,000. The MVAPICH team extends thanks to all
>> MVAPICH/MVAPICH2 users and their organizations.
>>
>> Thanks,
>>
>> The MVAPICH Team
>>
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss



-- 
Devendar
-------------- next part --------------
A non-text attachment was scrubbed...
Name: diff.patch
Type: text/x-patch
Size: 1818 bytes
Desc: not available
Url : http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20120322/3388809e/diff.bin


More information about the mvapich-discuss mailing list