<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body>
The README from the 5.8 tarball says this<br>
<blockquote><font face="monospace">ROCm, CUDA and OpenACC Extensions
to OMB</font><br>
<font face="monospace">----------------------------------------</font><br>
<font face="monospace">CUDA Extensions to OMB can be enable by
configuring the benchmark suite with</font><br>
<font face="monospace">--enable-cuda option as shown below. </font><br>
<br>
<font face="monospace"> ./configure CC=/path/to/mpicc </font><br>
<font face="monospace"> CXX=/path/to/mpicxx</font><br>
<font face="monospace"> --enable-cuda </font><br>
<font face="monospace">
--with-cuda-include=/path/to/cuda/include</font><br>
<font face="monospace">
--with-cuda-libpath=/path/to/cuda/lib</font><br>
<font face="monospace"> make</font><br>
<font face="monospace"> make install</font><br>
<br>
<font face="monospace"> .......</font><br>
<br>
<font face="monospace">Similarly, OpenACC Extensions can be
enabled by specifying the --enable-openacc</font><br>
<font face="monospace">option. The MPI library used should be
able to support MPI communication from</font><br>
<font face="monospace">buffers in GPU Device memory.</font><br>
<br>
<font face="monospace">The following benchmarks have been extended
to evaluate performance of</font><br>
<font face="monospace">MPI communication using buffers on AMD and
NVIDIA GPU devices.</font><br>
<br>
<font face="monospace"> osu_bibw - Bidirectional
Bandwidth Test</font><br>
<font face="monospace"> osu_bw - Bandwidth Test</font><br>
<font face="monospace"> osu_latency - Latency Test</font><br>
<font face="monospace"> osu_mbw_mr - Multiple Bandwidth
/ Message Rate Test</font><br>
<font face="monospace"> osu_multi_lat - Multi-pair Latency
Test</font><br>
<b><font face="monospace"> osu_latency_mt - Multi-threaded
Latency Test</font></b><b><br>
</b><b><font face="monospace"> osu_latency_mp -
Multi-process Latency Test</font></b><br>
<br>
<font face="monospace">......</font><br>
<br>
<b><font face="monospace">If both CUDA and OpenACC support is
enabled you can switch between the modes</font></b><b><br>
</b><font face="monospace"><b>using the -d [cuda|openacc] option
to the benchmarks.</b> If ROCm support is</font><br>
<font face="monospace">enabled, you need to use -d rocm option to
make the benchmarks use this feature.</font><br>
<font face="monospace">Whether a process allocates its
communication buffers on the GPU device or on</font><br>
<font face="monospace">the host can be controlled at run-time.
Use the -h option for more help.</font><br>
</blockquote>
The <font face="monospace">-d cuda</font> and <font
face="monospace">-d openacc</font> options don't work with the <font
face="monospace">osu_latency_mp</font> & <font
face="monospace">osu_latency_mt</font> benchmarks.<br>
<blockquote><font face="monospace">Invalid option [-d]</font><br>
<font face="monospace">Usage: osu_latency_mp [options]<br>
.....<br>
</font></blockquote>
Also, the online documentation<br>
<blockquote><a moz-do-not-send="true"
href="https://mvapich.cse.ohio-state.edu/benchmarks/">https://mvapich.cse.ohio-state.edu/benchmarks/</a><br>
</blockquote>
includes the <font face="monospace">osu_latency_mt</font> in the
list but not the <font face="monospace">osu_latency_mp</font>. It
just looks like the docs need to be updated.<br>
<br>
</body>
</html>