OpenMPI UCX GDRCopy nv_peer_mem

This document describes how to compile the following-

  • nv_peer_mem (required for multi-node)

  • gdrcopy (required for multi-node)

  • ucx (required for multi-node)

  • openmpi

Tip

Try using the NVidia HPC-X package and documentation, instead of compiling all of this software individually, see NVidia GPU/Infiniband HPC Setup

These instructions are intended as general guidelines.

NV PEER MEM

References –

Build packages:

wget https://www.mellanox.com/sites/default/files/downloads/ofed/nvidia-peer-memory_1.1.tar.gz
tar xzf nvidia_peer_memory-1.1.tar.gz
cd nvidia_peer_memory-1.1
./build_module.sh

# Expected output -- Note location of built packages
#
#   Building source rpm for nvidia_peer_memory...
#   Building debian tarball for nvidia-peer-memory...
#   Built: /tmp/nvidia_peer_memory-1.1.0.src.rpm
#   Built: /tmp/nvidia-peer-memory_1.1.orig.tar.gz

Install on RPM based OS:

rpmbuild --rebuild /tmp/nvidia_peer_memory-1.1-0.src.rpm
rpm -ivh  <path to generated binary rpm file>

Install on DEB based OS:

cd /tmp
tar xzf /tmp/nvidia-peer-memory_1.1.orig.tar.gz
cd nvidia-peer-memory-1.1
dpkg-buildpackage -us -uc
dpkg -i <path to generated deb files>

Check kernel module startup:

# Check kernel module status
service nv_peer_mem status

# alternative command
lsmod | grep nv_peer_mem

# Ensure kernel module is set to load up
service nv_peer_mem start

# alternative command
modprobe nv_peer_mem

GDR COPY

Reference – https://github.com/NVIDIA/gdrcopy

Install on RPM based OS:

# dkms can be installed from epel-release. See https://fedoraproject.org/wiki/EPEL.
sudo yum groupinstall 'Development Tools'
sudo yum install dkms rpm-build make check check-devel subunit subunit-devel

wget https://github.com/NVIDIA/gdrcopy/archive/refs/tags/v2.2.tar.gz
tar xzf gdrcopy-2.2.tar.gz
cd gdrcopy-2.2/packages

# specify the absolute path to the CUDA 11.1 top level directory, eg /usr/local/cuda-11.1
CUDA=<cuda-install-top-dir> ./build-rpm-packages.sh
sudo rpm -Uvh gdrcopy-kmod-<version>.<platform>.rpm
sudo rpm -Uvh gdrcopy-<version>.<platform>.rpm
sudo rpm -Uvh gdrcopy-devel-<version>.<platform>.rpm

Install on DEB based OS:

sudo apt install build-essential devscripts debhelper libsubunit0 check libsubunit-dev fakeroot pkg-config dkms

tar xzf gdrcopy-2.2.tar.gz
cd gdrcopy-2.2/packages

# specify the absolute path to the CUDA 11.1 top level directory, eg /usr/local/cuda-11.1
CUDA=<cuda-install-top-dir> ./build-deb-packages.sh
sudo dpkg -i gdrdrv-dkms_<version>_<platform>.deb
sudo dpkg -i gdrcopy_<version>_<platform>.deb

UCX

Using version 1.10.x or higher is recommended.

git clone https://github.com/openucx/ucx.git
cd ucx
git checkout v1.10.x
./autogen.sh

mkdir build
cd build
../contrib/configure-release \
    --prefix=<ucx-install-path> \
    --with-xpmem=</path/to/xpmem> \
    --with-cuda=<cuda/runtime/install/path> \
    --with-gdrcopy=<gdr_copy/install/path> \
    --enable-mt

make
make install

OpenMPI

Using OpenMPI 3.1.3 - 4.1.1 is recommended

Download

wget https://download.open-mpi.org/release/open-mpi/v4.1/openmpi-4.1.1.tar.gz
tar xzf openmpi-4.1.1.tar.gz
cd openmpi-4.1.1

mkdir build
cd build

Configure with UCX

Compile OpenMPI with UCX/CUDA support. Generally used for HPC/servers where internode communication using ucx/gdrcopy is required.

../configure --prefix=<openmpi-install-path> \
             --enable-mca-no-build=btl-uct \
             --with-pmix=internal \
             --with-cuda=/usr/local/cuda-10.1 \
             --with-ucx=<ucx-install-path> \
             --enable-mpirun-prefix-by-default

.. Or simple configuration

Compile OpenMPI with CUDA support. Generally used for workstations and single nodes.

../configure --prefix=<openmpi-install-path> \
             --with-cuda=/usr/local/cuda-10.1 \
             --enable-mpirun-prefix-by-default

Compile/Install

make
make install

Running M-Star

References:

# Set these variable if using UCX
export UCX_MEM_CUDA_HOOK_MODE=none
export UCX_MEMTYPE_CACHE=n

mpirun -np 8 \
    -x UCX_MEMTYPE_CACHE \
    -x UCX_MEM_CUDA_HOOK_MODE \
    mstar-cfd-mgpu -i input.xml -o out --gpu-auto --disable-ipc