Quantcast
Channel: Active questions tagged ubuntu - Stack Overflow
Viewing all articles
Browse latest Browse all 7069

OpenMP does not offload when compiled with clang from inside docker image

$
0
0

I have this toy program that I use just to check if OpenMP is working from inside my docker containers. the Dockerfile is the following:

FROM nvidia/cuda:12.6.0-devel-ubuntu24.04# Set ambient variablesENV CC=gccENV CXX=g++# Install dependenciesRUN apt-get updateRUN apt-get install -y build-essentialRUN apt-get install -y gcc-offload-nvptxRUN apt-get install -y gfortranRUN apt-get install -y libopenmpi-devRUN apt-get install -y openmpi-binRUN apt-get install -y cppcheckRUN apt-get install -y clang-tidy-18RUN apt-get install -y clang-format-18RUN apt-get install -y fftw2RUN apt-get install -y fftw-devRUN apt-get install -y pkg-configRUN apt-get install -y valgrindRUN apt-get install -y wgetRUN apt-get install -y cmakeRUN apt-get install -y python3RUN apt-get install -y python3-pipRUN apt-get install -y gitRUN apt-get install -y libomp-devRUN apt-get install -y libc++-18-devRUN apt-get install -y libc++abi-18-devENV LD_LIBRARY_PATH="/usr/lib/llvm-18/lib:$LD_LIBRARY_PATH"# Clean upRUN apt-get cleanRUN rm -rf /var/lib/apt/lists/*# Install MinicondaRUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \&& bash Miniconda3-latest-Linux-x86_64.sh -b# Sets work directoryWORKDIR /app# Copy application codeCOPY . /app# Set the PATH to include MinicondaENV PATH="/root/miniconda3/bin:$PATH"# Initialize condaRUN conda init# Install Python dependenciesRUN conda install -c conda-forge --yes --file scripts/requirements.txt

So, I use the image provided by NVIDIA. My personal computer has a RTX 4060 which has compute capability 8.9, and building my toy code with both gcc and clang works and runs normally, as expected. Here's some snippets:

void ipsum::computation(lorem &lor) {    int *lx = lor.x;    int *ly = lor.y;    int *lz = z;    size_t _sz = sz;    #pragma omp target teams distribute parallel for simd     for (size_t i = 0; i < _sz; i++) {        ly[i] = lx[i] + lz[i];    }    std::swap<int*>(lor.x, lor.y);}

it's a simple code that updates some vector by summing it with another, as I said, it's a toy code.

The problem arises when I push this image to dockerhub and then pull it in another machine. This other machine has a RTX A4500 and a GTX 980 Ti. This machine has CUDA version 12.8 but I have found here that it's not an issue. nvidia-smi says the drivers are 12.8 but the runtime is 12.6 because of the docker image. Now, in this other machine, the code compiles and runs fine with gcc, but when I compile with clang and try to run it, it doesn't run on any GPU at all. When I set OMP_TARGET_OFFLOAD=MANDATORY, this error comes up:

omptarget error: Consult https://openmp.llvm.org/design/Runtimes.html for debugging options.omptarget error: No images found compatible with the installed hardware. Found 1 image(s): (sm_86)ipsum.cpp:13:5: omptarget fatal error 1: failure of target construct while offloading is mandatoryAborted (core dumped)

I compiled with these flags in clang:

-Wall -Wextra -fopenmp -fopenmp-targets=nvptx64 -g -stdlib=libc++

This error is pretty strange to me because the RTX A4500 has compute capability 8.6, and GCC offloads to it just fine. What is wrong with clang? I have checked and libomptarget-nvptx-sm_86.bc is in my container. Additionally, omp_get_num_devices() returns zero.


Viewing all articles
Browse latest Browse all 7069

Trending Articles