I'm making my first attempts at building a .so
full of CUDA routines. I have matrix_vector_mult.cu
file which currently does nothing:
#include <stdio.h>extern "C"double *matrix_vector_mult(const double ** const M, const double * const v, const size_t num_rows, const size_t num_cols){ printf("Hello!\n"); double * p = (double *) malloc(num_rows*sizeof(double)); return p;}
I also have a makefile, whose contents is as follows:
CC := clangUNAME := $(shell uname -s)ifeq ($(UNAME), Darwin)CUDA_PATH := /Developer/NVIDIA/CUDA-6.5CUDA_LIB := ${CUDA_PATH}/libendififeq ($(UNAME), Linux)CUDA_PATH := /usr/local/cuda-6.5CUDA_LIB := ${CUDA_PATH}/lib64endifLIBS := -L ${CUDA_LIB} -lcudart -lcudadevrtNVCC := ${CUDA_PATH}/bin/nvcc -ccbin ${CC}CFLAGS := -g -std=c11 -Wextra -Wall -I include -rpath ${CUDA_LIB}NVCCFLAGS := -g -m64 -D__STRICT_ANSI__vpath %.cu srcvpath %.h includeall: matrix_vector_mult.o ${CC} ${CFLAGS} -o matrix_vector_mult.so -shared -fPIC $^ ${LIBS}matrix_vector_mult.o: matrix_vector_mult.cu ${NVCC} ${NVCCFLAGS} -o $@ -c $^clean: rm -f *.o *.so *.pyc
On Mac, this compiles just fine. However, on my Ubuntu box, I get the error message:
/usr/bin/ld: matrix_vector_mult.o: relocation R_X86_64_32S against `.rodata.str1.1' can not be used when making a shared object; recompile with -fPICmatrix_vector_mult.o: error adding symbols: Bad value
What could be the problem? (Adding -fPIC
to the compile line doesn't work.) nvcc --version
yields identical information on both boxes, clang --version
gives
Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM 3.5svn)
on the Mac, and
Ubuntu clang version 3.4-1ubuntu3 (tags/RELEASE_34/final) (based on LLVM 3.4)
on the Ubuntu box. I somewhat doubt the slight LLVM version difference is the problem, because I regard the makefile as a mess already. Any help appreciated.