Dear Bryon,
yes, we have seen this problem before. As far as I can tell it is a CUDA bug that affects some GPUs. All the eddy_cuda executables have been built as “fat” executables. That means that it has instructions for all supported architectures and for any given architecture selects the instructions that that particular GPU can handle. It seems that for a few GPUs it selects the wrong instructions, and that’s when you get the "cudaFuncGetAttributes: invalid device function” error message.
The solution is to compile a non-fat binary that includes only the instructions that are relevant for your GPU. I will do that and let you know when I have posted the executables for the Tesla P100.
Meanwhile, can you please install CUDA 8.0 and try to use the eddy_cuda8.0 executable. It would be very useful to know if Nvidia has solved the problem, and we don’t have any problematic GPUs to test on.
Jesper
> On 31 Oct 2017, at 19:15, Bryon Mueller <[log in to unmask]> wrote:
>
> I am getting a core dump while trying to run the HCP dMRI pipeline, using the HCP 3.22 release, FSL 5.0.10, a newer card (Tesla P100 compute 12Gb cap 6.0), on a server running RH7. The error I am getting is:
>
> thrust::system_error thrown in CudaVolume::common_assignment_from_newimage_vol after resize() with message: function_attributes(): after cudaFuncGetAttributes: invalid device function
> terminate called after throwing an instance of 'thrust::system::system_error'
> what(): function_attributes(): after cudaFuncGetAttributes: invalid device function
> /opt/local/hcp-pipelines-3.22.0/Pipelines-3.22.0/DiffusionPreprocessing/scripts/run_eddy.sh: line 380: 25767 Aborted (core dumped) ${eddy_command}
>
> The last few lines in the HCP .o file shows the call to eddy_cuda and then a core dump:
>
> Mon Oct 30 18:54:49 CDT 2017 - run_eddy.sh - About to issue the following eddy command:
> Mon Oct 30 18:54:49 CDT 2017 - run_eddy.sh - eddy_cuda --imain=Diffusion/eddy/Pos_Neg --mask=Diffusion/eddy/nodif_brain_mask --index=Diffusion/eddy/index.txt --acqp=Diffusion/eddy/acqparams.txt --bvecs=Diffusion/eddy/Pos_Neg.bvecs --bvals=Diffusion/eddy/Pos_Neg.bvals --fwhm=0 --topup=Diffusion/topup/topup_Pos_Neg_b0 --out=Diffusion/eddy/eddy_unwarped_images --flm=quadratic
> Entering EddyGpuUtils::LoadPredictionMaker
>
> ...................Allocated GPU # 0...................
> Mon Oct 30 18:56:24 CDT 2017 - run_eddy.sh - Completed with return value: 134
>
> I have the cuda 7.5 libraries installed and ldd shows eddy_cuda resolving all the libraries.
>
> Any suggestions?
>
> Thanks,
>
> Bryon
|