I am replying to this thread, as it seems like my problem might be
related. If not, I can resubmit as a separate issue.
I have a researcher trying to replicate an analysis that was done in
June. They are running the same commands -- one person gave over the
cluster setup script and I am pretty sure that only the subject ID has
changed.
Here is the command we are running with the version I got for FSL 5.0.10:
[bennet@flux-build bin]$ ls -l eddy_cuda-5.0.10
-rwxr-xr-x 1 bennet swinstaller 40525310 Apr 24 2017 eddy_cuda-5.0.10
[bennet@flux-build bin]$ md5sum eddy_cuda
afa454d92c75542924ca36313b70d36c eddy_cuda
[hajenna@nyx7500 dwipreproc-tmp-FDLY57]$ eddy_cuda \
--imain=eddy_in.nii --mask=eddy_mask.nii \
--acqp=eddy_config.txt --index=eddy_indices.txt \
--bvecs=bvecs --bvals=bvals --niter=8 --fwhm=10,6,4,2,0,0,0,0 \
--repol --ol_type=both --mporder=8 --s2v_niter=8 --dont_peas \
--ol_type=both --mporder=8 --s2v_niter=8 --slspec=slspec.txt \
--out=dwi_post_eddy
Entering EddyGpuUtils::LoadPredictionMaker
...................Allocated GPU # 0...................
Entering EddyGpuUtils::LoadPredictionMaker
Entering EddyGpuUtils::LoadPredictionMaker
Entering EddyGpuUtils::LoadPredictionMaker
Entering EddyGpuUtils::LoadPredictionMaker
Entering EddyGpuUtils::LoadPredictionMaker
Entering EddyGpuUtils::LoadPredictionMaker
Entering EddyGpuUtils::LoadPredictionMaker
Entering EddyGpuUtils::LoadPredictionMaker
Segmentation violation, Address not mapped, Offending address = (nil)
eddy_cuda ) [0x47d73a] [
eddy_cuda ) [0x4ab1b1] [
eddy_cuda ) [0x49c517] [
eddy_cuda ) [0x4096e3] [
eddy_cuda ) [0x40ca08] [
eddy_cuda ) [0x40db2c] [
/lib64/libc.so.6 __libc_start_main [0x2b3504d84445]
eddy_cuda ) [0x405f69] [
I tried the version that I got for FSL 5.0.11 with this result.
[bennet@flux-build bin]$ ls -l eddy-5.0.11_cuda7.5
-rwxrwxr-x 1 bennet swinstaller 33505739 Sep 21 2017 eddy-5.0.11_cuda7.5
[bennet@flux-build bin]$ md5sum eddy-5.0.11_cuda7.5
5e7edd5288d3b0b7834e9ca244bf7dee eddy-5.0.11_cuda7.5
[hajenna@nyx7500 dwipreproc-tmp-FDLY57]$ eddy-5.0.11_cuda7.5
--imain=eddy_in.nii --mask=eddy_mask.nii --acqp=eddy_config.txt
--index=eddy_indices.txt --bvecs=bvecs --bvals=bvals --niter=8
--fwhm=10,6,4,2,0,0,0,0 --repol --ol_type=both --mporder=8
--s2v_niter=8 --dont_peas --ol_type=both --mporder=8 --s2v_niter=8
--slspec=slspec.txt --out=dwi_post_eddy
...................Allocated GPU # 0...................
Segmentation violation, Unknown reason, Offending address = (nil)
eddy-5.0.11_cuda7.5 ) [0x49ff01] [?
eddy-5.0.11_cuda7.5 ) [0x4c62b1] [?
eddy-5.0.11_cuda7.5 ) [0x4bb8c9] [?
eddy-5.0.11_cuda7.5 ) [0x412c0d] [?
eddy-5.0.11_cuda7.5 ) [0x413974] [?
eddy-5.0.11_cuda7.5 ) [0x40798e] [?
/lib64/libc.so.6 __libc_start_main [0x2b2d00589445]
eddy-5.0.11_cuda7.5 ) [0x40ef31] [?
Here is the `nvidia-smi` output for the card in the machine (there are
more GPUs, but I thought the rest would be redundant).
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.48 Driver Version: 390.48 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K20Xm On | 00000000:09:00.0 Off | 0 |
| N/A 27C P8 16W / 235W | 0MiB / 5700MiB | 0% E. Process |
+-------------------------------+----------------------+----------------------+
We use modules, and the cuda/7.5 module is loaded, and produces
[bennet@nyx7500 ~]$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Tue_Aug_11_14:27:32_CDT_2015
Cuda compilation tools, release 7.5, V7.5.17
[bennet@nyx7500 ~]$ uname -r
3.10.0-693.11.6.el7.x86_64
If I swap with the cuda/6.5 modules, then it says it can't find libcudart.so.7.5
[hajenna@nyx7500 dwipreproc-tmp-FDLY57]$ eddy_cuda --imain=eddy_in.nii
--mask=eddy_mask.nii --acqp=eddy_config.txt --index=eddy_indices.txt
--bvecs=bvecs --bvals=bvals --niter=8 --fwhm=10,6,4,2,0,0,0,0 --repol
--ol_type=both --mporder=8 --s2v_niter=8 --dont_peas --ol_type=both
--mporder=8 --s2v_niter=8 --slspec=slspec.txt --out=dwi_post_eddy
eddy_cuda: error while loading shared libraries: libcudart.so.7.5:
cannot open shared object file: No such file or directory
So I am pretty sure we are matched with cuda libraries.
########################################################################
To unsubscribe from the FSL list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=FSL&A=1
|