Print

Print


I think the files are too big for Dropbox.

Saad/FSL group,
Is there a way that I can upload my files and logs to the FSL listserv and
have someone check it out?

Below is an example table demonstrating the very different results I got
when rerunning 1 of the first 5 subjects who initially had good results.
I've only included a very limited number of tracks (12 vs 256 tracks) due
to the fact that it took around 2 days to run all the tracks the initial
time. Waytotal_good is the initial run when the results looked good and
waytotal_bad is the result I got after rerunning the same subject again
after the weird results of the remaining 35 subjects. I didn't expect the
results to be identical when I reran it, but I expected high overlap. But
you can see that this isn't the case, and this is why I believe something
is wrong.


  track_names waytotal_good waytotal_bad  lFEF_llatFEF 161318 45
lFEF_lpreSMA 13236 0  lFEF_rlatFEF 0 0  lFEF_rpreSMA 731 0  llatFEF_lFEF
101516 59  llatFEF_lpreSMA 805 0  llatFEF_rlatFEF 2 0  llatFEF_rpreSMA 16 0
lpreSMA_lFEF 1161 0  lpreSMA_llatFEF 18 0  lpreSMA_rlatFEF 0 0
lpreSMA_rpreSMA 34702 0


On Sun, Nov 10, 2013 at 2:14 PM, Niels Bergsland <[log in to unmask]>wrote:

> Okay, well I would say that it doesn't seem like it is being killed due to
> running out of memory. I personally don't have much experience with
> probtrack but maybe if you're able to upload your inputs to an external
> site (e.g. Dropbox) along with the exact command you are running, the
> problem can be tracked down.
>
> On Sunday, November 10, 2013, Dana Wagshal wrote:
>
>> Hi Niels,
>>
>> I entered dmesg | grep kill and here was the output:
>>
>> [    7.788513] init: cloud-init-nonet main process (559) killed by TERM
>> signal
>> [    8.675361] init: failsafe main process (910) killed by TERM signal
>> [    9.473747] init: gdm main process (1201) killed by TERM signal
>>
>> I also ran the jobs again and entered dsmeg and here was the output:
>> [   11.368531] init: bluetooth main process (1530) terminated with status
>> 1
>> [   11.368590] init: bluetooth main process ended, respawning
>> [   11.556262] init: bluetooth main process (1630) terminated with status
>> 1
>> [   11.556290] init: bluetooth main process ended, respawning
>> [   11.972321] init: bluetooth main process (1699) terminated with status
>> 1
>> [   11.972349] init: bluetooth main process ended, respawning
>> [   12.156295] init: bluetooth main process (1747) terminated with status
>> 1
>> [   12.156326] init: bluetooth main process ended, respawning
>> [   12.524262] init: bluetooth main process (1838) terminated with status
>> 1
>> [   12.524286] init: bluetooth main process ended, respawning
>> [   12.752291] init: bluetooth main process (1938) terminated with status
>> 1
>> [   12.752318] init: bluetooth respawning too fast, stopped
>> [   17.035759] init: plymouth-stop pre-start process (2460) terminated
>> with status 1
>> [162326.205079] SGI XFS with ACLs, security attributes, realtime, large
>> block/inode numbers, no debug enabled
>> [162326.206183] SGI XFS Quota Management subsystem
>> [162326.308156] Btrfs loaded
>>
>>
>> Then I entered sudo grep kill /var/log/syslog (I use Ubuntu) but nothing
>> came up. I also looked at the probtrack logs and the only thing in them was
>> the probtrack command.
>>
>> Do these outputs tell you anything?
>>
>>
>> On Sun, Nov 10, 2013 at 8:35 AM, Niels Bergsland <[log in to unmask]>wrote:
>>
>> Hi Dana,
>> Unfortunately these messages don't indicate that the jobs are being
>> killed due to memory problems. Can you post the output of dmesg | grep kill
>> ? Alternatively, try running your job again and then immediately after it
>> stops, run dmesg again? If you have sudo access, you can try taking a look
>> at either /var/log/messages or /var/log/syslog depening on your system,
>> messages RedHat or syslog under Ubuntu. (e.g. sudo grep kill
>> /var/log/messages). Do the probtrack logs indicate anything out of the
>> ordinary, by the way?
>>
>>
>> On Fri, Nov 8, 2013 at 6:16 PM, Dana Wagshal <[log in to unmask]>wrote:
>>
>> Thank you for the suggestions Niels!
>>
>> It seems like you were right and it is due to a memory problem (please
>> see below). How much memory do I need to run probtrackx2 with 5000
>> streamlines and run multiple jobs at once? I have 256 jobs per subject with
>> 40 subjects.
>>
>> [   11.368531] init: bluetooth main process (1530) terminated with status
>> 1
>> [   11.368590] init: bluetooth main process ended, respawning
>> [   11.556262] init: bluetooth main process (1630) terminated with status
>> 1
>> [   11.556290] init: bluetooth main process ended, respawning
>> [   11.972321] init: bluetooth main process (1699) terminated with status
>> 1
>> [   11.972349] init: bluetooth main process ended, respawning
>> [   12.156295] init: bluetooth main process (1747) terminated with status
>> 1
>> [   12.156326] init: bluetooth main process ended, respawning
>> [   12.524262] init: bluetooth main process (1838) terminated with status
>> 1
>> [   12.524286] init: bluetooth main process ended, respawning
>> [   12.752291] init: bluetooth main process (1938) terminated with status
>> 1
>> [   12.752318] init: bluetooth respawning too fast, stopped
>> [   17.035759] init: plymouth-stop pre-start process (2460) terminated
>> with status 1
>>
>>
>>
>> On Thu, Nov 7, 2013 at 11:58 PM, Niels Bergsland <[log in to unmask]>wrote:
>>
>> Hm, okay - well you can check if the process has been killed due to
>> memory problems by running the command dmesg
>> as soon as your job finishes. The most recent entries will be at the
>> bottom, so look for messages about processes being killed.
>>
>>
>>
>>
>> On Fri, Nov 8, 2013 at 1:21 AM, Dana Wagshal <[log in to unmask]>wrote:
>>
>> My data is run on cloud computing servers. The IT person told me that the
>> cloud is a collection of high performance, full provisioned servers that
>> provide a shared infrastructure to run individual virtual instances. I have
>> an instance that belongs to me, but I'm on an infrastructure hosting other
>> individual virtual instances.
>>
>> Is there a specific way or command to check to see if I'm running into
>> memory problems?
>>
>>
>> On Thu, Nov 7, 2013 at 9:23 AM, Niels Bergsland <[log in to unmask]>wrote:
>>
>> Hi Dana,
>> Is it a shared system that you're running on? Is it possible you're
>> running into memory problems?
>>
>>
>> On Thu, Nov 7, 2013 at 6:21 PM, Dana Wagshal <[log in to unmask]>wrote:
>>
>> Hi Saad,
>>
>> --
>> -Dana
>>
>
>
> --
> Niels Bergsland
> Integration Director
> Buffalo Neuroimaging Analysis Center
> 100 High St. Buffalo NY 14203
> [log in to unmask]
> (716) 859-7677
>



-- 
-Dana