Just a brief follow-up to my previous post:
It appears to me that the only way currently to get access to the null
distribution text files from 'randomise' is via the -P flag, which also
results in output of the permutation vector (*perm*txt file).
So, as currently constituted, if one wants access to the null
distribution text files, you have to accept the very slow writing of the
permutation vector file. For a "large" number of permutations (e.g.
20000+) the writing of this vector file would become so slow as to be
basically unfeasible. I guess one can always kill the 'randomise'
process while it is writing the permutation vector file, but that
requires manual intervention and prevents scripting of a bunch of
serially-executed 'randomise' processes.
cheers,
-MH
On Mon, 2010-03-15 at 14:51 -0500, Michael Harms wrote:
> Hello,
>
> What follows is a possible suggestion for improved efficiency of the
> *perm*txt component of 'randomise'. (My observations are based on using
> v2.5 of build 414).
>
> I'm currently running a script with multiple 'randomise' lines of the
> form:
>
> randomise -i $input -o ${outstem}_negCorr -d $DM -t $confileNeg -e
> $group -m $mask -x -c 3.15 -C 3.15 -T -n 10000 -R -P
>
> Looking at the time stamps of the files, I noticed that the vast
> majority of the overall time seems to be spent writing out the
> *perm*.txt files.
>
> Specifically, in this instance, it is taking about 40 minutes to
> generate all the actual statistics for each invoked instance of
> 'randomise', but over 6 hours to complete the writing of the *perm*.txt
> file. These particular perm files are 10k lines, each line having a 150
> values, for a total final size of about 15 MB.
>
> I wouldn't expect the writing of a 15 MB text file to take over 6 hours.
> Based on the use of 'wc' to monitor the progress of writing the file, it
> appears that the process becomes increasingly slower as the file grows
> larger. I'm guessing that perhaps the file is created through some slow
> concatenation or looping method, rather than a direct writing of the
> file in one fell swoop...?
>
> So, as a suggestion for a future improvement, perhaps the mechanism for
> writing the *perm*txt files could be altered so as to increase the
> efficiency dramatically?
>
> cheers,
> -MH
>
>
|