Hi
Thanks for sharing your scripts!
> Just to make sure that I'm interpreting things clearly, when I aggregate the resulting files across subjects, I get a group z-statistic for relating a particular contrast (i.e., temporal effect of the design) to the IC. So, a group-wide effect of z=+/-1.96 would be significant at p < .05 (two-tailed) and would suggest a link between that contrast and the IC. Is that right?
Yep
> And to compare two groups (teens vs. adults), I would run an independent samples t-test on the z-stat values for one group against the z-stat values of the other.
>
Yep - assuming homoscedasticity this then corresponds to a proper between-group analysis
> Lastly, I agree that averaging z-values seems suboptimal and I would like to do better by aggregating betas and variances, but I'm not sure how to get these statistics (or how to combine them properly). The -o parameter to fsl_glm gives the betas, but I don't see an equivalent for variance (there is --out_cope and --out_varcb for COPEs and --out_varcb for residual noise variance).
You're interested in differential contrasts, so the COPE/VARCOPE rather than the raw betas is what you want
> And if I had the variance estimates for each beta, how would I combine these to compute a contrast of interest? I've tested differences in regression coefficients before (b1-b2/pooled se), so that would be gut-instinct approach, but that's probably wrong.
Correct, just use the COPE/VARCOPE for all of this.
hth
Christian
>
> Thanks,
> Michael
>
> ---
> R program for extracting and aggregating files follows
> ---
> setwd("/mnt/Terabyte2/bars_ica/Melodic_Config/MelodicConcatOUT.gica/groupmelodic.ica/report/timeStats/zstats")
> runs <- 1:4
> components <- 1:54
> numSubjects <- 26
> numContrasts <- 15
> glmFitCols <- numSubjects * length(runs) + 1
> adultSubjects <- 1:14 #adults
> teenSubjects <- 15:26 #teens
>
> #my data are ordered such that input are clumped for each subject
> #example: s1_run1 s1_run2 s1_run3 s1_run4 s2_run1 s2_run2 ...
> #thus, want to define which cols are relevant within each fsl_glm file (one per run)
> #1+ included to drop the first row, which is the eigenseries
> keepCols <- cbind(1 + seq(1, 104, by=4),
> 1 + seq(2, 104, by=4),
> 1 + seq(3, 104, by=4),
> 1 + seq(4, 104, by=4)
> )
>
> #for each component, read output for all runs
> #and aggregate across runs per subject
> for (component in components) {
> allRuns <- vector(mode="list", length=length(runs))
> for (run in runs) {
>
> filename <- paste("IC", component, "_run", run, "_zstat.out", sep="")
> thisOutput <- read.table(filename)
>
> if (!all.equal(dim(thisOutput), c(numContrasts, glmFitCols)))
> stop("Output file does not match expected dimensions (", numContrasts, "x", glmFitCols, "): ", filename)
>
> keepData <- thisOutput[, keepCols[, run] ] #only retain columns in the glm_fit file that pertain to this run
>
> names(keepData) <- paste("s", 1:numSubjects, "_", run, sep="")
> allRuns[[run]] <- keepData #append this run to the list
> }
>
> #combine runs as one data.frame
> allRuns <- do.call("cbind", allRuns)
>
> #setup dummy data.frame to store aggregate data
> aggData <- data.frame(matrix(rep(NA, numSubjects*numContrasts), ncol=numSubjects, nrow=numContrasts))
> names(aggData) <- paste("s", 1:numSubjects, sep="")
>
> for (sub in 1:numSubjects) {
> #define the four columns belonging to this subject
> subCols <- sub + 0:(length(runs)-1)*numSubjects
> aggData[,sub] <- apply(allRuns[,subCols], 1, mean)
> }
>
> aggData$avgAllSubs <- apply(aggData, 1, mean) #calculate mean for each contrast for all subjects
> aggData$avgAdults <- apply(aggData[,adultSubjects], 1, mean)
> aggData$avgTeens <- apply(aggData[,teenSubjects], 1, mean)
>
> write.table(aggData, file=paste("IC", component, "_zstat_aggregate.out", sep=""), row.names=FALSE, col.names=TRUE)
>
> }
>
|