Dr. Woolrich-

Thanks for all your assistance on this issue. I have another follow-up question.

It is my understanding that bootstrapping/permuting the data and using non-parametric analyses will also correct for the heteroskedasticity implied by a nested twin pairs design. If this is correct, would using randomise to permute and threshold the data provide me with robust results in this case? For instance, once I obtain single-subject means on the contrast of interest, I use avwmerge to concatenate them into a 48 volume 4D image. Then I run:
    randomise -i AllSubjs.nii.gz -o AllSubjs_clust -1 -n 10000 -c 2.3

Would such a procedure suitably control for both heteroskedasticity and family-wise error inflation, or am I barking up the wrong tree?

Thanks,
Jim Porter
TRiCAM Lab Coordinator
Elliott Hall N437
612.624.3892
www.psych.umn.edu/research/tricam 


Mark Woolrich wrote:
[log in to unmask]" type="cite">
Hi Jim,


1) Would I be correct in believing that choosing FLAME 1 over Simple OLS in FEAT is analogous to choosing GLS over OLS in a simple linear regression, and as such, FLAME 1 represents a method that is going to generate robust results in the face of violations of regression assumptions such as I have presented here? 

FLAME (1 or 2) models all of the variance components at each level of the hierarchy (e.g. within-subject variance, between-subject variance). This means that, for example, when inferring at the second level, the within-subject variance from the first level gets used. This is useful because it means that bad subjects with high first-level within-subject variance are downweighted compared to good subjects with low first-level within-subject variance. Hence, FLAME deals with the heteroskedasticity due to differences in the first level within-subject variances in a manner which is analogous to variance weighting in GLS. Note though, that using first level within-subject variances does not mop up your issue with having data from identical twins.

2) Or, is the Metropolis Hastings procedure not enough, and I have to use FLAME 1+2 to implement the full Monte Carlo simulation, which should most definitely be a fully robust procedure?

Just to clarify - FLAME 1 is a fast approximation to the solution based on maximising marginal posterior distributions in a Bayesian framework, whereas FLAME 2 is a slower, more accurate approach which uses the Markov Chain Monte Carlo technique of Metropolis-Hastings. Both deal with heteroskedasticity from lower levels in the model.

3) Or, are neither of these methods sufficient to overcome heteroskedasticity, and I need to account for subject clustering in my design matrices, either by indicating group membership or by entering extra covariate EVs? (If so, tips on how to best do this would be much appreciated.)

So, as mentioned above, we need to do something more to deal with the fact that you have data from identical twins in the higher level GLM. Otherwise the expected correlation between twins in the data will go unaccounted for and violate the noise assumptions. Whilst I thought we were talking about two sessions for each subject in my last email, the easiest way to deal with the correlations between twins is the same. As I understand it, you want to have EVs to model out the "twin means" by having an EV for each set of twins, where each EV picks out the 2 subjects which make up that twin. These EVs effectively account for the correlation between twins. Then you have a further EV which models your behavioural data.

Hope that helps.

Cheers, Mark.


----
Dr Mark Woolrich
EPSRC Advanced Research Fellow University Research Lecturer

Oxford University Centre for Functional MRI of the Brain (FMRIB),
John Radcliffe Hospital, Headington, Oxford OX3 9DU, UK.

Tel: (+44)1865-222782 Homepage: http://www.fmrib.ox.ac.uk/~woolrich




On 11 Jul 2007, at 15:43, James N. Porter wrote:

Mark-

Sorry for the confusion; I should have been more explicit (the old speed for accuracy trade-off). It's not that we have a repeated-measures design, rather by "twin design" I actually mean that our sample is composed entirely of sets of monozygotic (identical) twins, so we fully expect their shared genetic makeup to be a factor in how much error covariance we need to account for in our analysis design.

Since we expect there to be heteroskedasticity within twin pairs, (I believe) we need to use a clustered/nested design to account for it. The analogy often used is looking the effect of some factor (like teaching methods) upon standardized test results across multiple schools. Even though we may have 150 individual students in the study, we can't say they are fully independent and we would need to nest our subjects to account for error covariance driven by the fact that 50 students come from school A, 50 from school B, and 50 from school C. Similarly, in our neuroimaging analysis here we need to account for the error covariance driven by the fact that our 48 individuals are clustered in 24 pairs.

I know that if I were doing a simple linear regression and had reason to believe there was autocorrelation or clustering in my data, then I could not get away with using simple Ordinary Least Squares methods and would have to use Generalized Least Squares or Weighted Least Squares to impose corrections for inflated standard errors. When I look at the pull-down menu on the Stats tab in FEAT, I see Fixed Effects, Simple OLS, FLAME 1, and FLAME 1+2. It seems to me that they are ordered from least to most robust to violations of the assumptions of linear regression (independence of observations, homoskedasticity, no multicollinearity, etc). So, I guess this brings me back to my original inquiry:

1) Would I be correct in believing that choosing FLAME 1 over Simple OLS in FEAT is analogous to choosing GLS over OLS in a simple linear regression, and as such, FLAME 1 represents a method that is going to generate robust results in the face of violations of regression assumptions such as I have presented here?

2) Or, is the Metropolis Hastings procedure not enough, and I have to use FLAME 1+2 to implement the full Monte Carlo simulation, which should most definitely be a fully robust procedure?

3) Or, are neither of these methods sufficient to overcome heteroskedasticity, and I need to account for subject clustering in my design matrices, either by indicating group membership or by entering extra covariate EVs? (If so, tips on how to best do this would be much appreciated.)

Thanks again for your assistance,

Jim Porter
TRiCAM Lab Coordinator
Elliott Hall N437
612.624.3892
www.psych.umn.edu/research/tricam 


Mark Woolrich wrote:
[log in to unmask]" type="cite">Hi James,

Apologies, I am not familiar with some of the language you are using. When you say a twin design, are you talking about having two FMRI sessions for each subject? In which case you need to do something akin to a paired t-test:
where you have EVs to model each of the subject means across the two sessions. Then you have alongside those an EV to model your behavioural data. 

FLAME1 has most of the key benefits associated with FLAME2 with respect to using variances from the first level, it just trades off a bit of accuracy in the interest of speed - but is perfectly adequate for most purposes.

Apologies if I have misunderstood.

Cheers, Mark.

----
Dr Mark Woolrich
EPSRC Advanced Research Fellow University Research Lecturer

Oxford University Centre for Functional MRI of the Brain (FMRIB),
John Radcliffe Hospital, Headington, Oxford OX3 9DU, UK.

Tel: (+44)1865-222782 Homepage: http://www.fmrib.ox.ac.uk/~woolrich




On 10 Jul 2007, at 19:48, James N. Porter wrote:


I would like to know how to properly account for within-pair variance correlation in FEAT in a twin design. We have measured a continuous variable for all subjects that we are using as a regressor in our fMRI analysis, but we recognize that this variable may not have error that is independent of the twin-pair sampling. In our behavioral analysis in STATA, we can easily implement a design that cluster/nests the twin pairs to obtain robust standard errors. For the neuroimaging analysis, the path is not so clear. I have the following questions:

1) We believe that FLAME 2's full MCMC resampling would automatically produce robust standard errors in this case, but we would like to save time by just running FLAME 1. Does the Metropolis Hastings sampling procedure also result in robust standard errors in this case?

2) At the higher-level, would indicating separate Group membership in FEAT's GLM setup (i.e. Inputs 1&2=Group 1, Inputs 2&3=Group 2, etc) be the same as clustering twin pairs?