Print

Print


Dear Paul,

Randomise uses a general multiple regression framework that should have no problem with this. In fact Ged Ridgeway looked at this issue in some simulations in his thesis and found accurate if not conservative performance with the method randomise uses. 

So I'd expect you'd find the same result with standard parametric multiple linear regression.  While correlated regressors can degrade sensitivity, it can also occur that a solitary variable doesn't explain much variability but becomes significant in the presence of another (correlated) variable.

For a non-imaging example, consider a response of absolute body fat and the predictors weight and height (to avoid thinking about outliers, let's assume the clinically obese are excluded). Height and weight are positively correlated, but neither may predict absolute body fat well alone: A weight-only model ignores variation in overall body size, and a height-only model ignores the obvious importance of weight.  Hence both are needed for a good fit, but in particular in the joint model weight would surely be more significant when adjusted for height.  

Hence you've learned something cool about FA, delayed & immediate recall. I'd encourage you to extract some data and make some Voodoo plots; specifically simple scatterplots and added variable plots (aka partial regression plots) to understand what's going on. 

-Tom



From: Paul Borghesani <[log in to unmask]>
Date: 19 February 2011 02:49:02 GMT
Subject: [FSL] correlated variables in randomise
Reply-To: FSL - FMRIB's Software Library <[log in to unmask]>

Hi -

I've been using TBSS/randomise to explore the the association between cognition and white matter structure.  I've noticed that when I put two highly correlated variables into the model (e.g., immediate and delayed recall abilities - which unsurprisingly have an R value between 0.6 and 0.8) that I often get surprising amounts of significant voxels.  

For instance - with 162 FA maps from subjects of varying ages.
If delayed recall, age and gender are modeled in randomise the corrected t-stat map for delayed recall is entirely insignificant.
If immediate recall, age and gender are modeled in randomise the corrected t-stat map for immediate recall is entirely insignificant

However,
If both immediate and delayed recall, age and gender are modeled in randomise numerous voxels in both the immediate and delayed recall t-stat maps become significant

FYI - age and gender typically have significant voxels in this sample.

I recognize that interpreting what immediate recall means when one "controls for" delayed recall is quite ambiguous, so I am not suggesting this is a "good" model.  However, I was curious about the randomise results.  Does randomise produce accurate t/p-values if two correlated covariates are included in the model? Should I not be including more than one continuos covariate when using randomise?  Should I believe these somewhat surprising results?

Thanks for the help.

Sincerely,

Paul Borghesani
University of Washington