Hi Lara,
Please find my responses below.
Hope this helps,
Bryan
> On 28 Jan 2020, at 18:12, Lara Foland-Ross <[log in to unmask]> wrote:
>
> Hello,
>
> I have a multi-site study in which I'd like to compare change in fMRI data between two groups of subjects (patients and controls) over time (time 1, time 2), as well as the main effect of time and main effect of group. Although some of my subjects only have one time point, I'd like to use all scans in my analysis. In my model, I need to control for age at baseline, sex, scan site and duration of time between time points.
>
> It seems that the modified Sandwich Estimator may be best suited for this analysis. However, I'm having some trouble understanding the user guide on FSL's website. The SWE help pages in the SPM/Matlab literature are similarly confusing. Therefore, any help you can provide would be greatly appreciated. My specific questions are as follows:
Sorry for the help pages. We will try to improve them soon.
>
> 1. Is the the modified Sandwich Estimator appropriate for this analysis?
Yes, it will be able to account well for any types of within-subject covariance structures in your data. Now, in your study, you have also multiple sites and accounting for correlation between subjects within each site might be a bit tricky. For this, I can see 2 ways with the SwE method:
1) The simplest is to add an intercept for each site as a fixed effect covariate in your model. This assumes that the correlation between each pair of subjects within each site is constant and the same across sites. In many cases, I believe this should be sufficient to account for the within-site correlation.
2) The SwE could also account for more complex within-site correlation structure by considering each site as a cluster instead of each subject. Nevertheless, this is generally not a good idea because this would reduce the number of degrees of freedom available for statistical testing to approximatively the number of sites instead of the number of subjects. Therefore, this solution seems only reasonable if you have many sites with a small number of subjects in each of them, which seems rarely the case.
> 2. In the design.sub file, do I include only 3 columns (indicating subject ID, time point and group)? And place age at baseline, sex, scan site and duration of time between time points in the design.mat file?
Yes, all the covariates like age at baseline, patient group, control group, sex, scan site and duration of time should all be placed in the design.mat file. This represents the full design matrix with all the covariates for your model.
Note that the information in the design.sub file does not contains any covariates and will not be used to add covariates in the design matrix. It is used only to label each scan in terms of the subject, visit category and homogeneous group it belongs. This is only used to estimate the covariance of the data. Also, please note the filling of the design.sub file depends on the choice between the classic SwE (default option) and the modified SwE (--modified option). With the default option, the design.sub file needs only one column indicting the subject ID number for each scan. With the second option, you indeed need to provide three columns, the subject ID number, the visit category number and the homogeneous group number. The advantage of the classic SwE is that it allows a different covariance matrix for each subject, modelling any form of subject heteroskedasticity. This is particularly useful when the time between scans varies widely across subjects. The main issue is that the parametric tests (default option) with the classic SwE tends to be conservative, particularly in small samples. A way around this is to use the non-parametric WB (option --wb) to make non-parametric inference or to use the modified SwE (--modified option). This modified SwE makes the assumption that the subjects can be classified in homogeneous groups where they share a common group covariance matrix for each group. The advantage of this modified SwE version is that it typically yields more powerful parametric tests (particularly in small samples) than with the classic SwE. The first disadvantage is the assumption of group heteroskedasticity instead of subject heteroskedasticity, which is stronger. The second disadvantage is that you need to define consistent visit categories within each homogeneous group of subjects (e.g., visit at baseline, visit at 6 months, visit at 12 months) and this is not always possible. But if you can define such homogeneous groups, you have then to fill the visit category and group information in the second and third columns of the design.sub file. It is important to note that the second columns is expecting visit category number like 1, 2 and not visit time like 0, 0.345, 0.389,…
> 3. For subjects with only one time point, how do I enter duration of time between time points?
In general, a good practical way to specify age and time between scans in a longitudinal model is as follows:
1) compute the within-subject average age for each subject (for example, if a subject has at scan at 60 and another at 66, the average value would be 63) and assign these values to a pure cross-sectional covariate where the values are repeated for each time point within each subject. When specifying it, it is generally preferable to also centre this covariate before entering it in the model.
2) subtract the within-subject average age computed above from the original age covariate (for the example above, -3 would be assign to the first scan and 3 to the second scan) yielding a pure longitudinal age covariate. Also, this automatically assign the value 0 for all the subjects with only one time points. The latter makes sense as these subjects do not bring any longitudinal information. Note that this covariate is already centred and actually orthogonal to the cross-sectional age covariate computed in 1), which is convenient.
Now, one issue with the specification above is that the age at baseline might be more meaningful that the average age between time points (For example, in a study where a drug is given at baseline). Therefore, in such cases, it might be preferable to keep the age at baseline as cross-sectional covariate of age. Then, the longitudinal covariate of age can be build as (the age at scanning time - the age at baseline), in which case, the values corresponding to the baseline scans should always be 0. One issue in this way of specifying the two covariates of age is that they are typically correlated, which may not be optimal.
> 4. Do I need to mean center any variables?
In general, yes, except for the intercepts.
>
> Thanks in advance for your guidance,
> Lara
>
> ########################################################################
>
> To unsubscribe from the FSL list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=FSL&A=1
########################################################################
To unsubscribe from the FSL list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=FSL&A=1
|