Print

Print


Hi Hajime,

Please, see below:

[log in to unmask]" type="cite">
My understanding of the first part of Q3 has been confirmed by you. Thank you very much. However, I am confused about the second part when you wrote "No. Each page show the same experiment being repeated again." What does each data point (circle or triangle) from the 10 subjects in each page (p.9 to 14) actually represents?

Each datum (circle or triangle) represents one measurement. Perhaps the simplest is to interpret these examples as one datum per subject (say, a single voxel in the brain image compared across various subjects).

[log in to unmask]" type="cite">
In each experiment, we perform many scans.

Not necessarily, and the simplest (really simplest) is to think that there is just one image per subject. This image can be a summary of the results from the 1st level FMRI, which summarises the information from multiple scans, or perhaps a structural scan used in a VBM analysis (after segmentation, etc), or maybe an FA map (after projection to a skeleton, etc).

[log in to unmask]" type="cite">
There are more than one data point. For example, if each experiment lasts 20 minutes and the TR is 5 sec. Then, for each subject and each specific voxel in each experiment, there are (60/5)*20 = 250 scans. In other words, there are 250 rather than 1 data point. Am I missing something? Perhaps each data point from each of the 10 subjects on p.9 to 14 is the average of the 250 scans?

It is indeed possible to interpret this as multiple scans of an FMRI experiment of a single subject, as the maths are the same, and the GLM is constructed in the same way, but in that case, it would be a rather unusual experiment, with just 2 conditions that never alternate, each with 5 scans, without HRF convolution, etc. Even for a PET experiment this would be quite unusual.

I'm afraid, though, that you might be missing the point: the slides are showing the principles that underlie the idea of comparing the result of an experiment (the statistic) with the distribution of that statistic when there is no actual effect, which allows a p-value to be calculated. These principles are not specific to brain imaging and apply to any data. The distribution can be seen empirically when the same experiment is repeated ad infinitum, when there is no true effect present, each time with new data. Whether the data are voxels in the brain, height of the subjects, or level of some metabolite in the blood, it's all the same.

All the best,

Anderson



[log in to unmask]" type="cite">


"
[log in to unmask]">
3. So, from p9 to 14, the author measured the same voxel from all 10 subjects (5 from each group). Each page represents data from the same voxel obtained from a scan of 10 subjects.

Yes.

[log in to unmask]">
For example, the data on p.9 were gathered at t = 1, the data on p.10 were gathered at t = 2, etc.

No. Each page show the same experiment being repeated again.
"

Thank you.

Hajime




Date: Sat, 29 Mar 2014 18:48:52 +0000
From: [log in to unmask]
Subject: Re: [FSL] Questions about Permutation Testing (Randomise) and Multi-Subject Stats Part 1
To: [log in to unmask]

Hi Hajime,

As I wrote earlier, and also off-list, the slides are not meant to cover everything and are not supposed to replace reading of actual literature -- papers and books. I strongly advise that you try to follow one or more of these books instead of the slides, which are just an overview.

As for the last (in all senses) batch of questions, please, see below:

[log in to unmask]">
1. So, on p.3, there are about 7 subjects in the left figure and in the right
figure, there are two groups each of which has about 5 subjects?

Yes.

[log in to unmask]">
2. I think this is the cause of the confusion. You mentioned that on p.7, the line joins two groups each of which has 10 subjects. This seems to be different from the GLM figure on p.34 in feat1_part2 under the heading "Estimation: Finding the "best" parameter values". In that figure, the Y is a time series from the same subject. Am I right?

Yes, there it is a time series. In any case, a line connecting the points is just a visual aid. The observations -- subjects or time series -- are discrete, not continuous.

[log in to unmask]">
3. So, from p9 to 14, the author measured the same voxel from all 10 subjects (5 from each group). Each page represents data from the same voxel obtained from a scan of 10 subjects.

Yes.

[log in to unmask]">
For example, the data on p.9 were gathered at t = 1, the data on p.10 were gathered at t = 2, etc.

No. Each page show the same experiment being repeated again. The resulting statistic is a random number that, in the absence of an effect, follows a certain distribution. This distribution arises from the repetition of the same experiment over and over again. We don't actually need to do that (although some have done that in the XIX century), because the distribution can be derived mathematically if some assumptions about the data are made, or using permutation tests. What these slides are showing is the concept of what the distribution means.

You'll find the same in the first pages of various introductory statistics textbooks. For instance, Figure 3.1 of Keppel and Wickens' book "Design and Analysis: a Researcher's Handbook" (4th ed, 2004).

[log in to unmask]">
If the brain of these 10 subjects were scanned for 1000 time steps, there would have been 1000 data points (t-values) that made up the distribution.

No. As above, the distribution is a theoretical function that tells us how the statistic (a random variable) behaves when there is no true effect. How many datapoints will be there, it depends: infinite for theoretical distributions, or a fixed number that equals the number of times the experiment was repeated for empirical distributions (as this one, which is just a didactic example).

[log in to unmask]">
If the brain were subdivided into 500 voxels, there would be 500 histograms like the one shown on p.14. Am I right?

Yes, each voxel can in principle have its own distribution, and this can be the case of some modalities, but except for these, generally the same distribution is assumed for all of them.

[log in to unmask]">
4. p.76 "How do we choose the (arbitrary!) z-threshold?" Item 1. What RFT assumptions are you talking about?

The random field theory tries to calculate how many regions ("blobs") are above a certain threshold. For the theory to work, these blobs must not contain holes (remember, this is a 3D image), neither be "hollow". This is the RFT assumption that this slide is talking about. If the threshold is too low, various blobs coalesce, forming bigger ones that can contain holes (like "handles") or can be hollow. Then the value that RFT calculates is no longer correct. RFT requires a high threshold to meet this assumption.

[log in to unmask]">
5. p.89 "Oh dear! What now?" Item 1 "But, hinges on lots of assumptions about the data". Could you please let me know what assumptions?

This slide is saying that we could use Monte Carlo methods to derive the distribution of the statistic if there is no actual effect. However, to use Monte Carlo, we need to know the distribution itself; because that distribution isn't really known, assumptions need to be made about it. If these assumptions aren't valid, the results are incorrect. This can be solved by replacing Monte Carlo for permutation tests, which is what this slide and the following ones show.

[log in to unmask]">
6. p.95 "Permutations for dummies": Last three points.
a. Why we can use "classical" statistics (e.g. t-test) when data have strange distribution?

The t-statistic does not necessarily follow a t-distribution. Likewise, the F-statistic does not necessarily follow an F-distribution. These distributions only arise if the errors of the model (the part that doesn't fit) all have a normal distribution with zero mean, and are further all independent from each other. If these assumptions aren't met, the classical statistics can be calculated, but their behaviour will not be the same as predicted by theory. Permutation tests solves the problem.

[log in to unmask]">
Isn't bell curved required to use the t-test?

It is required to compute a p-value using the t-distribution as the reference. Otherwise, the t-statistic follows some other (unknown) distribution.

[log in to unmask]">
b. What do you mean by "Need to ensure exchangability" and "Don't hold your breath"?

Like any other test, permutation methods have their requirements, although these are much simpler: one is that the errors need to be "exchangeable", that is, permuting them does not change their joint (multivariate) distribution. This is fine for most multi-subject designs, although requires some extra care when working with repeated measurements. For a more extensive discussion, please, see our recent "randomise" paper: http://www.sciencedirect.com/science/article/pii/S1053811914000913

[log in to unmask]">
7. on p.98 "False Discovery Rate". Please clarify the following:

a. "5% of all voxels are false positives". It means 5% of all voxels in one brain map are false positive. Am I right?

No. It means that 5% (on average) of all voxels that are declared significant are expected to be false positives. Not 5% of all voxels in the brain.

[log in to unmask]">
b. "5% of all experiments have one or more false positive voxels". It means if we have 100 statistical brain maps, 5% of them have 1 or more false positive voxels. Am I right?

Exactly, this is the definition of FWE, where each brain map is a "family".

[log in to unmask]">
c. "On average 5% of significant voxels are false positives". Do you mean 5% of 5% because to be statistically significant, we commonly use a p-value of 5% as the threshold.

Yes, 5% has been the common threshold for about a century now, and the same idea has been ported to FDR.

[log in to unmask]">
So, assuming that there are 100 voxels in one brain map. Under the False Discovery Rate, about 5%x5% (i.e. 0.25) of them are false positives? I am a bit confused.

Knowing only the size of the brain map we can't say anything. But if in your 100 voxels map, 40 voxels are declared significant with FDR at 5%, then 5% of these 40, i.e., 2 voxels, are expected to be false positives. We don't know, however, which 2 are false. Neither if it's really 2 -- it can well be just 1, none, 3, 4 or perhaps even 5. It's on average (if the same experiment is repeated many times) that 5% are false positives.

[log in to unmask]">
8. p.100 There are three sets of 10 figures. What are you trying to show here?

Look at the slide just before that. There are 10 images with random noise. Then a circular region with signal is added to each of them, simulating what would happen if we were repeating the same experiment 10 times. In slide 100, the correction is made using three approaches, all at 10% level (instead of the conventional 5%): (1) no correction -- note it contains many voxels outside the circular region, all false positives as no true signal had been added there; (2) correction using FWE: note that in only 1 map out of 10 (so, 10%) there is one or more false positives (it's in the 9th map from left to right). However, most of the true signal goes away too (i.e., many false negatives); (3) correction using FDR: it's somewhat a compromise between the stringency of FWE and the large amount of errors of uncorrected. If you patiently count the number of voxels outside the red circle (false positives) they should be on average 10% of the total positives on each map.

[log in to unmask]">
9. p.101 "FDR for dummies" Item 1. Could you please tell me what assumptions are you talking about?

I think that the message here is that FDR requires that the p-values at each voxel need to be exact, that is, if the null is true, the p-values must be uniformly distributed between 0 and 1 in order to guarantee the average proportion of false discoveries. This will be the case if the (other) assumptions needed to calculate this p-value are correct for whatever test that is being done.

All the best,

Anderson


Am 24/03/2014 23:56, schrieb brain human:
[log in to unmask]">
Hi Anderson,

Thank you for your reply. I have some of these books. They are even more difficult to understand. I need to process some data for my experiment conducted in another place last year. Before the experiment, I was told that I would be provided with the training to do it. However, there has been no training. I am left to do the self studying on my own. There is no fMRI people in my university. As you can imagine, it is a tough job without the help from FSL experts like you. Could you please help? I need to verify my understandings. Thank you.

Hajime


Date: Mon, 24 Mar 2014 15:02:26 +0000
From: [log in to unmask]
Subject: Re: [FSL] Questions about Permutation Testing (Randomise) and Multi-Subject Stats Part 1
To: [log in to unmask]

Hi,

Sorry, but I think some of these questions I answered to you already. My advice is that you should not try to understand all of the inference using just a couple of slides of a course that you apparently didn't attend. Please, search for one or more of the books listed below in the library of your institution to have an introduction (or purchase at least one of them), then try reading the relevant papers.

http://www.amazon.co.uk/Functional-Magnetic-Resonance-Imaging-Huettel/dp/0878932860
http://www.amazon.co.uk/Handbook-Functional-MRI-Data-Analysis/dp/0521517664
http://www.amazon.co.uk/Statistical-Analysis-FMRI-Gregory-Ashby/dp/0262015048
http://www.amazon.co.uk/Functional-Magnetic-Resonance-Imaging-Introduction/dp/019852773X
http://www.amazon.co.uk/Statistical-Parametric-Mapping-Analysis-Functional/dp/0123725607

It would also be good to know your name; it's a bit strange to reply without having a real name attached to it. Sorry that I'm not all that familiar with internet anonymity.

All the best,

Anderson


Am 24.03.14 13:58, schrieb brain human:
[log in to unmask]">
Hi Anderson,

Thank you very much for the explanations. Now it is getting clearer.

1. So, on p.3, there are about 7 subjects in the left figure and in the right
figure, there are two groups each of which has about 5 subjects?

2. I think this is the cause of the confusion. You mentioned that on p.7, the line joins two groups each of which has 10 subjects. This seems to be different from the GLM figure on p.34 in feat1_part2 under the heading "Estimation: Finding the "best" parameter values". In that figure, the Y is a time series from the same subject. Am I right?

3. So, from p9 to 14, the author measured the same voxel from all 10 subjects (5 from each group). Each page represents data from the same voxel obtained from a scan of 10 subjects. For example, the data on p.9 were gathered at t = 1, the data on p.10 were gathered at t = 2, etc. If the brain of these 10 subjects were scanned for 1000 time steps, there would have been 1000 data points (t-values) that made up the distribution. If the brain were subdivided into 500 voxels, there would be 500 histograms like the one shown on p.14. Am I right?

4. p.76 "How do we choose the (arbitrary!) z-threshold?" Item 1. What RFT assumptions are you talking about?

5. p.89 "Oh dear! What now?" Item 1 "But, hinges on lots of assumptions about the data". Could you please let me know what assumptions?

6. p.95 "Permutations for dummies": Last three points.
a. Why we can use "classical" statistics (e.g. t-test) when data have strange distribution? Isn't bell curved required to use the t-test?
b. What do you mean by "Need to ensure exchangability" and "Don't hold your breath"?

7. on p.98 "False Discovery Rate". Please clarify the following:

a. "5% of all voxels are false positives". It means 5% of all voxels in one brain map are false positive. Am I right?

b. "5% of all experiments have one or more false positive voxels". It means if we have 100 statistical brain maps, 5% of them have 1 or more false positive voxels. Am I right?

c. "On average 5% of significant voxels are false positives". Do you mean 5% of 5% because to be statistically significant, we commonly use a p-value of 5% as the threshold. So, assuming that there are 100 voxels in one brain map. Under the False Discovery Rate, about 5%x5% (i.e. 0.25) of them are false positives? I am a bit confused.

8. p.100 There are three sets of 10 figures. What are you trying to show here?

9. p.101 "FDR for dummies" Item 1. Could you please tell me what assumptions are you talking about?

Thank you very much for your help once again.



Date: Sun, 23 Mar 2014 15:33:10 +0000
From: [log in to unmask]
Subject: Re: [FSL] Questions about Permutation Testing (Randomise) and Multi-Subject Stats Part 1
To: [log in to unmask]

Hi

Please, see below:


Am 22.03.14 01:22, schrieb brain human:
[log in to unmask]">
Hi Anderson,

Thank you very much for the explanations. I am almost there.

In regard to 1a:
------------------
On p.7, there are two subjects. One is called subject1 (or group 1 if you like) and the other is called subject 2 (or group 2).

Nope I'm afraid. What's being represented aren't 2 subjects, but 2 groups of subjects. Each group has 10 subjects, for a total of 20 subjects.

[log in to unmask]">
The top 10 data points, represented by blue dots, are measurements from a voxel of subject 1. The bottom 10 data points, represented by red triangles, are measurements from the same specific voxel of subject 2.

Nope... the top 10 data points, represented by blue dots, are measurements from a voxel across 10 subjects, i.e., a certain location in the brain for subject #1, then a measurement in the same location for subject #2, etc.

[log in to unmask]">
In each curve, the line just joins the data points together for presentation purpose.

Yes.

[log in to unmask]">
It does not mean the 20 data points are related (actually the top 10 points are from subject 1 and the bottom 10 points are from subject 2 so there could be two separate lines one joining the 10 data points from subject 1 and the other joining the 10 data points from subject 2). Am I right?

Yes, except that these aren't 2 subjects, but 2 groups (note that it is possible to formulate the same using actually 2 subjects only, using for instance an FMRI experiment with 10 timepoints for each, but a comparison as this wouldn't be much useful probably, and isn't what is being represented here).

[log in to unmask]">
At what situation do we consider the 10 data points as data from 10 different groups (i.e. only 1 subject in each of the 10 group)?

Here it's 20 subjects allocated into 2 groups of the same size. It's possible to have a higher level in which 10 groups are compared (see the subsequent FEAT presentation), but this isn't what's being shown here.

[log in to unmask]">
For the curves from p. 8 to 14, it is the same idea but the author only showed 5 data points from subject1 (could be called group 1) and 5 from subject2 (could be called group 2). Am I right? I am confused about why the number of data points was reduced by half from 20 to 10.

Yes, it's just an example. Fewer datapoints make it simpler.

[log in to unmask]">

Actually, "specific voxel" is what we want. However, in reality, even it is from the same subject, due to head movements, we can't measure from the same location in the brain in all scans. i.e. Let voxel A be the part of the brain located at (10.1, 20.3) in one scan.
Even from the same subject, in another scan, the nearby part of the brain will occupy the same coordinate. Nevertheless, for analysis, we assume that this coordinate is occupied by the same part of the brain regardless of the subject and the scan number. Am I right?

Yes, but note that there are methods to minimise these effects, from devices to restrain head movement, to motion correction, registration (within and across subjects), and inclusion of nuisance as a confound in the analyses.


[log in to unmask]">
In regard to 2
----------------

"Yes, remember it's just an example. Although the same principles apply to time series, forget time series here, and consider this as a multi-subject study, in which each observation (each subject) is independent of the others."

By "each subject", do you mean 2 different subjects (subject 1 represented by blue circles and subject 2 represented by red triangles) or 10 different subjects (or called groups)? I think it is the former but you mentioned that in a multi-subject study, each observation (20 dots on p.7 and 10 dots on p.8 to 14) is independent of the others. I am a bit confused. Could you please explain?

It's because it's not 2 subjects being represented here, but 2 groups of subjects.

[log in to unmask]">
"At this first part of the presentation..."
"In a later part of the presentation..."

Could you please let me know the page range for the first part and that for the 2nd part? I want to make sure that we are talking about the same thing.

By "first part" I mean the topic "Null-hypothesis and Null-distribution" (see the Outline slide). For the "later part" I mean the slides that discuss randomise (from slide 84 to 95).

[log in to unmask]">
In regard to 3
----------------

Do you mean in the given example, t8=2.64 means that the t-test of the two means x1 and x2 yields a t-value of 2.64?

I meant it "yielded" (or "yold") 2.64, not "yields", i.e., in this example the result was 2.64, but it could have been different with different data.

All the best,

Anderson


[log in to unmask]">


Thank you very much for your help.



Date: Fri, 21 Mar 2014 15:49:19 +0000
From: [log in to unmask]
Subject: Re: [FSL] Questions about Permutation Testing (Randomise) and Multi-Subject Stats Part 1
To: [log in to unmask]

Hi,

Please see below.

Am 21.03.14 08:04, schrieb brain human:
[log in to unmask]">
Hi Anderson,

Thank you for your reply. Could you please let me know the following?

1. On p.3, "The task of classical inference", there are two figures. The figure on the left shows about 8 observations (blue circles) from a group. On the right, there are several observations in both Groups1 and 2. I understood that. However, I started to get confused when I saw a curve made of both blue dots and red triangles. For example, see the curves on p.9 and p.14.

a. What are those curves on p.9 and p.14? How are they different?

These are just different ways of representing the data. In p.3, the horizontal axis has groups (1 or 2) and the vertical axis the values of the measurements. In p.9, the horizontal axis (not shown) contains the values of the measurements, and the vertical axis (not shown) contains a subject index (i.e., subject 1 = 1, subject 2 = 2, etc).

They tell the same information, and are just shown in different ways.

[log in to unmask]">
b. I think the checker box in the figures on p.9 and 14 are the design matrix where white boxes represent ON and black boxes represent off. There are two columns. Probably correspond to regressors X1 and X2. In that case, what do the GLM figure from p.9 to 14 mean?

Exactly. The checkerbox is the design matrix, the "X", that here contains 2 columns (EVs) and 10 rows. Each row is [1 0] for group 1, and [0 1] for group 2.
The line on the left are the observations, indexed by subject, and represented graphically. It is the "Y".
And the model is Y = X*b + e.

[log in to unmask]">
2. On p.14, "And if we do this til the cows come home", what do you actually mean?

This means if you keep doing this for a lot of time, i.e., if you keep repeating the same experiment with different data when there is no actual effect, the distribution of the statistic looks like those being constructed in the plots.

[log in to unmask]">
It seems that from p.8 to 14, the author is trying to collect different time series of data (from the same specific brain voxel of the same subject) to calculate different t-value.

Yes, remember it's just an example. Although the same principles apply to time series, forget time series here, and consider this as a multi-subject study, in which each observation (each subject) is independent of the others.

[log in to unmask]">
Then, make a distribution likes the one on p.14. I am a bit confused. Also, is the distribution made based on the data from only 1 voxel of different subjects (10 subjects - 5 from each group)?

At this first part of the presentation, the idea is to show that, repeating the same experiment many times when the null hypothesis is true (i.e., when there is no actual effect), we find statistics that are distributed as shown in these images.

In a later part of the presentation, an empirical distribution is constructed by means of permuting the group labels. These are two different things, although they share similarities. The first is to show the concepts on which these tests rely upon. The second is how to make use of the data in practice to find the best possible results without too many assumptions.

[log in to unmask]">
3. On p.20, does t8 = 2.64 mean the t-test of the two means x1 and x2 yields a t-value of 2.64?

It means that here, in this example, it yielded the value 2.64. In other experiments, the value can be different.

[log in to unmask]">
4. p.27, I now know that the value at each voxel of a z-map is assumed to follow a normal distribution. What is z-map?

Each voxel contains data that is used to make a test. This test gives a statistic, such as t, F, etc. It's also possible to have a z-statistic for each voxel. When each voxel is organised, side-by-side, so that their position represents an actual location, then we have a map, just like maps in geography, used to locate places. Also just like pictures taken with digital cameras are bitmaps (maps of bits, that constitute the pixels). Here, each voxel contains a statistic, hence a statistical map. If the statistic is z, it's a z-map.

[log in to unmask]">
5. On p.30, we are interested in thresholding the data so that only "ONCE" in 20 studies do we find "A" voxel above this threshold.  Do you mean we are interested in thresholding the data so that in given 20 brain maps (z-map?), only "one" of them contains "one" voxel above this threshold or among all the voxels in ONE brain map, only 5% (1/20) are statistically significant?

Almost there. The word "once" means a single time. With FWER we want that, on average, out of 20 studies (or 20 maps) in only 1 we find one or more voxels above the threshold even if there is no actual effect. I.e., one or more false positives in a single map, out of 20 maps (rather than a single false-positive in a single map).

[log in to unmask]">
6. Somewhere, the "family-wise error" is defined as the probability of making "more than one" false discoveries (type 1 error) among all the hypothesis when performing multiple hypothesis tests. How is this related to the statement in Item 5 listed above?

I think I explained above. Let me know if not clear.

Also note that the slides are supposed to give just an overview in the 50 min or so of the lecture. To have a full understanding, please, read the literature. A good starting point is the book by Poldrack, Mumford and Nichols, "Handbook of Functional MRI Data Analysis".

All the best,

Anderson



[log in to unmask]">

> Date: Sun, 16 Mar 2014 17:36:53 +0000
> From: [log in to unmask]
> Subject: Re: [FSL] Questions about Permutation Testing (Randomise) and Multi-Subject Stats Part 1
> To: [log in to unmask]
>
> Hi,
>
> Please, see below:
>
> > 1. p.19 "Tools of classical inference": What is t8 in the histogram? Is 8 the degrees of freedom? If so, why the dof = 8 here?
>
> Yes, it's the dof. If you look at some slides earlier, e.g., #14, there
> are 10 observations and 2 parameters being estimated (the two betas).
> 10-2=8.
>
> > 2. p.19: what is "e ~ N(0,sigma&2)? I assume sigma^2 is the variance?
>
> This is a standard notation to indicate that the errors are assumed to
> follow a normal distribution (N), with parameters 0 (mean) and sigma^2
> (variance).
>
> > 3. p.27 "What happens when we apply this to imaging data?": What is "voxel ~N" in "z-map where each voxel ~N"?
>
> This should be read as "z-map where the value at each voxel is assumed
> to follow a normal distribution". It's something very clear when
> attending the FSL course.
>
> > 4. p.76 "How do we choose the (arbitrary!) z-threshold?": What are the RFT assumptions in Item 1?
>
> RFT is based on many assumptions and crucial one for this slide is that
> the threshold needs to be high, such that there are (almost) no holes in
> the excursion set (i.e. the set of voxels that survive the threshold).
> When the threshold is high, there is an expression to estimate how many
> regions survive the threshold, and this demands that each region doesn't
> contain holes (or "handles"), nor that it's hollow, things that can
> happen at lower thresholds. To have a general introduction to this, see
> Worsley's paper "The geometry of random images", published in Chance, 1996.
>
> > 5. p.80 "Qualitative example": What are the advantages for enhancing the signal? Why do we do that?
>
> Because it gives more power. Imaging experiments seek signal, but that
> signal can sometimes be buried into noise. TFCE is a (good) method that
> integrates the strength and the extent of relatively weak(er) signals,
> to boost them so that they are more likely to be identified and localised.
>
> > 6. p.81 "TFCE for FSL-VBM": What are you trying to show in this slide?
>
> The top row shows TFCE maps after correction for multiple testing. The
> bottow row shows both voxel-level (cold colours) and cluster-level (hot
> colours), also after correction for multiple testing. We can see that,
> given adequate control over the error rate, TFCE finds more effects,
> some of them that would otherwise remain undetected. And even for the
> findings that overlap, TFCE has lower p-values. So, it's more powerful.
>
> > 7. p.82 "TFCE for TBSS": What are you trying to show here? What is the 2 in red and 3 in green?
>
> This is a comparison of cluster-level vs. TFCE for TBSS. The
> cluster-level inference requires a threshold, and the results change
> depending on the choice of the threshold. The red results are those
> using a threshold = 2, the green are for a threshold = 3. TFCE is again
> more powerful (rate of error type I is controlled in all cases).
>
> > 8. p.85 "Example: VBM-style analysis": At the bottom right, there is an unhappy face with "~N?" on top. Could you please let me know what you are trying to show here?
>
> This is an example of what could happen, e.g., in a VBM study in which
> the voxels are labelled either as entirely gray matter or not. In this
> case, fitting the GLM produces acceptable results (the betas). However,
> the errors have a bimodal distribution that is clearly not normal (the
> histogram at the lower-right corner of the page). If the errors have a
> distribution that is not normal, the assumptions of parametric tests are
> not valid, and we can no longer use distributions as t, F or normal to
> compute p-values. Also, the RFT falls apart. Hence the sad face. But as
> the next slide shows, this can be resolved with permutation tests.
>
> Note that to be didactic, this is a somewhat extreme example. Even VBM
> is not all that non-normal, and the voxels are in fact labelled in a
> fuzzy fashion (voxels have a probability between 0 and 1 of belonging to
> GM, or alternatively, the fraction of GM within a voxel can be
> estimated, also between 0 and 1, rather than exactly 0 or 1). Still,
> parametric methods are not recommended because of these distributional
> issues, among others.
>
> > 9. p.85: What the figure (lines connecting two columns of mixed blue circles and red triangles) on the bottom left means?
>
> This is just a way of representing the data. The blue circles are group
> #1, the red triangles are the group #2. The points (circles or
> triangles) that are toward the left represent 0 (voxel labelled as not
> GM), and those toward the right represent 1 (voxel labelled as GM).
>
> > 10. p.89 "Oh dear! What now?": How did you get the values 978 and 5000? What is "C.f."?
>
> In this example, after 5000 permutations, in 978 of these a statistic
> was found as larger or equal than the unpermuted. This means a
> permutation p-value p=978/5000 = 0.1956, or simply 19.6%. The 5000 is
> chosen by whomever is doing the analysis. It could as well have been
> 10000 or 20000. The 978 is the result found, and here it's just an
> example. It could have been any other value.
>
> The "C.f." I believe is a small typo; it should be "cf.", as in the
> Latin "confer", meaning that the above value, 19.6%, should be compared
> to what would be obtained using parametric methods. A t-statistic = 0.86
> with dof = 18 has a p-value = 0.20055, or 20.05% as shown in the slide.
>
> Note that the difference is not so dramatic, because the t-test is quite
> robust to certain departures from normality if the sample size isn't too
> tiny. However, the problems become far more serious when it comes to
> multiple testing correction, something mandatory in any serious imaging
> experiment.
>
> All the best,
>
> Anderson