Hi,
I’ll answer inline
> On 7. Apr 2017, at 00:22, Mingbo Cai <[log in to unmask]> wrote:
>
> Hello,
>
> I am developing an algorithm on fMRI which needs to use ICA for multiple times. I want to achieve the ability of automatically determining the number of components necessary to model the data similarly as Melodic does. I read through the documentation but I am still not 100% sure that I understand exactly how it is implemented. So I really hope that I can get some insights.
>
> (1) If I understand correctly, the lap argument for the --dimest option of melodic corresponds to the Laplace approximation method referred in the documentation. And it seems that the equation of the likelihood of data given the number of component is the one in Thomas Minka's 2000 paper Automatic choice of dimensionality for PCA. Can I confirm that the lap method in melodic is exactly the method by Minka?
Yes, melodic implements the approach published by Minka in 2000
> Is this true? And it is true that it can only work when doing spatial ICA (where sample size is larger than feature number)?
No, it works on the eigenspectrum of the data covariance - but that will always be smaller than than the smallest of the two dimensions. Also keep in mind that in melodic the dimensionality is selected on the eigenspectrum after adjusting for the finite sampling, using the Wishart distribution.
>
> (2) Additional question: if I were to use BIC to choose dimensionality instead (as the document shows both Laplace approximation and BIC appear to be the best). Would the correct way be, for each candidate dimensionality, calculating the likelihood under the probabilistic PCA framework and add the penalization term? Or instead, using the ICA components within PPCA framework to calculate likelihood and add penalization?
The likelihood depends on the residual space and that’s identical between PCA and ICA - it’s just that ICA and PCA span that space differently. As such, the choice of dimensionality can be based on PPCA.
>
> (3) Also, in terms of the number of parameters in BIC , should it be number of component * sample in each component, or the total size of the unmixing matrix? Or something else?
>
Both the selected dimensionality and N (total number of samples) enter into the penalty terms if I remember correctly.
hth
Christian
> Thank you very much in advance!
>
> Mingbo Cai
> Postdoctoral Research Associate
> Niv Lab
> Princeton Neuroscience Institute
|