Hi,
Your calculated variance is effectively centring the data when it does the variance calculation.
A PCA calculation should centre the data as well, for the same reasons (to separate the mean from the variation about the mean).
I hope this helps.
All the best,
Mark
On 23 May 2014, at 15:40, Mark <[log in to unmask]> wrote:
> Hi,
>
> Not sure if it's appropriate to ask this here but hope someone can give me some hints.
>
> I'm a beginner in PCA and I know that mostly we should centered our data before analyzing score matrix and loading matrix. Say the raw data are from 3 samples (s1,s2,s3) and have 2 variables (v1, v2).
>
> (1).
> If the raw data are:
> for s1: v1=1; v2=101
> for s2: v1=2; v2=102
> for s3: v1=3; v2=103
>
> (2).
> After centering, the data should be:
> for s1: v1=-1; v2=-1
> for s2: v1=0; v2=0
> for s3: v1=1; v2=1
>
> Since, for both v1 and v2, the variance is 2/3 in both (1) and (2), why do we need to center the data?
>
> Thank you so much.
>
> Mark
|