Just a quick question. I am trying to fit an LDA model with a restricted
subsample. My X is a numerical matrix and Y is vector of factor response.
fit _ lda( Y[1:50] ~ X[1:50, ] )
gives the following error message: variables are collinear in:
lda.default(x, grouping, ...)
I am guessing this is the problem of rank deficiency as I have about 80
variable. [since the lda works with subsample of size 80 and above]
Q1: Is my interpretation of the error message correct ?
Q2: I am using the fit to prediction purposes etc. Is this likely to be
affected. ie how serious is this problem ?
Q3: Is there a good website about sound statistical theory/practical of
overcoming problem of rank deficiency if this is indeed the source of the
error.
Many thanks, Adai.
|