Dear list
Suppose I have the following linear regression
Y = X\beta + \epsilon
Where X is an incidence matrix and beta is a vector of effects.
But, assume X is partitioned as X = [X_1 X_2 X_3], so I have a
partitioned regression as:
Y = X_1\beta_1 + X_2\beta_2 + X_3\beta_3 + \epsilon (model 1)
In the particular problem I am investigating, assume the partitioned
regression above is known to be "correct" in that it includes all
relevant variables of interest. Now, assume that I am really only
interested in the set of effects in \beta_3, but disregarding \beta_1
from the regression would give biased estimates of \beta_3. That is,
ignoring X_1 and \beta_1 gives
Y = X_2\beta_2 + X_3\beta_3 + \epsilon (model 2)
It is very simple to work out if B_3 is biased when relevant variables
are omitted from model 2 assuming model 1 provides a correct
specification. I have already accomplished this part symbolically.
However, I also want to work out whether the mean squared error is
larger between model 1 and 2 symbolically.
To do this, I believe I need to compare
var(B_{3m2}) - var(B_{3m1}) = Q
Where the indices m1 and m2 denotes the covariance matrices of B_3 from
models 1 and 2, respectively. Now, because var(B_{3m2}) and var(B_{3m1})
are matrices the new matrix formed, Q, will be positive definite if
var(B_{3m1}) is more efficient than var(B_{3m2}).
Now, in the past few days I have worked out a rather arduous amount of
matrix algebra to try and derive the variance/covariance matrix of both
estimators. That is, what is the variance/covariance matrix of B_3 with
and without B_1.
I have done so, but comparing the matrices to determine whether one is
more efficient than the other seems impossible to solve symbolically.
At this point, I need to pause and see if there is a smarter or more
efficient way to determine whether the estimates from model 1 are more
efficient than the estimates from model 2.
Because I am interested in a general problem, and not working with data,
I am interested in doing this symbolically. If I had data, I could of
course do this numerically. But, I am trying to see if a certain model
that is used in educational evaluations is both biased and has larger
mean squared error than the model I believe should be used.
I would appreciate any reactions. This is my first post to the lsit.
I've read the posting guide and FAQs and hope I have provided all
relevant information. If not, I am happy to provide any other relevant
details.
Harold
hdoran at air dot org
|