Hi allstat,
I received very few replies to my question below, but they were helpful
ones. There are a few ways to deal with heterogenous variances in STATA,
but curiously none really solves my problem! Here I'll list them:
-vwls-
-reg- with aweights
-reg- with robust standard errors
-rreg-
-gllamm-
Consider my problem below, say I want to regress age on sex controlling
for country, but I believe the variances differ between the four countries
that I've collected by data from, though not between sex.
Doing -xi: vwls age i.sex i.country- will assume that all the eight groups
defined by sex and countries have different variances. Doing -reg- with
aweights will require me to obtain their variances first, perhaps by doing
a regression without the aweights, but then the variances of the residual
after the regression with aweights will be different than the first. Doing
-reg- with robust SE does not adjust my coefficients based on variance,
and has not taken into account of my variance structure. Moreover in
practice, there seems to be particular problems when sub-groups are small.
Using rreg suffers from the same problem as -reg- with robust, although
the coefficients are now adjusted for variance difference. And I just
cannot get -xi: gllamm age i.sex i.country, i(id) s(ctry)- to work if I
defined ctry as -tab country, gen(z)-, and -eq ctry: z1 z2 z3 z4-, where
there is just one id per subject. In any case, it's too much effort to use
-gllamm- to solve a simple problem like that!
I think what I desire is an approach which is half-way between OLS
regression and Robust regression (rreg). Robust regression calculates the
variance matrix based on the residuals from fitting the model, but surely
if we simply averaged the residuals within each group (country), we'd have
a more appropriate variance matrix.
This is what I think is a logical simple extension to the robust
regression technique, although I haven't read any papers concerning it. If
anyone knows of such a paper, it'd be great to let me know. If it proves
to be theoretically sound, then perhaps someone would like to write a
STATA program for this, as I'm sure this is quite a common problem,
particularly if it can be extended to more complicated linear models.
Tim
|