Hello Everyone,
Currently I am conducting a missing value analysis, but am puzzled by
some of the output in SPSS.
Initially I tried to use the SPSS facility to estimate the missing
values by means of multiple regression (with no adjustment of the
regression estimates)
For the hypothetical data set at the foot of this page, SPSS produced
the following table of (correct) estimated means
Summary of Estimated Means
X1 X2 X3
Listwise 2.3846 2.4615 2.9231
All Values 2.5000 2.7647 3.1111
Regression 2.3846 2.4615 2.9231
However, the following table implies that the st.dev. of the values
predicted by regression is the same as the st dev from the 'listwise'
scenario. In my calculations, for example, the st.dev of the x1
values(predicted by regression) should be 0.674268 (using a multiple
regression model to predict values of x1:, 3.2745 -0.5122x2 +0.1269x3) .
Summary of Estimated Standard Deviations
X1 X2 X3
Listwise .9608 1.3301 1.5525
All Values 1.2948 1.4803 1.6764
Regression .9608 1.3301 1.5525
Additionally, when I request a file containing imputed values using
'multiple regression (with no adjustment)', SPSS has merely replaced the
missing values by 2.5 for x1; 2.76 for x1 and 3.11 for x3 (i.e. the mean
values of the original variables). Surely this is incorrect. If
multiple regression is used for imputation then surely the imputed
values should be different if the values for the independent variables
are different? I am a little concerned about how SPSS is imputing
missing values especially if I choose the more complex scenarios (e.g.
EM method and multiple regression method with parameter adjustment)
where I cannot check its results by hand.
Has anyone noticed these anomalies too?
May thanks,
Kim.
Hypothetical Data
x1 x2 x3
2 2 2
5 . 2
4 1 1
3 3 5
2 2 5
2 5 4
1 4 2
2 3 1
1 2 .
2 1 1
5 3 .
3 2 3
2 . 4
1 . 5
. 4 6
2 3 5
3 1 4
4 1 3
1 4 2
. 6 1
|