Dear all,
I am working on a longitudinal study of children in the UK and trying
the PAN package in R for imputation of missing data, since it fulfils
the critical criteria of taking into account individual subject trend
over time as well as population trend over time. In order to validate
the procedure I have started by deleting some known values of a
relatively stable and predictable variable height...we have 6 annual
measures of height on 300 children and I have imputed the missing values
using PAN and compared the imputed values to the real values I deleted -
in most individuals the imputed values fit the individual trend
extremely well! However, when looking at the trend over time for a
handful of individuals, the imputed value was actually lower than the
previous (real) value of height or higher than the next (real) value
making it appear that height went down...which in reality it never
does...so my question is why, when it seems to work so well for the
majority of individuals, does this happen? Am I doing something wrong?
As a novice user of R (and new to this area of statistics) I wondered if
anyone could possibly point me in the right direction, since the mixed
effect design (plus potential ease and speed) of the PAN procedure for
longitudinal data imputation is very appealing...
I would very much appreciate any advice you could give me, many thanks
in advance.
Jo Hosking
Code and a small sample data are shown below (I could supply more data
to anyone willing!)...
impht.data <-read.delim ("impht_long_trunc.dat",header = TRUE)
impht.data$sex <-factor(impht.data$sex,label = c("Boys","Girls"))
impht.data$visit <- factor (impht.data$visit)
impht.data$code <- factor (impht.data$code)
y <- impht.data$htmiss
subj <- impht.data$code
pred <- cbind (impht.data$age, impht.data$sex, impht.data$visit)
xcol <- 1:3
zcol <- 1
prior <- list(a=1, Binv=1, c=1, Dinv=1)
ht1 <- pan(y, subj, pred, xcol, zcol, prior, seed=13579, iter=1000)
code
sex
visit
age
ht
htmiss
1
2
1
4.87
105
105
1
2
2
5.86
109.6
1
2
3
6.88
116.4
116.4
1
2
4
7.72
121.2
121.2
1
2
5
8.72
126.7
126.7
1
2
6
9.71
132.3
132.3
2
2
1
4.84
107.1
107.1
2
2
2
6
115.7
115.7
2
2
3
6.86
121.4
121.4
2
2
4
7.69
126.5
126.5
2
2
5
8.7
134.15
134.15
2
2
6
9.76
140
3
2
1
4.62
103
103
3
2
2
5.69
108.9
108.9
3
2
3
6.87
115.1
3
2
4
7.55
118.6
118.6
3
2
5
8.46
123.6
123.6
3
2
6
9.63
128.9
128.9
|