In
case you're interested, there is a solution to this problem in the
multilevel SEM framework -- thanks to the Mplus team for pointing this
out -- which I describe in this Instats blog post here.
The problem of these large correlations is that with bounded survey
response distributions, the group mean can become collinear with the
within-group variance. As the scores within a group tend toward the
lower or upper boundary of a response distribution (e.g., tending toward
1 or 5 on a 5-point Likert scale), the variance within the group goes
down by design -- the group members' scores become more similar as they
get compressed against the boundary of the response options. The net
result is that the group mean and the within-group variance (the
location-scale parameters) have a strong positive or negative
correlation, depending on the direction of the boundary where the scores
tend to be massed (lower or upper boundaries, respectively).
As this Mplus output
shows, the problem can be eliminated by properly treating the data as
ordinal. I was initially trying to use a censored (Tobit) model or a
two-part model to address the problem, but much simpler and in this case
better is the attached ordinal response approach -- similar to a 2PL
polytomous IRT model with a probit link and Bayes estimator. This
specification appropriately reflects the categorical and bounded nature
of the observed data.
Setting the model up this way requires
having enough scale items to work with the latent factor at the
within-group level rather than using a scale mean to define the
within-group variance as a random B-level variable, but this should
always be possible when the problem of large B-group location-scale
correlations is caused by categorical item-level data. The resulting
correlations among the random latent within-group variance and the
latent factor at the B level are striking when compared to their
continuous counterparts:
Continuous version of B-level correlations (output file continuous.out):
COR(TSAT_B, SATV) = .858
COR(TWK_B, TWKV) = .768
Categorical version of B-level correlations (output file categorical.out):
COR(TSAT_B, SATV) = -.099
COR(TWK_B, TWKV) = -.132
The
net result is that, unlike in the continuous case, with the proper
categorical data specification you would now be able to treat the group
means and the within-group variances as distinct B-level constructs, and
you can look at latent interactions among the B-level group means and
random variances to meaningfully evaluate substantive hypotheses about
location-scale interactions at the group level (keeping in mind for
interpretation that the random variance is actually a logged version of
this variable). Although some researchers have recognized this
possibility, they have raised concerns such as estimation difficulties (Mestdagh et al., 2018:
p. 694), but Mplus's Bayes estimator renders this concern irrelevant.
Because of the viability of these models now, including with latent
interactions and group means and random within-group variances,
additional research can be done to illustrate these models and offer
further advice on their applications.
If you or your
colleagues/PhD students would like to learn more about these kinds of
models and how to estimate them, we still have a few places left in my
upcoming 3-day seminar on Multilevel SEM in Mplus: Location-Scale Models running March1-3. Hope to see you there!
Best wishes and find the data and Mplus files here in case you want to have a play
To unsubscribe from the PSYCH-POSTGRADS list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=PSYCH-POSTGRADS&A=1