Thank you all for your responses thus far.
Perhaps I should have provided more information orginally. We are
regressing Forest Stand Biomass (tonnes/ha) against Forest Stand Height (m)
and Stand Crown Closure. Both biomass and height are continuous variables.
The problem, however, is that the Crown Closure term is really a four
category item (A, B, C, D), for which we have, up to now, assigned a class
midpoint value to each class to "make it" artificially continuous.
Height shows a non-linear s-shaped (roughly) trend with biomass. The crown
closure data, due to its "class" format, shows only a very weak
inverse-U-type trend with biomass. We need to include Crown closure in our
model, however, because it is a surrogate for tree stem density and age
(forest structure) which, in turn, greatly affects biomass distribution. It
is really the only surrogate we have in our dataset for this.
We know that regression modelling using a artificially continuous variable
is not ideal. We used a square root transformation on the biomass variable
since it appears to linearise the trend height has with biomass and it
provides a relatively good fit in terms of R2 and SEE. We have tried using
different regression models including two non-linear models which
approximate an S-Shaped curve. These models provide decent "fits" but
appear to greatly underestimate biomass at high Height and Crown closure
values. The SQRT transformation of Y appears to, like I said, linearise the
trend, providing a better overall fit. Would you all suggest another
approach altogether, such as a GLM procedure etc? Do you think that the way
we are treating the Crown Cloure term is inappropriate?
Thanks again for all your help.
Brad.
|