Print

Print


Hello all

Does anyone have any views on the following?

I am about to conduct an ordinal logistic regression using variable
selection.  The package available to me is SAS.

The data are as follows:

There are X items and Y people were available for evaluation purposes.
Each item is evaluated on a  1...5 scale, (1=strongly agree...5=strongly
disagree) for a single question.

Due to the size of the evaluation task, each person is allocated half of
the X items to evaluate.  Each person is oblivious to the responses of
other people.

At the end of the evaluation task, it was hoped that the items would
have been evaluated an equal number of times but, due to time
constraints, some people did not evaluate all of his/her items.

Each of the items has values for Z independent variables.  My job is to
model the responses (which have values ranging from 1,...,5) based on
these independent variables.

Say we ended up with 100 evaluations (i.e. responses) altogether.  I am
a little concerned that any one item has been evaluated by several
people so we don't have 100 distinct vectors of explanatory variables.
We have instead:

Item 1 with explanatory vector 1 having Z1 responses from Z1 people.
Item 2 with explanatory vector 2 having Z2 responses from Z2 people Item
3 with explanatory vector 3 having Z3 responses from Z3 people.
Etc.
(and of course there is overlap between the people in groups Z1,Z2 and
Z3...etc)

My question is....could we still enter this data into SAS as 100
separate cases?  (ignoring the fact that patterns of explanatory
variables are replicated).  

i.e.
Case  X1   X2   X3   X4....XP  Response
1      .    .    .    .     .     .
2      .    .    .    .     .     .
3      .    .    .    .     .     .
.
.
.
100    .    .    .    .     .     .

Note that SAS calls this single-trial syntax; (events/trials syntax is
not available for ordinal response data).

(Having said all of this,  there is the possibility that certain
characteristics of the respondents e.g. 'age' may have an effect on the
response - thus, if we added 'age' to the set of predictors, we will
have less replicates as regards explanatory variable vectors). 

Many thanks for your views on this,
All the Best,
Kim.