Dear all,
here is a summary of the responses I got from asking for comments on the statement "Multivariate analysis differs from the more traditionally used univariate methods by focusing on two ore more variables simmultaneously, i.e. by taking the complex net of inter-correlations between the variables into consideration".
Thanks for everyone who answered, and sorry it's taken me so long to post a summary, but as some of you have pointed out I haven't been very articulate (as well as spelling simultaneoulsy wrong) so it has taken me a while to process all the replies of the replies of the replies. I have definitely learnt my lesson!
What was troubling me the most was that univariate methods take into consideration correlations between variables as well (in the form of interactions) and I couldn't quite get my head round how that fitted the the "focusing on two or more variables simultaneously" bit. The missing link was fairly obvious: multivariate statistics deals with multiple response variables rather than multiple explanatory variables, but sometimes you need people to tell what you already know for it to sink in properly...
I have pasted below all the replies I received, they all contain very informative comments on multivariate analysis.
Regards to all, hope you have a nice weekend
Sandra
......................................................................................................
Robert Grant ([log in to unmask]) wrote:
It seems quite a good succinct statement. I would prefer "Multivariate
analysis expands on the simpler univariate methods by..." and it would be
helpful to distinguish that you are referring to response variables (if
your audience will understand that) because lots of people start off
thinking that univariate methods deal with two variables: the x and the y,
and of course they have a point. It's also terribly difficult to explain
interactions succinctly, but I think the common sense interpretation of the
word gets most of the message across, so would prefer it to
inter-correlations (after all, interaction might not be linear correlation).
.........................................................................................................
Franck-Olivier Le Brun ([log in to unmask]) said:
I would say that contrary to what is said in the statement, the more
correlated the variables are, the less accurate (and even wrong) will often
be the parameter estimations in multivariate models. Some complex regression
models such as Partial Least Square regression allow for correlations among
predictors but classical methods can give misleading results in presence of
highly correlated variables.
What can be said on multivariate models is that they take into account
multiple confusing factors and enable the evaluation of interactions
............................................................................................................
Phillip Good ([log in to unmask]) wrote:
"The value of an analysis based on simultaneous observations on several
variables--for example, height, weight, blood pressure, and cholesterol
level, is that it can be used to detect subtle changes that might not be
detectable, except with very large, prohibitively expensive samples, were
you to consider only one variable at a time."
[...]
"Multivariate analysis lets us take advantage of the complex net of
inter-correlations among variables. By focusing on two or more variables
simultaneously, multivariate analysis requires fewer observations to reach
definitive conclusions."
I'd be happy to provide you with a preview copy of the chapter on
multivariate analysis from the forthcoming 3rd edition of my test on testing
hypotheses. [which he kindly has, thank you Philip]
...........................................................................................................
Nick Cox ([log in to unmask]) replied:
What's to discuss? This is virtually a _definition_
[...]
As it happens, I'm reviewing a book on multivariate
at present and am puzzled by the apparent definition
it uses. I'm pretty clear that what's crucial is
not having many variables, but have many _response_
variables. Thus multiple regression is not (strictly)
multivariate, but principal components certainly is.
............................................................................................................
Jay Warner ([log in to unmask]) wrote:
If one starts with model equations (and 'frequentist' views), then a
univariate analysis could be called those which have a single
independent variable and a single response/dependent variable. One
could then say that 'multivariate' includes more than one independent
variable. I believe this is what your statement below is saying.
what is lost in the discussion is that there are things which happen as
a result of the relationships between the independent variables (what I
call factors). First, there are interaction effects, involving
x(1)*x(2) effects on the response. these are commonly called
'interactions' and are of great value in understanding and controlling a
response or the mechanism of behavior. An analysis which ignores these
possible effects is doomed to disaster.
Second, there are effects due to the relationships between the factors.
These are insidious, as they usually are totally ignored by those first
undertaking an analysis. A properly designed experiment sets the factor
levels so that they are truly independent of one another - hence, DoE
and all that goes with that. When the factors are not selected but
simply observed, the possibility of relationships is real and frequently
ignored as I said above. Then all Creation bursts forth, and the
interpretation of results becomes very difficult and argued.
A 'multivariate' analysis that did not discuss these two issues would,
IMHO, be more naive and misleading than useful.
[and then when I pointed out that as far as I knew multivariate statistics
was concerned more with multiple responses, Jay replied]
Interesting discussion. I rarely consider multiple response variables,
because in my world they are clearly responses, and interactions between
them are of no direct, analytical concern.
I did do an analysis involving some 120 responses once, in which I used
PCA to show that only 6 of them were seriously different from the
others. but the way I usually set up a project, I know which are the
factors (explanatory), and which are the responses, before starting.
As you use the terms, it seems that '-variate' suggests the response
variable(s). Hmmm... I don't have access to a book of 'standard'
statistical terminology, but I would think that the collected thought of
the practitioners would have decided already whether your usage (as I
interpret it) is generally accepted.
.....................................................................................................
Brian G Miller ([log in to unmask]) wrote:
Apart from misspellings, it would be hard to disagree with this. Which
aspect did you wish to discuss?
[and in the next email, after I asked how the interaction tems of
multiple regression and ANOVA fitted in the statement, Brian added
what follows]
Of course it is true that in univariate regression with more than one
predictor, interactions and correlations are important. Bit I think most
people would consider multivariate to refer to multiple response variables
rather than predictors.
......................................................................................................
John Bibby ([log in to unmask]) replied:
I think this is a pretty good summary. (My only problem would be in your use
of the word 'tradition'. It suggests that multivariate analysis is NOT
traditional, whereas in my viwe it has established a tradition of its own -
based on the linear model - and often these may not be the best available
methods. Other methods - graphical or NON-traditional multivariate - may be
more approprriate)
|