To All:
Over the past ten years I have been developing a method that allows us to
infer formal causation from correlations. The notion of formal causation is
taken from Aristotle. No reference to time or experimentation is needed for
the inference.
CR was introduced in a 1991 paper that can be accessed from the following
website: http://www.wynja.com/chambers/regression.html .
The rationale is explained well enough in the paper but a step in the
calculations was left out. The GAUSS code for the complete analysis is
listed below.
I am looking for someone who will work out the proofs needed to justify the
method. It should be little more than a series of simple arithmetic
tautologies. Such a proof would be none-the-less crucial. I am a
psychologist, however, and such proofs are beyond me. Several journal
editors have invited me to submit papers on the method. It would be a
better paper(s) if a co-author would present the proofs.
CR has been the topic of intensive debate on the SEMNET newsgroup over the
past 10 months. SEMNET is dedicated to structural equation models but, in
the end, appears to limit its interest to LISREL type approaches. Over
the past ten months (or 10 years) no one has been able to disprove the CR
method but a number of eminent persons have admitted that it appears to be
on to something.
CR calculations have been limited to uniformly distributed causes but
recent work (introduced below) has extended its applicability to normally
distributed variables.
I do not wish to get into any debate on allstat concerning the validity of
CR. I assume allstat is not the proper forum for such debate. This post is
meant as an invitation to private exchanges.
William Chambers, PhD
************************The Most Recent Gauss
code********************************
The following CR program demonstrates that negative D-squares
characterize the causal relations in the model Y=x1+x2, where x1 and
x2 are NORMALLY distributed, The essence of the method is that all cases
in which either y, x1 and/or x2 are over z=+1.0 or under z=-1.0 are deleted
from the data. This removes those extreme observations that are based
exclusively on extreme values of x1 or x2, focusing the analysis on that
range of y in which all remaining values of x1 and x2 are conjugated.
For data in which there are latent variables, these latent variables should
be accessible via regression residuals, in the same way they are now
formulated during the standard CR calculations.
The method is not as powerful when this trimming method is used but the
D-squares do tend to be negative, demonstrating that CR can be used with
normally distributed variables, after they are trimmed. Perhaps a
correction for range restriction would be in order. If this trimming
method is to be used, the researcher should be careful to trim ALL
variables on a pairwise basis.
The Gauss Program:
format /m1/rd 5,2; /* Gauss format statement */
n=100; /* declares sample size */
x1=rndn(n,1); /* creates normal distribution for x1 */
x2=rndn(n,1); /* '' '' x2 */
y=x1+x2; /* the causal model */
x1dev=x1-meanc(x1); /* creates zscores for x1 */
zx1=x1dev./stdc(x1); extzx1=abs(zx1);
x2dev=x2-meanc(x2);
zx2=x2dev./stdc(x2); extzx2=abs(zx2); /* zscores for x2 */
ydev=y-meanc(y);ydev=y-meanc(y); zy=ydev./stdc(y); extzy=abs(zy);
zy=(zy); /* zscores for y */
z=zx1~zx2~zy; /*concatenates Z scores */
zz=missex(z,(z.>1).or (z.<-1)); /* converts z values beyond +/- 1.00
into missing values*/
zzz=zz;
zzzz=packr(zzz); /* removes cases with missing values
*/
zx1=submat(zzzz,0,1); /* pulls out trimmed zx1 */
zx2=submat(zzzz,0,2); /* pulls out trimmed zx2 */
zy=submat(zzzz,0,3); /* pulls out trimmed zy */
/* begin standard CR analysis */
a=zx1;
b=zy;
x=a~b;
c=corrx(x);
beta=submat(c,1,2);
respa= b-(beta.*a);
t=corrx(zx1~zx2~zy~respa);
tv=t./abs(t);
ts=(t.*t).*tv;
diff=abs(a-respa);
exta=abs(a);
extb=abs(b);
dat=extb~diff;
rdat=corrx(dat); vdat=rdat./abs(rdat); sdatb=(rdat.*rdat).*vdat;
/* Begin rde(a) pass */
a=zy;
b=zx1;
x=a~b;
c=corrx(x);
beta=submat(c,1,2);
respa= b-(beta.*a);
t=corrx(zx1~zx2~zy~respa);
tv=t./abs(t);
ts=(t.*t).*tv;
diff=abs(a-respa);
exta=abs(a);
extb=abs(b);
dat=extb~diff;
rdat=corrx(dat); vdat=rdat./abs(rdat); sdata=(rdat.*rdat).*vdat;
print "rde(b)";sdatb;
print "rde(a)";sdata;
d=sdatb-sdata;
print "d"; d;
ISREL t
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|