Dear ALLSTAT, Here is what may be a stupid question. It is possible to
reconstruct the vocabulary of Proto-Indo-European (a language assumed to
be ancestral to English, Russian, Hindi) etc essentially by saying that
where a cognate word appears in three out of twelve families of
Indo-European languages (Germanic, Slavic, Italic,..., Iranian,
Tocharian...) then it can be reconstructed to Proto-Indo-European. So we
might want to do some analysis of the Proto-Indo-European vocabulary, but
what of course we have is a 'censored' representation of the original
data where we only see items that survive (where survival is a random
process of some kind) in three or more of the twelve families. In
principle, we can't recover items that survive 0, 1 or 2 times. So my
question is: Does there exist any systematic treatment of this kind of
'censored' data and how it relates to the original uncensored dataset?
Grateful for any illumination, Howard Turner
--
Be Yourself @ mail.com!
Choose From 200+ Email Addresses
Get a Free Account at www.mail.com
|