Hi everyone!
I have an unusual stats question....I am a sociolinguist with a degrees in clinical and forensic psychology. I am currently working on a project involving the examination of criminal aliases...What I have found is that fugitive offenders tend to select names that are a variation of their real name and that clear combinatory patterns are favoured over others. In thinking about these two findings, it came to me that this situation is very much like genetics...there is a basic set of elements Adenosine (A), Guanine (G), Cytosine (C), and Thymine (T)...and from these four bases, we can create a seemingly endless combination of strings...some of them relatively long and others that are comparatively short (e.g. CCAAGTAC vs. TAAGGGCCA vs. AGT vs. CCA). Despite the theoretically infinite array of possible combinations, an examination of those that truly exists reveals that some combinations are more frequent than others. My question is this: Is there an elegant yet robust way to determine which combinations have the highest POTENTIAL of occuring? And this leads me to my second analogy...In thinking about this question, it came to me that predicting name combinations might well be like predicting the weather. If we know that it rained on Monday, Tuesday, and Wednesday, we can calculate the probability that it will also rain on Thursday based on these past observations. AND, if we assemble information about weather conditions over longer period of time, say hundreds of years rather than a few days, we should be able to reduce (at least theoretically, our error rate) when we we make a forecast. The larger our set of previous observations, the greater our degree of accuracy...assuming the formula is solid and the observations are valid. SO...here is my question, is there anyone out there who knows something about memory, sequencing, and inferential stats to help me out?
Any assistance you can offer would be GREATLY appreciated!!!!
Cheers from almost sunny Germany!
Nick
P.S.
H-E-L-P!
|