Sorry this has taken so long to get back out. Thank you all
tremendously for your input. A copy of my original question is at the bottom.
Because a number of people inquired, and their suggestions
depended on a little more information, here’s a little more background on what I’m
looking at.
It’s a control vs. intervention study where the cohorts are not
determined by geographic location. So mixing does occur, and the intervention
does not have hard lines so a bleed-over effect or dilution of the intervention
is a real possibility. There are roughly 10k observations per year, sampled
over 6 years, with a cohort of about 5k specifically followed each year. It
will be quite unbalanced, not everyone will have the same number of
observations longitudinally. Since the county is so large, my areas of interest
will be neighborhoods, on the census tract scale, which there are at least a
few hundred. I don’t have any real specific details figured out yet, but I am
expecting a large complex correlation and design matrix structure that would
cause inversion issues.
As expected, the responses were pretty uniform across the
options! I was already planning on taking a sample of observations and building
the model in WinBUGS to begin with, but it was very helpful to hear about cases
where people have successfully used BUGS on more complex models and datasets,
and suggested not to completely give up on BUGS before I begin.
In terms of speed, cpu, ram requirements and ease of
learning/use there was no real consensus. One person would say that language A
is faster/less memory/cpu intensive than language B, and another person would
say the opposite. I think this comes down to personal taste/familiarity with
the language, and differences in computers used.
Bugs, C and Fortran all got equal support (7 replies for
each), Matlab (which I had forgot about) got 4, R alone got 9, and R with C got
6. The big winner here seems to be R and C in some combo.
So my final decision will be to see how WinBUGS behaves,
branch out to R and if I’m still having difficulties, then try to include C. My
reasons are mainly due to familiarity with the languages and programs, I use
SAS for the main part of data management and analysis, R for graphics (but am working
on learning more analysis with it), and WinBUGS for any Bayesian modeling that
I come across. The main reason is that I really don’t –want- to learn another
language! I would rather become more proficient at R and BUGS. It was extremely
helpful to hear that there is plenty of user packages, support and success
stories out there.
Another suggestion that I found interesting to look into is cluster
computing. My only reservation with that is whatever results I find, the idea
that you would have to spend multiple days on a cluster to see it, makes the
results a little less practical in use. (if that made any sense)
Again, I really appreciate the time you spent responding, and
value your responses. There was a lot more information regarding what people
have experienced than I can summarize. If anyone would like a full compilation
of responses I will be happy to put it together for you.
-Robin
ß===Original
Email====à
I’m just starting out on my dissertation work,
which is going to involve a spatio-temporal analysis of a dataset that consists
of ~45,000 observations.
From what I’ve heard, experienced and read about
in the archives, this would basically kill BUGS.
So it may come down to having to write my own MCMC
sampler program, in some language such as C, C++, Fortran or JAVA. Iʼve
talked to three people so far today, and all three are adamant that their
choice of language is better than the others (of course they all chose
differently)
So I’m posing the same question to the folks out
there that have worked with some of these languages.
What would be the easiest language to learn, what would
be the most CPU/Memory efficient language for use on a non-top of the line PC?
Robin Jeffries
Dr.P.H. Student
UCLA Department of Biostatistics
530-624-0428
http://sites.google.com/site/biostatjeffries/Home