It is the lack of compatibility between different versions mentioned by Ethan that really put me off learning PYTHON. In contrast, the FORTRAN-66 program SHELX76 still compiles and runs correctly with any modern FORTRAN compiler. The only significant 'new' features that I now use are dynamic array allocation (introduced in FORTRAN-90) and OpenMP support for multiple CPUs, but even programs using OpenMP would still work with older compilers because the OpenMP instructions would be treated as comments. George On 09/12/2012 08:28 PM, Ethan Merritt wrote: > On Wednesday, September 12, 2012 09:52:09 am Jacob Keller wrote: >>> For the specific purpose you list - >>> input from tab-delimited data >>> output to simple statisitical summaries and (I assume) plots >>> - it sounds like gnuplot could do the job nicely. >>> >> I wasn't aware that gnuplot can do calculations--can it? I was probably >> going to use it somewhere as a plotting option. > Here's a simple-minded example using a dump of the current contents > of the PDB from www.pdb.org as a comma-separated file with ~65000 entries. > The input file was previously filtered to contain only X-ray structures > between 1 and 4 Angstroms resolution. > > gnuplot> !head -3 PDB.csv > PDB ID,R Observed,R All,R Work,R Free,Refinement Resolution > "100D","0.145","","0.145","","1.90" > "101D","0.163","","","0.252","2.25" > > gnuplot> set datafile separater "," > gnuplot> set datafile nofpe_trap # trap handling greatly slows large data sets > gnuplot> stats 'PDB.csv' using "R Observed" prefix "Robs" > > * FILE: > Records: 63029 > Out of range: 0 > Invalid: 0 > Blank: 2 > Data Blocks: 2 > > * COLUMN: > Mean: 0.1982 > Std Dev: 0.0334 > Sum: 12494.6900 > Sum Sq.: 2547.3068 > > Minimum: 0.0450 [24518] > Maximum: 0.9700 [45024] > Quartile: 0.1770 > Median: 0.1970 > Quartile: 0.2180 > > gnuplot> print Robs_mean > 0.198237160672072 > > gnuplot> #calculate correlation of Robs with Resolution > gnuplot> stats 'PDB.cvs' using "R Observed":"Refinement Resolution" nooutput > gnuplot> print STATS_correlation > 0.595763711910418 > > I've attached graphical output of the same data following some sorting, > filtered, binning, etc, with output to a PDF file. > > You can do all this in R also. R has a larger collection of statistics options, > but is not as good at dealing with really large data sets. IMHO gnuplot has more > flexible options for graphical output. > >>> Otherwise I'd recommend perl, and dis-recommend python. >> >> Why are you dis-ing python? Seems everybody loves it... > I'm sure you can google for many "reasons I hate Python" lists. > > Mine would start > 1) sensitive to white space == fail > 2) dynamic typing makes it nearly impossible to verify program correctness, > and very hard to debug problems that arise from unexpected input or > a mismatch between caller and callee. > 3) the language developers don't care about backward compatibility; > it seems version 2.n+1 always breaks code written for version 2.n, > and let's not even talk about version 3 > 4) sloooow unless you use it simply as a wrapper for C++, > in which case why not just use C++ or C to begin with? > 5) not thread-safe > > you did ask... > > Ethan > -- Prof. George M. Sheldrick FRS Dept. Structural Chemistry, University of Goettingen, Tammannstr. 4, D37077 Goettingen, Germany Tel. +49-551-39-3021 or -3068 Fax. +49-551-39-22582