Hi all,
I have run into an issue that affects a number of CCP4 programs
(and my own code as well).
The problem
============
Programs that produce TLSOUT descriptions of TLS parameters create
a file using the equivalent of Fortran format (9F8.4)
Here are two examples:
TLS
RANGE 'A 209.' 'A 220.' ALL
ORIGIN 7.895 -62.178 -23.423
T 0.9518 0.3476 0.5619 -0.0034 0.2309 0.0373
L 18.9522 22.2045 0.6690-19.0722 -1.8277 2.8987
S -0.2024 -0.2744 0.5817 0.2971 -0.1614 -1.0850 -0.1876 0.0462
TLS
RANGE 'B 15.' 'B 21.' ALL
ORIGIN 13.302 -6.004 38.582
T 0.0184 0.0726 0.0102 0.0294 -0.0001 0.0303
L 10.9899108.7779 8.3249 28.9340 -8.6452-15.7119
S -0.5438 0.6025 -0.5212 0.5483 3.1784 1.6311 -0.1214 -0.1901
You see the problem ... If any element of the T, L, or S tensors is
greater than 100 or less than -10 then two numbers run together in the output.
This affects several hundred files in the current PDB, including some
from my lab, e.g. 3BJE and 3I7F, which I used as examples above.
If a program reads this in using a corresponding Fortran fixed format, OK.
But the current versions of TLSANL and REFMAC5 don't do this;
they instead use a home-grown free format input routine.
TLSANL
============
When TLSANL hits one of these files, it exits with the message
****** INVALID CHARACTER "." AT POSITION 16.
*** FORMAT ERROR ON RECORD:
L 10.9899108.7779 8.3249 28.9340 -8.6452-15.7119
*** L OR S SPECIFICATION MISSING FOR TLS GROUP 12 12
This is annoying, but at least it's obvious that something went wrong.
Refmac
=============
When Refmac hits one of these, the result is more insidious.
Instead of exiting with an error message, it prints a small warning,
stores a mangled set of values, and continues.
Here is a snippet from the log file from refinement of 3I7F
Data line--- TLS
Data line--- RANGE 'A 209.' 'A 220.' ALL
Data line--- ORIGIN 7.895 -62.178 -23.423
Data line--- T 0.9518 0.3476 0.5619 -0.0034 0.2309 0.0373
Data line--- L 18.9522 22.2045 0.6690-19.0722 -1.8277 2.8987
*** Warning
Illegal number in field 4
Data line--- S -0.2024 -0.2744 0.5817 0.2971 -0.1614 -1.0850 -0.1876 0.0462
[snip]
Initial TLS parameters
TLS group 3:
T tensor ( 3) = 0.952 0.348 0.562 -0.003 0.231 0.037
L tensor ( 3) = 18.952 22.204 0.000 -1.828 2.899 0.000
S tensor ( 3) = -0.202 -0.274 0.582 0.297 -0.161 -1.085 -0.188 0.046
So the input tensor L
L 18.9522 22.2045 0.6690-19.0722 -1.8277 2.8987
has become
L 18.952 22.204 0.000 -1.828 2.899 0.000
This perturbs the refinement, sometimes fatally.
In fact, I think this is the reason I have had problems with TLS refinement
in recent refmac versions. Every time you recycle the TLSOUT as the TLSIN for
the next refinement round, you risk kicking one or more of the TLS group
descriptions out into the next county.
TLSMD
========
The TLSIN files produced by the TLSMD server can trigger the same problem.
This may explain why some people have reported problems with using the pair of files
(XYZIN TLSIN) returned by the server for use in refmac refinement.
Possible solutions
======================
The obvious fix is to change all programs that create a TLSOUT file to
guarantee that the tensor elements are separated by whitespace.
At a minimum, this includes
tlsmd tlsextract (from my lab, I've already changed our in-house copies)
refmac5
anisoanl
The new format could either be the equivalent of Fortran
( 9(X,F8.4)) or ( 9(X,F8.3))
The first of these preserves the precision, but would break any program that
uses a fixed format input statement describing the current format.
The second is backward compatible with existing fixed format input programs,
but loses one decimal of precision.
Both TLSANL and REFMAC are happy if you add the extra whitespace, but I don't
know what other programs out there might break because they use fixed format input.
Please discuss. I want to modify the TLSMD server output accordingly.
cheers,
Ethan
--
Ethan A Merritt
Biomolecular Structure Center
University of Washington, Seattle 98195-7742
|