Hello Everone,
Thanks for all the help. The key to finding the problem was following up on Tim Gruene's suggestion to compare the data sets directly. It appears that an error occurred during conversion from I to F - until I find the log file for the conversion, I can't guess what was done.
Longer version:
When I compared the "good" and "bad" data sets, R was about 0.15, instead of the 0.07 I was expecting.
Yesterday, I reintegrated the images using the same program that generated the "bad" data (CrystalClear - sorry to be opaque but I didn't want to inspire a lot of discussion about various integration programs when I was pretty sure the program wasn't at fault.), and ended up with a data set that agreed with the "good" data (XDS). (Yeah, I should've done this before sending a message to ccp4bb). The R for scaling the new CC dataset and the XDS dataset was 0.07 and refinement behaved as expected and agreed with that of XDS.
I have been unable to find the log file for the conversion from integrated I to mtz F (it's on some computer somewhere, I'm sure), but I did find the original ScalAveraged.ref file for the "bad" data and reimported that using the import scaled data task in ccp4i. That data set is also good. So, I conclude that something was done wrong during import to ccp4. Tim suggested that perhaps the data was converted twice to amplitudes, perhaps that's it. Anyway, now I know where the problem arose.
Several people suggested checking statistics using phenix polygon and other analysis tools in phenix. I agree that those are nice tools (and we had done that), however, they only tell you how your statistics are different from the median and often don't give any hints as to how any problems might have arisen.
Again, thanks for all the help.
Sue
On Jun 26, 2013, at 8:54 AM, Tim Gruene wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Dear Sue,
>
> if you made your rmsd (bonds) 20-30 times smaller I would agree they
> were not too loose. 0.14A is pretty high. So two suggestions:
> a) check the molprobity report of your PDB if its geometry is sane
> b) check the CC plot of one data set against the other one to check if
> the problem is due to two different data or due to the PDB file (xprep
> can do this plot conveniently).
>
> Did you check if you converted the data twice to amplitudes, or maybe
> not at all?
>
> Best,
> Tim
>
> On 06/26/2013 05:44 PM, Roberts, Sue A - (suer) wrote:
>> Hello Everyone
>>
>> I have two data sets, from the same crystal form (space group P32)
>> of the same protein, collected at 100 K at SSRL, about 2.2 A
>> resolution, that refining to R = 0.14, Rf = 0.26 (refmac/TLS).
>> This is a molecular replacement solution, from a model with about
>> 40% homology (after MR density was apparent for some missing or
>> misbuilt residues, so I don't think the structure is stuck in the
>> wrong place. The Fo-Fc map is essentially featureless. The 2Fo-Fc
>> map doesn't look as good as it should - for instance, there are
>> very few water molecules to be found. The data reduction
>> statistics look OK, the resolution cutoff is pretty conservative.
>> There is one molecule in the asymmetric unit, so no NCS. There is
>> no twinning either.
>>
>> It seemed to me that the R is too low, not Rf too high. More
>> normally, R ends up about .18 - .20 for a data set at this
>> resolution.
>>
>> I reprocessed the images with a different data processing program
>> and redid the MR. The data reduction statistics look similar, the
>> resolution is the same, but now the structure refines to R = 0.20,
>> Rf = 0.24 (same free R set of reflections chosen, still
>> refmac/TLS.) The maps look more normal. Further rebuilding took us
>> to R = 0.18, Rf = 0.22
>>
>> So, the question I have (and that I've been asked by the student
>> and PI) is: What was the problem with the original data set?
>> What should I be looking for in the data reduction log files, for
>> instance, or in the refinement log? The large R - free R spread
>> is characteristic of overfitting, but the geometry is not too
>> loose (rmsd bonds = 0.14), there are plenty of reflections (both
>> working and free).
>>
>> Can anyone point me toward a reason R would be low?
>>
>> Thanks
>>
>> Sue
>>
>>
>> Dr. Sue A. Roberts Dept. of Chemistry and Biochemistry University
>> of Arizona 1041 E. Lowell St., Tucson, AZ 85721 Phone: 520 621
>> 8171 or 520 621 4168 [log in to unmask]
>> http://www.cbc.arizona.edu/xray or
>> http://www.cbc.arizona.edu/facilities/x-ray_diffraction
>>
>>
>
> - --
> - --
> Dr Tim Gruene
> Institut fuer anorganische Chemie
> Tammannstr. 4
> D-37077 Goettingen
>
> GPG Key ID = A46BEE1A
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.12 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iD8DBQFRyw6vUxlJ7aRr7hoRAq4HAKCJJf+FfRVT7u3UOrty0vTOFMN+mgCgtHz8
> MYe+23hH+MKy/7E/h2w25+Q=
> =WAsD
> -----END PGP SIGNATURE-----
Dr. Sue A. Roberts
Dept. of Chemistry and Biochemistry
University of Arizona
1041 E. Lowell St., Tucson, AZ 85721
Phone: 520 621 8171 or 520 621 4168
[log in to unmask]
http://www.cbc.arizona.edu/xray or
http://www.cbc.arizona.edu/facilities/x-ray_diffraction
|