Ok, I'll bite.
"I dare anyone who considers themself an expert macromolecular crystallographer to find a way to build out of this map."
I put emphasis on "this map".
"Short of actually cheating (see below), there doesn't seem to be any automated way to arrive at a solved structure from these phases"
I put emphasis on "these phases".
I think the real challenge (and one that makes for an excellent macromolecular crystallographer) is how well one can interpret a map with poor phases.
That being said, I think a recalculation of the map using any other information besides the map itself should not be allowed.
PS. I'd like to see what the pre-DM phases look like. There's a huge chunk of the protein that is completely flattened out in impossible.mtz .
F
On Jan 12, 2013, at 1:50 PM, James Holton <[log in to unmask]> wrote:
>
> Woops! sorry folks. I made a mistake with the I(+)/I(-) entry. They had the wrong axis convention relative to 3dko and the F in the same file. Sorry about that.
>
> The files on the website now should be right.
> http://bl831.als.lbl.gov/~jamesh/challenge/possible.mtz
> http://bl831.als.lbl.gov/~jamesh/challenge/impossible.mtz
>
> md5 sums:
> c4bdb32a08c884884229e8080228d166 impossible.mtz
> caf05437132841b595be1c0dc1151123 possible.mtz
>
> -James Holton
> MAD Scientist
>
> On 1/12/2013 8:25 AM, James Holton wrote:
>>
>> Fair enough!
>>
>> I have just now added DANO and I(+)/I(-) to the files. I'll be very interested to see what you can come up with! For the record, the phases therein came from running mlphare with default parameters but exactly the correct heavy-atom constellation (all the sulfur atoms in 3dko), and then running dm with default parameters.
>>
>> Yes, there are other ways to run mlphare and dm that give better phases, but I was only able to determine those parameters by "cheating" (comparing the resulting map to the right answer), so I don't think it is "fair" to use those maps.
>>
>> I have had a few questions about what is "cheating" and what is not cheating. I don't have a problem with the use of sequence information because that actually is something that you realistically would know about your protein when you sat down to collect data. The sequence of this molecule is that of 3dko:
>> http://bl831.als.lbl.gov/~jamesh/challenge/seq.pir
>>
>> I also don't have a problem with anyone actually using an automation program to _help_ them solve the "impossible" dataset as long as they can explain what they did. Simply putting the above sequence into BALBES would, of course, be cheating! I suppose one could try eliminating 3dko and its "homologs" from the BALBES search, but that, in and of itself, is perhaps relevant to the challenge: "what is the most distance homolog that still allows you to solve the structure?". That, I think, is also a stringent test of model-building skill.
>>
>> I have already tried ARP/wARP, phenix.autobuild and buccaneer/refmac. With default parameters, all of these programs fail on both the "possible" and "impossible" datasets. It was only with some substantial tweaking that I found a way to get phenix.autobuild to crack the "possible" dataset (using 20 models in parallel). I have not yet found a way to get any automation program to build its way out of the "impossible" dataset. Personally, I think that the breakthrough might be something like what Tom Terwilliger mentioned. If you build a good enough starting set of atoms, then I think an automation program should be able to take you the rest of the way. If that is the case, then it means people like Tom who develop such programs for us might be able to use that insight to improve the software, and that is something that will benefit all of us.
>>
>> Or, it is entirely possible that I'm just not running the current software properly! If so, I'd love it if someone who knows better (such as their developers) could enlighten me.
>>
>> -James Holton
>> MAD Scientist
>>
>> On 1/12/2013 3:07 AM, Pavol Skubak wrote:
>>>
>>> Dear James,
>>>
>>> your challenge in its current form ignores an important source
>>> of information for model building that is available for your
>>> simulated data - namely, it does not allow to use anomalous
>>> phase information in the model building. In difficult cases on
>>> the edge of success such as this one, this typically makes
>>> the difference between building and not building.
>>>
>>> If you can make the F+/F- and Se substructure available, we
>>> can test whether this is the case indeed. However, while I
>>> expect this would push the challenge further significantly,
>>> most likely you would be able to decrease the Se incorporation
>>> of your simulated data further to such levels that the anomalous
>>> signal is again no longer sufficient to build the structure. And
>>> most likely, there would again exist an edge where a small
>>> decrease in the Se incorporation would lead from a model built
>>> to no model built.
>>>
>>> Best regards,
>>>
>>> --
>>> Pavol Skubak
>>> Biophysical Structural Chemistry
>>> Gorleaus Laboratories
>>> Einsteinweg 55
>>> Leiden University
>>> LEIDEN 2333CC
>>> the Netherlands
>>> tel: 0031715274414
>>> web: http://bsc.lic.leidenuniv.nl/people/skubak-0
>>
>
|