It was brought to my attention that the link to the preprint I provided
below doesn't work, but this one does:
https://www.biorxiv.org/content/early/2018/08/18/394965
Thanks to Folmer Fredslund for pointing this out to me!
-James Holton
MAD Scientist
On 9/21/2018 3:50 PM, James Holton wrote:
> For teaching purposes I have found that controlled pairs of data sets
> are most instructive. You are right that an easy one-button-push
> processing run tells you nothing, but so does a
> bang-it-crashed-now-what data set. Most useful are two data sets that
> are identical in every respect but one, and that one thing is the
> point you are trying to get across. It's hard to collect such
> perfectly paired data sets, so I ended up just simulating them. I
> deliberately chose a high-symmetry space group to keep the download
> size small. You can download them from here:
>
> http://bl831.als.lbl.gov/~jamesh/workshop/
>
> These five datasets represent the four biggest problems I see users
> have when trying to solve structures: 1) poor anomalous signal, 2)
> overlaps from a bad crystal orientation, 3) hidden radiation damage to
> sites, and 4) ice rings. The 5th "goodsignal" dataset is the positive
> control.
>
> The web page contains everything from images to processed MTZ files,
> maps and the "right answer" in pdb and mtz format. A slightly more
> "realistic" version with a bigger download size is here:
>
> http://bl831.als.lbl.gov/~jamesh/workshop2/
>
> This is the one I used for my "weak anomalous challenge" a few years
> back. The teaching advantage is that you can use the image-mixer
> script to modulate the severity of problems like ice rings and
> anomalous signal. If you make a competition of it, people tend to get
> more interested.
>
> When it comes to beam centers, it is not all that hard to take a data
> set with a "correct" beam center and just edit the headers. How you do
> this depends on the file format, but I have some instructions for
> editing images in general here:
>
> http://bl831.als.lbl.gov/~jamesh/bin_stuff/
>
> In general, you can usually separate the header from the data with the
> unix command "head" or "dd", edit the header with your favorite text
> editor, and then put the two parts back together with "cat". As for
> which beam center is "correct", it is important to tell your students
> that that depends on which software you are using. I wrote all this
> down in the last paragraph on page 7 of this doc:
>
> https://submit.biorxiv.org/submission/pdf?msid=BIORXIV/2018/394965
>
> This doc also describes another simulated data set that demonstrates
> the challenges of combining lots of short wedges together. May or may
> not be too advanced a topic for your students? Or maybe not. As you
> can guess I'm experimenting with biorxiv. So far, no comments.
>
> Good luck with your class!
>
> -James Holton
> MAD Scientist
>
>
> On 9/19/2018 5:15 PM, Whitley, Matthew J wrote:
>> Dear colleagues,
>>
>> For teaching purposes, I am looking for a small number (< 5) of
>> macromolecular diffraction datasets (raw images) that might be
>> considered 'difficult' for a beginning crystallography student to
>> process. By 'difficult' I generally mean not able to be processed
>> automatically by a common processing package (XDS, Mosflm, DIALS, etc)
>> using default settings, i.e., no black box "click and done" processing.
>> The datasets I am looking for would have some stumbling block such as
>> incorrect experimental parameters recorded in the image headers,
>> multiple lattices that cause indexing to fail, datasets for which
>> determining the correct space group is tricky, datasets for experiments
>> in which the crystal slipped or moved in the beam, or anything else you
>> can think of. The idea is for these beginning students to examine
>> several datasets that highlight various phenomena that can lead one
>> astray during processing.
>>
>> A good candidate dataset would also ideally comprise a modest number of
>> images so as to keep integration time to a minimum. Factors that are
>> mostly irrelevant for my purpose: resolution (as long as better than
>> ~3.5 Å), source (home vs synchrotron), presence/absence of anomalous
>> scattering, presence/absence of ligands, monomeric vs oligomeric
>> structures, etc. Also, to be clear, I am not looking for datasets that
>> have so many pathologies that they would require many long hours of work
>> for an expert to process correctly.
>>
>> I have checked public repositories such as proteindiffraction.org and
>> SBGrid databank, but all of the datasets I acquired from these sources
>> process satisfactorily with little effort, and in any event I know of no
>> way to search for 'challenging' datasets. (I also wonder whether
>> anybody is in the habit of depositing, shall we say, less-than-pristine
>> images to public repositories?)
>>
>> If you know of such a dataset that is already publicly available, or if
>> you have such a dataset that you are willing to share for solely
>> educational purposes, I would appreciate hearing from you, either on- or
>> off-list.
>>
>> Thank you in advance for your suggestions.
>>
>> Matthew
>>
>
> ########################################################################
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
########################################################################
To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
|