Great! Next scripting question that I’m puzzling over - how to take a set of sequences (present in a single fasta file), in arbitrary order, match and align them to their corresponding homologous subunits in a PDB file, and use align_and_mutate to correct the sequence of each subunit (and delete the gaps).
Right now, I’m stuck on the first step - how to take a single sequence from a fasta file and output the matching chain ID.
I can use:
import subprocess
align_to_closest_chain("%s"%(subprocess.check_output(["tail","+2","test.fasta"])).upper(), 0.95)
to check for a matching chain, but the output of align_to_closest_chain() is either 0 or 1.
The matching chain id is present in the console output, but I haven’t yet figured out how to parse that and do anything useful with it - the console output does not seem be easily accessible via either stdout or stderr (was hoping I could redirect the console output to a file and grep it for the chain id - ugly but I don’t know a better way).
Also it is not immediately clear to me how the match_fraction parameter relates to sequence identity or alignment score for a given match (The alignment also seems to be case sensitive, which doesn’t seem right, hence the .upper() in the above).
What am i doing wrong - thoughts/suggestions? Would it be better to do the sequence handling steps externally?
Oliver.
On Dec 30, 2013, at 2:43 PM, Paul Emsley <[log in to unmask]> wrote:
> On 30/12/13 17:42, Oliver Clarke wrote:
>> And here is a more compact/elegant version courtesy of Paul that accomplishes a similar function in half the code - morphing with a radius decreasing from 12 to 6 Å over an arbitrary number of cycles, using a map initially blurred with a B-factor of 100, decreasing incrementally to 0 (the original map) over the course of the procedure:
>>
>
> I did some experimenting and I have been won over to the idea of an initial RBR (into a blurred map), so adding that and fixing the typos, reso_morph becomes:
>
>
> def reso_morph(imol, imol_map, n_rounds):
>
> turn_off_backup(imol)
> set_refinement_immediate_replacement(1)
>
> sharpen(imol_map, 150)
> for ch_id in chain_ids(imol):
> if is_polymer_chain(imol, ch_id):
> rigid_body_refine_by_atom_selection(imol, '//'+ch_id)
> accept_regularizement()
>
> for round in range(n_rounds):
> f = float(round)/float(n_rounds)
> for ch_id in chain_ids(imol):
> if is_polymer_chain(imol, ch_id):
>
> # play with these numbers
> radius = 8 * (2 - f) - 3
> sf = 120 * (1 - f)
>
> sharpen(imol_map, sf)
> morph_fit_chain(imol, ch_id, radius)
>
>
|