Grant,

Python will do this sort of thing using strings - you can split a text string by the character positions.

For example:

mystring='abcdefg'
mystring[0:3]

will return...
'abc'

and...

mystring[4:5]

will return 'de' etc.

The PDB format is fixed - so you can use this approach to get the values you want even if there is no whitespace between columns.
http://deposit.rcsb.org/adit/docs/pdb_atom_format.html

Also, for more info on splitting text  - and python in general - see this site...
http://www.alan-g.me.uk/l2p/index.htm


Best wishes,

- Allister


On 6 Jun 2013, at 05:37, GRANT MILLS wrote:

Dear CCP4BB,

I'm trying to write a simple python script to retrieve and manipulate PDB data using the following code:

#for line in open("PDBfile.pdb"):
#    if "ATOM" in line:
#        column=line.split()
#        c4=column[4]

and then writing to a new document with:

#with open("selection.pdb", "a") as myfile:
#        myfile.write(c4+"\n")

Except for if the PDB contains columns which run together such as the occupancy and B-factor in the following:

ATOM    608  SG  CYS A  47      12.866 -28.741  -1.611  1.00201.10           S  
ATOM    609  OXT CYS A  47      14.622 -24.151  -1.842  1.00100.24           O 

My script seems to miscount the columns and read the two as one column, does anyone know how to avoid this? (PS, I've googled this like crazy but I either don't understand or the link is irrelevant)

Any advice would help.
Thanks for your time,
Grant