Grant,

Python will do this sort of thing using strings - you can split a text string by the character positions.

For example:

mystring='abcdefg'

mystring[0:3]

will return...

'abc'

and...

mystring[4:5]

will return 'de' etc.

The PDB format is fixed - so you can use this approach to get the values you want even if there is no whitespace between columns.

http://deposit.rcsb.org/adit/docs/pdb_atom_format.html

Also, for more info on splitting text - and python in general - see this site...

http://www.alan-g.me.uk/l2p/index.htm

Best wishes,

- Allister

On 6 Jun 2013, at 05:37, GRANT MILLS wrote:

Dear CCP4BB,

I'm trying to write a simple python script to retrieve and manipulate PDB data using the following code:

#for line in open("PDBfile.pdb"):
#    if "ATOM" in line:
#        column=line.split()
#        c4=column[4]

and then writing to a new document with:

#with open("selection.pdb", "a") as myfile:
#        myfile.write(c4+"\n")

Except for if the PDB contains columns which run together such as the occupancy and B-factor in the following:

ATOM    608 SG CYS A 47      12.866 -28.741 -1.611 1.00201.10           S
ATOM    609 OXT CYS A 47      14.622 -24.151 -1.842 1.00100.24           O

My script seems to miscount the columns and read the two as one column, does anyone know how to avoid this? (PS, I've googled this like crazy but I either don't understand or the link is irrelevant)

Any advice would help.
Thanks for your time,
Grant