Hi Leigh,
Well first of all thanks for trying out the data model and the Format
Converter. Concerning the following then:
> I am currently writing a program which reads in CNS distance restraints
> and does some processing on them. From the
> cpp.python.cns.distanceConstraintsIO docs, I can see the
> distanceConstraints class and its methods, and I have also looked at the
> python code for distanceConstraintsIO.py. I can create an instance of
> CnsDistanceCOnstraintFile, and use it to read in my data.
>
> What I cannot find is a definition of the attributes used for a cns
> distance constraint. If I look at the code (distanceConstraintsIO.py), I
> can see that there are attributes called targetDist, minusDist, etc. But
> if i look at the data model cpp.app.nmr.abstractConstraint, there are no
> attributes called targetDist etc. the same goes for the data model
> cpp.app.nmr.distanceconstraint. Where are the data model definitions that
> formatConverter is using? Or should I not be using the formatConverter
> routines at all, and just using the cpp.app.nmr.distanceconstraint data
> model and writing my own routines to do input and output?
The problem you are having is because the Format Converter itself works in
two layers:
- A parser layer (in ccp.format). This layer parses the CNS, NmrView, ...
files and puts the data in temporary classes, or writes out the data from
temporary classes to a file.
- A generic conversion layer (in ccpnmr.format.converters). Here classes
are defined to import the data from the parser layer into the Data Model,
and to export Data Model information to the parser layer classes. The main
class here is DataFormat, with format specific information defined in
CnsFormat, CyanaFormat, ... .
This distinction was made because a lot of the conversion code for
importing, for example, distance constraints into the Data Model is
generic - it doesn't matter whether you're importing CYANA or CNS data, a
lot of the operations are exactly the same.
So in response to your question, there's basically two ways you could
proceed:
- use only the parser layer for parsing (and writing) the CNS files, and
then use the temporary classes in there to do your processing. This way
you are not using the Data Model at all and your code will work only for
CNS.
- use the whole FormatConverter setup to import the CNS constraints into
the Data Model, then do your processing inside the Data Model, and export
the files in CNS again. This way your code works with the Data Model
itself, and can be used for any constraint list that's stored inside the
Data Model (this also means any other formats supported by the
FormatConverter (e.g. CYANA)).
We definitely recommend the second route - this way your code can be
directly used by anyone working with the Data Model.
Now how would you go about this? Attached is an example script that reads
in a CNS constraint list and a CNS sequence (from a coordinate file - you
can also import a sequence from any other supported format like XEasy or
Fasta). Note that all of this can be done via the FormatConverter GUI as
well (you can't customize it as well though).
After importing the sequence and constraints you have to run
'linkResonances' - this is essential when importing NMR or NMR-derived
data. Initially all NMR data is linked to 'Resonance' objects. This means
that you can store all the NMR information without knowing exactly which
atom the 'Resonance' corresponds to. This is also why you have to read in
a sequence in the example below - this creates 'Molecular System',
'Chain', 'Residues', and 'Atom' objects that describe the molecule(s)
you're studying. The 'linkResonances' script, then, allows you to
(semi-)automatically link the 'Resonance' to the 'Atom' objects (see
http://www.ebi.ac.uk/msd-srv/docs/NMR/NMRtoolkit/linkResonances.html for
some more information).
Once this is done, you can access the information inside the Data Model
proper - I've included some code for this in the attached example script.
You could also, for example, import coordinate files with the
FormatConverter and then write scripts to analyze your constraint lists
compared to the coordinates.
I hope this clarifies things - do get in touch if you're having problems
with the script or want more information. We are planning, by the way, to
clearly index the scripts we have written so far to do useful things
within the Data Model (e.g. making chemical shifts lists from peak lists,
...).
Bye,
Wim.
import Tkinter, os, string
from ccpnmr.format.converters.CnsFormat import CnsFormat
from memops.api import Implementation
if __name__ == '__main__':
#
# Open a Tk window for handling the popups...
#
root = Tkinter.Tk()
#
# Create a CCPN Data Model Project (this is the root object within the
# Data Model)
#
ccpnProject = Implementation.Project(name = 'My_project')
#
# Create the FormatConverter CnsFormat object
#
cnsFormat = CnsFormat(ccpnProject,root)
#
# Read in a sequence - this will create the molecular system with
# all the atom information.
#
# Note that a lot of the popups can be avoided when the right information
# is passed in (see ccpnmr.format.converters.DataFormat, the readSequence
# function in the DataFormat class)
#
ccpnChains = cnsFormat.readSequence('my_coord_file.pdb')
#
# Read in a distance constraint list
#
ccpnConstraintList = cnsFormat.readDistanceConstraints('my_constraint_file.tbl')
#
# Do some preliminary Data Model navigation to get input parameter for
# linkResonances
#
# An nmrConstraintHead links a group of constraint files
# A structureGeneration links an nmrConstraintHead with a set of structures
#
nmrConstraintHead = ccpnConstraintList.nmrConstraintHead
structureGeneration = nmrConstraintHead.structureGenerations[0]
#
# Run linkResonances (this will generate a lot of output to the shell)
#
# Many options are available - see ccpnmr.format.process.linkResonances
#
# The current options are the 'safest' to maintain the original information,
# although bear in mind that here all atoms in the original list are
# considered to be stereospecifically assigned
#
cnsFormat.linkResonances(
globalStereoAssign = 1,
setSingleProchiral = 1,
setSinglePossEquiv = 1,
strucGen = structureGeneration
)
#
# Save the CCPN project as XML files
#
# For default save it will use the <project_name>.xml file for the main
# information and save other XML files in the <project_name> directory
#
if not os.path.exists(ccpnProject.name):
os.mkdir(ccpnProject.name)
ccpnProject.saveModified()
#
# Navigate the Data Model, get a list of atoms per constraint item
#
for distConstr in ccpnConstraintList.constraints:
print "Constraint %d: %.1f-%.1f" % (distConstr.serial, distConstr.lowerLimit, distConstr.upperLimit)
for constrItem in distConstr.items:
#
# Now list the atoms for each of the two resonances associated with this item
#
atomList = []
for resonance in constrItem.resonances:
atomList.append([])
if resonance.resonanceSet:
for atomSet in resonance.resonanceSet.atomSets:
for atom in atomSet.atoms:
atomList[-1].append("%d.%s" % (atom.residue.seqCode,atom.name))
atomList[-1].sort()
atomList[-1] = string.join(atomList[-1],',')
print " (%s) - (%s)" % (atomList[0],atomList[1])
print
#
# Finally, note that you can read a CCPN project back in as well... use
# the following as an example:
#
#
# from memops.general.Io import loadXmlProjectFile
#
# ccpnProject = loadXmlProjectFile(file = 'My_project.xml')
#
#
|