On Tue, Jul 26, 2011 at 10:46 AM, Bernhard Rupp (Hofkristallrat a.D.) <[log in to unmask]> wrote:
Is there a simple way (or already an existing list) to extract/parse from
the heterodictionaries or monomer libraries which 3-letter symbols are
actually modified standard amino acids (as compared to bona fide ligands,
solvent molecules etc)?

You could start by searching for "L-peptide" in the CIF files (script appended).  This won't actually tell you which are modified standard amino acids, or which occur as part of protein chains, but it will narrow down the list.  (Piping the output into "grep TYROSINE" yields 46 entries.)

-Nat

#!/bin/sh

MON_LIB="/Applications/ccp4-6.2.0/lib/data/monomers"
DIRS=`/bin/ls ${MON_LIB}/`
for dir_name in $DIRS; do
  if [ -d "${MON_LIB}/${dir_name}" ]; then
    grep "L-peptide" ${MON_LIB}/${dir_name}/*.cif
  fi
done