That is a different question- I thought you already knew which atoms are duplicates.
To identify them, there are programs that will print a list of pairs of atoms within
a threshold distance from each other.
But in this case I think plain old "sort" program will be easiest.
1. Get all the waters in one file
grep HOH mypdb.pdb >waters1.pdb
(or awk '$1~/HETATM/ && $4~/HOH/' mypdb.pdb > waters1.pdb)
(may need to adjust depending what you call the waters)
2. sort -nk7 waters1.pdb >! waters2.pdb
This will sort the atoms on the X-coord value, so identically positioned water will be adjacent in the file.
3. go through waters2.pdb with a text editor, looking at the 7'th column, find pairs or clusters of adjacent lines with the same x,y,z values, and delete all but one such line in each group.
4. This is still pdb format, so you can merge the remaining atoms into your pdb file with the waters removed by something like
awk '$1!~/HETATM/ || $4!~/HOH/' mypdb.pdb > nowaters.pdb)
The awk and sort commands here assume you have a plain vanilla pdb file with no alternate conformations, insertion codes, and less than 9999 atoms, with chain letter, so that the X coordinate is the 7th column. And no anisou records for the waters. See recent discussion on advisability of using shell commands vs custom tools for modifying pdb files.
On 10/29/2014 11:53 PM, luzuok wrote:
> Dear Nicolas,
> It is really time-consuming! Philip told me to run the structure on PDB validation server. It will post error if there is duplicate molecules. Then I can directly find them on a text editor.
> I think it is better for COOT to solve this issue.
>
> Best reagards!
> Lu zuokun
>
>
>
> --
> 卢作焜
> 南开大学新生物站A202
>
>
> 在 2014-10-29 22:29:35,"FOOS Nicolas"<[log in to unmask]> 写道:
>>Dear Lu,
>>
>>one simple solution is to remove the water molecules with text editor for example. It depend of how-many times you have multiply water molecules and if your model have several or more water molecules.
>>In coot you can remove it graphically, but according to my knowledge not automatically, and it maybe time consuming.
>>
>>Hope to help
>>Nicolas
>>
>>________________________________________
>>De : CCP4 bulletin board [[log in to unmask]] de la part de luzuok [[log in to unmask]]
>>Envoyé : mercredi 29 octobre 2014 13:08
>>À : [log in to unmask]
>>Objet : [ccp4bb] water at the same exactly position
>>
>>Dear all,
>> I found that there are some water molecules in my pdb that share the same position. This maybe cause by merging molecules in coot. It seems that I have mereged water molecules into my protein for more than one time.
>>Does anyone tell me how to fix this problem?
>>
>>Best regards!
>>Lu Zuokun
>>
>>
>>
>>
>>--
>>卢作焜
>>南开大学新生物站A202
>>
>
>
>
|