Why molecular weight? That's just arbitrary.
There is a simple way of referring to proteins which avoids any
ambiguity - by it's sequence. When referring to a protein, we should use
its sequence as an identifier. No ambiguity.
Now, I understand that some smart people in America are now solving
proteins of more than a dozen aa in length. For these, quoting the whole
sequence could be a bit long. Fortunately this is a solved problem: all
we need to do is quote a CRC64 hash of the ascii representation of the
protein sequence. This gives a name space big enough that we can name
about 4 billion proteins before the probability of a name clash becomes
significant.
James Stroud wrote:
> I think actually *naming* the proteins would be too extreme. Even the
> current alpha-numeric system is overwrought. I liked it better when we
> just called proteins "p75" or "p105". For instance, how many proteins in
> the human genome are 75 kD, anyway? My guess is not enough to make the
> situation ambiguous in any catastrophic way.
|