KEELE UNIVERSITY
CENTRE FOR MEDICAL STATISTICS
SEMINAR SERIES on BIOSTATISTICS 2001/2002
NO 10: Wednesday, March 6th 2002, at 2:30 pm
"Percolation of annotation errors in a database of protein sequences"
Wally Gilks
MRC Biostatistics Unit, Cambridge
Protein sequence databases curate information on the sequence, structure
and function of proteins. Genome sequencing projects have led to a rapid
increase in protein sequence information, but reliable, experimentally
verified, information on protein function lags a long way behind. To
address this deficit, functional annotation in protein databases is
often inferred by sequence similarity to other annotated proteins, with
the attendant possibility of error. Now, the functional annotation in
these other proteins may itself have been acquired through sequence
similarity to yet other proteins, and it is generally not possible to
determine how
the functional annotation of any given protein has been acquired. Thus
the possibility of chains of misannotation arises. With some simple
assumptions, we develop a dynamical probabilistic model for these
misannotation chains, and explore the consequences for annotation
quality.
All welcome!!
Venue:
Room 2.22, Third Floor
MacKay Building
Keele University
http://www.keele.ac.uk
http://www.keele.ac.uk/depts/ma/seminars/medstats.html
http://www.keele.ac.uk/university/campus/maps/
*****************************
Janet Drewery
Centre for Medical Statistics
MacKay Building
Keele University
Keele, Staffs ST5 5BG
Tel: (01782) 583269
Fax: (01782) 583269/584268
E-mail: [log in to unmask]
*****************************
|