Statistics Seminar
Cardiff University School of Mathematics Senghennydd Road, Cardiff
Room M/0.37
2.30pm Monday 19th April 1999
Professor David Siegmund
An Approximate P-value for Sequence Alignments
Abstract:
An important step in learning the function of a new gene (DNA
sequence) or protein amino acid sequence is to compare the new
sequence with existing sequences whose function is already known.
Assume that two sequences from a finite alphabet are optimally
aligned according to a scoring system that penalizes mismatches and
indels (insertions and deletions). To evaluate the quality of that
alignment an approximate p-value is obtained for the case that
(i) the letters in each sequence are independent and identically
distributed and (ii) the penalty for starting a new interval
of indels is large compared to the penalties for a mismatch and for
extending an existing interval of indels. I will explain the
scientific background of problems of sequence alignment and its
relation to problems of queueing theory and change-point detection.
If time permits, I will give an idea of the proof of the p-value
approximation, which involves an extension of the method of Pollak
and Yakir (1998) and of Siegmund and Yakir (1999), which is motivated
by the study of change-point problems.
Enquiries to Terence Iles ([log in to unmask])
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|