Postdoctoral position in Statistics/Bioinformatics/Computer science at AgroParisTech / LBBE (Lyon)
Subject: Methods for Identifying Alternative Splicing (and Genetic Variants associated) from NGS data.
New sequencing technologies can now be applied to the study of mRNA, through the RNA-seq protocol. These techniques yield large amounts of short reads, which then need to be reassembled to identify and quantify the full-length mRNAs initially present in the sample. While this problem is simple when there is only one mRNA per gene, it becomes challenging when a gene gives rises to several alternative splicing variants, with different exon content. Recent studies estimate that up to 90% of multi-exon genes in human are alternatively spliced. Hence, what was thought to be once an exception seems in fact to be the rule. Idenfifying and quantifying all the variants of a gene is therefore a major challenge in the field, for which various methods have been proposed, including penalized regression approaches [Li et al., 2011, Bernard et al., 2014].
Unambiguously assigning an expression level to each variant is however not always possible, essentially because sequencing reads are short, and genes may have many (unannotated) variants [Lacroix et al. 2008]. In practice, several sets of variants may give solutions which are equally good in terms of penalized or constrained likelihood. Yet, most methods output only one best solution, without mentionning that there could be other equally good ones. This is not satisfactory and could lead to spurious biological conclusions.
A first step of this post-doc would be to carefully analyze the conditions under which a solution is unique and, most importantly, when it is not the case to characterize the set of solutions and propose possibly unique sub-solutions. The candidate will examine this problem in the context of shallow sequencing with long reads and/or deep sequencing with short reads.
Depending on the profile of the candidate, other directions for this postdoc include working on the identification of genetic variants associated with alternative splicing. Indeed, while many methods have been published to identify genetic variants on the one hand, and alternative splicing on the other hand, few methods [Monlong et al.] exist which try and connect the two fields and take advantage of the wealth of sequences generated both at the DNA and RNA level, for the same individuals.
Context
The postdoctoral fellow will work either in the UMR 518 AgroParisTech-INRA de Mathématiques et Informatique Appliquees (MIA) in Paris or in the UMR CNRS 5558 de Biométrie et de Biologie Évolutive (LBBE) in Lyon. Both groups are specialized in statistics and bioinformatics.
This postdoctoral position is offered in the framework of ABS4NGS ANR project https://sites.google.com/site/abs4ngs/. The postdoctoral fellow will thus discuss and collaborate with the different partners of this project Institut Curie and which started at the end of 2012.
Background
The applicant should have a strong background in applied statistics, bioinformatics and/or computer science; typically a PhD in one of these fields. A strong experience in programming is also desirable.
Localization.
• UMR 518 AgroParisTech-INRA MIA: 16 Rue Claude Bernard, Paris 5ème, www.agroparistech.fr/mia/
• UMR CNRS 5558 LBBE: Campus de La Doua - Université Claude Bernard - Lyon 1, 16 rue Raphael Dubois (Bâtiment Grégoire Mendel), Villeurbanne, lbbe.univ-lyon1.fr/
Salary.
Depending of the past experience of the applicant: 1968 euros per month (2398 euros before taxes) or 2142 euros per month (2611 euros before taxes).
Duration.
The position is for 18 months.
Contact.
• MIA 518: Stéphane Robin: [log in to unmask]
• LBBE: Laurent Jacob: [log in to unmask], Vincent Lacroix: [log in to unmask]
References
• Elsa Bernard, Laurent Jacob, Julien Mairal, and Jean-Philippe Vert. Efficient RNA isoform identification and quantification from rna-seq data with network flows. Bioinformatics, 30(17):2447–2455, 2014. doi: 10.1093/bioinformatics/btu317. URL http://dx.doi.org/10.1093/bioinformatics/btu317.
• Vincent Lacroix, Michael Sammeth, Roderic Guigó, and Anne Bergeron. Exact transcriptome reconstruction from short sequence reads. In Keith A. Crandall and Jens Lagergren, editors, Algorithms in Bioinformatics, 8th International Workshop, WABI 2008, Karlsruhe, Germany, September 15-19, 2008. Proceedings, volume 5251 of Lecture Notes in Computer Science, pages 50–63. Springer, 2008. ISBN 978-3-540-87360-0. doi: 10.1007/978-3-540-87361-7_5. URL http://dx.doi.org/10.1007/978-3-540-87361-7_5.
• Jingyi Jessica Li, Ci-Ren Jiang, James B Brown, Haiyan Huang, and Peter J Bickel. Sparse linear modeling of next-generation mrna sequencing (rna-seq) data for isoform discovery and abundance estimation. Proc Natl Acad Sci U S A, 108(50):19867–19872, Dec 2011. doi: 10.1073/pnas.1113972108. URL http://dx.doi.org/10.1073/pnas.1113972108.
• Jean Monlong, Miquel Calvo,Pedro G. Ferreira, Roderic Guigó. Identification of genetic variants associated with alternative splicing using sQTLseekeR, Nature Communications 5, Article number: 4698 doi:10.1038/ncomms5698
You may leave the list at any time by sending the command
SIGNOFF allstat
to [log in to unmask], leaving the subject line blank.
|