JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for TUNICATA Archives


TUNICATA Archives

TUNICATA Archives


TUNICATA@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

TUNICATA Home

TUNICATA Home

TUNICATA  October 2018

TUNICATA October 2018

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Introduction to bioinformatics for DNA and RNA sequence analysis (IBDR01)

From:

Oliver Hooker <[log in to unmask]>

Reply-To:

[log in to unmask]

Date:

Mon, 15 Oct 2018 16:45:57 +0100

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (57 lines)

Introduction to bioinformatics for DNA and RNA sequence analysis (IBDR01)

https://www.prinformatics.com/course/introduction-to-bioinformatics-for-dna-and-rna-sequence-analysis-ibdr01/

This course will be delivered by Malachi Griffith from the 29th October - 2nd November 2018 in Glasgow City Centre

Please feel free to share.

Course Overview:
Analysis of high throughput genome and transcriptome data is major component of many research projects ranging from large-scale precision medicine efforts to focused investigations in model systems. This analysis involves the identification of specific genome or transcriptome features that predispose individuals to disease, predict response to therapies, influence diagnosis/prognosis, or provide mechanistic insights into disease models. During this course (IBDR01), students will perform an example end-to-end bioinformatics analysis of genome (WGS and Exome) and transcriptome (RNA-seq) data. Students will start with raw sequence data for a hypothetical case, learn to install and use the tools needed to analyze this data on the cloud, and visualize and interpret results. After completing the course, students should be in a position to (1) understand raw sequence data formats, (2) perform bioinformatics analyses on the cloud, (3) run complete analysis pipelines for alignment, variant calling, annotation, and RNA-seq (transcriptome analysis approaches will be a major component of the workshop), (4) visualize and interpret whole genome, exome and RNA-seq results, (5) leverage the identification of passenger variants for immunotherapy applications, and (6) begin to place these results in a clinical context by use of variant knowledgebases. The data, tools, and analysis will be most directly relevant to human genomics and bioinformatics research. However, many of the skills and concepts covered will be applicable to other human diseases and model organisms. Furthermore, many analysis concepts covered during the workshop will be broadly applicable to other “big data” research problems. All course materials (including copies of presentations, practical exercises, data files, and example scripts prepared by the instructing team) will be provided electronically to participants.

Monday 29th – Classes from 09:30 to 17:30

Session 1. Introduction to genomics and bioinformatics.
In this session, students will be introduced to key concepts of genomics and their application to genomics research and precision medicine in cancer. An introduction to next-generation sequencing platforms and related bioinformatics approaches will also be provided. Core concepts and tools introduced: fundamentals of genome and transcriptome analysis, next-generation sequencing, precision/personalized medicine approaches (using cancer as an exemplar disease).

Session 2. Introduction to genomics data, file formats, QC, and cloud analysis.
In this session, students will be introduced to a hypothetical patient case and related samples to be analyzed throughout the course. Students will be provided with an introduction to the whole genome, exome, transcriptome and other data sets we have generated for this test case. Information on where to get the raw data and how to access it (and other test data) will be provided. Using this data as an example, the students will learn fundamentals of next generation sequence (NGS) data formats. The students will also be introduced to accessory files needed for analysis including reference genomes, reference transcriptomes, and annotation files. Tools for QC analysis of raw data will be demonstrated. Since most analysis will be performed on the cloud, each student will learn how to launch and log into their own cloud compute environment. Students will learn how to install bioinformatics tools and learn to use some of the most broadly useful tool kits for NGS data. Core concepts and tools introduced: file formats (Fasta, FastQ, SAM/BAM/CRAM, VCF, GTF), bedtools, Picard, samtools, fastQC, cloud computing (AWS, EC2).

Tuesday 30th – Classes from 09:30 to 17:30

Session 3. Primary genome data analysis (sequence alignment and visualization).
In this session, we will start to complete analysis of NGS data at the command line. Students will log into the cloud, and starting with their own copy of the raw data will align the whole genome and exome data to a reference genome. Following alignment, students will conduct a second quality analysis of the data and learn to visualize alignments in IGV. Core concepts and tools introduced: alignment algorithms, reference indexes, BWA, BWA-mem, alignment indexes, alignment flags, genome browsers, duplicate marking, alignment merging and sorting, IGV.

Session 4. Whole genome and exome variant calling and annotation.
In this session, we will introduce different algorithms for identifying sequence variations of various types from either whole genome or exome data (or both). Both germline and somatic variant calling will be covered. For each, students will learn strategies for identifying false positives and increasing confidence in individual predictions by manual or secondary examination of the alignments. Variant types detected will include single nucleotide variants (SNVs), small insertions and deletions (indels), copy number variants (CNVs) and structural variants (SVs). Students will learn strategies for visualizing and presenting variants of each type. After producing filtered variant results of each type, annotation methods and resources relevant to each variant type will be demonstrated. Core concepts and tools introduced: germline variation, somatic variation, variant calling, false positives, false negatives, alignment artifacts, manual review, svviz, manta, GATK, Strelka, MuTect, VarScan, CopyCat, Lumpy.

Wednesday 31st – Classes from 09:30 to 17:30

Session 5. RNA-seq analysis (introduction, alignment and abundance estimation).
In this session, students will learn about fundamentals of RNA-seq data analysis and perform initial QC and alignment of the raw transcriptome data. Appropriate sample comparisons for RNA-seq and other experimental design and analysis considerations will be discussed in detail. Core concepts and tools introduced: reference transcriptomes, normalization, batch effects, replicates, spliced alignment algorithms, RNA-seq data trimming, RNA assembly algorithms, RNASeqQC, HISAT, StringTie.

Session 6. RNA-seq analysis (fusions, differential expression, and clustering).
The uses of transcriptome data in biological research are remarkably varied. Students will pursue several strategies in this section. Fusion detection, an RNA-seq specific variant detection approach will be performed. The expression abundance results from the previous section will be used to identify a list of highly expressed genes. Comparison to RNA-seq data from a cohort of related samples will be used to identify expression outliers. Expression clustering algorithms will be used to stratify our case using a known expression signature. More advanced classification and pathway based approaches to stratification will be briefly introduced. Core concepts and tools introduced: fusion calling, outlier analysis, expression clustering, stratification, heatmaps, Ballgown, pizzly.

Thursday 1st – Classes from 09:30 to 17:30

Session 7. Prioritization, visualization and interpretation.
In this session, students will learn about procedures for refining the final results obtained from the previous analyses of our case data. Genome and transcriptome variant observations will be prioritized according to various annotation strategies. These vary from algorithmic predictions of pathogenicity to intersecting with results from population databases. Students will also learn how to integrate results from the DNA and RNA-seq analyses. For example, variants will be prioritized according to their expression status, allele specific expression bias, and the abundance of associated genes. Fusions predicted in the RNA will be confirmed in the DNA. Visualization techniques will be used to place variant observations from our case in the context of a cohort of previously sequenced cases with the same disease. A group discussion will tackle how to approach creating a final clinical interpretation for our example patient. Core concepts and tools introduced: allele specific expression, clonality, GenVisR, gnomad, CADD, bam-readcount, integrate.

Session 8. Gene/variant knowledgebases and clinical actionability.
In this session, students will learn the fundamentals of interpreting genome and transcriptome observations in a clinical context. The final candidate observations for our example case will be examined using various clinical interpretation tools and databases. Core concepts and tools introduced: Druggability, actionability, sensitivity, resistance, predictive variants, diagnostic variants, prognostic variants, predisposing variants, the ACMG and AMP guidelines for clinical actionability, variant knowledgebases, CBioPortal, CIViC, ClinVar, DGIdb, PharmGKB.

Friday 2nd – Classes from 09:30 to 16:00

Session 9. Leveraging passenger variants (monitoring and immunogenomics).
Up until this point, we have been focused on identifying, annotating and interpreting variants that are potentially relevant to disease biology or clinical interpretation. These are variants that are deemed functional, actionable, or of some known clinical relevance. What about those variants that may be unusual or unique to this case but of no known significance? What about the “passenger” variants? In this section, we will explore two broad strategies that leverage passenger variants in a clinically useful way (using cancer as an exemplar disease for this approach). First, we will examine their potential use in tracking response to therapy. Second, we will explore the possible immunogenomic implication of passenger variants by designing a personalized cancer vaccine for our example case. Core concepts and tools introduced: cfDNA, serial analysis, immunotherapy, pVacTools.

Session 10. Application to your own data
Optional free afternoon to cover previous modules or consult with the team of instructors. In this session, students will be free to work on their own, or in groups on the previously covered sections. Furthermore, students can consult with the team of instructors on their own experiments or get practical advice for analyzing their own data. Our hope is to make this session as interactive and useful as possible.

To learn more about the team of instructors, please visit www.griffithlab.org and http://genome.wustl.edu/people/groups/detail/griffith-lab/.

########################################################################

To unsubscribe from the TUNICATA list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TUNICATA&A=1

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

January 2019
December 2018
November 2018
October 2018
September 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
June 2014
May 2014
April 2014
February 2014
January 2014
December 2013
November 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
November 2011
October 2011
September 2011
August 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
February 2009
January 2009
December 2008
November 2008
October 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
2006
2005
2004
2003
2002
2001
2000


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager