Introduction to identifying and characterising somatic variants

  • Sheffield - 10th May 2019
  • 09:30am - 5pm
  • Pam Liversidge Building, Design Studio 1, D06

Overview

Healthcare systems around the globe are transforming medical practice to incorporate genetic variation as a vital information used for diagnosis and treatment planning. Genome sequencing technologies allow us to detect all variants in a patient’s genome and international collaborative efforts such as The 100,000 Genomes Project, The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) have begun to catalogue and release data on genomic variation in a variety of disease types.

However, such datasets pose new challenges in the way the data have to be analyzed, annotated and interpreted which are not trivial and are daunting to the clinician or biomedical scientist.

This course covers state-of-the-art and best-practice tools for the analysis of cancer genomes. We describe, and give hands-on experience of, the entire analysis workflow from raw data generated by a sequencing machine to deriving variant calls (e.g. Single Nucleotide Variants) that are ready for downstream analysis, interpretation and prioritisation.

We will describe the steps involved to go from sequencing library to a prioritised, clinically-relevant list of DNA variants. Practical sessions will use the user-friendly Galaxy interface (https://usegalaxy.eu/) to demonstrate tasks such as alignment, quality control, variant-calling and annotation

Materials

Who should attend this course?

Healthcare providers and researchers who want to get an appreciation for how somatic variants are identified and annotated, but not neccesarily working on this problem routinely.

Objectives:- After this course you should be able to:

  • Understand the main file formats used for NGS analysis (bam, vcf, bed etc), what is included in each file and appropriate tools for manipulating each file
  • Understand the concepts and challenges involved in calling somatic SNVs
  • Manually review a set of variant calls
  • Communicate more effectively with the Bioinformaticians dealing with your data

Aims:- During this course you will learn about:

  • An appreciation of the nature and scale of NGS data and the requirement for sophisticated computational methods
  • The theory behind current methods for calling somatic SNVs from NGS data, and their outputs
  • Given a set of called SNVs, how to assess quantitively and qualitatively which calls might be “real” or not ii) assess which calls might be biologically-meaningful and warrant further investigation
  • Annotating a set of somatic variants using public resources
  • Exploration of NGS data using interactive tools such the Integrative Genomics Viewer (IGV)

Prerequisites

  • No prior programming experience is required. Prior knowledge of NGS sequencing technologies would be an advantage, but not required.

Instructors

  • Dr Matthew Parker, Clinical Bioinformatics Core Scientist
  • Dr Mark Dunning, Bioinformatics Core Director

Timetable (provisional)

  • 09:30 - 10:30 - Intro & Somatic Variants Quick Recap of pre-material
  • 10:30 - 13:00 - FastQ Files, QC and Alignment (coffee during)
  • 13:00 - 14:00 - Lunch (not provided)
  • 14:00 - 15:00 Variant Calling & Annotation
  • 15:00 - 16:30 Filtering & Manual Review of Variant Calls

Registration

Registration is open now


For queries relating to collaborating with the Bioinformatics Core team on projects: bioinformatics-core@sheffield.ac.uk

Join our mailing list so as to be notified when we advertise talks and workshops by subscribing to this Google Group. You can also connect with us on Linkedin.

Requests for a Bioinformatics support clinic can be made via the Research Software Engineering (RSE) code clinic system. This is monitored by Bioinformatics Core staff, so we will ensure the appropriate expertise (which may involve individuals from multiple teams) will be available to help you

Queries regarding sequencing and library preparation provision at The University of Sheffield should be directed to the Multi-omics facility in SITraN or the Genomics Laboratory in Biosciences.