Introduction to identifying and characterising somatic variants

  • Sheffield - 10th May 2019
  • 09:30am - 5pm
  • Pam Liversidge Building, Design Studio 1, D06


Healthcare systems around the globe are transforming medical practice to incorporate genetic variation as a vital information used for diagnosis and treatment planning. Genome sequencing technologies allow us to detect all variants in a patient’s genome and international collaborative efforts such as The 100,000 Genomes Project, The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) have begun to catalogue and release data on genomic variation in a variety of disease types.

However, such datasets pose new challenges in the way the data have to be analyzed, annotated and interpreted which are not trivial and are daunting to the clinician or biomedical scientist.

This course covers state-of-the-art and best-practice tools for the analysis of cancer genomes. We describe, and give hands-on experience of, the entire analysis workflow from raw data generated by a sequencing machine to deriving variant calls (e.g. Single Nucleotide Variants) that are ready for downstream analysis, interpretation and prioritisation.

We will describe the steps involved to go from sequencing library to a prioritised, clinically-relevant list of DNA variants. Practical sessions will use the user-friendly Galaxy interface ( to demonstrate tasks such as alignment, quality control, variant-calling and annotation


Who should attend this course?

Healthcare providers and researchers who want to get an appreciation for how somatic variants are identified and annotated, but not neccesarily working on this problem routinely.

Objectives:- After this course you should be able to:

  • Understand the main file formats used for NGS analysis (bam, vcf, bed etc), what is included in each file and appropriate tools for manipulating each file
  • Understand the concepts and challenges involved in calling somatic SNVs
  • Manually review a set of variant calls
  • Communicate more effectively with the Bioinformaticians dealing with your data

Aims:- During this course you will learn about:

  • An appreciation of the nature and scale of NGS data and the requirement for sophisticated computational methods
  • The theory behind current methods for calling somatic SNVs from NGS data, and their outputs
  • Given a set of called SNVs, how to assess quantitively and qualitatively which calls might be “real” or not ii) assess which calls might be biologically-meaningful and warrant further investigation
  • Annotating a set of somatic variants using public resources
  • Exploration of NGS data using interactive tools such the Integrative Genomics Viewer (IGV)


  • No prior programming experience is required. Prior knowledge of NGS sequencing technologies would be an advantage, but not required.


  • Dr Matthew Parker, Clinical Bioinformatics Core Scientist
  • Dr Mark Dunning, Bioinformatics Core Director

Timetable (provisional)

  • 09:30 - 10:30 - Intro & Somatic Variants Quick Recap of pre-material
  • 10:30 - 13:00 - FastQ Files, QC and Alignment (coffee during)
  • 13:00 - 14:00 - Lunch (not provided)
  • 14:00 - 15:00 Variant Calling & Annotation
  • 15:00 - 16:30 Filtering & Manual Review of Variant Calls


Registration is open now