Introduction to RNA-seq analysis in R

Workshop Information

  • Monday 22nd July 13:00 - 16:00
  • Wednesday 24th July 13:00 - 16:00
  • Friday 26th July 13:00 - 16:00

Online, and The Diamond Workroom 2 (The University of Sheffield)

You will need to attend all three sessions, or be commited to catch-up on any materials you missed, in order to complete the workshop

Some basic R knowledge is assumed (and is essential). Without it, you will struggle on this course. If you are not familiar with the R statistical programming language we strongly encourage you to work through an introductory R course before attempting these materials. We recommend reviewing Parts 1 and 2 of our R introductory course before deciding if you can attend

You can sign-up to attend either in-person or online using this link

Materials

Overview

In this workshop, you will be learning how to analyse RNA-seq count data, using R. This will include reading the data into R, quality control and performing differential expression analysis and gene set testing, with a focus on the DESEq2 analysis workflow. You will learn how to generate common plots for analysis and visualisation of gene expression data, such as boxplots and heatmaps. You will also be learning how alignment and counting of raw RNA-seq data can be performed in R. This workshop is aimed at biologists interested in learning how to perform differential expression analysis of RNA-seq data when reference genomes are available.

Who should attend this course?

Researchers in life sciences who want to get an appreciation for the computational steps involved in RNA-seq analysis, and how to execute best-practice RNA-seq workflows in R.

Objectives:- After this course you should be able to:

  • Design properly your RNA-Seq experiments considering advantages and limitations of RNA-seq assays
  • Assess the quality of your datasets
  • Know what tools are available in Bioconductor for RNA-seq data analysis and understand the basic object-types that are utilised
  • Produce a list of differentially expressed genes from an RNA-seq experiment

Aims:- During this course you will learn about:

  • RNA sequencing technology and considerations on experiment design
  • Sources of variation in RNA-seq data
  • Differential expression analysis using DEseq2
  • Annotation resources in Bioconductor
  • Identifying over-represented gene sets among a list of differentially expressed genes

Software installation

You will need to bring an internet-enabled laptop to the course and install the latest versions of both R and RStudio before coming to the course

1) Install R and RStudio

Windows

Install R by downloading and running this .exe file from CRAN. Also, please download and run the RStudio installer for Windows. Note that if you have separate user and admin accounts, you should run the installers as administrator (right-click on .exe file and select “Run as administrator” instead of double-clicking). Otherwise problems may occur later, for example when installing R packages.

Mac

Install R by downloading and running this .pkg file from CRAN. Also, please download and run the RStudio installer for Mac

Linux

You can download the binary files for your distribution from CRAN. Or you can use your package manager (e.g. for Debian/Ubuntu run sudo apt-get install r-base and for Fedora run sudo yum install R). Also, please download and run the RStudio installer.

2) Install the R packages required for the course

Enter the following into the console prompt in RStudio

source("https://raw.githubusercontent.com/sheffield-bioinformatics-core/rnaseq-r-online/main/install_bioc_packages.R")

You can check that this worked by running

source("https://raw.githubusercontent.com/sheffield-bioinformatics-core/rnaseq-r-online/main/check_packages.R")

3) Download and unzip this file containing the data for the course

4) (Optional) Watch these short introductory videos

Instructors

  • Dr. Mark Dunning, Bioinformatics Core Director
  • Dr. Emily Chambers, Bioinformatics Core Scientist
  • Dr. Akshay Bhat, Cancer Bioinformatician
  • Lara Oberski, PhD Student

Timetable (provisional)

  • Session 1 Monday 22nd July 13:00 13:00 - 16:00
    • Importing RNA-seq counts into R
    • Quality assessment of raw counts
    • Checking sources of variation using PCA and clustering
  • Session 2 Wednesday 24th July 13:00 - 16:00
    • Differential expression using DESeq2 to generate gene-lists
    • Manipulating and filtering gene-lists
    • Basic plotting of differential-expression results
    • More-advanced designs using DESeq2
  • Session 3 Friday 26th July 13:00 - 16:00
    • Advanced plotting with heatmaps
    • Using Bioconductor resources to interrogate Gene Ontologies
    • Identifying over-represented and enriched gene sets

Feedback


For queries relating to collaborating with the Bioinformatics Core team on projects: bioinformatics-core@sheffield.ac.uk

Join our mailing list so as to be notified when we advertise talks and workshops by subscribing to this Google Group. You can also connect with us on Linkedin.

Requests for a Bioinformatics support clinic can be made via the Research Software Engineering (RSE) code clinic system. This is monitored by Bioinformatics Core staff, so we will ensure the appropriate expertise (which may involve individuals from multiple teams) will be available to help you

Queries regarding sequencing and library preparation provision at The University of Sheffield should be directed to the Multi-omics facility in SITraN or the Genomics Laboratory in Biosciences.