Last modified: 17 Feb 2020

Course Introduction

  • Crash Course in Prostate Bioinformatics
  • University of Oxford
  • 18th -20th February 2019 09:30 - 17:00

Overview

  • Processing of (bulk) rna-seq data using R
    • short single-cell teaser at end
  • Survival Analysis
  • Quality assessment, differential expression and visualisation
  • Using DESeq2 workflow
    • other popular tools available
  • Assuming that samples have already been aligned and counted
    • and experiment was appropriately-designed
  • Materials to be shared online
  • First we have to learn some R…

History of R

Powerful data manipulation and graphics capabilities

Notable uses

Topics covered

  • The Rstudio environment
  • Importing data from a spreadsheet
  • Filtering Data
  • Plotting
  • Calculating numerical summaries
  • Reporting
  • Joining data from multiple spreadsheets

Packages covered

Can't we just do these things in Excel?

  • Spreadsheets are a common entry point for many types of analysis and Excel is used widely but
    • can be unwieldy and difficult to deal with large amounts of data
  • error prone (e.g. gene symbols turning into dates)
  • tedious and time consuming to repeatedly process multiple files
  • how can you, or someone else, repeat what you did several months or years down the line?

Facilitating reproducible research

Course structure

  • Live coding
    • no more slides!
  • Exercises
  • Tea and Coffee breaks!
  • You can view the course notes online

Example plots

  • By the end of Day 1 we will be creating plots like this in a few lines of code

Example plots

Resources