Last modified: 02 Mar 2020

Course Introduction

  • Data Manipulation and Visualisation
  • University of Sheffield
  • 3rd March 2020 09:30 - 17:00

History of R

Powerful data manipulation and graphics capabilities

Notable uses

Topics covered

  • The Rstudio environment
  • Importing data from a spreadsheet
  • Filtering Data
  • Plotting
  • Calculating numerical summaries
  • Reporting
  • Joining data from multiple spreadsheets

Packages covered

Not covered in this course

  • Statistics (see March 5th)
  • Specific applications
    • e.g. Bioinformatics

Can't we just do these things in Excel?

  • Spreadsheets are a common entry point for many types of analysis and Excel is used widely but
    • can be unwieldy and difficult to deal with large amounts of data
  • error prone (e.g. gene symbols turning into dates)
  • tedious and time consuming to repeatedly process multiple files
  • how can you, or someone else, repeat what you did several months or years down the line?

Facilitating reproducible research

Course structure

  • Live coding
    • no more slides!
  • Exercises
    • sticky notes to indicate progress
  • Tea and Coffee breaks!
  • You can view the course notes online if you miss anything

Course Data

  • We will use data from the gapminder project

Example plots

  • By the end of the course we will be creating plots like this in a few lines of code

Example plots

Resources