web : sbc.shef.ac.uk
twitter: SheffBioinfCore
email: bioinformatics-core@sheffield.ac.uk
The course will comprise of five sessions
Please only sign-up to the course if you are available on these dates, or prepared to devote time to review any sessions that you miss
As a researcher, you will encounter research data in many forms, ranging from measurements, numbers and images to documents and publications. Whether you create, receive or collect data, you will certainly need to organise it at some stage of your project. This workshop will provide an overview of some basic principles on how we can work with data more effectively. We will discuss the best practices for research data management and organisation so that our research is auditable and reproducible by ourselves, and others, in the future.
You will need to install the OpenRefine software. This is free and available for Windows and Mac OSX. Download links are provided below:-
Windows Download Mac Download
As the data generated from high-throughput biological experiments increase in volume and become more complex, the ability to manipulate and visualise data is a highly-desirable skill in academia and industry. Whilst familiar tools such as Excel allow basic manipulations, they are often not scalable to larger datasets and are not ameanable to reproducible analysis.
R is a highly-regarded, free, software environment for statistical analysis, with many useful features that promote and facilitate reproducible research.
In this course, we give an introduction to the R environment and explain how it can be used to import, manipulate and visualise tabular data.
After the course you should feel confident to start exploring your own dataset using the materials and references provided.
These instructions are also described in a video:- https://youtu.be/QIubJ8W8R4g
Install R by downloading and running this .exe file from CRAN. Also, please install the RStudio IDE. Note that if you have separate user and admin accounts, you should run the installers as administrator (right-click on .exe file and select “Run as administrator” instead of double-clicking). Otherwise problems may occur later, for example when installing R packages.
Install R by downloading and running this .pkg file from CRAN. Also, please install the free RStudio IDE
You can download the binary files for your distribution from CRAN. Or you can use your package manager (e.g. for Debian/Ubuntu run sudo apt-get install r-base
and for Fedora run sudo yum install R
). Also, please install free the RStudio IDE.
Please download and extract (un-zip) this zip file into the directory on the computer that you wish to work in
Create an RStudio project using the menu File -> New Project -> Existing Directory and browse to the directory that you extraced the zip file to. Rstudio will refresh so that the working directory corresponds to the course data folder.
Type the following into the R console to install some extra R packages required for the workshop
install.packages("dplyr")
install.packages("ggplot2")
install.packages("readr")
Mac Users may get the following error message when trying to install these packages
xcrun error: inactive developer path (/Library/Developer/CommandLineTools), missing xcrun at:.....
If this is the case, you will need to follow the instructions from this link to install “Xcode”
Window users might get a message that Rtools is required. This shouldn’t be neccesary, but you might need it for other packages. It can be installed here:-
https://cran.r-project.org/bin/windows/Rtools/
source("https://raw.githubusercontent.com/sheffield-bioinformatics-core/r-online/master/check_packages.R")
Please watch this short presentation (<10 minutes) before attending the workshop
This course provides a refresher on the foundations of statistical analysis. The course is aimed at scientists at all levels – especially those whose formal education likely included statistics, but who have not perhaps put this into practice since. The focus of the course is on understanding the principles behind statistical testing, how to choose and execute the most appropriate test for your data, and how to interpret the result.