Welcome

Thank you for RSVPing to attend the R for Biology Data Science Workshop Series. Below are a few instructions that will improve the experience if you can incorporate them prior to attending the workshop sessions.

Data organization

We need to format our data so that R will read it correctly.

Our data should be set up as a rectangle with observations (samples) in rows and variables (treatment groups, results, etc.) in columns. Below is an example:

Sample Trial Treatment Result
1 A Control 0.1229292
2 A Control 0.8274256
3 A Control 0.1485878
4 A Treated 0.0860430
5 A Treated 0.3144540
6 A Treated 0.2195038
7 B Control 0.0906801
8 B Control 0.1097931
9 B Control 0.3352014
10 B Treated 0.1403315
11 B Treated 0.2441249
12 B Treated 0.4332501

It is best to import your raw data and not bother calculating averages or anything in excel previously. We will use dplyr in Session 3 to summarize data and calculate means based on groups or other explanatory variables, etc.

Now that our data is ready for R, we need to install R and RStudio.

Install R and RStudio

It is best if we have the latest versions of R (3.5.2) and RStudio (1.1.463).

To find your RStudio version go to the menu: RStudio >> About RStudio

Run the below code to find your R version.

getRversion()

or

sessionInfo()

Install Packages

R packages are collections of functions and documentation that can be reproduced and used with multiple datasets.

PoppR is an example of a popular package that was created by Zhian Kamvar and the Grünwald Lab.

In general, we need to install packages before we can use them. In this workshop series, we will use the Tidyverse package. The Tidyverse is a collection of packages that we will use to keep our data tidy. It includes ggplot2 and dplyr, the two packages we will use in the next two sessions.

We can install the package by running the code below:

install.packages("tidyverse")

Note that you only need to install this package in RStudio one time. Once it is installed, you can simply load the package anytime you want to use it.

Also, if you check the packages tab of your pane, you will see it installed ggplot2 and dplyr, and a few others.