Redcedar Data Analyses Instructions

Instructions

We will use a Github Repository to host Github Pages written with R Markdown to document methods in analyses.

Set Up

  • Getting set up
    • Install R and Rstudio
    • Create a login on github
    • Install git. Optionally install GitHub Desktop. Rstudio also has built in github functionality, but you need to install git on your computer then reference git exe in GIT section of ‘global options’ in RStudio. This webpage is helpful.
      • Note the above step may may be necessary if the you receive errors indicating files are read only when trying to save the r files.
    • Clone the Github Repository to your computer.
    • Open Rstudio project ‘open.Rproj’ in rStudio.

Github and R Markdown

We will document methods with R Markdown. Please create a R Markdown document to track methods even if the analyses are not completed in R (e.g. with ArcGIS).

  • Creating a github page to document methods
    • Browse to analyses (open/redcedar/analyses) folder
    • Create a new folder for your analysis (e.g. cluster)
    • Create a html markdown document in the new folder
    • List the contributors to the analysis in the Rmarkdown YAML
    • Copy top table from other index.Rmd files to include navigation pane (redcedar, data, analyses, etc))
    • Clear default text if desired
    • Add a ‘welcome or purpose header and description’
    • Save the markdown document as index.Rmd
    • Knit the document into index.html
    • Commit and Push changes to github repository
      • Describe changes or progress for other collaborators as commit message.

Completing analyses

  • Completing analyses
    • Please complete analyses using data in the data folder
      • avoid copying the data into new locations, lets try to work from the same dataset throughout.
      • data can be downloaded here or read inline using the command below.
      • carefully document any changes to the data (wrangling) made before conducting analyses
    • Include information about the software, packages, and versions used for analyses.
    • Please document all steps throughout methods so they can be easily reproduced by novice data scientist.
    • Use the ‘reclassified.tree.canopy.symptoms’ column as the response variable when possible, however, use the ‘binary.tree.canopy.symptoms’ column as the response variable when there is not enough resolution to split data into 5 groups.
data <- read.csv(https://github.com/jmhulbert/open/raw/main/redcedar/data/data-modified.csv)
  • File storage
    • Please save R analyses and related (small) datasets in the github repository.
    • All other files (e.g. GIS files) can be stored locally and shared later.

Sharing analyses and methods

Link to your analysis page from the main active analyses page when ready.

  • Create link in analyses page
    • Browse analyses folder and open index.Rmd file
    • Add the link to your new analysis folder/page
    • Add your name as a contributor in the Rmarkdown YAML
    • Knit the document into index.html
    • Commit and push changes to github repository.
      • Describe changes or progress for other collaborators as commit message.