Getting Started
R and RStudio
All of our work for this course will be done in the R language and we will be working with R in RStudio. RStudio is an integrated development environment (IDE) develop by a company named Posit. Please be sure to download and install the most recent versions of R and R Studio from Posit’s website.
It is a good idea to periodically update R and RStudio. RStudio will prompt you when it is time to update and you can follow the same process of downloading and installing from the Posit website that we just did here. For R, there are a number of ways to do it, but the easiest is to use packages like installr for Windows and updateR for Mac. Here is a good blog post that walks you through the steps of how to update R using these packages. I usually update R once a semester.
We are going to be using a number of R packages throughout the course. One essential set of packages are those that comprise the Tidyverse, but especially readr
, dplyr
, ggplot2
and tidyr
. You can install the entire Tidyverse collection of packages by typing install.packages("tidyverse")
in your console. We will talk about these packages in detail as we go through the course, but have a look at this basic description now to gain some basic familiarity.
Another thing that you really want to do is to make sure that you have the native pipe operator (|>
) enabled. In RStudio, go to Tools>Global Options, then go to Code and select “Use native pipe operator.”
While you are here, you can also go to Appearance to select a different editor theme or to Pane Layout to change how the four panes in R Studio are organized. Next, familiarize yourself with how to expand and minimize the four windows. The most important window that I want to highlight here is the source window. This is where we are going to be working most of the time in this course. And if I tell you to send your source code, I mean to send the file that you are working on in this window. This could be a Quarto document, an R script or an app.R
file for a Shiny app.
The next window is the Console and there we also see tabs for Terminal and Background Jobs. The console is where you can run R code one line at a time. The terminal is relevant for more advanced users and we will make some use of it when we talk about publishing Quarto documents. Background Jobs is going to be helpful when we want troubleshoot a Quarto document that is not rendering properly.
From there, the next pane we want to explore is Environment, History, etc. Environment tells us what files are currently available to us. The other important tab here is Git which will be where we push things to GitHub. You will be using this a lot in the course.
Finally, we see a pane with Files, Plots, Packages etc. Files tells us what files are in our project folder and enables us to copy, and delete files associated with our project. Plots is a window for viewing our visualizations. And Packages shows us what packages are available and loaded into our environment.
Before you move on to the next section, take some time to familiarize yourself with the various user-friendly buttons and shortcuts available to you like the drop down menu for the pane layout, a spell checker, a button for inserting a code chunk and other features that you can play around with.
Quarto
Once you have R, R Studio and Quarto installed, you are ready to start integrating text and code with Quarto. Quarto is an open source publishing platform that enables you to integrate text with code. If you have used R Markdown before then Quarto will look familiar to you because Quarto is the next generation of R Markdown.
RStudio comes with a version of Quarto already installed, but it can be useful to install the most recent version separately and because doing so will allow you to use Quarto with another IDE like VS Code. You can install the most recent version of Quarto by visiting this page and selecting the version for your operating system.
Now take a little time to create a Quarto project in R Studio and make sure everything is working properly. But before you get started, create a new folder(directory) for this course on your computer somewhere. Once that is done, go to File > New Project. Then select Quarto Project and name the project something like “test-project” in the Directory name field. Next, select Browse and navigate to the folder that you created for this course. Select Create Project.
You will notice that in your new project folder there is a file with an .Rproj extension. The .Rproj file is what tells RStudio which files are associated with the project and it obviates the need to set the working directory. It also makes it possible to share the folder with anyone who is running R and RStudio and have them run your code without having to set a working directory. This is what we refer to as a project-based workflow and we will use it for every assignment in this class.
Now try rendering the document with the Render toggle button. By default, Quarto renders an .html file that it will open in a browser and save to your project folder.
Next we want to try rendering a .pdf document. To do this, we have to install tinytex, a lightweight version of LaTeX. To do this, go to the Terminal and type quarto install tinytex
. Now, change the format from html
to pdf
by inserting format: pdf
in the YAML header. Then render the document again. A .pdf document should open up.
Now take a few minutes and try changing more of the code in the YAML header. You can try changing the title, adding a subtitle (subtitle:
) or changing the execution options. By default, Quarto uses the visual editor but behind the scenes it is using Markdown. Try and edit some text using the toggle buttons available in the visual editor and then switch to Source to view the underlying Markdown code. Play with the R code chunks embedded in the document or try adding new code chunks.
You may already have some experience writing in Markdown, which is a lightweight markup language that enables you to format plaintext files. If you have not used Markdown, or if your memory is hazy, don’t worry: it is really easy to learn. Have a look at this Markdown cheat sheet and try to familiarize yourself with its basic syntax. Finally, take some time to get familiar with the Guide and Reference sections of the Quarto website. Then take a look at the gallery so that you can get an idea of the kinds of things you can produce with Quarto.
GitHub (Optional)
GitHub is a platform for hosting version control repositories. In this course we will learn to use GitHub to store, manage and collaborate on code.
One central concept you want to be familiar with is a repository or “repo” for short. Repos are essentially folders where code can be stored and then accessed and changed by multiple users. All of the assignments for this course will be managed in repos.
GitHub repos are managed using a version control system named Git. Git allows developers to make and track changes to the code stored in the repo. Git also enables users to create branches to work on the code without affecting the main codebase and then merge those changes back into the main branch when they are ready.
The first thing you are going to want to do is to register a GitHub account. From there, you want to install Git and initiate Git using the usethis package. Be sure to install usethis
first (install.packages("usethis")
) if you don’t have it installed already. Then load usethis
(library(usethis)
) and type use_git_config(user.name = "Jane Doe", user.email = "jane@example.org")
substituting the name and email associated with your GitHub account.
Next, you need to generate a personal access token (PAT) by typing create_github_token()
and set your credentials with the gitcreds package. This is pretty simple once you have your PAT. After installing and loading gitcreds
(install.packages("gitcreds")
; library(gitcreds)
), you can type gitcreds_get()
to see whether you already have credentials set up. Then type gitcreds_set()
to enter your new credentials (your username and your password, which will be the PAT that you just generated).
Now, you can create a GitHub repo and clone it to your computer in RStudio. There a number of ways to clone a repo, but the recommended method for this course involves creating a new project in RStudio and selecting “Version Control” for the method (instead of “New Directory” as we did before). From there, select “Git”, copy the URL from the green “Code” tab in the GitHub repo, and paste it into “Repository URL” field. Next, select the directory where you want the repository to be created and click “create project.”