7 R markdown

In the chapter on R Scripts, you learned how to create R scripts – plain text files that contain R code and comments. These R scripts are kind of a big deal because they give us a simple and effective tool for saving, modifying, and sharing our R code. If it weren’t for the existence of R markdown files, we would probably do all of the coding in this book using R scripts. However, R markdown files do exist and they are AWESOME! So, I’m actually going to suggest that you use them instead of R scripts the majority of the time.

It’s actually kind of difficult for me to describe what an R markdown file is if you’ve never seen or heard of one before. Therefore, I’m going to start with an example and work backwards from there. Figure 7.1 below is an R markdown file. It includes the exact same R code and comments as the example we saw in the chapter on creating R scripts. 6.3

Example R markdown file.

Figure 7.1: Example R markdown file.

Click here to download the R markdown file

Notice that the results are embedded directly in the R markdown file immediately below the R code (e.g., between lines 19 and 20)!

Once rendered, this R markdown file creates the HTML file you see below in figure 7.2. HTML files are what websites are made out of, and I’ll walk you through how to create them from R markdown files later in this chapter.

Preview of HTML file created from an R markdown file.

Figure 7.2: Preview of HTML file created from an R markdown file.

Click here to download the HTML Notebook file.

Notice how everything is nicely formatted and easy to read!

When you create R markdown files on your computer, the rendered HTML file is saved in the same folder by default. 7.3

HTML Notebook file and R markdown file on MacOS.

Figure 7.3: HTML Notebook file and R markdown file on MacOS.

In the figure above, the HTML Notebook file is highlighted in blue and ends with the .nb.html file extension. The R markdown file is below the HTML Notebook file and ends with the .Rmd file extension. Both of these files can be modified, saved, and shared with others.

7.1 What is R markdown?

There is literally an entire book about R markdown (and it’s worth reading at some point). Therefore, I’m only going to hit some of the highlights in this chapter. As a starting point, you can think of R markdown files as being a mix of R scripts, the R console, and a Microsoft Word or Google Doc document. I say this because:

  • The R code that you would otherwise write in R scripts is written in R code chunks when you use R markdown files. In figure 7.1 there are R code chunks at lines 8 to 10, 12 to 14, 16 to 19, 25 to 27, and 31 to 33.

  • Instead of having to flip back and forth between your source pane and your console (or viewer) pane in RStudio, the results from your R code are embedded directly in the R markdown file, right alongside the code that generated them. In figure 7.1 there are embedded results between lines 19 and 20, between lines 27 and 28, and between lines 33 and 34.

  • When creating a document in Microsoft Word or Google Docs, you may format text headers to help organize your document, you may format your text to emphasize certain words, you may add tables to help organize concepts or data, you may add links to other resources, and you may add pictures or charts to help you clearly communicate ideas to yourself or others. Similarly, R markdown files allow you to surround your R code with formatted text, tables, links, pictures, and charts directly in your document.

Even when I don’t share my R markdown files with anyone else, I find that the added functionality described above really helps me organize my data analysis more effectively, and helps me understand what I was doing if I come back to the analysis at some point in the future.

But, R markdown really shines when I do want to share my analysis or results with others. To get an idea of what I’m talking about, please take a look at the R markdown gallery and view some of the amazing things you can do with R markdown. As you can see, these R markdown files mix R code with other kinds of text and images to create documents, websites, presentations, and more.

7.2 Why use R markdown?

At this point, you may be thinking “Ok, that R markdown gallery has some cool stuff, but this also looks complicated. Why shouldn’t I just use a basic R script for the little R program I’m writing?” If that’s what you’re thinking, you have a valid point. R markdown files are slightly more complicated than basic R scripts. However, after reading in the sections below, I think you will find that getting started with R markdown doesn’t have to be super complicated, and the benefits provided make the initial investment in learning R markdown worth your time.

7.3 Create an R Notebook

RStudio makes it very easy to create your own R markdown file, of which there are several types. In this chapter, I’m going to show you how to create a really commonly used type of R markdown file called an R Notebook.

The process is actually really similar to the process we used to create an R script. Start by clicking on the icon shown below. 7.4

Click the new source file icon.

Figure 7.4: Click the new source file icon.

As before, you’ll be presented with a dropdown box that lists a bunch of different file types that you can create. This time, we’ll click R Notebook instead of R script. 7.5

New source file options.

Figure 7.5: New source file options.

At this point you may have noticed that right below R Notebook in the dropdown menu is an option that says R Markdown... and you may be confused about why we aren’t choosing that option. Great observation! As I said at the beginning of this chapter, R markdown files have a ton of functionality. Usually, when things have a ton of functionality (e.g., phones, cars, computers) it comes at a cost. That cost is complexity. R markdown files are no exception. They are awesome, but it can take some time to learn how to take advantage of all they have to offer. R Notebooks ARE R markdown files, but they have some default settings that make it quick and easy for us to jump right into using them for doing some interactive R coding.

🗒Side Note: When I say “interactive R coding” I mean, type some R code, submit, see the result, type some more R code, submit it, see the result…

After you click the R Notebook option in the dropdown menu, a new untitled R Notebook file will appear in the source pane. This R Notebook will even include some example text and code meant to help get you started. We are typically going to erase all the example stuff and write our own text and code, but for now I will use it to highlight some key components of R markdown files. 7.6

A blank R script in the source pane.

Figure 7.6: A blank R script in the source pane.

First, notice lines 1 through 4 in the example above. These lines make up something called the YAML header (pronounced yamel). You don’t need to know what YAML means, but you do need to know that this is one of the defining features of all R markdown files. Essentially, The YAML header turns plain text files into R markdown files. We’ll talk more about the details of the YAML header soon.

Second, notice lines 10 through 12. These lines make up something called an R code chunk. Code chunks in R markdown files always start with three backticks ( ` ) and a pair of curly braces ({}), and they always end with three more backticks. We know that this code chunk contains R code because of the “r” inside of the curly braces. You can also create code chunks that will run other languages (e.g., python), but we won’t do that in this book. In this book, we will exclusively use the R language. You can think of each R code chunk as a mini R script. We’ll talk more about the details of code chunks soon.

Third, notice lines 6, 8, 14, and 18. These lines contain text instructions to help you use R Notebooks, but in a real analysis you would use formatted text like this to add context around the analysis in the code chunks. For now, you can think of this as being very similar to the comments we wrote in our R scripts. However, this text is actually something called markdown, which allows us to do lots of cool things that the comments in our R scripts aren’t able to do. For example, line 6 has a link to a website embedded in it, and lines 8, 14, 16, and 18 all include text that is being formatted (the orange text wrapped in asterisks). In this case, the text is being italicized.

And that is all you have to do to create an basic R Notebook. Next, I’m going to give you a few more details about each of the key components of the R Notebook that I briefly introduced above.

7.4 YAML headers

As I said before, the YAML header is really what makes an R markdown file an R markdown file. The YAML header always begins and ends with dash-dash-dash (---) typed on its own line (1 & 4 above). 7.6 The stuff written inside the YAML header generally falls into two categories:

  1. Stuff about the R markdown file itself. For example, the YAML header we saw above gives that R markdown file a title. The title is added to the file by adding the title keyword, followed by a colon (:), followed by a character string wrapped in quotes. Other examples include author and date.

  2. Stuff that tells R how to process the R markdown file. What do I mean by that? Well, remember the R markdown gallery you saw earlier? That gallery includes Word documents, PDF documents, websites, and more. But all of those different document types started as an R markdown file similar to the one in figure 7.6. R will create a PDF or a Word document or a website from the R markdown file based on the instructions you give it inside the YAML header. For example, the YAML header we saw above tells R to create an HTML Notebook from that R markdown file. This output type is selected by adding the output keyword, followed by a colon (:), followed by the html_notebook keyword.

What does an HTML Notebook look like? Well, if you hit the Preview button in RStudio:

RStudio's preview button. Only visible when an R Notebook is open.

Figure 7.7: RStudio’s preview button. Only visible when an R Notebook is open.

R will ask you to save your R markdown file. After you save it, R will automatically create (or render) a new HTML Notebook file and save it in the same location where your R markdown file is saved. Additionally, a little browser window will pop up and give you a preview of what the rendered HTML Notebook looks like. 7.8

An HTML Notebook created using an R markdown file.

Figure 7.8: An HTML Notebook created using an R markdown file.

Notice how all the formatting that was applied when R rendered the HTML Notebook file. For example, the title – “R Notebook” – is in big bold letters at the top of the screen, the words “R Markdown” in the first line of text are now a clickable link to another website, and the word “Run” in the second line of text is now italicized.

I can imagine that this section may seem a little confusing to some of you right now. If so, don’t worry. You don’t really need to understand the YAML header right now. Remember, when you create a new R Notebook file in the manner I described above, the YAML header is already there. You will probably want to change the title, but that’s about it.

7.5 R code chunks

As I said above, R code chunks always start out with three backticks ( ` ) and a pair of curly braces with an “r” in them ({r}), and they always end with three more backticks. Typing that over and over can be tedious, so RStudio provides a keyboard shortcut for inserting R code chunks into your R markdown files.

On Mac type option + command + i.

On Windows type control + alt + i

Inside your code chunk, you can type anything that you would otherwise type in the console or in an R script. You can then click the little green arrow in the top right corner of the code chunk to submit it to R and see the result. 7.9

The results of an R code chunk embedded in an R Notebook.

Figure 7.9: The results of an R code chunk embedded in an R Notebook.

7.6 Markdown

Many of you have probably heard of HTML and CSS before. HTML stands for hypertext markup language and CSS stands for cascading style sheets. Together, HTML and CSS are used to create and style every website you’ve ever seen. Remember that R Notebooks created from our R markdown files are HTML files. They will open in any web browser and behave just like any other website. Therefore, you can manipulate and style them using HTML and CSS just like any other website. However, it takes a lot of time and effort to learn HTML and CSS. So, markdown was created as an easier-to-use alternative. Think of it as HTML and CSS lite. It can’t fully replace HTML and CSS, but it is much easier to learn, and you can use it to do many of the main things you would want to do with HTML and CSS.

For example, in figures 7.6 and 7.8 you saw that wrapping your text with single asterisks (*) italicizes that text, and that using a combination of brackets and parentheses [Text](Link) can turn your text into a clickable link.

There are a ton of other things you can do with markdown, and I recommend checking out RStudio’s R markdown cheat sheet if you’re interested in learning more. You can download it (any many other cheat sheets) here. The cheat sheet is a little bit busy and may feel overwhelming at first. So, I suggest starting with the section called “Pandoc’s Markdown” on the second page of the cheat sheet. Just play around with some of the formatting options and get a feel for what they do.

Having said that, it’s totally fine if you don’t care to try to tackle learning markdown syntax right now. You don’t really need markdown to follow along with the rest of the book. However, I still suggest using R Notebook files for writing, saving, modifying, and sharing your R code – even if you don’t plan to format them with markdown syntax.