33  Introduction to Repeated Operations

This part of the book is all about the DRY principle. We first discussed the DRY principle in the section on creating and modifying multiple columns. As a reminder, DRY is an acronym for “Don’t Repeat Yourself.” But, what does that mean?

Well, think back to the conditional operations chapter. In that chapter, we compared conditional statements in R with asking our daughters to wear a raincoat if it’s raining. To extend the analogy, now imagine that we wake up one morning and say, “please wear your raincoat if it’s raining today - July 1st.” Then, we wake up the next morning and say, “please wear your raincoat if it’s raining today - July 2nd.” Then, we wake up the next morning and say, “please wear your raincoat if it’s raining today - July 3rd.” And, that pattern continues every morning until our daughters move out of the house. That’s a ton of repetition!! Alternatively, wouldn’t it be much more efficient to say, “please wear your raincoat on every day that it rains,” just once?

The same logic applies to our R code. We often want to do the same (or very similar) thing multiple times. This can result in many lines of code that are very similar and unnecessarily repetitive, and this unnecessary repetition can occur in all phases of our projects.

Project phases

For example:

In all of these situations we are asking our R code to do something repeatedly, or iteratively, but with a slight change each time. We can write a separate chunk of code for each time we want to do that thing, or we can write one chunk of code that asks R to do that thing over and over. Writing code in the later way will often result in R programs that:

Note

When we say “one line of code” above, we mean it figuratively. The code we use to remove unnecessary repetition will not necessarily be on one line; however, it should generally require less typing than code that includes unnecessary repetition.

So, writing code that is highly repetitive is usually not a great idea, and this part of the book is all about teaching you to recognize and remove unnecessary repetition from your code. As is often the case with R, there are multiple different methods we can use.

33.1 Multiple methods for repeated operations in R

In the chapters that follow, we will learn four different methods for removing unnecessary repetition from our code. They are:

Four methods for removing unnecessary repetition
  1. Writing our own functions that can be reused throughout our code.

  2. Using dplyr’s column-wise operations.

  3. Using for loops.

  4. Using the purrr package.

It’s also important to recognize that each of the methods above can be used independently or in combination with each other. We will see examples of both.

33.2 Tidy evaluation

In case it isn’t obvious to you by now, we’re fans of the tidyverse packages (i.e., dplyr, ggplot2, tidyr, etc.). We use dplyr, in particular, in virtually every single one of our R programs. The use of non-standard evaluation is just one of the many aspects of the tidyverse packages that we’re fans of. As a reminder, among other things, non-standard evaluation is what allows us to refer to data frame columns without using dollar sign or bracket notation (i.e., data masking). However, non-standard evaluation will create some challenges for us when we try to use functions from tidyverse packages inside of functions and for loops that we write ourselves. Therefore, we will have to learn more about tidy evaluation if we want to continue to use the tidyverse packages that we’ve been using throughout the book so far.

Tidy evaluation can be tricky even for experienced R programmers to wrap their heads around at first. Therefore, it might not be productive for us to try to learn a lot about the theory behind, or internals of, tidy evaluation as a standalone concept. Instead, in the chapters that follow, we plan to sprinkle in just enough tidy evaluation to accomplish the task at hand. As a little preview, a telltale sign that we are using tidy evaluation will be when you start seeing the {{ (said, curly-curly) operator and the !! (said, bang bang) operator. Hopefully, this will all make more sense in the next chapter when we start to get into some examples.

We recommend the following resources for those of you who are interested in developing a deeper understanding of rlang and tidy evaluation:

  1. Programming with dplyr. Accessed July 31, 2020. https://dplyr.tidyverse.org/articles/programming.html

  2. Wickham H. Introduction. In: Advanced R. Accessed July 31, 2020. https://adv-r.hadley.nz/metaprogramming.html

Now, let’s learn how to write our own functions!🤓