I’m going to start the introduction by writing down some basic goals that underlie the construction and content of this book. I’m writing this for you, the reader, but also to hold myself accountable as I write. So, feel free to read if you are interested or skip ahead if you aren’t.

The goals of this book are:

  1. To teach you how to use R and RStudio as tools for applied epidemiology. It is not to turn you into a computer scientist or a hard-core R programmer. Therefore, some readers who are experienced programmers may catch some technical inaccuracies on what I consider to be the fine points of what R is doing “under the hood.”

  2. To make this writing as accessible and practically useful as possible without stripping out all of the complexity that makes doing epidemiology in real life a challenge. In other words, I’m going to try to give you all the tools you need to do epidemiology in “real world” (as opposed to ideal) conditions without providing a whole bunch of extraneous (often theoretical) stuff that detracts from doing. Having said that, I will strive to add links to the other (often theoretical) stuff for readers who are interested.

  3. To teach you to accomplish common tasks, rather than teach you to use functions. In many R courses and texts, I’ve noticed a focus on learning all the things a function, or set of related functions, can do. It’s then up to you, the reader, to sift through all of these capabilities and decided which, if any, of the things that can be done will accomplish the tasks that you are actually trying to accomplish. Instead, I will strive to start with the end in mind. What is the task we are actually trying to accomplish? What are some functions/methods I could use to accomplish that task? What are the strengths and limitations of each?

  4. To start each concept with the end result and then deconstruct how we arrived at that result, where possible. I find that it is easier for me to understand new concepts when learning them as a component of a final product.

  5. To learn concepts with data instead of (or alongside) mathematical formulas and text descriptions, where possible. I find that it is easier for me to understand new concepts by seeing them in action.

Text conventions used in this book

  • Bold text is used to highlight important terms, file names, and file extensions.

  • Highlighted inline code is used to emphasize small sections of R code and program elements such as variable or function names.