R for Epidemiology
Welcome
- Acknowledgements
Introduction
About the Authors
- Brad Cannell
- Melvin Livingston
I Getting Started
1 Installing R and RStudio
- 1.1 Download and install on a Mac
- 1.2 Download and install on a PC
2 What is R?
- 2.1 What is data?
- 2.2 What is R?
3 Navigating the RStudio Interface
4 Speaking R’s Language
5 Let’s Get Programming
6 Asking Questions
II Coding Tools and Best Practices
7 R Scripts
- 7.1 Creating R scripts
8 Quarto Files
9 R Projects
10 Coding Best Practices
11 Using Pipes
III Data Transfer
12 Introduction to Data Transfer
13 File Paths
- 13.1 Finding file paths
- 13.2 Relative file paths
14 Importing Plain Text Files
15 Importing Binary Files
16 RStudio’s Data Import Tool
17 Exporting Data
- 17.1 Plain text files
- 17.2 R binary files
IV Descriptive Analysis
18 Introduction to Descriptive Analysis
- 18.1 What is descriptive analysis and why would we do it?
- 18.2 What kind of descriptive analysis should we perform?
19 Numerical Descriptions of Categorical Variables
20 Measures of Central Tendency
21 Measures of Dispersion
- 21.1 Comparing distributions
22 Describing the Relationship Between a Continuous Outcome and a Continuous Predictor
- 22.1 Pearson Correlation Coefficient
  - 22.1.1 Calculating r
  - 22.1.2 Correlation intuition
23 Describing the Relationship Between a Continuous Outcome and a Categorical Predictor
- 23.1 Single predictor and single outcome
- 23.2 Multiple predictors
24 Describing the Relationship Between a Categorical Outcome and a Categorical Predictor
- 24.1 Comparing two variables
V Data Management
25 Introduction to Data Management
- 25.1 Multiple paradigms for data management in R
- 25.2 The dplyr package
26 Creating and Modifying Columns
27 Subsetting Data Frames
28 Working with Dates
29 Working with Character Strings
30 Conditional Operations
31 Working with Multiple Data Frames
- 31.1 Combining data frames vertically: Adding rows
- 31.2 Combining data frames horizontally: Adding columns
  - 31.2.1 Combining data frames horizontally by position
  - 31.2.2 Combining data frames horizontally by key values
32 Restructuring Data frames
VI Repeated Operations
33 Introduction to Repeated Operations
- 33.1 Multiple methods for repeated operations in R
- 33.2 Tidy evaluation
34 Writing Functions
35 Column-wise Operations in dplyr
36 Writing For Loops
37 Using the purrr Package
VII Collaboration
38 Introduction to git and GitHub
39 Using git and GitHub
VIII Presenting Results
40 Creating Tables with R and Microsoft Word
IX Foundational Epidemiologic Concepts
41 Using R for Epidemiology
42 Populations and Samples
43 Measures of Occurrence
44 Random Error in Measures
45 Creating Contingency Tables in R
46 Measures of Association
47 Time-to-event Analysis
48 Stratification
49 Standardization
50 Selection Bias
- 50.1 Direction of bias
- 50.2 Summary
51 Systematic Error in Measures
52 Effect-measure Modification
53 Missing Data
X Introduction to Regression Analysis
54 Introduction to Regression Analysis
- 54.1 Generalize linear models
  - 54.1.1 The glm function
- 54.2 Regression intuition
55 Linear Regression
56 Linear Regression
57 Poisson Regression
58 Cox Proportional Hazards Regression
59 Multilevel Models
60 Generalized Estimating Equations
XI Predictive Analysis
61 Introduction to Predictive Analysis
XII Introduction to Causal Inference
62 Introduction to Causal Inference
63 Sufficient and Component Cause Diagrams
- 63.1 Summary
64 Introduction to Directed Acyclic Graphs
65 Confounding
66 Deconfounding
67 Mediation
XIII Study Design
68 Experimental Studies
69 Cohort Studies
70 Case-control Studies
71 Cross-sectional Studies
72 Ecologic Studies
73 Quasi-experimental Studies
74 Meta-analysis
75 Power and Sample Size
XIV Appendix
Appendix A: Glossary
Appendix: Alternative table formats
- 75.1 Smaller data frame
- 75.2 Larger data frame
References
Published with bookdown

R for Epidemiology

53 Missing Data

This chapter is under heavy development and may still undergo significant changes.