During the course of a project,
you may want to repeat a similar analysis across multiple R Markdown
files. To avoid duplicated code across your files (which is difficult to
update), there are multiple strategies you can use to share common
code:
To share R code like function definitions, you can put this code
in an R script and import it in each file with the function
source()
To share common R Markdown text and code chunks, you can use child documents
To share common templates, you can use the function
knitr::knit_expand()
Each of these strategies is detailed below, with a special emphasis
on how to use them within the workflowr framework. In order to source
scripts or use child documents, it is suggested you use the here package, which
helps to locate the root directory of your project regardless of the
directory your script or analysis file is, making sourcing documents
cleaner.
Overview of directories
First, a quick overview of the directories in a workflowr project.
This is critical for importing these shared files.
In a standard R Markdown file, the code is executed in the directory
where the R Markdown file is saved. Thus any paths to files in the R
Markdown file should be relative to this directory. However, the
directory where the code is executed, referred to as the “knit
directory” in the workflowr documentation, can be configured. The
default for a new workflowr project is to run the code in the root of
the workflowr project (this is defined in the file
_workflowr.yml
; see ?wflow_html
for
configuration details). Thus any filepaths should be relative to the
root of the project. As an example, if you have shared R functions
defined in the file ~/Desktop/myproject/code/common.R
, the
relative filepath from the root of the project directory would be
"code/common.R"
.
Share R code with source()
If you have R code you want to re-use across multiple R Markdown
files, the most straightforward option is to save this code in an R
script, e.g. code/functions.R
.
Then in each R Markdown file that needs to use the code defined in
that file, you can use source()
to load it. If the code in
your workflowr project is executed in the root of the project directory
(which is the default behavior for new workflowr projects), then you
would add the following chunk:
```{r shared-code}
source("code/functions.R")
```
On the other hand, if you have changed the value of
knit_root_dir
in the file _workflowr.yml
, you
need to ensure that the filepath to the R script is relative to this
directory. For example, if you set
knit_root_dir: "analysis"
, you would use this code
chunk:
```{r shared-code}
source("../code/functions.R")
```
To avoid having to figure out the correct relative path (or having to
update it in the future if you were to change
knit_root_dir
), you can use here::here()
as it
is always based off the project root. Additionally, it will help
readability when using child documents as discussed below.
```{r shared-code}
source(here::here("code/functions.R"))
```
Share child documents with chunk option
To share text and code chunks across R Markdown files, you can use child documents, a
feature of the knitr package.
Here is a example of a simple R Markdown file that you can use to
test this feature. Note that it contains an H2 header, some regular
text, and a code chunk.
## Header in child document
Text in child document.
```{r child-code-chunk}
str(mtcars)
```
You can save this child document anywhere in the workflowr project
with one critical exception: it cannot be saved in the R Markdown
directory (analysis/
by default) with the file extension
.Rmd
or .rmd
. This is because workflowr
expects every R Markdown file in this directory to be a standalone
analysis that has a 1:1 correspondence with an HTML file in the website
directory (docs/
by default). We recommend saving child
documents in a subdirectory of the R Markdown directory,
e.g. analysis/child/ex-child.Rmd
.
To include the content of the child document, you can reference it
using here::here()
in your chunk options.
```{r parent, child = here::here("analysis/child/ex-child.Rmd")}
```
However, this fails if you wish to include plots in the code chunks
of the child documents. It will not generate an error, but the plot will
be missing . In a situation like this, you would want
to generate the plot within the parent R Markdown file or use
knitr::knit_expand()
as described in the next section.
Share templates with knit_expand()
If you need to pass parameters to the code in your child document,
then you can use knitr::knit_expand()
. Also, this strategy
has the added benefit that it can handle plots in the child document.
However, this requires setting knit_root_dir: "analysis"
in
the file _workflowr.yml
for plots to work properly.
Below is an example child document with one variable to be expanded:
{{title}}
refers to a species in the iris data set. The
value assigned will be used to filter the iris data set and label the
section, chunk, and plot. We will refer to this file as
analysis/child/iris.Rmd
.
## {{title}}
```{r plot_{{title}}}
iris %>%
filter(Species == "{{title}}") %>%
ggplot() +
aes(x = Sepal.Length, y = Sepal.Width) +
geom_point() +
labs(title = "{{title}}")
```
To generate a plot using the species "setosa"
, you can
expand the child document in a hidden code chunk:
```{r, include = FALSE}
src <- knitr::knit_expand(file = here::here("analysis/child/iris.Rmd"),
title = "setosa")
```
and then later knit it using an inline code expression:
`r knitr::knit(text = unlist(src))`
The convenience of using knitr::knit_expand()
gives you
the flexibility to generate multiple plots along with custom headers,
figure labels, and more. For example, if you want to generate a scatter
plot for each Species in the iris
datasets, you can call
knitr::knit_expand()
within a lapply()
or
purrr::map()
call:
```{r, include = FALSE}
src <- lapply(
sort(unique(iris$Species)),
FUN = function(x) {
knitr::knit_expand(
file = here::here("analysis/child/iris.Rmd"),
title = x
)
}
)
```
This example code loops through each unique iris$Species
and sends it to the template as the variable title
.
title
is inserted into the header, the chunk label, the
dplyr::filter()
, and the title of the plot. This generates
three plots with custom plot titles and labels while keeping your
analysis flow clean and simple.
Remember to insert knitr::knit(text = unlist(src))
in an
inline R expression as noted above to knit the code in the desired
location of your main document.
Read the knitr::knit_expand()
vignette for more
information.
vignette("knit_expand", package = "knitr")