The workflowr package combines many powerful tools in order to produce a research website. It is absolutely not necessary to understand all the underlying tools to take advantage of workflowr, and in fact that is one of the primary goals of workflowr: to allow researchers to focus on their analyses without having to worry too much about the technical details. However, if you are interested in implementing advanced customization options, contributing to workflowr, or simply want to learn more about these tools, the sections below provide some explanations of how workflowr works.
R is the computer
programming language used to perform the analysis. knitr is an R package that executes
code chunks in an R Markdown file to create a Markdown file. Markdown is a
lightweight markup language that is easier to read and write than HTML.
rmarkdown is an R package
that combines the functionality of knitr and the document converter pandoc. Pandoc powers the conversion of knitr-produced Markdown files into
HTML, Word, or PDF documents. Additionally, newer versions of rmarkdown contain functions
for building websites. The styling of the websites is performed by the
web framework Bootstrap. Bootstrap implements the navigation
bar at the top of the website, has many available themes to customize
the look of the site, and dynamically adjusts the website so it can be
viewed on a desktop, tablet, or mobile device. The rmarkdown website
configuration file _site.yml
allows convenient
customization of the Bootstrap
navigation bar and theme.
Git is a distributed version
control system (VCS) that tracks code development. It has many powerful
features, but only a handful of the main functions are required to use
workflowr. git2r
is an R package which provides an interface to libgit2, which is a portable, pure C
implementation of the Git core methods (this is why you don’t need to
install Git before using workflowr). GitHub is a website that hosts Git repositories and additionally
provides collaboration tools for developing software. GitHub Pages is a GitHub service that offers free hosting
of static
websites. By placing the HTML files for the website in the
subdirectory docs/
, GitHub Pages serves them
online.
To aid reproducibility, workflowr provides an R Markdown output
format wflow_html()
template that automatically sets a seed
for random number generation, records the session information, and
reports the status of the Git repository (so you always know which
version of the code produced the results contained in that particular
file). These options are controlled by the settings in
_workflowr.yml
. It also provides a custom site generator
wflow_site()
that enables wflow_html()
to work
with R Markdown websites. These options are controlled in
analysis/_site.yml
.
workflowr saves the figures into an organized, hierarchical directory
structure within analysis/
. For example, the first figure
generated by the chunk named plot-data
in the file
filename.Rmd
will be saved as
analysis/figure/filename.Rmd/plot-data-1.png
. Furthermore,
the figure files are moved to docs/
when
render_site
is run (this is the rmarkdown package function
called by wflow_build
, wflow_publish
, and the
RStudio Knit button).
The figures have to be committed to the Git repository in
docs/
in order to be displayed properly on the website.
wflow_publish
automatically commits the figures in
docs
corresponding to new or updated R Markdown files, and
analysis/figure/
is in the .gitignore
file to
prevent accidentally committing duplicate files.
Because workflowr requires the figures to be saved to a specific
location in order to function properly, it will override any custom
setting of the knitr option fig.path
(which controls where
figure files are saved) and insert a warning into the HTML file to alert
the user that their value for fig.path
was ignored.
Posit Software, PBC is a company that develops open source software for R users. They are the principal developers of RStudio, an integrated development environment (IDE) for R, and the rmarkdown package. Because of this tight integration, new developments in the rmarkdown package are quickly incorporated into the RStudio IDE. While not strictly required for using workflowr, using RStudio provides many benefits, including:
RStudio projects make it easier to setup your R environment, e.g. set the correct working directory, and quickly switch between different projects
The Git pane allows you to conveniently view your changes and run the main Git functions
The Viewer pane displays the rendered HTML results for immediate feedback
Clicking the Knit
button automatically uses the Bootstrap options specified in
_site.yml
and moves the rendered HTML to the website
subdirectory docs/
(requires version 1.0 or
greater)
Includes an up-to-date copy of pandoc so you don’t have to install or update it
Tons of other cool features like debugging and source code inspection
Another key R package used by workflowr is rprojroot. This
package finds the root of the repository, so workflowr functions like
wflow_build
will work the same regardless of the current
working directory. Specifically, rprojroot
searches for the RStudio project .Rproj
file at the base of
the workflowr project (so don’t delete it!).
How the code, results, and figures are executed and displayed can be customized using knitr chunk and package options
How R Markdown websites are configured
Directions
to publish a GitHub Pages site
using the docs/
subdirectory