1 Introduction

This is about preparing Rmarkdown documents that exploit the special features available in Web pages. It is a work in progress.

2 First, Study the Rmarkdown Basics

The stationery package includes a vignette that introduces the markdown philosophy and the Rmarkdown version of it. It shows how to use R (R Core Team 2018) code chunks. This document is focused on the special features that might be obtained with HTML documents.

3 How to Compile the Document

The stationery package includes a vignette stationery that explains the process of compiling the document. The document can be compiled either by starting R and using the stationery function named rmd2html or it can be compiled by the command line using the shell script rmd2html.sh that we provide with the package.

The rendered output is an HTML file that can be opened using any browser. The HTML document has figures and cascading style sheets embedded in it, so it is nearly self-contained (relies on MathJax web server and possibly some external javascript).

4 Special Features for Rmd into HTML documents.

Rmarkdown intended for an HTML backend can include HTML code. If Rmarkdown is missing syntax to achieve some purpose, then the HTML approach will generally get the job done.

Because many Rmarkdown authors are unfamiliar with HTML code, quite a few syntactic-shortcuts have been developed. As we explained in the Rmarkdown vignette, it is preferable to use the Rmarkdown syntax when it is available because this improves the portability of the document. However, when no markdown syntax exists, one must improvise.

In this section, we first emphasize 2 special features that are provided in our cascading style sheet that facilitate use of some pleasant HTML markup strategies. These are 1) colored callouts and 2) tabbed subsections.

4.1 Colored callouts

The stylesheet includes style code for “callout” sections. These were adapted from the HTML stylesheets in the bootstrap project.

A colored callout must begin as a level-4 markdown heading. The syntax begins with ####, and then after that some syntax that is, actually, HTML style code, is included. The colors for which we have provided are “gray”, “red”, “orange”, “blue”, and “green”.

4.1.1 Demonstrating callouts

4.1.1.1 Gray Callout

The gray callout is created by this Rmarkdown code:

```
#### Gray Callout {.bs-callout .bs-callout-gray}
```

Perhaps “gray” is for wisdom. Perhaps it is just a visual separator between exciting colors like red and blue!

4.1.1.2 Red Callout

Syntax:

```
#### Red Callout {.bs-callout .bs-callout-red}
```

Red callout is for danger, in the eyes of some authors. Other authors just think it is pretty.

4.1.1.3 Orange Callout

Orange might be used for examples.

```
#### Orange Callout {.bs-callout .bs-callout-orange}
```

4.1.1.4 Blue Callout

```
#### Blue Callout {.bs-callout .bs-callout-blue}
```

Blue is for correct answers, at least according to the color Nazis.

4.1.1.5 Green Callout

```
#### Green Callout {.bs-callout .bs-callout-green}
```

Green is the color of the Earth, of course, so we use it for ideas, suggestions, or whatever we like.

4.1.1.6 What is the meaning of the colors

At one time, we were calling naming these things by their purpose rather that colors. The purpose <==> color mapping was

purpose color
info blue
warning orange
danger red

However, we concluded that some people might like to use red for warnings or orange for danger. We are all about diversity and concluded it was superficial to use purpose-based names. Some of us use the colored callout regions simply for decoration, so we don’t name them by purpose anymore.

Some of our older Rmarkdown documents do use that approach, however.

4.1.1.7 Other structures can be embedded in colored callout

This is an R code chunk embedded inside the red: colored callout:

dat <- data.frame(x=rnorm(1000), y=rpois(1000, l=7))
summary(dat)
       x                   y         
 Min.   :-3.128437   Min.   : 0.000  
 1st Qu.:-0.647860   1st Qu.: 5.000  
 Median : 0.054540   Median : 7.000  
 Mean   : 0.005542   Mean   : 6.809  
 3rd Qu.: 0.691346   3rd Qu.: 8.000  
 Max.   : 2.786802   Max.   :17.000  
hist(dat$x, xlab = "Monkey Weight (deviations)", main = "Histogram", prob = TRUE, ylim = c(0, 1))

Note that the colored tabs, which were level 4 headings, are terminated when the next heading is declared at level 2.

4.2 Interactive Tabs

This is the only feature that truly differentiates the HTML backend from PDF. The user can “interact” with the tabs. The major benefit is that a section in which there are, say 5, large subsections, can be made to seem shorter by “hiding” the subsections under the tabs.

In our style sheet, tabs are created in two steps. First, a level two markdown header (##) is introduced with the flag {.tabset .tabset-fade}. The tabs within that group are created by level 3 headers (###). To close down the tabbed section, it is necessary to introduce a new level 1 or 2 header.

Please note it is VERY IMPORTANT to include a blank line before a new tabbed section begins. If the line is omitted, then the new section will not be created properly.

4.3 A very basic tabbed structure

As demonstrated by this paragraph, commentary before the level-3 tabbed headers is allowed. In fact, one can introduce any number of paragraphs before the first level 3 header is inserted to begin the tabbed subsections.

4.3.1 Kansas

Items about our fine state

4.3.2 Missouri

Items about another fine state, which is not quite as good as Kansas

4.3.3 New York

My baby daughter exclaimed “New York stinks!” in 1990. Last time I was there, it was still correct to say that.

4.3.4 Connecticut

If you could retire as a rich person, this might be the right place to go.

4.4 We need to fix up the style a little bit

The “hidden” subsections are labeled, but not vividly, and our CSS is to blame. Or the CSS inherited from others is inadequate. Also we need to more easily color and dramatize these tabs. As discussed next, some raw HTML markup is needed to obtain colors.

4.4.1 I want more beautiful tabs!

The only way (that we know of) to get colors is to wrap the tab headers in a <span style> as shown below. This might be useful to draw attention to the tabs. Blue is the default color.

Note that it is necessary to declare the level-2 header again, to start a new tabset:

## A level-2 heading launches a new tabset, with color via HTML markup {.tabset .tabset-fade}

Followed by the tab captions, which are inside level-3 headers, including color markup:

### <span style="color:orange">An orange tab</span>

Here is the working example:

4.5 A level-2 heading launches a new tabset, with color via HTML markup

4.5.1 A red tab

4.5.1.1 This Red callout embedded under the red tab

Commentary about red stuff. We have embedded a red callout box here to have some pizzaz. Click “An orange tab” where we’ve hidded some R output.

4.5.2 An orange tab

Lets try some R code within this tabbed level 3 section:

dat <- data.frame(x=rgamma(1000, 1.4))
hist(dat$x)

4.5.3 A tab with no special color is blue

words here!

4.6 Inserting images: Use HTML code

Pictures or graphics can be inserted into Rmarkdown documents. The usual markdown syntax for image inserts is

![alt text](image/location/file.png "Image Title Text")

That syntax is somewhat limiting, mostly because we cannot resize the images. Another limitation is that some graphics formats are not allowed. The suggested file formats are svg, png, and jpg, so graphics in pdf will not be usable as is.

To resize images, we need to resort to raw HTML code, which seems somewhat disappointing to many authors. HTML allows rescaling. We can specify both the width and the height of the image. In this example code, a png format file named “plot1.png” is inserted in the document.

<img src="ext_img/plot1.png" alt = "Floating .png"
  width  = "308"
  height = "216">

Authors who need to use graphics saved in other formats will need to convert to png, jpg, or svg. The Gold standard of format converters is the convert function of the ImageMagick suite of tools. It is also possible to open a PDF in some editors, such as the GNU Image Manipulation program (GIMP), and save as an image format. There are some websites that might be useful for this purpose, such as http://pdf2png.com.

4.7 International characters

If you can figure out how to insert characters with accents, they will display correctly. For example, Karl Gustav Jöreskog, Dag Sörbom, and Linda Muthén and Bengt Muthén. These are entered at the keyboard using editor-specific tools.

5 Illustration of Chunk Features

In the stationery package vignette named code_chunks, we explain the idea that in both and Rmarkdown, one can insert R code chunks that will be processed. There, we spell out a list of requirements for any chunk based system along with examples.

We run the same code chunks here, to compare the HTML output with PDF from the code_chunks vignette.

5.1 Chunks that do not generate graphics

  1. A chunk that is evaluated, echoed, both input and output. This is a standard chunk, no chunk options are used:

    The user will see both the input code and the output, each in a separate box:

    set.seed(234234)
    x <- rnorm(100)
    mean(x)
    [1] -0.1004232

    Notice the code highlighting is not entirely successful, and is different from results we see in other backends.

  2. A chunk with commands that are echoed into the document, but not evaluated (eval=F).

    When the document is compiled, the reader will see the depiction of the code, which is (by default) beautified and reformatted:

    set.seed(234234)
    x <- rnorm(100)
    mean(x)
  3. A chunk that is evaluated, with output displayed, but code is not echoed (echo=F). It is not necessary to specify eval=T because that is a default.

    The user will not see any code that runs, but only a result:

    [1] 0.2024592
  4. A hidden code chunk. A chunk that is evaluated, but neither is the input nor output displayed (include=F)

    What is the grammatically correct way to say “did you see nothing?” You should not even see an empty box? After that, the object x exists in the on-going R session, it can be put to use.

5.2 Chunks with graphics

  1. A chunk that creates a graph, and allows it to be inserted into the document, but the code is not echoed for the reader to see.

  2. Save a graph in a file and display it at a later point.

    This can be acheived by specifing: fig.show=“hold”, echo=F. Optionally we can specify the height and width of the figure with fig.height and fig.width (which are always in inches). The file will be saved in the current working directory.

    hist(x, main = "Another Histogram")
  3. A chunk that shows a series of plotting commands.

    This is a named chunk that is not evaluated, but it is displayed to reader. The same code is then put to use twice in what follows.

    par(mar = c(3,2,0.5,0.5))
    cax <- 0.7 ## cex.axis
    plot(c(0, 1), c(0, 1), xlim = c(0,1), ylim = c(0,1), type = "n", ann = FALSE, axes = FALSE)
    rect(0, 0, 1, 1, col = "light grey", border = "grey")
    axis(1, tck = 0.01, pos = 0, cex.axis = cax, padj = -2.8, lwd = 0.3,
          at = seq(0, 1, by = 0.2), labels = c("", seq(0.2,0.8, by=0.2), ""))
    axis(2, tck = 0.01, pos = 0, cex.axis = cax, padj = 2.8, lwd = 0.3,
         at = seq(0, 1, by = 0.2), labels = c("", seq(0.2,0.8, by=0.2), ""))
    mtext(expression(x), side = 1, line = 0.5, at = .5, cex = cax)
    mtext(expression(y), side = 2, line = 0.5, at = .5, cex = cax)
    mtext(c("Min x", "Max x"), side = 1, line = -0.5, at = c(0.05, 0.95), cex = cax)
    mtext(c("Min y", "Max y"), side = 2, line = -0.5, at = c(0.05, 0.95), cex = cax)
    lines(c(.6, .6, 0), c(0, .6, .6), lty = "dashed")
    text(.6, .6, expression(paste("The location ",
                    group("(",list(x[i] == .6, y[i] == .6),")"))), pos = 3, cex = cax + 0.1)
    points(.6, .6, pch = 16)

    The first re-use of this code simply runs the whole chunk, and keeps the final figure. This figure is a png file that is embedded in the HTML document.

    A Special Figure

    A Special Figure

    A special feature of knitr is the ability to keep the intermediate plots that are produced by each line. An inspection of the tmpout directory shows that this code created several graphs. Observe there are several files:

    list.files("tmpout", pattern="p-chunk76.*png") 
     [1] "p-chunk76-1.png"  "p-chunk76-10.png" "p-chunk76-11.png"
     [4] "p-chunk76-2.png"  "p-chunk76-3.png"  "p-chunk76-4.png" 
     [7] "p-chunk76-5.png"  "p-chunk76-6.png"  "p-chunk76-7.png" 
    [10] "p-chunk76-8.png"  "p-chunk76-9.png" 

    In a way that is rather similar to the PDF backend, we use a backend-specific table structure to display four of the images. The display of the table’s caption is controlled by the style sheet.

<table border="0" cellpadding="0">
<caption>Figure: Table Array of Four Graphics</caption>
<tr><td><img src="tmpout/p-chunk76-4.png" height=350 width=350 alt = "a png"></td>
<td><img src="tmpout/p-chunk76-8.png" height=350 width=350 alt = "b png"></td></tr>
<tr><td><img src="tmpout/p-chunk76-9.png" height=350 width=350 alt = "c png"></td>
<td><img src="tmpout/p-chunk76-11.png" height=350 width=350 alt = "d png"> </td></tr>
</table>
Figure: Table Array of Four Graphics
a png b png
c png d png

5.3 Chunks with tables

Markdown includes a rather crude table-making syntax. We used it above in to display the purpose to color relationship of the colored callout boxes. For most serious analysis, that type of table will not be sufficient.

The chunk option results="asis" is used to display HTML markup that can be created by R functions. The cascading style sheet will play an important role in the final display. If we are unhappy with the rendering of the tables, we should concentrate on fixing the CSS, rather than finger-painting borders and such. While working on this project we discovered a flaw in the pandoc processing engine that caused tables to fail. If the HTML generated by the chunk includes spaces, pandoc can be fooled into thinking the text is markdown rather than HTML.

  1. Using results=“asis” to display HTML markup: Regression Tables

There are many packages that can create near-publication-quality regression tables. Here is an example of the rockchalk package can create HTML code for a regression table. In this case, we run the code chunk to generate the HTML code, and then we have to manually purge the extra spaces in the output so that pandoc will not corrupt the output.

cat(or1)
<td colspan = ‘1’; align = ‘left’>Amod
Bmod Gmod
Estimate Estimate Estimate
(S.E.) (S.E.) (S.E.)
(Intercept) 30.245*** 29.774*** 30.013***
(0.618) (0.522) (0.490)
x1 1.546* _ 2.217***
(0.692) (0.555)
x2 _ 3.413*** 3.717***
(0.512) (0.483)
N 100 100 100
RMSE 6.121 5.205 4.849
R2 0.048 0.312 0.409
adj R2 0.039 0.305 0.397
  • p ≤0.05** p ≤0.01*** p ≤0.001
  1. General purpose HTML output from the pander package

The package pander offers flexability and functionality. It can display an R table, the coefficient object generated by a regression summary

library(pander)    
sum <- summary(m1)
pander(sum$coefficients)
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 30.25 0.6176 48.97 1.042e-70
x1 1.546 0.6924 2.232 0.02789

and it can also display a matrix created by the package psyc

library(psych)
pander(describe(dat))
Table continues below
  vars n mean sd median trimmed mad min
x1 1 100 -0.1192 0.8884 -0.02763 -0.09578 0.9889 -2.614
x2 2 100 0.0841 1.021 0.1955 0.1211 0.9755 -2.562
y1 3 100 30.06 6.243 30.65 30.06 6.384 16.59
y2 4 100 0.2453 5.16 0.7159 0.3888 4.532 -13.2
  max range skew kurtosis se
x1 1.912 4.526 -0.2958 -0.1176 0.08884
x2 2.233 4.794 -0.3634 -0.2852 0.1021
y1 48.44 31.85 0.03705 -0.2777 0.6243
y2 11.32 24.52 -0.2685 -0.2653 0.516

6 Dealing with missing features in HTML documents

Some document elements that are available in PDF output are missing in Rmarkdown to HTML conversion. The most serious missing pieces are numbered and labeled “floating” tables, figures, and equations. These losses seem nearly fatal for the HTML backend and are a strong reason why one should prefer PDF.

Nevertheless, for Web pages, some authors truly prefer HTML output (maybe because they like colored callouts and tabbed sections). As a result, we have some work arounds for these problems.

6.1 Equation Numbering

In “display equation” mathematics, we want to insert numbered equations and then refer to them. Unfortunately, Rmarkdown to HTML does not support auto-numbering equations. However, one can number equations manually by adding\tag{} to the end of equations. For example,

\[ + - = \approx \ne \ge \lt \pm\tag{1} \]

\[ \pi \approx 3.1415927\tag{2} \]

\[ a_i \ge 0~~~\forall i\tag{3} \]

\[ x \lt 15\tag{4} \]

Unfortunately, when new equations are inserted, it will be necesssary to manually renumber these. In addition, there is no HTML backend method to then refer to equation (3) without explicitly typing in the equation number.

6.2 Cross references

HTML does offer its own form of cross referencing by hyperlink anchors, however. Suppose we want the reader to be able to click a link that goes to a figure that we have presented previously. We go that that figure and insert HTML code along these lines:

<a name="specialfig"></a>

When we want to write something like click here to see the special figure", the HTML markup is

<a href="#specialfig">click here to see the special figure</a>

7 Policies about writing in these documents.

  1. Use these callouts to attract attention.

  2. Blank rows separate paragraphs.

  3. The character width of rows should be 80 or less. I have no idea how anybody thinks they have a right to impose an infinitely long row, but it’s bad. Edit the document with Emacs, run M-q to get re-positioned text. If your editor cannot do that, quit using it.

  4. Must make sure compiling using the kutils.css located in the stationery package. For example, stationery::rmd2html("filename.Rmd")

8 Session Info

R Under development (unstable) (2019-10-26 r77334)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 19.10

Matrix products: default
BLAS:   /tmp/r-devel/lib/R/lib/libRblas.so
LAPACK: /tmp/r-devel/lib/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
[1] psych_1.8.12       pander_0.6.3       rockchalk_1.8.144 
[4] stationery_0.98.24

loaded via a namespace (and not attached):
 [1] zip_2.0.4        Rcpp_1.0.2       compiler_4.0.0  
 [4] nloptr_1.2.1     plyr_1.8.4       highr_0.8       
 [7] tools_4.0.0      boot_1.3-23      digest_0.6.22   
[10] lme4_1.1-21      evaluate_0.14    nlme_3.1-141    
[13] lattice_0.20-38  rlang_0.4.1      openxlsx_4.1.0.1
[16] Matrix_1.2-17    parallel_4.0.0   yaml_2.2.0      
[19] pbivnorm_0.6.0   xfun_0.10        stringr_1.4.0   
[22] knitr_1.25       stats4_4.0.0     grid_4.0.0      
[25] foreign_0.8-72   rmarkdown_1.16   lavaan_0.6-5    
[28] carData_3.0-2    minqa_1.2.4      magrittr_1.5    
[31] htmltools_0.4.0  MASS_7.3-51.4    kutils_1.69     
[34] splines_4.0.0    mnormt_1.5-5     xtable_1.8-4    
[37] stringi_1.4.3   

Available under Created Commons license 3.0 CC BY

References

R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing: R Foundation for Statistical Computing. https://cran.r-project.org/doc/manuals/r-release/fullrefman.pdf.