What happens when a tibble is printed? This vignette documents the
control flow and the data flow, explains the design choices, and shows
the default implementation for the "tbl"
class. It is
mainly of interest for implementers of table subclasses. Customizing the
formatting of a vector class in a tibble is described in
vignette("pillar", package = "vctrs")
. The different
customization options are showcased in
vignette("extending")
.
Fit into pre-specified width, distributing across multiple tiers if necessary
Optionally shrink and stretch individual columns
Header, body and footer for the tibble
Custom components for the pillars in a tibble, top-aligned
Customization of the entire output and of the pillars
Support for data frame columns (packed data frames) and matrix/array columns
Pillars are always shown from left to right, no “holes” in the colonnade
Printing pillars should take time proportional to the number of characters printed, and be “fast enough”.
The overall control and data flow are illustrated in the diagram below. Boxes are functions and methods. Solid lines are function calls. Dotted lines represent information that a function obtains via argument or (in the case of options) queries actively.
#> Error in loadNamespace(x): there is no package called 'DiagrammeR'
The pillar package uses debugme for debugging. Activating debugging
for pillar is another way to track the control flow, see
vignette("debugme")
for details.
A tibble is a list of columns of class "tbl_df"
and
"tbl"
. Printing is designed to work for non-data-frame
table-like objects such as lazy tables. The print.tbl()
method calls format()
for the object and prints the
output.
tbl <- tibble::tibble(a = 1:3, b = tibble::tibble(c = 4:6, d = 7:9), e = 10:12)
print(tbl, width = 23)
#> # A tibble: 3 × 3
#> a b$c e
#> <int> <int> <int>
#> 1 1 4 10
#> 2 2 5 11
#> 3 3 6 12
#> # ℹ 1 more variable:
#> # b$d <int>
str(tbl)
#> tibble [3 × 3] (S3: tbl_df/tbl/data.frame)
#> $ a: int [1:3] 1 2 3
#> $ b: tibble [3 × 2] (S3: tbl_df/tbl/data.frame)
#> ..$ c: int [1:3] 4 5 6
#> ..$ d: int [1:3] 7 8 9
#> $ e: int [1:3] 10 11 12
pillar:::print.tbl()
The format.tbl()
method creates a setup object, and uses
that object to format header, body and footer.
pillar:::format.tbl()
While it’s possible to extend or override these methods for your
"tbl"
subclass, often overriding the more specialized
methods shown below is sufficient.
Most of the work for formatting actually happens in
tbl_format_setup()
. The desired output width is baked into
the setup object and must be available when calling. Setup objects print
like a tibble but with a clear separation of header, body, and
footer.
setup <- tbl_format_setup(tbl, width = 24)
setup
#> <pillar_tbl_format_setup>
#> <tbl_format_header(setup)>
#> # A tibble: 3 × 3
#> <tbl_format_body(setup)>
#> a b$c e
#> <int> <int> <int>
#> 1 1 4 10
#> 2 2 5 11
#> 3 3 6 12
#> <tbl_format_footer(setup)>
#> # ℹ 1 more variable:
#> # b$d <int>
A setup object is required here to avoid computing information twice. For instance, the dimensions shown in the header or the extra columns displayed in the footer are available only after the body has been computed.
The generic dispatches over the container, so that you can override it if necessary. It is responsible for assigning default values to arguments before passing them on to the method.
tbl_format_setup()
tbl_format_setup <- function (x, width = NULL, ..., setup = list(tbl_sum = tbl_sum(x)),
n = NULL, max_extra_cols = NULL, max_footer_lines = NULL,
focus = NULL)
{
"!!!!DEBUG tbl_format_setup()"
width <- get_width_print(width)
n <- get_n_print(n, tbl_nrow(x))
max_extra_cols <- get_max_extra_cols(max_extra_cols)
max_footer_lines <- get_max_footer_lines(max_footer_lines)
out <- tbl_format_setup_dispatch(x, width, ..., setup = setup,
n = n, max_extra_cols = max_extra_cols, max_footer_lines = max_footer_lines,
focus = focus)
return(out)
UseMethod("tbl_format_setup")
}
The default implementation converts the input to a data frame via
as.data.frame(head(x))
, and returns an object constructed
with new_tbl_format_setup()
that contains the data frame
and additional information. If you override this method, e.g. to
incorporate more information, you can add new items to the default setup
object, but you should not overwrite existing items.
pillar:::tbl_format_setup.tbl()
tbl_format_setup.tbl <- function (x, width, ..., setup, n, max_extra_cols, max_footer_lines,
focus)
{
"!!!!DEBUG tbl_format_setup.tbl()"
if (is.null(setup)) {
tbl_sum <- tbl_sum(x)
return(new_tbl_format_setup(width, tbl_sum, rows_total = NA))
}
else {
tbl_sum <- setup$tbl_sum
}
rows <- tbl_nrow(x)
lazy <- is.na(rows)
if (lazy) {
df <- as.data.frame(head(x, n + 1))
if (nrow(df) <= n) {
rows <- nrow(df)
}
else {
df <- vec_head(df, n)
}
}
else {
df <- df_head(x, n)
}
if (is.na(rows)) {
needs_dots <- (nrow(df) >= n)
}
else {
needs_dots <- (rows > n)
}
if (needs_dots) {
rows_missing <- rows - n
}
else {
rows_missing <- 0
}
rownames(df) <- NULL
colonnade <- ctl_colonnade(df, has_row_id = if (!lazy &&
.row_names_info(x) > 0)
"*"
else TRUE, width = width, controller = x, focus = focus)
body <- colonnade$body
extra_cols <- colonnade$extra_cols
extra_cols_total <- length(extra_cols)
if (extra_cols_total > max_extra_cols) {
length(extra_cols) <- max_extra_cols
}
abbrev_cols <- colonnade$abbrev_cols
new_tbl_format_setup(x = x, df = df, width = width, tbl_sum = tbl_sum,
body = body, rows_missing = rows_missing, rows_total = rows,
extra_cols = extra_cols, extra_cols_total = extra_cols_total,
max_footer_lines = max_footer_lines, abbrev_cols = abbrev_cols)
}
At the core, the internal function ctl_colonnade()
composes the body. Its functionality and the customization points it
offers are detailed in the “Colonnade” section below.
The internal function ctl_colonnade()
composes the body.
It performs the following tasks:
ctl_new_pillar_list()
,
ctl_new_pillar()
and ultimately pillar()
and
pillar_shaft()
format()
function, passing
the now known width.In the following, the first and the fourth steps are discussed.
The initial tibble is passed to ctl_new_pillar_list()
,
which eventually calls ctl_new_pillar()
once or several
times. For each top-level column, one pillar object is constructed. The
loop is terminated when the available width is exhausted even
considering the minimum width.
The ctl_new_pillar_list()
generic dispatches on the
container:
ctl_new_pillar_list(tbl, tbl$a, width = 20)
#> [[1]]
#> <pillar>
#> <int>
#> 1
#> 2
#> 3
#>
#> attr(,"remaining_width")
#> [1] 14
#> attr(,"simple")
#> [1] TRUE
ctl_new_pillar_list(tbl, tbl$b, width = 20)
#> [[1]]
#> <pillar>
#> c
#> <int>
#> 4
#> 5
#> 6
#>
#> [[2]]
#> <pillar>
#> d
#> <int>
#> 7
#> 8
#> 9
#>
#> attr(,"extra")
#> character(0)
#> attr(,"remaining_width")
#> [1] 8
#> attr(,"simple")
#> [1] FALSE
In a tibble, each column can be a data frame, matrix, or even array
itself, such columns are called compound columns. Such columns
are decomposed into sub-pillars and returned as a list of pillars.
Regular vectors are forwarded to ctl_new_pillar()
and
returned as list of length one. Implementers of "tbl"
subclasses will rarely if ever need to extend or override this
method.
pillar:::ctl_new_pillar_list.tbl()
ctl_new_pillar_list.tbl <- function (controller, x, width, ..., title = NULL, first_pillar = NULL)
{
"!!!!DEBUG ctl_new_pillar_list.tbl(`v(width)`, `v(title)`)"
if (is.data.frame(x)) {
new_data_frame_pillar_list(x, controller, width, title = title,
first_pillar = first_pillar)
}
else if (is.matrix(x) && !inherits(x, c("Surv", "Surv2"))) {
new_matrix_pillar_list(x, controller, width, title = title,
first_pillar = first_pillar)
}
else if (is.array(x) && length(dim(x)) > 2) {
new_array_pillar_list(x, controller, width, title = title,
first_pillar = first_pillar)
}
else {
if (is.null(first_pillar)) {
first_pillar <- ctl_new_pillar(controller, x, width,
..., title = prepare_title(title))
}
new_single_pillar_list(first_pillar, width)
}
}
The ctl_new_pillar()
method is called for columns that
are not data frames or arrays, and also dispatches over the
container.
pillar:::ctl_new_pillar.tbl()
The default method calls pillar()
directly, passing the
maximum width available.
pillar()
Formatting for title and type is provided by
new_pillar_title()
and new_pillar_type()
. The
body can be customized by implementing pillar_shaft()
for a
vector class, see vignette("pillar", package = "vctrs")
for
details. If title or type don’t fit the available width,
pillar_shaft()
is never called.
This function now returns NULL
if the width is
insufficient to contain the data. It is possible to change the
appearance of pillars by overriding or extending
ctl_new_pillar()
.
Pillar objects share the same structure and are ultimately
constructed with new_pillar()
.
new_pillar()
new_pillar <- function (components, ..., width = NULL, class = NULL, extra = deprecated())
{
"!!!!DEBUG new_pillar(`v(width)`, `v(class)`)"
if (is_present(extra)) {
deprecate_warn("1.7.0", "pillar::new_pillar(extra = )")
}
check_dots_empty()
if (length(components) > 0 && !is_named(components)) {
abort("All components must have names.")
}
structure(components, width = width, class = c(class, "pillar"))
}
A pillar is stored as a list of components. Each pillar represents only one simple (atomic) column, compound columns are always represented as multiple pillar objects.
When a pillar object is constructed, it has a minimum and a desired
(maximum) width. Because it depends on the number and width of other
pillar objects that may not be even constructed, the final width is not
known yet. It is passed to format()
, which uses the desired
width if empty: